Re: [Bioc-devel] Handling larger data in vignette

Martin Morgan Fri, 27 Jul 2018 02:50:04 -0700

Remember also that there are overall evaluation time limits.

One strategy might use existing stable publicly available SC data setsfrom e.g.,


  http://imlspenticton.uzh.ch:3838/conquer/
  https://hemberg-lab.github.io/scRNA.seq.datasets/

These could be downloaded using BiocFileCache as a first step in thevignette ; the download cost would be 'paid' the first time, butsubsequent use would be from the locally cached data.

A second approach would be to create an ExperimentHub package, and touse that in your examples.


  http://bioconductor.org/packages/devel/ExperimentHub

http://bioconductor.org/packages/devel/bioc/vignettes/ExperimentHub/inst/doc/CreateAnExperimentHubPackage.html

The submission process would start by submitting the EH package, andthen adding, once the kinks in the experiment data package are workedout, the software package to the issue. The data and software packageswould be accepted together.


Martin

On 07/27/2018 02:38 AM, Sarah Williams via Bioc-devel wrote:

Hi,

I'm preparing a package for submission to bioconductor, but hitting the 4mb
size limit due to examples in my vignette.

I do have a demo toy sized dataset which I use for the bulk of the
vignette. But I wanted to show real-data examples at the end because
approach doesn't work well on toy-sized data.

Conceivably everything except for the 'examples' section would make a
'complete' vignette (but probably not a very helpful one...). I'm wondering
if I should static-ify just those examples? Might hit the 50% runnable
code-chunk limit then though. Unfortunately its a rank-based approach so i
can't really take the top 100 genes for these particuar objects.

Not sure how best to solve this, any tips/suggestions? Thanks!

(The problematic vignette is here:
https://bioinformatics.erc.monash.edu/home/sarah.williams/projects/cell_groupings/doco/celaref_doco.html
)

NB: To make matters worse this is a tool for comparing datasets - so I have
multiples! They are public datasets and I haven't done anything exciting
with them (nor would anyone else want to reuse processed objects) - so I
don't think that I should make a data package.

NB: Using xz compression and some cleanup I got it down to 21mb, from 49mb
- So its not huge but I don't think I can get to <4mb this way.

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



This email message may contain legally privileged and/or...{{dropped:2}}

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Handling larger data in vignette

Reply via email to