Sounds like a use case for drake... On Tue, Dec 18, 2018 at 6:58 AM Aaron Lun < infinite.monkeys.with.keyboa...@gmail.com> wrote:
> @Michael In this case, the resource produced by vignette X is a > SingleCellExperiment object containing the results of various processing > steps (normalization, clustering, etc.) described in that vignette. > > I can imagine a lazy evaluation model for this, but it wouldn’t be pretty. > If I had another vignette Y that depended on the SCE produced by vignette > X, I would need Y to execute all of the steps in X if X hadn’t already been > run before Y. This gets us into the territory of Makefile-like > dependencies, which seems even more complicated than simply specifying a > compilation order. > > You might ask why X and Y are split into two separate vignettes. The use > of different vignettes is motivated by the complexity of the workflows: > > - Vignette 1 demonstrates core processing steps for one read-based > single-cell RNAseq dataset. > - Vignette 2 demonstrates (slightly different) core steps for a UMI-based > dataset. > - … so on for a bunch of other core steps for different types of data. > - Vignette 6 demonstrates extra optional steps for the two SCEs produced > by vignettes 1 & 3. > - … and so on for a bunch of other optional steps. > > The separation between core and optional steps into separate documents is > desirable. From a pedagogical perspective, I would very much like to get > the reader through all the core steps before even considering the extra > steps, which would just be confusing if presented so early on. Previously, > everything was in a single document, which was difficult to read (for > users) and to debug (for me), especially because I had to use contrived > variable names to avoid clashes between different sections of the workflow > that did similar things. > > @Martin I’ve been using BiocFileCache for all of the online resources that > are used in the workflow. However, this is only for my (and the reader’s) > convenience. I use a local cache rather than the system default, to ensure > that the downloaded files are removed after package build. This is > intentional as it forces the package builder to try to re-download > resources when compiling the vignette, thus ensuring the validity of the > URLs. For a similar reason, I would prefer not to cache the result objects > for use in different R sessions. I could imagine caching the result objects > for use by a different vignette in the same build session, but this gets > back to the problem of ensuring that the result object is generated by one > vignette before it is needed by another vignette. > > -A > > > On 18 Dec 2018, at 14:14, Martin Morgan <mtmorgan.b...@gmail.com> wrote: > > > > Also perhaps using BiocFileCache so that the result object is only > generated once, then cached for future (different session) use. > > > > On 12/18/18, 8:35 AM, "Bioc-devel on behalf of Michael Lawrence" < > bioc-devel-boun...@r-project.org on behalf of lawrence.mich...@gene.com> > wrote: > > > > I would recommend against dependencies across vignettes. Ideally > someone > > can pick up a vignette and execute the code independently of any other > > documentation. Perhaps you could move the code generating those shared > > resources to the package. They could behave lazily, only generating > the > > resource if necessary, otherwise reusing it. That would also make it > easy > > for people to write their own documents using those resources. > > > > Michael > > > > On Tue, Dec 18, 2018 at 5:22 AM Aaron Lun < > > infinite.monkeys.with.keyboa...@gmail.com> wrote: > > > >> In a number of my workflow packages (e.g., simpleSingleCell), I rely on > a > >> specific compilation order for my vignettes. This is because some > vignettes > >> set up resources or objects that are to be used by later vignettes. > >> > >> From what I understand, vignettes are compiled in alphanumeric ordering > of > >> their file names. As such, I give my vignettes fairly structured names, > >> e.g., “work-1-reads.Rmd”, “work-2-umi.Rmd” and so on. > >> > >> However, it becomes rather annoying when I want to add a new vignette in > >> the middle somewhere. This results in some unnatural numberings, e.g., > >> “work-0”, “3b”, which are ugly and unintuitive. This is relevant as > >> BiocStyle::Biocpkg() links between vignettes require you to use the > >> destination vignette’s file name; so difficult names complicate linking, > >> especially if the names continually change to reflect new orderings. > >> > >> Is there an easier way to control vignette compilation order? WRE > provides > >> no (obvious) guidance, so I would like to know what non-standard hacks > are > >> known to work on the build machines. I can imagine something dirty > whereby > >> one ”reference” vignette contains code to “rmarkdown::render" all other > >> vignettes in the specified order… ugh. > >> > >> -A > >> > >> _______________________________________________ > >> Bioc-devel@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/bioc-devel > >> > >> > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioc-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/bioc-devel > > > > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel