Re: [Bioc-devel] Controlling vignette compilation order

Aaron Lun Fri, 21 Dec 2018 23:27:17 -0800

I gave it a shot:

https://github.com/LTLA/DrakeTest <https://github.com/LTLA/DrakeTest>


This uses a single “controller” Rmd file to trigger Drake::make. Running this 
file will instruct Drake to compile all of the other vignettes following the 
desired dependency structure.

The current sticking point is that I need to move the Drake-controlled Rmd 
files out of “vignettes/“, otherwise they’ll just be compiled as usual without 
consideration of their dependencies. This causes problems as R CMD BUILD only 
recognizes the controller Rmd file as the sole vignette, and doesn’t retain or 
index the HTML files produced from the other Rmd files as side-effects of 
running the controller.

Are there any better ways to subvert the vignette building procedure to get the 
desired effect of running drake::make() and recognition of the resulting HTMLs 
as vignettes?

-A

> On 18 Dec 2018, at 17:41, Michael Lawrence <lawrence.mich...@gene.com> wrote:
> 
> Sounds like a use case for drake...
> 
> On Tue, Dec 18, 2018 at 6:58 AM Aaron Lun 
> <infinite.monkeys.with.keyboa...@gmail.com 
> <mailto:infinite.monkeys.with.keyboa...@gmail.com>> wrote:
> @Michael In this case, the resource produced by vignette X is a 
> SingleCellExperiment object containing the results of various processing 
> steps (normalization, clustering, etc.) described in that vignette. 
> 
> I can imagine a lazy evaluation model for this, but it wouldn’t be pretty. If 
> I had another vignette Y that depended on the SCE produced by vignette X, I 
> would need Y to execute all of the steps in X if X hadn’t already been run 
> before Y. This gets us into the territory of Makefile-like dependencies, 
> which seems even more complicated than simply specifying a compilation order.
> 
> You might ask why X and Y are split into two separate vignettes. The use of 
> different vignettes is motivated by the complexity of the workflows:
> 
> - Vignette 1 demonstrates core processing steps for one read-based 
> single-cell RNAseq dataset. 
> - Vignette 2 demonstrates (slightly different) core steps for a UMI-based 
> dataset.
> - … so on for a bunch of other core steps for different types of data.
> - Vignette 6 demonstrates extra optional steps for the two SCEs produced by 
> vignettes 1 & 3.
> - … and so on for a bunch of other optional steps.
> 
> The separation between core and optional steps into separate documents is 
> desirable. From a pedagogical perspective, I would very much like to get the 
> reader through all the core steps before even considering the extra steps, 
> which would just be confusing if presented so early on. Previously, 
> everything was in a single document, which was difficult to read (for users) 
> and to debug (for me), especially because I had to use contrived variable 
> names to avoid clashes between different sections of the workflow that did 
> similar things.
> 
> @Martin I’ve been using BiocFileCache for all of the online resources that 
> are used in the workflow. However, this is only for my (and the reader’s) 
> convenience. I use a local cache rather than the system default, to ensure 
> that the downloaded files are removed after package build. This is 
> intentional as it forces the package builder to try to re-download resources 
> when compiling the vignette, thus ensuring the validity of the URLs. For a 
> similar reason, I would prefer not to cache the result objects for use in 
> different R sessions. I could imagine caching the result objects for use by a 
> different vignette in the same build session, but this gets back to the 
> problem of ensuring that the result object is generated by one vignette 
> before it is needed by another vignette.
> 
> -A
> 
> > On 18 Dec 2018, at 14:14, Martin Morgan <mtmorgan.b...@gmail.com 
> > <mailto:mtmorgan.b...@gmail.com>> wrote:
> > 
> > Also perhaps using BiocFileCache so that the result object is only 
> > generated once, then cached for future (different session) use.
> > 
> > On 12/18/18, 8:35 AM, "Bioc-devel on behalf of Michael Lawrence" 
> > <bioc-devel-boun...@r-project.org <mailto:bioc-devel-boun...@r-project.org> 
> > on behalf of lawrence.mich...@gene.com <mailto:lawrence.mich...@gene.com>> 
> > wrote:
> > 
> >    I would recommend against dependencies across vignettes. Ideally someone
> >    can pick up a vignette and execute the code independently of any other
> >    documentation. Perhaps you could move the code generating those shared
> >    resources to the package. They could behave lazily, only generating the
> >    resource if necessary, otherwise reusing it. That would also make it easy
> >    for people to write their own documents using those resources.
> > 
> >    Michael
> > 
> >    On Tue, Dec 18, 2018 at 5:22 AM Aaron Lun <
> >    infinite.monkeys.with.keyboa...@gmail.com 
> > <mailto:infinite.monkeys.with.keyboa...@gmail.com>> wrote:
> > 
> >> In a number of my workflow packages (e.g., simpleSingleCell), I rely on a
> >> specific compilation order for my vignettes. This is because some vignettes
> >> set up resources or objects that are to be used by later vignettes.
> >> 
> >> From what I understand, vignettes are compiled in alphanumeric ordering of
> >> their file names. As such, I give my vignettes fairly structured names,
> >> e.g., “work-1-reads.Rmd”, “work-2-umi.Rmd” and so on.
> >> 
> >> However, it becomes rather annoying when I want to add a new vignette in
> >> the middle somewhere. This results in some unnatural numberings, e.g.,
> >> “work-0”, “3b”, which are ugly and unintuitive. This is relevant as
> >> BiocStyle::Biocpkg() links between vignettes require you to use the
> >> destination vignette’s file name; so difficult names complicate linking,
> >> especially if the names continually change to reflect new orderings.
> >> 
> >> Is there an easier way to control vignette compilation order? WRE provides
> >> no (obvious) guidance, so I would like to know what non-standard hacks are
> >> known to work on the build machines. I can imagine something dirty whereby
> >> one ”reference” vignette contains code to “rmarkdown::render" all other
> >> vignettes in the specified order… ugh.
> >> 
> >> -A
> >> 
> >> _______________________________________________
> >> Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> mailing list
> >> https://stat.ethz.ch/mailman/listinfo/bioc-devel 
> >> <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
> >> 
> >> 
> > 
> >       [[alternative HTML version deleted]]
> > 
> >    _______________________________________________
> >    Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> mailing list
> >    https://stat.ethz.ch/mailman/listinfo/bioc-devel 
> > <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
> > 
> 
> _______________________________________________
> Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel 
> <https://stat.ethz.ch/mailman/listinfo/bioc-devel>


        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Controlling vignette compilation order

Reply via email to