Re: [Bioc-devel] Controlling vignette compilation order

Martin Morgan Sat, 22 Dec 2018 11:07:49 -0800

...but in the end isn't it just simpler to name your vignettes in collation 
order? Who other than you will be able to parse what you've done?


Martin

On 12/22/18, 1:56 PM, "Bioc-devel on behalf of Michael Lawrence" 
<bioc-devel-boun...@r-project.org on behalf of lawrence.mich...@gene.com> wrote:

    Anything that eventually lands in inst/doc is a vignette, I think, so
    there might be a hack around that.
    
    On Fri, Dec 21, 2018 at 11:26 PM Aaron Lun
    <infinite.monkeys.with.keyboa...@gmail.com> wrote:
    >
    > I gave it a shot:
    >
    > https://github.com/LTLA/DrakeTest <https://github.com/LTLA/DrakeTest>
    >
    > This uses a single “controller” Rmd file to trigger Drake::make. Running 
this file will instruct Drake to compile all of the other vignettes following 
the desired dependency structure.
    >
    > The current sticking point is that I need to move the Drake-controlled 
Rmd files out of “vignettes/“, otherwise they’ll just be compiled as usual 
without consideration of their dependencies. This causes problems as R CMD 
BUILD only recognizes the controller Rmd file as the sole vignette, and doesn’t 
retain or index the HTML files produced from the other Rmd files as 
side-effects of running the controller.
    >
    > Are there any better ways to subvert the vignette building procedure to 
get the desired effect of running drake::make() and recognition of the 
resulting HTMLs as vignettes?
    >
    > -A
    >
    > > On 18 Dec 2018, at 17:41, Michael Lawrence <lawrence.mich...@gene.com> 
wrote:
    > >
    > > Sounds like a use case for drake...
    > >
    > > On Tue, Dec 18, 2018 at 6:58 AM Aaron Lun 
<infinite.monkeys.with.keyboa...@gmail.com 
<mailto:infinite.monkeys.with.keyboa...@gmail.com>> wrote:
    > > @Michael In this case, the resource produced by vignette X is a 
SingleCellExperiment object containing the results of various processing steps 
(normalization, clustering, etc.) described in that vignette.
    > >
    > > I can imagine a lazy evaluation model for this, but it wouldn’t be 
pretty. If I had another vignette Y that depended on the SCE produced by 
vignette X, I would need Y to execute all of the steps in X if X hadn’t already 
been run before Y. This gets us into the territory of Makefile-like 
dependencies, which seems even more complicated than simply specifying a 
compilation order.
    > >
    > > You might ask why X and Y are split into two separate vignettes. The 
use of different vignettes is motivated by the complexity of the workflows:
    > >
    > > - Vignette 1 demonstrates core processing steps for one read-based 
single-cell RNAseq dataset.
    > > - Vignette 2 demonstrates (slightly different) core steps for a 
UMI-based dataset.
    > > - … so on for a bunch of other core steps for different types of data.
    > > - Vignette 6 demonstrates extra optional steps for the two SCEs 
produced by vignettes 1 & 3.
    > > - … and so on for a bunch of other optional steps.
    > >
    > > The separation between core and optional steps into separate documents 
is desirable. From a pedagogical perspective, I would very much like to get the 
reader through all the core steps before even considering the extra steps, 
which would just be confusing if presented so early on. Previously, everything 
was in a single document, which was difficult to read (for users) and to debug 
(for me), especially because I had to use contrived variable names to avoid 
clashes between different sections of the workflow that did similar things.
    > >
    > > @Martin I’ve been using BiocFileCache for all of the online resources 
that are used in the workflow. However, this is only for my (and the reader’s) 
convenience. I use a local cache rather than the system default, to ensure that 
the downloaded files are removed after package build. This is intentional as it 
forces the package builder to try to re-download resources when compiling the 
vignette, thus ensuring the validity of the URLs. For a similar reason, I would 
prefer not to cache the result objects for use in different R sessions. I could 
imagine caching the result objects for use by a different vignette in the same 
build session, but this gets back to the problem of ensuring that the result 
object is generated by one vignette before it is needed by another vignette.
    > >
    > > -A
    > >
    > > > On 18 Dec 2018, at 14:14, Martin Morgan <mtmorgan.b...@gmail.com 
<mailto:mtmorgan.b...@gmail.com>> wrote:
    > > >
    > > > Also perhaps using BiocFileCache so that the result object is only 
generated once, then cached for future (different session) use.
    > > >
    > > > On 12/18/18, 8:35 AM, "Bioc-devel on behalf of Michael Lawrence" 
<bioc-devel-boun...@r-project.org <mailto:bioc-devel-boun...@r-project.org> on 
behalf of lawrence.mich...@gene.com <mailto:lawrence.mich...@gene.com>> wrote:
    > > >
    > > >    I would recommend against dependencies across vignettes. Ideally 
someone
    > > >    can pick up a vignette and execute the code independently of any 
other
    > > >    documentation. Perhaps you could move the code generating those 
shared
    > > >    resources to the package. They could behave lazily, only 
generating the
    > > >    resource if necessary, otherwise reusing it. That would also make 
it easy
    > > >    for people to write their own documents using those resources.
    > > >
    > > >    Michael
    > > >
    > > >    On Tue, Dec 18, 2018 at 5:22 AM Aaron Lun <
    > > >    infinite.monkeys.with.keyboa...@gmail.com 
<mailto:infinite.monkeys.with.keyboa...@gmail.com>> wrote:
    > > >
    > > >> In a number of my workflow packages (e.g., simpleSingleCell), I rely 
on a
    > > >> specific compilation order for my vignettes. This is because some 
vignettes
    > > >> set up resources or objects that are to be used by later vignettes.
    > > >>
    > > >> From what I understand, vignettes are compiled in alphanumeric 
ordering of
    > > >> their file names. As such, I give my vignettes fairly structured 
names,
    > > >> e.g., “work-1-reads.Rmd”, “work-2-umi.Rmd” and so on.
    > > >>
    > > >> However, it becomes rather annoying when I want to add a new 
vignette in
    > > >> the middle somewhere. This results in some unnatural numberings, 
e.g.,
    > > >> “work-0”, “3b”, which are ugly and unintuitive. This is relevant as
    > > >> BiocStyle::Biocpkg() links between vignettes require you to use the
    > > >> destination vignette’s file name; so difficult names complicate 
linking,
    > > >> especially if the names continually change to reflect new orderings.
    > > >>
    > > >> Is there an easier way to control vignette compilation order? WRE 
provides
    > > >> no (obvious) guidance, so I would like to know what non-standard 
hacks are
    > > >> known to work on the build machines. I can imagine something dirty 
whereby
    > > >> one ”reference” vignette contains code to “rmarkdown::render" all 
other
    > > >> vignettes in the specified order… ugh.
    > > >>
    > > >> -A
    > > >>
    > > >> _______________________________________________
    > > >> Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> mailing 
list
    > > >> https://stat.ethz.ch/mailman/listinfo/bioc-devel 
<https://stat.ethz.ch/mailman/listinfo/bioc-devel>
    > > >>
    > > >>
    > > >
    > > >       [[alternative HTML version deleted]]
    > > >
    > > >    _______________________________________________
    > > >    Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> mailing 
list
    > > >    https://stat.ethz.ch/mailman/listinfo/bioc-devel 
<https://stat.ethz.ch/mailman/listinfo/bioc-devel>
    > > >
    > >
    > > _______________________________________________
    > > Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> mailing list
    > > https://stat.ethz.ch/mailman/listinfo/bioc-devel 
<https://stat.ethz.ch/mailman/listinfo/bioc-devel>
    >
    >
    >         [[alternative HTML version deleted]]
    >
    > _______________________________________________
    > Bioc-devel@r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/bioc-devel
    >
    
    _______________________________________________
    Bioc-devel@r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/bioc-devel
    
_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Controlling vignette compilation order

Reply via email to