Re: [Bioc-devel] [Untrusted Server]Re: [Untrusted Server]Re: strange error in Jenkins build forsingleCellWorkflow

2017-09-29 Thread Aaron Lun
Thanks Martin. Looks like it's building happily now, which gives us some 
breathing space.


-Aaron


From: Martin Morgan <martin.mor...@roswellpark.org>
Sent: Tuesday, 26 September 2017 6:05:22 PM
To: Aaron Lun; Herv� Pag�s; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] [Untrusted Server]Re: [Untrusted Server]Re: strange 
error in Jenkins build forsingleCellWorkflow

On 09/26/2017 03:04 AM, Aaron Lun wrote:
> Hi Herve,
>
>
> I tried out the .BBSoptions approach, but it seems that the build system
> is still having some trouble:
>
>
> http://docbuilder.bioconductor.org:8080/job/simpleSingleCell/label=master/59/console
>
>
> I bumped up the maximum number of DLLs to 200 in .BBSoptions, but to no
> effect. Any ideas?

This is my bad advice; as Herve mentions the workflow builders do not
respect BBS options. We will adjust the max. DLLs on our end. Please be
patient.

Martin

>
>
> -Aaron
>
> 
> *From:* Herv� Pag�s <hpa...@fredhutch.org>
> *Sent:* Thursday, 21 September 2017 3:06:18 PM
> *To:* Aaron Lun; Martin Morgan; bioc-devel@r-project.org
> *Subject:* Re: [Bioc-devel] [Untrusted Server]Re: [Untrusted Server]Re:
> strange error in Jenkins build forsingleCellWorkflow
> Hi,
>
> @Martin: It's good news that the workflows have been standardized as
> packages but aren't we still using the traditional workflow builder?
> AFAIK .BBSoptions files are only honoured on the main build system
> (a.k.a. BBS).
>
> @Aaron: If we decide to use BBS (our main build system) to build the
> workflows, then you'll be able to control R_MAX_NUM_DLLS by putting
> the following lines to your .BBSoptions file:
>
> RbuildPrepend: R_MAX_NUM_DLLS=150
> RbuildPrepend.win: set R_MAX_NUM_DLLS=150&&
> RcheckPrepend: R_MAX_NUM_DLLS=150
> RcheckPrepend.win: set R_MAX_NUM_DLLS=150&&
>
> You might not need all of them but it doesn't hurt to have them
> all. Note that you should not try to put a space before && in the
> RbuildPrepend.win or RcheckPrepend.win value.
>
> H.
>
> On 09/19/2017 05:51 PM, Aaron Lun wrote:
>> Thanks Martin. I think I will stick to one workflow for now, until the
>> BioC-workflows page provides some formal support for multiple workflows
>> representing different components of the same workflow (i.e., other than
>> me manually writing in the abstract that "This workflow is based on the
>> concepts introduced in the previous workflow X").
>>
>>
>> @Herve can you help me out with the .BBSoptions configuration for
>> R_MAX_NUM_DLLS? I guess we should also indicate to the user that this
>> needs to be increased in order for the workflow to run.
>>
>>
>> -Aaron
>>
>>
>>
>> ------------------------
>> *From:* Bioc-devel <bioc-devel-boun...@r-project.org> on behalf of
>> Martin Morgan <martin.mor...@roswellpark.org>
>> *Sent:* Wednesday, 20 September 2017 2:16 AM
>> *To:* Wolfgang Huber; bioc-devel@r-project.org
>> *Subject:* Re: [Bioc-devel] [Untrusted Server]Re: [Untrusted Server]Re:
>> strange error in Jenkins build forsingleCellWorkflow
>> On 09/19/2017 09:50 AM, Wolfgang Huber wrote:
>>>
>>> My 3 cents:
>>> - I think this is a more and more common problem that I'm also
>>> encountering in everyday work and that asks for a general solution.
>>> - I agree with Martin that setting R_MAX_NUM_DLLS is better than
>>> unloading. AfaIk it is not even possible to cleanly unload every package
>>> ('as if it had never been loaded') due to irreversible global effects;
>>> although I'd happy to be educated otherwise.
>>> - R_MAX_NUM_DLLS is not a sustainable solution either: the current
>>> default is 100, but e.g. on my MacOS 10.12 any value >152 leads to an
>>> error. Upping to the maximum 152 will give us some temporary respite but
>>> seems not really future-proof.
>>
>> This was the R-core motivation for increasing the max to only 100, but
>> it's still surprising to me that a modern OS has such a tight limit.
>> I'll see if there are ideas in R-core.
>>
>>   From our internal discussions there is some willingness to (continue)
>> supporting large and complicated work flows, but it is valuable to think
>> carefully about the consequences for users following along. Maybe part
>> of this is clearly alerting the user to the fact that 500G of data are
>> going to be downloaded, the workflow requires advanced configuration of
>> R, etc.
>>
>> @Aaron -- if you'd

Re: [Bioc-devel] [Untrusted Server]Re: [Untrusted Server]Re: strange error in Jenkins build forsingleCellWorkflow

2017-09-26 Thread Martin Morgan

On 09/26/2017 03:04 AM, Aaron Lun wrote:

Hi Herve,


I tried out the .BBSoptions approach, but it seems that the build system 
is still having some trouble:



http://docbuilder.bioconductor.org:8080/job/simpleSingleCell/label=master/59/console


I bumped up the maximum number of DLLs to 200 in .BBSoptions, but to no 
effect. Any ideas?


This is my bad advice; as Herve mentions the workflow builders do not 
respect BBS options. We will adjust the max. DLLs on our end. Please be 
patient.


Martin




-Aaron


*From:* Hervé Pagès <hpa...@fredhutch.org>
*Sent:* Thursday, 21 September 2017 3:06:18 PM
*To:* Aaron Lun; Martin Morgan; bioc-devel@r-project.org
*Subject:* Re: [Bioc-devel] [Untrusted Server]Re: [Untrusted Server]Re: 
strange error in Jenkins build forsingleCellWorkflow

Hi,

@Martin: It's good news that the workflows have been standardized as
packages but aren't we still using the traditional workflow builder?
AFAIK .BBSoptions files are only honoured on the main build system
(a.k.a. BBS).

@Aaron: If we decide to use BBS (our main build system) to build the
workflows, then you'll be able to control R_MAX_NUM_DLLS by putting
the following lines to your .BBSoptions file:

RbuildPrepend: R_MAX_NUM_DLLS=150
RbuildPrepend.win: set R_MAX_NUM_DLLS=150&&
RcheckPrepend: R_MAX_NUM_DLLS=150
RcheckPrepend.win: set R_MAX_NUM_DLLS=150&&

You might not need all of them but it doesn't hurt to have them
all. Note that you should not try to put a space before && in the
RbuildPrepend.win or RcheckPrepend.win value.

H.

On 09/19/2017 05:51 PM, Aaron Lun wrote:

Thanks Martin. I think I will stick to one workflow for now, until the
BioC-workflows page provides some formal support for multiple workflows
representing different components of the same workflow (i.e., other than
me manually writing in the abstract that "This workflow is based on the
concepts introduced in the previous workflow X").


@Herve can you help me out with the .BBSoptions configuration for
R_MAX_NUM_DLLS? I guess we should also indicate to the user that this
needs to be increased in order for the workflow to run.


-Aaron




*From:* Bioc-devel <bioc-devel-boun...@r-project.org> on behalf of
Martin Morgan <martin.mor...@roswellpark.org>
*Sent:* Wednesday, 20 September 2017 2:16 AM
*To:* Wolfgang Huber; bioc-devel@r-project.org
*Subject:* Re: [Bioc-devel] [Untrusted Server]Re: [Untrusted Server]Re:
strange error in Jenkins build forsingleCellWorkflow
On 09/19/2017 09:50 AM, Wolfgang Huber wrote:


My 3 cents:
- I think this is a more and more common problem that I'm also
encountering in everyday work and that asks for a general solution.
- I agree with Martin that setting R_MAX_NUM_DLLS is better than
unloading. AfaIk it is not even possible to cleanly unload every package
('as if it had never been loaded') due to irreversible global effects;
although I'd happy to be educated otherwise.
- R_MAX_NUM_DLLS is not a sustainable solution either: the current
default is 100, but e.g. on my MacOS 10.12 any value >152 leads to an
error. Upping to the maximum 152 will give us some temporary respite but
seems not really future-proof.


This was the R-core motivation for increasing the max to only 100, but
it's still surprising to me that a modern OS has such a tight limit.
I'll see if there are ideas in R-core.

   From our internal discussions there is some willingness to (continue)
supporting large and complicated work flows, but it is valuable to think
carefully about the consequences for users following along. Maybe part
of this is clearly alerting the user to the fact that 500G of data are
going to be downloaded, the workflow requires advanced configuration of
R, etc.

@Aaron -- if you'd like to continue with one work flow, contact Herve
(cc'd) and he'll provide the .BBSoptions configuration to allow the
build system to use an appropriate R_MAX_NUM_DLLS. If instead you'd like
to produce two workflows, then the best strategy in your case would be
to simply have two independent packages (DESCRIPTION + vignettes/) each
with more modest numbers of DLLs; contact Lori (cc'd) when you've
decided on a second name, and we'll create the svn location for you.

Martin



  Wolfgang

19.9.17 12:02, Martin Morgan scripsit:

On 09/18/2017 10:42 PM, Shian Su wrote:

Hi Aaron,

Would you mind sharing the code for flushing DLLs? This is a problem
that others working with single cells and I have faced.



For the user encountering this problem I think a better solution is to
increase the number of DLLs allowed by R, for instance editing
.Renviron to contain the line

R_MAX_NUM_DLLS=120

or similar. This can be on an installation-wide, user-wise, or
project-specific basis, as described in ?Startup

@Aaron -- we are still discussing things internally; for instance it
is 

Re: [Bioc-devel] [Untrusted Server]Re: [Untrusted Server]Re: strange error in Jenkins build forsingleCellWorkflow

2017-09-26 Thread Aaron Lun
Hi Herve,


I tried out the .BBSoptions approach, but it seems that the build system is 
still having some trouble:


http://docbuilder.bioconductor.org:8080/job/simpleSingleCell/label=master/59/console


I bumped up the maximum number of DLLs to 200 in .BBSoptions, but to no effect. 
Any ideas?


-Aaron


From: Herv� Pag�s <hpa...@fredhutch.org>
Sent: Thursday, 21 September 2017 3:06:18 PM
To: Aaron Lun; Martin Morgan; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] [Untrusted Server]Re: [Untrusted Server]Re: strange 
error in Jenkins build forsingleCellWorkflow

Hi,

@Martin: It's good news that the workflows have been standardized as
packages but aren't we still using the traditional workflow builder?
AFAIK .BBSoptions files are only honoured on the main build system
(a.k.a. BBS).

@Aaron: If we decide to use BBS (our main build system) to build the
workflows, then you'll be able to control R_MAX_NUM_DLLS by putting
the following lines to your .BBSoptions file:

RbuildPrepend: R_MAX_NUM_DLLS=150
RbuildPrepend.win: set R_MAX_NUM_DLLS=150&&
RcheckPrepend: R_MAX_NUM_DLLS=150
RcheckPrepend.win: set R_MAX_NUM_DLLS=150&&

You might not need all of them but it doesn't hurt to have them
all. Note that you should not try to put a space before && in the
RbuildPrepend.win or RcheckPrepend.win value.

H.

On 09/19/2017 05:51 PM, Aaron Lun wrote:
> Thanks Martin. I think I will stick to one workflow for now, until the
> BioC-workflows page provides some formal support for multiple workflows
> representing different components of the same workflow (i.e., other than
> me manually writing in the abstract that "This workflow is based on the
> concepts introduced in the previous workflow X").
>
>
> @Herve can you help me out with the .BBSoptions configuration for
> R_MAX_NUM_DLLS? I guess we should also indicate to the user that this
> needs to be increased in order for the workflow to run.
>
>
> -Aaron
>
>
>
> 
> *From:* Bioc-devel <bioc-devel-boun...@r-project.org> on behalf of
> Martin Morgan <martin.mor...@roswellpark.org>
> *Sent:* Wednesday, 20 September 2017 2:16 AM
> *To:* Wolfgang Huber; bioc-devel@r-project.org
> *Subject:* Re: [Bioc-devel] [Untrusted Server]Re: [Untrusted Server]Re:
> strange error in Jenkins build forsingleCellWorkflow
> On 09/19/2017 09:50 AM, Wolfgang Huber wrote:
>>
>> My 3 cents:
>> - I think this is a more and more common problem that I'm also
>> encountering in everyday work and that asks for a general solution.
>> - I agree with Martin that setting R_MAX_NUM_DLLS is better than
>> unloading. AfaIk it is not even possible to cleanly unload every package
>> ('as if it had never been loaded') due to irreversible global effects;
>> although I'd happy to be educated otherwise.
>> - R_MAX_NUM_DLLS is not a sustainable solution either: the current
>> default is 100, but e.g. on my MacOS 10.12 any value >152 leads to an
>> error. Upping to the maximum 152 will give us some temporary respite but
>> seems not really future-proof.
>
> This was the R-core motivation for increasing the max to only 100, but
> it's still surprising to me that a modern OS has such a tight limit.
> I'll see if there are ideas in R-core.
>
>   From our internal discussions there is some willingness to (continue)
> supporting large and complicated work flows, but it is valuable to think
> carefully about the consequences for users following along. Maybe part
> of this is clearly alerting the user to the fact that 500G of data are
> going to be downloaded, the workflow requires advanced configuration of
> R, etc.
>
> @Aaron -- if you'd like to continue with one work flow, contact Herve
> (cc'd) and he'll provide the .BBSoptions configuration to allow the
> build system to use an appropriate R_MAX_NUM_DLLS. If instead you'd like
> to produce two workflows, then the best strategy in your case would be
> to simply have two independent packages (DESCRIPTION + vignettes/) each
> with more modest numbers of DLLs; contact Lori (cc'd) when you've
> decided on a second name, and we'll create the svn location for you.
>
> Martin
>
>>
>>  Wolfgang
>>
>> 19.9.17 12:02, Martin Morgan scripsit:
>>> On 09/18/2017 10:42 PM, Shian Su wrote:
>>>> Hi Aaron,
>>>>
>>>> Would you mind sharing the code for flushing DLLs? This is a problem
>>>> that others working with single cells and I have faced.
>>>>
>>>
>>> For the user encountering this problem I think a better solution is to
>>> increase the number of DLLs allowed by R, for instance editing
>>

Re: [Bioc-devel] [Untrusted Server]Re: [Untrusted Server]Re: strange error in Jenkins build forsingleCellWorkflow

2017-09-20 Thread Hervé Pagès

Hi,

@Martin: It's good news that the workflows have been standardized as
packages but aren't we still using the traditional workflow builder?
AFAIK .BBSoptions files are only honoured on the main build system
(a.k.a. BBS).

@Aaron: If we decide to use BBS (our main build system) to build the
workflows, then you'll be able to control R_MAX_NUM_DLLS by putting
the following lines to your .BBSoptions file:

RbuildPrepend: R_MAX_NUM_DLLS=150
RbuildPrepend.win: set R_MAX_NUM_DLLS=150&&
RcheckPrepend: R_MAX_NUM_DLLS=150
RcheckPrepend.win: set R_MAX_NUM_DLLS=150&&

You might not need all of them but it doesn't hurt to have them
all. Note that you should not try to put a space before && in the
RbuildPrepend.win or RcheckPrepend.win value.

H.

On 09/19/2017 05:51 PM, Aaron Lun wrote:

Thanks Martin. I think I will stick to one workflow for now, until the
BioC-workflows page provides some formal support for multiple workflows
representing different components of the same workflow (i.e., other than
me manually writing in the abstract that "This workflow is based on the
concepts introduced in the previous workflow X").


@Herve can you help me out with the .BBSoptions configuration for
R_MAX_NUM_DLLS? I guess we should also indicate to the user that this
needs to be increased in order for the workflow to run.


-Aaron




*From:* Bioc-devel <bioc-devel-boun...@r-project.org> on behalf of
Martin Morgan <martin.mor...@roswellpark.org>
*Sent:* Wednesday, 20 September 2017 2:16 AM
*To:* Wolfgang Huber; bioc-devel@r-project.org
*Subject:* Re: [Bioc-devel] [Untrusted Server]Re: [Untrusted Server]Re:
strange error in Jenkins build forsingleCellWorkflow
On 09/19/2017 09:50 AM, Wolfgang Huber wrote:


My 3 cents:
- I think this is a more and more common problem that I'm also
encountering in everyday work and that asks for a general solution.
- I agree with Martin that setting R_MAX_NUM_DLLS is better than
unloading. AfaIk it is not even possible to cleanly unload every package
('as if it had never been loaded') due to irreversible global effects;
although I'd happy to be educated otherwise.
- R_MAX_NUM_DLLS is not a sustainable solution either: the current
default is 100, but e.g. on my MacOS 10.12 any value >152 leads to an
error. Upping to the maximum 152 will give us some temporary respite but
seems not really future-proof.


This was the R-core motivation for increasing the max to only 100, but
it's still surprising to me that a modern OS has such a tight limit.
I'll see if there are ideas in R-core.

  From our internal discussions there is some willingness to (continue)
supporting large and complicated work flows, but it is valuable to think
carefully about the consequences for users following along. Maybe part
of this is clearly alerting the user to the fact that 500G of data are
going to be downloaded, the workflow requires advanced configuration of
R, etc.

@Aaron -- if you'd like to continue with one work flow, contact Herve
(cc'd) and he'll provide the .BBSoptions configuration to allow the
build system to use an appropriate R_MAX_NUM_DLLS. If instead you'd like
to produce two workflows, then the best strategy in your case would be
to simply have two independent packages (DESCRIPTION + vignettes/) each
with more modest numbers of DLLs; contact Lori (cc'd) when you've
decided on a second name, and we'll create the svn location for you.

Martin



 Wolfgang

19.9.17 12:02, Martin Morgan scripsit:

On 09/18/2017 10:42 PM, Shian Su wrote:

Hi Aaron,

Would you mind sharing the code for flushing DLLs? This is a problem
that others working with single cells and I have faced.



For the user encountering this problem I think a better solution is to
increase the number of DLLs allowed by R, for instance editing
.Renviron to contain the line

R_MAX_NUM_DLLS=120

or similar. This can be on an installation-wide, user-wise, or
project-specific basis, as described in ?Startup

@Aaron -- we are still discussing things internally; for instance it
is possible to set the maximum number of DLLs in the build system.

Martin


Better yet would anyone know of code that would allow unused DLL to
be identified and unloaded? I suspect not as it would require keeping
track of the dependency tree of your current environment but I’m
hopeful.

Kind regards,
Shian Su


On 19 Sep 2017, at 12:30 pm, Aaron Lun <a...@wehi.edu.au> wrote:

Well, inertia won out in the end, and so I've just moved a whole
stack of packages into "Suggests" for now. This is probably not a
sustainable solution as the workflow can potentially get larger over
time; I would prefer to have some formal support for splitting up
the workflow into modules that can be independently installed.

-Aaron

From: Vincent Carey <st...@channing.harvard.edu>
Sent: Saturday, 16 September 2017 10:08:13 PM
To: A

Re: [Bioc-devel] [Untrusted Server]Re: [Untrusted Server]Re: strange error in Jenkins build forsingleCellWorkflow

2017-09-19 Thread Aaron Lun
Thanks Martin. I think I will stick to one workflow for now, until the 
BioC-workflows page provides some formal support for multiple workflows 
representing different components of the same workflow (i.e., other than me 
manually writing in the abstract that "This workflow is based on the concepts 
introduced in the previous workflow X").


@Herve can you help me out with the .BBSoptions configuration for 
R_MAX_NUM_DLLS? I guess we should also indicate to the user that this needs to 
be increased in order for the workflow to run.


-Aaron



From: Bioc-devel <bioc-devel-boun...@r-project.org> on behalf of Martin Morgan 
<martin.mor...@roswellpark.org>
Sent: Wednesday, 20 September 2017 2:16 AM
To: Wolfgang Huber; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] [Untrusted Server]Re: [Untrusted Server]Re: strange 
error in Jenkins build forsingleCellWorkflow

On 09/19/2017 09:50 AM, Wolfgang Huber wrote:
>
> My 3 cents:
> - I think this is a more and more common problem that I'm also
> encountering in everyday work and that asks for a general solution.
> - I agree with Martin that setting R_MAX_NUM_DLLS is better than
> unloading. AfaIk it is not even possible to cleanly unload every package
> ('as if it had never been loaded') due to irreversible global effects;
> although I'd happy to be educated otherwise.
> - R_MAX_NUM_DLLS is not a sustainable solution either: the current
> default is 100, but e.g. on my MacOS 10.12 any value >152 leads to an
> error. Upping to the maximum 152 will give us some temporary respite but
> seems not really future-proof.

This was the R-core motivation for increasing the max to only 100, but
it's still surprising to me that a modern OS has such a tight limit.
I'll see if there are ideas in R-core.

 From our internal discussions there is some willingness to (continue)
supporting large and complicated work flows, but it is valuable to think
carefully about the consequences for users following along. Maybe part
of this is clearly alerting the user to the fact that 500G of data are
going to be downloaded, the workflow requires advanced configuration of
R, etc.

@Aaron -- if you'd like to continue with one work flow, contact Herve
(cc'd) and he'll provide the .BBSoptions configuration to allow the
build system to use an appropriate R_MAX_NUM_DLLS. If instead you'd like
to produce two workflows, then the best strategy in your case would be
to simply have two independent packages (DESCRIPTION + vignettes/) each
with more modest numbers of DLLs; contact Lori (cc'd) when you've
decided on a second name, and we'll create the svn location for you.

Martin

>
>  Wolfgang
>
> 19.9.17 12:02, Martin Morgan scripsit:
>> On 09/18/2017 10:42 PM, Shian Su wrote:
>>> Hi Aaron,
>>>
>>> Would you mind sharing the code for flushing DLLs? This is a problem
>>> that others working with single cells and I have faced.
>>>
>>
>> For the user encountering this problem I think a better solution is to
>> increase the number of DLLs allowed by R, for instance editing
>> .Renviron to contain the line
>>
>> R_MAX_NUM_DLLS=120
>>
>> or similar. This can be on an installation-wide, user-wise, or
>> project-specific basis, as described in ?Startup
>>
>> @Aaron -- we are still discussing things internally; for instance it
>> is possible to set the maximum number of DLLs in the build system.
>>
>> Martin
>>
>>> Better yet would anyone know of code that would allow unused DLL to
>>> be identified and unloaded? I suspect not as it would require keeping
>>> track of the dependency tree of your current environment but I�m
>>> hopeful.
>>>
>>> Kind regards,
>>> Shian Su
>>>
>>>> On 19 Sep 2017, at 12:30 pm, Aaron Lun <a...@wehi.edu.au> wrote:
>>>>
>>>> Well, inertia won out in the end, and so I've just moved a whole
>>>> stack of packages into "Suggests" for now. This is probably not a
>>>> sustainable solution as the workflow can potentially get larger over
>>>> time; I would prefer to have some formal support for splitting up
>>>> the workflow into modules that can be independently installed.
>>>>
>>>> -Aaron
>>>> 
>>>> From: Vincent Carey <st...@channing.harvard.edu>
>>>> Sent: Saturday, 16 September 2017 10:08:13 PM
>>>> To: Aaron Lun
>>>> Cc: Martin Morgan; bioc-devel@r-project.org
>>>> Subject: Re: [Bioc-devel] [Untrusted Server]Re: strange error in
>>>> Jenkins build forsingleCellWorkflow
>>>>
>>>> IMHO the pedagogic v

Re: [Bioc-devel] [Untrusted Server]Re: [Untrusted Server]Re: strange error in Jenkins build forsingleCellWorkflow

2017-09-19 Thread Martin Morgan

On 09/19/2017 09:50 AM, Wolfgang Huber wrote:


My 3 cents:
- I think this is a more and more common problem that I'm also 
encountering in everyday work and that asks for a general solution.
- I agree with Martin that setting R_MAX_NUM_DLLS is better than 
unloading. AfaIk it is not even possible to cleanly unload every package 
('as if it had never been loaded') due to irreversible global effects; 
although I'd happy to be educated otherwise.
- R_MAX_NUM_DLLS is not a sustainable solution either: the current 
default is 100, but e.g. on my MacOS 10.12 any value >152 leads to an 
error. Upping to the maximum 152 will give us some temporary respite but 
seems not really future-proof.


This was the R-core motivation for increasing the max to only 100, but 
it's still surprising to me that a modern OS has such a tight limit. 
I'll see if there are ideas in R-core.


From our internal discussions there is some willingness to (continue) 
supporting large and complicated work flows, but it is valuable to think 
carefully about the consequences for users following along. Maybe part 
of this is clearly alerting the user to the fact that 500G of data are 
going to be downloaded, the workflow requires advanced configuration of 
R, etc.


@Aaron -- if you'd like to continue with one work flow, contact Herve 
(cc'd) and he'll provide the .BBSoptions configuration to allow the 
build system to use an appropriate R_MAX_NUM_DLLS. If instead you'd like 
to produce two workflows, then the best strategy in your case would be 
to simply have two independent packages (DESCRIPTION + vignettes/) each 
with more modest numbers of DLLs; contact Lori (cc'd) when you've 
decided on a second name, and we'll create the svn location for you.


Martin



 Wolfgang

19.9.17 12:02, Martin Morgan scripsit:

On 09/18/2017 10:42 PM, Shian Su wrote:

Hi Aaron,

Would you mind sharing the code for flushing DLLs? This is a problem 
that others working with single cells and I have faced.




For the user encountering this problem I think a better solution is to 
increase the number of DLLs allowed by R, for instance editing 
.Renviron to contain the line


R_MAX_NUM_DLLS=120

or similar. This can be on an installation-wide, user-wise, or 
project-specific basis, as described in ?Startup


@Aaron -- we are still discussing things internally; for instance it 
is possible to set the maximum number of DLLs in the build system.


Martin

Better yet would anyone know of code that would allow unused DLL to 
be identified and unloaded? I suspect not as it would require keeping 
track of the dependency tree of your current environment but I’m 
hopeful.


Kind regards,
Shian Su


On 19 Sep 2017, at 12:30 pm, Aaron Lun <a...@wehi.edu.au> wrote:

Well, inertia won out in the end, and so I've just moved a whole 
stack of packages into "Suggests" for now. This is probably not a 
sustainable solution as the workflow can potentially get larger over 
time; I would prefer to have some formal support for splitting up 
the workflow into modules that can be independently installed.


-Aaron

From: Vincent Carey <st...@channing.harvard.edu>
Sent: Saturday, 16 September 2017 10:08:13 PM
To: Aaron Lun
Cc: Martin Morgan; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] [Untrusted Server]Re: strange error in 
Jenkins build forsingleCellWorkflow


IMHO the pedagogic value of a unified document that treats a topic 
thoroughly
is quite high.  Building the whole workflow on an arbitrary user's 
system seems to
me to be a lower priority.  Thus using the environment variable in 
the build system

to avoid this limit seems an appropriate solution.

On Sat, Sep 16, 2017 at 7:43 AM, Aaron Lun 
<a...@wehi.edu.au<mailto:a...@wehi.edu.au>> wrote:
Thanks Martin. Yes, it's quite unfortunate that scater drags in 
dplyr and ggplot2, which - combined with Bioconductor's core 
packages - already puts us pretty close to the limit without doing 
anything else!



A solution might be to split my workflow into self-contained 
components, each of which can become its own workflow package (e.g., 
simpleSingleCell1, simpleSingleCell2, simpleSingleCell3 and so on). 
This should avoid all of the problems and our associated hacks.



I'm happy to do this, but is it possible for the website to indicate 
that there is a connection between the component workflows? For 
example, the link that ordinarily goes to the compiled workflow 
could instead go to an indexing page, which contains links to 
individual component workflows.



-Aaron



From: Martin Morgan 
<martin.mor...@roswellpark.org<mailto:martin.mor...@roswellpark.org>>

Sent: Saturday, 16 September 2017 8:18:09 PM
To: Aaron Lun; 
bioc-devel@r-project.org<mailto:bioc-devel@r-project.org>
Subject: Re: [Bioc-devel] [Untrusted Server]Re: strange error in 
Jenkins build forsingleCellWorkflow


On 09/16/2017 01:53 AM,

Re: [Bioc-devel] [Untrusted Server]Re: [Untrusted Server]Re: strange error in Jenkins build forsingleCellWorkflow

2017-09-19 Thread Wolfgang Huber


My 3 cents:
- I think this is a more and more common problem that I'm also 
encountering in everyday work and that asks for a general solution.
- I agree with Martin that setting R_MAX_NUM_DLLS is better than 
unloading. AfaIk it is not even possible to cleanly unload every package 
('as if it had never been loaded') due to irreversible global effects; 
although I'd happy to be educated otherwise.
- R_MAX_NUM_DLLS is not a sustainable solution either: the current 
default is 100, but e.g. on my MacOS 10.12 any value >152 leads to an 
error. Upping to the maximum 152 will give us some temporary respite but 
seems not really future-proof.


Wolfgang

19.9.17 12:02, Martin Morgan scripsit:

On 09/18/2017 10:42 PM, Shian Su wrote:

Hi Aaron,

Would you mind sharing the code for flushing DLLs? This is a problem 
that others working with single cells and I have faced.




For the user encountering this problem I think a better solution is to 
increase the number of DLLs allowed by R, for instance editing .Renviron 
to contain the line


R_MAX_NUM_DLLS=120

or similar. This can be on an installation-wide, user-wise, or 
project-specific basis, as described in ?Startup


@Aaron -- we are still discussing things internally; for instance it is 
possible to set the maximum number of DLLs in the build system.


Martin

Better yet would anyone know of code that would allow unused DLL to be 
identified and unloaded? I suspect not as it would require keeping 
track of the dependency tree of your current environment but I’m hopeful.


Kind regards,
Shian Su


On 19 Sep 2017, at 12:30 pm, Aaron Lun <a...@wehi.edu.au> wrote:

Well, inertia won out in the end, and so I've just moved a whole 
stack of packages into "Suggests" for now. This is probably not a 
sustainable solution as the workflow can potentially get larger over 
time; I would prefer to have some formal support for splitting up the 
workflow into modules that can be independently installed.


-Aaron

From: Vincent Carey <st...@channing.harvard.edu>
Sent: Saturday, 16 September 2017 10:08:13 PM
To: Aaron Lun
Cc: Martin Morgan; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] [Untrusted Server]Re: strange error in 
Jenkins build forsingleCellWorkflow


IMHO the pedagogic value of a unified document that treats a topic 
thoroughly
is quite high.  Building the whole workflow on an arbitrary user's 
system seems to
me to be a lower priority.  Thus using the environment variable in 
the build system

to avoid this limit seems an appropriate solution.

On Sat, Sep 16, 2017 at 7:43 AM, Aaron Lun 
<a...@wehi.edu.au<mailto:a...@wehi.edu.au>> wrote:
Thanks Martin. Yes, it's quite unfortunate that scater drags in dplyr 
and ggplot2, which - combined with Bioconductor's core packages - 
already puts us pretty close to the limit without doing anything else!



A solution might be to split my workflow into self-contained 
components, each of which can become its own workflow package (e.g., 
simpleSingleCell1, simpleSingleCell2, simpleSingleCell3 and so on). 
This should avoid all of the problems and our associated hacks.



I'm happy to do this, but is it possible for the website to indicate 
that there is a connection between the component workflows? For 
example, the link that ordinarily goes to the compiled workflow could 
instead go to an indexing page, which contains links to individual 
component workflows.



-Aaron



From: Martin Morgan 
<martin.mor...@roswellpark.org<mailto:martin.mor...@roswellpark.org>>

Sent: Saturday, 16 September 2017 8:18:09 PM
To: Aaron Lun; bioc-devel@r-project.org<mailto:bioc-devel@r-project.org>
Subject: Re: [Bioc-devel] [Untrusted Server]Re: strange error in 
Jenkins build forsingleCellWorkflow


On 09/16/2017 01:53 AM, Aaron Lun wrote:
Bumping this rather old thread. To re-iterate, I'm updating my 
simpleSingleCell workflow and I'm running into R's DLL limit. I've 
added a code block halfway through the workflow that unloads all 
DLLs and cleans them out, and this works fine during compilation on 
my local machine.



However, it seems that the BioC workflow builder uses a 
pre-processing step whereby it first tries to load all packages 
contained within library() calls. This hits the DLL limit as it 
doesn't execute the protective code block, which defeats the purpose 
of all my fiddling in the first place.



What options are there? I'm happy to split my workflow into multiple 
smaller Rmarkdown files that get compiled separately, provided there 
is appropriate support for this setup from the build system


The workflows have been standardized as packages. The packages put the
workflow dependencies in the 'Depends:' field, with the idea being that
the user installing the workflow package 'in the usual way' will get the
packages used in the vignette installed in their system 'in the usual
way'

Re: [Bioc-devel] [Untrusted Server]Re: [Untrusted Server]Re: strange error in Jenkins build forsingleCellWorkflow

2017-09-19 Thread Aaron Lun
The simplest approach is to try unloading each package in turn (it will fail if 
there are dependencies) and repeat until all desired packages are unloaded. 
After this, you can call gcDLLs() from the R.utils package. There is a code 
chunk in my workflow.Rmd file from lines 1588 to 1605 to do this, see 
https://github.com/MarioniLab/BiocWorkflow2016.

[https://avatars1.githubusercontent.com/u/16623186?v=4=400]<https://github.com/MarioniLab/BiocWorkflow2016>

GitHub - MarioniLab/BiocWorkflow2016: Files for a 
...<https://github.com/MarioniLab/BiocWorkflow2016>
github.com
BiocWorkflow2016 - Files for a Bioconductor workflow for low-level scRNA-seq 
data analyses.



However, this is not without problems, as some packages do some funky 
database-related things upon loading and don't get unloaded properly. Trial and 
error suggests that AnnotationDbi and GenomeInfoDb (and maybe more) should not 
be unloaded, as they can't be properly loaded again in the same session.


-Aaron


From: Shian Su
Sent: Tuesday, 19 September 2017 12:42:47 PM
To: Aaron Lun
Cc: Vincent Carey; bioc-devel@r-project.org
Subject: Re: [Untrusted Server]Re: [Bioc-devel] [Untrusted Server]Re: strange 
error in Jenkins build forsingleCellWorkflow

Hi Aaron,

Would you mind sharing the code for flushing DLLs? This is a problem that 
others working with single cells and I have faced.

Better yet would anyone know of code that would allow unused DLL to be 
identified and unloaded? I suspect not as it would require keeping track of the 
dependency tree of your current environment but I�m hopeful.

Kind regards,
Shian Su

> On 19 Sep 2017, at 12:30 pm, Aaron Lun <a...@wehi.edu.au> wrote:
>
> Well, inertia won out in the end, and so I've just moved a whole stack of 
> packages into "Suggests" for now. This is probably not a sustainable solution 
> as the workflow can potentially get larger over time; I would prefer to have 
> some formal support for splitting up the workflow into modules that can be 
> independently installed.
>
> -Aaron
> 
> From: Vincent Carey <st...@channing.harvard.edu>
> Sent: Saturday, 16 September 2017 10:08:13 PM
> To: Aaron Lun
> Cc: Martin Morgan; bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] [Untrusted Server]Re: strange error in Jenkins 
> build forsingleCellWorkflow
>
> IMHO the pedagogic value of a unified document that treats a topic thoroughly
> is quite high.  Building the whole workflow on an arbitrary user's system 
> seems to
> me to be a lower priority.  Thus using the environment variable in the build 
> system
> to avoid this limit seems an appropriate solution.
>
> On Sat, Sep 16, 2017 at 7:43 AM, Aaron Lun 
> <a...@wehi.edu.au<mailto:a...@wehi.edu.au>> wrote:
> Thanks Martin. Yes, it's quite unfortunate that scater drags in dplyr and 
> ggplot2, which - combined with Bioconductor's core packages - already puts us 
> pretty close to the limit without doing anything else!
>
>
> A solution might be to split my workflow into self-contained components, each 
> of which can become its own workflow package (e.g., simpleSingleCell1, 
> simpleSingleCell2, simpleSingleCell3 and so on). This should avoid all of the 
> problems and our associated hacks.
>
>
> I'm happy to do this, but is it possible for the website to indicate that 
> there is a connection between the component workflows? For example, the link 
> that ordinarily goes to the compiled workflow could instead go to an indexing 
> page, which contains links to individual component workflows.
>
>
> -Aaron
>
>
> 
> From: Martin Morgan 
> <martin.mor...@roswellpark.org<mailto:martin.mor...@roswellpark.org>>
> Sent: Saturday, 16 September 2017 8:18:09 PM
> To: Aaron Lun; bioc-devel@r-project.org<mailto:bioc-devel@r-project.org>
> Subject: Re: [Bioc-devel] [Untrusted Server]Re: strange error in Jenkins 
> build forsingleCellWorkflow
>
> On 09/16/2017 01:53 AM, Aaron Lun wrote:
>> Bumping this rather old thread. To re-iterate, I'm updating my 
>> simpleSingleCell workflow and I'm running into R's DLL limit. I've added a 
>> code block halfway through the workflow that unloads all DLLs and cleans 
>> them out, and this works fine during compilation on my local machine.
>>
>>
>> However, it seems that the BioC workflow builder uses a pre-processing step 
>> whereby it first tries to load all packages contained within library() 
>> calls. This hits the DLL limit as it doesn't execute the protective code 
>> block, which defeats the purpose of all my fiddling in the first place.
>>
>>
>> What options are there? I'm happy to split my workflow into multiple 

Re: [Bioc-devel] [Untrusted Server]Re: [Untrusted Server]Re: strange error in Jenkins build forsingleCellWorkflow

2017-09-18 Thread Shian Su
Hi Aaron,

Would you mind sharing the code for flushing DLLs? This is a problem that 
others working with single cells and I have faced.

Better yet would anyone know of code that would allow unused DLL to be 
identified and unloaded? I suspect not as it would require keeping track of the 
dependency tree of your current environment but I’m hopeful.

Kind regards,
Shian Su

> On 19 Sep 2017, at 12:30 pm, Aaron Lun <a...@wehi.edu.au> wrote:
> 
> Well, inertia won out in the end, and so I've just moved a whole stack of 
> packages into "Suggests" for now. This is probably not a sustainable solution 
> as the workflow can potentially get larger over time; I would prefer to have 
> some formal support for splitting up the workflow into modules that can be 
> independently installed.
> 
> -Aaron
> 
> From: Vincent Carey <st...@channing.harvard.edu>
> Sent: Saturday, 16 September 2017 10:08:13 PM
> To: Aaron Lun
> Cc: Martin Morgan; bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] [Untrusted Server]Re: strange error in Jenkins 
> build forsingleCellWorkflow
> 
> IMHO the pedagogic value of a unified document that treats a topic thoroughly
> is quite high.  Building the whole workflow on an arbitrary user's system 
> seems to
> me to be a lower priority.  Thus using the environment variable in the build 
> system
> to avoid this limit seems an appropriate solution.
> 
> On Sat, Sep 16, 2017 at 7:43 AM, Aaron Lun 
> <a...@wehi.edu.au<mailto:a...@wehi.edu.au>> wrote:
> Thanks Martin. Yes, it's quite unfortunate that scater drags in dplyr and 
> ggplot2, which - combined with Bioconductor's core packages - already puts us 
> pretty close to the limit without doing anything else!
> 
> 
> A solution might be to split my workflow into self-contained components, each 
> of which can become its own workflow package (e.g., simpleSingleCell1, 
> simpleSingleCell2, simpleSingleCell3 and so on). This should avoid all of the 
> problems and our associated hacks.
> 
> 
> I'm happy to do this, but is it possible for the website to indicate that 
> there is a connection between the component workflows? For example, the link 
> that ordinarily goes to the compiled workflow could instead go to an indexing 
> page, which contains links to individual component workflows.
> 
> 
> -Aaron
> 
> 
> 
> From: Martin Morgan 
> <martin.mor...@roswellpark.org<mailto:martin.mor...@roswellpark.org>>
> Sent: Saturday, 16 September 2017 8:18:09 PM
> To: Aaron Lun; bioc-devel@r-project.org<mailto:bioc-devel@r-project.org>
> Subject: Re: [Bioc-devel] [Untrusted Server]Re: strange error in Jenkins 
> build forsingleCellWorkflow
> 
> On 09/16/2017 01:53 AM, Aaron Lun wrote:
>> Bumping this rather old thread. To re-iterate, I'm updating my 
>> simpleSingleCell workflow and I'm running into R's DLL limit. I've added a 
>> code block halfway through the workflow that unloads all DLLs and cleans 
>> them out, and this works fine during compilation on my local machine.
>> 
>> 
>> However, it seems that the BioC workflow builder uses a pre-processing step 
>> whereby it first tries to load all packages contained within library() 
>> calls. This hits the DLL limit as it doesn't execute the protective code 
>> block, which defeats the purpose of all my fiddling in the first place.
>> 
>> 
>> What options are there? I'm happy to split my workflow into multiple smaller 
>> Rmarkdown files that get compiled separately, provided there is appropriate 
>> support for this setup from the build system
> 
> The workflows have been standardized as packages. The packages put the
> workflow dependencies in the 'Depends:' field, with the idea being that
> the user installing the workflow package 'in the usual way' will get the
> packages used in the vignette installed in their system 'in the usual
> way' without having to execute special variants of biocLite() /
> install.packages() / funky code in the vignette itself to be able to
> build the vignette.
> 
> Loading a package loads its Depends: (and Imports:) so triggers the problem.
> 
> Writing separate vignettes would not help with this (but might make the
> workflow more palatable; I'm not 100% sure of support for separate work
> flows in a single package, there is no problem with having multiple
> workflow packages on the same general topic).
> 
> One could move (some?) packages to Suggests: and use your trick of
> unloading packages part-way through the vignette. But then users will
> find that they need to install packages to complete the vignette.
> 
> 'We' could add a support for a