[Bioc-devel] Memory usage for bplapply

2019-01-03 Thread Lulu Chen
Dear all,

I met a memory issue for bplapply with SnowParam(). I need to calculate
something from a large matrix many many times. But from the discussions in
https://support.bioconductor.org/p/92587, I learned that bplapply copied
the current and parent environment to each worker thread. Then means the
large matrix in my package will be copied so many times. Do you have better
suggestions in windows platform?

Before I tried to package my code, I used doSNOW package with foreach
%dopar%. It seems to consume less memory in each core (almost the size of
the matrix the task needs). But bplapply seems to copy more then objects in
current environment and the above one level environment. I am very
confused.and just guess it was copying everything.

Thanks for any help!
Best,
Lulu

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Error in Bioc windows check report

2019-01-03 Thread Benjamin Tremblay
Thank you! I will do as you suggested for the documentation. Still unsure as of 
yet about what to do about the Rd warnings regarding BiocGenerics.

As for motifdb, for now I will just not allow the example to run on windows.. I 
hope that’s not considered bad form.

Thanks,

Benjamin

> Le 1 janv. 2019 à 22:15, Martin Morgan  a écrit :
> 
> The warnings below can be understood by a careful reading of 
> https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Cross_002dreferences
> 
> You use \link[pkg]{foo} and the docs say that foo is the name of the html 
> file where you would like to link. DNAStringSet is documented in the html 
> file XStringSet-class, so your links would be 
> \link[Biostrings:XStringSet-class]{DNAStringSet}. But actually this is just 
> making work for yourself, and it is more straight-forward to just use 
> \link{DNAStringSet}; R then does the necessary bookkeeping.
> 
> The error, later, looks like a problem loading 32-bit motifdb; I'm not sure 
> that I have anything immediately helpful to provide...
> 
> Martin
> 
> 
> On 12/31/18, 1:30 PM, "Bioc-devel on behalf of Benjamin Tremblay" 
>  wrote:
> 
>Hi,
> 
>I recently pushed a bug fix to the devel and release branches of my 
> package (universalmotif). The updated package in my GitHub repository 
> (https://github.com/bjmt/universalmotif 
> ) passed all checks with no errors or 
> warnings on linux in travis CI, and on windows in appveyor 
> (https://ci.appveyor.com/project/bjmt/universalmotif 
> ). It also passed both 
> mac and linux checks on the Bioc machines. The windows Bioc machine failed 
> the check however, and I am not quite sure as to why 
> (http://bioconductor.org/checkResults/release/bioc-LATEST/universalmotif/tokay1-checksrc.html
>  
> ).
> 
>The following warning occurred:
>* checking whether package 'universalmotif' can be installed ... WARNING
>Found the following significant warnings:
>  Rd warning: 
> C:/Users/biocbuild/bbs-3.8-bioc/tmpdir/RtmpaWjzyP/R.INSTALL3dcce5dc9/universalmotif/man/ArabidopsisPromoters.Rd:7:
>  file link 'DNAStringSet' in package 'Biostrings' does not exist and so has 
> been treated as a topic
>  Rd warning: 
> C:/Users/biocbuild/bbs-3.8-bioc/tmpdir/RtmpaWjzyP/R.INSTALL3dcce5dc9/universalmotif/man/create_motif.Rd:124:
>  file link 'DNAStringSet-class' in package 'Biostrings' does not exist and so 
> has been treated as a topic
>  Rd warning: 
> C:/Users/biocbuild/bbs-3.8-bioc/tmpdir/RtmpaWjzyP/R.INSTALL3dcce5dc9/universalmotif/man/create_motif.Rd:126:
>  file link 'RNAStringSet-class' in package 'Biostrings' does not exist and so 
> has been treated as a topic
>… etc
>There was also an error for the i386 arch:
>** running examples for arch 'i386' ... ERROR
>Running examples in 'universalmotif-Ex.R' failed
>Though the error message was not informative and did not reveal why the 
> example failed.
> 
>At this point I am at a loss as to what to do. Not only is this warning 
> and error specific to windows, it is specific to the Bioc windows machine. 
> This situation also occurred during package submission (though the 
> errors/warnings were slightly different), but my reviewer OK’d it regardless. 
> At this point, I would like to ask if there is some flaw in my package which 
> I should fix to allow it to pass the check? Or is there a problem specific to 
> the Bioc windows machine?
> 
>Thank you,
> 
>Benjamin Tremblay
>   [[alternative HTML version deleted]]
> 
>___
>Bioc-devel@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/bioc-devel
> 

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Travis CI errors for BiocManager

2019-01-03 Thread Erik Fasterius
Ah, okey! Very good to know, thank you! Is there a way I can keep track of 
this, to see when it gets fixed?

Erik

> On 3 Jan 2019, at 14:00, Martin Morgan  wrote:
> 
> This seems to be a regression in R-devel, and has been reported.
> 
> Previously (at least svn r75833, I think)
> 
>  df <- data.frame(vers= package_version("1.2"))
>  rbind(df, df)$vers
> 
> returned
> 
>  [1] '1.2' '1.2'
> 
> now (r75945) we have
> 
>> rbind(df, df)$vers
> [[1]]
> [1] 1 2
> 
> [[2]]
> [1] 1 2
> 
> Martin
> 
> On 1/3/19, 5:25 AM, "Bioc-devel on behalf of Erik Fasterius" 
>  
> wrote:
> 
>Hi,
> 
>I’m currently updating my package `seqCAT` with some new code, and I 
> always run it through Travis CI before pushing changes to Bioconductor. The 
> last build errors with the following message:
> 
>Updating HTML index of packages in '.Library'
>Making 'packages.html' ... done
>Error: invalid version specification ‘c(3, 9)’
>Execution halted
>The command "eval Rscript -e 'if (!requireNamespace("BiocManager", 
> quietly=TRUE))  install.packages("BiocManager");if (TRUE) 
> BiocManager::install(version = "devel");cat(append = TRUE, file = 
> "~/.Rprofile.site", "options(repos = 
> BiocManager::repositories());")' " failed.
> 
>This happens even though both `devtools::check()` and `BiocCheck()` 
> completes without errors, like normal. I have never had this error before, 
> and it started with the most recent change I did: `n()` to `dplyr::n()` in a 
> single line of code (to account for the upcoming changes to the `dplyr` 
> package), so I don’t believe the code itself is to blame.
> 
>Does anybody know what the problem is here?
>Thanks in advance!
>Erik
> 
>   [[alternative HTML version deleted]]
> 
>___
>Bioc-devel@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/bioc-devel
> 

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] how to achieve reproducibility with BiocParallel regardless of number of threads and OS (set.seed is disallowed)

2019-01-03 Thread Lulu Chen
Thanks for teaching me how to set a seed for each job!

On Wed, Jan 2, 2019 at 9:45 AM Martin Morgan 
wrote:

> I'll back-track on my advice a little, and say that the right way to
> enable the user to get reproducible results is to respect the setting the
> user makes outside your function. So for
>
> your = function()
>  unlist(bplapply(1:4, rnorm))
>
> The user will
>
> register(MulticoreParam(2, RNGseed=123))
> your()
>
> to always produces the identical result.
>
> Following Aaron's strategy, the R-level approach to reproducibility might
> be along the lines of
>
> - tell the user to set parallel::RNGkind("L'Ecuyer-CMRG") and set.seed()
> - In your function, generate seeds for each job
>
> n = 5; seeds <- vector("list", n)
> seeds[[1]] = .Random.seed  # FIXME fails if set.seed or random nos.
> have not been generated...
> for (i in tail(seq_len(n), -1)) seeds[[i]] = nextRNGStream(seeds[[i -
> 1]])
>
> - send these, along with the job, to the workers, setting .Random.seed on
> each worker
>
> bpmapply(function(i, seed, ...) {
> oseed <- get(".Random.seed", envir = .GlobalEnv)
> on.exit(assign(".Random.seed", oseed, envir = .GlobalEnv))
> assign(".Random.seed", seed, envir = .GlobalEnv)
> ...
> }, seq_len(n), seeds, ...)
>
> The use of L'Ecuyer-CMRG and `nextRNGStream()` means that the streams on
> each worker are independent. Using on.exit means that, even on the worker,
> the state of the random number generator is not changed by the evaluation.
> This means that even with SerialParam() the generator is well-behaved. I
> don’t know how BiocCheck responds to use of .Random.seed, which in general
> would be a bad thing to do but in this case with the use of on.exit() the
> usage seems ok.
>
> Martin
>
>
> On 12/31/18, 3:17 PM, "Lulu Chen"  wrote:
>
> Hi Martin,
>
>
> Thanks for your help. But setting different number of workers will
> generate different results:
>
>
> > unlist(bplapply(1:4, rnorm, BPPARAM=SnowParam(1, RNGseed=123)))
>  [1]  1.0654274 -1.2421454  1.0523311 -0.7744536  1.3081934
> -1.5305223  1.1525356  0.9287607 -0.4355877  1.5055436
> > unlist(bplapply(1:4, rnorm, BPPARAM=SnowParam(2, RNGseed=123)))
>  [1] -0.9685927  0.7061091  1.4890213 -0.4094454  0.8909694
> -0.8653704  1.4642711  1.2674845 -0.2220491  2.4505322
> > unlist(bplapply(1:4, rnorm, BPPARAM=SnowParam(3, RNGseed=123)))
>  [1] -0.96859273 -0.40944544  0.89096942 -0.86537045  1.46427111
> 1.26748453 -0.48906078  0.43304237 -0.03195349
> [10]  0.14670372
> > unlist(bplapply(1:4, rnorm, BPPARAM=SnowParam(4, RNGseed=123)))
>  [1] -0.96859273 -0.40944544  0.89096942 -0.48906078  0.43304237
> -0.03195349 -1.03886641  1.57451249  0.74708204
> [10]  0.67187201
>
>
>
> Best,
> Lulu
>
>
>
> On Mon, Dec 31, 2018 at 1:12 PM Martin Morgan 
> wrote:
>
>
> The major BiocParallel objects (SnowParam(), MulticoreParam()) and use
> of bplapply() allow fully repeatable randomizations, e.g.,
>
> > library(BiocParallel)
> > unlist(bplapply(1:4, rnorm, BPPARAM=MulticoreParam(RNGseed=123)))
>  [1] -0.96859273 -0.40944544  0.89096942 -0.48906078  0.43304237
> -0.03195349
>  [7] -1.03886641  1.57451249  0.74708204  0.67187201
> > unlist(bplapply(1:4, rnorm, BPPARAM=MulticoreParam(RNGseed=123)))
>  [1] -0.96859273 -0.40944544  0.89096942 -0.48906078  0.43304237
> -0.03195349
>  [7] -1.03886641  1.57451249  0.74708204  0.67187201
> > unlist(bplapply(1:4, rnorm, BPPARAM=SnowParam(RNGseed=123)))
> [1] -0.96859273 -0.40944544  0.89096942 -0.48906078  0.43304237
> -0.03195349
>  [7] -1.03886641  1.57451249  0.74708204  0.67187201
>
> The idea then would be to tell the user to register() such a param, or
> to write your function to accept an argument rngSeed along the lines of
>
> f = function(..., rngSeed = NULL) {
> if (!is.null(rngSeed)) {
> param = bpparam()  # user's preferred back-end
> oseed = bpRNGseed(param)
> on.exit(bpRNGseed(param) <- oseed)
> bpRNGseed(param) = rngSeed
> }
> bplapply(1:4, rnorm)
> }
>
> (actually, this exercise illustrates a problem with bpRNGseed<-() when
> the original seed is NULL; this will be fixed in the next day or so...)
>
> Is that sufficient for your use case?
>
> On 12/31/18, 11:24 AM, "Bioc-devel on behalf of Lulu Chen" <
> bioc-devel-boun...@r-project.org on behalf of
> luluc...@vt.edu> wrote:
>
> Dear all,
>
> I posted the question in the Bioconductor support site (
>
> https://support.bioconductor.org/p/116381/ <
> https://support.bioconductor.org/p/116381/>) and was suggested to direct
> future correspondence there.
>
> I plan to generate a vector of seeds (provided by users through
> argument of
> my R function) and use them by set.seed() in each parallel
> computation.
> However, set.seed() will 

Re: [Bioc-devel] Travis CI errors for BiocManager

2019-01-03 Thread Martin Morgan
This seems to be a regression in R-devel, and has been reported.

Previously (at least svn r75833, I think)

  df <- data.frame(vers= package_version("1.2"))
  rbind(df, df)$vers

returned

  [1] '1.2' '1.2'

now (r75945) we have

  > rbind(df, df)$vers
[[1]]
[1] 1 2

[[2]]
[1] 1 2

Martin

On 1/3/19, 5:25 AM, "Bioc-devel on behalf of Erik Fasterius" 
 
wrote:

Hi,

I’m currently updating my package `seqCAT` with some new code, and I always 
run it through Travis CI before pushing changes to Bioconductor. The last build 
errors with the following message:

Updating HTML index of packages in '.Library'
Making 'packages.html' ... done
Error: invalid version specification ‘c(3, 9)’
Execution halted
The command "eval Rscript -e 'if (!requireNamespace("BiocManager", 
quietly=TRUE))  install.packages("BiocManager");if (TRUE) 
BiocManager::install(version = "devel");cat(append = TRUE, file = 
"~/.Rprofile.site", "options(repos = 
BiocManager::repositories());")' " failed.

This happens even though both `devtools::check()` and `BiocCheck()` 
completes without errors, like normal. I have never had this error before, and 
it started with the most recent change I did: `n()` to `dplyr::n()` in a single 
line of code (to account for the upcoming changes to the `dplyr` package), so I 
don’t believe the code itself is to blame.

Does anybody know what the problem is here?
Thanks in advance!
Erik

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Mixed species dataset for makeExperimentHubMetadata

2019-01-03 Thread Shepherd, Lori
We will look at expanding the field out - for now you could either chose one or 
use NA


Lori Shepherd

Bioconductor Core Team

Roswell Park Cancer Institute

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263


From: Bioc-devel  on behalf of Lu, Dongyi 
(Lambda) 
Sent: Wednesday, January 2, 2019 3:18:18 PM
To: bioc-devel@r-project.org
Subject: [Bioc-devel] Mixed species dataset for makeExperimentHubMetadata

Hi all,

I have a dataset from 10x with a mixture of human and mouse cells, but the help 
page for makeExperimentHubMetadata says the �Species� field must have length 1, 
so I can�t do list column. Then how can I put both �Homo sapiens� and �Mus 
musculus� into the �Species� field?

Best,
Lambda

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Travis CI errors for BiocManager

2019-01-03 Thread Erik Fasterius
Hi,

I’m currently updating my package `seqCAT` with some new code, and I always run 
it through Travis CI before pushing changes to Bioconductor. The last build 
errors with the following message:

Updating HTML index of packages in '.Library'
Making 'packages.html' ... done
Error: invalid version specification ‘c(3, 9)’
Execution halted
The command "eval Rscript -e 'if (!requireNamespace("BiocManager", 
quietly=TRUE))  install.packages("BiocManager");if (TRUE) 
BiocManager::install(version = "devel");cat(append = TRUE, file = 
"~/.Rprofile.site", "options(repos = 
BiocManager::repositories());")' " failed.

This happens even though both `devtools::check()` and `BiocCheck()` completes 
without errors, like normal. I have never had this error before, and it started 
with the most recent change I did: `n()` to `dplyr::n()` in a single line of 
code (to account for the upcoming changes to the `dplyr` package), so I don’t 
believe the code itself is to blame.

Does anybody know what the problem is here?
Thanks in advance!
Erik

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel