Re: [Bioc-devel] C++ parallel computing

2021-05-26 Thread Oleksii Nikolaienko
Thank you for the information. I guess I'll try to stick to the R-level parallelization whenever possible. Best, Oleksii On Wed, 26 May 2021 at 13:47, Martin Morgan wrote: > The best way to process large files is in chunks using BamFile(…, > yieldSize = …) and by using ScanBamParam() to select

Re: [Bioc-devel] About the size limitation of the package

2021-05-26 Thread Vincent Carey
On Wed, May 26, 2021 at 5:55 PM Stuart Lee wrote: > Hi You and Lori, > > Are fitted models in scope for ExperimentHub? I thought it was more for > data. Maybe there should be a ModelHub for developers to include trained > models from papers in their packages? > > @You: if that model has been

Re: [Bioc-devel] About the size limitation of the package

2021-05-26 Thread Stuart Lee
Hi You and Lori, Are fitted models in scope for ExperimentHub? I thought it was more for data. Maybe there should be a ModelHub for developers to include trained models from papers in their packages? @You: if that model has been fitted in R take a look at https://github.com/tidymodels/butcher

Re: [Bioc-devel] %outside% on GRanges

2021-05-26 Thread Oleksii Nikolaienko
Thanks very much for the explanation, Jim. Best, Oleksii On Wed, 26 May 2021 at 16:28, James W. MacDonald wrote: > Hi Oleksii, > > That function is just a simplification of the negation of overlapsAny: > > > getAnywhere("%outside%") > A single object matching '%outside%' was found > It was

Re: [Bioc-devel] %outside% on GRanges

2021-05-26 Thread James W. MacDonald
Hi Oleksii, That function is just a simplification of the negation of overlapsAny: > getAnywhere("%outside%") A single object matching '%outside%' was found It was found in the following places package:IRanges namespace:IRanges with value function (query, subject) !overlapsAny(query,

[Bioc-devel] %outside% on GRanges

2021-05-26 Thread Oleksii Nikolaienko
Dear Bioc team, %outside% operator from IRanges works as one would expect even if GRanges objects are supplied as operands: > a <- as("chr1:100-200", "GRanges") > b <- as("chr2:150-250", "GRanges") > IRanges::`%outside%`(a, b) [1] TRUE > IRanges::`%outside%`(ranges(a), ranges(b)) [1] FALSE It

Re: [Bioc-devel] About the size limitation of the package

2021-05-26 Thread Kern, Lori
Please consider using Experiment Hub to host the large data file. More information can be found here: https://bioconductor.org/packages/devel/bioc/vignettes/AnnotationHub/inst/doc/CreateAHubPackage.html Cheers, Lori Shepherd Bioconductor Core Team Roswell Park Comprehensive Cancer Center

Re: [Bioc-devel] C++ parallel computing

2021-05-26 Thread Martin Morgan
The best way to process large files is in chunks using BamFile(…, yieldSize = …) and by using ScanBamParam() to select just the components of the bam files of interest. The number of cores is basically irrelevant for input -- you'll be using just one, so choose yieldSize to use modest amounts

[Bioc-devel] About the size limitation of the package

2021-05-26 Thread You Zhou
Dear Bioc team, I am compiling a package �m6Aboost� and planning to submit it in the Bioconductor. This package using a trained machine learning model to identify the correct m6A signals from the miCLIP2 data set (more detail about this machine learning model can be found in our paper

Re: [Bioc-devel] C++ parallel computing

2021-05-26 Thread Aaron Lun
Incidentally, I was reflecting on this topic the other day and was wondering whether BiocParallel could have something like OpenMPParam() that sets the number of threads to some non-zero value via omp_set_num_threads(). This would provide a consistent framework through which users could