date:20190913

Re: [Bioc-devel] new package for accessing some chemical and biological databases

2019-09-13 Thread Pierrick Roger

Thank you all for your explanations and suggestions.

I'd like first to state that the repos `biodb-repos` is not part of the
package.
Let me explain a bit about my package. biodb offers a unified framework
for accessing very different databases. The user has to create a central
instance of the class Biodb, and then from it he can creates connectors
to diverse databases. Every connector will use another central class
(BiodbRequestScheduler), which controls frequency of access to the
databases.  This BiodbRequestScheduler will also use another central
class (BiodbCache) for caching the requests and their results, and thus
skipping the connections to the databases the next time the same
requests are made. This same cache is also used when accessing
individual entries from the databases.  So no cache is provided with the
package when installed, but only a cache system. When the user run the
package for the first time, a empty cache folder will be created if the
cache system is enabled (it can be disabled), and files will be written
in it and read from it when needed.

All databases are different in the way they provide access to their
services or their data. But to some up, I would say there are two main
types that are important for the cache system:
 1. The databases that propose web services for accessing entries by
 identifier.
 2. Those that do not propose such services, but allow for downloading
 of the whole database.
Among the second type, we can cite Massbank, and HMDB Metabolites, whose
sizes are quite big. Downloading those databases take some time.  The
first type is not free of issues, as all databases are regularly subject
to instability in their services.
So, in order to avoid having failures on Travis-CI for:
 1. Unavailability of web services.
 2. Long time of requests or download.
I decided to use a pre-built cache for running "R CMD Check" on
Travis-CI.

I think this "longtest" feature you propose is interesting but it does
not answer the issue of "Unavailability of web service". Also the issue
is not only on tests, but also on examples and vignettes, since I've
tried to provide examples for all database connectors, and I've tried to
write real cases in vignettes that run several requests that could be
long.

I could of course put `dontrun` directives around all examples, and also
run only a minimal set of the tests if the pre-built cache is not
present (I could still run full tests on Travis-CI with my GitHub
version of my package). But, apart from the fact that I do not find
theses solutions satisfying, I wonder what to do with the vignettes.
Maybe I could provide pre-built vignettes? How would that be possible?

On Fri 13 Sep 19 15:36, Pages, Herve wrote:
> Hi,
> 
> On 9/13/19 06:38, Morgan, Martin wrote:
> > Putting bioc-devel back in the loop.
> > 
> > I think that the straight-forward answer to your original query is 'no, git 
> > modules are not supported'.
> > 
> > I think we'd carry on and say 'packages should be self-contained and 
> > conform to the Bioconductor size and time constraints', so you cannot have 
> > a very large package or a package that takes a long time to check, and you 
> > can't download part of the package from some alternative source (except 
> > perhaps AnnotationHub or ExperimentHub).
> > 
> > While we 'could' make special accommodations on the build systems to 
> > support your package, e.g., parsing a yaml file, we have found that this is 
> > not a fruitful endeavor.
> > 
> > A natural place to put files used in tests would be in the /tests 
> > directory; these are not included in the installed package. But it seems 
> > likely that including your tests would violate the time and / or space 
> > limitations we place on packages.
> 
> If you put your tests in the longtests/ folder of your package, we'll 
> run them once a week and they'll be allowed to run for 6h (instead of 40 
> min. for the tests in the tests/ folder) before we declare a timeout.
> 
> The results for the "long tests" checks are published here:
> 
>https://bioconductor.org/checkResults/3.10/bioc-longtests-LATEST/
> 
> Only 1 package (beachmat) effectively uses this service so far (the 
> other 3 packages you see there are not running real tests).
> 
> This service was implemented a couple of years ago and is FREE!
> 
> > 
> > It seems likely that this leads to the question you pose below, which is 
> > how do you know that you're running on the build system so that you can 
> > perform more modest computations? This comes up here
> > 
> >
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_pipermail_bioc-2Ddevel_2019-2DSeptember_015518.html=DwIGaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=ZRCtNnhjO58QmdaxRv3BM85jl0itCViRLJo1OYMgofA=SIerVMHE2XIKpUVsb0zGGE7-L_mWSoP7Xq5XN2vB98w=
> > 
> > and I'm a little surprised that Herve is not willing to commit to an easy 
> > answer, perhaps because this opens the door to people circumventing even

Re: [Bioc-devel] Duplicated method names in purrr and GenomicRanges

2019-09-13 Thread Henrik Bengtsson

Just an FYI, *if* all you're doing is rbind():ing those data frames,
then you're better of doing:

  df <- do.call(rbind, dfs)

than

  df <- Reduce(rbind, dfs)

because the former is faster and more memory efficient:

> dfs <- rep(list(iris), times=100)
> bench::mark(df <- Reduce(rbind, dfs))[,1:7]
# A tibble: 1 x 7
  expressionmin   median `itr/sec` mem_alloc `gc/sec` n_itr
 
1 df <- Reduce(rbind, dfs)   65.5ms   68.6ms  13.6 108MB 30.7 8
Warning message:
Some expressions had a GC in every iteration; so filtering is disabled.

> bench::mark(df <- do.call(rbind, dfs))[,1:7]
# A tibble: 1 x 7
  expression min   median `itr/sec` mem_alloc `gc/sec` n_itr
  
1 df <- do.call(rbind, dfs)   8.67ms   9.47ms  105.  14MB 49.534

/Henrik

On Fri, Sep 13, 2019 at 5:39 AM  wrote:
>
> Thank you for all of your answers. Michaels solution works fine for me.
> I had to merge a list of data.frames. Used the solution in this thread here:
>
> https://stackoverflow.com/questions/8091303/simultaneously-merge-multiple-data-frames-in-a-list
>
> Am 12.09.19 um 13:05 schrieb Michael Lawrence via Bioc-devel:
> > Third option: use Reduce() from base instead of purr::reduce().
> >
> > On Thu, Sep 12, 2019 at 2:54 AM O'CALLAGHAN Alan
> >  wrote:
> >> Hi,
> >>
> >> Two options.
> >>
> >> First option: import either purrr::reduce or GenomicRanges::reduce, and
> >> call the other with [pkg]::reduce.
> >>
> >> Second option: remove the import for both of these. Use purrr::reduce
> >> and GenomicRanges::reduce to call both functions.
> >>
> >> I think the second option leads to clearer code and would be my definite
> >> preference.
> >>
> >>
> >> On 12/09/2019 10:07, bio...@posteo.de wrote:
> >>> Dear all,
> >>>
> >>> I am developing a Bioconductor package and have a problem with two
> >>> methods which have the same name. I am using the reduce() function
> >>> from the R packages GenomicRanges and purrr. All methods from other
> >>> packages are imported with @importFrom in all of my functions.
> >>>
> >>>
> >>> During devtools::document() I get the following Warning:
> >>>
> >>> ...
> >>>
> >>> replacing previous import ‘GenomicRanges::reduce’ by ‘purrr::reduce’
> >>> when loading ‘testPackage’
> >>>
> >>> ...
> >>>
> >>>
> >>> Here are my NAMESPACE entries:
> >>>
> >>> # Generated by roxygen2: do not edit by hand
> >>>
> >>> export(mergeDataFrameList)
> >>> export(reduceDummy)
> >>> importFrom(GenomicRanges,GRanges)
> >>> importFrom(GenomicRanges,reduce)
> >>> importFrom(IRanges,IRanges)
> >>> importFrom(dplyr,"%>%")
> >>> importFrom(dplyr,left_join)
> >>> importFrom(dplyr,mutate)
> >>> importFrom(dplyr,pull)
> >>> importFrom(magrittr,"%<>%")
> >>> importFrom(purrr,reduce)
> >>> importFrom(tibble,tibble)
> >>>
> >>>
> >>> I am not using both reduce functions in the same function. To use the
> >>> GenomicRanges reduce function, I have to call this function like this:
> >>> GenomicRanges::reduce().
> >>>
> >>> I understand the warning and why I have to call the reduce function
> >>> like this. Is there a solution for this problem? Compiling a R package
> >>> with warnings and calling functions like this is not the best way I
> >>> guess.
> >>>
> >>> I am using R version 3.6.1 (2019-07-05)
> >>>
> >>> Thanks for help!
> >>>
> >>> Best,
> >>>
> >>> Tobias
> >>>
> >>> ___
> >>> Bioc-devel@r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >> The University of Edinburgh is a charitable body, registered in Scotland, 
> >> with registration number SC005336.
> >> ___
> >> Bioc-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >
> >
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] new package for accessing some chemical and biological databases

2019-09-13 Thread Pages, Herve

Hi,

On 9/13/19 06:38, Morgan, Martin wrote:
> Putting bioc-devel back in the loop.
> 
> I think that the straight-forward answer to your original query is 'no, git 
> modules are not supported'.
> 
> I think we'd carry on and say 'packages should be self-contained and conform 
> to the Bioconductor size and time constraints', so you cannot have a very 
> large package or a package that takes a long time to check, and you can't 
> download part of the package from some alternative source (except perhaps 
> AnnotationHub or ExperimentHub).
> 
> While we 'could' make special accommodations on the build systems to support 
> your package, e.g., parsing a yaml file, we have found that this is not a 
> fruitful endeavor.
> 
> A natural place to put files used in tests would be in the /tests directory; 
> these are not included in the installed package. But it seems likely that 
> including your tests would violate the time and / or space limitations we 
> place on packages.

If you put your tests in the longtests/ folder of your package, we'll 
run them once a week and they'll be allowed to run for 6h (instead of 40 
min. for the tests in the tests/ folder) before we declare a timeout.

The results for the "long tests" checks are published here:

   https://bioconductor.org/checkResults/3.10/bioc-longtests-LATEST/

Only 1 package (beachmat) effectively uses this service so far (the 
other 3 packages you see there are not running real tests).

This service was implemented a couple of years ago and is FREE!

> 
> It seems likely that this leads to the question you pose below, which is how 
> do you know that you're running on the build system so that you can perform 
> more modest computations? This comes up here
> 
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_pipermail_bioc-2Ddevel_2019-2DSeptember_015518.html=DwIGaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=ZRCtNnhjO58QmdaxRv3BM85jl0itCViRLJo1OYMgofA=SIerVMHE2XIKpUVsb0zGGE7-L_mWSoP7Xq5XN2vB98w=
> 
> and I'm a little surprised that Herve is not willing to commit to an easy 
> answer, perhaps because this opens the door to people circumventing even 
> minimal tests of their package...

Yes, we want to encourage extensive testing. IMO we should encourage 
people using the long-tests service for tests that would otherwise take 
too long, rather than telling them how to disable or reduce the size of 
the tests that get run on our machines.

H.

> 
> Martin
> 
> On 9/13/19, 7:49 AM, "Shepherd, Lori"  wrote:
> 
>  
>  I'm including Martin and Herve for their opinions and to chime in too 
> since you took this conversation off the mailing list...
>  
>  
>  Could you please describe what you mean by works transparently?
>  
>  
>  We realize there isn't a function to call -  we were suggesting you make 
> a function to call that could be utilized
>  
>  
>  How does your caching system work?  I would also advise looking into 
> BiocFileCache - the Bioconductor suggested package for data caching of files.
>  
>  
>  
>  
>  The relevant files to look at for the environment calls can be found
>  
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_Bioconductor_Contributions=DwIGaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=ZRCtNnhjO58QmdaxRv3BM85jl0itCViRLJo1OYMgofA=QM-t_MwMB6kPSZq4qtPcCa6aoTuAzvIOY8tYts3RN9c=
>  
>  esp.
>  
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_Bioconductor_Contributions-23r-2Dcmd-2Dcheck-2Denvironment=DwIGaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=ZRCtNnhjO58QmdaxRv3BM85jl0itCViRLJo1OYMgofA=FDxnLRBZQjvpqvPuXeRWVpR5EquPOj-xH591wJ8UI_c=
>  
>  
>  
>  Please also be mindful of:
>  
>  Submission Guidelines
>  
> https://urldefense.proofpoint.com/v2/url?u=https-3A__bioconductor.org_developers_package-2Dsubmission_=DwIGaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=ZRCtNnhjO58QmdaxRv3BM85jl0itCViRLJo1OYMgofA=xgAF4zpoc8qzdK-5hquJxr_iPTw5d_ZIUMwsSbrp--g=
>  
>  Package Guidelines
>  
> https://urldefense.proofpoint.com/v2/url?u=https-3A__bioconductor.org_developers_package-2Dguidelines_=DwIGaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=ZRCtNnhjO58QmdaxRv3BM85jl0itCViRLJo1OYMgofA=nCnO4UyM__32_Dmpkq-ru_xrAFOp6_6n3AvHZZOO2Rk=
>  
>  
>  
>  
>  More specifically on the single package builder we use:
>  R CMD BiocCheckGitClone 
>  R CMD build --keep-empty-dirs --no-resave-data  
>  
>  R CMD check --no-vignettes --timings 
>  
>  R CMD BiocCheck --build-output-file= --new-package 
> 
>  
>  
>  
>  With the environment variables set up as described in the above link
>  
>  
>  special files are not encouraged and as far as I am aware not allowed.  
> Herve who has more experience

Re: [Bioc-devel] new package for accessing some chemical and biological databases

2019-09-13 Thread Mike Smith

I've lost track of whether the infrastructure is actually used, but
certainly some package have a 'longtests' folder e.g.
https://github.com/LTLA/beachmat

On Fri, 13 Sep 2019 at 16:02, Kasper Daniel Hansen <
kasperdanielhan...@gmail.com> wrote:

> We used to have (? or at least discussed the possibility of) occasional
> extensive checking so we could have
>   tests
>   long_tests
> (names made up).
>
> On Fri, Sep 13, 2019 at 9:50 AM Martin Morgan 
> wrote:
>
> > Putting bioc-devel back in the loop.
> >
> > I think that the straight-forward answer to your original query is 'no,
> > git modules are not supported'.
> >
> > I think we'd carry on and say 'packages should be self-contained and
> > conform to the Bioconductor size and time constraints', so you cannot
> have
> > a very large package or a package that takes a long time to check, and
> you
> > can't download part of the package from some alternative source (except
> > perhaps AnnotationHub or ExperimentHub). I agree that the hubs are not
> > suitable for regularly updated files, and that they are meant for
> > biologically motivated rather than purely test-related data resources.
> >
> > While we 'could' make special accommodations on the build systems to
> > support your package, we have found that this is not a fruitful endeavor.
> >
> > A natural place to put files used in tests would be in the /tests
> > directory; these are not included in the installed package. But it seems
> > likely that including your tests would violate the time and / or space
> > limitations we place on packages.
> >
> > It seems likely that this leads to the question you pose below, which is
> > how do you know that you're running on the build system so that you can
> > perform more modest computations? This is similar to here, where special
> > resources are normally required
> >
> >   https://stat.ethz.ch/pipermail/bioc-devel/2019-September/015518.html
> >
> > Herve seems not willing to commit to an easy answer, perhaps because this
> > opens the door to people circumventing even minimal tests of their
> > package...
> >
> > Martin
> >
> > On 9/13/19, 7:49 AM, "Shepherd, Lori" 
> > wrote:
> >
> >
> > I'm including Martin and Herve for their opinions and to chime in too
> > since you took this conversation off the mailing list...
> >
> >
> > Could you please describe what you mean by works transparently?
> >
> >
> > We realize there isn't a function to call -  we were suggesting you
> > make a function to call that could be utilized
> >
> >
> > How does your caching system work?  I would also advise looking into
> > BiocFileCache - the Bioconductor suggested package for data caching of
> > files.
> >
> >
> >
> >
> > The relevant files to look at for the environment calls can be found
> > https://github.com/Bioconductor/Contributions
> >
> > esp.
> >
> https://github.com/Bioconductor/Contributions#r-cmd-check-environment
> >
> >
> >
> > Please also be mindful of:
> >
> > Submission Guidelines
> > https://bioconductor.org/developers/package-submission/
> >
> > Package Guidelines
> > https://bioconductor.org/developers/package-guidelines/
> >
> >
> >
> >
> > More specifically on the single package builder we use:
> > R CMD BiocCheckGitClone 
> > R CMD build --keep-empty-dirs --no-resave-data  
> >
> > R CMD check --no-vignettes --timings 
> >
> > R CMD BiocCheck --build-output-file= --new-package
> > 
> >
> >
> >
> > With the environment variables set up as described in the above link
> >
> >
> > special files are not encouraged and as far as I am aware not
> > allowed.  Herve who has more experience with the builders may be able to
> > chime in further here.
> >
> >
> >
> >
> >
> >
> >
> > Lori Shepherd
> > Bioconductor Core Team
> > Roswell Park Cancer Institute
> > Department of Biostatistics & Bioinformatics
> > Elm & Carlton Streets
> > Buffalo, New York 14263
> >
> >
> > 
> > From: Pierrick Roger 
> > Sent: Friday, September 13, 2019 2:48 AM
> > To: Shepherd, Lori 
> > Subject: Re: [Bioc-devel] new package for accessing some chemical and
> > biological databases
> >
> > Thank you for the example. However I do not think it is relevant.
> This
> > package has no examples, no tests and just one vignette. The `get`
> > function is part of the interface, so it makes sens to use it inside
> > the vignette. But for my package biodb, there is no function to call,
> > the cache works transparently.
> >
> > Could you please give me more details about the build process of
> > packages in
> > Bioconductor? Are there some environment variables set during the
> build
> > so a package can now it is being built or checked by Bioconductor? If
> > this is the case, maybe I could write a tweak in my code in order to
> > download the cache when needed.
> > If not, would it be

Re: [Bioc-devel] Experiment package does not appear in build report

2019-09-13 Thread Shepherd, Lori

The ENCODExplorerData package was added as an annotation package not as a data 
experiment package.  While you should continue to push changes to the git 
repository please let us know when you need this annotation package updated on 
Bioconductor.  It will need to be added manually as annotation packages are 
currently not auto-propagated.  We are in the process of moving these types of 
packages to automatic git process but it is still in development


Daniel can you please grab the latest version of the package from our git 
server and please make sure it gets on the builder.



Lori Shepherd

Bioconductor Core Team

Roswell Park Cancer Institute

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263


From: Bioc-devel  on behalf of �ric Fournier 

Sent: Friday, September 13, 2019 9:51 AM
To: bioc-devel@r-project.org 
Subject: [Bioc-devel] Experiment package does not appear in build report

Hello,

I maintain the ENCODExplorerData package. Recently, it has come to our 
attention that changes to some of the online ENCODE data we rely on broke one 
of our helper functions, and I pushed a fix to BioConductor devel. However, I 
can't see our package in the ExperimentData build log, either for BioConductor 
3.9 or 3.10 
(https://www.bioconductor.org/checkResults/3.10/data-experiment-LATEST/). Am I 
looking in the wrong place?

Thanks for your time,
-Eric

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] new package for accessing some chemical and biological databases

2019-09-13 Thread Kasper Daniel Hansen

We used to have (? or at least discussed the possibility of) occasional
extensive checking so we could have
  tests
  long_tests
(names made up).

On Fri, Sep 13, 2019 at 9:50 AM Martin Morgan 
wrote:

> Putting bioc-devel back in the loop.
>
> I think that the straight-forward answer to your original query is 'no,
> git modules are not supported'.
>
> I think we'd carry on and say 'packages should be self-contained and
> conform to the Bioconductor size and time constraints', so you cannot have
> a very large package or a package that takes a long time to check, and you
> can't download part of the package from some alternative source (except
> perhaps AnnotationHub or ExperimentHub). I agree that the hubs are not
> suitable for regularly updated files, and that they are meant for
> biologically motivated rather than purely test-related data resources.
>
> While we 'could' make special accommodations on the build systems to
> support your package, we have found that this is not a fruitful endeavor.
>
> A natural place to put files used in tests would be in the /tests
> directory; these are not included in the installed package. But it seems
> likely that including your tests would violate the time and / or space
> limitations we place on packages.
>
> It seems likely that this leads to the question you pose below, which is
> how do you know that you're running on the build system so that you can
> perform more modest computations? This is similar to here, where special
> resources are normally required
>
>   https://stat.ethz.ch/pipermail/bioc-devel/2019-September/015518.html
>
> Herve seems not willing to commit to an easy answer, perhaps because this
> opens the door to people circumventing even minimal tests of their
> package...
>
> Martin
>
> On 9/13/19, 7:49 AM, "Shepherd, Lori" 
> wrote:
>
>
> I'm including Martin and Herve for their opinions and to chime in too
> since you took this conversation off the mailing list...
>
>
> Could you please describe what you mean by works transparently?
>
>
> We realize there isn't a function to call -  we were suggesting you
> make a function to call that could be utilized
>
>
> How does your caching system work?  I would also advise looking into
> BiocFileCache - the Bioconductor suggested package for data caching of
> files.
>
>
>
>
> The relevant files to look at for the environment calls can be found
> https://github.com/Bioconductor/Contributions
>
> esp.
> https://github.com/Bioconductor/Contributions#r-cmd-check-environment
>
>
>
> Please also be mindful of:
>
> Submission Guidelines
> https://bioconductor.org/developers/package-submission/
>
> Package Guidelines
> https://bioconductor.org/developers/package-guidelines/
>
>
>
>
> More specifically on the single package builder we use:
> R CMD BiocCheckGitClone 
> R CMD build --keep-empty-dirs --no-resave-data  
>
> R CMD check --no-vignettes --timings 
>
> R CMD BiocCheck --build-output-file= --new-package
> 
>
>
>
> With the environment variables set up as described in the above link
>
>
> special files are not encouraged and as far as I am aware not
> allowed.  Herve who has more experience with the builders may be able to
> chime in further here.
>
>
>
>
>
>
>
> Lori Shepherd
> Bioconductor Core Team
> Roswell Park Cancer Institute
> Department of Biostatistics & Bioinformatics
> Elm & Carlton Streets
> Buffalo, New York 14263
>
>
> 
> From: Pierrick Roger 
> Sent: Friday, September 13, 2019 2:48 AM
> To: Shepherd, Lori 
> Subject: Re: [Bioc-devel] new package for accessing some chemical and
> biological databases
>
> Thank you for the example. However I do not think it is relevant. This
> package has no examples, no tests and just one vignette. The `get`
> function is part of the interface, so it makes sens to use it inside
> the vignette. But for my package biodb, there is no function to call,
> the cache works transparently.
>
> Could you please give me more details about the build process of
> packages in
> Bioconductor? Are there some environment variables set during the build
> so a package can now it is being built or checked by Bioconductor? If
> this is the case, maybe I could write a tweak in my code in order to
> download the cache when needed.
> If not, would it be possible to have them defined or to have to have a
> special file `bioc.yml` defined at the root of the package in which I
> could write a `prebuild_step` command for retrieving the cache from my
> public GitHub repos `biodb-cache`?
>
> On Thu 12 Sep 19 17:12, Shepherd, Lori wrote:
> > Please look at  SRAdb  for an example of how we would recommend
> keeping the data.
> >
> > Summary:
> > On github or wherever you would like to host and keep the data
> current, please make sure it is

[Bioc-devel] Experiment package does not appear in build report

2019-09-13 Thread Éric Fournier

Hello,

I maintain the ENCODExplorerData package. Recently, it has come to our 
attention that changes to some of the online ENCODE data we rely on broke one 
of our helper functions, and I pushed a fix to BioConductor devel. However, I 
can't see our package in the ExperimentData build log, either for BioConductor 
3.9 or 3.10 
(https://www.bioconductor.org/checkResults/3.10/data-experiment-LATEST/). Am I 
looking in the wrong place?

Thanks for your time,
-Eric

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] new package for accessing some chemical and biological databases

2019-09-13 Thread Martin Morgan

Putting bioc-devel back in the loop.

I think that the straight-forward answer to your original query is 'no, git 
modules are not supported'.

I think we'd carry on and say 'packages should be self-contained and conform to 
the Bioconductor size and time constraints', so you cannot have a very large 
package or a package that takes a long time to check, and you can't download 
part of the package from some alternative source (except perhaps AnnotationHub 
or ExperimentHub). I agree that the hubs are not suitable for regularly updated 
files, and that they are meant for biologically motivated rather than purely 
test-related data resources.

While we 'could' make special accommodations on the build systems to support 
your package, we have found that this is not a fruitful endeavor.

A natural place to put files used in tests would be in the /tests directory; 
these are not included in the installed package. But it seems likely that 
including your tests would violate the time and / or space limitations we place 
on packages.

It seems likely that this leads to the question you pose below, which is how do 
you know that you're running on the build system so that you can perform more 
modest computations? This is similar to here, where special resources are 
normally required

  https://stat.ethz.ch/pipermail/bioc-devel/2019-September/015518.html

Herve seems not willing to commit to an easy answer, perhaps because this opens 
the door to people circumventing even minimal tests of their package...

Martin

On 9/13/19, 7:49 AM, "Shepherd, Lori"  wrote:

I'm including Martin and Herve for their opinions and to chime in too since 
you took this conversation off the mailing list... 

Could you please describe what you mean by works transparently? 

We realize there isn't a function to call -  we were suggesting you make a 
function to call that could be utilized 

How does your caching system work?  I would also advise looking into 
BiocFileCache - the Bioconductor suggested package for data caching of files. 

The relevant files to look at for the environment calls can be found 
https://github.com/Bioconductor/Contributions

esp.
https://github.com/Bioconductor/Contributions#r-cmd-check-environment

Please also be mindful of: 

Submission Guidelines
https://bioconductor.org/developers/package-submission/

Package Guidelines
https://bioconductor.org/developers/package-guidelines/

More specifically on the single package builder we use:
R CMD BiocCheckGitClone 
R CMD build --keep-empty-dirs --no-resave-data  

R CMD check --no-vignettes --timings  

R CMD BiocCheck --build-output-file= --new-package 

With the environment variables set up as described in the above link

special files are not encouraged and as far as I am aware not allowed.  
Herve who has more experience with the builders may be able to chime in further 
here. 

Lori Shepherd
Bioconductor Core Team
Roswell Park Cancer Institute
Department of Biostatistics & Bioinformatics
Elm & Carlton Streets
Buffalo, New York 14263

From: Pierrick Roger 
Sent: Friday, September 13, 2019 2:48 AM
To: Shepherd, Lori 
Subject: Re: [Bioc-devel] new package for accessing some chemical and 
biological databases 

Thank you for the example. However I do not think it is relevant. This
package has no examples, no tests and just one vignette. The `get`
function is part of the interface, so it makes sens to use it inside
the vignette. But for my package biodb, there is no function to call,
the cache works transparently.

Could you please give me more details about the build process of packages in
Bioconductor? Are there some environment variables set during the build
so a package can now it is being built or checked by Bioconductor? If
this is the case, maybe I could write a tweak in my code in order to
download the cache when needed.
If not, would it be possible to have them defined or to have to have a
special file `bioc.yml` defined at the root of the package in which I
could write a `prebuild_step` command for retrieving the cache from my
public GitHub repos `biodb-cache`?

On Thu 12 Sep 19 17:12, Shepherd, Lori wrote:
> Please look at  SRAdb  for an example of how we would recommend keeping 
the data.
> 
> Summary:
> On github or wherever you would like to host and keep the data current, 
please make sure it is publically accessible.  Within your package have an 
download function that retrieves the file from the public location.
> 
> Its not recommended but This will be acceptable in this case.
> 
> Thank you.
> 
> 
>

Re: [Bioc-devel] Duplicated method names in purrr and GenomicRanges

2019-09-13 Thread bioinf

Thank you for all of your answers. Michaels solution works fine for me. 
I had to merge a list of data.frames. Used the solution in this thread here:


https://stackoverflow.com/questions/8091303/simultaneously-merge-multiple-data-frames-in-a-list

Am 12.09.19 um 13:05 schrieb Michael Lawrence via Bioc-devel:

Third option: use Reduce() from base instead of purr::reduce().

On Thu, Sep 12, 2019 at 2:54 AM O'CALLAGHAN Alan
 wrote:

Hi,

Two options.

First option: import either purrr::reduce or GenomicRanges::reduce, and
call the other with [pkg]::reduce.

Second option: remove the import for both of these. Use purrr::reduce
and GenomicRanges::reduce to call both functions.

I think the second option leads to clearer code and would be my definite
preference.


On 12/09/2019 10:07, bio...@posteo.de wrote:

Dear all,

I am developing a Bioconductor package and have a problem with two
methods which have the same name. I am using the reduce() function
from the R packages GenomicRanges and purrr. All methods from other
packages are imported with @importFrom in all of my functions.


During devtools::document() I get the following Warning:

...

replacing previous import ‘GenomicRanges::reduce’ by ‘purrr::reduce’
when loading ‘testPackage’

...


Here are my NAMESPACE entries:

# Generated by roxygen2: do not edit by hand

export(mergeDataFrameList)
export(reduceDummy)
importFrom(GenomicRanges,GRanges)
importFrom(GenomicRanges,reduce)
importFrom(IRanges,IRanges)
importFrom(dplyr,"%>%")
importFrom(dplyr,left_join)
importFrom(dplyr,mutate)
importFrom(dplyr,pull)
importFrom(magrittr,"%<>%")
importFrom(purrr,reduce)
importFrom(tibble,tibble)


I am not using both reduce functions in the same function. To use the
GenomicRanges reduce function, I have to call this function like this:
GenomicRanges::reduce().

I understand the warning and why I have to call the reduce function
like this. Is there a solution for this problem? Compiling a R package
with warnings and calling functions like this is not the best way I
guess.

I am using R version 3.6.1 (2019-07-05)

Thanks for help!

Best,

Tobias

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

The University of Edinburgh is a charitable body, registered in Scotland, with 
registration number SC005336.
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel





___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Extending GenomicRanges::`intra-range-methods`

2019-09-13 Thread Michael Lawrence via Bioc-devel

Thanks for these suggestions; I think they're worth considering.

I've never been totally satisfied with (my function) flank(), because
it's limited and its arguments are somewhat obscure in meaning. You
can check out what we did in plyranges:
https://rdrr.io/bioc/plyranges/man/flank-ranges.html. Your functions
are more flexible, because they are two-way about the endpoint, like
promoters(). Sometimes I've solved that with resize(flank()), but
that's not ideal.  Maybe a better name is "straddle" for when ranges
straddle one of the endpoints? In keeping with the current pattern of
Ranges API, there would be a single function: straddle(x, side, left,
right, ignore.strand=FALSE). So straddle(x, "start", -100, 10) would
be like promoters(x, 100, 10) for a positive or "*" strand range. That
brings up strandedness, which needs to be considered here. For
unstranded ranges, it may be that direct start() and end()
manipulation is actually more transparent than a special verb. I
wonder what Stuart Lee thinks?

The functions that involve reduce() wouldn't fit into the intrarange
operations, as they are summarizing ranges, not transforming them.
They may be going too far.

Michael

On Fri, Sep 13, 2019 at 4:48 AM Bhagwat, Aditya
 wrote:
>
> Dear bioc-devel,
>
> The ?GenomicRanges::`intra-range-methods` are very useful for range 
> arithmetic
>
> Feedback request: would it be of general use to add the methods below to the 
> GenomicRanges::`intra-range-methods` palette (after properly S4-ing them)?
> Or shall I keep them in 
> multicrispr?
> Additional feedback welcome as well (e.g. re-implementation of already 
> existing functionality).
>
>
> 1) Left flank
>
> #' Left flank
> #' @param gr   \code{\link[GenomicRanges]{GRanges-class}}
> #' @param leftstart number: flank start (relative to range start)
> #' @param leftend   number: flank end   (relative to range start)
> #' @return a \code{\link[GenomicRanges]{GRanges-class}}
> #' @export
> #' @examples
> #' bedfile <- system.file('extdata/SRF.bed', package = 'multicrispr')
> #' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus
> #' gr <- read_bed(bedfile, bsgenome)
> #' left_flank(gr)
> left_flank <- function(gr, leftstart = -200, leftend   = -1){
>
> # Assert
> assert_is_identical_to_true(is(gr, 'GRanges'))
> assert_is_a_number(leftstart)
> assert_is_a_number(leftend)
>
> # Flank
> newranges <- gr
> end(newranges)   <- start(gr) + leftend
> start(newranges) <- start(gr) + leftstart
>
> # Return
> newranges
> }
>
>
> 2) Right flank
>
> #' Right flank
> #' @param gr\code{\link[GenomicRanges]{GRanges-class}}
> #' @param rightstart number: flank start (relative to range end)
> #' @param rightend   number: flank end   (relative to range end)
> #' @return \code{\link[GenomicRanges]{GRanges-class}}
> #' @export
> #' @examples
> #' bedfile <- system.file('extdata/SRF.bed', package = 'multicrispr')
> #' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus
> #' gr <- read_bed(bedfile, bsgenome)
> #' right_flank(gr)
> #' @export
> right_flank <- function(gr, rightstart = 1, rightend   = 200){
>
> # Assert
> assert_is_identical_to_true(is(gr, 'GRanges'))
> assert_is_a_number(rightstart)
> assert_is_a_number(rightend)
> assert_is_a_bool(verbose)
>
> # Flank
> newranges <- gr
> start(newranges) <- end(newranges) + rightstart
> end(newranges)   <- end(newranges) + rightend
>
> # Plot
> if (plot)  plot_intervals(GRangesList(sites = gr, rightflanks = 
> newranges))
>
> # Return
> cmessage('\t\t%d right flanks : [end%s%d, end%s%d]',
> length(newranges),
> csign(rightstart),
> abs(rightstart),
> csign(rightend),
> abs(rightend))
> newranges
> }
>
>
> 3) Slop
>
> #' Slop (i.e. extend left/right)
> #' @param gr\code{\link[GenomicRanges]{GRanges-class}}
> #' @param leftstart number: flank start (relative to range start)
> #' @param rightend  number: flank end   (relative to range end)
> #' @return \code{\link[GenomicRanges]{GRanges-class}}
> #' @export
> #' @examples
> #' bedfile <- system.file('extdata/SRF.bed', package = 'multicrispr')
> #' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus
> #' gr <- read_bed(bedfile, bsgenome)
> #' slop(gr)
> #' @export
> slop <- function(gr, leftstart = -22, rightend  =  22){
>
> # Assert
> assert_is_identical_to_true(methods::is(gr, 'GRanges'))
> assert_is_a_number(leftstart)
> assert_is_a_number(rightend)
> assert_is_a_bool(verbose)
>
> # Slop
> newranges <- gr
> start(newranges) <- start(newranges) + leftstart
> end(newranges)   <- end(newranges)   + rightend
>
> # Return
> newranges
> }
>
>
> 4) Flank fourways
>
> #' Flank fourways
> #'
> #' Flank left and right, for both

[Bioc-devel] Extending GenomicRanges::`intra-range-methods`

2019-09-13 Thread Bhagwat, Aditya

Dear bioc-devel,

The ?GenomicRanges::`intra-range-methods` are very useful for range 
arithmetic

Feedback request: would it be of general use to add the methods below to the 
GenomicRanges::`intra-range-methods` palette (after properly S4-ing them)?
Or shall I keep them in 
multicrispr?
Additional feedback welcome as well (e.g. re-implementation of already existing 
functionality).


1) Left flank

#' Left flank
#' @param gr   \code{\link[GenomicRanges]{GRanges-class}}
#' @param leftstart number: flank start (relative to range start)
#' @param leftend   number: flank end   (relative to range start)
#' @return a \code{\link[GenomicRanges]{GRanges-class}}
#' @export
#' @examples
#' bedfile <- system.file('extdata/SRF.bed', package = 'multicrispr')
#' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus
#' gr <- read_bed(bedfile, bsgenome)
#' left_flank(gr)
left_flank <- function(gr, leftstart = -200, leftend   = -1){

# Assert
assert_is_identical_to_true(is(gr, 'GRanges'))
assert_is_a_number(leftstart)
assert_is_a_number(leftend)

# Flank
newranges <- gr
end(newranges)   <- start(gr) + leftend
start(newranges) <- start(gr) + leftstart

# Return
newranges
}


2) Right flank

#' Right flank
#' @param gr\code{\link[GenomicRanges]{GRanges-class}}
#' @param rightstart number: flank start (relative to range end)
#' @param rightend   number: flank end   (relative to range end)
#' @return \code{\link[GenomicRanges]{GRanges-class}}
#' @export
#' @examples
#' bedfile <- system.file('extdata/SRF.bed', package = 'multicrispr')
#' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus
#' gr <- read_bed(bedfile, bsgenome)
#' right_flank(gr)
#' @export
right_flank <- function(gr, rightstart = 1, rightend   = 200){

# Assert
assert_is_identical_to_true(is(gr, 'GRanges'))
assert_is_a_number(rightstart)
assert_is_a_number(rightend)
assert_is_a_bool(verbose)

# Flank
newranges <- gr
start(newranges) <- end(newranges) + rightstart
end(newranges)   <- end(newranges) + rightend

# Plot
if (plot)  plot_intervals(GRangesList(sites = gr, rightflanks = newranges))

# Return
cmessage('\t\t%d right flanks : [end%s%d, end%s%d]',
length(newranges),
csign(rightstart),
abs(rightstart),
csign(rightend),
abs(rightend))
newranges
}


3) Slop

#' Slop (i.e. extend left/right)
#' @param gr\code{\link[GenomicRanges]{GRanges-class}}
#' @param leftstart number: flank start (relative to range start)
#' @param rightend  number: flank end   (relative to range end)
#' @return \code{\link[GenomicRanges]{GRanges-class}}
#' @export
#' @examples
#' bedfile <- system.file('extdata/SRF.bed', package = 'multicrispr')
#' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus
#' gr <- read_bed(bedfile, bsgenome)
#' slop(gr)
#' @export
slop <- function(gr, leftstart = -22, rightend  =  22){

# Assert
assert_is_identical_to_true(methods::is(gr, 'GRanges'))
assert_is_a_number(leftstart)
assert_is_a_number(rightend)
assert_is_a_bool(verbose)

# Slop
newranges <- gr
start(newranges) <- start(newranges) + leftstart
end(newranges)   <- end(newranges)   + rightend

# Return
newranges
}


4) Flank fourways

#' Flank fourways
#'
#' Flank left and right, for both strands, and merge overlaps
#' @param gr  \code{\link[GenomicRanges]{GRanges-class}}
#' @param leftstart   number: left flank start  (relative to range start)
#' @param leftend number: left flank  end   (relative to range start)
#' @param rightstart  number: right flank start (relative to range end)
#' @param rightendnumber: right flank end   (relative to range end)
#' @return \code{\link[GenomicRanges]{GRanges-class}}
#' @examples
#' bedfile <- system.file('extdata/SRF.bed', package = 'multicrispr')
#' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus
#' granges <- read_bed(bedfile, bsgenome)
#' flank_fourways(granges)
#' @export
flank_fourways <- function(gr, leftstart  = -200, leftend=   -1, rightstart 
=1, rightend   =  200){

# Comply
. <- NULL

# Flank
left <-  left_flank( gr, leftstart, leftend)
right <- right_flank(gr,rightstart, rightend)
newranges <- c(left, right)

# Complement
newranges %<>% c(invertStrand(.))

# Merge overlaps
newranges %<>% reduce() # GenomicRanges::reduce

# Return
newranges
}



5) Slop fourways

#' Slop granges for both strands, merging overlaps
#' @param gr   \code{\link[GenomicRanges]{GRanges-class}}
#' @param leftstart number
#' @param rightend  number
#' @return \code{\link[GenomicRanges]{GRanges-class}}
#' @examples
#' bedfile <- system.file('extdata/SRF.bed', package = 'multicrispr')
#' bsgenome <-

Re: [Bioc-devel] DMRcate_2.0.0 and updated DMRcatedata

2019-09-13 Thread Shepherd, Lori

I have commented on the issue on the SPB so Kayla knows to ignore these ERRORs. 
 Cheers


Lori Shepherd

Bioconductor Core Team

Roswell Park Cancer Institute

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263


From: Tim Peters 
Sent: Thursday, September 12, 2019 8:58 PM
To: Turaga, Nitesh 
Cc: Bioc-devel ; Shepherd, Lori 

Subject: Re: [Bioc-devel] DMRcate_2.0.0 and updated DMRcatedata


Thanks Nitesh,

The only reason DMRcatedata isn't passing BiocCheck is because of package 
numbering and the fact that the old version already exists - which this new 
version is supposed to overwrite 
(http://bioconductor.org/spb_reports/DMRcatedata_buildreport_20190912205035.html).
 I understand the reviewer won't look at the package until it passes, but 
because this is an update of a (now) ExperimentData package can you ask the 
reviewer to overlook this please?

Best,

Tim

On 12/9/19 11:51 pm, Turaga, Nitesh wrote:

Thanks Tim. The assigned reviewer will review your package and provide 
suggestions as needed.

Best,

Nitesh



On Sep 12, 2019, at 3:15 AM, Tim Peters 
 wrote:

Hi Nitesh,

Thanks, I have transformed DMRcatedata into an ExperimentHub package and
started an issue here
https://github.com/Bioconductor/Contributions/issues/1247.

Cheers,

Tim

On 23/8/19 3:46 am, Turaga, Nitesh wrote:


Hi Tim,

Based on what your have mentioned, it seems that DMRcatedata should become an 
experiment hub package.

https://bioconductor.org/packages/devel/bioc/vignettes/ExperimentHub/inst/doc/CreateAnExperimentHubPackage.html

Please take a look that vignette.

Let me know if you have any questions.

Nitesh



On Aug 21, 2019, at 7:49 PM, Tim Peters 
 wrote:

   Hello,

   I have a major update of DMRcate and its associated data package,
   DMRcatedata. They both pass R CMD check and BiocCheck for R 3.6.1.
   Because the new version of DMRcate depends on the updated DMRcatedata 
package,
   can I host DMRcatedata (it's ~28MB) somewhere for you to download and update 
Bioconductor, and
   then I can sync the new version of DMRcate please?

   Best,

   Tim

--

===

Tim Peters, PhD

Bioinformatics Research Officer | Immunogenomics Laboratory | Immunology 
Division

Garvan Institute of Medical Research

384 Victoria St., Darlinghurst, NSW, Australia 2010

E: t.pet...@garvan.org.au | W: 
http://www.garvan.org.au | P: +612 9295 8325

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel




This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.



--

===

Tim Peters, PhD

Bioinformatics Research Officer | Immunogenomics Laboratory | Immunology 
Division

Garvan Institute of Medical Research

384 Victoria St., Darlinghurst, NSW, Australia 2010

E: t.pet...@garvan.org.au | W: 
http://www.garvan.org.au | P: +612 9295 8325

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.





This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to

Re: [Bioc-devel] Problems with converting GPos to GRanges.

2019-09-13 Thread Pages, Herve

Hi Charles,

Cryptic (but short) answer: the methods package **automatically** 
creates a coercion method from CTSS to GRanges for you. Unfortunately 
this method is broken.

Decryption:

Warning, this will take us to the very dark side of the S4 coercion system!

First this automatic method cannot be seen in a fresh session i.e. 
selectMethod() does NOT show it:

   > library(CAGEr)
   > selectMethod("coerce", c("CTSS", "GRanges"))
   Method Definition:

   function (from, to = "GRanges", strict = TRUE)
   {
 if (!isTRUEorFALSE(strict))
 stop("'strict' must be TRUE or FALSE")
 if (!strict)
 return(from)
 class(from) <- "GRanges"
 from@ranges <- as(from@ranges, "IRanges")
 from
   }

   Signatures:
   from to
   target  "CTSS"   "GRanges"
   defined "UnstitchedGPos" "GRanges"

As we will realize later, the method found by selectMethod() is not the 
one that is effectively used when coercing a CTSS object to GRanges. The 
method is defined in the GenomicRanges package and does the right thing. 
Note that its signature is UnstitchedGPos,GRanges not CTSS,GRanges but 
since CTSS extends UnstitchedGPos, it makes sense that it got picked up 
by selectMethod(). Note that, predictably, getMethod() doesn't find it 
because, unlike selectMethod(), it does not use inheritance:

   > getMethod("coerce", c("CTSS", "GRanges"))
   Error in getMethod("coerce", c("CTSS", "GRanges")) :
 no method found for function 'coerce' and signature CTSS, GRanges

Where it becomes really nasty is that this method is NOT the method that 
will get called when we do as(ctss, "GRanges"):

   gr0 <- as(new("CTSS"), "GRanges")

How do I know that? Let's use getMethod() again:

   > getMethod("coerce", c("CTSS", "GRanges"))
   Method Definition:

   function (from, to = "GRanges", strict = TRUE)
   if (strict) {
 from <- {
 class(from) <- "UnstitchedGPos"
 from
 }
 {
 from <- from
 {
 value <- new("GRanges")
 for (what in c("seqnames", "ranges", "strand", "seqinfo",
 "elementMetadata", "elementType", "metadata")) slot(value,
 what) <- slot(from, what)
 value
 }
 }
   } else from

   Signatures:
 from   to
   target  "CTSS" "GRanges"
   defined "CTSS" "GRanges"

Surprise! And don't trust the appearances: this method is NOT defined by 
the GenomicRanges package (despite the  line). It's an automatic coercion method that 
is defined on-the-fly by the methods package the first time the coercion 
is "needed". The methods package automatically defines these coercion 
methods between 2 classes when (1) one class is a parent of the other 
and (2) the developer didn't define its own method.

In addition to not be detectable by selectMethod() or getMethod(), 
another problem with these automatic coercion methods is that they tend 
to be wrong. And that's what happens with the coercion method from CTSS 
to GRanges: it returns a broken GRanges instance (which is why calling 
promoters() on it fails later).

So all you need to do is define your own coercion method from CTSS to 
GRanges. You can do this with:

   setMethod("coerce", c("CTSS", "GRanges"), from_GPos_to_GRanges)

The from_GPos_to_GRanges() function is defined in GenomicRanges. It used 
to be .from_GPos_to_GRanges() but I just renamed and exported it in 
GenomicRanges 1.37.16:

https://github.com/Bioconductor/GenomicRanges/commit/d7a0353830ae01d7776924045bebeabd035659d7

Note that it's important that you use setMethod("coerce", ...) instead 
of setAs(...) here. This is another long story but if you are curious I 
have some grumpy comments about this in GenomicRanges/R/GPos-class.R and 
in other places e.g. here:

https://github.com/Bioconductor/S4Vectors/blob/f4b4ee769d2e57ecc4a672bc117bff5c28edfad4/R/HitsList-class.R#L91-L116

Cheers,
H.

On 9/12/19 21:29, Charles Plessy wrote:
> Hello,
> 
> I am trying to make the CAGEr package (1.27.2) pass its regression
> tests, and I am still struggling with the refactoring of GPos to
> UnstitchedGPos and StitchedGPos in the devel branch of Bioconductor...
> 
> Currently, my problem is with the following command:
> 
> promoters(GRanges(CTSScoordinatesGR(exampleCAGEexp)))
> 
> CTSScoordinatesGR(exampleCAGEexp) returns a GPos object of transcription
> start sites, that I want to transform in promoter ranges.
> 
> While it worked in the past, I now have the following error:
> 
> Error in (function (classes, fdef, mtable)  :
> unable to find an inherited method for function ‘update_ranges’ for
> signature ‘"UnstitchedIPos"’
> 
> I tried the same command with the gpos1a and gpos1b example objects from
> the GPos manual page, and they do not trigger the error.
> 
> Further inspection showed me that
> GRanges(CTSScoordinatesGR(exampleCAGEexp)) returns a GRanges object
> where the ranges are still in UnstitchedIPos class, while in the case of
>

Re: [Bioc-devel] new package for accessing some chemical and biological databases

Re: [Bioc-devel] Duplicated method names in purrr and GenomicRanges

Re: [Bioc-devel] new package for accessing some chemical and biological databases

Re: [Bioc-devel] new package for accessing some chemical and biological databases

Re: [Bioc-devel] Experiment package does not appear in build report

Re: [Bioc-devel] new package for accessing some chemical and biological databases

[Bioc-devel] Experiment package does not appear in build report

Re: [Bioc-devel] new package for accessing some chemical and biological databases

Re: [Bioc-devel] Duplicated method names in purrr and GenomicRanges

Re: [Bioc-devel] Extending GenomicRanges::`intra-range-methods`

[Bioc-devel] Extending GenomicRanges::`intra-range-methods`

Re: [Bioc-devel] DMRcate_2.0.0 and updated DMRcatedata

Re: [Bioc-devel] Problems with converting GPos to GRanges.

13 matches

Site Navigation

Mail list logo

Footer information