Re: [Bioc-devel] Is biocLite() working for you right now? Could be a problem on our side. Issue with DelayedArray install with a PC on R 3.5.0

2019-09-17 Thread Leonardo Collado Torres
Hi,

I'm happy to report that this was never a connection issue. I was just
mislead by 
https://stackoverflow.com/questions/25487593/r-what-does-incomplete-block-on-file-mean
which suggested that the download was incomplete as well as the timing
of the connection issue with the non-bioc website. It turns out that
the installation of clusterProfiler's dependencies DO.db, GO.db and
other packages that have no Windows binary can be resolved by editing
the TMP and TEMP environment variables to a path with no spaces for a
Windows user whose username includes a space. Doing so leads to a
utils::tempdir() with no spaces, which is used to form the destination
directory with default arguments when using utils::install.packages().

I wrote more about this at
http://lcolladotor.github.io/2019/09/18/windows-user-space-issues-with-installing-r-packages/#.XYGsZJNKg0o

Thank you again for your help on this a year ago! If any of you have
other solutions, we'd love to learn about them.

Best,
Leo


On Thu, Aug 2, 2018 at 2:22 PM Leonardo Collado Torres  wrote:
>
> Thanks Hervé and no problem about the Windows binary.
>
> Richard did get DelayedArray installed today (I think using biocLite)! ^^
>
> Best,
> Leo
>
> On Thu, Aug 2, 2018 at 1:59 PM Hervé Pagès  wrote:
>>
>>
>> Hi Leonardo,
>>
>> Hope you were able to solve your connection problems.
>> Just to let you that I fixed the DelayedArray timeout on
>> Windows and the zip files are now available for this
>> platform:
>>
>>https://bioconductor.org/packages/DelayedArray
>>
>> Sorry for the inconvenience.
>>
>> Cheers,
>> H.
>>
>>
>> On 08/01/2018 02:23 PM, Leonardo Collado Torres wrote:
>> > Hi,
>> >
>> > Just as a quick update. Everything worked yesterday when Richard used
>> > another PC computer from his home (another network). So it's
>> > definitely not a Bioc problem.
>> >
>> > Yet I have no idea how to troubleshoot it beyond burning it all and
>> > starting from scratch: re-installing R and everything and checking if
>> > that works. Well, or testing using a hotspot wifi connection with one
>> > of our phones and seeing if that works to bypass the wifi network from
>> > work.
>> >
>> > Best,
>> > Leo
>> >
>> > On Mon, Jul 30, 2018 at 4:29 PM Tim Triche, Jr.  
>> > wrote:
>> >>
>> >> are you sure his tmp directory isn't full
>> >>
>> >> --t
>> >>
>> >>
>> >> On Mon, Jul 30, 2018 at 3:25 PM Leonardo Collado Torres 
>> >>  wrote:
>> >>>
>> >>>  From Richard:
>> >>>
>>  BiocManager::install("DelayedArray")
>> >>>
>> >>> Bioconductor version 3.7 (BiocManager 1.30.1), R 3.5.0 (2018-04-23)
>> >>>
>> >>> Installing package(s) 'BiocVersion', 'DelayedArray'
>> >>>
>> >>> trying URL 
>> >>> 'https://urldefense.proofpoint.com/v2/url?u=https-3A__bioconductor.org_packages_3.7_bioc_bin_windows_contrib_3.5_BiocVersion-5F3.7.4.zip=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=VQqOqRCRgcQL-JkG2jnIvc9WLYZ09mm9gdXpjfDqJos=RdZdVvOSqc562aHV7-SuXUDX1UWmqQS3vui_2xYItgQ='
>> >>>
>> >>> Content type 'application/zip' length 8649 bytes
>> >>>
>> >>> downloaded 8649 bytes
>> >>>
>> >>>
>> >>>
>> >>> package ‘BiocVersion’ successfully unpacked and MD5 sums checked
>> >>>
>> >>>
>> >>>
>> >>> The downloaded binary packages are in
>> >>>
>> >>> C:\Users\Richard 
>> >>> Straub\AppData\Local\Temp\RtmpQNrgbq\downloaded_packages
>> >>>
>> >>> installing the source package ‘DelayedArray’
>> >>>
>> >>>
>> >>>
>> >>> trying URL 
>> >>> 'https://urldefense.proofpoint.com/v2/url?u=https-3A__bioconductor.org_packages_3.7_bioc_src_contrib_DelayedArray-5F0.6.2.tar.gz=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=VQqOqRCRgcQL-JkG2jnIvc9WLYZ09mm9gdXpjfDqJos=vz1NGwOvFQqQbsZ2yqnCyV188YNc_k7oDMiD78DpRqY='
>> >>>
>> >>> Content type 'application/x-gzip' length 486139 bytes (474 KB)
>> >>>
>> >>> downloaded 474 KB
>> >>>
>> >>>
>> >>>
>> >>> Error in untar2(tarfile, files, list, exdir, restore_times) :
>> >>>
>> >>>incomplete block on file
>> >>>
>> >>> In R CMD INSTALL
>> >>>
>> >>>
>> >>>
>> >>> The downloaded source packages are in
>> >>>
>> >>> ‘C:\Users\Richard
>> >>> Straub\AppData\Local\Temp\RtmpQNrgbq\downloaded_packages’
>> >>>
>> >>> installation path not writeable, unable to update packages: foreign,
>> >>> MASS, mgcv, survival
>> >>>
>> >>> Update old packages: 'openssl', 'stringi'
>> >>>
>> >>> Update all/some/none? [a/s/n]:
>> >>>
>> >>> n
>> >>>
>> >>> Warning message:
>> >>>
>> >>> In install.packages(pkgs = doing, lib = lib, repos = repos, ...) :
>> >>>
>> >>>installation of package ‘DelayedArray’ had non-zero exit status
>> >>>
>> 
>> >>>
>> >>>
>> >>>
>> >>> Also, at 
>> >>> https://urldefense.proofpoint.com/v2/url?u=http-3A__bioconductor.org_packages_release_bioc_html_DelayedArray.html=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=VQqOqRCRgcQL-JkG2jnIvc9WLYZ09mm9gdXpjfDqJos=sIiQvz3aJN2aHqxD1FLQLygGYRequcJnj2ywQlGrkds=
>> >>> I don't see the tar for the Windows binary.
>> >>>

Re: [Bioc-devel] read_bed()

2019-09-17 Thread Michael Lawrence via Bioc-devel
I think you probably made a mistake when dropping the columns. When I
provide the extraCols= argument (inventing my own names for things),
it almost works, but it breaks due to NAs in the extra columns. The
"." character is the standard way to express NA in BED files.  I've
added support for extra na.strings to version 1.45.6.

For reference, the call is like:

import("SRF.bed", extraCols=c(chr2="character", start2="integer",
end2="integer", mDux="factor", type="factor", pos1="integer",
pos2="integer", strand2="factor", from="factor", n="integer",
code="character", anno="factor", id="character", biotype="character",
score2="numeric" ), na.strings="NA")


On Tue, Sep 17, 2019 at 7:23 AM Bhagwat, Aditya
 wrote:
>
> Hi Michael,
>
> I removed the additional metadata columns in SRF.bed
> https://gitlab.gwdg.de/loosolab/software/multicrispr/blob/master/inst/extdata/SRF.bed
>
> But still can't get rtracklayer::import.bed working:
>
> > rtracklayer::import.bed(bedfile)
> Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, 
> : scan() expected 'a real', got '1.168.595'
> > bedfile
> [1] "C:/Users/abhagwa/Documents/R/R-3.6.1/library/multicrispr/extdata/SRF.bed"
>
> Never mind, multicrispr function read_bed, based on data.table::fread is 
> doing the job, so I will stick to that .
>
> Thank you for all feedback,
>
> Cheers,
>
> Aditya
>
>
> 
> From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
> Aditya [aditya.bhag...@mpi-bn.mpg.de]
> Sent: Tuesday, September 17, 2019 2:48 PM
> To: Michael Lawrence
> Cc: bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> Oh :-) - Thankyou for explaining!
> 
> From: Michael Lawrence [lawrence.mich...@gene.com]
> Sent: Tuesday, September 17, 2019 2:40 PM
> To: Bhagwat, Aditya
> Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> Having a "." in the function name does not make something "S3".
> There's no dispatch from import() to import.bed(). Had I not been a
> total newb when I created rtracklayer, I would have called the
> function importBed() or something like that. Sorry for the confusion.
>
> On Tue, Sep 17, 2019 at 5:34 AM Bhagwat, Aditya
>  wrote:
> >
> > Oh, superb, thx!
> >
> > Interesting ... here you use S3 rather than S4 - I wonder the design 
> > intention underlying these choices (I'm asking because I am trying to 
> > figure out myself when to use S3 and when to use S4 and whether to mix the 
> > two).
> >
> > Aditya
> >
> > 
> > From: Michael Lawrence [lawrence.mich...@gene.com]
> > Sent: Tuesday, September 17, 2019 2:23 PM
> > To: Bhagwat, Aditya
> > Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> > Subject: Re: [Bioc-devel] read_bed()
> >
> > The generic documentation does not mention it, but see ?import.bed.
> > It's similar to colClasses on read.table().
> >
> > On Tue, Sep 17, 2019 at 5:15 AM Bhagwat, Aditya
> >  wrote:
> > >
> > > Thankyou Michael,
> > >
> > > How do I use the extraCols argument? The documentation does not mention 
> > > an `extraCols` argument explicitly, so it must be one of the ellipsis 
> > > arguments, but `?rtracklayer::import` does not mention it. Should I say 
> > > extraCols = 10 (ten extra columns) or so?
> > >
> > > Aditya
> > >
> > > 
> > > From: Michael Lawrence [lawrence.mich...@gene.com]
> > > Sent: Tuesday, September 17, 2019 2:05 PM
> > > To: Bhagwat, Aditya
> > > Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> > > Subject: Re: [Bioc-devel] read_bed()
> > >
> > > It breaks it because it's not standard BED; however, using the
> > > extraCols= argument should work in this case. Requiring an explicit
> > > format specification is intentional, because it provides validation
> > > and type safety, and it communicates the format to a future reader.
> > > This also looks a bit like a bedPE file, so you might consider using
> > > the Pairs data structure.
> > >
> > > Michael
> > >
> > > On Tue, Sep 17, 2019 at 4:51 AM Bhagwat, Aditya
> > >  wrote:
> > > >
> > > > Hi Michael,
> > > >
> > > > Yeah, I also noticed that the attachment was eaten when it entered the 
> > > > bio-devel list.
> > > >
> > > > The file is also accessible in the extdata of the multicrispr:
> > > > https://gitlab.gwdg.de/loosolab/software/multicrispr/blob/master/inst/extdata/SRF.bed
> > > >
> > > > A bedfile to GRanges importer requires columns 1 (chrom), 2 
> > > > (chromStart), 3 (chromEnd), and column 6 (strand). All of these are 
> > > > present in SRF.bed.
> > > >
> > > > I am curious as to why you feel that having additional columns in a 
> > > > bedfile would break it?
> > > >
> > > > Cheers,
> > > >
> > > > Aditya
> > > >
> > > > 
> > > > From: Michael Lawrence [lawrence.mich...@gene.com]
> > > > Sent: 

Re: [Bioc-devel] Fwd: TissueEnrich problems reported in the Multiple platform build/check report for BioC 3.9

2019-09-17 Thread Pages, Herve
Thanks Ashish for the quick fix.

For the record the change to tidyr::gather() in the latest version of 
tidyr (v 1.0.0) also breaks the CNPBayes and SummarizedBenchmark packages:

 
https://bioconductor.org/checkResults/3.9/bioc-LATEST/CNPBayes/malbec2-buildsrc.html

 
https://bioconductor.org/checkResults/3.9/bioc-LATEST/SummarizedBenchmark/malbec2-buildsrc.html

Also FWIW the man page for gather() actually recommends to switch to 
pivot_longer():

   Description:

  *Retired*

  Development on ‘gather()’ is complete, and for new code we
  recommend switching to ‘pivot_longer()’, which is easier to use,
  more featureful, and still under active development. ‘df %>%
  gather("key", "value", x, y, z)’ is equivalent to ‘df %>%
  pivot_longer(c(x, y, z), names_to = "key", values_to = "value")’

  See more details in ‘vignette("pivot")’.

Cheers,
H.

On 9/17/19 09:17, Ashish Jain wrote:
> Hi Lori,
> 
> Thank you for your help. I was able to reproduce the issue locally after
> updating the packages. I updated the vignette with the changes and pushed
> it on both RELEASE_3_9 and the development (3_10) branches. Again thank you
> for your help!
> 
> Regards,
> Ashish Jain
> 
> On Tue, Sep 17, 2019 at 7:06 AM Shepherd, Lori <
> lori.sheph...@roswellpark.org> wrote:
> 
>> I am able to reproduce the ERROR.  It seems to be an issue of trying to
>> subset.
>>
>>> head(exp %>% gather(Tissue))  Tissue value
>> 1 Adipose Tissue  1.63226821549951
>> 2 Adipose Tissue 0
>> 3 Adipose Tissue 0
>> 4 Adipose Tissue 0.485426827170242
>> 5 Adipose Tissue 0.584962500721156
>> 6 Adipose Tissue  3.05311133645956> head(exp %>% 
>> gather(Tissue=1:(ncol(exp)-1)))Error: 1 components of `...` had unexpected 
>> names.
>>
>> We detected these problematic arguments:
>> * `Tissue`
>>
>> Did you misspecify an argument?Call `rlang::last_error()` to see a backtrace
>>
>>>
>>
>> You can see if there were any recent changes to package you depend on that
>> could have caused this.  A package you depend on may have changed their
>> functionality.  I needed to update packages (with BiocManager::install()
>>   update all )  in order to reproduce.
>>
>> Cheers,
>>
>> Lori Shepherd
>>
>> Bioconductor Core Team
>>
>> Roswell Park Cancer Institute
>>
>> Department of Biostatistics & Bioinformatics
>>
>> Elm & Carlton Streets
>>
>> Buffalo, New York 14263
>> --
>> *From:* Bioc-devel  on behalf of Ashish
>> Jain 
>> *Sent:* Monday, September 16, 2019 3:08 PM
>> *To:* bioc-devel@r-project.org 
>> *Subject:* [Bioc-devel] Fwd: TissueEnrich problems reported in the
>> Multiple platform build/check report for BioC 3.9
>>
>> Hi All,
>>
>> I just got the message that my package is giving an error during the build
>> process on Bioconductor build server. I saw the logs and found it is
>> failing while building the vignette. I am not sure why it's happening as I
>> have not updated the package since release and it was building without any
>> errors. Can someone look into this?
>>
>> Regards,
>> Ashish Jain
>>
>> -- Forwarded message -
>> From: 
>> Date: Mon, Sep 16, 2019 at 1:59 PM
>> Subject: TissueEnrich problems reported in the Multiple platform
>> build/check report for BioC 3.9
>> To: 
>>
>>
>> [This is an automatically generated email. Please don't reply.]
>>
>> Hi TissueEnrich maintainer,
>>
>> According to the Multiple platform build/check report for BioC 3.9,
>> the TissueEnrich package has the following problem(s):
>>
>>o ERROR for 'R CMD build' on malbec2. See the details here:
>>
>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__master.bioconductor.org_checkResults_3.9_bioc-2DLATEST_TissueEnrich_malbec2-2Dbuildsrc.html=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=o1sr-o_MvUhGrSGKx6vufW6iDzp84bUofNfcGxAtObw=sA7mdXKmscGnnwrA2a-Dh9P7BpaJOuqIdbh1X6qTaAw=
>>
>> Please take the time to address this by committing and pushing
>> changes to your package at git.bioconductor.org
>>
>> Notes:
>>
>>* This was the status of your package at the time this email was sent to
>> you.
>>  Given that the online report is updated daily (in normal conditions)
>> you
>>  could see something different when you visit the URL(s) above,
>> especially if
>>  you do so several days after you received this email.
>>
>>* It is possible that the problems reported in this report are false
>> positives,
>>  either because another package (from CRAN or Bioconductor) breaks your
>>  package (if yours depends on it) or because of a Build System problem.
>>  If this is the case, then you can ignore this email.
>>
>>* Please check the report again 24h after you've committed your changes
>> to the
>>  package and make sure that all the problems have gone.
>>
>>* If you have questions about this report or need help with the
>>  maintenance of your package, please use the 

Re: [Bioc-devel] which web browser on the linux build machines? was Re: R environment variable which indicates "running in the bioc build system"?

2019-09-17 Thread Pages, Herve
Hi Paul,

We have firefox on the Linux builders:

 > getOption("browser")
[1] "/usr/bin/firefox"
 > browseURL("https://www.bioconductor.org/packages/igvR;)

However it could be that this works, not because firefox somehow gets 
started in headless mode, but because we have Xvfb (X11 virtual frame 
buffer) run as a service in the background so firefox is able to connect 
to that. I'm not sure.

Hope this helps,
H.

On 9/17/19 05:54, Paul Shannon wrote:
> On Sep 12, 2019, at 3:13 PM, Pages, Herve  wrote:
>>
>> AFAIK the build machines have web browsers.
> 
> Hi Herve,
> 
> I’d like to do my local linux testing of igvR using the same web browser you 
> have on the bioc linux build machines.I’d be grateful if you could tell 
> me which one is installed.
> 
> Thanks!
> 
>   - Paul
> 

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Fwd: TissueEnrich problems reported in the Multiple platform build/check report for BioC 3.9

2019-09-17 Thread Ashish Jain
Hi Lori,

Thank you for your help. I was able to reproduce the issue locally after
updating the packages. I updated the vignette with the changes and pushed
it on both RELEASE_3_9 and the development (3_10) branches. Again thank you
for your help!

Regards,
Ashish Jain

On Tue, Sep 17, 2019 at 7:06 AM Shepherd, Lori <
lori.sheph...@roswellpark.org> wrote:

> I am able to reproduce the ERROR.  It seems to be an issue of trying to
> subset.
>
> > head(exp %>% gather(Tissue))  Tissue value
> 1 Adipose Tissue  1.63226821549951
> 2 Adipose Tissue 0
> 3 Adipose Tissue 0
> 4 Adipose Tissue 0.485426827170242
> 5 Adipose Tissue 0.584962500721156
> 6 Adipose Tissue  3.05311133645956> head(exp %>% 
> gather(Tissue=1:(ncol(exp)-1)))Error: 1 components of `...` had unexpected 
> names.
>
> We detected these problematic arguments:
> * `Tissue`
>
> Did you misspecify an argument?Call `rlang::last_error()` to see a backtrace
>
> >
>
> You can see if there were any recent changes to package you depend on that
> could have caused this.  A package you depend on may have changed their
> functionality.  I needed to update packages (with BiocManager::install()
>  update all )  in order to reproduce.
>
> Cheers,
>
> Lori Shepherd
>
> Bioconductor Core Team
>
> Roswell Park Cancer Institute
>
> Department of Biostatistics & Bioinformatics
>
> Elm & Carlton Streets
>
> Buffalo, New York 14263
> --
> *From:* Bioc-devel  on behalf of Ashish
> Jain 
> *Sent:* Monday, September 16, 2019 3:08 PM
> *To:* bioc-devel@r-project.org 
> *Subject:* [Bioc-devel] Fwd: TissueEnrich problems reported in the
> Multiple platform build/check report for BioC 3.9
>
> Hi All,
>
> I just got the message that my package is giving an error during the build
> process on Bioconductor build server. I saw the logs and found it is
> failing while building the vignette. I am not sure why it's happening as I
> have not updated the package since release and it was building without any
> errors. Can someone look into this?
>
> Regards,
> Ashish Jain
>
> -- Forwarded message -
> From: 
> Date: Mon, Sep 16, 2019 at 1:59 PM
> Subject: TissueEnrich problems reported in the Multiple platform
> build/check report for BioC 3.9
> To: 
>
>
> [This is an automatically generated email. Please don't reply.]
>
> Hi TissueEnrich maintainer,
>
> According to the Multiple platform build/check report for BioC 3.9,
> the TissueEnrich package has the following problem(s):
>
>   o ERROR for 'R CMD build' on malbec2. See the details here:
>
>
> https://master.bioconductor.org/checkResults/3.9/bioc-LATEST/TissueEnrich/malbec2-buildsrc.html
>
> Please take the time to address this by committing and pushing
> changes to your package at git.bioconductor.org
>
> Notes:
>
>   * This was the status of your package at the time this email was sent to
> you.
> Given that the online report is updated daily (in normal conditions)
> you
> could see something different when you visit the URL(s) above,
> especially if
> you do so several days after you received this email.
>
>   * It is possible that the problems reported in this report are false
> positives,
> either because another package (from CRAN or Bioconductor) breaks your
> package (if yours depends on it) or because of a Build System problem.
> If this is the case, then you can ignore this email.
>
>   * Please check the report again 24h after you've committed your changes
> to the
> package and make sure that all the problems have gone.
>
>   * If you have questions about this report or need help with the
> maintenance of your package, please use the Bioc-devel mailing list:
>
>   https://bioconductor.org/help/mailing-list/
>
> (all package maintainers are requested to subscribe to this list)
>
> For immediate notification of package build status, please
> subscribe to your package's RSS feed. Information is at:
>
> https://bioconductor.org/developers/rss-feeds/
>
> Thanks for contributing to the Bioconductor project!
>
>
>
> --
> Regards,
> Ashish Jain
> Ph.D. Candidate
> Bioinformatics and Computational Biology
> Iowa State University
> Ph +1-317-529-7973
> Website: *https://ashishjain1988.github.io
> /*
>
> [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
> This email message may contain legally privileged and/or confidential
> information. If you are not the intended recipient(s), or the employee or
> agent responsible for the delivery of this message to the intended
> recipient(s), you are hereby notified that any disclosure, copying,
> distribution, or use of this email message is prohibited. If you have
> received this message in error, please notify the sender immediately by
> e-mail and delete this email message from your 

Re: [Bioc-devel] read_bed()

2019-09-17 Thread Bhagwat, Aditya
Hi Michael,

I removed the additional metadata columns in SRF.bed
https://gitlab.gwdg.de/loosolab/software/multicrispr/blob/master/inst/extdata/SRF.bed

But still can't get rtracklayer::import.bed working:

> rtracklayer::import.bed(bedfile)
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : 
scan() expected 'a real', got '1.168.595'
> bedfile
[1] "C:/Users/abhagwa/Documents/R/R-3.6.1/library/multicrispr/extdata/SRF.bed"

Never mind, multicrispr 
function 
read_bed,
 based on data.table::fread is doing the job, so I will stick to that .

Thank you for all feedback,

Cheers,

Aditya



From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
Aditya [aditya.bhag...@mpi-bn.mpg.de]
Sent: Tuesday, September 17, 2019 2:48 PM
To: Michael Lawrence
Cc: bioc-devel@r-project.org
Subject: Re: [Bioc-devel] read_bed()

Oh :-) - Thankyou for explaining!

From: Michael Lawrence [lawrence.mich...@gene.com]
Sent: Tuesday, September 17, 2019 2:40 PM
To: Bhagwat, Aditya
Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] read_bed()

Having a "." in the function name does not make something "S3".
There's no dispatch from import() to import.bed(). Had I not been a
total newb when I created rtracklayer, I would have called the
function importBed() or something like that. Sorry for the confusion.

On Tue, Sep 17, 2019 at 5:34 AM Bhagwat, Aditya
 wrote:
>
> Oh, superb, thx!
>
> Interesting ... here you use S3 rather than S4 - I wonder the design 
> intention underlying these choices (I'm asking because I am trying to figure 
> out myself when to use S3 and when to use S4 and whether to mix the two).
>
> Aditya
>
> 
> From: Michael Lawrence [lawrence.mich...@gene.com]
> Sent: Tuesday, September 17, 2019 2:23 PM
> To: Bhagwat, Aditya
> Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> The generic documentation does not mention it, but see ?import.bed.
> It's similar to colClasses on read.table().
>
> On Tue, Sep 17, 2019 at 5:15 AM Bhagwat, Aditya
>  wrote:
> >
> > Thankyou Michael,
> >
> > How do I use the extraCols argument? The documentation does not mention an 
> > `extraCols` argument explicitly, so it must be one of the ellipsis 
> > arguments, but `?rtracklayer::import` does not mention it. Should I say 
> > extraCols = 10 (ten extra columns) or so?
> >
> > Aditya
> >
> > 
> > From: Michael Lawrence [lawrence.mich...@gene.com]
> > Sent: Tuesday, September 17, 2019 2:05 PM
> > To: Bhagwat, Aditya
> > Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> > Subject: Re: [Bioc-devel] read_bed()
> >
> > It breaks it because it's not standard BED; however, using the
> > extraCols= argument should work in this case. Requiring an explicit
> > format specification is intentional, because it provides validation
> > and type safety, and it communicates the format to a future reader.
> > This also looks a bit like a bedPE file, so you might consider using
> > the Pairs data structure.
> >
> > Michael
> >
> > On Tue, Sep 17, 2019 at 4:51 AM Bhagwat, Aditya
> >  wrote:
> > >
> > > Hi Michael,
> > >
> > > Yeah, I also noticed that the attachment was eaten when it entered the 
> > > bio-devel list.
> > >
> > > The file is also accessible in the extdata of the multicrispr:
> > > https://gitlab.gwdg.de/loosolab/software/multicrispr/blob/master/inst/extdata/SRF.bed
> > >
> > > A bedfile to GRanges importer requires columns 1 (chrom), 2 (chromStart), 
> > > 3 (chromEnd), and column 6 (strand). All of these are present in SRF.bed.
> > >
> > > I am curious as to why you feel that having additional columns in a 
> > > bedfile would break it?
> > >
> > > Cheers,
> > >
> > > Aditya
> > >
> > > 
> > > From: Michael Lawrence [lawrence.mich...@gene.com]
> > > Sent: Tuesday, September 17, 2019 1:41 PM
> > > To: Bhagwat, Aditya
> > > Cc: Shepherd, Lori; bioc-devel@r-project.org
> > > Subject: Re: [Bioc-devel] read_bed()
> > >
> > > I don't see an attachment, nor can I find the multicrispr package
> > > anywhere. The "addressed soon" was referring to the BEDX+Y formats,
> > > which was addressed many years ago, so I've updated the documentation.
> > > Broken BED files will never be supported.
> > >
> > > Michael
> > >
> > > On Tue, Sep 17, 2019 at 4:17 AM Bhagwat, Aditya
> > >  wrote:
> > > >
> > > > Hi Lori,
> > > >
> > > > I remember now - I tried this function earlier, but it does not work 
> > > > for my bedfiles, like the one in attach.
> > > >
> > > > > bedfile <- system.file('extdata/SRF.bed', package = 'multicrispr')
> > > > >
> > > > > targetranges <- 

Re: [Bioc-devel] Failing to build vignette because of problems with ImageMagick

2019-09-17 Thread Christian Mertes
I just came across this issue on rmarkdown which links the same problem
to BiocStyle.
The post is from Nov 2017.

https://github.com/rstudio/rmarkdown/issues/1207

Maybe this helps to understand the underlying problem?

It was suggested to check this in BiocStyle:

Have you reported to the authors of BiocStyle? It seems they enabled
the *knitr* hook |knitr::hook_pdfcrop| unconditionally (i.e. without
checking if ImageMagick has been installed).

Best,

Christian

On 9/11/19 5:50 PM, Pages, Herve wrote:
> New to me too. But it seems that knitr suggests magick, which itself has
>
>SystemRequirements: ImageMagick++: ImageMagick-c++-devel (rpm) or 
> libmagick++-dev (deb)
>
> Don't know when this knitr dep on magick was introduced tough... Bummer!
>
> H.
>
> On 9/11/19 06:13, Kasper Daniel Hansen wrote:
>> Yeah, does this imply that the render operation uses (or tries to use) 
>> ImageMagick? That's news to me, but I am not following this closely.
>>
>> On Wed, Sep 11, 2019 at 5:21 AM Pages, Herve > > wrote:
>>
>> On 9/11/19 00:50, Vincent Carey wrote:
>>  > I seem to be running into a similar problem with BiocOncoTK on
>> windows
>>  >
>>  > The build report for tokay1 shows:
>>  >
>>  > Loading required package: ontologyIndex
>>  > Invalid Parameter - /figure-html
>>  > Warning in shell(paste(c(cmd, args), collapse = " ")) :
>>  >    'convert "BiocOncoTK_files/figure-html/lkgbm-1.png" -trim
>>  > "BiocOncoTK_files/figure-html/lkgbm-1.png"' execution failed with
>>  > error code 4
>>  > Invalid Parameter - /figure-html
>>  >
>>  > The figure code is introduced with ```{r
>> lkgbm,fig=TRUE,message=FALSE}
>>  > ... the 'convert' process is not requested by me
>>  >
>>  > Is the fig=TRUE problematic for windows?  It seems unnecessary.
>>
>> Not sure what's going on. A few observations:
>>
>> a) About 500 software packages use fig=TRUE.
>>
>> b) The convert warning is just a warning. The actual error in the case
>> of BiocOncoTK is:
>>
>>     Error: processing vignette 'BiocOncoTK.Rmd' failed with diagnostics:
>> argument is of length zero
>>
>>     Note that the ndexr vignette also fails with this error on tokay1
>> only but it doesn't have the convert warning (this vignette does not
>> use
>> 'fig' at all). So it's not clear to me that the "argument is of length
>> zero" error is related to the convert warning.
>>
>> c) The devel build report shows the convert warning for 4 other
>> packages
>> (CAGEfightR, CATALYST, CTDquerier, specL) but each of them actually
>> fails with a different error message:
>>
>>     CAGEfightR:
>>       colData(object1) not identical to colData(object2)
>>
>>     CATALYST:
>>       no slot of name "reducedDims" for this object of class "daFrame"
>>
>>     CTDquerier:
>>       bfcadd() failed; see warnings()
>>
>>     specL:
>>       pandoc.exe: Out of memory
>>
>>     These errors don't seem related to the convert warning either.
>>
>> So I'm wondering: could it be that the convert warning is actually
>> common but we generally don't see it because 'R CMD build' doesn't
>> report warnings? And that we just happen to see the warning when 'R CMD
>> build' fails to build a vignette.
>>
>> We'll investigate more.
>>
>> H.
>>
>>
>>  >
>>  > On Tue, Sep 10, 2019 at 11:35 AM Christian Mertes
>> mailto:mer...@in.tum.de>> wrote:
>>  >
>>  >> Thanks a lot for the info! So from my understanding we dont use any
>>  >> trimming or editing function from ImageMagick directly. I think
>> this is
>>  >> rather knitr based since we just include png files in the vignette.
>>  >>
>>  >> I guess it was an hickup since now the error is gone over night.
>>  >>
>>  >> Best regards,
>>  >>
>>  >> Christian
>>  >>
>>  >> On 9/9/19 4:34 PM, Kasper Daniel Hansen wrote:
>>  >>> You don't declare any systems requirements for ImageMagick
>> (doing so
>>  >>> will probably not solve your problem, but you really should).
>>  >>>
>>  >>> Alternatively you could look into using the tools provided by the
>>  >>> magick package, which wraps ImageMagick.
>>  >>>
>>  >>> But it looks like you're editing PNG files for your vignette. I
>> would
>>  >>> really recommend not doing so. It introduces a system
>> dependency which
>>  >>> is just going to increase headaches on your end, for (perhaps)
>> no real
>>  >>> tangible benefits. If you're trimming PNGs, you should be able to
>>  >>> achieve the same effect when using the png device(s) in R, and that
>>  >>> will make everything more portable anyway.
>>  >>>
>>  >>> On Mon, Sep 9, 2019 at 9:42 AM Christian Mertes
>> mailto:mer...@in.tum.de>
>>  >>> >> 

[Bioc-devel] read.table fails with https protocol

2019-09-17 Thread Christian Mertes
Hey Bioc-devel community,

My package OUTRIDER fails again sometimes on the build system but rather
randomly. First I thought it was due to the ImageMagick problem I posted
some days ago. But this is really only a warning.

I guess I found the problem. But this I dont really understand. Any help
is appreciated.

I assume from the docs that *read.table* works for http and https. But
on the build system and also locally sometimes this fails with the error:

Error in read.table(URL, sep = "\t") : no lines available in input

I digged into it a bit and it looks like *readLines* has problems
reading from https connections. See below my examples:

library(data.table)
library(utils)
library(curl)

# Link to a count table in TSV format at nature.com
URL <-
"media.nature.com/original/nature-assets/ncomms/2017/170612/ncomms15824/extref/ncomms15824-s1.txt"

# Fails with https
read.table(paste0("https://;, URL), sep="\t", nrows=10)[,1:10]

# Works with plain http
read.table(paste0("http://;, URL),  sep="\t", nrows=10)[,1:10]

# Works if using curl to read lines first
read.table(text=readLines(curl(paste0("https://;, URL))), sep="\t",
nrows=10)[,1:10]

# Fails if using only readLines
read.table(text=readLines(paste0("https://;, URL)), sep="\t",
nrows=10)[,1:10]

# Works with fread from data.table package (it uses curl to dump first
the file)
data.frame(fread(paste0("https://;, URL), sep="\t", nrows=10)[,1:10],
row.names=1)
data.frame(fread(paste0("http://;, URL),  sep="\t", nrows=10)[,1:10],
row.names=1)

I guess my solution is to use http or move to use fread or curl. But I
think the clean way is to use read.table or?

Best,

Christian

-- 

Christian Mertes
PhD Student / Lab Administrator
Gagneur lab
 
Computational Genomics
I12 - Bioinformatics Department
Technical University Munich
Boltzmannstr. 3
85748 Garching, Germany

Mail: mer...@in.tum.de
Phone: +49-89-289-19416
http://gagneurlab.in.tum.de


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] which web browser on the linux build machines? was Re: R environment variable which indicates "running in the bioc build system"?

2019-09-17 Thread Paul Shannon
On Sep 12, 2019, at 3:13 PM, Pages, Herve  wrote:
> 
> AFAIK the build machines have web browsers.

Hi Herve,

I’d like to do my local linux testing of igvR using the same web browser you 
have on the bioc linux build machines.I’d be grateful if you could tell me 
which one is installed.

Thanks!

 - Paul

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] read_bed()

2019-09-17 Thread Bhagwat, Aditya
Oh :-) - Thankyou for explaining!

From: Michael Lawrence [lawrence.mich...@gene.com]
Sent: Tuesday, September 17, 2019 2:40 PM
To: Bhagwat, Aditya
Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] read_bed()

Having a "." in the function name does not make something "S3".
There's no dispatch from import() to import.bed(). Had I not been a
total newb when I created rtracklayer, I would have called the
function importBed() or something like that. Sorry for the confusion.

On Tue, Sep 17, 2019 at 5:34 AM Bhagwat, Aditya
 wrote:
>
> Oh, superb, thx!
>
> Interesting ... here you use S3 rather than S4 - I wonder the design 
> intention underlying these choices (I'm asking because I am trying to figure 
> out myself when to use S3 and when to use S4 and whether to mix the two).
>
> Aditya
>
> 
> From: Michael Lawrence [lawrence.mich...@gene.com]
> Sent: Tuesday, September 17, 2019 2:23 PM
> To: Bhagwat, Aditya
> Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> The generic documentation does not mention it, but see ?import.bed.
> It's similar to colClasses on read.table().
>
> On Tue, Sep 17, 2019 at 5:15 AM Bhagwat, Aditya
>  wrote:
> >
> > Thankyou Michael,
> >
> > How do I use the extraCols argument? The documentation does not mention an 
> > `extraCols` argument explicitly, so it must be one of the ellipsis 
> > arguments, but `?rtracklayer::import` does not mention it. Should I say 
> > extraCols = 10 (ten extra columns) or so?
> >
> > Aditya
> >
> > 
> > From: Michael Lawrence [lawrence.mich...@gene.com]
> > Sent: Tuesday, September 17, 2019 2:05 PM
> > To: Bhagwat, Aditya
> > Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> > Subject: Re: [Bioc-devel] read_bed()
> >
> > It breaks it because it's not standard BED; however, using the
> > extraCols= argument should work in this case. Requiring an explicit
> > format specification is intentional, because it provides validation
> > and type safety, and it communicates the format to a future reader.
> > This also looks a bit like a bedPE file, so you might consider using
> > the Pairs data structure.
> >
> > Michael
> >
> > On Tue, Sep 17, 2019 at 4:51 AM Bhagwat, Aditya
> >  wrote:
> > >
> > > Hi Michael,
> > >
> > > Yeah, I also noticed that the attachment was eaten when it entered the 
> > > bio-devel list.
> > >
> > > The file is also accessible in the extdata of the multicrispr:
> > > https://gitlab.gwdg.de/loosolab/software/multicrispr/blob/master/inst/extdata/SRF.bed
> > >
> > > A bedfile to GRanges importer requires columns 1 (chrom), 2 (chromStart), 
> > > 3 (chromEnd), and column 6 (strand). All of these are present in SRF.bed.
> > >
> > > I am curious as to why you feel that having additional columns in a 
> > > bedfile would break it?
> > >
> > > Cheers,
> > >
> > > Aditya
> > >
> > > 
> > > From: Michael Lawrence [lawrence.mich...@gene.com]
> > > Sent: Tuesday, September 17, 2019 1:41 PM
> > > To: Bhagwat, Aditya
> > > Cc: Shepherd, Lori; bioc-devel@r-project.org
> > > Subject: Re: [Bioc-devel] read_bed()
> > >
> > > I don't see an attachment, nor can I find the multicrispr package
> > > anywhere. The "addressed soon" was referring to the BEDX+Y formats,
> > > which was addressed many years ago, so I've updated the documentation.
> > > Broken BED files will never be supported.
> > >
> > > Michael
> > >
> > > On Tue, Sep 17, 2019 at 4:17 AM Bhagwat, Aditya
> > >  wrote:
> > > >
> > > > Hi Lori,
> > > >
> > > > I remember now - I tried this function earlier, but it does not work 
> > > > for my bedfiles, like the one in attach.
> > > >
> > > > > bedfile  <- system.file('extdata/SRF.bed', package = 
> > > > > 'multicrispr')
> > > > >
> > > > > targetranges <- rtracklayer::import(bedfile, format = 'BED', genome = 
> > > > > 'mm10')
> > > > Error in scan(file = file, what = what, sep = sep, quote = quote, dec = 
> > > > dec,  : scan() expected 'an integer', got 'chr2'
> > > > >
> > > >
> > > > Perhaps this sentence in `?rtracklayer::import` points to the source of 
> > > > the error?
> > > >
> > > > many tools and organizations have extended BED with additional columns. 
> > > > These are not officially valid BED files, and as such rtracklayer does 
> > > > not yet support them (this will be addressed soon).
> > > >
> > > > Which brings the question: how soon is soon :-D ?
> > > >
> > > > Aditya
> > > >
> > > >
> > > > 
> > > > From: Shepherd, Lori [lori.sheph...@roswellpark.org]
> > > > Sent: Tuesday, September 17, 2019 1:02 PM
> > > > To: Bhagwat, Aditya; bioc-devel@r-project.org
> > > > Subject: Re: read_bed()
> > > >
> > > > Please look at rtracklayer::import()  function that we recommend for 
> > > > reading of BAM files along with other common 

Re: [Bioc-devel] read_bed()

2019-09-17 Thread Michael Lawrence via Bioc-devel
Having a "." in the function name does not make something "S3".
There's no dispatch from import() to import.bed(). Had I not been a
total newb when I created rtracklayer, I would have called the
function importBed() or something like that. Sorry for the confusion.

On Tue, Sep 17, 2019 at 5:34 AM Bhagwat, Aditya
 wrote:
>
> Oh, superb, thx!
>
> Interesting ... here you use S3 rather than S4 - I wonder the design 
> intention underlying these choices (I'm asking because I am trying to figure 
> out myself when to use S3 and when to use S4 and whether to mix the two).
>
> Aditya
>
> 
> From: Michael Lawrence [lawrence.mich...@gene.com]
> Sent: Tuesday, September 17, 2019 2:23 PM
> To: Bhagwat, Aditya
> Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> The generic documentation does not mention it, but see ?import.bed.
> It's similar to colClasses on read.table().
>
> On Tue, Sep 17, 2019 at 5:15 AM Bhagwat, Aditya
>  wrote:
> >
> > Thankyou Michael,
> >
> > How do I use the extraCols argument? The documentation does not mention an 
> > `extraCols` argument explicitly, so it must be one of the ellipsis 
> > arguments, but `?rtracklayer::import` does not mention it. Should I say 
> > extraCols = 10 (ten extra columns) or so?
> >
> > Aditya
> >
> > 
> > From: Michael Lawrence [lawrence.mich...@gene.com]
> > Sent: Tuesday, September 17, 2019 2:05 PM
> > To: Bhagwat, Aditya
> > Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> > Subject: Re: [Bioc-devel] read_bed()
> >
> > It breaks it because it's not standard BED; however, using the
> > extraCols= argument should work in this case. Requiring an explicit
> > format specification is intentional, because it provides validation
> > and type safety, and it communicates the format to a future reader.
> > This also looks a bit like a bedPE file, so you might consider using
> > the Pairs data structure.
> >
> > Michael
> >
> > On Tue, Sep 17, 2019 at 4:51 AM Bhagwat, Aditya
> >  wrote:
> > >
> > > Hi Michael,
> > >
> > > Yeah, I also noticed that the attachment was eaten when it entered the 
> > > bio-devel list.
> > >
> > > The file is also accessible in the extdata of the multicrispr:
> > > https://gitlab.gwdg.de/loosolab/software/multicrispr/blob/master/inst/extdata/SRF.bed
> > >
> > > A bedfile to GRanges importer requires columns 1 (chrom), 2 (chromStart), 
> > > 3 (chromEnd), and column 6 (strand). All of these are present in SRF.bed.
> > >
> > > I am curious as to why you feel that having additional columns in a 
> > > bedfile would break it?
> > >
> > > Cheers,
> > >
> > > Aditya
> > >
> > > 
> > > From: Michael Lawrence [lawrence.mich...@gene.com]
> > > Sent: Tuesday, September 17, 2019 1:41 PM
> > > To: Bhagwat, Aditya
> > > Cc: Shepherd, Lori; bioc-devel@r-project.org
> > > Subject: Re: [Bioc-devel] read_bed()
> > >
> > > I don't see an attachment, nor can I find the multicrispr package
> > > anywhere. The "addressed soon" was referring to the BEDX+Y formats,
> > > which was addressed many years ago, so I've updated the documentation.
> > > Broken BED files will never be supported.
> > >
> > > Michael
> > >
> > > On Tue, Sep 17, 2019 at 4:17 AM Bhagwat, Aditya
> > >  wrote:
> > > >
> > > > Hi Lori,
> > > >
> > > > I remember now - I tried this function earlier, but it does not work 
> > > > for my bedfiles, like the one in attach.
> > > >
> > > > > bedfile  <- system.file('extdata/SRF.bed', package = 
> > > > > 'multicrispr')
> > > > >
> > > > > targetranges <- rtracklayer::import(bedfile, format = 'BED', genome = 
> > > > > 'mm10')
> > > > Error in scan(file = file, what = what, sep = sep, quote = quote, dec = 
> > > > dec,  : scan() expected 'an integer', got 'chr2'
> > > > >
> > > >
> > > > Perhaps this sentence in `?rtracklayer::import` points to the source of 
> > > > the error?
> > > >
> > > > many tools and organizations have extended BED with additional columns. 
> > > > These are not officially valid BED files, and as such rtracklayer does 
> > > > not yet support them (this will be addressed soon).
> > > >
> > > > Which brings the question: how soon is soon :-D ?
> > > >
> > > > Aditya
> > > >
> > > >
> > > > 
> > > > From: Shepherd, Lori [lori.sheph...@roswellpark.org]
> > > > Sent: Tuesday, September 17, 2019 1:02 PM
> > > > To: Bhagwat, Aditya; bioc-devel@r-project.org
> > > > Subject: Re: read_bed()
> > > >
> > > > Please look at rtracklayer::import()  function that we recommend for 
> > > > reading of BAM files along with other common formats.
> > > >
> > > > Cheers,
> > > >
> > > >
> > > > Lori Shepherd
> > > >
> > > > Bioconductor Core Team
> > > >
> > > > Roswell Park Cancer Institute
> > > >
> > > > Department of Biostatistics & Bioinformatics
> > > >
> > > > Elm & Carlton Streets
> > > >
> > > > Buffalo, New York 

Re: [Bioc-devel] read_bed()

2019-09-17 Thread Bhagwat, Aditya
Oh, superb, thx!

Interesting ... here you use S3 rather than S4 - I wonder the design intention 
underlying these choices (I'm asking because I am trying to figure out myself 
when to use S3 and when to use S4 and whether to mix the two).

Aditya


From: Michael Lawrence [lawrence.mich...@gene.com]
Sent: Tuesday, September 17, 2019 2:23 PM
To: Bhagwat, Aditya
Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] read_bed()

The generic documentation does not mention it, but see ?import.bed.
It's similar to colClasses on read.table().

On Tue, Sep 17, 2019 at 5:15 AM Bhagwat, Aditya
 wrote:
>
> Thankyou Michael,
>
> How do I use the extraCols argument? The documentation does not mention an 
> `extraCols` argument explicitly, so it must be one of the ellipsis arguments, 
> but `?rtracklayer::import` does not mention it. Should I say extraCols = 10 
> (ten extra columns) or so?
>
> Aditya
>
> 
> From: Michael Lawrence [lawrence.mich...@gene.com]
> Sent: Tuesday, September 17, 2019 2:05 PM
> To: Bhagwat, Aditya
> Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> It breaks it because it's not standard BED; however, using the
> extraCols= argument should work in this case. Requiring an explicit
> format specification is intentional, because it provides validation
> and type safety, and it communicates the format to a future reader.
> This also looks a bit like a bedPE file, so you might consider using
> the Pairs data structure.
>
> Michael
>
> On Tue, Sep 17, 2019 at 4:51 AM Bhagwat, Aditya
>  wrote:
> >
> > Hi Michael,
> >
> > Yeah, I also noticed that the attachment was eaten when it entered the 
> > bio-devel list.
> >
> > The file is also accessible in the extdata of the multicrispr:
> > https://gitlab.gwdg.de/loosolab/software/multicrispr/blob/master/inst/extdata/SRF.bed
> >
> > A bedfile to GRanges importer requires columns 1 (chrom), 2 (chromStart), 3 
> > (chromEnd), and column 6 (strand). All of these are present in SRF.bed.
> >
> > I am curious as to why you feel that having additional columns in a bedfile 
> > would break it?
> >
> > Cheers,
> >
> > Aditya
> >
> > 
> > From: Michael Lawrence [lawrence.mich...@gene.com]
> > Sent: Tuesday, September 17, 2019 1:41 PM
> > To: Bhagwat, Aditya
> > Cc: Shepherd, Lori; bioc-devel@r-project.org
> > Subject: Re: [Bioc-devel] read_bed()
> >
> > I don't see an attachment, nor can I find the multicrispr package
> > anywhere. The "addressed soon" was referring to the BEDX+Y formats,
> > which was addressed many years ago, so I've updated the documentation.
> > Broken BED files will never be supported.
> >
> > Michael
> >
> > On Tue, Sep 17, 2019 at 4:17 AM Bhagwat, Aditya
> >  wrote:
> > >
> > > Hi Lori,
> > >
> > > I remember now - I tried this function earlier, but it does not work for 
> > > my bedfiles, like the one in attach.
> > >
> > > > bedfile  <- system.file('extdata/SRF.bed', package = 'multicrispr')
> > > >
> > > > targetranges <- rtracklayer::import(bedfile, format = 'BED', genome = 
> > > > 'mm10')
> > > Error in scan(file = file, what = what, sep = sep, quote = quote, dec = 
> > > dec,  : scan() expected 'an integer', got 'chr2'
> > > >
> > >
> > > Perhaps this sentence in `?rtracklayer::import` points to the source of 
> > > the error?
> > >
> > > many tools and organizations have extended BED with additional columns. 
> > > These are not officially valid BED files, and as such rtracklayer does 
> > > not yet support them (this will be addressed soon).
> > >
> > > Which brings the question: how soon is soon :-D ?
> > >
> > > Aditya
> > >
> > >
> > > 
> > > From: Shepherd, Lori [lori.sheph...@roswellpark.org]
> > > Sent: Tuesday, September 17, 2019 1:02 PM
> > > To: Bhagwat, Aditya; bioc-devel@r-project.org
> > > Subject: Re: read_bed()
> > >
> > > Please look at rtracklayer::import()  function that we recommend for 
> > > reading of BAM files along with other common formats.
> > >
> > > Cheers,
> > >
> > >
> > > Lori Shepherd
> > >
> > > Bioconductor Core Team
> > >
> > > Roswell Park Cancer Institute
> > >
> > > Department of Biostatistics & Bioinformatics
> > >
> > > Elm & Carlton Streets
> > >
> > > Buffalo, New York 14263
> > >
> > > 
> > > From: Bioc-devel  on behalf of Bhagwat, 
> > > Aditya 
> > > Sent: Tuesday, September 17, 2019 6:58 AM
> > > To: bioc-devel@r-project.org 
> > > Subject: [Bioc-devel] read_bed()
> > >
> > > Dear bioc-devel,
> > >
> > > I had two feedback requests regarding the function read_bed().
> > >
> > > 1) Did I overlook, and therefore, re-invent existing functionality?
> > > 2) If not, would `read_bed` be suited for existence in a more 
> > > foundational package, e.g. `GenomicRanges`, given the rather basal nature 
> > > of this 

Re: [Bioc-devel] read_bed()

2019-09-17 Thread Michael Lawrence via Bioc-devel
The generic documentation does not mention it, but see ?import.bed.
It's similar to colClasses on read.table().

On Tue, Sep 17, 2019 at 5:15 AM Bhagwat, Aditya
 wrote:
>
> Thankyou Michael,
>
> How do I use the extraCols argument? The documentation does not mention an 
> `extraCols` argument explicitly, so it must be one of the ellipsis arguments, 
> but `?rtracklayer::import` does not mention it. Should I say extraCols = 10 
> (ten extra columns) or so?
>
> Aditya
>
> 
> From: Michael Lawrence [lawrence.mich...@gene.com]
> Sent: Tuesday, September 17, 2019 2:05 PM
> To: Bhagwat, Aditya
> Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> It breaks it because it's not standard BED; however, using the
> extraCols= argument should work in this case. Requiring an explicit
> format specification is intentional, because it provides validation
> and type safety, and it communicates the format to a future reader.
> This also looks a bit like a bedPE file, so you might consider using
> the Pairs data structure.
>
> Michael
>
> On Tue, Sep 17, 2019 at 4:51 AM Bhagwat, Aditya
>  wrote:
> >
> > Hi Michael,
> >
> > Yeah, I also noticed that the attachment was eaten when it entered the 
> > bio-devel list.
> >
> > The file is also accessible in the extdata of the multicrispr:
> > https://gitlab.gwdg.de/loosolab/software/multicrispr/blob/master/inst/extdata/SRF.bed
> >
> > A bedfile to GRanges importer requires columns 1 (chrom), 2 (chromStart), 3 
> > (chromEnd), and column 6 (strand). All of these are present in SRF.bed.
> >
> > I am curious as to why you feel that having additional columns in a bedfile 
> > would break it?
> >
> > Cheers,
> >
> > Aditya
> >
> > 
> > From: Michael Lawrence [lawrence.mich...@gene.com]
> > Sent: Tuesday, September 17, 2019 1:41 PM
> > To: Bhagwat, Aditya
> > Cc: Shepherd, Lori; bioc-devel@r-project.org
> > Subject: Re: [Bioc-devel] read_bed()
> >
> > I don't see an attachment, nor can I find the multicrispr package
> > anywhere. The "addressed soon" was referring to the BEDX+Y formats,
> > which was addressed many years ago, so I've updated the documentation.
> > Broken BED files will never be supported.
> >
> > Michael
> >
> > On Tue, Sep 17, 2019 at 4:17 AM Bhagwat, Aditya
> >  wrote:
> > >
> > > Hi Lori,
> > >
> > > I remember now - I tried this function earlier, but it does not work for 
> > > my bedfiles, like the one in attach.
> > >
> > > > bedfile  <- system.file('extdata/SRF.bed', package = 'multicrispr')
> > > >
> > > > targetranges <- rtracklayer::import(bedfile, format = 'BED', genome = 
> > > > 'mm10')
> > > Error in scan(file = file, what = what, sep = sep, quote = quote, dec = 
> > > dec,  : scan() expected 'an integer', got 'chr2'
> > > >
> > >
> > > Perhaps this sentence in `?rtracklayer::import` points to the source of 
> > > the error?
> > >
> > > many tools and organizations have extended BED with additional columns. 
> > > These are not officially valid BED files, and as such rtracklayer does 
> > > not yet support them (this will be addressed soon).
> > >
> > > Which brings the question: how soon is soon :-D ?
> > >
> > > Aditya
> > >
> > >
> > > 
> > > From: Shepherd, Lori [lori.sheph...@roswellpark.org]
> > > Sent: Tuesday, September 17, 2019 1:02 PM
> > > To: Bhagwat, Aditya; bioc-devel@r-project.org
> > > Subject: Re: read_bed()
> > >
> > > Please look at rtracklayer::import()  function that we recommend for 
> > > reading of BAM files along with other common formats.
> > >
> > > Cheers,
> > >
> > >
> > > Lori Shepherd
> > >
> > > Bioconductor Core Team
> > >
> > > Roswell Park Cancer Institute
> > >
> > > Department of Biostatistics & Bioinformatics
> > >
> > > Elm & Carlton Streets
> > >
> > > Buffalo, New York 14263
> > >
> > > 
> > > From: Bioc-devel  on behalf of Bhagwat, 
> > > Aditya 
> > > Sent: Tuesday, September 17, 2019 6:58 AM
> > > To: bioc-devel@r-project.org 
> > > Subject: [Bioc-devel] read_bed()
> > >
> > > Dear bioc-devel,
> > >
> > > I had two feedback requests regarding the function read_bed().
> > >
> > > 1) Did I overlook, and therefore, re-invent existing functionality?
> > > 2) If not, would `read_bed` be suited for existence in a more 
> > > foundational package, e.g. `GenomicRanges`, given the rather basal nature 
> > > of this functionality?
> > >
> > > It reads a bedfile into a GRanges, converts the coordinates from 0-based 
> > > (bedfile) to 1-based (GRanges), adds 
> > > BSgenome info (to allow for implicit range validity 
> > > checking) and plots the 
> > > karyogram.
> > >
> > > Thank you for your feedback.
> > >
> > > Cheers,
> > >
> > > Aditya
> > >
> > >
> > > #' Read bedfile into GRanges
> > > #'
> > > #' 

Re: [Bioc-devel] read_bed()

2019-09-17 Thread Bhagwat, Aditya
Thankyou Michael, 

How do I use the extraCols argument? The documentation does not mention an 
`extraCols` argument explicitly, so it must be one of the ellipsis arguments, 
but `?rtracklayer::import` does not mention it. Should I say extraCols = 10 
(ten extra columns) or so?

Aditya


From: Michael Lawrence [lawrence.mich...@gene.com]
Sent: Tuesday, September 17, 2019 2:05 PM
To: Bhagwat, Aditya
Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] read_bed()

It breaks it because it's not standard BED; however, using the
extraCols= argument should work in this case. Requiring an explicit
format specification is intentional, because it provides validation
and type safety, and it communicates the format to a future reader.
This also looks a bit like a bedPE file, so you might consider using
the Pairs data structure.

Michael

On Tue, Sep 17, 2019 at 4:51 AM Bhagwat, Aditya
 wrote:
>
> Hi Michael,
>
> Yeah, I also noticed that the attachment was eaten when it entered the 
> bio-devel list.
>
> The file is also accessible in the extdata of the multicrispr:
> https://gitlab.gwdg.de/loosolab/software/multicrispr/blob/master/inst/extdata/SRF.bed
>
> A bedfile to GRanges importer requires columns 1 (chrom), 2 (chromStart), 3 
> (chromEnd), and column 6 (strand). All of these are present in SRF.bed.
>
> I am curious as to why you feel that having additional columns in a bedfile 
> would break it?
>
> Cheers,
>
> Aditya
>
> 
> From: Michael Lawrence [lawrence.mich...@gene.com]
> Sent: Tuesday, September 17, 2019 1:41 PM
> To: Bhagwat, Aditya
> Cc: Shepherd, Lori; bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> I don't see an attachment, nor can I find the multicrispr package
> anywhere. The "addressed soon" was referring to the BEDX+Y formats,
> which was addressed many years ago, so I've updated the documentation.
> Broken BED files will never be supported.
>
> Michael
>
> On Tue, Sep 17, 2019 at 4:17 AM Bhagwat, Aditya
>  wrote:
> >
> > Hi Lori,
> >
> > I remember now - I tried this function earlier, but it does not work for my 
> > bedfiles, like the one in attach.
> >
> > > bedfile  <- system.file('extdata/SRF.bed', package = 'multicrispr')
> > >
> > > targetranges <- rtracklayer::import(bedfile, format = 'BED', genome = 
> > > 'mm10')
> > Error in scan(file = file, what = what, sep = sep, quote = quote, dec = 
> > dec,  : scan() expected 'an integer', got 'chr2'
> > >
> >
> > Perhaps this sentence in `?rtracklayer::import` points to the source of the 
> > error?
> >
> > many tools and organizations have extended BED with additional columns. 
> > These are not officially valid BED files, and as such rtracklayer does not 
> > yet support them (this will be addressed soon).
> >
> > Which brings the question: how soon is soon :-D ?
> >
> > Aditya
> >
> >
> > 
> > From: Shepherd, Lori [lori.sheph...@roswellpark.org]
> > Sent: Tuesday, September 17, 2019 1:02 PM
> > To: Bhagwat, Aditya; bioc-devel@r-project.org
> > Subject: Re: read_bed()
> >
> > Please look at rtracklayer::import()  function that we recommend for 
> > reading of BAM files along with other common formats.
> >
> > Cheers,
> >
> >
> > Lori Shepherd
> >
> > Bioconductor Core Team
> >
> > Roswell Park Cancer Institute
> >
> > Department of Biostatistics & Bioinformatics
> >
> > Elm & Carlton Streets
> >
> > Buffalo, New York 14263
> >
> > 
> > From: Bioc-devel  on behalf of Bhagwat, 
> > Aditya 
> > Sent: Tuesday, September 17, 2019 6:58 AM
> > To: bioc-devel@r-project.org 
> > Subject: [Bioc-devel] read_bed()
> >
> > Dear bioc-devel,
> >
> > I had two feedback requests regarding the function read_bed().
> >
> > 1) Did I overlook, and therefore, re-invent existing functionality?
> > 2) If not, would `read_bed` be suited for existence in a more foundational 
> > package, e.g. `GenomicRanges`, given the rather basal nature of this 
> > functionality?
> >
> > It reads a bedfile into a GRanges, converts the coordinates from 0-based 
> > (bedfile) to 1-based (GRanges), adds 
> > BSgenome info (to allow for implicit range validity 
> > checking) and plots the 
> > karyogram.
> >
> > Thank you for your feedback.
> >
> > Cheers,
> >
> > Aditya
> >
> >
> > #' Read bedfile into GRanges
> > #'
> > #' @param bedfilefile path
> > #' @param bsgenome   BSgenome, e.g. 
> > BSgenome.Mmusculus.UCSC.mm10::Mmusculus
> > #' @param zero_based logical(1): whether bedfile GRanges are 0-based
> > #' @param rm_duplicates  logical(1)
> > #' @param plot   logical(1)
> > #' @param verboselogical(1)
> > #' @return \code{\link[GenomicRanges]{GRanges-class}}
> > #' @note By convention BED files are 0-based. GRanges are always 

Re: [Bioc-devel] read_bed()

2019-09-17 Thread Michael Lawrence via Bioc-devel
It breaks it because it's not standard BED; however, using the
extraCols= argument should work in this case. Requiring an explicit
format specification is intentional, because it provides validation
and type safety, and it communicates the format to a future reader.
This also looks a bit like a bedPE file, so you might consider using
the Pairs data structure.

Michael

On Tue, Sep 17, 2019 at 4:51 AM Bhagwat, Aditya
 wrote:
>
> Hi Michael,
>
> Yeah, I also noticed that the attachment was eaten when it entered the 
> bio-devel list.
>
> The file is also accessible in the extdata of the multicrispr:
> https://gitlab.gwdg.de/loosolab/software/multicrispr/blob/master/inst/extdata/SRF.bed
>
> A bedfile to GRanges importer requires columns 1 (chrom), 2 (chromStart), 3 
> (chromEnd), and column 6 (strand). All of these are present in SRF.bed.
>
> I am curious as to why you feel that having additional columns in a bedfile 
> would break it?
>
> Cheers,
>
> Aditya
>
> 
> From: Michael Lawrence [lawrence.mich...@gene.com]
> Sent: Tuesday, September 17, 2019 1:41 PM
> To: Bhagwat, Aditya
> Cc: Shepherd, Lori; bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> I don't see an attachment, nor can I find the multicrispr package
> anywhere. The "addressed soon" was referring to the BEDX+Y formats,
> which was addressed many years ago, so I've updated the documentation.
> Broken BED files will never be supported.
>
> Michael
>
> On Tue, Sep 17, 2019 at 4:17 AM Bhagwat, Aditya
>  wrote:
> >
> > Hi Lori,
> >
> > I remember now - I tried this function earlier, but it does not work for my 
> > bedfiles, like the one in attach.
> >
> > > bedfile  <- system.file('extdata/SRF.bed', package = 'multicrispr')
> > >
> > > targetranges <- rtracklayer::import(bedfile, format = 'BED', genome = 
> > > 'mm10')
> > Error in scan(file = file, what = what, sep = sep, quote = quote, dec = 
> > dec,  : scan() expected 'an integer', got 'chr2'
> > >
> >
> > Perhaps this sentence in `?rtracklayer::import` points to the source of the 
> > error?
> >
> > many tools and organizations have extended BED with additional columns. 
> > These are not officially valid BED files, and as such rtracklayer does not 
> > yet support them (this will be addressed soon).
> >
> > Which brings the question: how soon is soon :-D ?
> >
> > Aditya
> >
> >
> > 
> > From: Shepherd, Lori [lori.sheph...@roswellpark.org]
> > Sent: Tuesday, September 17, 2019 1:02 PM
> > To: Bhagwat, Aditya; bioc-devel@r-project.org
> > Subject: Re: read_bed()
> >
> > Please look at rtracklayer::import()  function that we recommend for 
> > reading of BAM files along with other common formats.
> >
> > Cheers,
> >
> >
> > Lori Shepherd
> >
> > Bioconductor Core Team
> >
> > Roswell Park Cancer Institute
> >
> > Department of Biostatistics & Bioinformatics
> >
> > Elm & Carlton Streets
> >
> > Buffalo, New York 14263
> >
> > 
> > From: Bioc-devel  on behalf of Bhagwat, 
> > Aditya 
> > Sent: Tuesday, September 17, 2019 6:58 AM
> > To: bioc-devel@r-project.org 
> > Subject: [Bioc-devel] read_bed()
> >
> > Dear bioc-devel,
> >
> > I had two feedback requests regarding the function read_bed().
> >
> > 1) Did I overlook, and therefore, re-invent existing functionality?
> > 2) If not, would `read_bed` be suited for existence in a more foundational 
> > package, e.g. `GenomicRanges`, given the rather basal nature of this 
> > functionality?
> >
> > It reads a bedfile into a GRanges, converts the coordinates from 0-based 
> > (bedfile) to 1-based (GRanges), adds 
> > BSgenome info (to allow for implicit range validity 
> > checking) and plots the 
> > karyogram.
> >
> > Thank you for your feedback.
> >
> > Cheers,
> >
> > Aditya
> >
> >
> > #' Read bedfile into GRanges
> > #'
> > #' @param bedfilefile path
> > #' @param bsgenome   BSgenome, e.g. 
> > BSgenome.Mmusculus.UCSC.mm10::Mmusculus
> > #' @param zero_based logical(1): whether bedfile GRanges are 0-based
> > #' @param rm_duplicates  logical(1)
> > #' @param plot   logical(1)
> > #' @param verboselogical(1)
> > #' @return \code{\link[GenomicRanges]{GRanges-class}}
> > #' @note By convention BED files are 0-based. GRanges are always 1-based.
> > #'   A good discussion on these two alternative codings is given
> > #'   by Obi Griffith on Biostars: https://www.biostars.org/p/84686/
> > #' @examples
> > #' bedfile  <- system.file('extdata/SRF.bed', package = 'multicrispr')
> > #' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus
> > #' (gr <- read_bed(bedfile, bsgenome))
> > #' @importFrom  data.table  :=
> > #' @export
> > read_bed <- function(
> > bedfile,
> > bsgenome,
> > zero_based= TRUE,
> > rm_duplicates = TRUE,
> > plot  = 

Re: [Bioc-devel] read_bed()

2019-09-17 Thread Bhagwat, Aditya
Hi Michael,

Yeah, I also noticed that the attachment was eaten when it entered the 
bio-devel list. 

The file is also accessible in the extdata of the multicrispr:
https://gitlab.gwdg.de/loosolab/software/multicrispr/blob/master/inst/extdata/SRF.bed

A bedfile to GRanges importer requires columns 1 (chrom), 2 (chromStart), 3 
(chromEnd), and column 6 (strand). All of these are present in SRF.bed.

I am curious as to why you feel that having additional columns in a bedfile 
would break it?

Cheers,

Aditya


From: Michael Lawrence [lawrence.mich...@gene.com]
Sent: Tuesday, September 17, 2019 1:41 PM
To: Bhagwat, Aditya
Cc: Shepherd, Lori; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] read_bed()

I don't see an attachment, nor can I find the multicrispr package
anywhere. The "addressed soon" was referring to the BEDX+Y formats,
which was addressed many years ago, so I've updated the documentation.
Broken BED files will never be supported.

Michael

On Tue, Sep 17, 2019 at 4:17 AM Bhagwat, Aditya
 wrote:
>
> Hi Lori,
>
> I remember now - I tried this function earlier, but it does not work for my 
> bedfiles, like the one in attach.
>
> > bedfile  <- system.file('extdata/SRF.bed', package = 'multicrispr')
> >
> > targetranges <- rtracklayer::import(bedfile, format = 'BED', genome = 
> > 'mm10')
> Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  
> : scan() expected 'an integer', got 'chr2'
> >
>
> Perhaps this sentence in `?rtracklayer::import` points to the source of the 
> error?
>
> many tools and organizations have extended BED with additional columns. These 
> are not officially valid BED files, and as such rtracklayer does not yet 
> support them (this will be addressed soon).
>
> Which brings the question: how soon is soon :-D ?
>
> Aditya
>
>
> 
> From: Shepherd, Lori [lori.sheph...@roswellpark.org]
> Sent: Tuesday, September 17, 2019 1:02 PM
> To: Bhagwat, Aditya; bioc-devel@r-project.org
> Subject: Re: read_bed()
>
> Please look at rtracklayer::import()  function that we recommend for reading 
> of BAM files along with other common formats.
>
> Cheers,
>
>
> Lori Shepherd
>
> Bioconductor Core Team
>
> Roswell Park Cancer Institute
>
> Department of Biostatistics & Bioinformatics
>
> Elm & Carlton Streets
>
> Buffalo, New York 14263
>
> 
> From: Bioc-devel  on behalf of Bhagwat, 
> Aditya 
> Sent: Tuesday, September 17, 2019 6:58 AM
> To: bioc-devel@r-project.org 
> Subject: [Bioc-devel] read_bed()
>
> Dear bioc-devel,
>
> I had two feedback requests regarding the function read_bed().
>
> 1) Did I overlook, and therefore, re-invent existing functionality?
> 2) If not, would `read_bed` be suited for existence in a more foundational 
> package, e.g. `GenomicRanges`, given the rather basal nature of this 
> functionality?
>
> It reads a bedfile into a GRanges, converts the coordinates from 0-based 
> (bedfile) to 1-based (GRanges), adds 
> BSgenome info (to allow for implicit range validity 
> checking) and plots the 
> karyogram.
>
> Thank you for your feedback.
>
> Cheers,
>
> Aditya
>
>
> #' Read bedfile into GRanges
> #'
> #' @param bedfilefile path
> #' @param bsgenome   BSgenome, e.g. 
> BSgenome.Mmusculus.UCSC.mm10::Mmusculus
> #' @param zero_based logical(1): whether bedfile GRanges are 0-based
> #' @param rm_duplicates  logical(1)
> #' @param plot   logical(1)
> #' @param verboselogical(1)
> #' @return \code{\link[GenomicRanges]{GRanges-class}}
> #' @note By convention BED files are 0-based. GRanges are always 1-based.
> #'   A good discussion on these two alternative codings is given
> #'   by Obi Griffith on Biostars: https://www.biostars.org/p/84686/
> #' @examples
> #' bedfile  <- system.file('extdata/SRF.bed', package = 'multicrispr')
> #' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus
> #' (gr <- read_bed(bedfile, bsgenome))
> #' @importFrom  data.table  :=
> #' @export
> read_bed <- function(
> bedfile,
> bsgenome,
> zero_based= TRUE,
> rm_duplicates = TRUE,
> plot  = TRUE,
> verbose   = TRUE
> ){
> # Assert
> assert_all_are_existing_files(bedfile)
> assert_is_a_bool(verbose)
> assert_is_a_bool(rm_duplicates)
> assert_is_a_bool(zero_based)
>
> # Comply
> seqnames <- start <- end <- strand <- .N <- gap <- width <- NULL
>
> # Read
> if (verbose) cmessage('\tRead %s', bedfile)
> dt <- data.table::fread(bedfile, select = c(seq_len(3), 6),
> col.names = c('seqnames', 'start', 'end', 'strand'))
> data.table::setorderv(dt, c('seqnames', 'start', 'end', 'strand'))
>
> # Transform coordinates: 0-based -> 1-based
> if (zero_based){
> if (verbose)cmessage('\t\tConvert 0 -> 

Re: [Bioc-devel] Fwd: TissueEnrich problems reported in the Multiple platform build/check report for BioC 3.9

2019-09-17 Thread Shepherd, Lori
I am able to reproduce the ERROR.  It seems to be an issue of trying to subset.

> head(exp %>% gather(Tissue))
  Tissue value
1 Adipose Tissue  1.63226821549951
2 Adipose Tissue 0
3 Adipose Tissue 0
4 Adipose Tissue 0.485426827170242
5 Adipose Tissue 0.584962500721156
6 Adipose Tissue  3.05311133645956
> head(exp %>% gather(Tissue=1:(ncol(exp)-1)))
Error: 1 components of `...` had unexpected names.

We detected these problematic arguments:
* `Tissue`

Did you misspecify an argument?
Call `rlang::last_error()` to see a backtrace


>

You can see if there were any recent changes to package you depend on that 
could have caused this.  A package you depend on may have changed their 
functionality.  I needed to update packages (with BiocManager::install()   
update all )  in order to reproduce.

Cheers,


Lori Shepherd

Bioconductor Core Team

Roswell Park Cancer Institute

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263


From: Bioc-devel  on behalf of Ashish Jain 

Sent: Monday, September 16, 2019 3:08 PM
To: bioc-devel@r-project.org 
Subject: [Bioc-devel] Fwd: TissueEnrich problems reported in the Multiple 
platform build/check report for BioC 3.9

Hi All,

I just got the message that my package is giving an error during the build
process on Bioconductor build server. I saw the logs and found it is
failing while building the vignette. I am not sure why it's happening as I
have not updated the package since release and it was building without any
errors. Can someone look into this?

Regards,
Ashish Jain

-- Forwarded message -
From: 
Date: Mon, Sep 16, 2019 at 1:59 PM
Subject: TissueEnrich problems reported in the Multiple platform
build/check report for BioC 3.9
To: 


[This is an automatically generated email. Please don't reply.]

Hi TissueEnrich maintainer,

According to the Multiple platform build/check report for BioC 3.9,
the TissueEnrich package has the following problem(s):

  o ERROR for 'R CMD build' on malbec2. See the details here:

https://master.bioconductor.org/checkResults/3.9/bioc-LATEST/TissueEnrich/malbec2-buildsrc.html

Please take the time to address this by committing and pushing
changes to your package at git.bioconductor.org

Notes:

  * This was the status of your package at the time this email was sent to
you.
Given that the online report is updated daily (in normal conditions) you
could see something different when you visit the URL(s) above,
especially if
you do so several days after you received this email.

  * It is possible that the problems reported in this report are false
positives,
either because another package (from CRAN or Bioconductor) breaks your
package (if yours depends on it) or because of a Build System problem.
If this is the case, then you can ignore this email.

  * Please check the report again 24h after you've committed your changes
to the
package and make sure that all the problems have gone.

  * If you have questions about this report or need help with the
maintenance of your package, please use the Bioc-devel mailing list:

  https://bioconductor.org/help/mailing-list/

(all package maintainers are requested to subscribe to this list)

For immediate notification of package build status, please
subscribe to your package's RSS feed. Information is at:

https://bioconductor.org/developers/rss-feeds/

Thanks for contributing to the Bioconductor project!



--
Regards,
Ashish Jain
Ph.D. Candidate
Bioinformatics and Computational Biology
Iowa State University
Ph +1-317-529-7973
Website: *https://ashishjain1988.github.io
/*

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] read_bed()

2019-09-17 Thread Michael Lawrence via Bioc-devel
I don't see an attachment, nor can I find the multicrispr package
anywhere. The "addressed soon" was referring to the BEDX+Y formats,
which was addressed many years ago, so I've updated the documentation.
Broken BED files will never be supported.

Michael

On Tue, Sep 17, 2019 at 4:17 AM Bhagwat, Aditya
 wrote:
>
> Hi Lori,
>
> I remember now - I tried this function earlier, but it does not work for my 
> bedfiles, like the one in attach.
>
> > bedfile  <- system.file('extdata/SRF.bed', package = 'multicrispr')
> >
> > targetranges <- rtracklayer::import(bedfile, format = 'BED', genome = 
> > 'mm10')
> Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  
> : scan() expected 'an integer', got 'chr2'
> >
>
> Perhaps this sentence in `?rtracklayer::import` points to the source of the 
> error?
>
> many tools and organizations have extended BED with additional columns. These 
> are not officially valid BED files, and as such rtracklayer does not yet 
> support them (this will be addressed soon).
>
> Which brings the question: how soon is soon :-D ?
>
> Aditya
>
>
> 
> From: Shepherd, Lori [lori.sheph...@roswellpark.org]
> Sent: Tuesday, September 17, 2019 1:02 PM
> To: Bhagwat, Aditya; bioc-devel@r-project.org
> Subject: Re: read_bed()
>
> Please look at rtracklayer::import()  function that we recommend for reading 
> of BAM files along with other common formats.
>
> Cheers,
>
>
> Lori Shepherd
>
> Bioconductor Core Team
>
> Roswell Park Cancer Institute
>
> Department of Biostatistics & Bioinformatics
>
> Elm & Carlton Streets
>
> Buffalo, New York 14263
>
> 
> From: Bioc-devel  on behalf of Bhagwat, 
> Aditya 
> Sent: Tuesday, September 17, 2019 6:58 AM
> To: bioc-devel@r-project.org 
> Subject: [Bioc-devel] read_bed()
>
> Dear bioc-devel,
>
> I had two feedback requests regarding the function read_bed().
>
> 1) Did I overlook, and therefore, re-invent existing functionality?
> 2) If not, would `read_bed` be suited for existence in a more foundational 
> package, e.g. `GenomicRanges`, given the rather basal nature of this 
> functionality?
>
> It reads a bedfile into a GRanges, converts the coordinates from 0-based 
> (bedfile) to 1-based (GRanges), adds 
> BSgenome info (to allow for implicit range validity 
> checking) and plots the 
> karyogram.
>
> Thank you for your feedback.
>
> Cheers,
>
> Aditya
>
>
> #' Read bedfile into GRanges
> #'
> #' @param bedfilefile path
> #' @param bsgenome   BSgenome, e.g. 
> BSgenome.Mmusculus.UCSC.mm10::Mmusculus
> #' @param zero_based logical(1): whether bedfile GRanges are 0-based
> #' @param rm_duplicates  logical(1)
> #' @param plot   logical(1)
> #' @param verboselogical(1)
> #' @return \code{\link[GenomicRanges]{GRanges-class}}
> #' @note By convention BED files are 0-based. GRanges are always 1-based.
> #'   A good discussion on these two alternative codings is given
> #'   by Obi Griffith on Biostars: https://www.biostars.org/p/84686/
> #' @examples
> #' bedfile  <- system.file('extdata/SRF.bed', package = 'multicrispr')
> #' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus
> #' (gr <- read_bed(bedfile, bsgenome))
> #' @importFrom  data.table  :=
> #' @export
> read_bed <- function(
> bedfile,
> bsgenome,
> zero_based= TRUE,
> rm_duplicates = TRUE,
> plot  = TRUE,
> verbose   = TRUE
> ){
> # Assert
> assert_all_are_existing_files(bedfile)
> assert_is_a_bool(verbose)
> assert_is_a_bool(rm_duplicates)
> assert_is_a_bool(zero_based)
>
> # Comply
> seqnames <- start <- end <- strand <- .N <- gap <- width <- NULL
>
> # Read
> if (verbose) cmessage('\tRead %s', bedfile)
> dt <- data.table::fread(bedfile, select = c(seq_len(3), 6),
> col.names = c('seqnames', 'start', 'end', 'strand'))
> data.table::setorderv(dt, c('seqnames', 'start', 'end', 'strand'))
>
> # Transform coordinates: 0-based -> 1-based
> if (zero_based){
> if (verbose)cmessage('\t\tConvert 0 -> 1-based')
> dt[, start := start + 1]
> }
>
> if (verbose) cmessage('\t\tRanges: %d ranges on %d chromosomes',
> nrow(dt), length(unique(dt$seqnames)))
>
> # Drop duplicates
> if (rm_duplicates){
> is_duplicated <- cduplicated(dt)
> if (any(is_duplicated)){
> if (verbose) cmessage('\t\t%d after removing duplicates')
> dt %<>% extract(!duplicated)
> }
> }
>
> # Turn into GRanges
> gr <-  add_seqinfo(as(dt, 'GRanges'), bsgenome)
>
> # Plot and return
> title <- paste0(providerVersion(bsgenome), ': ', basename(bedfile))
> if (plot) plot_karyogram(gr, title)
> gr
> }
>
>
> [[alternative HTML version deleted]]
>

Re: [Bioc-devel] read_bed()

2019-09-17 Thread Bhagwat, Aditya
Hi Lori,

I remember now - I tried this function earlier, but it does not work for my 
bedfiles, like the one in attach.

> bedfile  <- system.file('extdata/SRF.bed', package = 'multicrispr')
>
> targetranges <- rtracklayer::import(bedfile, format = 'BED', genome = 'mm10')
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  : 
scan() expected 'an integer', got 'chr2'
>

Perhaps this sentence in `?rtracklayer::import` points to the source of the 
error?

many tools and organizations have extended BED with additional columns. These 
are not officially valid BED files, and as such rtracklayer does not yet 
support them (this will be addressed soon).

Which brings the question: how soon is soon :-D ?

Aditya



From: Shepherd, Lori [lori.sheph...@roswellpark.org]
Sent: Tuesday, September 17, 2019 1:02 PM
To: Bhagwat, Aditya; bioc-devel@r-project.org
Subject: Re: read_bed()

Please look at rtracklayer::import()  function that we recommend for reading of 
BAM files along with other common formats.

Cheers,


Lori Shepherd

Bioconductor Core Team

Roswell Park Cancer Institute

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263


From: Bioc-devel  on behalf of Bhagwat, 
Aditya 
Sent: Tuesday, September 17, 2019 6:58 AM
To: bioc-devel@r-project.org 
Subject: [Bioc-devel] read_bed()

Dear bioc-devel,

I had two feedback requests regarding the function read_bed().

1) Did I overlook, and therefore, re-invent existing functionality?
2) If not, would `read_bed` be suited for existence in a more foundational 
package, e.g. `GenomicRanges`, given the rather basal nature of this 
functionality?

It reads a bedfile into a GRanges, converts the coordinates from 0-based 
(bedfile) to 1-based (GRanges), adds BSgenome 
info (to allow for implicit range validity 
checking) and plots the 
karyogram.

Thank you for your feedback.

Cheers,

Aditya


#' Read bedfile into GRanges
#'
#' @param bedfilefile path
#' @param bsgenome   BSgenome, e.g. BSgenome.Mmusculus.UCSC.mm10::Mmusculus
#' @param zero_based logical(1): whether bedfile GRanges are 0-based
#' @param rm_duplicates  logical(1)
#' @param plot   logical(1)
#' @param verboselogical(1)
#' @return \code{\link[GenomicRanges]{GRanges-class}}
#' @note By convention BED files are 0-based. GRanges are always 1-based.
#'   A good discussion on these two alternative codings is given
#'   by Obi Griffith on Biostars: https://www.biostars.org/p/84686/
#' @examples
#' bedfile  <- system.file('extdata/SRF.bed', package = 'multicrispr')
#' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus
#' (gr <- read_bed(bedfile, bsgenome))
#' @importFrom  data.table  :=
#' @export
read_bed <- function(
bedfile,
bsgenome,
zero_based= TRUE,
rm_duplicates = TRUE,
plot  = TRUE,
verbose   = TRUE
){
# Assert
assert_all_are_existing_files(bedfile)
assert_is_a_bool(verbose)
assert_is_a_bool(rm_duplicates)
assert_is_a_bool(zero_based)

# Comply
seqnames <- start <- end <- strand <- .N <- gap <- width <- NULL

# Read
if (verbose) cmessage('\tRead %s', bedfile)
dt <- data.table::fread(bedfile, select = c(seq_len(3), 6),
col.names = c('seqnames', 'start', 'end', 'strand'))
data.table::setorderv(dt, c('seqnames', 'start', 'end', 'strand'))

# Transform coordinates: 0-based -> 1-based
if (zero_based){
if (verbose)cmessage('\t\tConvert 0 -> 1-based')
dt[, start := start + 1]
}

if (verbose) cmessage('\t\tRanges: %d ranges on %d chromosomes',
nrow(dt), length(unique(dt$seqnames)))

# Drop duplicates
if (rm_duplicates){
is_duplicated <- cduplicated(dt)
if (any(is_duplicated)){
if (verbose) cmessage('\t\t%d after removing duplicates')
dt %<>% extract(!duplicated)
}
}

# Turn into GRanges
gr <-  add_seqinfo(as(dt, 'GRanges'), bsgenome)

# Plot and return
title <- paste0(providerVersion(bsgenome), ': ', basename(bedfile))
if (plot) plot_karyogram(gr, title)
gr
}


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

This email message may contain legally privileged and/or confidential 
information. If you are not the intended recipient(s), or the employee or agent 
responsible for the delivery of this message to the intended recipient(s), you 
are hereby notified that any disclosure, copying, distribution, or use of this 
email message is prohibited. If you have received this message in error, please 
notify the sender immediately by e-mail and delete 

Re: [Bioc-devel] read_bed()

2019-09-17 Thread Bhagwat, Aditya
Aha - thx!

Aditya

From: Shepherd, Lori [lori.sheph...@roswellpark.org]
Sent: Tuesday, September 17, 2019 1:02 PM
To: Bhagwat, Aditya; bioc-devel@r-project.org
Subject: Re: read_bed()

Please look at rtracklayer::import()  function that we recommend for reading of 
BAM files along with other common formats.

Cheers,


Lori Shepherd

Bioconductor Core Team

Roswell Park Cancer Institute

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263


From: Bioc-devel  on behalf of Bhagwat, 
Aditya 
Sent: Tuesday, September 17, 2019 6:58 AM
To: bioc-devel@r-project.org 
Subject: [Bioc-devel] read_bed()

Dear bioc-devel,

I had two feedback requests regarding the function read_bed().

1) Did I overlook, and therefore, re-invent existing functionality?
2) If not, would `read_bed` be suited for existence in a more foundational 
package, e.g. `GenomicRanges`, given the rather basal nature of this 
functionality?

It reads a bedfile into a GRanges, converts the coordinates from 0-based 
(bedfile) to 1-based (GRanges), adds BSgenome 
info (to allow for implicit range validity 
checking) and plots the 
karyogram.

Thank you for your feedback.

Cheers,

Aditya


#' Read bedfile into GRanges
#'
#' @param bedfilefile path
#' @param bsgenome   BSgenome, e.g. BSgenome.Mmusculus.UCSC.mm10::Mmusculus
#' @param zero_based logical(1): whether bedfile GRanges are 0-based
#' @param rm_duplicates  logical(1)
#' @param plot   logical(1)
#' @param verboselogical(1)
#' @return \code{\link[GenomicRanges]{GRanges-class}}
#' @note By convention BED files are 0-based. GRanges are always 1-based.
#'   A good discussion on these two alternative codings is given
#'   by Obi Griffith on Biostars: https://www.biostars.org/p/84686/
#' @examples
#' bedfile  <- system.file('extdata/SRF.bed', package = 'multicrispr')
#' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus
#' (gr <- read_bed(bedfile, bsgenome))
#' @importFrom  data.table  :=
#' @export
read_bed <- function(
bedfile,
bsgenome,
zero_based= TRUE,
rm_duplicates = TRUE,
plot  = TRUE,
verbose   = TRUE
){
# Assert
assert_all_are_existing_files(bedfile)
assert_is_a_bool(verbose)
assert_is_a_bool(rm_duplicates)
assert_is_a_bool(zero_based)

# Comply
seqnames <- start <- end <- strand <- .N <- gap <- width <- NULL

# Read
if (verbose) cmessage('\tRead %s', bedfile)
dt <- data.table::fread(bedfile, select = c(seq_len(3), 6),
col.names = c('seqnames', 'start', 'end', 'strand'))
data.table::setorderv(dt, c('seqnames', 'start', 'end', 'strand'))

# Transform coordinates: 0-based -> 1-based
if (zero_based){
if (verbose)cmessage('\t\tConvert 0 -> 1-based')
dt[, start := start + 1]
}

if (verbose) cmessage('\t\tRanges: %d ranges on %d chromosomes',
nrow(dt), length(unique(dt$seqnames)))

# Drop duplicates
if (rm_duplicates){
is_duplicated <- cduplicated(dt)
if (any(is_duplicated)){
if (verbose) cmessage('\t\t%d after removing duplicates')
dt %<>% extract(!duplicated)
}
}

# Turn into GRanges
gr <-  add_seqinfo(as(dt, 'GRanges'), bsgenome)

# Plot and return
title <- paste0(providerVersion(bsgenome), ': ', basename(bedfile))
if (plot) plot_karyogram(gr, title)
gr
}


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

This email message may contain legally privileged and/or confidential 
information. If you are not the intended recipient(s), or the employee or agent 
responsible for the delivery of this message to the intended recipient(s), you 
are hereby notified that any disclosure, copying, distribution, or use of this 
email message is prohibited. If you have received this message in error, please 
notify the sender immediately by e-mail and delete this email message from your 
computer. Thank you.

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] read_bed()

2019-09-17 Thread Shepherd, Lori
Please look at rtracklayer::import()  function that we recommend for reading of 
BAM files along with other common formats.

Cheers,


Lori Shepherd

Bioconductor Core Team

Roswell Park Cancer Institute

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263


From: Bioc-devel  on behalf of Bhagwat, 
Aditya 
Sent: Tuesday, September 17, 2019 6:58 AM
To: bioc-devel@r-project.org 
Subject: [Bioc-devel] read_bed()

Dear bioc-devel,

I had two feedback requests regarding the function read_bed().

1) Did I overlook, and therefore, re-invent existing functionality?
2) If not, would `read_bed` be suited for existence in a more foundational 
package, e.g. `GenomicRanges`, given the rather basal nature of this 
functionality?

It reads a bedfile into a GRanges, converts the coordinates from 0-based 
(bedfile) to 1-based (GRanges), adds BSgenome 
info (to allow for implicit range validity 
checking) and plots the 
karyogram.

Thank you for your feedback.

Cheers,

Aditya


#' Read bedfile into GRanges
#'
#' @param bedfilefile path
#' @param bsgenome   BSgenome, e.g. BSgenome.Mmusculus.UCSC.mm10::Mmusculus
#' @param zero_based logical(1): whether bedfile GRanges are 0-based
#' @param rm_duplicates  logical(1)
#' @param plot   logical(1)
#' @param verboselogical(1)
#' @return \code{\link[GenomicRanges]{GRanges-class}}
#' @note By convention BED files are 0-based. GRanges are always 1-based.
#'   A good discussion on these two alternative codings is given
#'   by Obi Griffith on Biostars: https://www.biostars.org/p/84686/
#' @examples
#' bedfile  <- system.file('extdata/SRF.bed', package = 'multicrispr')
#' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus
#' (gr <- read_bed(bedfile, bsgenome))
#' @importFrom  data.table  :=
#' @export
read_bed <- function(
bedfile,
bsgenome,
zero_based= TRUE,
rm_duplicates = TRUE,
plot  = TRUE,
verbose   = TRUE
){
# Assert
assert_all_are_existing_files(bedfile)
assert_is_a_bool(verbose)
assert_is_a_bool(rm_duplicates)
assert_is_a_bool(zero_based)

# Comply
seqnames <- start <- end <- strand <- .N <- gap <- width <- NULL

# Read
if (verbose) cmessage('\tRead %s', bedfile)
dt <- data.table::fread(bedfile, select = c(seq_len(3), 6),
col.names = c('seqnames', 'start', 'end', 'strand'))
data.table::setorderv(dt, c('seqnames', 'start', 'end', 'strand'))

# Transform coordinates: 0-based -> 1-based
if (zero_based){
if (verbose)cmessage('\t\tConvert 0 -> 1-based')
dt[, start := start + 1]
}

if (verbose) cmessage('\t\tRanges: %d ranges on %d chromosomes',
nrow(dt), length(unique(dt$seqnames)))

# Drop duplicates
if (rm_duplicates){
is_duplicated <- cduplicated(dt)
if (any(is_duplicated)){
if (verbose) cmessage('\t\t%d after removing duplicates')
dt %<>% extract(!duplicated)
}
}

# Turn into GRanges
gr <-  add_seqinfo(as(dt, 'GRanges'), bsgenome)

# Plot and return
title <- paste0(providerVersion(bsgenome), ': ', basename(bedfile))
if (plot) plot_karyogram(gr, title)
gr
}


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] read_bed()

2019-09-17 Thread Bhagwat, Aditya
Dear bioc-devel,

I had two feedback requests regarding the function read_bed().

1) Did I overlook, and therefore, re-invent existing functionality?
2) If not, would `read_bed` be suited for existence in a more foundational 
package, e.g. `GenomicRanges`, given the rather basal nature of this 
functionality?

It reads a bedfile into a GRanges, converts the coordinates from 0-based 
(bedfile) to 1-based (GRanges), adds BSgenome 
info (to allow for implicit range validity 
checking) and plots the 
karyogram.

Thank you for your feedback.

Cheers,

Aditya


#' Read bedfile into GRanges
#'
#' @param bedfilefile path
#' @param bsgenome   BSgenome, e.g. BSgenome.Mmusculus.UCSC.mm10::Mmusculus
#' @param zero_based logical(1): whether bedfile GRanges are 0-based
#' @param rm_duplicates  logical(1)
#' @param plot   logical(1)
#' @param verboselogical(1)
#' @return \code{\link[GenomicRanges]{GRanges-class}}
#' @note By convention BED files are 0-based. GRanges are always 1-based.
#'   A good discussion on these two alternative codings is given
#'   by Obi Griffith on Biostars: https://www.biostars.org/p/84686/
#' @examples
#' bedfile  <- system.file('extdata/SRF.bed', package = 'multicrispr')
#' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus
#' (gr <- read_bed(bedfile, bsgenome))
#' @importFrom  data.table  :=
#' @export
read_bed <- function(
bedfile,
bsgenome,
zero_based= TRUE,
rm_duplicates = TRUE,
plot  = TRUE,
verbose   = TRUE
){
# Assert
assert_all_are_existing_files(bedfile)
assert_is_a_bool(verbose)
assert_is_a_bool(rm_duplicates)
assert_is_a_bool(zero_based)

# Comply
seqnames <- start <- end <- strand <- .N <- gap <- width <- NULL

# Read
if (verbose) cmessage('\tRead %s', bedfile)
dt <- data.table::fread(bedfile, select = c(seq_len(3), 6),
col.names = c('seqnames', 'start', 'end', 'strand'))
data.table::setorderv(dt, c('seqnames', 'start', 'end', 'strand'))

# Transform coordinates: 0-based -> 1-based
if (zero_based){
if (verbose)cmessage('\t\tConvert 0 -> 1-based')
dt[, start := start + 1]
}

if (verbose) cmessage('\t\tRanges: %d ranges on %d chromosomes',
nrow(dt), length(unique(dt$seqnames)))

# Drop duplicates
if (rm_duplicates){
is_duplicated <- cduplicated(dt)
if (any(is_duplicated)){
if (verbose) cmessage('\t\t%d after removing duplicates')
dt %<>% extract(!duplicated)
}
}

# Turn into GRanges
gr <-  add_seqinfo(as(dt, 'GRanges'), bsgenome)

# Plot and return
title <- paste0(providerVersion(bsgenome), ': ', basename(bedfile))
if (plot) plot_karyogram(gr, title)
gr
}


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Extending GenomicRanges::`intra-range-methods`

2019-09-17 Thread Bhagwat, Aditya
Owkies, will file a PR in one of the coming days. And continue the discussion 
when I do so.

Cheers!

Aditya


From: Stuart Lee [le...@wehi.edu.au]
Sent: Tuesday, September 17, 2019 5:33 AM
To: Michael Lawrence
Cc: Bhagwat, Aditya; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] Extending GenomicRanges::`intra-range-methods`

Hi Aditya,

I think straddle would be a great addition to plyranges. Happy for you to put 
in a PR and add you as a contributor.

Maybe instead of specifying the start etc. we could dispatch on anchored ranges 
instead? So we�d follow the anchor_start(gr) %>% straddle(). We could also have 
the directed version for considering strands.

https://github.com/sa-lee/plyranges

Thanks,
Stuart

---
Stuart Lee
Visiting PhD Student - Ritchie Lab



On 13 Sep 2019, at 22:38, Michael Lawrence 
mailto:lawrence.mich...@gene.com>> wrote:

Thanks for these suggestions; I think they're worth considering.

I've never been totally satisfied with (my function) flank(), because
it's limited and its arguments are somewhat obscure in meaning. You
can check out what we did in plyranges:
https://rdrr.io/bioc/plyranges/man/flank-ranges.html. Your functions
are more flexible, because they are two-way about the endpoint, like
promoters(). Sometimes I've solved that with resize(flank()), but
that's not ideal.  Maybe a better name is "straddle" for when ranges
straddle one of the endpoints? In keeping with the current pattern of
Ranges API, there would be a single function: straddle(x, side, left,
right, ignore.strand=FALSE). So straddle(x, "start", -100, 10) would
be like promoters(x, 100, 10) for a positive or "*" strand range. That
brings up strandedness, which needs to be considered here. For
unstranded ranges, it may be that direct start() and end()
manipulation is actually more transparent than a special verb. I
wonder what Stuart Lee thinks?

The functions that involve reduce() wouldn't fit into the intrarange
operations, as they are summarizing ranges, not transforming them.
They may be going too far.

Michael

On Fri, Sep 13, 2019 at 4:48 AM Bhagwat, Aditya
mailto:aditya.bhag...@mpi-bn.mpg.de>> wrote:

Dear bioc-devel,

The ?GenomicRanges::`intra-range-methods` are very useful for range 
arithmetic

Feedback request: would it be of general use to add the methods below to the 
GenomicRanges::`intra-range-methods` palette (after properly S4-ing them)?
Or shall I keep them in 
multicrispr?
Additional feedback welcome as well (e.g. re-implementation of already existing 
functionality).


1) Left flank

#' Left flank
#' @param gr   \code{\link[GenomicRanges]{GRanges-class}}
#' @param leftstart number: flank start (relative to range start)
#' @param leftend   number: flank end   (relative to range start)
#' @return a \code{\link[GenomicRanges]{GRanges-class}}
#' @export
#' @examples
#' bedfile <- system.file('extdata/SRF.bed', package = 'multicrispr')
#' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus
#' gr <- read_bed(bedfile, bsgenome)
#' left_flank(gr)
left_flank <- function(gr, leftstart = -200, leftend   = -1){

   # Assert
   assert_is_identical_to_true(is(gr, 'GRanges'))
   assert_is_a_number(leftstart)
   assert_is_a_number(leftend)

   # Flank
   newranges <- gr
   end(newranges)   <- start(gr) + leftend
   start(newranges) <- start(gr) + leftstart

   # Return
   newranges
}


2) Right flank

#' Right flank
#' @param gr\code{\link[GenomicRanges]{GRanges-class}}
#' @param rightstart number: flank start (relative to range end)
#' @param rightend   number: flank end   (relative to range end)
#' @return \code{\link[GenomicRanges]{GRanges-class}}
#' @export
#' @examples
#' bedfile <- system.file('extdata/SRF.bed', package = 'multicrispr')
#' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus
#' gr <- read_bed(bedfile, bsgenome)
#' right_flank(gr)
#' @export
right_flank <- function(gr, rightstart = 1, rightend   = 200){

   # Assert
   assert_is_identical_to_true(is(gr, 'GRanges'))
   assert_is_a_number(rightstart)
   assert_is_a_number(rightend)
   assert_is_a_bool(verbose)

   # Flank
   newranges <- gr
   start(newranges) <- end(newranges) + rightstart
   end(newranges)   <- end(newranges) + rightend

   # Plot
   if (plot)  plot_intervals(GRangesList(sites = gr, rightflanks = newranges))

   # Return
   cmessage('\t\t%d right flanks : [end%s%d, end%s%d]',
   length(newranges),
   csign(rightstart),
   abs(rightstart),
   csign(rightend),
   abs(rightend))
   newranges
}


3) Slop

#' Slop (i.e. extend left/right)
#' @param gr\code{\link[GenomicRanges]{GRanges-class}}
#' @param leftstart number: flank start (relative to range start)
#' @param rightend  number: flank end   (relative to range end)
#' @return \code{\link[GenomicRanges]{GRanges-class}}
#'