[Bioc-devel] Organizing committee for BioC2020

2019-09-18 Thread Aedin Culhane

Dear Bioconductor Developer Community,

The organizing committee for BioC2020, to be held in Boston on July 
29-31, needs organizers to help in planning the event.


Participation involves attending approximately monthly remote meetings 
plus taking on responsibilities in one or more of the areas of:


* Outreach (developing materials, promotion)
* Website
* Workshop technical organization
* Program development
* Sponsorship and funding
* Code of Conduct training and enforcement
* Local organization in Boston

Please register your interest to participate on the Bioc2020 organizing 
committee at https://forms.gle/Zs1gn2T9RtQ7Xysv5


Thanks
Aedin Culhane, Levi Waldron

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] read_bed()

2019-09-18 Thread Michael Lawrence via Bioc-devel
I'd suggest separating the operations on the data from the interface.
You can have both, one layer for programming and another for
interactive analysis.

On Wed, Sep 18, 2019 at 5:05 AM Bhagwat, Aditya
 wrote:
>
> In the end I endeavour to end up with a handful of verbs, with which I can do 
> all tasks in a project.
>
> Regarding the BED files: they're basic bed files, with additional metadata 
> columns to allow traceback. But for the purpose of multicrispr, non need to 
> restrict to those files only. You extraCols works great for me. And for 
> multicrispr examples, I have removed the metadata cols to keep things simple. 
> You were right btw that things went wrong earlier in the column stripping 
> process.
>
> Aditya
>
>
> 
> From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
> Aditya [aditya.bhag...@mpi-bn.mpg.de]
> Sent: Wednesday, September 18, 2019 1:57 PM
> To: Michael Lawrence
> Cc: bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> Hi Michael,
>
> That's a software design dilemma I've encountered a few times.
>
> One approach is to keep the "verb" functions bare. E.g. read_bed would only 
> read a bedfile, and plot_bed would somehow plot it. Advantage: if read_bed 
> doesn't depend on anything else, other functions can depend on it, which 
> makes dependency handling easier.
>
> Another intention is to make verb functions "intuitive". In that scenario, I 
> try for each operation to also output a visual image of the operation, to 
> make it easier to see at a glance what is happening. E.g. for the range 
> operations in multicrispr, the function plot_intervals visually shows what 
> operation is being performed, making it easier to both spot errors as well as 
> maintain focus.
>
> In the case of read_bed, I thought of wrapping around your excellent 
> core-level rtracklayer::import(), additionally providing the textual and 
> visual feedback which I intent to give.
>
> Interesting to hear your suggestions on this topic, though.
>
> Aditya
>
>
> 
> From: Michael Lawrence [lawrence.mich...@gene.com]
> Sent: Wednesday, September 18, 2019 1:33 PM
> To: Bhagwat, Aditya
> Cc: Michael Lawrence; bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> I'm not sure if a function called read_bed() should be plotting or
> printing. Is your BED file a known BED variant, i.e., maybe there is a
> better name for the file type than "bed"?
>
>
> On Wed, Sep 18, 2019 at 3:17 AM Bhagwat, Aditya
>  wrote:
> >
> > Actually,
> >
> > I will keep multicrispr::read_bed(), but wrap it around 
> > rtracklayer::import.bed, and additionally plot and print range summaries.
> >
> > Aditya
> >
> > 
> > From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
> > Aditya [aditya.bhag...@mpi-bn.mpg.de]
> > Sent: Wednesday, September 18, 2019 11:31 AM
> > To: Michael Lawrence
> > Cc: bioc-devel@r-project.org
> > Subject: Re: [Bioc-devel] read_bed()
> >
> > (Typo corrected to avoid confusion)
> >
> > Michael,
> >
> > rtracklayer::import.bed() indeed works perfectly for me, so I am dropping 
> > multicrispr::read_bed().
> >
> > In order to avoid the overkill of `require(tracklayer)` for multicrispr 
> >  users, does it make 
> > sense to import/re-export import.bed() in multicrispr? What is BioC 
> > convention/best practice in such cases?
> >
> > Aditya
> >
> >
> >
> > 
> > From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
> > Aditya [aditya.bhag...@mpi-bn.mpg.de]
> > Sent: Wednesday, September 18, 2019 8:35 AM
> > To: Michael Lawrence
> > Cc: bioc-devel@r-project.org
> > Subject: Re: [Bioc-devel] read_bed()
> >
> > Thank you Michael :-)
> >
> > Aditya
> > 
> > From: Michael Lawrence [lawrence.mich...@gene.com]
> > Sent: Tuesday, September 17, 2019 8:49 PM
> > To: Bhagwat, Aditya
> > Cc: Michael Lawrence; bioc-devel@r-project.org
> > Subject: Re: [Bioc-devel] read_bed()
> >
> > I think you probably made a mistake when dropping the columns. When I
> > provide the extraCols= argument (inventing my own names for things),
> > it almost works, but it breaks due to NAs in the extra columns. The
> > "." character is the standard way to express NA in BED files. I've
> > added support for extra na.strings to version 1.45.6.
> >
> > For reference, the call is like:
> >
> > import("SRF.bed", extraCols=c(chr2="character", start2="integer",
> > end2="integer", mDux="factor", type="factor", pos1="integer",
> > pos2="integer", strand2="factor", from="factor", n="integer",
> > code="character", anno="factor", id="character", biotype="character",
> > score2="numeric" ), na.strings="NA")
> >
> >
> > On Tue, Sep 17, 2019 at 7:23 AM Bhagwat, Aditya
> >  wrote:
> > >
> > > Hi Michael,
> > >
> > > I 

Re: [Bioc-devel] read_bed()

2019-09-18 Thread Bhagwat, Aditya
In the end I endeavour to end up with a handful of verbs, with which I can do 
all tasks in a project.

Regarding the BED files: they're basic bed files, with additional metadata 
columns to allow traceback. But for the purpose of multicrispr, non need to 
restrict to those files only. You extraCols works great for me. And for 
multicrispr examples, I have removed the metadata cols to keep things simple. 
You were right btw that things went wrong earlier in the column stripping 
process.

Aditya



From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
Aditya [aditya.bhag...@mpi-bn.mpg.de]
Sent: Wednesday, September 18, 2019 1:57 PM
To: Michael Lawrence
Cc: bioc-devel@r-project.org
Subject: Re: [Bioc-devel] read_bed()

Hi Michael,

That's a software design dilemma I've encountered a few times.

One approach is to keep the "verb" functions bare. E.g. read_bed would only 
read a bedfile, and plot_bed would somehow plot it. Advantage: if read_bed 
doesn't depend on anything else, other functions can depend on it, which makes 
dependency handling easier.

Another intention is to make verb functions "intuitive". In that scenario, I 
try for each operation to also output a visual image of the operation, to make 
it easier to see at a glance what is happening. E.g. for the range operations 
in multicrispr, the function plot_intervals visually shows what operation is 
being performed, making it easier to both spot errors as well as maintain focus.

In the case of read_bed, I thought of wrapping around your excellent core-level 
rtracklayer::import(), additionally providing the textual and visual feedback 
which I intent to give.

Interesting to hear your suggestions on this topic, though.

Aditya



From: Michael Lawrence [lawrence.mich...@gene.com]
Sent: Wednesday, September 18, 2019 1:33 PM
To: Bhagwat, Aditya
Cc: Michael Lawrence; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] read_bed()

I'm not sure if a function called read_bed() should be plotting or
printing. Is your BED file a known BED variant, i.e., maybe there is a
better name for the file type than "bed"?


On Wed, Sep 18, 2019 at 3:17 AM Bhagwat, Aditya
 wrote:
>
> Actually,
>
> I will keep multicrispr::read_bed(), but wrap it around 
> rtracklayer::import.bed, and additionally plot and print range summaries.
>
> Aditya
>
> 
> From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
> Aditya [aditya.bhag...@mpi-bn.mpg.de]
> Sent: Wednesday, September 18, 2019 11:31 AM
> To: Michael Lawrence
> Cc: bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> (Typo corrected to avoid confusion)
>
> Michael,
>
> rtracklayer::import.bed() indeed works perfectly for me, so I am dropping 
> multicrispr::read_bed().
>
> In order to avoid the overkill of `require(tracklayer)` for multicrispr 
>  users, does it make 
> sense to import/re-export import.bed() in multicrispr? What is BioC 
> convention/best practice in such cases?
>
> Aditya
>
>
>
> 
> From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
> Aditya [aditya.bhag...@mpi-bn.mpg.de]
> Sent: Wednesday, September 18, 2019 8:35 AM
> To: Michael Lawrence
> Cc: bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> Thank you Michael :-)
>
> Aditya
> 
> From: Michael Lawrence [lawrence.mich...@gene.com]
> Sent: Tuesday, September 17, 2019 8:49 PM
> To: Bhagwat, Aditya
> Cc: Michael Lawrence; bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> I think you probably made a mistake when dropping the columns. When I
> provide the extraCols= argument (inventing my own names for things),
> it almost works, but it breaks due to NAs in the extra columns. The
> "." character is the standard way to express NA in BED files. I've
> added support for extra na.strings to version 1.45.6.
>
> For reference, the call is like:
>
> import("SRF.bed", extraCols=c(chr2="character", start2="integer",
> end2="integer", mDux="factor", type="factor", pos1="integer",
> pos2="integer", strand2="factor", from="factor", n="integer",
> code="character", anno="factor", id="character", biotype="character",
> score2="numeric" ), na.strings="NA")
>
>
> On Tue, Sep 17, 2019 at 7:23 AM Bhagwat, Aditya
>  wrote:
> >
> > Hi Michael,
> >
> > I removed the additional metadata columns in SRF.bed
> > https://gitlab.gwdg.de/loosolab/software/multicrispr/blob/master/inst/extdata/SRF.bed
> >
> > But still can't get rtracklayer::import.bed working:
> >
> > > rtracklayer::import.bed(bedfile)
> > Error in scan(file = file, what = what, sep = sep, quote = quote, dec = 
> > dec, : scan() expected 'a real', got '1.168.595'
> > > bedfile
> > [1] 
> > 

Re: [Bioc-devel] read_bed()

2019-09-18 Thread Bhagwat, Aditya
Hi Michael, 

That's a software design dilemma I've encountered a few times.

One approach is to keep the "verb" functions bare. E.g. read_bed would only 
read a bedfile, and plot_bed would somehow plot it. Advantage: if read_bed 
doesn't depend on anything else, other functions can depend on it, which makes 
dependency handling easier.

Another intention is to make verb functions "intuitive". In that scenario, I 
try for each operation to also output a visual image of the operation, to make 
it easier to see at a glance what is happening. E.g. for the range operations 
in multicrispr, the function plot_intervals visually shows what operation is 
being performed, making it easier to both spot errors as well as maintain focus.

In the case of read_bed, I thought of wrapping around your excellent core-level 
rtracklayer::import(), additionally providing the textual and visual feedback 
which I intent to give.

Interesting to hear your suggestions on this topic, though.

Aditya



From: Michael Lawrence [lawrence.mich...@gene.com]
Sent: Wednesday, September 18, 2019 1:33 PM
To: Bhagwat, Aditya
Cc: Michael Lawrence; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] read_bed()

I'm not sure if a function called read_bed() should be plotting or
printing. Is your BED file a known BED variant, i.e., maybe there is a
better name for the file type than "bed"?


On Wed, Sep 18, 2019 at 3:17 AM Bhagwat, Aditya
 wrote:
>
> Actually,
>
> I will keep multicrispr::read_bed(), but wrap it around 
> rtracklayer::import.bed, and additionally plot and print range summaries.
>
> Aditya
>
> 
> From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
> Aditya [aditya.bhag...@mpi-bn.mpg.de]
> Sent: Wednesday, September 18, 2019 11:31 AM
> To: Michael Lawrence
> Cc: bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> (Typo corrected to avoid confusion)
>
> Michael,
>
> rtracklayer::import.bed() indeed works perfectly for me, so I am dropping 
> multicrispr::read_bed().
>
> In order to avoid the overkill of `require(tracklayer)` for multicrispr 
>  users, does it make 
> sense to import/re-export import.bed() in multicrispr? What is BioC 
> convention/best practice in such cases?
>
> Aditya
>
>
>
> 
> From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
> Aditya [aditya.bhag...@mpi-bn.mpg.de]
> Sent: Wednesday, September 18, 2019 8:35 AM
> To: Michael Lawrence
> Cc: bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> Thank you Michael :-)
>
> Aditya
> 
> From: Michael Lawrence [lawrence.mich...@gene.com]
> Sent: Tuesday, September 17, 2019 8:49 PM
> To: Bhagwat, Aditya
> Cc: Michael Lawrence; bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> I think you probably made a mistake when dropping the columns. When I
> provide the extraCols= argument (inventing my own names for things),
> it almost works, but it breaks due to NAs in the extra columns. The
> "." character is the standard way to express NA in BED files. I've
> added support for extra na.strings to version 1.45.6.
>
> For reference, the call is like:
>
> import("SRF.bed", extraCols=c(chr2="character", start2="integer",
> end2="integer", mDux="factor", type="factor", pos1="integer",
> pos2="integer", strand2="factor", from="factor", n="integer",
> code="character", anno="factor", id="character", biotype="character",
> score2="numeric" ), na.strings="NA")
>
>
> On Tue, Sep 17, 2019 at 7:23 AM Bhagwat, Aditya
>  wrote:
> >
> > Hi Michael,
> >
> > I removed the additional metadata columns in SRF.bed
> > https://gitlab.gwdg.de/loosolab/software/multicrispr/blob/master/inst/extdata/SRF.bed
> >
> > But still can't get rtracklayer::import.bed working:
> >
> > > rtracklayer::import.bed(bedfile)
> > Error in scan(file = file, what = what, sep = sep, quote = quote, dec = 
> > dec, : scan() expected 'a real', got '1.168.595'
> > > bedfile
> > [1] 
> > "C:/Users/abhagwa/Documents/R/R-3.6.1/library/multicrispr/extdata/SRF.bed"
> >
> > Never mind, multicrispr function read_bed, based on data.table::fread is 
> > doing the job, so I will stick to that .
> >
> > Thank you for all feedback,
> >
> > Cheers,
> >
> > Aditya
> >
> >
> > 
> > From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
> > Aditya [aditya.bhag...@mpi-bn.mpg.de]
> > Sent: Tuesday, September 17, 2019 2:48 PM
> > To: Michael Lawrence
> > Cc: bioc-devel@r-project.org
> > Subject: Re: [Bioc-devel] read_bed()
> >
> > Oh :-) - Thankyou for explaining!
> > 
> > From: Michael Lawrence [lawrence.mich...@gene.com]
> > Sent: Tuesday, September 17, 2019 2:40 PM
> > To: Bhagwat, Aditya
> > Cc: Michael Lawrence; Shepherd, Lori; 

Re: [Bioc-devel] read_bed()

2019-09-18 Thread Michael Lawrence via Bioc-devel
I'm not sure if a function called read_bed() should be plotting or
printing. Is your BED file a known BED variant, i.e., maybe there is a
better name for the file type than "bed"?


On Wed, Sep 18, 2019 at 3:17 AM Bhagwat, Aditya
 wrote:
>
> Actually,
>
> I will keep multicrispr::read_bed(), but wrap it around 
> rtracklayer::import.bed, and additionally plot and print range summaries.
>
> Aditya
>
> 
> From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
> Aditya [aditya.bhag...@mpi-bn.mpg.de]
> Sent: Wednesday, September 18, 2019 11:31 AM
> To: Michael Lawrence
> Cc: bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> (Typo corrected to avoid confusion)
>
> Michael,
>
> rtracklayer::import.bed() indeed works perfectly for me, so I am dropping 
> multicrispr::read_bed().
>
> In order to avoid the overkill of `require(tracklayer)` for multicrispr 
>  users, does it make 
> sense to import/re-export import.bed() in multicrispr? What is BioC 
> convention/best practice in such cases?
>
> Aditya
>
>
>
> 
> From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
> Aditya [aditya.bhag...@mpi-bn.mpg.de]
> Sent: Wednesday, September 18, 2019 8:35 AM
> To: Michael Lawrence
> Cc: bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> Thank you Michael :-)
>
> Aditya
> 
> From: Michael Lawrence [lawrence.mich...@gene.com]
> Sent: Tuesday, September 17, 2019 8:49 PM
> To: Bhagwat, Aditya
> Cc: Michael Lawrence; bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> I think you probably made a mistake when dropping the columns. When I
> provide the extraCols= argument (inventing my own names for things),
> it almost works, but it breaks due to NAs in the extra columns. The
> "." character is the standard way to express NA in BED files. I've
> added support for extra na.strings to version 1.45.6.
>
> For reference, the call is like:
>
> import("SRF.bed", extraCols=c(chr2="character", start2="integer",
> end2="integer", mDux="factor", type="factor", pos1="integer",
> pos2="integer", strand2="factor", from="factor", n="integer",
> code="character", anno="factor", id="character", biotype="character",
> score2="numeric" ), na.strings="NA")
>
>
> On Tue, Sep 17, 2019 at 7:23 AM Bhagwat, Aditya
>  wrote:
> >
> > Hi Michael,
> >
> > I removed the additional metadata columns in SRF.bed
> > https://gitlab.gwdg.de/loosolab/software/multicrispr/blob/master/inst/extdata/SRF.bed
> >
> > But still can't get rtracklayer::import.bed working:
> >
> > > rtracklayer::import.bed(bedfile)
> > Error in scan(file = file, what = what, sep = sep, quote = quote, dec = 
> > dec, : scan() expected 'a real', got '1.168.595'
> > > bedfile
> > [1] 
> > "C:/Users/abhagwa/Documents/R/R-3.6.1/library/multicrispr/extdata/SRF.bed"
> >
> > Never mind, multicrispr function read_bed, based on data.table::fread is 
> > doing the job, so I will stick to that .
> >
> > Thank you for all feedback,
> >
> > Cheers,
> >
> > Aditya
> >
> >
> > 
> > From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
> > Aditya [aditya.bhag...@mpi-bn.mpg.de]
> > Sent: Tuesday, September 17, 2019 2:48 PM
> > To: Michael Lawrence
> > Cc: bioc-devel@r-project.org
> > Subject: Re: [Bioc-devel] read_bed()
> >
> > Oh :-) - Thankyou for explaining!
> > 
> > From: Michael Lawrence [lawrence.mich...@gene.com]
> > Sent: Tuesday, September 17, 2019 2:40 PM
> > To: Bhagwat, Aditya
> > Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> > Subject: Re: [Bioc-devel] read_bed()
> >
> > Having a "." in the function name does not make something "S3".
> > There's no dispatch from import() to import.bed(). Had I not been a
> > total newb when I created rtracklayer, I would have called the
> > function importBed() or something like that. Sorry for the confusion.
> >
> > On Tue, Sep 17, 2019 at 5:34 AM Bhagwat, Aditya
> >  wrote:
> > >
> > > Oh, superb, thx!
> > >
> > > Interesting ... here you use S3 rather than S4 - I wonder the design 
> > > intention underlying these choices (I'm asking because I am trying to 
> > > figure out myself when to use S3 and when to use S4 and whether to mix 
> > > the two).
> > >
> > > Aditya
> > >
> > > 
> > > From: Michael Lawrence [lawrence.mich...@gene.com]
> > > Sent: Tuesday, September 17, 2019 2:23 PM
> > > To: Bhagwat, Aditya
> > > Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> > > Subject: Re: [Bioc-devel] read_bed()
> > >
> > > The generic documentation does not mention it, but see ?import.bed.
> > > It's similar to colClasses on read.table().
> > >
> > > On Tue, Sep 17, 2019 at 5:15 AM Bhagwat, Aditya
> > >  wrote:
> > > >
> > > > 

Re: [Bioc-devel] read_bed()

2019-09-18 Thread Bhagwat, Aditya
Actually,

I will keep multicrispr::read_bed(), but wrap it around 
rtracklayer::import.bed, and additionally plot and print range summaries.

Aditya


From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
Aditya [aditya.bhag...@mpi-bn.mpg.de]
Sent: Wednesday, September 18, 2019 11:31 AM
To: Michael Lawrence
Cc: bioc-devel@r-project.org
Subject: Re: [Bioc-devel] read_bed()

(Typo corrected to avoid confusion)

Michael,

rtracklayer::import.bed() indeed works perfectly for me, so I am dropping 
multicrispr::read_bed().

In order to avoid the overkill of `require(tracklayer)` for multicrispr 
 users, does it make 
sense to import/re-export import.bed() in multicrispr? What is BioC 
convention/best practice in such cases?

Aditya




From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
Aditya [aditya.bhag...@mpi-bn.mpg.de]
Sent: Wednesday, September 18, 2019 8:35 AM
To: Michael Lawrence
Cc: bioc-devel@r-project.org
Subject: Re: [Bioc-devel] read_bed()

Thank you Michael :-)

Aditya

From: Michael Lawrence [lawrence.mich...@gene.com]
Sent: Tuesday, September 17, 2019 8:49 PM
To: Bhagwat, Aditya
Cc: Michael Lawrence; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] read_bed()

I think you probably made a mistake when dropping the columns. When I
provide the extraCols= argument (inventing my own names for things),
it almost works, but it breaks due to NAs in the extra columns. The
"." character is the standard way to express NA in BED files. I've
added support for extra na.strings to version 1.45.6.

For reference, the call is like:

import("SRF.bed", extraCols=c(chr2="character", start2="integer",
end2="integer", mDux="factor", type="factor", pos1="integer",
pos2="integer", strand2="factor", from="factor", n="integer",
code="character", anno="factor", id="character", biotype="character",
score2="numeric" ), na.strings="NA")


On Tue, Sep 17, 2019 at 7:23 AM Bhagwat, Aditya
 wrote:
>
> Hi Michael,
>
> I removed the additional metadata columns in SRF.bed
> https://gitlab.gwdg.de/loosolab/software/multicrispr/blob/master/inst/extdata/SRF.bed
>
> But still can't get rtracklayer::import.bed working:
>
> > rtracklayer::import.bed(bedfile)
> Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, 
> : scan() expected 'a real', got '1.168.595'
> > bedfile
> [1] "C:/Users/abhagwa/Documents/R/R-3.6.1/library/multicrispr/extdata/SRF.bed"
>
> Never mind, multicrispr function read_bed, based on data.table::fread is 
> doing the job, so I will stick to that .
>
> Thank you for all feedback,
>
> Cheers,
>
> Aditya
>
>
> 
> From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
> Aditya [aditya.bhag...@mpi-bn.mpg.de]
> Sent: Tuesday, September 17, 2019 2:48 PM
> To: Michael Lawrence
> Cc: bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> Oh :-) - Thankyou for explaining!
> 
> From: Michael Lawrence [lawrence.mich...@gene.com]
> Sent: Tuesday, September 17, 2019 2:40 PM
> To: Bhagwat, Aditya
> Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> Having a "." in the function name does not make something "S3".
> There's no dispatch from import() to import.bed(). Had I not been a
> total newb when I created rtracklayer, I would have called the
> function importBed() or something like that. Sorry for the confusion.
>
> On Tue, Sep 17, 2019 at 5:34 AM Bhagwat, Aditya
>  wrote:
> >
> > Oh, superb, thx!
> >
> > Interesting ... here you use S3 rather than S4 - I wonder the design 
> > intention underlying these choices (I'm asking because I am trying to 
> > figure out myself when to use S3 and when to use S4 and whether to mix the 
> > two).
> >
> > Aditya
> >
> > 
> > From: Michael Lawrence [lawrence.mich...@gene.com]
> > Sent: Tuesday, September 17, 2019 2:23 PM
> > To: Bhagwat, Aditya
> > Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> > Subject: Re: [Bioc-devel] read_bed()
> >
> > The generic documentation does not mention it, but see ?import.bed.
> > It's similar to colClasses on read.table().
> >
> > On Tue, Sep 17, 2019 at 5:15 AM Bhagwat, Aditya
> >  wrote:
> > >
> > > Thankyou Michael,
> > >
> > > How do I use the extraCols argument? The documentation does not mention 
> > > an `extraCols` argument explicitly, so it must be one of the ellipsis 
> > > arguments, but `?rtracklayer::import` does not mention it. Should I say 
> > > extraCols = 10 (ten extra columns) or so?
> > >
> > > Aditya
> > >
> > > 
> > > From: Michael Lawrence [lawrence.mich...@gene.com]
> > > Sent: Tuesday, September 17, 2019 2:05 PM
> > > To: Bhagwat, 

Re: [Bioc-devel] read_bed()

2019-09-18 Thread Bhagwat, Aditya
(Typo corrected to avoid confusion)

Michael,

rtracklayer::import.bed() indeed works perfectly for me, so I am dropping 
multicrispr::read_bed().

In order to avoid the overkill of `require(tracklayer)` for multicrispr 
 users, does it make 
sense to import/re-export import.bed() in multicrispr? What is BioC 
convention/best practice in such cases?

Aditya




From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
Aditya [aditya.bhag...@mpi-bn.mpg.de]
Sent: Wednesday, September 18, 2019 8:35 AM
To: Michael Lawrence
Cc: bioc-devel@r-project.org
Subject: Re: [Bioc-devel] read_bed()

Thank you Michael :-)

Aditya

From: Michael Lawrence [lawrence.mich...@gene.com]
Sent: Tuesday, September 17, 2019 8:49 PM
To: Bhagwat, Aditya
Cc: Michael Lawrence; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] read_bed()

I think you probably made a mistake when dropping the columns. When I
provide the extraCols= argument (inventing my own names for things),
it almost works, but it breaks due to NAs in the extra columns. The
"." character is the standard way to express NA in BED files. I've
added support for extra na.strings to version 1.45.6.

For reference, the call is like:

import("SRF.bed", extraCols=c(chr2="character", start2="integer",
end2="integer", mDux="factor", type="factor", pos1="integer",
pos2="integer", strand2="factor", from="factor", n="integer",
code="character", anno="factor", id="character", biotype="character",
score2="numeric" ), na.strings="NA")


On Tue, Sep 17, 2019 at 7:23 AM Bhagwat, Aditya
 wrote:
>
> Hi Michael,
>
> I removed the additional metadata columns in SRF.bed
> https://gitlab.gwdg.de/loosolab/software/multicrispr/blob/master/inst/extdata/SRF.bed
>
> But still can't get rtracklayer::import.bed working:
>
> > rtracklayer::import.bed(bedfile)
> Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, 
> : scan() expected 'a real', got '1.168.595'
> > bedfile
> [1] "C:/Users/abhagwa/Documents/R/R-3.6.1/library/multicrispr/extdata/SRF.bed"
>
> Never mind, multicrispr function read_bed, based on data.table::fread is 
> doing the job, so I will stick to that .
>
> Thank you for all feedback,
>
> Cheers,
>
> Aditya
>
>
> 
> From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
> Aditya [aditya.bhag...@mpi-bn.mpg.de]
> Sent: Tuesday, September 17, 2019 2:48 PM
> To: Michael Lawrence
> Cc: bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> Oh :-) - Thankyou for explaining!
> 
> From: Michael Lawrence [lawrence.mich...@gene.com]
> Sent: Tuesday, September 17, 2019 2:40 PM
> To: Bhagwat, Aditya
> Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> Having a "." in the function name does not make something "S3".
> There's no dispatch from import() to import.bed(). Had I not been a
> total newb when I created rtracklayer, I would have called the
> function importBed() or something like that. Sorry for the confusion.
>
> On Tue, Sep 17, 2019 at 5:34 AM Bhagwat, Aditya
>  wrote:
> >
> > Oh, superb, thx!
> >
> > Interesting ... here you use S3 rather than S4 - I wonder the design 
> > intention underlying these choices (I'm asking because I am trying to 
> > figure out myself when to use S3 and when to use S4 and whether to mix the 
> > two).
> >
> > Aditya
> >
> > 
> > From: Michael Lawrence [lawrence.mich...@gene.com]
> > Sent: Tuesday, September 17, 2019 2:23 PM
> > To: Bhagwat, Aditya
> > Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> > Subject: Re: [Bioc-devel] read_bed()
> >
> > The generic documentation does not mention it, but see ?import.bed.
> > It's similar to colClasses on read.table().
> >
> > On Tue, Sep 17, 2019 at 5:15 AM Bhagwat, Aditya
> >  wrote:
> > >
> > > Thankyou Michael,
> > >
> > > How do I use the extraCols argument? The documentation does not mention 
> > > an `extraCols` argument explicitly, so it must be one of the ellipsis 
> > > arguments, but `?rtracklayer::import` does not mention it. Should I say 
> > > extraCols = 10 (ten extra columns) or so?
> > >
> > > Aditya
> > >
> > > 
> > > From: Michael Lawrence [lawrence.mich...@gene.com]
> > > Sent: Tuesday, September 17, 2019 2:05 PM
> > > To: Bhagwat, Aditya
> > > Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> > > Subject: Re: [Bioc-devel] read_bed()
> > >
> > > It breaks it because it's not standard BED; however, using the
> > > extraCols= argument should work in this case. Requiring an explicit
> > > format specification is intentional, because it provides validation
> > > and type safety, and it communicates the format to a future reader.
> > > This also looks a 

Re: [Bioc-devel] read_bed()

2019-09-18 Thread Bhagwat, Aditya
Michael,

rtracklayer::import.bed() indeed works perfectly for me, so I am dropping 
multicrispr::read_bed.

In order to avoid the overkill of `require(tracklayer` for multicrispr 
 users, does it make 
sense to import/re-export read.bed in multicrispr? What is BioC convention/best 
practice in such cases?

Aditya




From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
Aditya [aditya.bhag...@mpi-bn.mpg.de]
Sent: Wednesday, September 18, 2019 8:35 AM
To: Michael Lawrence
Cc: bioc-devel@r-project.org
Subject: Re: [Bioc-devel] read_bed()

Thank you Michael :-)

Aditya

From: Michael Lawrence [lawrence.mich...@gene.com]
Sent: Tuesday, September 17, 2019 8:49 PM
To: Bhagwat, Aditya
Cc: Michael Lawrence; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] read_bed()

I think you probably made a mistake when dropping the columns. When I
provide the extraCols= argument (inventing my own names for things),
it almost works, but it breaks due to NAs in the extra columns. The
"." character is the standard way to express NA in BED files. I've
added support for extra na.strings to version 1.45.6.

For reference, the call is like:

import("SRF.bed", extraCols=c(chr2="character", start2="integer",
end2="integer", mDux="factor", type="factor", pos1="integer",
pos2="integer", strand2="factor", from="factor", n="integer",
code="character", anno="factor", id="character", biotype="character",
score2="numeric" ), na.strings="NA")


On Tue, Sep 17, 2019 at 7:23 AM Bhagwat, Aditya
 wrote:
>
> Hi Michael,
>
> I removed the additional metadata columns in SRF.bed
> https://gitlab.gwdg.de/loosolab/software/multicrispr/blob/master/inst/extdata/SRF.bed
>
> But still can't get rtracklayer::import.bed working:
>
> > rtracklayer::import.bed(bedfile)
> Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, 
> : scan() expected 'a real', got '1.168.595'
> > bedfile
> [1] "C:/Users/abhagwa/Documents/R/R-3.6.1/library/multicrispr/extdata/SRF.bed"
>
> Never mind, multicrispr function read_bed, based on data.table::fread is 
> doing the job, so I will stick to that .
>
> Thank you for all feedback,
>
> Cheers,
>
> Aditya
>
>
> 
> From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
> Aditya [aditya.bhag...@mpi-bn.mpg.de]
> Sent: Tuesday, September 17, 2019 2:48 PM
> To: Michael Lawrence
> Cc: bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> Oh :-) - Thankyou for explaining!
> 
> From: Michael Lawrence [lawrence.mich...@gene.com]
> Sent: Tuesday, September 17, 2019 2:40 PM
> To: Bhagwat, Aditya
> Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> Having a "." in the function name does not make something "S3".
> There's no dispatch from import() to import.bed(). Had I not been a
> total newb when I created rtracklayer, I would have called the
> function importBed() or something like that. Sorry for the confusion.
>
> On Tue, Sep 17, 2019 at 5:34 AM Bhagwat, Aditya
>  wrote:
> >
> > Oh, superb, thx!
> >
> > Interesting ... here you use S3 rather than S4 - I wonder the design 
> > intention underlying these choices (I'm asking because I am trying to 
> > figure out myself when to use S3 and when to use S4 and whether to mix the 
> > two).
> >
> > Aditya
> >
> > 
> > From: Michael Lawrence [lawrence.mich...@gene.com]
> > Sent: Tuesday, September 17, 2019 2:23 PM
> > To: Bhagwat, Aditya
> > Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> > Subject: Re: [Bioc-devel] read_bed()
> >
> > The generic documentation does not mention it, but see ?import.bed.
> > It's similar to colClasses on read.table().
> >
> > On Tue, Sep 17, 2019 at 5:15 AM Bhagwat, Aditya
> >  wrote:
> > >
> > > Thankyou Michael,
> > >
> > > How do I use the extraCols argument? The documentation does not mention 
> > > an `extraCols` argument explicitly, so it must be one of the ellipsis 
> > > arguments, but `?rtracklayer::import` does not mention it. Should I say 
> > > extraCols = 10 (ten extra columns) or so?
> > >
> > > Aditya
> > >
> > > 
> > > From: Michael Lawrence [lawrence.mich...@gene.com]
> > > Sent: Tuesday, September 17, 2019 2:05 PM
> > > To: Bhagwat, Aditya
> > > Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> > > Subject: Re: [Bioc-devel] read_bed()
> > >
> > > It breaks it because it's not standard BED; however, using the
> > > extraCols= argument should work in this case. Requiring an explicit
> > > format specification is intentional, because it provides validation
> > > and type safety, and it communicates the format to a future reader.
> > > This also looks a bit like a bedPE file, so you might consider 

Re: [Bioc-devel] read_bed()

2019-09-18 Thread Bhagwat, Aditya
Thank you Michael :-)

Aditya

From: Michael Lawrence [lawrence.mich...@gene.com]
Sent: Tuesday, September 17, 2019 8:49 PM
To: Bhagwat, Aditya
Cc: Michael Lawrence; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] read_bed()

I think you probably made a mistake when dropping the columns. When I
provide the extraCols= argument (inventing my own names for things),
it almost works, but it breaks due to NAs in the extra columns. The
"." character is the standard way to express NA in BED files.  I've
added support for extra na.strings to version 1.45.6.

For reference, the call is like:

import("SRF.bed", extraCols=c(chr2="character", start2="integer",
end2="integer", mDux="factor", type="factor", pos1="integer",
pos2="integer", strand2="factor", from="factor", n="integer",
code="character", anno="factor", id="character", biotype="character",
score2="numeric" ), na.strings="NA")


On Tue, Sep 17, 2019 at 7:23 AM Bhagwat, Aditya
 wrote:
>
> Hi Michael,
>
> I removed the additional metadata columns in SRF.bed
> https://gitlab.gwdg.de/loosolab/software/multicrispr/blob/master/inst/extdata/SRF.bed
>
> But still can't get rtracklayer::import.bed working:
>
> > rtracklayer::import.bed(bedfile)
> Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, 
> : scan() expected 'a real', got '1.168.595'
> > bedfile
> [1] "C:/Users/abhagwa/Documents/R/R-3.6.1/library/multicrispr/extdata/SRF.bed"
>
> Never mind, multicrispr function read_bed, based on data.table::fread is 
> doing the job, so I will stick to that .
>
> Thank you for all feedback,
>
> Cheers,
>
> Aditya
>
>
> 
> From: Bioc-devel [bioc-devel-boun...@r-project.org] on behalf of Bhagwat, 
> Aditya [aditya.bhag...@mpi-bn.mpg.de]
> Sent: Tuesday, September 17, 2019 2:48 PM
> To: Michael Lawrence
> Cc: bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> Oh :-) - Thankyou for explaining!
> 
> From: Michael Lawrence [lawrence.mich...@gene.com]
> Sent: Tuesday, September 17, 2019 2:40 PM
> To: Bhagwat, Aditya
> Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> Subject: Re: [Bioc-devel] read_bed()
>
> Having a "." in the function name does not make something "S3".
> There's no dispatch from import() to import.bed(). Had I not been a
> total newb when I created rtracklayer, I would have called the
> function importBed() or something like that. Sorry for the confusion.
>
> On Tue, Sep 17, 2019 at 5:34 AM Bhagwat, Aditya
>  wrote:
> >
> > Oh, superb, thx!
> >
> > Interesting ... here you use S3 rather than S4 - I wonder the design 
> > intention underlying these choices (I'm asking because I am trying to 
> > figure out myself when to use S3 and when to use S4 and whether to mix the 
> > two).
> >
> > Aditya
> >
> > 
> > From: Michael Lawrence [lawrence.mich...@gene.com]
> > Sent: Tuesday, September 17, 2019 2:23 PM
> > To: Bhagwat, Aditya
> > Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> > Subject: Re: [Bioc-devel] read_bed()
> >
> > The generic documentation does not mention it, but see ?import.bed.
> > It's similar to colClasses on read.table().
> >
> > On Tue, Sep 17, 2019 at 5:15 AM Bhagwat, Aditya
> >  wrote:
> > >
> > > Thankyou Michael,
> > >
> > > How do I use the extraCols argument? The documentation does not mention 
> > > an `extraCols` argument explicitly, so it must be one of the ellipsis 
> > > arguments, but `?rtracklayer::import` does not mention it. Should I say 
> > > extraCols = 10 (ten extra columns) or so?
> > >
> > > Aditya
> > >
> > > 
> > > From: Michael Lawrence [lawrence.mich...@gene.com]
> > > Sent: Tuesday, September 17, 2019 2:05 PM
> > > To: Bhagwat, Aditya
> > > Cc: Michael Lawrence; Shepherd, Lori; bioc-devel@r-project.org
> > > Subject: Re: [Bioc-devel] read_bed()
> > >
> > > It breaks it because it's not standard BED; however, using the
> > > extraCols= argument should work in this case. Requiring an explicit
> > > format specification is intentional, because it provides validation
> > > and type safety, and it communicates the format to a future reader.
> > > This also looks a bit like a bedPE file, so you might consider using
> > > the Pairs data structure.
> > >
> > > Michael
> > >
> > > On Tue, Sep 17, 2019 at 4:51 AM Bhagwat, Aditya
> > >  wrote:
> > > >
> > > > Hi Michael,
> > > >
> > > > Yeah, I also noticed that the attachment was eaten when it entered the 
> > > > bio-devel list.
> > > >
> > > > The file is also accessible in the extdata of the multicrispr:
> > > > https://gitlab.gwdg.de/loosolab/software/multicrispr/blob/master/inst/extdata/SRF.bed
> > > >
> > > > A bedfile to GRanges importer requires columns 1 (chrom), 2 
> > > > (chromStart), 3 (chromEnd), and column 6 (strand). All of these are 
> > > > present in SRF.bed.
> > > >
> > > > I am curious as