The dropSeqlevels approach for the autosomes is acceptable in terms of readability, but it means the code needs to know the seqname style (chrX rather than just X).
On Fri, Dec 27, 2013 at 1:49 PM, Valerie Obenchain <voben...@fhcrc.org>wrote: > I think the idea was to avoid the proliferation of small, specialized > functions. > > library(TxDb.Hsapiens.UCSC.hg19.knownGene) > gr <- transcripts(TxDb.Hsapiens.UCSC.hg19.knownGene) > > ## autosomes > std <- keepStandardChromosomes(gr, "Homo sapiens", "UCSC") > dropSeqlevels(std, c("chrX", "chrY")) > > ## allosomes > keepSeqlevels(std, c("chrX", "chrY")) > > While the above code seems straight forward maybe it's not enough of an > improvement over what you had to do before. > > If we need stand alone functions for 'autosomes', 'allosomes', etc. maybe > this isn't the right approach. Eariler in this thread there was mention of > storing this information in the Seqinfo. Maybe we need to step back to take > another look at that. If this information were in the Seqinfo these become > more like getters than one-off functions with their own > overhead/maintenance etc. > > Half the team is out so a final decision will have to wait a week. Thanks > for the feedback. > > Valerie > > > > On 12/27/2013 12:53 PM, Tim Triche, Jr. wrote: > >> Hi Sonali, >> >> Thank you for this much requested addition to the GenomicRanges API! >> >> I would like to second Michael Lawrence's assertion that keepAutosomes() >> is more straightforward and harder to misunderstand or misuse than a >> special case of keep/dropSeqlevels. (If the author of a function has >> trouble using it, think what that means for the rest of us!) An ideal API >> is (IMHO) one that is easy to use well, and hard to use poorly. I propose >> that this exchange has illustrated why a very experienced programmer (ML) >> would nonetheless suggest API "bloat". >> >> The recent history of the Ranges API has (again IMHO) been a series of >> exchanges balancing convenience against elegance, often with significant >> disagreements on where the balance should lie. But if a "special case" >> turns out to be the common case, historically, a variance has been granted. >> Examples include sortSeqlevels(), as(data.frame, "GRanges"), and x$foo as >> shorthand for elementMetadata(x)[,'foo']. >> >> People seldom do what they know to be right. They do what is convenient, >> then repent. (Ask me how I know ;-)) In my experience, at least, the best >> way to get people to "do the right thing" is to make it convenient. >> >> Thank you for being bold enough to stand up for the design choices you >> believe in. I hope my request to trade a little elegance for readability >> does not come across as criticism thereof. >> >> Best, >> >> --t >> >> On Dec 27, 2013, at 12:11 PM, Michael Lawrence < >>> lawrence.mich...@gene.com> wrote: >>> >>> I guess you mean dropSeqlevels, not keepSeqlevels? That should work. But >>> your function needs some tweaking, I think. The function should just take >>> the seqnameStyle from the object. There's no need to require the user to >>> specify it, and then check for consistency. I would like to be able to >>> just >>> call: >>> >>> keepStandardChromosomes(trans) >>> >>> Also, I think you made GenomicRanges depend on AnnotationDbi. That's a >>> major change. Before, it was just in Suggests. Did you discuss this with >>> anyone? >>> >>> Michael >>> >>> >>> >>> >>> On Fri, Dec 27, 2013 at 11:37 AM, Arora, Sonali <sar...@fhcrc.org> >>>> wrote: >>>> >>>> Hi Michael, >>>> >>>> We decided to come up with one function called keepStandardChromosomes() >>>> which >>>> is a wrapper for both the functions suggested by you (i.e., >>>> keepPrimaryChromosomes() and keepAutosomes() ) >>>> >>>> Here is an example: >>>> >>>> library(AnnotationDbi) >>>> library(GenomicRanges) >>>> txdb <- loadDb(system.file("extdata", "UCSC_knownGene_sample.sqlite", >>>> package="GenomicFeatures")) >>>> >>>> trans <- transcripts(txdb) >>>> seqlevels(trans) >>>> >>>> #gets all GRanges for Human chromosomes : 1-22,X,Y,M >>>> got <- keepStandardChromosomes(trans,style="UCSC", species="Homo >>>> sapiens") >>>> seqlevels(got) >>>> >>>> #for getting only autosomes, one could then extract using >>>> keepSeqlevels() >>>> #as shown earlier: >>>> keepSeqlevels(got,c("chrX","chrY")) >>>> keepSeqlevels(got,c("chrX","chrY", "chrM")) >>>> >>>> Thanks and Regards >>>> Sonali. >>>> >>>> On 12/25/2013 4:15 AM, Michael Lawrence wrote: >>>> >>>> Awesome -- how about keepAutosomes? >>>> >>>> >>>> On Tue, Dec 24, 2013 at 1:11 PM, Arora, Sonali <sar...@fhcrc.org> >>>>> wrote: >>>>> >>>>> Hi everyone, >>>>> >>>>> We are pleased to announce that we have added a new function called >>>>> keepStandardChromosomes() to GenomicRanges(1.15.18) - devel branch. >>>>> >>>>> This function allows a user to subset a given object (containing a >>>>> seqinfo class) to >>>>> retain only the primary chromosomes and the autosomes. >>>>> >>>>> Please feel free to get back if you face any issues. >>>>> >>>>> Thanks, >>>>> Sonali Arora. >>>>> >>>>> On 12/16/2013 11:01 AM, Michael Lawrence wrote: >>>>> >>>>> Awesome, thanks, Sonali. And welcome to the team. >>>>> >>>>> Michael >>>>> >>>>> >>>>> On Mon, Dec 16, 2013 at 10:38 AM, Arora, Sonali <sar...@fhcrc.org> >>>>>> wrote: >>>>>> >>>>>> Hi Michael, >>>>>> >>>>>> That is an extremely interesting question. We have a couple of ideas >>>>>> and >>>>>> are beginning to work on it. >>>>>> We hope to come up with something soon. >>>>>> >>>>>> Thanks, >>>>>> Sonali. >>>>>> >>>>>> >>>>>> On 12/16/2013 6:14 AM, Michael Lawrence wrote: >>>>>>> >>>>>>> should be stored with the Seqinfo. It could be imputed >>>>>>> (along with the isCircular I think) via th >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Bioc-devel@r-project.org mailing list >>>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel >>>>>> >>>>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioc-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/bioc-devel >>> >> >> _______________________________________________ >> Bioc-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/bioc-devel >> >> > > -- > Valerie Obenchain > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B155 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: voben...@fhcrc.org > Phone: (206) 667-3158 > Fax: (206) 667-1319 > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel