On Mon, Dec 16, 2013 at 5:00 AM, Julian Gehring <julian.gehr...@embl.de>wrote:
> Hi Michael, > > I would second your request. In a package I'll submitting soon, I have a > work-around for this by defining a set of functions like 'hsAutosomes', > 'hsAllosomes' etc. that return the respective set of human chromosome > names. Perhaps on could incorporate this in the 'seqinfo' class, by > additional columns similar to 'isCircular'. One would still need an > additional data source for this, since the information about which chr is > primary, autosome etc. in not contained in a standard reference file. > > Yes, I think it should be stored with the Seqinfo. It could be imputed (along with the isCircular I think) via the SeqnameStyle system that stores different naming conventions for different species. At the very least, the SeqnameStyle could inform a utility like keepAutosomes(), whether we modify Seqinfo or not. > > We've found that analysts often need to restrict seqlevels to certain >> pre-defined sets of chromsomes. Given the variability across organisms, it >> would be nice to have an abstraction. >> >> We often see this in code: >> >> keepSeqlevels(seqinfo, as.character(1:22) >> keepSeqlevels(seqinfo, c(1:22, "X", "Y")) >> >> Perhaps instead we could the more abstract and arguably more readable: >> >> keepAutosomes(seqinfo) >> keepPrimaryChromosomes(seqinfo) >> >> Not sure of the best term for the latter. It refers to the set of >> chromosomes that are not assembly fragments but are generally in the >> nucleus (when there is one). >> > > > Does the current 'sortSeqlevels' function address this? E.g. > > #+BEGIN_SRC R > > library(GenomicRanges) > seqinfo <- Seqinfo(paste0("chr", c(10, 1, 3)), c(10000, 1000, 3000), NA, > "mock1") > seqinfo ## 'chr10', 'chr1', 'chr3' > sortSeqlevels(seqinfo) ## now sorted 'chr1', 'chr3', 'chr10' > > #+END_SRC > > Thanks, I was not aware of this one. That should do the trick. > > It would also be nice to have a sort,Seqinfo method that sorts by the >> natural ordering of the chromosomes, if there is one. Maybe the function >> needs its own name, but either way, this is something that really needs to >> be in the infrastructure. >> >> I think the existing SeqnameStyle infrastructure should be able to support >> this. >> > > Best wishes > Julian > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel