Hi, I think I'm with Ivan and leaning towards not allowing duplicate names in a GRangesList, even though normal lists in R do allow duplicate names.
As Ivan suggested, I also often use the names of any R list when I want to use the list as something similar to a Python dictionary. Still, if the consensus turns out to allow duplicate names in *RangesList(s), perhaps it'd be nice for the the validity method to fire off a warning that duplicate names exist in the list so the user knows something might be fishy. -steve On Fri, Feb 25, 2011 at 9:48 AM, Ivan Gregoretti <[email protected]> wrote: > Hello Hervé, > > While we wait for comments from "power users", I just wanted to say > that non-unique names open the door for potentially more problems than > solutions. > > Imagine a Python dictionary or a Perl hash with multiple values per key. > > I wonder how many R/Bioconductor functions exploit the vector's > capability to hold multiple elements with the same name. > > Regardless, thanks for asking users opinions. > > Ivan > > > Ivan Gregoretti, PhD > National Institute of Diabetes and Digestive and Kidney Diseases > National Institutes of Health > 5 Memorial Dr, Building 5, Room 205. > Bethesda, MD 20892. USA. > Phone: 1-301-496-1016 and 1-301-496-1592 > Fax: 1-301-496-9878 > > > > On Fri, Feb 25, 2011 at 3:08 AM, Pages, Herve <[email protected]> wrote: >> Hi Dario, >> >> A GRangesList object with duplicated names is apparently >> considered broken: >> >>> grl <- GRangesList(GRanges(), GRanges()) >>> names(grl) <- c("a", "a") >>> validObject(grl) >> Error in `rownames<-`(`*tmp*`, value = c("a", "a")) : >> duplicate rownames not allowed >> >> If we are ok with this feature, we should fix the "names<-" >> method (and any other code around that lets the user generate >> broken objects). >> >> But if we are not ok with this feature, we should modify >> the validity method for GRangesList objects. I tend to prefer >> this solution for 3 reasons: >> >> 1. Consistency with ordinary vectors: the names of a vector >> in R are not required to be unique. >> >> 2. It's not uncommon to see the same name used for 2 different >> genes. One might still want to be able to stick those names >> on a GRangesList object where each top-level element corresponds >> to a gene (e.g. exons grouped by gene). >> >> 3. It's easier to modify the validity method than to go around >> trying to find and fix every piece of code in GenomicRanges >> (and maybe other places) that can potentially produce a >> GRangesList object with duplicated names. >> >> How do our power users feel about this? >> >> Thanks, >> H. >> >> >> ----- Original Message ----- >> From: "Dario Strbenac" <[email protected]> >> To: [email protected] >> Sent: Thursday, February 24, 2011 10:00:11 PM >> Subject: [Bioc-sig-seq] GRangesList with duplicate names >> >> Hello, >> >> It is possible to create a GRangesList with duplicated names, but not to >> re-order it. >> >>> summary(grl) >> Length Class Mode >> 3 GRangesList S4 >>> names(grl) <- c("Cancer", "Cancer", "Normal") >>> grl[3:1] >> Error in `rownames<-`(`*tmp*`, value = c("Normal", "Cancer", "Cancer")) : >> duplicate rownames not allowed >>> sessionInfo() >> R version 2.12.0 (2010-10-15) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8 >> [5] LC_MONETARY=C LC_MESSAGES=en_AU.UTF-8 >> [7] LC_PAPER=en_AU.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] GenomicRanges_1.2.3 IRanges_1.8.9 >> >> -------------------------------------- >> Dario Strbenac >> Research Assistant >> Cancer Epigenetics >> Garvan Institute of Medical Research >> Darlinghurst NSW 2010 >> Australia >> >> _______________________________________________ >> Bioc-sig-sequencing mailing list >> [email protected] >> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing >> >> _______________________________________________ >> Bioc-sig-sequencing mailing list >> [email protected] >> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing >> > > _______________________________________________ > Bioc-sig-sequencing mailing list > [email protected] > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
