On Tue, Oct 29, 2013 at 5:55 PM, Hervé Pagès <hpa...@fhcrc.org> wrote:
> Hi Michael, > > In Bioc < 2.13, subsetting was a mess. In particular, handling of > list-like subscripts was rather unpredictable. It would work only > if you were lucky enough to try it with one of the few supported > types (like IntegerList, LogicalList, or IRangesList), but it didn't > work for other very natural types like list or CharacterList. > Or it would work for [ but not for [<-, or vice-versa: > > x <- splitAsList(letters[1:6], c(2, 4, 3, 2, 2, 4)) > x[list(1)] # doesn't work in BioC < 2.13! > x[list(1)] <- "XX" # works in BioC < 2.13! > > Or, if both [ and [<- worked, they could behave inconsistently: one > would require the list-like subscript to have the same length as 'x' > but the other wouldn't. Or one would use the names on the subscript > and on 'x' to map the list elements between the two, but the other > wouldn't. > > Hopefully in BioC 2.13, subsetting behaves more consistently (at least > that was the intention). For example now the names on the subscript and > on 'x' are always used to map the list elements between the two: > > > x[list(`4`=2:1)] > CharacterList of length 1 > [["4"]] f b > > Also now, it's an error if the subscript has names but 'x' has not: > > > unname(x)[list(`4`=2:1)] > Error in subsetListByList(x, i) : > > cannot subscript an unnamed list-like object by a named list-like > object > > (I should probably change this message for: "cannot subset an unnamed > list-like object by a named list-like subscript".) > > This is to be consistent with subsetting a Vector object by name, which > fails if 'x' has no names: > > > IRanges(1:4, 5)["a"] > > Error in normalizeSingleBracketSubscrip**t(i, x) : > cannot subset by character when names are NULL > > If the subscript is a list-like object with names, the assumption is > that the user intended those names to be mapped against 'x' names. > Why make this assumption? Three users here have not made it and were surprised by the names on the index having any relevance to extraction. > If 'x' doesn't have names, I think it should fail rather than silently > fall back to position-based mapping. So at least you give a chance > to the user to either put names on 'x' (maybe s/he just forgot) or to > remove them from the subscript. If we really want to fall back to > position-based mapping, at least it should issue a warning, I think. > > One thing I didn't change from pre-BioC-2.13 behavior is that a > list-like subscript (when unnamed) is not recycled along 'x'. It's > open to discussion whether this would be a good thing to have or not. > Changing this would be pretty disruptive though... > > Cheers, > H. > > > > On 10/29/2013 03:51 PM, Michael Lawrence wrote: > >> I think we should just drop the names for the user. The Bioc <2.13 >> behavior seems reasonable to me. Please elaborate on the subtle issues. >> Most users would not expect the *names* on the index to have any effect >> on the extraction, in accordance with the behavior of ordinary vectors. >> The only difference with Lists is that there is a partitioning, which >> seems unrelated to naming. >> >> Michael >> >> >> On Tue, Oct 29, 2013 at 3:40 PM, Hervé Pagès <hpa...@fhcrc.org >> <mailto:hpa...@fhcrc.org>> wrote: >> >> Hi Thomas, >> >> For the same reasons that you cannot subset by names a Vector object >> with no names: >> >> > IRanges(1:4, width=10)[letters[1:4]] >> Error in normalizeSingleBracketSubscrip**__t(i, x) : >> >> cannot subset by character when names are NULL >> >> you cannot subset an unnamed List object using a named list-like >> subscript. So in your case, just remove the names on 'keep_ranges' >> (which are probably not desired anyway) before using it as a >> subscript: >> >> >> > keep_ranges >> CompressedIRangesList of length 18 >> $`1` >> IRanges of length 1 >> start end width >> [1] 20 108 89 >> >> $`2` >> IRanges of length 1 >> start end width >> [1] 43 131 89 >> >> $`3` >> IRanges of length 1 >> start end width >> [1] 21 105 85 >> >> ... >> <15 more elements> >> >> > return_rles[ unname(keep_ranges) ] >> RleList of length 18 >> [[1]] >> logical-Rle of length 89 with 1 run >> Lengths: 89 >> Values : TRUE >> >> [[2]] >> logical-Rle of length 89 with 1 run >> Lengths: 89 >> Values : TRUE >> >> [[3]] >> logical-Rle of length 85 with 1 run >> Lengths: 85 >> Values : TRUE >> >> [[4]] >> logical-Rle of length 85 with 1 run >> Lengths: 85 >> Values : TRUE >> >> [[5]] >> logical-Rle of length 102 with 1 run >> Lengths: 102 >> Values : TRUE >> >> ... >> <13 more elements> >> >> Prior to BioC 2.13, it was possible to subset an unnamed List object >> by >> a named list-like subscript, and in that case, the names on the >> subscript were ignored and the subscript was treated as parallel to >> the >> object to subset. However this behavior was somehow dangerous (could >> lead to subtle issues) and didn't follow the spirit of what subsetting >> an unnamed Vector by name does. So it's not supported anymore. >> >> Sorry for the inconvenience, >> H. >> >> >> >> On 10/29/2013 03:05 PM, Thomas Sandmann wrote: >> >> Hi Herve, >> >> I have updated to IRanges 1.20.4 now, but unfortunately, I still >> encounter an error when I try to subset a CompressedRleList or >> SimpleRleList with a CompressedIRangesList or SimpleIRangesList. >> >> Would you mind having a look at where I am going wrong ? (My two >> example >> objects are available in the rdata object at the url shown below). >> >> con=url("http://dl.__dropboxus**ercontent.com/u/__126180/** >> example.rdata <http://dropboxusercontent.com/u/__126180/example.rdata> >> >> >> <http://dl.dropboxusercontent.**com/u/126180/example.rdata<http://dl.dropboxusercontent.com/u/126180/example.rdata> >> >") >> load( con ) >> return_rles[ keep_ranges ] >> >> Error in subsetListByList(x, i) (from List-class.R#205) : >> cannot subscript an unnamed list-like object by a named >> list-like object >> >> R version 3.0.2 (2013-09-25) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] parallel stats graphics grDevices utils datasets >> methods >> [8] base >> >> other attached packages: >> [1] trimPrimers_1.3.0 Rsamtools_1.14.1 Biostrings_2.30.0 >> [4] GenomicRanges_1.14.2 XVector_0.2.0 IRanges_1.20.4 >> [7] BiocGenerics_0.8.0 Defaults_1.1-1 >> BiocInstaller_1.12.0 >> [10] roxygen2_2.2.2 digest_0.6.3 devtools_1.3 >> >> loaded via a namespace (and not attached): >> [1] bitops_1.0-6 brew_1.0-6 compiler_3.0.2 >> evaluate_0.5.1 httr_0.2 >> [6] memoise_0.1 RCurl_1.95-4.1 stats4_3.0.2 stringr_0.6.2 >> tools_3.0.2 >> [11] whisker_0.3-2 zlibbioc_1.8.0 >> >> >> -- >> Hervé Pagès >> >> Program in Computational Biology >> Division of Public Health Sciences >> Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N, M1-B514 >> P.O. Box 19024 >> Seattle, WA 98109-1024 >> >> E-mail: hpa...@fhcrc.org <mailto:hpa...@fhcrc.org> >> Phone: (206) 667-5791 <tel:%28206%29%20667-5791> >> Fax: (206) 667-1319 <tel:%28206%29%20667-1319> >> >> ______________________________**___________________ >> Bioc-devel@r-project.org >> <mailto:Bioc-devel@r-project.**org<Bioc-devel@r-project.org>> >> mailing list >> >> https://stat.ethz.ch/mailman/_**_listinfo/bioc-devel<https://stat.ethz.ch/mailman/__listinfo/bioc-devel> >> >> <https://stat.ethz.ch/mailman/**listinfo/bioc-devel<https://stat.ethz.ch/mailman/listinfo/bioc-devel> >> > >> >> >> > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpa...@fhcrc.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > [[alternative HTML version deleted]]
_______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel