On Tue, Oct 29, 2013 at 5:55 PM, Hervé Pagès <hpa...@fhcrc.org> wrote:

> Hi Michael,
>
> In Bioc < 2.13, subsetting was a mess. In particular, handling of
> list-like subscripts was rather unpredictable. It would work only
> if you were lucky enough to try it with one of the few supported
> types (like IntegerList, LogicalList, or IRangesList), but it didn't
> work for other very natural types like list or CharacterList.
> Or it would work for [ but not for [<-, or vice-versa:
>
>   x <- splitAsList(letters[1:6], c(2, 4, 3, 2, 2, 4))
>   x[list(1)]            # doesn't work in BioC < 2.13!
>   x[list(1)] <- "XX"    # works in BioC < 2.13!
>
> Or, if both [ and [<- worked, they could behave inconsistently: one
> would require the list-like subscript to have the same length as 'x'
> but the other wouldn't. Or one would use the names on the subscript
> and on 'x' to map the list elements between the two, but the other
> wouldn't.
>
> Hopefully in BioC 2.13, subsetting behaves more consistently (at least
> that was the intention). For example now the names on the subscript and
> on 'x' are always used to map the list elements between the two:
>
>   > x[list(`4`=2:1)]
>   CharacterList of length 1
>   [["4"]] f b
>
> Also now, it's an error if the subscript has names but 'x' has not:
>
>   > unname(x)[list(`4`=2:1)]
>   Error in subsetListByList(x, i) :
>
>     cannot subscript an unnamed list-like object by a named list-like
> object
>
> (I should probably change this message for: "cannot subset an unnamed
> list-like object by a named list-like subscript".)
>
> This is to be consistent with subsetting a Vector object by name, which
> fails if 'x' has no names:
>
>   > IRanges(1:4, 5)["a"]
>
>   Error in normalizeSingleBracketSubscrip**t(i, x) :
>     cannot subset by character when names are NULL
>
> If the subscript is a list-like object with names, the assumption is
> that the user intended those names to be mapped against 'x' names.
>


Why make this assumption? Three users here have not made it and were
surprised by the names on the index having any relevance to extraction.


> If 'x' doesn't have names, I think it should fail rather than silently
> fall back to position-based mapping. So at least you give a chance
> to the user to either put names on 'x' (maybe s/he just forgot) or to
> remove them from the subscript. If we really want to fall back to
> position-based mapping, at least it should issue a warning, I think.
>
> One thing I didn't change from pre-BioC-2.13 behavior is that a
> list-like subscript (when unnamed) is not recycled along 'x'. It's
> open to discussion whether this would be a good thing to have or not.
> Changing this would be pretty disruptive though...
>
> Cheers,
> H.
>
>
>
> On 10/29/2013 03:51 PM, Michael Lawrence wrote:
>
>> I think we should just drop the names for the user. The Bioc <2.13
>> behavior seems reasonable to me. Please elaborate on the subtle issues.
>> Most users would not expect the *names* on the index to have any effect
>> on the extraction, in accordance with the behavior of ordinary vectors.
>> The only difference with Lists is that there is a partitioning, which
>> seems unrelated to naming.
>>
>> Michael
>>
>>
>> On Tue, Oct 29, 2013 at 3:40 PM, Hervé Pagès <hpa...@fhcrc.org
>> <mailto:hpa...@fhcrc.org>> wrote:
>>
>>     Hi Thomas,
>>
>>     For the same reasons that you cannot subset by names a Vector object
>>     with no names:
>>
>>        > IRanges(1:4, width=10)[letters[1:4]]
>>        Error in normalizeSingleBracketSubscrip**__t(i, x) :
>>
>>          cannot subset by character when names are NULL
>>
>>     you cannot subset an unnamed List object using a named list-like
>>     subscript. So in your case, just remove the names on 'keep_ranges'
>>     (which are probably not desired anyway) before using it as a
>>     subscript:
>>
>>
>>        > keep_ranges
>>        CompressedIRangesList of length 18
>>        $`1`
>>        IRanges of length 1
>>            start end width
>>        [1]    20 108    89
>>
>>        $`2`
>>        IRanges of length 1
>>            start end width
>>        [1]    43 131    89
>>
>>        $`3`
>>        IRanges of length 1
>>            start end width
>>        [1]    21 105    85
>>
>>        ...
>>        <15 more elements>
>>
>>        > return_rles[ unname(keep_ranges) ]
>>        RleList of length 18
>>        [[1]]
>>        logical-Rle of length 89 with 1 run
>>          Lengths:   89
>>          Values : TRUE
>>
>>        [[2]]
>>        logical-Rle of length 89 with 1 run
>>          Lengths:   89
>>          Values : TRUE
>>
>>        [[3]]
>>        logical-Rle of length 85 with 1 run
>>          Lengths:   85
>>          Values : TRUE
>>
>>        [[4]]
>>        logical-Rle of length 85 with 1 run
>>          Lengths:   85
>>          Values : TRUE
>>
>>        [[5]]
>>        logical-Rle of length 102 with 1 run
>>          Lengths:  102
>>          Values : TRUE
>>
>>        ...
>>        <13 more elements>
>>
>>     Prior to BioC 2.13, it was possible to subset an unnamed List object
>> by
>>     a named list-like subscript, and in that case, the names on the
>>     subscript were ignored and the subscript was treated as parallel to
>> the
>>     object to subset. However this behavior was somehow dangerous (could
>>     lead to subtle issues) and didn't follow the spirit of what subsetting
>>     an unnamed Vector by name does. So it's not supported anymore.
>>
>>     Sorry for the inconvenience,
>>     H.
>>
>>
>>
>>     On 10/29/2013 03:05 PM, Thomas Sandmann wrote:
>>
>>         Hi Herve,
>>
>>         I have updated to IRanges 1.20.4 now, but unfortunately, I still
>>         encounter an error when I try to subset a CompressedRleList or
>>         SimpleRleList with a CompressedIRangesList or SimpleIRangesList.
>>
>>         Would you mind having a look at where I am going wrong ? (My two
>>         example
>>         objects are available in the rdata object at the url shown below).
>>
>>         con=url("http://dl.__dropboxus**ercontent.com/u/__126180/**
>> example.rdata <http://dropboxusercontent.com/u/__126180/example.rdata>
>>
>>         
>> <http://dl.dropboxusercontent.**com/u/126180/example.rdata<http://dl.dropboxusercontent.com/u/126180/example.rdata>
>> >")
>>         load( con )
>>         return_rles[ keep_ranges ]
>>
>>         Error in subsetListByList(x, i) (from List-class.R#205) :
>>             cannot subscript an unnamed list-like object by a named
>>         list-like object
>>
>>         R version 3.0.2 (2013-09-25)
>>         Platform: x86_64-unknown-linux-gnu (64-bit)
>>
>>         locale:
>>            [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>            [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>            [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>>            [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>            [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>         [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>
>>         attached base packages:
>>         [1] parallel  stats     graphics  grDevices utils     datasets
>>           methods
>>         [8] base
>>
>>         other attached packages:
>>            [1] trimPrimers_1.3.0    Rsamtools_1.14.1     Biostrings_2.30.0
>>            [4] GenomicRanges_1.14.2 XVector_0.2.0        IRanges_1.20.4
>>            [7] BiocGenerics_0.8.0   Defaults_1.1-1
>>         BiocInstaller_1.12.0
>>         [10] roxygen2_2.2.2       digest_0.6.3         devtools_1.3
>>
>>         loaded via a namespace (and not attached):
>>            [1] bitops_1.0-6   brew_1.0-6     compiler_3.0.2
>>         evaluate_0.5.1 httr_0.2
>>            [6] memoise_0.1    RCurl_1.95-4.1 stats4_3.0.2   stringr_0.6.2
>>            tools_3.0.2
>>         [11] whisker_0.3-2  zlibbioc_1.8.0
>>
>>
>>     --
>>     Hervé Pagès
>>
>>     Program in Computational Biology
>>     Division of Public Health Sciences
>>     Fred Hutchinson Cancer Research Center
>>     1100 Fairview Ave. N, M1-B514
>>     P.O. Box 19024
>>     Seattle, WA 98109-1024
>>
>>     E-mail: hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>
>>     Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>>     Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>>
>>     ______________________________**___________________
>>     Bioc-devel@r-project.org 
>> <mailto:Bioc-devel@r-project.**org<Bioc-devel@r-project.org>>
>> mailing list
>>     
>> https://stat.ethz.ch/mailman/_**_listinfo/bioc-devel<https://stat.ethz.ch/mailman/__listinfo/bioc-devel>
>>     
>> <https://stat.ethz.ch/mailman/**listinfo/bioc-devel<https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>> >
>>
>>
>>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpa...@fhcrc.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to