Re: [Bioc-sig-seq] GenomicFeatures, error in type conversion RangeData to GRanges

Patrick Aboyoun Thu, 01 Apr 2010 22:31:02 -0700

Michael,
I'm not sure I follow. Converting user-supplied input is different from 
using an argument's default value during function execution. If an 
argument's default value is questionable enough to merit a warning, then 
it is better to get rid of the default value and require the user to 
provide one. I'm fine with making strand a required argument if users 
find the current default value to provide little value, or worse by 
causing some confusion.



Patrick


On 4/1/10 1:32 PM, Michael Lawrence wrote:
> I think this is still too pedantic. For example, the GRanges 
> constructor defaults to '*'. That should also emit a warning to be 
> consistent with this.
>
>
> On Thu, Apr 1, 2010 at 12:01 PM, Patrick Aboyoun <[email protected] 
> <mailto:[email protected]>> wrote:
>
>     I just checked in a patch to the GenomicRanges package in which
>     the GRanges constructor will now convert NA values in strand to
>     the both/either strand indicator "*" and issue a warning to the
>     end-user that informs them of the change. The updated
>     GenomicRanges package should be available from bioconductor.org
>     <http://bioconductor.org> with the next 36 hours. Here is an example:
>
>
>     > RangedData(IRanges(1,2))
>     RangedData with 1 row and 0 value columns across 1 space
>            space    ranges |
>     <character> <IRanges> |
>     1           1    [1, 2] |
>
>     > as(RangedData(IRanges(1,2)), "GRanges")
>
>     GRanges with 1 range and 0 elementMetadata values
>        seqnames    ranges strand |
>     <Rle> <IRanges> <Rle> |
>     [1]        1    [1, 2]      * |
>
>     seqlengths
>      1
>     NA
>     Warning message:
>     In GRanges(seqnames = space(from), ranges = ranges, strand =
>     Rle(strand(from)),  :
>      missing values in strand converted to "*"
>
>     > sessionInfo()
>     R version 2.11.0 Under development (unstable) (2010-03-22 r51355)
>     i386-apple-darwin9.8.0
>
>     locale:
>     [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>
>
>     attached base packages:
>     [1] stats     graphics  grDevices utils     datasets  methods   base
>
>     other attached packages:
>     [1] GenomicRanges_0.1.3 IRanges_1.5.74
>
>
>
>
>
>     On 4/1/10 8:04 AM, Michael Lawrence wrote:
>
>         Thinking about this some more, it's somewhat analogous to the
>         coercion to
>         factor in R, i.e. as.factor(c("male", "female")) returns something
>         reasonable, despite missing level information.
>
>         as.factor("male") would probably not be what I wanted, but we
>         live with it,
>         since the alternative (requiring the levels argument) would
>         probably be
>         worse.
>
>         On Thu, Apr 1, 2010 at 7:31 AM, Michael
>         Lawrence<[email protected] <mailto:[email protected]>>  wrote:
>
>
>
>             On Thu, Apr 1, 2010 at 7:22 AM, Martin
>             Morgan<[email protected] <mailto:[email protected]>>  wrote:
>
>
>                 On 04/01/2010 07:12 AM, Michael Lawrence wrote:
>
>                     On Thu, Apr 1, 2010 at 7:09 AM, Martin
>                     Morgan<[email protected] <mailto:[email protected]>>
>
>                 wrote:
>
>
>                         On 03/31/2010 07:11 PM, [email protected]
>                         <mailto:[email protected]> wrote:
>
>                              Dear bioc-sig-sequencing,
>
>                             I would like to annotate chip-seq peaks
>                             for the arabidopsis genome.
>
>                  In
>
>                         trying to work thru the GenomicFeatures
>                         vignette dated 03/27/10, I need
>
>                 to
>
>                         convert my ChIPSeq peaks from a RangedData
>                         object to a GRanges object.
>
>                  In a
>
>                         recent, but previous Bioconductor development
>                         version, the conversion
>
>                 with
>
>                         this particular RangedData object worked fine.
>
>                             In this more recent Bioconductor
>                             development version, I get the
>
>                 following
>
>                         error message:
>
>
>                                 gr_ChSeqPks<-
>                                 as(rd0_chr1_s_8_trt_vs_INPctl, "GRanges")
>
>                             Error in validObject(.Object) :
>                               invalid class "GRanges" object: slot
>                             'strand' contains missing
>
>                 values
>
>                                 rd0_chr1_s_8_trt_vs_INPctl
>
>                             RangedData with 57 rows and 2 value
>                             columns across 1 space
>                                       space               ranges   |  
>                               ARAB8 ARAB7INPCTL
>                             <character> <IRanges>    |<integer> <integer>
>                             1          chr1   [ 617092,  617094]   |  
>                                  24           0
>                             2          chr1   [1808262, 1808262]   |  
>                                   8           0
>                             3          chr1   [3889445, 3889452]   |  
>                                  64           0
>                             4          chr1   [4404410, 4404410]   |  
>                                   8           0
>                             5          chr1   [7081127, 7081127]   |  
>                                   8           0
>                             6          chr1   [7128574, 7128581]   |  
>                                  64           0
>                             7          chr1   [7128592, 7128649]   |  
>                                 464           0
>                             8          chr1   [7530777, 7530781]   |  
>                                  40           0
>                             9          chr1   [7530784, 7530786]   |  
>                                  24           0
>                             ...         ...                  ... ...  
>                                 ...         ...
>
>                         Hi,
>
>
>                             rd = RangedData(IRanges(1, 10))
>                             as(rd, "GRanges")
>
>                         Error in validObject(.Object) :
>                          invalid class "GRanges" object: slot 'strand'
>                         contains missing values
>
>                             rd[["strand"]] = "*"
>                             as(rd, "GRanges")
>
>                         GRanges with 1 range and 0 elementMetadata values
>                            seqnames    ranges strand |
>                         <Rle> <IRanges> <Rle>  |
>                         [1]        1   [1, 10]      * |
>
>                         seqlengths
>                          1
>                         NA
>
>                         Martin
>
>
>
>                     Shouldn't the coerce function just do this
>                     automatically?
>
>                 Currently GRanges thinks of strand as '+', '-', '*',
>                 whereas IRanges
>                 allows NA as well (hence the error) so coercing NA to
>                 * represents a
>                 decision on the part of the investigator that '*'
>                 (strand irrelevant) is
>                 synonymous with NA (no information about strand
>                 available). Part of the
>                 motivation for this current state of affairs is that
>                 the use case for
>                 both NA and * were unclear, but course corrections
>                 welcome.
>
>
>
>             Ok. I guess one could think of the coercion of a
>             RangedData missing a
>             'strand' column to a GRanges as an equivalent statement,
>             since GRanges
>             requires strand information. If that doesn't sound
>             reasonable, a better
>             error message will help avoid questions like this in the
>             future.
>
>             Michael
>
>
>
>
>
>                 Martin
>
>
>
>                                 sessionInfo()
>
>                             R version 2.12.0 Under development
>                             (unstable) (2010-03-30 r51506)
>                             x86_64-unknown-linux-gnu
>
>                             locale:
>                              [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>                              [3] LC_TIME=en_US.UTF-8      
>                              LC_COLLATE=en_US.UTF-8
>                              [5] LC_MONETARY=C            
>                              LC_MESSAGES=en_US.UTF-8
>                              [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>                              [9] LC_ADDRESS=C               LC_TELEPHONE=C
>                             [11] LC_MEASUREMENT=en_US.UTF-8
>                             LC_IDENTIFICATION=C
>
>                             attached base packages:
>                             [1] stats     graphics  grDevices utils  
>                               datasets  methods   base
>
>                             other attached packages:
>                             [1] biomaRt_2.3.5        
>                             GenomicFeatures_0.5.0 GenomicRanges_0.1.0
>                             [4] IRanges_1.5.73
>
>                             loaded via a namespace (and not attached):
>                             [1] Biobase_2.7.5      Biostrings_2.15.26
>                             BSgenome_1.15.20   DBI_0.2-5
>                             [5] RCurl_1.3-1        RSQLite_0.8-4    
>                              rtracklayer_1.7.11
>
>                 tools_2.12.0
>
>                             [9] XML_2.8-1
>
>
>
>                             Thanks,
>                             P. Terry
>                             [email protected]
>                             <mailto:[email protected]>
>
>                                   [[alternative HTML version deleted]]
>
>                             _______________________________________________
>                             Bioc-sig-sequencing mailing list
>                             [email protected]
>                             <mailto:[email protected]>
>                             
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
>
>                         --
>                         Martin Morgan
>                         Computational Biology / Fred Hutchinson Cancer
>                         Research Center
>                         1100 Fairview Ave. N.
>                         PO Box 19024 Seattle, WA 98109
>
>                         Location: Arnold Building M1 B861
>                         Phone: (206) 667-2793
>
>                         _______________________________________________
>                         Bioc-sig-sequencing mailing list
>                         [email protected]
>                         <mailto:[email protected]>
>                         
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
>
>
>
>                 --
>                 Martin Morgan
>                 Computational Biology / Fred Hutchinson Cancer
>                 Research Center
>                 1100 Fairview Ave. N.
>                 PO Box 19024 Seattle, WA 98109
>
>                 Location: Arnold Building M1 B861
>                 Phone: (206) 667-2793
>
>
>
>
>                [[alternative HTML version deleted]]
>
>         _______________________________________________
>         Bioc-sig-sequencing mailing list
>         [email protected]
>         <mailto:[email protected]>
>         https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
>
>


        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Re: [Bioc-sig-seq] GenomicFeatures, error in type conversion RangeData to GRanges

Reply via email to