I think this is still too pedantic. For example, the GRanges constructor defaults to '*'. That should also emit a warning to be consistent with this.
On Thu, Apr 1, 2010 at 12:01 PM, Patrick Aboyoun <[email protected]> wrote: > I just checked in a patch to the GenomicRanges package in which the GRanges > constructor will now convert NA values in strand to the both/either strand > indicator "*" and issue a warning to the end-user that informs them of the > change. The updated GenomicRanges package should be available from > bioconductor.org with the next 36 hours. Here is an example: > > > > RangedData(IRanges(1,2)) > RangedData with 1 row and 0 value columns across 1 space > space ranges | > <character> <IRanges> | > 1 1 [1, 2] | > > > as(RangedData(IRanges(1,2)), "GRanges") > > GRanges with 1 range and 0 elementMetadata values > seqnames ranges strand | > <Rle> <IRanges> <Rle> | > [1] 1 [1, 2] * | > > seqlengths > 1 > NA > Warning message: > In GRanges(seqnames = space(from), ranges = ranges, strand = > Rle(strand(from)), : > missing values in strand converted to "*" > > > sessionInfo() > R version 2.11.0 Under development (unstable) (2010-03-22 r51355) > i386-apple-darwin9.8.0 > > locale: > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] GenomicRanges_0.1.3 IRanges_1.5.74 > > > > > > On 4/1/10 8:04 AM, Michael Lawrence wrote: > >> Thinking about this some more, it's somewhat analogous to the coercion to >> factor in R, i.e. as.factor(c("male", "female")) returns something >> reasonable, despite missing level information. >> >> as.factor("male") would probably not be what I wanted, but we live with >> it, >> since the alternative (requiring the levels argument) would probably be >> worse. >> >> On Thu, Apr 1, 2010 at 7:31 AM, Michael Lawrence<[email protected]> >> wrote: >> >> >> >>> >>> On Thu, Apr 1, 2010 at 7:22 AM, Martin Morgan<[email protected]> >>> wrote: >>> >>> >>> >>>> On 04/01/2010 07:12 AM, Michael Lawrence wrote: >>>> >>>> >>>>> On Thu, Apr 1, 2010 at 7:09 AM, Martin Morgan<[email protected]> >>>>> >>>>> >>>> wrote: >>>> >>>> >>>>> >>>>> >>>>>> On 03/31/2010 07:11 PM, [email protected] wrote: >>>>>> >>>>>> >>>>>>> Dear bioc-sig-sequencing, >>>>>>> >>>>>>> I would like to annotate chip-seq peaks for the arabidopsis genome. >>>>>>> >>>>>>> >>>>>> In >>>> >>>> >>>>> trying to work thru the GenomicFeatures vignette dated 03/27/10, I need >>>>>> >>>>>> >>>>> to >>>> >>>> >>>>> convert my ChIPSeq peaks from a RangedData object to a GRanges object. >>>>>> >>>>>> >>>>> In a >>>> >>>> >>>>> recent, but previous Bioconductor development version, the conversion >>>>>> >>>>>> >>>>> with >>>> >>>> >>>>> this particular RangedData object worked fine. >>>>>> >>>>>> >>>>>>> In this more recent Bioconductor development version, I get the >>>>>>> >>>>>>> >>>>>> following >>>> >>>> >>>>> error message: >>>>>> >>>>>> >>>>>>> >>>>>>> >>>>>>>> gr_ChSeqPks<- as(rd0_chr1_s_8_trt_vs_INPctl, "GRanges") >>>>>>>> >>>>>>>> >>>>>>> Error in validObject(.Object) : >>>>>>> invalid class "GRanges" object: slot 'strand' contains missing >>>>>>> >>>>>>> >>>>>> values >>>> >>>> >>>>> rd0_chr1_s_8_trt_vs_INPctl >>>>>>>> >>>>>>>> >>>>>>> RangedData with 57 rows and 2 value columns across 1 space >>>>>>> space ranges | ARAB8 ARAB7INPCTL >>>>>>> <character> <IRanges> |<integer> <integer> >>>>>>> 1 chr1 [ 617092, 617094] | 24 0 >>>>>>> 2 chr1 [1808262, 1808262] | 8 0 >>>>>>> 3 chr1 [3889445, 3889452] | 64 0 >>>>>>> 4 chr1 [4404410, 4404410] | 8 0 >>>>>>> 5 chr1 [7081127, 7081127] | 8 0 >>>>>>> 6 chr1 [7128574, 7128581] | 64 0 >>>>>>> 7 chr1 [7128592, 7128649] | 464 0 >>>>>>> 8 chr1 [7530777, 7530781] | 40 0 >>>>>>> 9 chr1 [7530784, 7530786] | 24 0 >>>>>>> ... ... ... ... ... ... >>>>>>> >>>>>>> >>>>>> Hi, >>>>>> >>>>>> >>>>>> >>>>>>> rd = RangedData(IRanges(1, 10)) >>>>>>> as(rd, "GRanges") >>>>>>> >>>>>>> >>>>>> Error in validObject(.Object) : >>>>>> invalid class "GRanges" object: slot 'strand' contains missing values >>>>>> >>>>>> >>>>>>> rd[["strand"]] = "*" >>>>>>> as(rd, "GRanges") >>>>>>> >>>>>>> >>>>>> GRanges with 1 range and 0 elementMetadata values >>>>>> seqnames ranges strand | >>>>>> <Rle> <IRanges> <Rle> | >>>>>> [1] 1 [1, 10] * | >>>>>> >>>>>> seqlengths >>>>>> 1 >>>>>> NA >>>>>> >>>>>> Martin >>>>>> >>>>>> >>>>>> >>>>>> >>>>> Shouldn't the coerce function just do this automatically? >>>>> >>>>> >>>> Currently GRanges thinks of strand as '+', '-', '*', whereas IRanges >>>> allows NA as well (hence the error) so coercing NA to * represents a >>>> decision on the part of the investigator that '*' (strand irrelevant) is >>>> synonymous with NA (no information about strand available). Part of the >>>> motivation for this current state of affairs is that the use case for >>>> both NA and * were unclear, but course corrections welcome. >>>> >>>> >>>> >>>> >>> Ok. I guess one could think of the coercion of a RangedData missing a >>> 'strand' column to a GRanges as an equivalent statement, since GRanges >>> requires strand information. If that doesn't sound reasonable, a better >>> error message will help avoid questions like this in the future. >>> >>> Michael >>> >>> >>> >>> >>> >>> >>>> Martin >>>> >>>> >>>>> >>>>> >>>>>> >>>>>>> >>>>>>>> sessionInfo() >>>>>>>> >>>>>>>> >>>>>>> R version 2.12.0 Under development (unstable) (2010-03-30 r51506) >>>>>>> x86_64-unknown-linux-gnu >>>>>>> >>>>>>> locale: >>>>>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >>>>>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >>>>>>> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 >>>>>>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >>>>>>> [9] LC_ADDRESS=C LC_TELEPHONE=C >>>>>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>>>>>> >>>>>>> attached base packages: >>>>>>> [1] stats graphics grDevices utils datasets methods base >>>>>>> >>>>>>> other attached packages: >>>>>>> [1] biomaRt_2.3.5 GenomicFeatures_0.5.0 GenomicRanges_0.1.0 >>>>>>> [4] IRanges_1.5.73 >>>>>>> >>>>>>> loaded via a namespace (and not attached): >>>>>>> [1] Biobase_2.7.5 Biostrings_2.15.26 BSgenome_1.15.20 >>>>>>> DBI_0.2-5 >>>>>>> [5] RCurl_1.3-1 RSQLite_0.8-4 rtracklayer_1.7.11 >>>>>>> >>>>>>> >>>>>> tools_2.12.0 >>>> >>>> >>>>> [9] XML_2.8-1 >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> P. Terry >>>>>>> [email protected] >>>>>>> >>>>>>> [[alternative HTML version deleted]] >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Bioc-sig-sequencing mailing list >>>>>>> [email protected] >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> Martin Morgan >>>>>> Computational Biology / Fred Hutchinson Cancer Research Center >>>>>> 1100 Fairview Ave. N. >>>>>> PO Box 19024 Seattle, WA 98109 >>>>>> >>>>>> Location: Arnold Building M1 B861 >>>>>> Phone: (206) 667-2793 >>>>>> >>>>>> _______________________________________________ >>>>>> Bioc-sig-sequencing mailing list >>>>>> [email protected] >>>>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> -- >>>> Martin Morgan >>>> Computational Biology / Fred Hutchinson Cancer Research Center >>>> 1100 Fairview Ave. N. >>>> PO Box 19024 Seattle, WA 98109 >>>> >>>> Location: Arnold Building M1 B861 >>>> Phone: (206) 667-2793 >>>> >>>> >>>> >>> >>> >>> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioc-sig-sequencing mailing list >> [email protected] >> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing >> >> > > [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
