Michael, I'm not sure I follow. Converting user-supplied input is different from using an argument's default value during function execution. If an argument's default value is questionable enough to merit a warning, then it is better to get rid of the default value and require the user to provide one. I'm fine with making strand a required argument if users find the current default value to provide little value, or worse by causing some confusion.
Patrick On 4/1/10 1:32 PM, Michael Lawrence wrote: > I think this is still too pedantic. For example, the GRanges > constructor defaults to '*'. That should also emit a warning to be > consistent with this. > > > On Thu, Apr 1, 2010 at 12:01 PM, Patrick Aboyoun <[email protected] > <mailto:[email protected]>> wrote: > > I just checked in a patch to the GenomicRanges package in which > the GRanges constructor will now convert NA values in strand to > the both/either strand indicator "*" and issue a warning to the > end-user that informs them of the change. The updated > GenomicRanges package should be available from bioconductor.org > <http://bioconductor.org> with the next 36 hours. Here is an example: > > > > RangedData(IRanges(1,2)) > RangedData with 1 row and 0 value columns across 1 space > space ranges | > <character> <IRanges> | > 1 1 [1, 2] | > > > as(RangedData(IRanges(1,2)), "GRanges") > > GRanges with 1 range and 0 elementMetadata values > seqnames ranges strand | > <Rle> <IRanges> <Rle> | > [1] 1 [1, 2] * | > > seqlengths > 1 > NA > Warning message: > In GRanges(seqnames = space(from), ranges = ranges, strand = > Rle(strand(from)), : > missing values in strand converted to "*" > > > sessionInfo() > R version 2.11.0 Under development (unstable) (2010-03-22 r51355) > i386-apple-darwin9.8.0 > > locale: > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] GenomicRanges_0.1.3 IRanges_1.5.74 > > > > > > On 4/1/10 8:04 AM, Michael Lawrence wrote: > > Thinking about this some more, it's somewhat analogous to the > coercion to > factor in R, i.e. as.factor(c("male", "female")) returns something > reasonable, despite missing level information. > > as.factor("male") would probably not be what I wanted, but we > live with it, > since the alternative (requiring the levels argument) would > probably be > worse. > > On Thu, Apr 1, 2010 at 7:31 AM, Michael > Lawrence<[email protected] <mailto:[email protected]>> wrote: > > > > On Thu, Apr 1, 2010 at 7:22 AM, Martin > Morgan<[email protected] <mailto:[email protected]>> wrote: > > > On 04/01/2010 07:12 AM, Michael Lawrence wrote: > > On Thu, Apr 1, 2010 at 7:09 AM, Martin > Morgan<[email protected] <mailto:[email protected]>> > > wrote: > > > On 03/31/2010 07:11 PM, [email protected] > <mailto:[email protected]> wrote: > > Dear bioc-sig-sequencing, > > I would like to annotate chip-seq peaks > for the arabidopsis genome. > > In > > trying to work thru the GenomicFeatures > vignette dated 03/27/10, I need > > to > > convert my ChIPSeq peaks from a RangedData > object to a GRanges object. > > In a > > recent, but previous Bioconductor development > version, the conversion > > with > > this particular RangedData object worked fine. > > In this more recent Bioconductor > development version, I get the > > following > > error message: > > > gr_ChSeqPks<- > as(rd0_chr1_s_8_trt_vs_INPctl, "GRanges") > > Error in validObject(.Object) : > invalid class "GRanges" object: slot > 'strand' contains missing > > values > > rd0_chr1_s_8_trt_vs_INPctl > > RangedData with 57 rows and 2 value > columns across 1 space > space ranges | > ARAB8 ARAB7INPCTL > <character> <IRanges> |<integer> <integer> > 1 chr1 [ 617092, 617094] | > 24 0 > 2 chr1 [1808262, 1808262] | > 8 0 > 3 chr1 [3889445, 3889452] | > 64 0 > 4 chr1 [4404410, 4404410] | > 8 0 > 5 chr1 [7081127, 7081127] | > 8 0 > 6 chr1 [7128574, 7128581] | > 64 0 > 7 chr1 [7128592, 7128649] | > 464 0 > 8 chr1 [7530777, 7530781] | > 40 0 > 9 chr1 [7530784, 7530786] | > 24 0 > ... ... ... ... > ... ... > > Hi, > > > rd = RangedData(IRanges(1, 10)) > as(rd, "GRanges") > > Error in validObject(.Object) : > invalid class "GRanges" object: slot 'strand' > contains missing values > > rd[["strand"]] = "*" > as(rd, "GRanges") > > GRanges with 1 range and 0 elementMetadata values > seqnames ranges strand | > <Rle> <IRanges> <Rle> | > [1] 1 [1, 10] * | > > seqlengths > 1 > NA > > Martin > > > > Shouldn't the coerce function just do this > automatically? > > Currently GRanges thinks of strand as '+', '-', '*', > whereas IRanges > allows NA as well (hence the error) so coercing NA to > * represents a > decision on the part of the investigator that '*' > (strand irrelevant) is > synonymous with NA (no information about strand > available). Part of the > motivation for this current state of affairs is that > the use case for > both NA and * were unclear, but course corrections > welcome. > > > > Ok. I guess one could think of the coercion of a > RangedData missing a > 'strand' column to a GRanges as an equivalent statement, > since GRanges > requires strand information. If that doesn't sound > reasonable, a better > error message will help avoid questions like this in the > future. > > Michael > > > > > > Martin > > > > sessionInfo() > > R version 2.12.0 Under development > (unstable) (2010-03-30 r51506) > x86_64-unknown-linux-gnu > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 > LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=C > LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 > LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils > datasets methods base > > other attached packages: > [1] biomaRt_2.3.5 > GenomicFeatures_0.5.0 GenomicRanges_0.1.0 > [4] IRanges_1.5.73 > > loaded via a namespace (and not attached): > [1] Biobase_2.7.5 Biostrings_2.15.26 > BSgenome_1.15.20 DBI_0.2-5 > [5] RCurl_1.3-1 RSQLite_0.8-4 > rtracklayer_1.7.11 > > tools_2.12.0 > > [9] XML_2.8-1 > > > > Thanks, > P. Terry > [email protected] > <mailto:[email protected]> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioc-sig-sequencing mailing list > [email protected] > <mailto:[email protected]> > > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > > > -- > Martin Morgan > Computational Biology / Fred Hutchinson Cancer > Research Center > 1100 Fairview Ave. N. > PO Box 19024 Seattle, WA 98109 > > Location: Arnold Building M1 B861 > Phone: (206) 667-2793 > > _______________________________________________ > Bioc-sig-sequencing mailing list > [email protected] > <mailto:[email protected]> > > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > > > > > -- > Martin Morgan > Computational Biology / Fred Hutchinson Cancer > Research Center > 1100 Fairview Ave. N. > PO Box 19024 Seattle, WA 98109 > > Location: Arnold Building M1 B861 > Phone: (206) 667-2793 > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioc-sig-sequencing mailing list > [email protected] > <mailto:[email protected]> > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > > > [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
