Re: [Bioc-sig-seq] WIG from coverage

Michael Lawrence Tue, 29 Sep 2009 19:25:33 -0700

On Tue, Sep 29, 2009 at 4:28 PM, Seidel, Christopher <[email protected]>wrote:


> Hello,
>
> In an earlier thread: BED to WIG format conversion (Thu 9/17/2009 4:50 PM)
> Michael Lawrence mentioned:
>
> Ignoring efficiency, it's pretty straight-forward. In the development
> version of IRanges, one can convert an Rle (or RleList for multiple
> chromosomes) as output to coverage() to a RangedData. In the development
> version of rtracklayer, this is done implicitly:
>
> > export(cov, "coverage.wig")
>
> as a way to create a wig file from a coverage() result. When I try this, my
> wig file looks like the following:
>
> track name="R Track" type=wiggle_0
> .       0       2       168
> .       2       4       175
> .       4       5       177
> .       5       7       178
> .       7       8       182
> .       8       12      185
> .       12      14      188
> .       14      15      191
> etc.
>
> This isn't compatible with any of the UCSC wig formats that I'm aware of
> (is it?).


No, it should be outputting the chromosome names instead of ".", following
the BED-line WIG variant.

The coercion from Rle to RangedData yields a single-chromosome RangedData
with NULL names. This becomes <NA> upon coercion to a UCSCData (which tries
to generate conforming names but fails). <NA> shows up as "." in BED.

New policy: RangedData cannot have NULL names. Made a validity check for
this. Fixed Rle->RangedData coercion to yield default of "1". This mimics
the behavior of RangedData().

Shouldn't rtracklayer be putting out a variableStep format for this?


Well, come to think of it, that should be possible. Right now, rtracklayer
only uses variableStep when the span is the same across an entire
chromosome. Should be possible to split the chromosome into multiple blocks
though.


> Am I doing something wrong?
>
>

Note that in general your code will not work, because you do not tell
rtracklayer the name of the chromosome. This makes me worry that generating
a default chromosome name will often hide user errors.



> By the way, here is my work flow for going from solexa export to coverage:
>
> # read in some IP data
> aln <- readAligned(spath, pattern="s_1_export.txt", type="SolexaExport",
> filter=filt)
> # convert to GenomeData object
> s1 <- as(aln,"GenomeData")
>
> # read in some WCE data
> aln <- readAligned(spath, pattern="s_3_export.txt", type="SolexaExport",
> filter=filt)
> # convert to GenomeData object
> s3 <- as(aln,"GenomeData")
>
> # create GenomeDataList from ip and wce
> ypd <- GenomeDataList(list(ip=s1, wce=s3))
>
> # extend reads
> ip1 <- extendReads(ypd$ip$chrI, seqLen=150)
> wce1 <- extendReads(ypd$wce$chrI, seqLen=150)
>
> # find coverage for chromosome 1
> ip1.cov <- coverage(ip1, width=230208)
> wce1.cov <- coverage(wce1, width=230208)
>
> # make a track
> export(ip1.cov, "ip1_coverage.wig")
>
> > sessionInfo()
> R version 2.10.0 Under development (unstable) (2009-09-03 r49555)
> x86_64-unknown-linux-gnu
>
> locale:
> [1] C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] rtracklayer_1.5.13 RCurl_1.2-0        bitops_1.0-4.1     chipseq_0.1.25
> [5] ShortRead_1.3.34   lattice_0.17-25    BSgenome_1.13.11
> Biostrings_2.13.39
> [9] IRanges_1.3.75
>
> loaded via a namespace (and not attached):
> [1] Biobase_2.5.6 XML_2.6-0     grid_2.10.0   hwriter_1.1
>
> -Chris
>
> Chris Seidel, Ph.D.
> http://research.stowers-institute.org/cws
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> [email protected]
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Re: [Bioc-sig-seq] WIG from coverage

Reply via email to