Re: [Bioc-sig-seq] Assessing Transcriptome Coverage

Michael Dondrup Tue, 18 Aug 2009 08:22:55 -0700

Hi

it looks as if the largest proportion of the genome have a coverage of0. This is of course to expect, but it means that you will have toplay with the ylim parameter too, because otherwise the frequency forthe first bins will dominate the plot, that's why you just see onebar. see ?hist


In such a setting I just try something like:
hist(lane1, ylim=c(0, 2000), breaks=seq(1, max(c1)+100, 100))
for bins of width 100 starting from 1

In addition the package GenomeGraphs provides additional methods toplot the coverage over the chromosome

which is maybe of interest, too. see the examples in the vignette.

Michael

Am 17.08.2009 um 21:07 schrieb Abhishek Pratap:

Hi

I dont have a lot of experience with plotting large amount of data
points and clearly my question reflects that. :)

summary(lane1)
    Min.   1st Qu.    Median      Mean   3rd Qu.      Max.      NA's
   0.000     0.040     0.180     5.186     0.620 39730.000  2264.000

Thanks for your help.

-Abhi


On Mon, Aug 17, 2009 at 3:05 PM, Sean Davis<[email protected]> wrote:

On Mon, Aug 17, 2009 at 3:01 PM, Abhishek Pratap <[email protected]>
wrote:
Hi Sean
Thanks for your suggestion on both the mailing lists. I am nowreading
the coverage values from a file and storing them as a data.frame and
then creating a new numeric vector for each lane. Each vector mayhave
15000-45000 entries.  The values are integers with a significant
difference in values, some could be between 0-1 eg (0.45,0.89) and
then I also have values in range like (4000, 44000). I am justtaking
random examples to explain the bias in the data.

When I plot a histogram I just see one big bar. I feel the bins are
not created effectively. I also tried couple of different options in
the R hist function but with same result.

hist(lane2, freq=TRUE, breaks=10);
 hist(lane2, freq=TRUE, include.lowest=TRUE);
What does summary(lane2) show? You may need to transform the datato make
it more presentable (log?).

Sean


_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


Michael Dondrup, Ph.D.
Bergen Center for Computational Science
Computational Biology Unit
Unifob AS - Thormøhlensgate 55, N-5008 Bergen, Norway
Phone: +47 55584029 Fax: +47 55584295

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Re: [Bioc-sig-seq] Assessing Transcriptome Coverage

Reply via email to