Hi

it looks as if the largest proportion of the genome have a coverage of 0. This is of course to expect, but it means that you will have to play with the ylim parameter too, because otherwise the frequency for the first bins will dominate the plot, that's why you just see one bar. see ?hist

In such a setting I just try something like:
hist(lane1, ylim=c(0, 2000), breaks=seq(1, max(c1)+100, 100))
for bins of width 100 starting from 1

In addition the package GenomeGraphs provides additional methods to plot the coverage over the chromosome
which is maybe of interest, too. see the examples in the vignette.

Michael

Am 17.08.2009 um 21:07 schrieb Abhishek Pratap:

Hi

I dont have a lot of experience with plotting large amount of data
points and clearly my question reflects that. :)

summary(lane1)
    Min.   1st Qu.    Median      Mean   3rd Qu.      Max.      NA's
   0.000     0.040     0.180     5.186     0.620 39730.000  2264.000

Thanks for your help.

-Abhi


On Mon, Aug 17, 2009 at 3:05 PM, Sean Davis<[email protected]> wrote:


On Mon, Aug 17, 2009 at 3:01 PM, Abhishek Pratap <[email protected] >
wrote:

Hi Sean

Thanks for your suggestion on both the mailing lists. I am now reading
the coverage values from a file and storing them as a data.frame and
then creating a new numeric vector for each lane. Each vector may have
15000-45000 entries.  The values are integers with a significant
difference in values, some could be between 0-1 eg (0.45,0.89) and
then I also have values in range like (4000, 44000). I am just taking
random examples to explain the bias in the data.

When I plot a histogram I just see one big bar. I feel the bins are
not created effectively. I also tried couple of different options in
the R hist function but with same result.

hist(lane2, freq=TRUE, breaks=10);
 hist(lane2, freq=TRUE, include.lowest=TRUE);

What does summary(lane2) show? You may need to transform the data to make
it more presentable (log?).

Sean




_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Michael Dondrup, Ph.D.
Bergen Center for Computational Science
Computational Biology Unit
Unifob AS - Thormøhlensgate 55, N-5008 Bergen, Norway
Phone: +47 55584029 Fax: +47 55584295

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to