Thanks, Jim!

This is not what I want,  What I want is calculate the percentage of reads
bigger or equal to that reads in each position.MY output is like the
following:
for row 1, all the reads is >= 4, so the cover_per is 100,
for row 2, 99 % reads >=4, so the cover_per is 99.
> head(final)
  cover_per reads
1       100     4
2        99     8
3        98    13
4        97    14
5        96    17
6        95    20

I attached the input file with this email. This file is only 100 rows, very
small. MY original data set is 3384766 rows.

 >
matt<-read.table("/house/groupdirs/genetic_analysis/mjblow/ILLUMINA_ONLY_MICROBIAL_GENOME_ASSEMBLY/4083340/STANDARD_LIBRARY/GWZW.994.5.1129.trim_69.fastq.19621832.sub.sorted.bam.clone.depth",
sep="\t", skip=0, header=F,fill=T) #
> dim(matt)
[1] 3384766       2

Thanks so much for your time!

> matt<-read.table("/home/cdu/operon/dimer5_0623/matt_test.txt", sep="\t",
skip=0, header=F,fill=T) #
> names(matt)<-c("id","reads")
> dim(matt)
[1] 100   2
> cover<-matt$reads
>
>
> #calculate the cumulative coverage.
> cover_per<-function (data) {
+ output<-numeric(0)
+ for (i in data) {
+           x<-(100*sum(ifelse(data >= i, 1, 0))/length(data))
+           output<-c(output, x)
+                 }
+ return(output)
+ }
>
>
> result<-cover_per(cover)
> head(result)
[1] 100  99  98  97  96  95
>
> final<-data.frame(result, cover)
>
> names(final)<-c("cover_per", "reads")
> head(final)
  cover_per reads
1       100     4
2        99     8
3        98    13
4        97    14
5        96    17
6        95    20





On Thu, Nov 4, 2010 at 9:18 AM, jim holtman <jholt...@gmail.com> wrote:

> Is this what you want:
>
> > x
>            id reads
> 1   Contig79:1     4
> 2   Contig79:2     8
> 3   Contig79:3    13
> 4   Contig79:4    14
> 5   Contig79:5    17
> 6   Contig79:6    20
> 7   Contig79:7    25
> 8   Contig79:8    27
> 9   Contig79:9    32
> 10 Contig79:10    33
> 11 Contig79:11    34
> > x$percent <- x$reads / max(x$reads) * 100
> > x
>            id reads   percent
> 1   Contig79:1     4  11.76471
> 2   Contig79:2     8  23.52941
> 3   Contig79:3    13  38.23529
> 4   Contig79:4    14  41.17647
> 5   Contig79:5    17  50.00000
> 6   Contig79:6    20  58.82353
> 7   Contig79:7    25  73.52941
> 8   Contig79:8    27  79.41176
> 9   Contig79:9    32  94.11765
> 10 Contig79:10    33  97.05882
> 11 Contig79:11    34 100.00000
>
>
> On Thu, Nov 4, 2010 at 11:46 AM, Changbin Du <changb...@gmail.com> wrote:
> > HI, Dear R community,
> >
> > I have one data set like this,  What I want to do is to calculate the
> > cumulative coverage. The following codes works for small data set (#rows
> =
> > 100), but when feed the whole data set,  it still running after 24 hours.
> > Can someone give some suggestions for long vector?
> >
> > id                reads
> > Contig79:1    4
> > Contig79:2    8
> > Contig79:3    13
> > Contig79:4    14
> > Contig79:5    17
> > Contig79:6    20
> > Contig79:7    25
> > Contig79:8    27
> > Contig79:9    32
> > Contig79:10    33
> > Contig79:11    34
> >
> >
> matt<-read.table("/house/groupdirs/genetic_analysis/mjblow/ILLUMINA_ONLY_MICROBIAL_GENOME_ASSEMBLY/4083340/STANDARD_LIBRARY/GWZW.994.5.1129.trim_69.fastq.19621832.sub.sorted.bam.clone.depth",
> > sep="\t", skip=0, header=F,fill=T) #
> > dim(matt)
> > [1] 3384766       2
> >
> > matt_plot<-function(matt, outputfile) {
> > names(matt)<-c("id","reads")
> >
> >  cover<-matt$reads
> >
> >
> > #calculate the cumulative coverage.
> > + cover_per<-function (data) {
> > + output<-numeric(0)
> > + for (i in data) {
> > +           x<-(100*sum(ifelse(data >= i, 1, 0))/length(data))
> > +           output<-c(output, x)
> > +                 }
> > + return(output)
> > + }
> >
> >
> >  result<-cover_per(cover)
> >
> >
> > Thanks so much!
> >
> >
> > --
> > Sincerely,
> > Changbin
> > --
> >
> >        [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>



-- 
Sincerely,
Changbin
--

Changbin Du
DOE Joint Genome Institute
Bldg 400 Rm 457
2800 Mitchell Dr
Walnut Creet, CA 94598
Phone: 925-927-2856
Contig79:1      4
Contig79:2      8
Contig79:3      13
Contig79:4      14
Contig79:5      17
Contig79:6      20
Contig79:7      25
Contig79:8      27
Contig79:9      32
Contig79:10     33
Contig79:11     34
Contig79:12     36
Contig79:13     39
Contig79:14     40
Contig79:15     44
Contig79:16     49
Contig79:17     55
Contig79:18     56
Contig79:19     59
Contig79:20     60
Contig79:21     62
Contig79:22     64
Contig79:23     64
Contig79:24     68
Contig79:25     68
Contig79:26     68
Contig79:27     70
Contig79:28     73
Contig79:29     76
Contig79:30     77
Contig79:31     78
Contig79:32     78
Contig79:33     79
Contig79:34     80
Contig79:35     80
Contig79:36     84
Contig79:37     87
Contig79:38     87
Contig79:39     88
Contig79:40     88
Contig79:41     89
Contig79:42     93
Contig79:43     94
Contig79:44     98
Contig79:45     99
Contig79:46     99
Contig79:47     102
Contig79:48     103
Contig79:49     108
Contig79:50     112
Contig79:51     112
Contig79:52     113
Contig79:53     116
Contig79:54     118
Contig79:55     120
Contig79:56     124
Contig79:57     124
Contig79:58     126
Contig79:59     128
Contig79:60     130
Contig79:61     133
Contig79:62     134
Contig79:63     136
Contig79:64     139
Contig79:65     144
Contig79:66     145
Contig79:67     146
Contig79:68     148
Contig79:69     149
Contig79:70     151
Contig79:71     156
Contig79:72     157
Contig79:73     158
Contig79:74     159
Contig79:75     159
Contig79:76     159
Contig79:77     160
Contig79:78     160
Contig79:79     161
Contig79:80     163
Contig79:81     164
Contig79:82     165
Contig79:83     165
Contig79:84     165
Contig79:85     165
Contig79:86     166
Contig79:87     170
Contig79:88     170
Contig79:89     172
Contig79:90     174
Contig79:91     178
Contig79:92     180
Contig79:93     181
Contig79:94     184
Contig79:95     184
Contig79:96     187
Contig79:97     190
Contig79:98     192
Contig79:99     194
Contig79:100    199
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to