Thanks, Jim! This is not what I want, What I want is calculate the percentage of reads bigger or equal to that reads in each position.MY output is like the following: for row 1, all the reads is >= 4, so the cover_per is 100, for row 2, 99 % reads >=4, so the cover_per is 99. > head(final) cover_per reads 1 100 4 2 99 8 3 98 13 4 97 14 5 96 17 6 95 20
I attached the input file with this email. This file is only 100 rows, very small. MY original data set is 3384766 rows. > matt<-read.table("/house/groupdirs/genetic_analysis/mjblow/ILLUMINA_ONLY_MICROBIAL_GENOME_ASSEMBLY/4083340/STANDARD_LIBRARY/GWZW.994.5.1129.trim_69.fastq.19621832.sub.sorted.bam.clone.depth", sep="\t", skip=0, header=F,fill=T) # > dim(matt) [1] 3384766 2 Thanks so much for your time! > matt<-read.table("/home/cdu/operon/dimer5_0623/matt_test.txt", sep="\t", skip=0, header=F,fill=T) # > names(matt)<-c("id","reads") > dim(matt) [1] 100 2 > cover<-matt$reads > > > #calculate the cumulative coverage. > cover_per<-function (data) { + output<-numeric(0) + for (i in data) { + x<-(100*sum(ifelse(data >= i, 1, 0))/length(data)) + output<-c(output, x) + } + return(output) + } > > > result<-cover_per(cover) > head(result) [1] 100 99 98 97 96 95 > > final<-data.frame(result, cover) > > names(final)<-c("cover_per", "reads") > head(final) cover_per reads 1 100 4 2 99 8 3 98 13 4 97 14 5 96 17 6 95 20 On Thu, Nov 4, 2010 at 9:18 AM, jim holtman <jholt...@gmail.com> wrote: > Is this what you want: > > > x > id reads > 1 Contig79:1 4 > 2 Contig79:2 8 > 3 Contig79:3 13 > 4 Contig79:4 14 > 5 Contig79:5 17 > 6 Contig79:6 20 > 7 Contig79:7 25 > 8 Contig79:8 27 > 9 Contig79:9 32 > 10 Contig79:10 33 > 11 Contig79:11 34 > > x$percent <- x$reads / max(x$reads) * 100 > > x > id reads percent > 1 Contig79:1 4 11.76471 > 2 Contig79:2 8 23.52941 > 3 Contig79:3 13 38.23529 > 4 Contig79:4 14 41.17647 > 5 Contig79:5 17 50.00000 > 6 Contig79:6 20 58.82353 > 7 Contig79:7 25 73.52941 > 8 Contig79:8 27 79.41176 > 9 Contig79:9 32 94.11765 > 10 Contig79:10 33 97.05882 > 11 Contig79:11 34 100.00000 > > > On Thu, Nov 4, 2010 at 11:46 AM, Changbin Du <changb...@gmail.com> wrote: > > HI, Dear R community, > > > > I have one data set like this, What I want to do is to calculate the > > cumulative coverage. The following codes works for small data set (#rows > = > > 100), but when feed the whole data set, it still running after 24 hours. > > Can someone give some suggestions for long vector? > > > > id reads > > Contig79:1 4 > > Contig79:2 8 > > Contig79:3 13 > > Contig79:4 14 > > Contig79:5 17 > > Contig79:6 20 > > Contig79:7 25 > > Contig79:8 27 > > Contig79:9 32 > > Contig79:10 33 > > Contig79:11 34 > > > > > matt<-read.table("/house/groupdirs/genetic_analysis/mjblow/ILLUMINA_ONLY_MICROBIAL_GENOME_ASSEMBLY/4083340/STANDARD_LIBRARY/GWZW.994.5.1129.trim_69.fastq.19621832.sub.sorted.bam.clone.depth", > > sep="\t", skip=0, header=F,fill=T) # > > dim(matt) > > [1] 3384766 2 > > > > matt_plot<-function(matt, outputfile) { > > names(matt)<-c("id","reads") > > > > cover<-matt$reads > > > > > > #calculate the cumulative coverage. > > + cover_per<-function (data) { > > + output<-numeric(0) > > + for (i in data) { > > + x<-(100*sum(ifelse(data >= i, 1, 0))/length(data)) > > + output<-c(output, x) > > + } > > + return(output) > > + } > > > > > > result<-cover_per(cover) > > > > > > Thanks so much! > > > > > > -- > > Sincerely, > > Changbin > > -- > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem that you are trying to solve? > -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute Bldg 400 Rm 457 2800 Mitchell Dr Walnut Creet, CA 94598 Phone: 925-927-2856
Contig79:1 4 Contig79:2 8 Contig79:3 13 Contig79:4 14 Contig79:5 17 Contig79:6 20 Contig79:7 25 Contig79:8 27 Contig79:9 32 Contig79:10 33 Contig79:11 34 Contig79:12 36 Contig79:13 39 Contig79:14 40 Contig79:15 44 Contig79:16 49 Contig79:17 55 Contig79:18 56 Contig79:19 59 Contig79:20 60 Contig79:21 62 Contig79:22 64 Contig79:23 64 Contig79:24 68 Contig79:25 68 Contig79:26 68 Contig79:27 70 Contig79:28 73 Contig79:29 76 Contig79:30 77 Contig79:31 78 Contig79:32 78 Contig79:33 79 Contig79:34 80 Contig79:35 80 Contig79:36 84 Contig79:37 87 Contig79:38 87 Contig79:39 88 Contig79:40 88 Contig79:41 89 Contig79:42 93 Contig79:43 94 Contig79:44 98 Contig79:45 99 Contig79:46 99 Contig79:47 102 Contig79:48 103 Contig79:49 108 Contig79:50 112 Contig79:51 112 Contig79:52 113 Contig79:53 116 Contig79:54 118 Contig79:55 120 Contig79:56 124 Contig79:57 124 Contig79:58 126 Contig79:59 128 Contig79:60 130 Contig79:61 133 Contig79:62 134 Contig79:63 136 Contig79:64 139 Contig79:65 144 Contig79:66 145 Contig79:67 146 Contig79:68 148 Contig79:69 149 Contig79:70 151 Contig79:71 156 Contig79:72 157 Contig79:73 158 Contig79:74 159 Contig79:75 159 Contig79:76 159 Contig79:77 160 Contig79:78 160 Contig79:79 161 Contig79:80 163 Contig79:81 164 Contig79:82 165 Contig79:83 165 Contig79:84 165 Contig79:85 165 Contig79:86 166 Contig79:87 170 Contig79:88 170 Contig79:89 172 Contig79:90 174 Contig79:91 178 Contig79:92 180 Contig79:93 181 Contig79:94 184 Contig79:95 184 Contig79:96 187 Contig79:97 190 Contig79:98 192 Contig79:99 194 Contig79:100 199
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.