On 09/07/2011 11:34 AM, wang peter wrote:
i replaced the low quality position with "N" and then trimmed N from 5' and 3' i got the reads which only contain the "N" in the middle if i want to find those reads before replacement by "N" WHAT CAN I DO?
Hi Peter -- Not sure what you mean, probably not this library(ShortRead) example(readFastq) ## defines 'rfq' variable startsWithN <- as.character(narrow(sread(rfq), 1, 1)) == "N" endsWithN <- as.character(narrow(sread(rfq), width(rfq), width(rfq))) == "N" middleN <- !startsWithN & !endsWithN & alphabetFrequency(sread(rfq))[,"N"] != 0 rfq[!middleN] I think below your code can be simplified with at <- as(quality(read), "matrix") < qualityCutoff Martin
THX such is coding library(ShortRead) reads<- readFastq(fastqfile);# ids<- id(reads); # seqs<- sread(reads); nCount<-alphabetFrequency(seqs)[,"N"] nDist<- table(nCount) qualityCutoff<- 20 qual<- PhredQuality(quality(quality(reads))) myqual_mat<- matrix(charToRaw(as.character(unlist(qual))), nrow=length(qual), byrow=TRUE) # convert quality score to matrix at<- myqual_mat< charToRaw(as.character(PhredQuality(as.integer(qualityCutoff)))) letter_subject<- DNAString(paste(rep.int("N", width(seqs)[1]), collapse="")) letter<- as(Views(letter_subject, start=1, end=rowSums(at)), "DNAStringSet") injectedseqs<- replaceLetterAt(seqs, at, letter) trimmed<-trimLRPatterns(Rpattern = letter_subject, Lpattern = letter_subject,subject = injectedseqs) nCount<-alphabetFrequency(trimmed)[,"N"] nDist<- table(nCount) thank u shan gao [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list Bioc-sig-sequencing@r-project.org https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
-- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793 _______________________________________________ Bioc-sig-sequencing mailing list Bioc-sig-sequencing@r-project.org https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing