Hi Dan, thanks for your comment. Actually I got the idea of using Biostrings from seqanswers, which was thrown in by James W. MacDonald. see http://seqanswers.com/forums/showthread.php?t=780
Apparently Novoalign can also cope with adapters at both ends but only the licensed version will take care of the 5' end adapter. Any more thoughts? Cheers, Dave > Date: Thu, 8 Jan 2009 16:04:35 +0000 > From: [email protected] > To: [email protected] > Subject: Re: [Bioc-sig-seq] adapter removal > CC: [email protected] > > 2009/1/8 David A.G <[email protected]>: > > > > Dear list, > > > > I have some experience with Bioconductor but am newbie to this list and to > > NGS. I am trying to remove some adapters from my solexa s_N_sequence.txt > > file using Biostrings and ShortRead packages and the vignettes. I managed > > to read in the text file and got to save the reads as follows > > > > fqpattern <- "s_4_sequence.txt" > > f4 <- file.path(analysisPath(sp), fqpattern) > > fq4 <- readFastq(sp, fqpattern) > > reads <- sread(fq4) #"reads" contains more than 4 million 34-length > > fragments > > > > Having the following adapter sequence: > > > > adapter <- DNAString("ACGGATTGTTCAGT") > > > > I tried to mimic the example in the Biostring vignette as follows: > > > > > > myAdapterAligns <- pairwiseAlignment(reads, adapter, type = "overlap") > > > > but after more than two hours the process is still running. > > > > I am running R 2.8.0 on a 64bit linux machine (Kubuntu 2.6.24) with 4Gb > > RAM, and I only have some 30Mb free RAM left. I found a thread on adapter > > removal but does not clear things much to me, since as far as I understood > > the option mentioned in the thread is not appropriate (quote :(though > > apparently this is not entirely satisfactory, see the second entry!)). > > > > Is this just a memory issue or am I doing something wrong? Shall I leave > > the process to run for longer? > > > > TIA for your help, > > > > Dave > > Hi Dave > > I think a stand alone C program may be more appropriate for the task > you are trying to perform. I'm new to NGS myself, but I believe there > are many software available to do this. I think the convenience of > using R natrualy results in a performance hit on some intensive > algorithms. > > Try asking your question over here: > > http://seqanswers.com/ > > > or is there a better mailing list? > > Cheers, > > Dan. > > > > > _________________________________________________________________ > > Show them the way! Add maps and directions to your party invites. > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioc-sig-sequencing mailing list > > [email protected] > > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > > _________________________________________________________________ [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
