Quick question: By looking at your CoverageDistribution.txt file, I can see that there are numerous low-coverage erroneous k-mers.
I would say that these reads are not Illumina reads and not 454 reads although I might be wrong. Therefore, are these color-space reads that you converted to fasta or fastq using SAET from SOLiD ? Just asking. Also, if you happen to work with color space data, do you know a way to convert color-space contigs into nucleotide-space contigs ? Sébastien On Mon, 2011-06-20 at 06:59 -0400, Phillip San Miguel wrote: > Hi Sébastien, > Here it is: > > http://pastebin.com/Mv5ziDVk > > Regards, > Phillip > > On 6/17/2011 12:50 PM, Sébastien Boisvert wrote: > > What is the content of the CoverageDistribution.txt file with the > > smoothing code ? > > > > > > > > On Thu, 2011-06-16 at 11:32 -0400, Rick Westerman wrote: > >> Sebastian: > >> > >> Now that I look at the source closer, the version number in the > >> v.1.6.1-rc1 link is set to 1.6.0 however the code itself looks like you > >> have put in smoothing therefore I am confident that we have the latest and > >> greatest. I've recompiled a couple of times, including a time when I > >> changed the version number just to be 100% sure that we are running the > >> latest code. However our over-all problem of short contigs continues: > >> > >> The 'LibraryStatistics' file looks like: > >> > >> ------------------- > >> > >> File: ../FastQ/000617_TL3360_both.fastq > >> NumberOfSequences: 13001302 > >> > >> Total: 13001302 > >> > >> NumberOfPairedLibraries: 1 > >> > >> LibraryNumber: 0 > >> InputFormat: Interleaved,Paired > >> DetectionType: Automatic > >> File: ../FastQ/000617_TL3360_both.fastq > >> NumberOfSequences: 13001302 > >> AverageOuterDistance: 1018 > >> StandardDeviation: 963 > >> DetectionFailure: Yes > >> > >> ------------------- > >> > >> > >> The old file looks like: > >> > >> ------------------- > >> > >> File: ../FastQ/000617_TL3360_both.fastq > >> NumberOfSequences: 13001302 > >> > >> Total: 13001302 > >> > >> NumberOfPairedLibraries: 1 > >> > >> LibraryNumber: 0 > >> InputFormat: Interleaved,Paired > >> DetectionType: Automatic > >> File: ../FastQ/000617_TL3360_both.fastq > >> NumberOfSequences: 13001302 > >> AverageOuterDistance: 385 > >> StandardDeviation: 628 > >> DetectionFailure: Yes > >> > >> ------------------- > >> > >> So obviously we are getting a different (and perhaps better) distance > >> distribution. > >> > >> I am going to try some other parameters out but any suggestions would > >> be useful > >> > >> Thanks, > >> -- Rick > >> > >> > >> > >> ----- Original Message ----- > >>> Sebastian: > >>> > >>> The download link for v1.6.1-rc1 brings up 1.6.0 code. At least that > >>> what the version says. We are also still having problems with our > >>> sequences. Could you look into this problem. Perhaps create another > >>> download? > >>> > >>> Thanks, > >>> > >>> -- > >>> Rick Westerman > >>> wester...@purdue.edu > >>> > >>> Bioinformatics specialist at the Genomics Facility. > >>> Phone: (765) 494-0505 FAX: (765) 496-7255 > >>> Department of Horticulture and Landscape Architecture > >>> 625 Agriculture Mall Drive > >>> West Lafayette, IN 47907-2010 > >>> Physically located in room S049, WSLR building > > > > > ------------------------------------------------------------------------------ Simplify data backup and recovery for your virtual environment with vRanger. Installation's a snap, and flexible recovery options mean your data is safe, secure and there when you need it. Data protection magic? Nope - It's vRanger. Get your free trial download today. http://p.sf.net/sfu/quest-sfdev2dev _______________________________________________ Denovoassembler-users mailing list Denovoassembler-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/denovoassembler-users