Hi!

Peter Huang a écrit :
> Hi Ray group,
>
> I am using Ray to assembly our sequencing data. As some of the reads 
> mis-assembled onto our final scaffolds and
>  we have many low coverage contigs hanging around, I am curious if there is a 
> flag to eliminate the contigs
>  with low coverage such as 5 or 10. ( I know Ray has a flag to set the limit 
> length of contig).

I just added this option:

        -use-minimum-seed-coverage minimumSeedCoverageDepth
               Sets the minimum seed coverage depth.
               Any path with a coverage depth lower than this will be 
discarded. The default is 0.


Example: -use-minimum-seed-coverage 40


You will need to install Ray (and RayPlatform) from the git repository.

The changes:

  MANUAL_PAGE.txt                         |    4 ++++
  code/application_core/Parameters.cpp    |    5 ++++-
  code/plugin_SeedingData/SeedingData.cpp |   13 ++++++++++++-
  code/plugin_SeedingData/SeedingData.h   |    6 ++++++
  4 files changed, 26 insertions(+), 2 deletions(-)


This option will be available also in Ray v2.1.0 which will be shipped around 
mid September 2012.


Also, Ray creates a file containing meta data for each contig, you can use the
column 'Mode k-mer coverage depth':


[boiseb01@ls30 RayKmerSearchDevel]$ head 
TestX/BiologicalAbundances/_DeNovoAssembly/Contigs.tsv
#Contig name    K-mer length    Length in k-mers        Colored k-mers  
Proportion      Mode k-mer coverage depth       K-mer observations      Total   
Proportion
contig-0        21      9859    0       0       30      295770  60497522        
0.00488896
contig-15       21      6874    0       0       28      192472  60497522        
0.00318149
contig-16       21      3353    0       0       31      103943  60497522        
0.00171814
contig-14       21      8809    0       0       32      281888  60497522        
0.0046595
contig-1000000  21      139     0       0       88      12232   60497522        
0.00020219
contig-1000015  21      558     0       0       58      32364   60497522        
0.000534964
contig-3        21      9297    0       0       29      269613  60497522        
0.0044566
contig-7        21      9644    0       0       30      289320  60497522        
0.00478234
contig-27       21      12701   0       0       30      381030  60497522        
0.00629827


>In addition, is there a cut off for kmer as well, so that low kmer coverage 
>will be eliminated at early stage of assembly?
>

This:

        -use-minimum-seed-coverage minimumSeedCoverageDepth
               Sets the minimum seed coverage depth.
               Any path with a coverage depth lower than this will be 
discarded. The default is 0.

(not available in v2.0.0, see above).


There is also the following option that discard things that have too much 
coverage:

        -use-maximum-seed-coverage
               Ignores any seed with a coverage depth above this threshold.ééé
               The default is 4294967295.


If the problem is with memory usage caused by erroneous k-mers, you can 
increase the number of
bits in the Bloom filter:

        -bloom-filter-bits bits
               Sets the number of bits for the Bloom filter
               Default is 268435456 bits, 0 bits disables the Bloom filter.

This option was added recently, you will need to install from the git 
repository.

There are other useful new options for tuning the distributed in-memory storage 
engine,
see MANUAL_PAGE.txt


Sébastien Boisvert


> Thanks.
>
> Best,
>
> Peter


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Denovoassembler-users mailing list
Denovoassembler-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to