I think a better word would be uniform for balanced amounts, I guess.



We did some in-house simulations using 3 000 000 000 100-nucleotide reads (equivalent to a Illumina flow cell) with 1000 bacterial genomes with proportions following a power law
(as found in nature, presumably).

We ran this on 1024 processor cores and it worked quite well, with profiling and all.



-------- Message original --------
Sujet:  Re: [Denovoassembler-users] Understanding the coverage parameters
Date :  Wed, 6 Jun 2012 17:38:51 -0400
De :    Keith Robison <keith.e.robi...@gmail.com>
Pour :  Sébastien Boisvert <sebastien.boisver...@ulaval.ca>



Normalized would mean we put in a balanced amount of each species; not normalized meaning that each species in the sample is present at a unique amount (which is closer to what you would find in nature)

On Wed, Jun 6, 2012 at 5:37 PM, Sébastien Boisvert <sebastien.boisver...@ulaval.ca <mailto:sebastien.boisver...@ulaval.ca>> wrote:

   What do you mean by normalized ?

   Keith Robison a écrit :
    Thanks!!

    Metagenome in this case was a synthetic mixture of 96 strains,
    though they were not normalized and are probably present in about
    a 100X range of concentrations.

    On Wed, Jun 6, 2012 at 4:22 PM, Sébastien Boisvert
    <sebastien.boisver...@ulaval.ca
    <mailto:sebastien.boisver...@ulaval.ca>> wrote:

        Hello,


        Just out of curiosity, what kind of metagenomes are you
        dealing with ?


        As of Ray v2.0.0-rc8, these 3 parameters no longer exists.
        However, the values are still written to the Analysis file because
        they are useful for single-genome assemblies.

        For a single-genome assembly, you should see a peak.

        For a metagenome or transcriptome assembly, these values will
        not be
        informative.
        Usually, the value is not significative because the code that
        computes
        it needs a peak.


        The méta-Ray engine, which is utilised for all assemblies in
        the 2.0.0
        series, does not rely on these
        values.


        In this engine, everything is computed locally because you
        don't usually see
        any peak for metagenomes (let's
        say gut microbiomes) or transcriptome.
        This works well for metagenomes, transcriptomes and single
        genomes. I
        don't know if it works
        for alternative splicing.


        In fact, these global values make no sense for metagenomes or
        transcriptomes
        because they are computed on all the de Bruijn graph.

        In a metagenome or a transcriptome, the abundance levels
        follow usually
        something like
        a power law, which is really not uniform.

        Basically, the technology behind the 2.0.0 series of Ray computes
        coverage distributions
        for local discrete objects.

        I am submitting my paper tomorrow (finally !) about that, somehow.


        Keith Robison a écrit :
        > Hello!
        >
        > Could you comment on when one might want to toy with the
        coverage
        > options to Ray, and what sort of expected effects these
        would have
        >
        > Assembly options (defaults work well)
        >
        >         -minimumCoverage minimumCoverage
        >                Sets manually the minimum coverage.
        >                If not provided, it is computed by Ray
        automatically.
        >
        >         -peakCoverage peakCoverage
        >                Sets manually the peak coverage.
        >                If not provided, it is computed by Ray
        automatically.
        >
        >         -repeatCoverage repeatCoverage
        >                Sets manually the repeat coverage.
        >                If not provided, it is computed by Ray
        automatically.
        >
        > For a single genome dataset, the CoverageDistribution file:
        >
        > k-mer length:   21
        > Lowest coverage observed:       2
        > MinimumCoverage:        14
        > PeakCoverage:   58
        > RepeatCoverage: 102
        > Number of k-mers with at least MinimumCoverage: 17633562 k-mers
        > Percentage of vertices with coverage 2: 29.714 %
        >
        >
        > For a large metagenomic dataset, the CoverageDistribution file:
        > k-mer length:   21
        > Lowest coverage observed:       2
        > MinimumCoverage:        240
        > PeakCoverage:   241
        > RepeatCoverage: 242
        > Number of k-mers with at least MinimumCoverage: 11446138 k-mers
        > Percentage of vertices with coverage 2: 17.4967 %
        >
        > Looking at these, I'm now confused by the MinimumCoverage
        value -- in
        > my metagenomic example it is quite high.  Is this setting
        the minimum
        > coverage to be included in a contig?


        
------------------------------------------------------------------------------
        Live Security Virtual Conference
        Exclusive live event will cover all the ways today's security and
        threat landscape has changed and how IT managers can respond.
        Discussions
        will include endpoint security, mobile security and the latest
        in malware
        threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
        _______________________________________________
        Denovoassembler-users mailing list
        Denovoassembler-users@lists.sourceforge.net
        <mailto:Denovoassembler-users@lists.sourceforge.net>
        https://lists.sourceforge.net/lists/listinfo/denovoassembler-users





------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Denovoassembler-users mailing list
Denovoassembler-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to