I think a better word would be uniform for balanced amounts, I guess.
We did some in-house simulations using 3 000 000 000 100-nucleotide
reads (equivalent
to a Illumina flow cell) with 1000 bacterial genomes with proportions
following a power law
(as found in nature, presumably).
We ran this on 1024 processor cores and it worked quite well, with
profiling and all.
-------- Message original --------
Sujet: Re: [Denovoassembler-users] Understanding the coverage parameters
Date : Wed, 6 Jun 2012 17:38:51 -0400
De : Keith Robison <keith.e.robi...@gmail.com>
Pour : Sébastien Boisvert <sebastien.boisver...@ulaval.ca>
Normalized would mean we put in a balanced amount of each species; not
normalized meaning that each species in the sample is present at a
unique amount (which is closer to what you would find in nature)
On Wed, Jun 6, 2012 at 5:37 PM, Sébastien Boisvert
<sebastien.boisver...@ulaval.ca <mailto:sebastien.boisver...@ulaval.ca>>
wrote:
What do you mean by normalized ?
Keith Robison a écrit :
Thanks!!
Metagenome in this case was a synthetic mixture of 96 strains,
though they were not normalized and are probably present in about
a 100X range of concentrations.
On Wed, Jun 6, 2012 at 4:22 PM, Sébastien Boisvert
<sebastien.boisver...@ulaval.ca
<mailto:sebastien.boisver...@ulaval.ca>> wrote:
Hello,
Just out of curiosity, what kind of metagenomes are you
dealing with ?
As of Ray v2.0.0-rc8, these 3 parameters no longer exists.
However, the values are still written to the Analysis file because
they are useful for single-genome assemblies.
For a single-genome assembly, you should see a peak.
For a metagenome or transcriptome assembly, these values will
not be
informative.
Usually, the value is not significative because the code that
computes
it needs a peak.
The méta-Ray engine, which is utilised for all assemblies in
the 2.0.0
series, does not rely on these
values.
In this engine, everything is computed locally because you
don't usually see
any peak for metagenomes (let's
say gut microbiomes) or transcriptome.
This works well for metagenomes, transcriptomes and single
genomes. I
don't know if it works
for alternative splicing.
In fact, these global values make no sense for metagenomes or
transcriptomes
because they are computed on all the de Bruijn graph.
In a metagenome or a transcriptome, the abundance levels
follow usually
something like
a power law, which is really not uniform.
Basically, the technology behind the 2.0.0 series of Ray computes
coverage distributions
for local discrete objects.
I am submitting my paper tomorrow (finally !) about that, somehow.
Keith Robison a écrit :
> Hello!
>
> Could you comment on when one might want to toy with the
coverage
> options to Ray, and what sort of expected effects these
would have
>
> Assembly options (defaults work well)
>
> -minimumCoverage minimumCoverage
> Sets manually the minimum coverage.
> If not provided, it is computed by Ray
automatically.
>
> -peakCoverage peakCoverage
> Sets manually the peak coverage.
> If not provided, it is computed by Ray
automatically.
>
> -repeatCoverage repeatCoverage
> Sets manually the repeat coverage.
> If not provided, it is computed by Ray
automatically.
>
> For a single genome dataset, the CoverageDistribution file:
>
> k-mer length: 21
> Lowest coverage observed: 2
> MinimumCoverage: 14
> PeakCoverage: 58
> RepeatCoverage: 102
> Number of k-mers with at least MinimumCoverage: 17633562 k-mers
> Percentage of vertices with coverage 2: 29.714 %
>
>
> For a large metagenomic dataset, the CoverageDistribution file:
> k-mer length: 21
> Lowest coverage observed: 2
> MinimumCoverage: 240
> PeakCoverage: 241
> RepeatCoverage: 242
> Number of k-mers with at least MinimumCoverage: 11446138 k-mers
> Percentage of vertices with coverage 2: 17.4967 %
>
> Looking at these, I'm now confused by the MinimumCoverage
value -- in
> my metagenomic example it is quite high. Is this setting
the minimum
> coverage to be included in a contig?
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond.
Discussions
will include endpoint security, mobile security and the latest
in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Denovoassembler-users mailing list
Denovoassembler-users@lists.sourceforge.net
<mailto:Denovoassembler-users@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Denovoassembler-users mailing list
Denovoassembler-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users