Nils,
I see I made a mistake in my countTileDups script, in that I report the wrong
variable in the end!! The number of dups in the same tile is in fact > than
you reported optical duplicates.
So please ignore my request for now.
Thank you,
Anna
From: Salzberg, Anna
Sent: Friday, October 10, 2014 10:26 AM
To: 'Nils Homer'
Cc: samtools-help@lists.sourceforge.net
Subject: RE: [Samtools-help] Reporting Bug - Optical Duplicates of Picard
MarkDuplicates
Dear Nils,
I installed Picard 1.122, and the number of optical duplicates was reduced by
over 25% (the estimated library size was also different in the new version).
Unfortunately, I still think that there is a bug with the number of optical
duplicates, as simply counting the number of duplicates that have the same tile
results in 3 orders of magnitude less than the MarkDuplicates optical
duplicates count.
I would *greatly* appreciate if you could look into this as this is super
important to my lab. I have provided in my previous email 2 scripts; one of
them is a very simple script (only a few lines) that simply counts duplicates
with the same tile.
Thank you very much for your help with this issue.
Anna
From: Nils Homer [mailto:nho...@broadinstitute.org]
Sent: Thursday, October 09, 2014 4:34 PM
To: Salzberg, Anna
Cc:
samtools-help@lists.sourceforge.net<mailto:samtools-help@lists.sourceforge.net>
Subject: Re: [Samtools-help] Reporting Bug - Optical Duplicates of Picard
MarkDuplicates
I am replying to the list so others can benefit from our discussion.
The latest Picard release to support updated Illumina read names is 1.120 while
your install is 1.99. You will need to update to this version or the latest
version to get the benefit of this update.
Nils
On Thu, Oct 9, 2014 at 4:08 PM, Nils Homer
<nho...@broadinstitute.org<mailto:nho...@broadinstitute.org>> wrote:
Could you tell us what version of Picard you are using? There was an issue
earlier with parsing read names from newer Illumina analysis software.
Nils
On Thu, Oct 9, 2014 at 3:00 PM, Salzberg, Anna
<asalzb...@hmc.psu.edu<mailto:asalzb...@hmc.psu.edu>> wrote:
Hello,
I am convinced that the optical duplicates count of the Picard MarkDuplicates
command is incorrect. When I wrote a script to detect optical duplicates in
my dataset, I got only ~1k optical duplicates as opposed to MarkDuplicates ~3
million. I think the problem with MarkDuplicates is tile related because I
then wrote a super simple script that simply counts how many duplicates share
the same tile, and that was < 4k, that is, 3 orders of magnitudes less than
MarkDuplicates! The overall number of duplicates (opticals or otherwise)
matched (~7 million). I'm convinced my script is right, as it's so simple.
Remove optical duplicates script:
https://gist.github.com/annasa/eef7c30152ac296bb49b
Count duplicates in same tile:
https://gist.github.com/annasa/f5633eecf012153a3ff2
Both scripts take as input a sam file sorted on chr and startPos. They also
assume that when the sequence name is parsed by ":" then the tile is the 5th
field, x the 6th and y the 7th (e.g. HWI-ST1318:119:H89A3ADXX:1:2209:1705:6933,
where tile is '2209', x is '1705' and y is'6933'). Finally, they assume that
the file is for a single lane, as I was working with such files.
This is VERY important for my lab. Please advise as soon as you can.
Thank you,
Anna
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net<mailto:Samtools-help@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/samtools-help
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help