We downloaded some BAM files for some public data. We would like to extract
raw reads and remap. We've had trouble doing this so far for this one
particular data set using Picard, and after some investigating, the root cause
seems to be that they have some reads that have more than one primary
alignment. So, for a given read name and mate (1/2), there are in some cases
multiple SAM records that do not have the non-primary flag set. This seems
"wrong" to me. Am I missing something?
In any case, right or wrong, we're having trouble working with it. SamToFastq
throws an exception, as does SortSam (we were going to sort by queryname and
then use some simple code to filter out the dups before converting to FASTQ).
Anybody successfully dealt with this before who could provide some advice?
FYI: we're using an old version of picard, so we're trying with a newer version
in case the newer version can handle it. We're also going to try samtools sort
followed by the filtering code. But, I thought maybe if others had dealt with
this, they could enlighten me with their wisdom and save us some work
reinventing the wheel.
Michael
________________________________
Email Disclaimer: www.stjude.org/emaildisclaimer
Consultation Disclaimer: www.stjude.org/consultationdisclaimer
------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help