I checked some of the offending records, and they did NOT have the 0x800 bit 
set (on any of them).  It seems that samtools sort -n is able to handle this 
case, and so we will be doing a samtools sort -n followed by a little java 
class to fix the sorting declaration in the header, eliminate the dups, and 
strip out alignment info.  Hopefully that'll do it!

Thanks for the help.

Michael

From: Alec Wysoker [mailto:al...@broadinstitute.org]
Sent: Monday, June 16, 2014 12:29 PM
To: Rusch, Michael
Cc: samtools-help@lists.sourceforge.net
Subject: Re: [Samtools-help] working with file with multiple primary alignme 
nts. .

Hi Michael,

Note that some aligners, e.g. BWA, can now produce split alignments if an 
alignment cannot be represented by a coordinate + cigar string.  In a group of 
SAMRecords representing such an alignment, all but one of them should have the 
0x800 supplementary flag set.  Newer versions of Picard handle this properly.

I.e. for a given {read name, end number} there should be only one SAMRecord 
that has neither 0x800 (supplementary) nor 0x100 (secondary) flag set.

-Alec

On Jun 16, 2014, at 1:08 PM, Rusch, Michael 
<michael.ru...@stjude.org<mailto:michael.ru...@stjude.org>> wrote:


We downloaded some BAM files for some public data.  We would like to extract 
raw reads and remap.  We've had trouble doing this so far for this one 
particular data set using Picard, and after some investigating, the root cause 
seems to be that they have some reads that have more than one primary 
alignment.  So, for a given read name and mate (1/2), there are in some cases 
multiple SAM records that do not have the non-primary flag set.  This seems 
"wrong" to me.  Am I missing something?

In any case, right or wrong, we're having trouble working with it.  SamToFastq 
throws an exception, as does SortSam (we were going to sort by queryname and 
then use some simple code to filter out the dups before converting to FASTQ).  
Anybody successfully dealt with this before who could provide some advice?

FYI: we're using an old version of picard, so we're trying with a newer version 
in case the newer version can handle it.  We're also going to try samtools sort 
followed by the filtering code.  But, I thought maybe if others had dealt with 
this, they could enlighten me with their wisdom and save us some work 
reinventing the wheel.

Michael

________________________________
Email Disclaimer: 
www.stjude.org/emaildisclaimer<http://www.stjude.org/emaildisclaimer>
Consultation Disclaimer: 
www.stjude.org/consultationdisclaimer<http://www.stjude.org/consultationdisclaimer>
------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net<mailto:Samtools-help@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/samtools-help

------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to