Hi Michael,

Note that some aligners, e.g. BWA, can now produce split alignments if an 
alignment cannot be represented by a coordinate + cigar string.  In a group of 
SAMRecords representing such an alignment, all but one of them should have the 
0x800 supplementary flag set.  Newer versions of Picard handle this properly.

I.e. for a given {read name, end number} there should be only one SAMRecord 
that has neither 0x800 (supplementary) nor 0x100 (secondary) flag set.

-Alec

On Jun 16, 2014, at 1:08 PM, Rusch, Michael <michael.ru...@stjude.org> wrote:

> We downloaded some BAM files for some public data.  We would like to extract 
> raw reads and remap.  We've had trouble doing this so far for this one 
> particular data set using Picard, and after some investigating, the root 
> cause seems to be that they have some reads that have more than one primary 
> alignment.  So, for a given read name and mate (1/2), there are in some cases 
> multiple SAM records that do not have the non-primary flag set.  This seems 
> "wrong" to me.  Am I missing something?
>  
> In any case, right or wrong, we're having trouble working with it.  
> SamToFastq throws an exception, as does SortSam (we were going to sort by 
> queryname and then use some simple code to filter out the dups before 
> converting to FASTQ).  Anybody successfully dealt with this before who could 
> provide some advice?
>  
> FYI: we're using an old version of picard, so we're trying with a newer 
> version in case the newer version can handle it.  We're also going to try 
> samtools sort followed by the filtering code.  But, I thought maybe if others 
> had dealt with this, they could enlighten me with their wisdom and save us 
> some work reinventing the wheel.
>  
> Michael
> 
> Email Disclaimer: www.stjude.org/emaildisclaimer
> Consultation Disclaimer: www.stjude.org/consultationdisclaimer
> ------------------------------------------------------------------------------
> HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
> Find What Matters Most in Your Big Data with HPCC Systems
> Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
> Leverages Graph Analysis for Fast Processing & Easy Data Exploration
> http://p.sf.net/sfu/hpccsystems_______________________________________________
> Samtools-help mailing list
> Samtools-help@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/samtools-help

------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to