Dear Rob and John

You are right!  When I use zless I don't think I have any @SQ headers 
(attached).
The headers that I saw using samtools view header or the grep command only had 
SN: and LN: fields.

So...
Erne is not outputting @SQ headers.
It just looks that way when I use samtools view because samtools automatically 
generates basic ones.
No @SQ headers causes samtools sort to fail, but only when the alignments a big 
enough to need to merge (hence the singletons worked but the paired did not)
Samtools 1.3.1 will not have this problem but I must be somehow running the 
biobuild samtools 1.3 on my HPC.

I have just set off samtools sort jobs specifying the direct path to samtools 
1.3.1 - they should finish by the end of today.
If they work I will contact the erne authors to ask them to output @SQ headers 
in future.
If they don't work I will be back in touch.

Thank-you so much for this help. You can tell I am a newbie but I will try to 
use this mailing list wisely and sparingly.

Jo
-----Original Message-----
From: Robert Davies [mailto:r...@sanger.ac.uk] 
Sent: 04 January 2017 10:24
To: John Marshall <j...@sanger.ac.uk>
Cc: Holbrook J. <j.holbr...@soton.ac.uk>; samtools-help@lists.sourceforge.net
Subject: Re: [Samtools-help] samtools sort for paired-end .bam cannot find 
chromosome name in text header

On Wed, 4 Jan 2017, John Marshall wrote:


> TL;DR i.e. what Rob said.  But note that when you output SAM with 
>samtools view, samtools appends basic @SQ headers if there aren't 
>already any.  So Rob's `samtools view -H ... | grep '^@SQ'` will 
>display some headers whether the file contains textual headers or not.  
>Removing the grep will allow some educated guesses to be made about 
>whether @SQ headers seen with samtools view are synthetic or really in 
>the input
>file: if there are any other (e.g. @RG) headers *after* the block of 
>@SQ headers, or if the @SQ headers have any fields beyond SN and LN, 
>then they are definitely real.

Yes, I've just spotted this - samtools view fixes the problem too.

Looking at the bam file with 'zless' might be the quickest way of checking for 
@SQ lines.  The text header is fairly readable at the start.

Rob Davies              r...@sanger.ac.uk
The Sanger Institute    http://www.sanger.ac.uk/
Hinxton, Cambs.,        Tel. +44 (1223) 834244
CB10 1SA, U.K.          Fax. +44 (1223) 494919


--
 The Wellcome Trust Sanger Institute is operated by Genome Research  Limited, a 
charity registered in England with number 1021457 and a  company registered in 
England with number 2742969, whose registered  office is 215 Euston Road, 
London, NW1 2BE. 
[jh1m15@cyan01 SureSelect]$ zless Sample1b_unmasked.bam 
BAM^A<EB>^C^@^@@HD      VN:1.0  SO:unsorted
@PG     ID:ERNE VN:2.1.1        CL: erne-bs5 --reference unmasked_htable.ebm 
--query1 Trimmed/Sample_1b-1/Sample1b_catR1.fastq --query2 
Trimmed/Sample_1b-1/Sample1b_catR2.fastq --output Sample1b_unmasked.bam
@CO     1481027931: ERNE version 2.1.1
@CO     1481027931: --query1           = 
Trimmed/Sample_1b-1/Sample1b_catR1.fastq
@CO     1481027931: --query2           = 
Trimmed/Sample_1b-1/Sample1b_catR2.fastq
@CO     1481027931: --output           = Sample1b_unmasked.bam
@CO     1481027931: --sample           = no_sample_specified
@CO     1481027931: --reference        = unmasked_htable.ebm
@CO     1481027931: --contamination-reference = 
@CO     1481027931: --auto-errors      = true
@CO     1481027931: --errors-rate      = 15
@CO     1481027931: --errors           = 0
@CO     1481027931: --threads          = 1
@CO     1481027931: --min-size         = 25
@CO     1481027931: --min-phred-value-CLC = 20
@CO     1481027931: --min-mean-phread-quality = 20
@CO     1481027931: --no-quality-check = false
@RG     ID:1481027931   SM:no_sample_specified
]^@^@^@^E^@^@^@chr1^@=C<DB>^N^F^@^@^@chr10^@<9B>^X^T^H^F^@^@^@chr11^@4  
^L^H^V^@^@^@chr11_gl000202_random^@<A7><9C>^@^@^F^@^@^@chr12^@<F7>j<FA>^G^F^@^@^@chr13^@VZ<DD>^F^F^@^@^@chr14^@$^Ff^F^F^@^@^@chr15^@@<81>^\^F^F^@^@^@chr16^@A<B4>b^E^F^@^@^@chr17^@<CA><F0><D6>^D^P^@^@^@chr17_ctg5_hap1^@<BC><A5>^Y^@^V^@^@^@chr17_gl000203_random^@z<92>^@^@^V^@^@^@chr17_gl000204_random^@<9E>=^A^@^V^@^@^@chr17_gl000205_random^@<FC><A9>^B^@^V^@^@^@chr17_gl000206_random^@)<A0>^@^@^F^@^@^@chr18^@@]<A7>^D^V^@^@^@chr18_gl000207_random^@<A6>^P^@^@^F^@^@^@chr19^@<97><<86>^C^V^@^@^@chr19_gl000208_random^@^Qj^A^@^V
^@^@^@chr19_gl000209_random^@<C1>m^B^@^U^@^@^@chr1_gl000191_random^@<C1><9F>^A^@^U^@^@^@chr1_gl000192_random^@<A8>^@^E^@^@^@chr2^@<8D><ED>~^N^F^@^@^@chr20^@p<B1><C1>^C^F^@^@^@chr21^@gg<DE>^B^V^@^@^@chr21_gl000210_random^@"l^@^@^F^@^@^@chr22^@v<D8>^N^C^E^@^@^@chr3^@^^<95><CD>^K^E^@^@^@chr4^@d<C8>d
^K^O^@^@^@chr4_ctg9_hap1^@Z^B       
^@^U^@^@^@chr4_gl000193_random^@]<E5>^B^@^U^@^@^@chr4_gl000194_random^@<ED><EB>^B^@^E^@^@^@chr5^@<<8C><C8>
^E^@^@^@chr6^@;^B3
^N^@^@^@chr6_apd_hap1^@<D2><87>F^@^N^@^@^@chr6_cox_hap2^@<EB>+I^@^N^@^@^@chr6_dbb_hap3^@\YF^@^O^@^@^@chr6_mann_hap4^@<FF>uG^@^N^@^@^@chr6_mcf_hap5^@v<C0>I^@^N^@^@^@chr6_qbl_hap6^@<90>_F^@^O^@^@^@chr6_ssto_hap7^@74K^@^E^@^@^@chr7^@gC|
       ^U^@^@^@chr7_gl000195_random^@p<CA>^B^@^E^@^@^@chr8^@vV
<B9>^H^U^@^@^@chr8_gl000196_random^@^B<98>^@^@^U^@^@^@chr8_gl000197_random^@7<91>^@^@^E^@^@^@chr9^@<F7><BE>^U^@^@^@chr9_gl000198_random^@<E5>_^A^@^U
^@^@^@chr9_gl000199_random^@<92><97>^B^@^U^@^@^@chr9_gl000200_random^@<9B><DA>^B^@^U^@^@^@chr9_gl000201_random^@4<8D>^@^@^E^@^@^@chrM^@<BB>@^@^@^O^@
^@^@chrUn_gl000211^@<A6><8A>^B^@^O^@^@^@chrUn_gl000212^@<EA><D9>^B^@^O^@^@^@chrUn_gl000213^@<8F><81>^B^@^O^@^@^@chrUn_gl000214^@<F6>^Y^B^@^O^@^@^@chrUn_gl000215^@^A<A2>^B^@^O^@^@^@chrUn_gl000216^@^F<A1>^B^@^O^@^@^@chrUn_gl000217^@u<A0>^B^@^O^@^@^@chrUn_gl000218^@{u^B^@^O^@^@^@chrUn_gl000219^@<FE>
<BB>^B^@^O^@^@^@chrUn_gl000220^@
x^B^@^O^@^@^@chrUn_gl000221^@^E_^B^@^O^@^@^@chrUn_gl000222^@<ED><D9>^B^@^O^@^@^@chrUn_gl000223^@<E7><C0>^B^@^O^@^@^@chrUn_gl000224^@<ED><BD>^B^@^O^@
^@^@chrUn_gl000225^@<E5>8^C^@^O^@^@^@chrUn_gl000226^@<A0>:^@^@^O^@^@^@chrUn_gl000227^@v<F5>^A^@^O^@^@^@chrUn_gl000228^@`<F8>^A^@^O^@^@^@chrUn_gl000229^@<C9>M^@^@^O^@^@^@chrUn_gl000230^@<AB><AA>^@^@^O^@^@^@chrUn_gl000231^@<FA>j^@^@^O^@^@^@chrUn_gl000232^@<CC><9E>^@^@^O^@^@^@chrUn_gl000233^@u<B3>^@
^@^O^@^@^@chrUn_gl000234^@S<9E>^@^@^O^@^@^@chrUn_gl000235^@<AA><86>^@^@^O^@^@^@chrUn_gl000236^@<CE><A3>^@^@^O^@^@^@chrUn_gl000237^@+<B3>^@^@^O^@^@^@chrUn_gl000238^@^C<9C>^@^@^O^@^@^@chrUn_gl000239^@
 
<84>^@^@^O^@^@^@chrUn_gl000240^@<CD><A3>^@^@^O^@^@^@chrUn_gl000241^@<A8><A4>^@^@^O^@^@^@chrUn_gl000242^@^C<AA>^@^@^O^@^@^@chrUn_gl000243^@M<A9>^@^@^O^@^@^@chrUn_gl000244^@<F9><9B>^@^@^O^@^@^@chrUn_gl000245^@+<8F>^@^@^O^@^@^@chrUn_gl000246^@
<95>^@^@^O^@^@^@chrUn_gl000247^@F<8E>^@^@^O^@^@^@chrUn_gl000248^@j<9B>^@^@^O^@^@^@chrUn_gl000249^@f<96>^@^@^E^@^@^@chrX^@<A0>=A
 ^E^@^@^@chrY^@<FE>
<F7><89>^C^F^A^@^@^B^@^@^@B^S^R^A%<<91>^V^C^@q^@_^@^@^@^B^@^@^@&^S^R^A<B7><FF><FF><FF>K00210:41:HFYLTBBXX:2:1101
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to