Dear Rob and John
You are right! When I use zless I don't think I have any @SQ headers
(attached).
The headers that I saw using samtools view header or the grep command only had
SN: and LN: fields.
So...
Erne is not outputting @SQ headers.
It just looks that way when I use samtools view because samtools automatically
generates basic ones.
No @SQ headers causes samtools sort to fail, but only when the alignments a big
enough to need to merge (hence the singletons worked but the paired did not)
Samtools 1.3.1 will not have this problem but I must be somehow running the
biobuild samtools 1.3 on my HPC.
I have just set off samtools sort jobs specifying the direct path to samtools
1.3.1 - they should finish by the end of today.
If they work I will contact the erne authors to ask them to output @SQ headers
in future.
If they don't work I will be back in touch.
Thank-you so much for this help. You can tell I am a newbie but I will try to
use this mailing list wisely and sparingly.
Jo
-----Original Message-----
From: Robert Davies [mailto:r...@sanger.ac.uk]
Sent: 04 January 2017 10:24
To: John Marshall <j...@sanger.ac.uk>
Cc: Holbrook J. <j.holbr...@soton.ac.uk>; samtools-help@lists.sourceforge.net
Subject: Re: [Samtools-help] samtools sort for paired-end .bam cannot find
chromosome name in text header
On Wed, 4 Jan 2017, John Marshall wrote:
> TL;DR i.e. what Rob said. But note that when you output SAM with
>samtools view, samtools appends basic @SQ headers if there aren't
>already any. So Rob's `samtools view -H ... | grep '^@SQ'` will
>display some headers whether the file contains textual headers or not.
>Removing the grep will allow some educated guesses to be made about
>whether @SQ headers seen with samtools view are synthetic or really in
>the input
>file: if there are any other (e.g. @RG) headers *after* the block of
>@SQ headers, or if the @SQ headers have any fields beyond SN and LN,
>then they are definitely real.
Yes, I've just spotted this - samtools view fixes the problem too.
Looking at the bam file with 'zless' might be the quickest way of checking for
@SQ lines. The text header is fairly readable at the start.
Rob Davies r...@sanger.ac.uk
The Sanger Institute http://www.sanger.ac.uk/
Hinxton, Cambs., Tel. +44 (1223) 834244
CB10 1SA, U.K. Fax. +44 (1223) 494919
--
The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a
charity registered in England with number 1021457 and a company registered in
England with number 2742969, whose registered office is 215 Euston Road,
London, NW1 2BE.
[jh1m15@cyan01 SureSelect]$ zless Sample1b_unmasked.bam
BAM^A<EB>^C^@^@@HD VN:1.0 SO:unsorted
@PG ID:ERNE VN:2.1.1 CL: erne-bs5 --reference unmasked_htable.ebm
--query1 Trimmed/Sample_1b-1/Sample1b_catR1.fastq --query2
Trimmed/Sample_1b-1/Sample1b_catR2.fastq --output Sample1b_unmasked.bam
@CO 1481027931: ERNE version 2.1.1
@CO 1481027931: --query1 =
Trimmed/Sample_1b-1/Sample1b_catR1.fastq
@CO 1481027931: --query2 =
Trimmed/Sample_1b-1/Sample1b_catR2.fastq
@CO 1481027931: --output = Sample1b_unmasked.bam
@CO 1481027931: --sample = no_sample_specified
@CO 1481027931: --reference = unmasked_htable.ebm
@CO 1481027931: --contamination-reference =
@CO 1481027931: --auto-errors = true
@CO 1481027931: --errors-rate = 15
@CO 1481027931: --errors = 0
@CO 1481027931: --threads = 1
@CO 1481027931: --min-size = 25
@CO 1481027931: --min-phred-value-CLC = 20
@CO 1481027931: --min-mean-phread-quality = 20
@CO 1481027931: --no-quality-check = false
@RG ID:1481027931 SM:no_sample_specified
]^@^@^@^E^@^@^@chr1^@=C<DB>^N^F^@^@^@chr10^@<9B>^X^T^H^F^@^@^@chr11^@4
^L^H^V^@^@^@chr11_gl000202_random^@<A7><9C>^@^@^F^@^@^@chr12^@<F7>j<FA>^G^F^@^@^@chr13^@VZ<DD>^F^F^@^@^@chr14^@$^Ff^F^F^@^@^@chr15^@@<81>^\^F^F^@^@^@chr16^@A<B4>b^E^F^@^@^@chr17^@<CA><F0><D6>^D^P^@^@^@chr17_ctg5_hap1^@<BC><A5>^Y^@^V^@^@^@chr17_gl000203_random^@z<92>^@^@^V^@^@^@chr17_gl000204_random^@<9E>=^A^@^V^@^@^@chr17_gl000205_random^@<FC><A9>^B^@^V^@^@^@chr17_gl000206_random^@)<A0>^@^@^F^@^@^@chr18^@@]<A7>^D^V^@^@^@chr18_gl000207_random^@<A6>^P^@^@^F^@^@^@chr19^@<97><<86>^C^V^@^@^@chr19_gl000208_random^@^Qj^A^@^V
^@^@^@chr19_gl000209_random^@<C1>m^B^@^U^@^@^@chr1_gl000191_random^@<C1><9F>^A^@^U^@^@^@chr1_gl000192_random^@<A8>^@^E^@^@^@chr2^@<8D><ED>~^N^F^@^@^@chr20^@p<B1><C1>^C^F^@^@^@chr21^@gg<DE>^B^V^@^@^@chr21_gl000210_random^@"l^@^@^F^@^@^@chr22^@v<D8>^N^C^E^@^@^@chr3^@^^<95><CD>^K^E^@^@^@chr4^@d<C8>d
^K^O^@^@^@chr4_ctg9_hap1^@Z^B
^@^U^@^@^@chr4_gl000193_random^@]<E5>^B^@^U^@^@^@chr4_gl000194_random^@<ED><EB>^B^@^E^@^@^@chr5^@<<8C><C8>
^E^@^@^@chr6^@;^B3
^N^@^@^@chr6_apd_hap1^@<D2><87>F^@^N^@^@^@chr6_cox_hap2^@<EB>+I^@^N^@^@^@chr6_dbb_hap3^@\YF^@^O^@^@^@chr6_mann_hap4^@<FF>uG^@^N^@^@^@chr6_mcf_hap5^@v<C0>I^@^N^@^@^@chr6_qbl_hap6^@<90>_F^@^O^@^@^@chr6_ssto_hap7^@74K^@^E^@^@^@chr7^@gC|
^U^@^@^@chr7_gl000195_random^@p<CA>^B^@^E^@^@^@chr8^@vV
<B9>^H^U^@^@^@chr8_gl000196_random^@^B<98>^@^@^U^@^@^@chr8_gl000197_random^@7<91>^@^@^E^@^@^@chr9^@<F7><BE>^U^@^@^@chr9_gl000198_random^@<E5>_^A^@^U
^@^@^@chr9_gl000199_random^@<92><97>^B^@^U^@^@^@chr9_gl000200_random^@<9B><DA>^B^@^U^@^@^@chr9_gl000201_random^@4<8D>^@^@^E^@^@^@chrM^@<BB>@^@^@^O^@
^@^@chrUn_gl000211^@<A6><8A>^B^@^O^@^@^@chrUn_gl000212^@<EA><D9>^B^@^O^@^@^@chrUn_gl000213^@<8F><81>^B^@^O^@^@^@chrUn_gl000214^@<F6>^Y^B^@^O^@^@^@chrUn_gl000215^@^A<A2>^B^@^O^@^@^@chrUn_gl000216^@^F<A1>^B^@^O^@^@^@chrUn_gl000217^@u<A0>^B^@^O^@^@^@chrUn_gl000218^@{u^B^@^O^@^@^@chrUn_gl000219^@<FE>
<BB>^B^@^O^@^@^@chrUn_gl000220^@
x^B^@^O^@^@^@chrUn_gl000221^@^E_^B^@^O^@^@^@chrUn_gl000222^@<ED><D9>^B^@^O^@^@^@chrUn_gl000223^@<E7><C0>^B^@^O^@^@^@chrUn_gl000224^@<ED><BD>^B^@^O^@
^@^@chrUn_gl000225^@<E5>8^C^@^O^@^@^@chrUn_gl000226^@<A0>:^@^@^O^@^@^@chrUn_gl000227^@v<F5>^A^@^O^@^@^@chrUn_gl000228^@`<F8>^A^@^O^@^@^@chrUn_gl000229^@<C9>M^@^@^O^@^@^@chrUn_gl000230^@<AB><AA>^@^@^O^@^@^@chrUn_gl000231^@<FA>j^@^@^O^@^@^@chrUn_gl000232^@<CC><9E>^@^@^O^@^@^@chrUn_gl000233^@u<B3>^@
^@^O^@^@^@chrUn_gl000234^@S<9E>^@^@^O^@^@^@chrUn_gl000235^@<AA><86>^@^@^O^@^@^@chrUn_gl000236^@<CE><A3>^@^@^O^@^@^@chrUn_gl000237^@+<B3>^@^@^O^@^@^@chrUn_gl000238^@^C<9C>^@^@^O^@^@^@chrUn_gl000239^@
<84>^@^@^O^@^@^@chrUn_gl000240^@<CD><A3>^@^@^O^@^@^@chrUn_gl000241^@<A8><A4>^@^@^O^@^@^@chrUn_gl000242^@^C<AA>^@^@^O^@^@^@chrUn_gl000243^@M<A9>^@^@^O^@^@^@chrUn_gl000244^@<F9><9B>^@^@^O^@^@^@chrUn_gl000245^@+<8F>^@^@^O^@^@^@chrUn_gl000246^@
<95>^@^@^O^@^@^@chrUn_gl000247^@F<8E>^@^@^O^@^@^@chrUn_gl000248^@j<9B>^@^@^O^@^@^@chrUn_gl000249^@f<96>^@^@^E^@^@^@chrX^@<A0>=A
^E^@^@^@chrY^@<FE>
<F7><89>^C^F^A^@^@^B^@^@^@B^S^R^A%<<91>^V^C^@q^@_^@^@^@^B^@^@^@&^S^R^A<B7><FF><FF><FF>K00210:41:HFYLTBBXX:2:1101
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help