On Mon, 16 Jan 2017, Colin Hercus wrote: > Sometimes we get reads to align that have been trimmed to zero length and > I'm wondering how these should be represented in SAM format. > > Here's a pair as reported by Novoalign that had been trimmed by cutadapt > and one read of the pair is zero length > > READID 77 * 0 0 * * 0 0 * PG:Z:novoalign > READID 141 * 0 0 * * 0 0 > GTGTAGATCTCGGTGGTCGCCGTATCATTAAAAAAAAAAGGGG > EEDDB:=<;A9/=C=@A;:<,1:<?@.0<./;;;AC.;;5@:: PG:Z:novoalign > > The first read of the pair has a zero length SEQ field. > > This pair fails with a parse error in Samtools Version: 1.2 (using htslib > 1.2.1) but is accepted by Samtools Version: 0.1.19-44428cd. > > What is a valid SAM record for a zero length read?
The sequence should be '*' rather than blank. In fact, the latest version of samtools seems to correct your record to this instead of complaining about it: cat > /tmp/test.sam READID 77 * 0 0 * * 0 0 * PG:Z:novoalign READID 141 * 0 0 * * 0 0 GTGTAGATCTCGGTGGTCGCCGTATCATTAAAAAAAAAAGGGG EEDDB:=<;A9/=C=@A;:<,1:<?@.0<./;;;AC.;;5@:: PG:Z:novoalign samtools view /tmp/test.sam READID 77 * 0 0 * * 0 0 * * PG:Z:novoalign READID 141 * 0 0 * * 0 0 GTGTAGATCTCGGTGGTCGCCGTATCATTAAAAAAAAAAGGGG EEDDB:=<;A9/=C=@A;:<,1:<?@.0<./;;;AC.;;5@:: PG:Z:novoalign Rob Davies r...@sanger.ac.uk The Sanger Institute http://www.sanger.ac.uk/ Hinxton, Cambs., Tel. +44 (1223) 834244 CB10 1SA, U.K. Fax. +44 (1223) 494919 -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. ------------------------------------------------------------------------------ Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today. http://sdm.link/xeonphi _______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help