Thanks, I documented this in my code. It looks like I am running into a problem reading back BAM content I have written (reading SAM content isn't a problem). Attached are a pair of BAM and SAM files I generated the same way. The only difference is passed "wb" instead of "w" to sam_open in order to write the BAM file. When I read back in the BAM file it fails and I see the following printed in standard output:
[W::bam_hdr_read] EOF marker is absent. The input is probably truncated In order to write the SAM/BAM files I make the following calls: sam_open sam_hdr_write sam_write1 sam_close This is with the most recent release of htslib. When writing a BAM file I find: outputFile->format.compression = 2 outputFile->format.compression_level = -1 immediately after calling sam_open (outputFile is a samFile*). >From reading the SAM/BAM specification it sounds like I'm missing a 28 byte EOF at the end of my file. Am I expeted to write those 1f 8b .. 00 00 bytes myself manually, or was htslib supposed to have done that for me, or is there some htslib command I have failed to call? Note reading the header for this file works, but sam_read1 fails so I cannot parse the various reads. Will On Wed, Nov 7, 2018 at 6:28 AM James Bonfield <j...@sanger.ac.uk> wrote: > On Tue, Nov 06, 2018 at 10:47:36AM -0500, Will Stokes wrote: > > Sorry I was unclear. I'm referring to data included following the name, > > cigar, sequence, and quality, aka at the end of the bam_t.data block. > > > > int extra_len = 0; > > > > int bam_len = numNameBytes + numCigarBytes + numSeqBytes + > > numQualityBytes + extra_len; > > Ah yes, this is used in CRAM's bam_construct_seq function purely for > purposes of memory allocation. > > The extra data referred to here is the auxiliary tags, either verbatim > ones stored in CRAM or auto-generated ones such as the RG:Z: tag. > > > Note that when encoding the name I *am* ensuring I use 1-4 null bytes > > to ensure 32-bit alignment for CIGAR data. Perhaps I should padd the > > entire data buffer such that > > I wouldn't do that. Any extra data left over after quality will be > interpreted as auxiliary tags. If you have none then the bam record > must end immediately after the quality values. > > See the BAM table in the SAM specification (section 4.2). > > James > > -- > James Bonfield (j...@sanger.ac.uk) > The Sanger Institute, Hinxton, Cambs, CB10 1SA > > > -- > The Wellcome Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a > company registered in England with number 2742969, whose registered > office is 215 Euston Road, London, NW1 2BE. > > > _______________________________________________ > Samtools-help mailing list > Samtools-help@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/samtools-help >
simple.bam
Description: application/dna
simple.sam
Description: Binary data
_______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help