Hi Jens, My guess is since this is 3 out of 39 files generated the same way that it is due to a data loss/truncation somewhere. Maybe a cluster node ran out of temporary disk space?
It is possible the files are valid, but missing the EOF marker (a special empty GZIP block). That could happen if there was a bug in the tool used to generate the file - but that seems unlikely if all 39 files are from the same pipeline. You can try double checking the EOF marker with this simple script of mine: https://github.com/peterjc/picobio/blob/master/sambam/bgzf_check_eof.py I would recommend you check the read counts are as you expect, and also try BAM to SAM to BAM which could give an error if there is a data truncation part-way though a read. Peter On Tue, Jun 14, 2016 at 8:57 AM, Jens Christian Froslev Nielsen <jens.c.niel...@chalmers.se> wrote: > Hi, > > I have been sorting tophat generated .bam files with samtools to use them as > input for htseq-count. > > $ samtools sort -n file1.bam -o file1.bam_sorted_n.bam > > for 36 of 39 files this works fine, while for 3 files i receive the > following error: > > [W::bam_hdr_read] EOF marker is absent. The input is probably truncated. > > These 3 files happens to be the largest files of ~ 4Gb > > any suggestions? Can I check if the files are corrupt or if it is a bug in > samtools? > > Thanks > > -Jens > > > ------------------------------------------------------------------------------ > What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic > patterns at an interface-level. Reveals which users, apps, and protocols are > consuming the most bandwidth. Provides multi-vendor support for NetFlow, > J-Flow, sFlow and other flows. Make informed decisions using capacity > planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e > _______________________________________________ > Samtools-help mailing list > Samtools-help@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/samtools-help > ------------------------------------------------------------------------------ What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic patterns at an interface-level. Reveals which users, apps, and protocols are consuming the most bandwidth. Provides multi-vendor support for NetFlow, J-Flow, sFlow and other flows. Make informed decisions using capacity planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e _______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help