Hi Jens,

My guess is since this is 3 out of 39 files generated the same
way that it is due to a data loss/truncation somewhere. Maybe
a cluster node ran out of temporary disk space?

It is possible the files are valid, but missing the EOF marker
(a special empty GZIP block). That could happen if there
was a bug in the tool used to generate the file - but that
seems unlikely if all 39 files are from the same pipeline.

You can try double checking the EOF marker with this
simple script of mine:

https://github.com/peterjc/picobio/blob/master/sambam/bgzf_check_eof.py

I would recommend you check the read counts are as you
expect, and also try BAM to SAM to BAM which could give
an error if there is a data truncation part-way though a read.

Peter

On Tue, Jun 14, 2016 at 8:57 AM, Jens Christian Froslev Nielsen
<jens.c.niel...@chalmers.se> wrote:
> Hi,
>
> I have been sorting tophat generated .bam files with samtools to use them as
> input for htseq-count.
>
> $ samtools sort -n file1.bam -o file1.bam_sorted_n.bam
>
> for 36 of 39 files this works fine, while for 3 files i receive the
> following error:
>
> [W::bam_hdr_read] EOF marker is absent. The input is probably truncated.
>
> These 3 files happens to be the largest files of ~ 4Gb
>
> any suggestions? Can I check if the files are corrupt or if it is a bug in
> samtools?
>
> Thanks
>
> -Jens
>
>
> ------------------------------------------------------------------------------
> What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
> patterns at an interface-level. Reveals which users, apps, and protocols are
> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
> J-Flow, sFlow and other flows. Make informed decisions using capacity
> planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
> _______________________________________________
> Samtools-help mailing list
> Samtools-help@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/samtools-help
>

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity 
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to