Alex  -

I strongly suggest running 'samtools view -b' on your .cram file to verify that you really can recover the sequence data it contains.

It is a samtools design feature when writing .cram format that md5 checksums in the original .bam header block are carried over intact into the .cram header block, whether or not they agree with the md5 checksum of the reference sequence being used to compress the .cram. Subsequently, when the .cram is read, it is the reference sequence named by the md5 checksum in the .cram header block that is used to uncompress the .cram. The uncompression will fail when this is not the same as the reference used for compression. I am told that this is a deliberate design feature of samtools to avoid spending the time needed to recalculate md5 checksums for the reference .fasta file.

If you encounter failures, the solution will be to rewrite the .cram header block with the md5 checksums of the reference sequence actually used for compression. Then the .cram will be readable.

                                                -  tom blackwell  -

On Wed, 28 Nov 2018, Alexander Brown wrote:

Hello,

I crammed a bam file using samtools view with the -T option, and the cram was 
produced. When I looked at the @SQ headers, I saw several chromosomes not in my 
reference fasta file. It turned out I used a build of the genome that was not 
used to produce the original bam file. It also looks like if the genome build 
following a -T option is wrong, samtools view will look for it elsewhere, as 
follows (from http://www.htslib.org/doc/samtools-1.1.html):

"

The search order to obtain a reference is:

Use any local file specified by the command line options (eg -T).

Look for MD5 via REF_CACHE environment variable.

Look for MD5 in each element of the REF_PATH environment variable.

Look for a local file listed in the UR: header tag."

I assume what happened is my "-T fasta" was wrong and so it looked elsewhere 
and found the correct file. Is there a way to get an output notifying that the -T fasta 
has been skipped? While it worked this time, it's important to know when this is 
happening.

Best regards



_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to