Hi All,
Thanks for letting me know about this.
I’ve had a (very) quick look and the samtools view -H command is working for me
with v1.2.
[alignment]:samtools
Program: samtools (Tools for alignments in the SAM format)
Version: 1.2 (using htslib 1.2.1)
[alignment]:samtools view -H
SC_GMFUL5306366.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov.cram
@HD VN:1.5 SO:coordinate
@SQ SN:chr1 LN:248956422
UR:ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/GRCh38_reference_genome/GRCh38_full_analysis_set_plus_decoy_hla.fa
AS:GRCh38 M5:6aef897c3d6ff0c78aff06ac189178dd SP:Human
@SQ SN:chr2 LN:242193529
UR:ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/GRCh38_reference_genome/GRCh38_full_analysis_set_plus_decoy_hla.fa
AS:GRCh38 M5:f98db672eb0993dcfdabafe2a882905c SP:Human
@SQ SN:chr3 LN:198295559
UR:ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/GRCh38_reference_genome/GRCh38_full_analysis_set_plus_decoy_hla.fa
AS:GRCh38 M5:76635a41ea913a405ded820447d067b0 SP:Human
@SQ SN:chr4 LN:190214555
UR:ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/GRCh38_reference_genome/GRCh38_full_analysis_set_plus_decoy_hla.fa
AS:GRCh38 M5:3210fecf1eb92d5489da4346b3fddc6e SP:Human
@SQ SN:chr5 LN:181538259
UR:ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/GRCh38_reference_genome/GRCh38_full_analysis_set_plus_decoy_hla.fa
AS:GRCh38 M5:a811b3dc9fe66af729dc0dddf7fa4f13 SP:Human
@SQ SN:chr6 LN:170805979
UR:ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/GRCh38_reference_genome/GRCh38_full_analysis_set_plus_decoy_hla.fa
AS:GRCh38 M5:5691468a67c7e7a7b5f2a3a683792c29 SP:Human
…
Any thoughts on what is causing this and the best way to address it are welcome.
Thanks,
Susan.
> On 8 Nov 2017, at 12:02, Robert Davies <r...@sanger.ac.uk> wrote:
>
> On Wed, 8 Nov 2017, Tommy Carstensen wrote:
>
>> To samtools-help,
>>
>> 1) I am trying to convert a cram to bam with samtools view v1.5, but
>> eventually I get the error below for some of the files, whereas others are
>> successfully converted:
>> Block CRC32 failure
>> [main_samview] truncated file.
>>
>> Has anyone had this problem? I am quite sure the files are not
>> truncated/corrupted. How do people usually check that?
>>
>> 2) Furthermore, when typing the command below:
>> samtools view -H
>> ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/gambian_genome_variation_project/data/FULA/SC_GMFUL5306338/alignment/SC_GMFUL5306338.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov.cram
>>
>> Then I get this error:
>> [E::hts_hopen] Failed to open file
>> ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/gambian_genome_variation_project/data/FULA/SC_GMFUL5306338/alignment/SC_GMFUL5306338.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov.cram
>> [E::hts_open_format] Failed to open file
>> ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/gambian_genome_variation_project/data/FULA/SC_GMFUL5306338/alignment/SC_GMFUL5306338.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov.cram
>> samtools view: failed to open
>> "ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/gambian_genome_variation_project/data/FULA/SC_GMFUL5306338/alignment/SC_GMFUL5306338.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov.cram"
>> for reading: Exec format error
>
> It looks like the file is corrupt:
>
> curl
> 'ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/gambian_genome_variation_project/data/FULA/SC_GMFUL5306338/alignment/SC_GMFUL5306338.alt_bwamem_GRCh38DH.20151208.FULA.gambian_lowcov.cram'
> | hexdump -C | head
> % Total % Received % Xferd Average Speed Time Time Time
> Current
> Dload Upload Total Spent Left Speed
> 0 9972M 0 99.3M 0 0 12.6M 0 0:13:07 0:00:07 0:13:00 9784k
> 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
> *
> 066334b0 00 00 00 00 00 00 00 00 00 00 00 00 e5 7a 59 35 |.............zY5|
> 066334c0 e2 58 f8 07 bc 4e 2b 2d c7 c7 81 c5 9d 56 aa 27 |.X...N+-.....V.'|
> 066334d0 62 fb 82 89 05 77 99 23 df a9 f9 c9 8c 0f 67 c4 |b....w.#......g.|
> 066334e0 c4 2d 8e 2f 67 0a a8 1d 79 1f f5 ef a0 cd ca 71 |.-./g...y......q|
> 066334f0 a6 b9 24 99 6b b4 95 20 46 8f d5 0b c8 aa 40 bb |..$.k.. F.....@.|
> 06633500 05 05 d5 83 f4 8a 2b 86 86 4b 5b da cc 27 9c 8d |......+..K[..'..|
> 06633510 77 ca 1f 32 67 2d 14 62 99 90 21 bc 71 0a b2 5b |w..2g-.b..!.q..[|
> 06633520 40 a2 bb a9 2e a2 2c df 5f 16 b8 83 f7 c3 0c 9a |@.....,._.......|
>
> That's a lot of zeros to find at the beginning of a CRAM file.
>
> The bizarre error message is a result of htslib abusing the standard unix
> error codes to pass back error conditions. In this case it couldn't work out
> what sort of file it was trying to open.
>
> The "Block CRC32 failure" is likely to be another corruption of some sort. We
> really need to make the software print out the file position when this
> happens. It would make tracking down exactly where the problem is much
> easier. In this case, as long as the header is intact you may be able to
> rescue the data after the corruption by using range queries to jump to parts
> of the file after the broken bit.
>
> Rob Davies r...@sanger.ac.uk
> The Sanger Institute http://www.sanger.ac.uk/
> Hinxton, Cambs., Tel. +44 (1223) 834244
> CB10 1SA, U.K. Fax. +44 (1223) 494919
>
>
> --
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a
> charity registered in England with number 1021457 and a company registered in
> England with number 2742969, whose registered office is 215 Euston Road,
> London, NW1 2BE.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help