Hi Peter,
Yes, in fact you can start with any vcf at all, bgzip it and follow the steps 
below to reproduce the error.
You are correct about the header, that's why I cat it only once in the 
procedure below.

Thanks,
Stathis

-----Original Message-----
From: Peter Cock [mailto:p.j.a.c...@googlemail.com] 
Sent: 13 May 2015 01:52
To: Kanterakis, Efstathios
Cc: samtools-help@lists.sourceforge.net
Subject: Re: [Samtools-help] tabix bug on cat'ed vcf.gz

Hello Efstathios,

Yes, you should be able to cat together BGZF files and have a valid BGZF file. 
(Likewise you should be able to cat together any gzipped files and have a valid 
gzip file).

However, there are two important provisos. First, the internal data format must 
allow concatenation (e.g. you cannot cat together two BAM files since that 
would result in an extra header in the middle of the data). Second, some tools 
fail to cope with concatenated gzip block (e.g. some Java libraries break).

Can you reproduce this with a public sample VCF file?

Peter

On Wed, May 13, 2015 at 12:17 PM, Kanterakis, Efstathios 
<ekantera...@illumina.com> wrote:
> Hello,
> The following routine seems to produce invalid tabix indices (samtools 1.2):
> zgrep '^chr1' some.vcf.gz > chr1.vcf
> zgrep '^chr2' some.vcf.gz > chr2.vcf
> zgrep '^#' some.vcf.gz > header.vcf
> cat header.vcf chr1.vcf > chr1_h.vcf
> bgzip chr1_h.vcf
> bgzip chr2.vcf
> cat chr1_h.vcf.gz chr2.vcf.gz > test.vcf.gz tabix test.vcf.gz tabix 
> test.vcf.gz chr2 # blank tabix test.vcf.gz chr1 # works
>
> bgzip -d test.vcf.gz
> bgzip test.vcf
> tabix test.vcf.gz
> tabix test.vcf.gz chr2 # works now
>
> I was under the impression that bgzipped files are directly cat'able. Is this 
> a bug?
>
> Thank you,
> Stathis
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to