Hello,
I am trying to merge 24 different individual sample VCF files (currently
testing with only 3 samples) into one large VCF file to display on JBrowse
as a single track. However, after merging the files, it seems like all the
data stops at around 536 Mbp per psuedomolecule.
Below is an example single sample VCF ending for a particular chromosome,
and then the merged vcf output.
Individual File:
zgrep "chr7A" G3116_CSv1_calls_full_chr_length.vcf.gz | tail
chr7A 736692151 . A T 30.4183 .
DP=2;VDB=0.32;SGB=-0.453602;MQ0F=0;AC=2;AN=2;DP4=0,0,0,2;MQ=32 GT:PL
1/1:60,6,0
chr7A 736692161 . C T 30.4183 .
DP=2;VDB=0.32;SGB=-0.453602;MQ0F=0;AC=2;AN=2;DP4=0,0,0,2;MQ=32 GT:PL
1/1:60,6,0
chr7A 736692407 . G C 16.3433 .
DP=5;VDB=0.22;SGB=-0.453602;RPB=1;MQB=1;BQB=1;MQ0F=0;ICB=1;HOB=0.5;AC=1;AN=2;DP4=1,0,2,0;MQ=40
GT:PL 0/1:49,0,32
chr7A 736692408 . C T 59 .
DP=5;VDB=0.0983484;SGB=-0.511536;MQ0F=0;AC=2;AN=2;DP4=0,0,3,0;MQ=40
GT:PL 1/1:89,9,0
chr7A 736692435 . C G 106 .
DP=7;VDB=0.189048;SGB=-0.556411;MQSB=1;MQ0F=0;AC=2;AN=2;DP4=0,0,3,1;MQ=37
GT:PL 1/1:136,12,0
chr7A 736692492 . G A 106 .
DP=7;VDB=0.184775;SGB=-0.556411;MQSB=1;MQ0F=0;AC=2;AN=2;DP4=0,0,3,1;MQ=37
GT:PL 1/1:136,12,0
chr7A 736692503 . G A 15.355 .
DP=7;SGB=-0.379885;RPB=1;MQB=1;MQSB=1;BQB=1;MQ0F=0;ICB=1;HOB=0.5;AC=1;AN=2;DP4=2,1,1,0;MQ=37
GT:PL 0/1:48,0,75
chr7A 736692507 . T C 42.2273 .
DP=7;VDB=0.460446;SGB=-0.511536;RPB=1;MQB=1;MQSB=1;BQB=1;MQ0F=0;ICB=1;HOB=0.5;AC=1;AN=2;DP4=1,0,2,1;MQ=37
GT:PL 0/1:75,0,48
chr7A 736692530 . T C 15.2759 .
DP=6;SGB=-0.379885;RPB=1;MQB=1;MQSB=0;BQB=1;MQ0F=0;ICB=1;HOB=0.5;AC=1;AN=2;DP4=1,2,1,0;MQ=39
GT:PL 0/1:48,0,82
chr7A 736692546 . A G 4.33255 .
DP=2;SGB=-0.379885;RPB=1;MQB=1;BQB=1;MQ0F=0;ICB=1;HOB=0.5;AC=1;AN=2;DP4=0,1,0,1;MQ=44
GT:PL 0/1:35,0,23
Merged File
zgrep "chr7A" test.vcf | tail
chr7A 536773061 . T C 225 .
VDB=0.185719;SGB=-0.511536;MQ0F=0;MQ=60;MQSB=1;DP=13;DP4=0,0,2,9;AN=4;AC=4
GT:PL 1/1:102,9,0 1/1:255,24,0
chr7A 536773098 . T G 129 .
VDB=0.0243922;SGB=-0.616816;MQ0F=0;MQ=60;DP=7;DP4=0,0,0,6;AN=2;AC=2
GT:PL ./.:. 1/1:159,18,0
chr7A 536773112 . C T 112 .
VDB=0.0256605;SGB=-0.590765;MQ0F=0;MQ=60;DP=6;DP4=0,0,0,5;AN=2;AC=2
GT:PL ./.:. 1/1:142,15,0
chr7A 536773115 . G C 112 .
VDB=0.0256605;SGB=-0.590765;MQ0F=0;MQ=60;DP=5;DP4=0,0,0,5;AN=2;AC=2
GT:PL ./.:. 1/1:142,15,0
chr7A 536773513 . A G 225 .
VDB=0.0795616;SGB=-0.651104;MQSB=0;MQ0F=0;MQ=51;DP=23;DP4=0,0,15,4;AN=4;AC=4
GT:PL 1/1:211,24,0 1/1:255,33,0
chr7A 536773541 . C T 228 .
VDB=0.0660093;SGB=-0.662043;MQSB=0;MQ0F=0;MQ=52;RPB=1;MQB=1;BQB=1;DP=30;DP4=0,1,18,4;AN=4;AC=4
GT:PL 1/1:244,27,0 1/1:255,1,0
chr7A 536773717 . A G 101 .
VDB=0.0860477;SGB=-0.511536;MQSB=1;MQ0F=0;MQ=52;DP=4;DP4=0,0,2,1;AN=2;AC=2
GT:PL 1/1:131,9,0 ./.:.
chr7A 536773733 . C G 101 .
VDB=0.0822145;SGB=-0.511536;MQSB=1;MQ0F=0;MQ=52;DP=4;DP4=0,0,2,1;AN=2;AC=2
GT:PL 1/1:131,9,0 ./.:.
chr7A 536773785 . A G 74 .
VDB=0.490096;SGB=-0.511536;MQSB=1;MQ0F=0;MQ=43;DP=3;DP4=0,0,1,2;AN=2;AC=2
GT:PL 1/1:104,9,0 ./.:.
chr7A 536773788 . T G 74 .
VDB=0.470313;SGB=-0.511536;MQSB=1;MQ0F=0;MQ=43;DP=3;DP4=0,0,1,2;AN=2;AC=2
GT:PL 1/1:104,9,0 ./.:.
Is this a limitation of BCF tools, bgzip, or Tabix indexes? I've tried
using vcftools too, but obtained the same result. Are there any other
alternatives in order to merge this data together?
Thank you for the help.
Cheers,
-Hans
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help