Hi,
I’m using samtools/bcftools to obtain a diploid consensus genome for an
individual human. I’ve taken a bam file of mapped reads from a high-coverage
WGS sample of the 1000 genomes project (for example
ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000_genomes_project/data/CHB/NA18525/high_cov_alignment/NA18525.alt_bwamem_GRCh38DH.20150917.CHB.high_coverage.cram),
then passed it through samtools mpileup | bcfcall | vcftuils.pl vcf2fq to
obtain a fastq file containing the consensus genome.
When I view that fastq file, it only seems to contain one sequence. How can it
be representing a diploid genome? Or have I made some fundamental
misunderstanding of how the above commands work? When I pass the fastq to
psmc2fq, a program that detects heterozygotes along the sequence, it works
happily.
Regards,
Nik
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help