Hi Barbara,
these are placeholders for unseen alleles (not observed in the pileup)
and they are auxiliary structures used for calling and allow to express
genotype likelihoods at homozygous sites.
For your reference, the complete variant calling command should look
like described here
http://samtools.github.io/bcftools/howtos/variant-calling.html
and the VCF specification which describes genotype likelihoods and the
<*> allele is here
http://samtools.github.io/hts-specs/VCFv4.3.pdf
In short, if you use the complete command, you don't need to worry about
these.
Best wishes,
Petr
On 04/01/2022 15:57, bparre...@igc.gulbenkian.pt wrote:
Hi,
First, I wish you all a 2022 of health and success:)
Second, I am trying to understand the output from the mpileup tool but I
am unsure what the "<*>" is in the ALT column is. It appears in every line
and sometimes in conjunction to another base. I realize that some people
have the same doubt, however I could not find a clarifying answer.
Is this related to INDELS? I notice I cannot find any INDELS. However, the
strange thing is that "<*>" appears with a depth of 0. Could you please
help?
I am using the following command:
bcftools mpileup -f ref_genome/ref.fasta results/bam/sorted.bam --annotate
AD > sample.mpileup
As an example I am copying one output line below:
NC_045512.2 102 . G A,<*> 0 .
DP=285;I16=194,32,1,0,9491,424267,36,1296,13560,813600,60,3600,3928,78818,6,36;QS=0.996221,0.00377873,0;SGB=-0.379885;RPBZ=-1.54189;MQBZ=0;MQSBZ=0;BQBZ=-0.532247;SCBZ=-0.133926;FS=0;MQ0F=0
PL:AD 0,255,255,255,255,255:226,1,0
Thank you so much,
Bárbara
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_samtools-2Dhelp&d=DwIFAw&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=xdvdTaAZDWitAtUqWIZL0P13OEx0W20kZW6y-ipKSDY&m=Ufg4yIBVUeyOKJbU-MiZN4mF0Bbld13gPJnQAAuNYJd1hAZsIr6IpFmZl1G9o-9z&s=ExNhgyvGcQaKg64CTcIOzI4HlQGbjRsWuCAN8OBk4TY&e=
--
The Wellcome Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE. _______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help