Hi Sean,

Codons with an N in any position are represented with an X (stop codons 
are represented with a Z). Assemblies that are not well sequenced, such 
as the Gorilla (gorGor1) will have quite a few Ns (which are bases with 
low quality scores) and, thus, quite a few Xs in the protein alignment 
file. You can confirm this by viewing the gorGor1 assembly on our test 
browser here: 
http://genome-test.cse.ucsc.edu/cgi-bin/hgTracks?db=gorGor1. Please note 
that tracks and data on the test server have not undergone formal 
quality assurance.

More information about this file format can be found here: 
http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html#FASTA.

I hope this information is helpful. Please contact us again at 
[email protected] <mailto:[email protected]> if you have any further 
questions.

Best,
Mary
------------------
Mary Goldman
UCSC Bioinformatics Group



On 7/6/11 3:21 PM, Xiang Li wrote:
> Hi, Dear Support,
>
>
>
> It would be easy to understand if they are at the end of a protein
> sequence. However, could you please help me understand why there are so
> many "X"es inside some sequences?
>
>
>
> http://hgdownload.cse.ucsc.edu/goldenPath/hg19/multiz46way/alignments/re
> fGene.exonAA.fa.gz
>
>
>
>> NM_000152_gorGor1_18_19 51 0 0 Supercontig_0039638:17387-17539+
> NXIXNELVXVTSEGAGLQLQKVTVLGVATAPQQVXSNGVPVSNFTYSPDTK
>
> --
>
>> NM_001079803_gorGor1_18_19 51 0 0 Supercontig_0039638:17387-17539+
> NXIXNELVXVTSEGAGLQLQKVTVLGVATAPQQVXSNGVPVSNFTYSPDTK
>
> --
>
>> NM_001079804_gorGor1_18_19 51 0 0 Supercontig_0039638:17387-17539+
> NXIXNELVXVTSEGAGLQLQKVTVLGVATAPQQVXSNGVPVSNFTYSPDTK
>
>
>
>
>
> There are more than 30,000 sequences with X like that.   Please help.
> Thanks!
>
>
>
> Sean
>
>
>
> Sean (Xiang) Li, Ph.D
>
> Bioinformatics Scientist
>
> Ambry Genetics
>
> [email protected]<mailto:[email protected]>
>
> Direct 949-900-5504
>
> Fax 949-900-5501
>
>
>
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to