Hi, Mary,

 

I got it. Thanks a lot!    

 

BTW, from the webpage you pointed to me, it seems there are multiple
alignments at the DNA level between entire genomes, i.e., not only just
for CDS regions, but also for entire exon and intronic regions. 

 

Is my understanding correct? If so, could you please instruct me how to
get that MAFs? 

 

Thanks!

 

 

Sean

 

From: Mary Goldman [mailto:[email protected]] 
Sent: Wednesday, July 06, 2011 4:30 PM
To: Xiang Li
Cc: [email protected]
Subject: Re: [Genome] [help] Lots of stop codons in multiz46way protein
alignment file

 

Hi Sean,

You can also view the Gorilla browser at our preview site here:
http://genome-preview.cse.ucsc.edu/cgi-bin/hgTracks?db=gorGor1. It tends
to be more reliably available than our test site. Our preview site
carries the same warning that tracks and data on the test server have
not undergone formal quality assurance. 

Best,
Mary
---------------------
Mary Goldman
UCSC Bioinformatics Group

On 7/6/11 4:16 PM, Mary Goldman wrote: 

Hi Sean,

Codons with an N in any position are represented with an X (stop codons
are represented with a Z). Assemblies that are not well sequenced, such
as the Gorilla (gorGor1) will have quite a few Ns (which are bases with
low quality scores) and, thus, quite a few Xs in the protein alignment
file. You can confirm this by viewing the gorGor1 assembly on our test
browser here:
http://genome-test.cse.ucsc.edu/cgi-bin/hgTracks?db=gorGor1. Please note
that tracks and data on the test server have not undergone formal
quality assurance. 

More information about this file format can be found here:
http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html#FASTA. 

I hope this information is helpful. Please contact us again at
[email protected] if you have any further questions.

Best,
Mary
------------------
Mary Goldman
UCSC Bioinformatics Group



On 7/6/11 3:21 PM, Xiang Li wrote: 

Hi, Dear Support,
 
 
 
It would be easy to understand if they are at the end of a protein
sequence. However, could you please help me understand why there are so
many "X"es inside some sequences?
 
 
 
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/multiz46way/alignments/re
fGene.exonAA.fa.gz
 
 
 

        NM_000152_gorGor1_18_19 51 0 0 Supercontig_0039638:17387-17539+

NXIXNELVXVTSEGAGLQLQKVTVLGVATAPQQVXSNGVPVSNFTYSPDTK
 
--
 

        NM_001079803_gorGor1_18_19 51 0 0
Supercontig_0039638:17387-17539+

NXIXNELVXVTSEGAGLQLQKVTVLGVATAPQQVXSNGVPVSNFTYSPDTK
 
--
 

        NM_001079804_gorGor1_18_19 51 0 0
Supercontig_0039638:17387-17539+

NXIXNELVXVTSEGAGLQLQKVTVLGVATAPQQVXSNGVPVSNFTYSPDTK
 
 
 
 
 
There are more than 30,000 sequences with X like that.   Please help.
Thanks!
 
 
 
Sean
 
 
 
Sean (Xiang) Li, Ph.D
 
Bioinformatics Scientist
 
Ambry Genetics
 
[email protected] <mailto:[email protected]> <mailto:[email protected]>  
 
Direct 949-900-5504
 
Fax 949-900-5501
 
 
 
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to