Hi Matt,

The lowercase letters signify that the bases are part of a repeat in the 
sequence.

The numbers are quality scores:

  s panTro2.chrUn            9697231 26 +  58616431 TTTTTGAAAAACAAACAACAAGTTGG
  q panTro2.chrUn                                   99999999999999999999999999

So, in the above example, each base has a quality score of 9, the 
highest value. If you look at the first column it will label the row 
either s (for sequence) or q (for quality score).

For more detailed info, see the last couple sections of the .maf file 
FAQ:  http://genome.ucsc.edu/FAQ/FAQformat.html#format5

Just email us again if you have any additional questions.

-
Greg Roe
UCSC Genome Browser


On 11/29/10 10:53 AM, Matt De Both wrote:
> Hello,
>
> I am involved in undergraduate research and have just been introduced to
> .maf files. I understand the structure of the files from the online
> documentation, but I have two questions.
>
> What is the significance, if any, of uppercase vs. lowercase letters in
> the sequences?
>
> What do numbers in the sequence represent?
>
> Thanks in advance for any help!
>
> Matt
>
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to