Hi Matt, The lowercase letters signify that the bases are part of a repeat in the sequence.
The numbers are quality scores: s panTro2.chrUn 9697231 26 + 58616431 TTTTTGAAAAACAAACAACAAGTTGG q panTro2.chrUn 99999999999999999999999999 So, in the above example, each base has a quality score of 9, the highest value. If you look at the first column it will label the row either s (for sequence) or q (for quality score). For more detailed info, see the last couple sections of the .maf file FAQ: http://genome.ucsc.edu/FAQ/FAQformat.html#format5 Just email us again if you have any additional questions. - Greg Roe UCSC Genome Browser On 11/29/10 10:53 AM, Matt De Both wrote: > Hello, > > I am involved in undergraduate research and have just been introduced to > .maf files. I understand the structure of the files from the online > documentation, but I have two questions. > > What is the significance, if any, of uppercase vs. lowercase letters in > the sequences? > > What do numbers in the sequence represent? > > Thanks in advance for any help! > > Matt > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
