Hey Tychele, Your interpretation of the frame is not quite right. The InFrame and OutFrame numbers are a representation of where in the frame the exon starts and ends. So, if InFrame is 1, the first base in the codon comes from the *previous* exon (i.e. it's a split codon) which will be found somewhere upstream. If OutFrame is 1, then the first base of the first codon in the *next* exon (found somewhere downstream) is at the end of the current exon. Similarly, if inFrames is 2, the first two bases are in the upstream exon.
If the gene is on the negative strand, then the codons start at the end of the range, but the InFrame and OutFrame numbers are interpreted the same way. In the exonAA files, the split codons appear with the exon where most of the bases occur. The numbers in the range are one-based so the region can be pasted directly into the browser. I hope this answers your question. If you have further questions, please respond to this list. Brian On Tue, Sep 14, 2010 at 9:50 AM, Tychele <[email protected]> wrote: > Hi, > > I have downloaded and am looking at the refGene.exonAA.fa file for the > 46 multizway alignment. I am trying to match each amino acid to its > corresponding 3 nucleotide position. I realize this is the meaning of > the header and sequence: > > >Name_Assembly(Species)_Exon#_TotalExons ExonLength(AA) InFrame > OutFrame Location Strand(+/-) > > AASequence > > example: > > >NM_032291_hg19_1_25 3 0 1 chr1:67000042-67000051+ > > MME > > > > What I’m trying to determine is if I have the rules right for > assigning location. Can anyone confirm this? > > Rules I figured based on looking at examples in the file: > > Assume given location chr:a-b > > 1. * If strand is + > > a. if inframe is 0 than first amino acid is equal to chr:a, chr:a > +1, chr:a+2 > > b. if inframe is 1 than first amino acid is equal to chr:a-1, > chr:a, chr:a+1 > > c. if inframe is 2 than first amino acid is equal to chr:a+1, > chr:a+2, chr:a+3 > > 2. *if strand is – > > a. if inframe is 0 than first amino acid is equal to chr:b, > chr:b-1, chr:b-2 > > b. if inframe is 1 than first amino acid is equal to chr:b+1, > chr:b, chr:b-1 > > c. if inframe is 2 than first amino acid is equal to chr:b-1, > chr:b-2, chr:b-3 > > Please let me know if these are correct assumptions or not. Thank you. > > Tychele > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
