Hi Brian, Thank you. This explanation really clarifies the InFrame and OutFrame numbers and how to identify the nucleotide positions relating to each amino acid position.
Tychele On Sep 15, 2010, at 5:44 PM, Brian Raney wrote: > Hey Tychele, > > Your interpretation of the frame is not quite right. The InFrame > and OutFrame numbers are a representation of where in the frame the > exon starts and ends. So, if InFrame is 1, the first base in the > codon comes from the *previous* exon (i.e. it's a split codon) which > will be found somewhere upstream. If OutFrame is 1, then the first > base of the first codon in the *next* exon (found somewhere > downstream) is at the end of the current exon. Similarly, if > inFrames is 2, the first two bases are in the upstream exon. > > If the gene is on the negative strand, then the codons start at the > end of the range, but the InFrame and OutFrame numbers are > interpreted the same way. > > In the exonAA files, the split codons appear with the exon where > most of the bases occur. > > The numbers in the range are one-based so the region can be pasted > directly into the browser. > > I hope this answers your question. If you have further questions, > please respond to this list. > > Brian > > On Tue, Sep 14, 2010 at 9:50 AM, Tychele <[email protected]> wrote: > Hi, > > I have downloaded and am looking at the refGene.exonAA.fa file for the > 46 multizway alignment. I am trying to match each amino acid to its > corresponding 3 nucleotide position. I realize this is the meaning of > the header and sequence: > > >Name_Assembly(Species)_Exon#_TotalExons ExonLength(AA) InFrame > OutFrame Location Strand(+/-) > > AASequence > > example: > > >NM_032291_hg19_1_25 3 0 1 chr1:67000042-67000051+ > > MME > > > > What I’m trying to determine is if I have the rules right for > assigning location. Can anyone confirm this? > > Rules I figured based on looking at examples in the file: > > Assume given location chr:a-b > > 1. * If strand is + > > a. if inframe is 0 than first amino acid is equal to chr:a, chr:a > +1, chr:a+2 > > b. if inframe is 1 than first amino acid is equal to chr:a-1, > chr:a, chr:a+1 > > c. if inframe is 2 than first amino acid is equal to chr:a+1, > chr:a+2, chr:a+3 > > 2. *if strand is – > > a. if inframe is 0 than first amino acid is equal to chr:b, > chr:b-1, chr:b-2 > > b. if inframe is 1 than first amino acid is equal to chr:b+1, > chr:b, chr:b-1 > > c. if inframe is 2 than first amino acid is equal to chr:b-1, > chr:b-2, chr:b-3 > > Please let me know if these are correct assumptions or not. Thank you. > > Tychele > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
