Hi,
I am attempting to extract the nucleotide sequences for exons in several genomes based on their locations listed in the refFlat.txt. In almost all cases, the exonStarts-exonEnds do not correspond to the nucleotide position relative to the refSeq for that particular organism and chromosome. For example, mouse build37 has a 30Mbp gap at the start of all chromosomes, except for Y. This gap is shown in the sequence with "N" but that is omitted from the refFlat table. In other words, nucleotide position 30x10^6 + 1 = position 0 in the refFlat. In chicken (and others), there are gaps interspersed throughout many of the assembled chromosomes, shown with "N", but refFlat locations are not offset by the gap lengths. Can somebody please suggest to me how I can extract genomic features based on nucleotide position programmatically, if the refFlat positions do not match the nucleotide positions and the offsets are unknown? Thank you, Aaron _______________________________________________ Genome maillist - [email protected] http://www.soe.ucsc.edu/mailman/listinfo/genome
