Hi, I am a PhD student. My current project involves Repeat elements (Retrotransposons) in Human genome. I have checked ucsc genome browser for relevant file. And in the Human annotation database (http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/) website i have found rmsk.txt.gz, which gives chromosomal location of the repeat elements in human chromosome. I have few questions, and i would be grateful if someone can please address them:
1. First question is: under the field repeatName most of them have HERV then some number and then "-int". My first question is does it mean intron? So it wouldnt have the gene info? Where can i get the complete sequence in that location of the chromosome? I know i can go to repbase to get the reference sequence, but i am rather after the sequences in the current genome build. 2. Some of the elements have HERV in their repeatName but some dont. Because I am new to this area, my understanding was all ERV in Human are called HERV. So why some of them start with HERV as repeatName and some start with LTR or MER in their repeatName field? I have given few examples below from the rmsk file. Chromosome start stop repeatName repeatFamilyName chrY 27387638 27395881 HERV9-int ERV1 chrY 27695203 27699373 HERVK3-int ERVK chrY 27719085 27723450 HERVL-int ERVL chrX 90706526 90709746 MER57A-int ERV1 chrX 93126842 93130433 LTR25-int ERV1 I would really appreciate if someone address the questions i have. Thank you for your cooperation. Firoz Centre for Vascular Research Lowy Cancer Research Centre University of New South Wales Sydney, Australia _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
