Hi,
I am a PhD student at University of New South Wales. I am working on 
Retroelement in Human Genome. So procedure I have followed is the following 
steps:
1. I have downloaded rmsk.txt.gz from the UCSC genome website.
2. Using the rmsk.sql definition, which gives me repeatName and their genomic 
location I've written program to extract the repeat nucleotide sequences from 
the 2009 human reference sequence (GRCh37).

When I've extracted the rmsk.txt.gz file to rmsk.txt (which is a large ~455 MB 
file), but the longest HERV sequence from this file is HERVS71-int of length
8909 bp.

But according to some literature some of the HERV-K that was reported and the 
corresponding NCBI sequence (GenBank accession no. 
M14123<http://www.ncbi.nlm.nih.gov/nuccore/182227>) is 9109 bp.

So my question is am I looking at the right ucsc file?  Cause the above NCBI 
sequence is a Human Endogenous retrovirus sequence. So i assumed this should be 
automatically included in the rmsk file.

Please advice me in this regards.

Firoz
______________________________
Firoz Anwar
Complex System in Biology Group
Centre for Vascular Research (CVR)
Lowy Cancer Research Building
Level 4

University of New South Wales
Email: [email protected]<mailto:[email protected]>
Mobile # +61 0413185168

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to