Good Evening Sina: I'm sorry, I do not understand your question at all.
The "exact" locations of all repeats are exactly genoName,genoStart,genoEnd If you are trying to repeat mask some other type of genome sequence, you need to run RepeatMasker on that sequence. It doesn't make any sense to try to translate locations of repeats from one genome to a different genome. --Hiram ----- Original Message ----- From: "Sina Vivekanandan" <[email protected]> To: "Hiram Clawson" <[email protected]> Cc: [email protected] Sent: Sunday, November 21, 2010 9:37:32 PM GMT -08:00 US/Canada Pacific Subject: RE: [Genome] Regarding repeat masked tables (human) in ucsc Hi Hiram Yes..Thank you. I have seen the table schema before. But the doubt was how to get the absolute coordinates from this table. It shows columns like genoName, genoStart, genoEnd and genoLeft which gives the coordinates in the genome sequence and also columns like repStart, repEnd and repLeft which gives the coordinate positions in the repeat sequence. So my question is how to relate these two. What I need to do is to find the exact coordinates of all the repeats in this genome, so that I can use them to mask the genome file I have myself. But how do I use these 7 columns in unison to find these coordinates? Hope I have made my question clear this time. Thanks in advance. Sina K V -----Original Message----- From: Hiram Clawson [mailto:[email protected]] Sent: Friday, November 19, 2010 11:22 PM To: Vivekanandan, Sina Cc: [email protected] Subject: Re: [Genome] Regarding repeat masked tables (human) in ucsc Good Morning Sina: Please note, you can view the table schema for any table in the genome browser via the "tables" link in the blue navigation bar. Select group: Variation and Repeats, table: rmsk and then "describe table schema" Note the definition of genoName, genoStart, genoEnd: genoName chr1 varchar(255) Genomic sequence name genoStart 10000 int(10) unsigned Start in genomic sequence genoEnd 10468 int(10) unsigned End in genomic sequence You can select these from the genome browser with the MySQL command: mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -e "select genoName,genoStart,genoEnd from rmsk;" hg19 --Hiram Vivekanandan, Sina wrote: > Hi > > I am having trouble understanding the table for repeat masked regions from > UCSC. > What I was looking for is absolute genomic coordinates so as to use those > coordinates > to remove the repeat regions in a coordinate based file that I have. What I > find > are 6 coordinate based columns namely: genoStart, genoEnd, genoLeft, > repStart, repEnd > and repLeft. How do I use all these to find the absolute coordinates of the > repeat > regions? Please help. > > Thanks and regards > Sina K V The information contained in this message may be confidential and legally protected under applicable law. The message is intended solely for the addressee(s). If you are not the intended recipient, you are hereby notified that any use, forwarding, dissemination, or reproduction of this message is strictly prohibited and may be unlawful. If you are not the intended recipient, please contact the sender by return e-mail and destroy all copies of the original message. _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
