Good Evening Sina:

I'm sorry, I do not understand your question at all.

The "exact" locations of all repeats are exactly genoName,genoStart,genoEnd

If you are trying to repeat mask some other type of genome sequence, you need
to run RepeatMasker on that sequence.  It doesn't make any sense to try
to translate locations of repeats from one genome to a different genome.

--Hiram

----- Original Message -----
From: "Sina Vivekanandan" <[email protected]>
To: "Hiram Clawson" <[email protected]>
Cc: [email protected]
Sent: Sunday, November 21, 2010 9:37:32 PM GMT -08:00 US/Canada Pacific
Subject: RE: [Genome] Regarding repeat masked tables (human) in ucsc

Hi Hiram

Yes..Thank you. I have seen the table schema before. But the doubt was how to 
get the absolute coordinates from this table. It shows columns like genoName, 
genoStart, genoEnd  and genoLeft which gives the coordinates in the genome 
sequence and also columns like repStart, repEnd and repLeft which gives the 
coordinate positions in the repeat sequence. So my question is how to relate 
these two. What I need to do is to find the exact coordinates of all the 
repeats in this genome, so that I can use them to mask the genome file I have 
myself. But how do I use these 7 columns in unison to find these coordinates? 
Hope I have made my question clear this time.

Thanks in advance.
Sina K V

-----Original Message-----
From: Hiram Clawson [mailto:[email protected]]
Sent: Friday, November 19, 2010 11:22 PM
To: Vivekanandan, Sina
Cc: [email protected]
Subject: Re: [Genome] Regarding repeat masked tables (human) in ucsc

Good Morning Sina:

Please note, you can view the table schema for any table in the genome
browser via the "tables" link in the blue navigation bar.  Select
group: Variation and Repeats, table: rmsk and then "describe table schema"

Note the definition of genoName, genoStart, genoEnd:

genoName        chr1    varchar(255)    Genomic sequence name
genoStart       10000   int(10) unsigned        Start in genomic sequence
genoEnd         10468   int(10) unsigned        End in genomic sequence

You can select these from the genome browser with the MySQL command:

mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -e "select 
genoName,genoStart,genoEnd from rmsk;" hg19

--Hiram

Vivekanandan, Sina wrote:
> Hi
>
> I am having trouble understanding the table for repeat masked regions from 
> UCSC.
> What I was looking for is absolute genomic coordinates so as to use those 
> coordinates
> to remove the repeat regions in a coordinate based file that I have. What I 
> find
> are 6 coordinate based columns namely: genoStart, genoEnd, genoLeft, 
> repStart, repEnd
> and repLeft. How do I use all these to find the absolute coordinates of the 
> repeat
> regions? Please help.
>
> Thanks and regards
> Sina K V

The information contained in this message may be confidential and legally 
protected under applicable law. The message is intended solely for the 
addressee(s). If you are not the intended recipient, you are hereby notified 
that any use, forwarding, dissemination, or reproduction of this message is 
strictly prohibited and may be unlawful. If you are not the intended recipient, 
please contact the sender by return e-mail and destroy all copies of the 
original message.

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to