Dear Professor,

Many thanks for your reply and explanation.
I just download again the desired output in BED format through UCSC.
However, I still feel a bit with the following case:
Case 1:
Gene Name Chromosome Start Position End Position Remarks
PCDH15      10           55568451   55569239      3'UTR
PCDH15      10           55568451   55569306      coding exons
PCDH15      10           55568502   55568819      intron

It seems like PCDH15 is overlap based on UCSC output?
I can't really distinguish which regions is 3'UTR, coding exons or intron of 
PCDH15.

Case 2:
Gene Name Chromosome Start Position End Position Remarks
ATP5L      11           118279813   118280562      3'UTR
ATP5L      11           118279714   118280562      5'UTR

It will be good if you can provide which some real case to distinguish promoter 
region, 3'UTR, coding exons, 5'UTR and intron from the UCSC output result.


Thanks for your advice regarding miRNA, rRNA, tRNA, etc.
I able to solve it out now.

Looking forward to hear from you.

best regards
Edge




________________________________
 From: Brooke Rhead <[email protected]>
To: Edge Edge <[email protected]> 
Cc: ucsc <[email protected]> 
Sent: Saturday, June 30, 2012 11:31 AM
Subject: Re: [Genome]  Help with extract rRNA, tRNA, 3'-UTR and 5'-UTR from 
UCSC Genome Browser
 
Hi Edge,

> perfectly fine." In biology behaviour of human genome, is it normal
> that 3'-UTR overlap with 5'-UTR?

There is a bug in the Table Browser that I think is causing a lot of confusion 
here.  When you choose to retrieve 3' UTR or 5' UTR output from a region where 
there is a non-coding gene, the Table Browser should not return any result for 
that non-coding gene.  We have recorded this bug and we hope to fix it 
eventually.  In the meantime, you will need another way to distinguish UTRs 
from genes that are entirely non-coding.

> Can I know that whether UCSC got provide the coordinates of human
> miRNA? If yes, can I know how to extract it out?

There are several tracks that include miRNA annotations.  To decide which track 
you want to use, you can click on the track name on the main Genome Browser 
page (http://genome.ucsc.edu/cgi-bin/hgTracks) and read about where the data 
comes from and how recent it is.  Some tracks you might want to consider on 
hg19 are:  sno/miRNA, tRNA Genes, RefSeq Genes, UCSC Genes (which contains 
annotations based on data from RefSeq, Genbank, CCDS, UniProt, Rfam, and the 
tRNA Genes track), and GENCODE.

To get coordinate positions, there is no need to retrieve sequence from the 
Table Browser.  There are two different ways to get coordinate positions.  The 
first is to choose either "all fields from selected table" or "selected fields 
from primary and related tables" as the output format.  The start and stop 
coordinates will be included in the output.  Also, if the table includes UTRs 
(as in the refGene and knownGene tables, for instance), you can calculate the 
positions of the UTRs by subtracting txStart from cdsStart and cdsEnd from 
txEnd.  (To see a description of the fields of any table, select it in the 
Table Browser and hit the "describe table schema" button.)  To read more about 
the way genome coordinates are stored in our tables, see: 
http://genomewiki.ucsc.edu/index.php/Coordinate_Transforms.

The other way to get coordinate positions is to select the "BED - browser 
extensible data" as the output format.  BED format is described here:  
http://genome.ucsc.edu/FAQ/FAQformat.html#format1.  If you use this method, you 
will be able to limit the output to just the UTRs. However, you will run into 
the bug described above of completely non-coding genes showing up in the output.

I hope this helps explain what you are seeing in the Genome Browser and helps 
you get a start on using the Table Browser.  If you haven't seen the Open Helix 
tutorials on the Browser, especially the Table Browser tutorial, you might find 
these helpful:  http://www.openhelix.com/ucsc.

If you have further questions, please contact us again at [email protected].

--
Brooke Rhead
UCSC Genome Bioinformatics Group


On 6/28/12 6:08 PM, Edge Edge wrote:
> Dear Professor,
> 
> 
> According your previous reply, "This is why you see overlap between
> the 5’ UTR of uc010nxq.1 and the 3’ UTR of uc001aaa.3 and this is
> perfectly fine." In biology behaviour of human genome, is it normal
> that 3'-UTR overlap with 5'-UTR?
> 
> Can I know that whether UCSC got provide the coordinates of human
> miRNA? If yes, can I know how to extract it out?
> 
> I got try to extract out the human exon coordinates by using the
> following method: 1. Set the following options: Group: Genes and Gene
> Prediction Tracks Track: RefSeq Genes Table: knownGene Output format:
> sequence
> 
> 2. Click the "get output" button
> 
> 3. Select "genomic" and click the "submit" button
> 
> 4. Check the " CDS Exons "
> 
> 5. Click the "get sequence" button
> 
> From the result, it shown that
>> hg19_refGene_NM_032291 range=chr1:67000042-67208778 5'pad=0 3'pad=0
>> strand=+ repeatMasking=none
> However, when I check the coordinates through UCSC Genome Browser, it
> shown that chr1:67,000,042-67,208,778 is fall in Gene region but not
> exons of that particular gene.
> 
> Can I know how to just extract the coordinate of each exon in every
> gene? My main purpose just hope can extract the coordinates of
> promoter, 3'-UTR, 5'-UTR, miRNA, rRNA, tRNA, exon position of human
> by using UCSC.
> 
> Really thanks for your advice. Many thanks for your explanation in
> detail previously. Very helpful and learn a lot from your
> explanation. I'm very fresh at bioinformatics.
> 
> Thanks.
> 
> best regards Edge Research Student
> _______________________________________________ Genome maillist  -
> [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to