Dear Dr. Jackson,

Thank you very much for your detail answer! That is very helpful to me.

Do you think I need download the knowGene data and write a program to
identify non-coding genes by selecting those genes where the cdsStart
== cdsEnd through? For I failed to do by using the Table browser?

Thank you again!

Best regards,

Chuangye
2009-11-25


2009/11/25, Jennifer Jackson <[email protected]>:
> Hello Chuangye,
>
> The browser tools option for EST data is to extract the associated genomic
> for the intron regions (the genomic region covered by the EST alignment.
> There is not an automated way to extract to actual EST sequence. To download
> these genomic regions, use the Table browser and use the following steps:
>
> 1) go to http://genome.ucsc.edu/cgi-bin/hgTables
> 2) set controls to the assembly of interest and track group to "mRna and Est
> tracks"
> 3) choose either intronEST or just EST
> 4) then select output options as sequence. name file and submit.
> 5) at the next output details page, choose the option "Regions between
> blocks". A block is a technical name we use for an exon - basically any
> contiguous alignment section versus genomic sequence. Whether or not a block
> is actually an exon will depend on the quality of the data being aligned.
> For Est data, this can vary.
> 6) download regions. The set will be a mix of regions bounded by coding or
> non-coding exons.
>
> Ests are not annotated as coding or non-coding, but you can use a gene track
> (UCSC genes or RefSeq genes or other) to extract genomic intron regions.
> Follow the same method above, starting with the gene track, select genomic
> sequence, then Introns.
>
> Non-coding genes can be identified by selecting those genes where the
> cdsStart == cdsEnd. This is how we designate non-coding genes. An example of
> this is the data from the UCSC Genes track (knownGenes table) where name ==
> uc001aaa.2. Use the assembly or table browser and search using this
> identifier to view the example.
>
> To locate ESTs associated with non-coding genes, create a custom track that
> contains only the non-coding genes, then start a Table browser query
> starting with the EST track. Set an intersection (overlap) against this
> custom track and output the data. Only ESTs that align to the same region of
> genomic as the non-coding gene will be returned in the result.
>
> Spliced Ests are those that contain verified splice sites at the alignemnt
> block boundaries and the data is in the Spliced ESTs track (table =
> intronEst). Ests that do not have splice sites are joined together with the
> first set in the EST track (table = all_est). These tables can be quite
> large, so for assemblies with a large number of ESTs either extract the data
> per region or chromosome or consider using the files on Download and your
> own tools to parse out the data from the fasta files based on the block
> coordinates in the tables.
>
> Read the track descriptions for more details about how the data is
> classified. To do this, go into the Assembly browser and click on the track
> name.
>
> More help:
> http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html
> (general help and example queries)
> http://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html#Download
> (this link has the actual EST sequences - which you could potentially parse
> using the your own tools and the coordinate data from the tables in the
> mySQL database).
>
> To find the files associated with any mySQL table, go into Downloads, use
> the links to locate the assembly, then go into the Annotation Database
> directory. All tables are here - the files are named the same as the tables
> with either a .txt.gz at the end for the data and an .sql for the schema.
> Ftp to use locally.
>
> We hope this helps,
> Jennifer Jackson
>
> ------------------------------------------------
> Jennifer Jackson
> UCSC Genome Bioinformatics Group
>
> ----- "Chuangye" <[email protected]> wrote:
>
>> From: "Chuangye" <[email protected]>
>> To: [email protected]
>> Sent: Tuesday, November 24, 2009 6:04:59 PM GMT -08:00 US/Canada Pacific
>> Subject: [Genome] HOW to get noncoding EST and the intron part of
>> protein-coding EST
>>
>> Hello, Sir/Miss,
>>
>> How could I  get noncoding EST(ncRNA) and the  intron part of
>> protein-coding EST from UCSC Genome Browser? What the differences of
>> the "all_est" table between "Spliced Ests" track and "Human ESTs"
>> track?  And are the  "intronEST"  of "Spliced Ests" track introns of
>> genes?
>>
>> Thanks!
>>
>> Chuangye
>>
>> 2009-11-24
>> _______________________________________________
>> Genome maillist  -  [email protected]
>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to