Hi Ann,

(1) Using our public mysql server, genome-mysql (see the "Direct MySQL 
access to data" FAQ for more details about using genome-mysql: 
http://genome-test.cse.ucsc.edu/FAQ/FAQdownloads.html#download29), as 
suggested by Brent Pedersen is a good option and it was kind of him to 
provide the commands. However, please be aware that we did not vet these 
commands; you'll have to review them and verify they suit your purposes.

Another option is to use the Table Browser, to get a set of *all* 
introns in one query and then get a set of *all* coding exons in another 
separate query. You'll then have to write a script to parse out the 
first intron for each gene and the first coding exon for each gene.

 From the Table Browser, select a gene track, set the output format to 
"sequence." On the "Select sequence type for UCSC Genes" page, select 
"genomic." On the " UCSC Genes Genomic Sequence" page, to get all 
introns you'll want to select "Introns" and "One FASTA record per 
region...". This will provide you a list of *all* introns. To get the 
all coding exons, you'll do nearly the same thing, except on the on the 
"UCSC Genes Genomic Sequence" page, you'll select "CDS Exons" instead of 
"Introns". This will provide you a list of *all* exons.

You'll then have to write a script to extract the first intron/exon of 
each gene from each of your lists of results. Each region, intron or 
exon, in your results will start with ">database_table_geneId_#" and the 
number at the end will indicate the intron/exon number. For genes on the 
plus strand, the region numbered 0 is the first intron or exon (1 is the 
second, etc) of the gene. Here is an example:

 >hg19_knownGene_uc001aaa.3_0

For genes on the - strand, your script will have to determine which of 
the regions has the highest intron/exon number and pull that as the 
first intron or exon.

(2) We don't have a public programs specifically for finding tandem 
repeats within microsatellite loci, but you could do an intersection 
between the Simple Repeat track and the Microsatellite track via the 
table browser.

Please contact the mail list ([email protected]) again if you have any 
further questions.

Katrina Learned
UCSC Genome Bioinformatics Group


On 8/25/11 9:07 PM, Ann Eileen Miller Baker wrote:
> Brooke and others,
> (1) Is there an alternative way to learn the first intron, first coding
> exon?
> (2) Does the Bioinformatics group have public programs for finding tandem
> repeats within microsatellite loci?
> Thanks for your help Brooke,
> A
>
> On Thu, Aug 25, 2011 at 1:04 PM, Brooke Rhead<[email protected]>  wrote:
>
>> Hello Ann,
>>
>> The Table Browser does not have an option to limit output to only the first
>> intron or first coding exon.
>>
>> --
>> Brooke Rhead
>> UCSC Genome Bioinformatics Group
>>
>>
>>
>> On 08/25/11 10:00, Ann Eileen Miller Baker wrote:
>>
>>> 25Au11
>>> Please answer below. I am aware that the table browser delivers
>>> for mouse DMIT microsatellite loci overlapping introns, exons, coding
>>> exons,
>>> and UTR, but I am asking if there is any way to customize this listing to
>>> include  FIRST INTRON; FIRST CODING EXON.
>>> Thanks,
>>> Ann
>>>
>>> ---------- Forwarded message ----------
>>> From: Ann Eileen Miller Baker<[email protected]>
>>> Date: Sun, Aug 21, 2011 at 3:45 PM
>>> Subject: identifying "first intron", "first coding exon"
>>> To:[email protected]
>>>
>>>
>>> 21Au11
>>> Dear UCSC genomics team,
>>> When determining genomic elements (UTR, exons, coding exons, introns)
>>> co-occuring with DMIT microsatellite loci,
>>> <<Is there a way to specify requesting first intron, first coding exon>>?
>>> Thanks,
>>> Ann
>>> ______________________________**_________________
>>> Genome maillist  [email protected]
>>> https://lists.soe.ucsc.edu/**mailman/listinfo/genome<https://lists.soe.ucsc.edu/mailman/listinfo/genome>
>>>
> _______________________________________________
> Genome maillist  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to