Re: [Genome] help: Exonic position map to Protein position

Greg Roe Mon, 11 Apr 2011 15:00:36 -0700

Hi Janeela,

You can use the Table Browser again. Select (Clade/Genome/Assembly) 
Mammal/Pig/susScr2  and:


group: Gene and Gene prediction tracks
track: Ensebl Genes (or N-Scan Genes)
table: ensGene (or nscanGene)
region: genome
identifiers (names/accessions): click on "paste list" and paste in the 
identifiers following the instructions.
output format: sequence
Click get output

Select sequence type: genomic
Click Submit


On the sequence retrieval options page, make sure to uncheck the Introns 
box. Other than that the defaults should work for you - just read down 
the list to make sure, then click Ge Sequence.

If you have further questions, please contact us at [email protected]

-
Greg Roe
UCSC Genome Bioinformatics Group

On 4/4/11 3:41 PM, janeela khan wrote:
>
> Thank you so very much. It was very useful information for me. I wonder if I 
> can also retrieve the exonic sequence for the pig genome?>  From: 
> [email protected]
>> Subject: Genome Digest, Vol 99, Issue 3
>> To: [email protected]
>> Date: Fri, 1 Apr 2011 12:00:12 -0700
>>
>> Send Genome mailing list submissions to
>>      [email protected]
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>>      https://lists.soe.ucsc.edu/mailman/listinfo/genome
>> or, via email, send a message with subject or body 'help' to
>>      [email protected]
>>
>> You can reach the person managing the list at
>>      [email protected]
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Genome digest..."
>>
>>
>> Today's Topics:
>>
>>     1. Re: help: Exonic position map to Protein position
>>        (Vanessa Kirkup Swing)
>>     2. Re: How to generate mapping between Ensembl and refseq
>>        transcript IDs (Hiram Clawson)
>>     3. Re: bedgraph data will not display points (Hiram Clawson)
>>     4. Re: when is a query excessive. (Galt Barber)
>>     5. Re: bedgraph data will not display points
>>        (Lionel (Lee) Brooks 3rd)
>>     6. Re: bedgraph data will not display points (Hiram Clawson)
>>     7. protein families (Tom Traut)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Fri, 1 Apr 2011 09:38:36 -0700 (PDT)
>> From: Vanessa Kirkup Swing<[email protected]>
>> Subject: Re: [Genome] help: Exonic position map to Protein position
>> To: janeela khan<[email protected]>
>> Cc: UCSC genome<[email protected]>
>> Message-ID:
>>      <[email protected]>
>> Content-Type: text/plain; charset=utf-8
>>
>> Hi Janeela,
>>
>> To figure out where in the exon the protein is translated from, you will 
>> need to use the table browser. To get to the table browser click on on 
>> "Tables" from the blue navigation bar.
>>
>> Set the clade, genome, and assembly.
>>
>> Then you will need to set the following:
>>
>> group: Gene and Gene prediction tracks
>> track: UCSC Genes
>> table: knownGene
>> region: genome
>> identifiers (names/accessions): click on "paste list" and paste in the 
>> identifiers following the instructions.
>> output format: selected fields from primary and related tables
>>
>> click "get output"
>>
>> select the fields you want displayed.
>>
>> click "get output"
>>
>> Hope this helps lead you in the right direction. If you have further 
>> questions, please contact us at [email protected]
>>
>> Vanessa Kirkup Swing
>> UCSC Genome Bioinformatics Group
>>
>> ----- Original Message -----
>> From: "janeela khan"<[email protected]>
>> To: "UCSC genome"<[email protected]>
>> Sent: Thursday, March 31, 2011 8:21:24 AM
>> Subject: [Genome] help: Exonic position map to Protein position
>>
>>
>>
>> Dear All,
>> Could you guide me how I can map certain Positions in an exon to the Protein 
>> positions? Here i donot have the exact genomic positions but I have the gene 
>> name and the relative position in an exon. Is there a way to map this 
>> position to protein?
>> Thanks for the help in advance
>> MvH/janeela
>>
>>                                      
>> _______________________________________________
>> Genome maillist  -  [email protected]
>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>>
>>
>> ------------------------------
>>
>> Message: 2
>> Date: Fri, 01 Apr 2011 09:41:00 -0700
>> From: Hiram Clawson<[email protected]>
>> Subject: Re: [Genome] How to generate mapping between Ensembl and
>>      refseq  transcript IDs
>> To: "Cook, Malcolm"<[email protected]>
>> Cc: "'Rajasimha, Harsha \(NIH/NEI\) \[C\]'"<[email protected]>,
>>      "'[email protected]'"<[email protected]>
>> Message-ID:<[email protected]>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> Sorry Malcolm, there isn't a generic method for all genomes at UCSC.
>> This is a most interesting example you have here.  Usually chrM at
>> Ensembl is: "Mt"
>>
>> Newer genome assemblies at UCSC are including two tables:
>> ensemblLift
>> ucscToEnsembl
>>
>> Which allow translation of UCSC names to Ensembl names and
>> coordinate conversions for haplotypes and other random bits that
>> might be located in a different coordinate system.  For example:
>>
>> $ hgsql -e "select * from ensemblLift;" hg19
>> +-----------------+----------+
>> | chrom           | offset   |
>> +-----------------+----------+
>> | HSCHR4_1        | 69170076 |
>> | HSCHR17_1       | 43384863 |
>> | HSCHR6_MHC_APD  | 28696603 |
>> | HSCHR6_MHC_COX  | 28477796 |
>> | HSCHR6_MHC_DBB  | 28696603 |
>> | HSCHR6_MHC_MANN | 28696603 |
>> | HSCHR6_MHC_MCF  | 28696603 |
>> | HSCHR6_MHC_QBL  | 28696603 |
>> | HSCHR6_MHC_SSTO | 28659142 |
>> +-----------------+----------+
>>
>> $ hgsql -e "select * from ucscToEnsembl;" hg19 | grep MHC
>> chr6_ssto_hap7  HSCHR6_MHC_SSTO
>> chr6_qbl_hap6   HSCHR6_MHC_QBL
>> chr6_mcf_hap5   HSCHR6_MHC_MCF
>> chr6_mann_hap4  HSCHR6_MHC_MANN
>> chr6_cox_hap2   HSCHR6_MHC_COX
>> chr6_dbb_hap3   HSCHR6_MHC_DBB
>> chr6_apd_hap1   HSCHR6_MHC_APD
>>
>> It would be a useful process to go back over some of the older popular
>> genomes to add these conversion tables.
>>
>> --Hiram
>>
>> Cook, Malcolm wrote:
>>> Hiram,
>>>
>>> Is there a similar approach for chromosomal identifiers?  (i.e. chrM in dm3 
>>> is dmel_mitochondrion_genome at ensemble)
>>>
>>> Or better, an SQL query for same?
>>>
>>> Thx
>>>
>>> Malcolm Cook
>>> Stowers Institute for Medical Research -  Bioinformatics
>>> Kansas City, Missouri  USA
>>
>> ------------------------------
>>
>> Message: 3
>> Date: Fri, 01 Apr 2011 09:49:16 -0700
>> From: Hiram Clawson<[email protected]>
>> Subject: Re: [Genome] bedgraph data will not display points
>> To: Lionel Brooks<[email protected]>
>> Cc: [email protected]
>> Message-ID:<[email protected]>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> Good Morning Lionel:
>>
>> The bedGraph drawing mechanism can construct bar graphs at your
>> specified intervals, or when you select graphType=points it will
>> draw only the top of the bar graph at your specified intervals.
>> There is no line drawing except by the trick of "smoothing" points
>> such that they appear to be in a line graph.  This only works if
>> the data points are continuous when seen in the genome browser.
>> Smoothing will not smear points into areas where there is no
>> data value specified.
>>
>> The Genome Graphs function of the genome browser:
>>      http://genome.ucsc.edu/cgi-bin/hgGenome
>> will only draw lines between your specified points.
>>
>> See also:
>>
>> http://genomewiki.ucsc.edu/index.php/Selecting_a_graphing_track_data_format
>>
>> --Hiram
>>
>> Lionel Brooks wrote:
>>> Hello all,
>>>
>>> I have a bedgraph file.  In the past I have used to files to attain
>>> graphic output in the form of a smoothed line but I uploaded my most
>>> recent data set and now I cannot get a line graph.  In fact, I'm not
>>> sure what I am looking at because the values that are displayed along
>>> the y-axis are not described with a label.
>>>
>>> Here is my track line:
>>> track type=bedGraph autoScale=on graphType=points windowingFunction=mean
>>> smoothingWindow=16
>>>
>>> My data format is
>>>
>>> chr   coordA   coordB   value
>>>
>>> Where approximate distribution of data values are : 5<= value<= 500.
>>> Is it possible that your plotting function cannot compute this line
>>> because my coordinate intervals are too small?
>>> Another possibly relevant issue may be that the coordinate intervals are
>>> not fixed length.
>>>
>>> Any suggestions for course of action would be great.
>>>
>>> Sincerely,
>>> Lionel
>>
>> ------------------------------
>>
>> Message: 4
>> Date: Fri, 01 Apr 2011 10:25:14 -0700
>> From: Galt Barber<[email protected]>
>> Subject: Re: [Genome] when is a query excessive.
>> To: John Hayward<[email protected]>
>> Cc: "[email protected]"<[email protected]>
>> Message-ID:<[email protected]>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>>
>> Hi, John!
>>
>> Queries that take more than a few minutes to run are
>> probably inappropriate for the shared public mysql server.
>>
>> I found this query formulation for you that takes less than one minute:
>>
>> select name, observed, count(*) from(
>> (select name, observed, 'CEU' from hapmapSnpsCEU where chrom = 'Chr16')
>> union
>> (select name, observed, 'YRI' from hapmapSnpsYRI where chrom = 'Chr16')
>> union
>> (select name, observed, 'CHB' from hapmapSnpsCHB where chrom = 'Chr16')
>> union
>> (select name, observed, 'JPT' from hapmapSnpsJPT where chrom = 'Chr16')
>> ) resultAlias group by name, observed having count(*) = 4 limit 30;
>>
>> +-----------+----------+----------+
>> | name      | observed | count(*) |
>> +-----------+----------+----------+
>> | rs1000014 | A/G      |        4 |
>> | rs1000047 | C/T      |        4 |
>> | rs1000077 | C/G      |        4 |
>> | rs1000078 | A/G      |        4 |
>> | rs1000100 | A/T      |        4 |
>> | rs1000174 | A/G      |        4 |
>> | rs1000178 | C/T      |        4 |
>> | rs1000192 | A/G      |        4 |
>> | rs1000193 | A/C      |        4 |
>> | rs1000454 | C/G      |        4 |
>> | rs1000455 | A/T      |        4 |
>> | rs1000710 | G/T      |        4 |
>> | rs1000711 | C/G      |        4 |
>> | rs1000720 | A/G      |        4 |
>> | rs1000742 | C/T      |        4 |
>> | rs1001157 | A/G      |        4 |
>> | rs1001170 | G/T      |        4 |
>> | rs1001171 | A/T      |        4 |
>> | rs1001302 | A/G      |        4 |
>> | rs1001362 | C/T      |        4 |
>> | rs1001366 | C/T      |        4 |
>> | rs1001493 | C/T      |        4 |
>> | rs1001554 | A/G      |        4 |
>> | rs1001608 | C/T      |        4 |
>> | rs1001631 | C/G      |        4 |
>> | rs1001655 | A/G      |        4 |
>> | rs1001722 | G/T      |        4 |
>> | rs1001776 | C/T      |        4 |
>> | rs1001861 | A/G      |        4 |
>> | rs1001871 | C/G      |        4 |
>> +-----------+----------+----------+
>> 30 rows in set (46.17 sec)
>>
>> Of course for your own full output,
>> you would remove the "limit" clause.
>>
>> In case you are curious how many there are:
>>
>> select count(*) from (
>> select name, observed, count(*) from(
>> (select name, observed, 'CEU' from hapmapSnpsCEU where chrom = 'Chr16')
>> union
>> (select name, observed, 'YRI' from hapmapSnpsYRI where chrom = 'Chr16')
>> union
>> (select name, observed, 'CHB' from hapmapSnpsCHB where chrom = 'Chr16')
>> union
>> (select name, observed, 'JPT' from hapmapSnpsJPT where chrom = 'Chr16')
>> ) resultAlias group by name, observed having count(*) = 4) resultAlias2;
>> +----------+
>> |   105841 |
>> +----------+
>> 1 row in set (47.60 sec)
>>
>>
>> Another alternative would be to capture the output from each like this:
>>
>> select name, observed, 'CEU' from hapmapSnpsCEU where chrom = 'Chr16'
>>
>> for each of your 4 files.
>> You could sort them by name (rsId) either with an order by clause in
>> sql, or with the unix sort command.
>>
>> You can even use the unix join command to join them up on the name and
>> observed fields.
>>
>> Once the contents of each of the 4 sets are sorted by name and observed,
>> joining them can be very fast.
>>
>> -Galt
>>
>> 4/1/2011 8:25 AM, John Hayward:
>>>    I would like to run queries against the genome-mysql.cse.ucsc.edu 
>>> database which may be excessive and don't want to cause problems for others.
>>>
>>> I want to find matches for a particular chromosome which have the same name 
>>> and observation for tables  hapmapSnpsCEU, haphapmapSnpsYRI, mapSnpsCHB, 
>>> hapmapSnpsJPT.
>>>
>>> Doing a query to pickup the count of hapmapSnpsCEU for one chromosome took 
>>> 0.14 seconds.
>>> If I do a query to pick up the count joining hapmapSnpsCEU and hapmapSnpCHB 
>>> took 8.40 seconds.
>>>
>>> If I join all tables would that constitute an excessive load?
>>>
>>> Below is the query joining two tables.
>>> ======
>>> select count(*) from hapmapSnpsCEU, hapmapSnpsCHB where hapmapSnpsCEU.chrom 
>>> = 'Chr16' and hapmapSnpsCHB.chrom = 'Chr16' and hapmapSnpsCEU.name = 
>>> hapmapSnpsCHB.name and hapmapSnpsCEU.observed = hapmapSnpsCHB.observed;
>>> ======
>>> johnh...
>>>
>>>
>>> _______________________________________________
>>> Genome maillist  -  [email protected]
>>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>>
>>
>> ------------------------------
>>
>> Message: 5
>> Date: Fri, 01 Apr 2011 14:16:42 -0400
>> From: "Lionel (Lee) Brooks 3rd"<[email protected]>
>> Subject: Re: [Genome] bedgraph data will not display points
>> To: Hiram Clawson<[email protected]>
>> Cc: [email protected]
>> Message-ID:<[email protected]>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> Hi Hiram,
>>
>> >From
>> http://genomewiki.ucsc.edu/index.php/Selecting_a_graphing_track_data_format
>>
>>     1. Pseudo /line graphs/ can be drawn with the wiggle tracks by
>>        setting optional drawing parameters in the display of the track to
>>        draw /points/ instead of bars with smoothing on to smear the
>>        points together into a line.
>>
>> The pseudo line graph functionality is what I desire.
>> Previously, it had been possible to do this with bedgraph format files.
>> I don't know what "smearing" means.  I'm just looking for a quick way to
>> draw the moving average as I had been able to do before.
>> As I mentioned below, my track line is:
>> track type=bedGraph autoScale=on graphType=points windowingFunction=mean
>> smoothingWindow=16
>>
>> I suppose my solution is to modify my scripts to use the wiggle variable
>> step format?
>>
>>
>> thanks,
>> -Lionel
>>
>>
>> Hiram Clawson wrote:
>>> Good Morning Lionel:
>>>
>>> The bedGraph drawing mechanism can construct bar graphs at your
>>> specified intervals, or when you select graphType=points it will
>>> draw only the top of the bar graph at your specified intervals.
>>> There is no line drawing except by the trick of "smoothing" points
>>> such that they appear to be in a line graph.  This only works if
>>> the data points are continuous when seen in the genome browser.
>>> Smoothing will not smear points into areas where there is no
>>> data value specified.
>>>
>>> The Genome Graphs function of the genome browser:
>>>      http://genome.ucsc.edu/cgi-bin/hgGenome
>>> will only draw lines between your specified points.
>>>
>>> See also:
>>>
>>> http://genomewiki.ucsc.edu/index.php/Selecting_a_graphing_track_data_format
>>>
>>>
>>> --Hiram
>>>
>>> Lionel Brooks wrote:
>>>> Hello all,
>>>>
>>>> I have a bedgraph file.  In the past I have used to files to attain
>>>> graphic output in the form of a smoothed line but I uploaded my most
>>>> recent data set and now I cannot get a line graph.  In fact, I'm not
>>>> sure what I am looking at because the values that are displayed along
>>>> the y-axis are not described with a label.
>>>> Here is my track line:
>>>> track type=bedGraph autoScale=on graphType=points
>>>> windowingFunction=mean smoothingWindow=16
>>>>
>>>> My data format is
>>>>
>>>> chr   coordA   coordB   value
>>>>
>>>> Where approximate distribution of data values are : 5<= value<= 500.
>>>> Is it possible that your plotting function cannot compute this line
>>>> because my coordinate intervals are too small?
>>>> Another possibly relevant issue may be that the coordinate intervals
>>>> are not fixed length.
>>>>
>>>> Any suggestions for course of action would be great.
>>>>
>>>> Sincerely,
>>>> Lionel
>>
>> ------------------------------
>>
>> Message: 6
>> Date: Fri, 1 Apr 2011 11:18:07 -0700 (PDT)
>> From: Hiram Clawson<[email protected]>
>> Subject: Re: [Genome] bedgraph data will not display points
>> To: "Lionel (Lee) Brooks 3rd"<[email protected]>
>> Cc: [email protected]
>> Message-ID:
>>      <[email protected]>
>> Content-Type: text/plain; charset=utf-8
>>
>> It won't make any difference what type of wiggle format you choose.
>> They all draw the same way.
>>
>> You are going to have to provide me with a URL to your data file
>> so I can see what it looks like.
>>
>> --Hiram
>>
>> ----- Original Message -----
>> From: "Lionel (Lee) Brooks 3rd"<[email protected]>
>> To: "Hiram Clawson"<[email protected]>
>> Cc: [email protected]
>> Sent: Friday, April 1, 2011 11:16:42 AM
>> Subject: Re: [Genome] bedgraph data will not display points
>>
>> Hi Hiram,
>>
>> >From
>> http://genomewiki.ucsc.edu/index.php/Selecting_a_graphing_track_data_format
>>
>>     1. Pseudo /line graphs/ can be drawn with the wiggle tracks by
>>        setting optional drawing parameters in the display of the track to
>>        draw /points/ instead of bars with smoothing on to smear the
>>        points together into a line.
>>
>> The pseudo line graph functionality is what I desire.
>> Previously, it had been possible to do this with bedgraph format files.
>> I don't know what "smearing" means.  I'm just looking for a quick way to
>> draw the moving average as I had been able to do before.
>> As I mentioned below, my track line is:
>> track type=bedGraph autoScale=on graphType=points windowingFunction=mean
>> smoothingWindow=16
>>
>> I suppose my solution is to modify my scripts to use the wiggle variable
>> step format?
>>
>>
>> thanks,
>> -Lionel
>>
>>
>> Hiram Clawson wrote:
>>> Good Morning Lionel:
>>>
>>> The bedGraph drawing mechanism can construct bar graphs at your
>>> specified intervals, or when you select graphType=points it will
>>> draw only the top of the bar graph at your specified intervals.
>>> There is no line drawing except by the trick of "smoothing" points
>>> such that they appear to be in a line graph.  This only works if
>>> the data points are continuous when seen in the genome browser.
>>> Smoothing will not smear points into areas where there is no
>>> data value specified.
>>>
>>> The Genome Graphs function of the genome browser:
>>>      http://genome.ucsc.edu/cgi-bin/hgGenome
>>> will only draw lines between your specified points.
>>>
>>> See also:
>>>
>>> http://genomewiki.ucsc.edu/index.php/Selecting_a_graphing_track_data_format
>>>
>>>
>>> --Hiram
>>>
>>> Lionel Brooks wrote:
>>>> Hello all,
>>>>
>>>> I have a bedgraph file.  In the past I have used to files to attain
>>>> graphic output in the form of a smoothed line but I uploaded my most
>>>> recent data set and now I cannot get a line graph.  In fact, I'm not
>>>> sure what I am looking at because the values that are displayed along
>>>> the y-axis are not described with a label.
>>>> Here is my track line:
>>>> track type=bedGraph autoScale=on graphType=points
>>>> windowingFunction=mean smoothingWindow=16
>>>>
>>>> My data format is
>>>>
>>>> chr   coordA   coordB   value
>>>>
>>>> Where approximate distribution of data values are : 5<= value<= 500.
>>>> Is it possible that your plotting function cannot compute this line
>>>> because my coordinate intervals are too small?
>>>> Another possibly relevant issue may be that the coordinate intervals
>>>> are not fixed length.
>>>>
>>>> Any suggestions for course of action would be great.
>>>>
>>>> Sincerely,
>>>> Lionel
>>
>> ------------------------------
>>
>> Message: 7
>> Date: Fri, 1 Apr 2011 14:34:07 -0400
>> From: "Tom Traut"<[email protected]>
>> Subject: [Genome] protein families
>> To: [email protected]
>> Message-ID:<p06240804c9bbcac80186@[152.19.36.114]>
>> Content-Type: text/plain; charset=us-ascii; format=flowed
>>
>> Can I use your site (or any other) to find a listing of major protein 
>> families?
>>
>> how many kinases
>> how many proteases
>> how many G proteins
>>
>> etc
>> -- 
>> Tom Traut
>>
>> Professor of Biochemistry&  Biophysics
>>
>> Phone:       919 966-5044
>> FAX: 919 966-2852
>> URL: www.unc.edu/~traut
>>
>>
>>
>> ------------------------------
>>
>> _______________________________________________
>> Genome maillist  -  [email protected]
>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>>
>>
>> End of Genome Digest, Vol 99, Issue 3
>> *************************************
>                                       
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Re: [Genome] help: Exonic position map to Protein position

Reply via email to