Re: [Genome] About NCBI gene coordinate from Gene ID or refseqID

Greg Roe Tue, 08 Feb 2011 12:04:45 -0800

Hi Shibu,

 >> check the boxes next to "name", "txStart", and "txStop"


Of course I meant "txEnd".  ;-)

-
Greg



On 2/8/11 11:56 AM, Greg Roe wrote:
> Hi Shibu,
>
> The easiest way to do this would be to use the Table Browser
> (http://genome.ucsc.edu/cgi-bin/hgTables).  Select the assembly of
> interest, mm9 I assume. The select:
>
> Group: Genes and Gene Prediction Tracks
> Track: RefSeq Genes
> Table: refGenes
> Region: genome
>
> Then under identifiers click upload list.  You'll need to make a list of
> all the gene ids, without all the extra data (ex: NM_020501).  You'll
> need to remove the version numbering as well. So NM_020501.1 should be
> shown as NM_020501, without the .1. The east low-tech way to do this
> would be to load your data in a spreadsheet using the pipe as a column
> delimiter, removing the extra columns, then doing a find and replace to
> remove all the version designations, ".1", etc. Then save the gene ids
> to a text file and upload that.
>
> Then set the output format to "selected fields from primary and related
> tables, choose desired file type returned, and click "get output".
>
> On the subsequent screen, check the boxes next to "name", "txStart", and
> "txStop".  Click "get output" and you should have the data you need.
>
> Now, there are about 28,150 rows in that data set.  You may have more
> refSeq ids because UCSC only displays Accession types that start with
> NM_ and NR_.  There are several other types, see:
> http://www.ncbi.nlm.nih.gov/projects/RefSeq/key.html#accessions. So we
> only display a subset of the total.
>
> Hope that helps!
>
> Just email the genome list if you have any additional questions.
>
> -
> Greg Roe
> UCSC Genome Browser Group
>
>
>
>
> On 2/6/11 1:31 PM, John, Shibu wrote:
>> Hi,
>>
>> I have a list of  (36692) NCBI refSeq id in the following format. (Mouse,  
>> downloaded on May 2009 )
>> *****
>> gi|10048421|ref|NM_020488.1|
>> gi|10048425|ref|NM_020501.1|
>> ****
>> Is there any way to get the chromosome start, end position  of these geneID? 
>> ( chr6   131636578       131637481  gi|10048425|ref|NM_020501.1|)
>>
>> I tried to intersect with the "NM_" id with UCSC 
>> "http://hgdownload.cse.ucsc.edu/goldenPath/mm9/database/refGene.txt.gz";
>> ********
>> gi|10048425|ref|NM_020501.1|     Tas2r105        NM_020501       chr6    -   
>>     131636578       131637481
>> ********
>>
>> But this "refGene.txt" contains only 28108  id's..
>>
>> And I tried to find the gene with Entrez batch finder ..
>> *******
>> Id=XM_915912:   This record was removed as a result of standard genome 
>> annotation processing. See the genome build documentation at 
>> http://www.ncbi.nlm.nih.gov/genome/guide/build.html for further information, 
>> or contact [email protected].
>> Id=XM_912174:   This record was replaced or removed.
>> ........
>> .......
>> Received lines: 36692
>> Rejected lines: 17
>> Removed duplicates: 0
>> Passed to Entrez: 36675
>> *********
>>
>> It will be a great help that if you can help me to get the chromosome start 
>> position of these genes  or corresponding ENSEMBL gene ID.
>>
>> Thanks,
>> Shibu
>>
>> _______________________________________________
>> Genome maillist  -  [email protected]
>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Re: [Genome] About NCBI gene coordinate from Gene ID or refseqID

Reply via email to