Hi Jennifer,

We already talked about they are GenBank IDs... The key question here is
how to convert GenBank IDs to Entrez IDs... By cross-joining related
tables, it returned nothing so far. But I do not believe there is not even
a single match to my GenBank IDs in your UCSC tables. Any suggestions?
Anyone else?

BTW, how did the join function of UCSC occur?

Thanks,
YB


On Wed, 12 May 2010 12:51:27 -0700, Jennifer Jackson <[email protected]>
wrote:
> Hello,
> 
> If the sequences are not in the RefSeq track, then the RefSeq track 
> tables will return no data. It appears that your IDs are not RefSeqs.
> 
> Thanks,
> Jennifer
> 
> ---------------------------------
> Jennifer Jackson
> UCSC Genome Informatics Group
> http://genome.ucsc.edu/
> 
> On 5/12/10 6:36 AM, Yongsheng Bai wrote:
>> Hello,
>>
>> I would not be able to get the Entrez ID from GenBanl ID, it does not
>> matter
>> whichever table I tried to combine (mm9.kgXref.refseq (via
>> gbCdnaInfo.acc);
>> mm9.refGene.name (via gbCdnaInfo.acc); mm9.refLink.mrnaAcc (via
>> gbCdnaInfo.acc)). The rest of columns are always "n/a".
>>
>> Do you know why? Thanks!!
>>
>> ===============
>>
>> #mm9.gbCdnaInfo.acc  mm9.gbCdnaInfo.geneName mm9.geneName.id
>> mm9.geneName.name    mm9.refLink.geneName    mm9.refLink.locusLinkId
>> AB004856     1       1       thrB    n/a     n/a
>> AB005263     2       2       argA    n/a     n/a
>> AB011407     0       0       n/a     n/a     n/a
>> AB012144     3       3       sam-pr  n/a     n/a
>> AB012145     0       0       n/a     n/a     n/a
>> AB017109     5       5       hacA    n/a     n/a
>> AB019621     0       0       n/a     n/a     n/a
>> AB026157     0       0       n/a     n/a     n/a
>> AB027742     6       6       nifH    n/a     n/a
>> AB027743     6       6       nifH    n/a     n/a
>> AB027744     6       6       nifH    n/a     n/a
>> AB027745     6       6       nifH    n/a     n/a
>> AB027746     6       6       nifH    n/a     n/a
>> AB027747     6       6       nifH    n/a     n/a
>> AB027748     6       6       nifH    n/a     n/a
>> AB027749     6       6       nifH    n/a     n/a
>> AB027750     6       6       nifH    n/a     n/a
>> AB045977     0       0       n/a     n/a     n/a
>> AB062062     0       0       n/a     n/a     n/a
>> AB062753     0       0       n/a     n/a     n/a
>> AB071366     7       7       azr     n/a     n/a
>> AB071367     7       7       azr     n/a     n/a
>> AB071368     7       7       azr     n/a     n/a
>> AB088633     8       8       enolA   n/a     n/a
>> AB094433     9       9       LipL32  n/a     n/a
>> AB094434     9       9       LipL32  n/a     n/a
>> AB094435     9       9       LipL32  n/a     n/a
>> AB094436     9       9       LipL32  n/a     n/a
>> AB094437     9       9       LipL32  n/a     n/a
>> AB099701     10      10      comS    n/a     n/a
>> AB109116     11      11      SsgB    n/a     n/a
>> AB166870     12      12      SshEstI n/a     n/a
>> AB176840     14      14      gyrB    n/a     n/a
>> AB240674     19      19      lipL41  n/a     n/a
>> AB240675     19      19      lipL41  n/a     n/a
>> AB240676     19      19      lipL41  n/a     n/a
>> AB240677     19      19      lipL41  n/a     n/a
>> AB240678     19      19      lipL41  n/a     n/a
>> AB240679     20      20      lipL45  n/a     n/a
>> AB240680     20      20      lipL45  n/a     n/a
>> AB240681     20      20      lipL45  n/a     n/a
>> AB240682     20      20      lipL45  n/a     n/a
>> AB240683     20      20      lipL45  n/a     n/a
>>
>>
>>
>> -----Original Message-----
>> From: Jennifer Jackson [mailto:[email protected]]
>> Sent: Tuesday, May 11, 2010 4:32 PM
>> To: [email protected]
>> Cc: 'Antonio Coelho'; [email protected]
>> Subject: Re: [Genome] download hg18 genomic sequences
>>
>> Hello -
>>
>> Locus Link IDs have been retired at NCBI. They have been replaced with
>> Entrez Genes. We retain the older table names/labels as LocusLink in our
>> database for convenience reasons, but the content is Entrez.
>>
>> Thanks,
>> Jennifer
>>
>> ---------------------------------
>> Jennifer Jackson
>> UCSC Genome Informatics Group
>> http://genome.ucsc.edu/
>>
>> On 5/11/10 1:08 PM, Yongsheng Bai wrote:
>>> Is locusLinkId in refFlat table referred to Entrez ID?
>>>
>>> -----Original Message-----
>>> From: Jennifer Jackson [mailto:[email protected]]
>>> Sent: Tuesday, May 11, 2010 4:02 PM
>>> To: [email protected]
>>> Cc: 'Antonio Coelho'; [email protected]
>>> Subject: Re: [Genome] download hg18 genomic sequences
>>>
>>> Hello -
>>>
>>> The Table browser can link files if you do not just wish to download
and
>>> process them yourself.
>>>
>>> Start with a base track (like mRNA), use output = "selected fields from
>>> primary and related tables", name file, and click on "get output". The
>>> next form will allow you to then link in the other tables and select
>> fields.
>>>
>>> A previous answer today has the path for a similar query that you can
>>> use as a template combined with the table information we have already
>>> discussed:
>>> https://lists.soe.ucsc.edu/pipermail/genome/2010-May/022216.html
>>>
>>> The Table browser user's guide has instructions for all functions:
>>> http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html#SelectedFields
>>>
>>> Best wishes with your project,
>>> Jennifer
>>>
>>> ---------------------------------
>>> Jennifer Jackson
>>> UCSC Genome Informatics Group
>>> http://genome.ucsc.edu/
>>>
>>> On 5/11/10 12:51 PM, Yongsheng Bai wrote:
>>>> Hello Jennifer,
>>>>
>>>> What I really want is a joined table with Entrez ID, Gene symbol,
>>>> GenBank
>>>> ID...
>>>>
>>>> Thanks,
>>>> YB
>>>>
>>>> -----Original Message-----
>>>> From: Jennifer Jackson [mailto:[email protected]]
>>>> Sent: Tuesday, May 11, 2010 3:35 PM
>>>> To: [email protected]
>>>> Cc: 'Antonio Coelho'; [email protected]
>>>> Subject: Re: [Genome] download hg18 genomic sequences
>>>>
>>>> Hi Yongsheng,
>>>>
>>>> geneName is a Genbank table, so is associated with all GenBank tracks.
>>>> Whenever gbCdnaInfo is the selected table (the "top" defined table),
>>>> geneName will appear as a linked table below.
>>>>
>>>> One way to find it is:
>>>>
>>>> mRna track ->
>>>> describe table schema ->
>>>> click on gbCdnaInfo in associated tables ->
>>>> geneName will now appear in the list of linked tables
>>>>
>>>> Thanks!
>>>> Jennifer
>>>>
>>>> On 5/11/10 12:26 PM, Yongsheng Bai wrote:
>>>>> Under which group/track that table "geneName" is located?
>>>>>
>>>>> -----Original Message-----
>>>>> From: Jennifer Jackson [mailto:[email protected]]
>>>>> Sent: Tuesday, May 11, 2010 2:46 PM
>>>>> To: [email protected]
>>>>> Cc: 'Antonio Coelho'; [email protected]
>>>>> Subject: Re: [Genome] download hg18 genomic sequences
>>>>>
>>>>> Hello Yongsheng,
>>>>>
>>>>> The table refGene has two fields:
>>>>> name = Genbank accession (nuc)
>>>>> name2 = gene name
>>>>>
>>>>> The table gbCdnaInfo has two fields:
>>>>> acc = Genbank accession (nuc)
>>>>> gi = Genbank identifier
>>>>>
>>>>> The table geneName has two fields:
>>>>> id = linking number into gbCdnaInfo.geneName
>>>>> name = gene name as listed in the Genbank data sheet (note: RefSeq
>>>>> sequences will have this as well as many mRNA sequences. It just
>>>>> depends
>>>>> on what the data submission included.)
>>>>>
>>>>> To link in more identifiers based on UCSC Genes, which includes
RefSeq
>>>>> as an input, use the table kgXref. Please note that any RefSeq
>>>>> sequences
>>>>> that have been added since the last UCSC Gene track update
(2009-10-08
>>>>> for hg19) will not be included in the UCSC Gene track's tables
>>>>> (includes
>>>>> kgXref). You will not find mRna or other Genbank accessions in this
>>>>> table, but you will find data from the other sources used to build
the
>>>>> UCSC Genes track. See the UCSC Genes track description for details
>>>>> about
>>>>> sources/last update.
>>>>>
>>>>> To find/explore the schema yourself, open up the Table browser to the
>>>>> target genome, select a track/table, then use the "describe table
>>>>> schema" button. The table selected will be defined, followed by
linked
>>>>> tables (including what keys they are linked on to the selected table
>>>>> above). Any of this can be clicked on to "promote" them to become the
>>>>> select table. Fewer or less linked tables may come up. This is an
>>>>> excellent way to navigate the schema and find out what data is
>>>>> available
>>>>> in which tables.
>>>>>
>>>>> Hopefully this is helpful!
>>>>> Jennifer
>>>>>
>>>>> ---------------------------------
>>>>> Jennifer Jackson
>>>>> UCSC Genome Informatics Group
>>>>> http://genome.ucsc.edu/
>>>>>
>>>>> On 5/11/10 7:19 AM, Yongsheng Bai wrote:
>>>>>> Hello,
>>>>>>
>>>>>> Thanks! What's UCSC table name for converting "GenBank ID" to
"Entrez
>>>>>> ID/Gene symbol"?
>>>>>>
>>>>>> Thanks,
>>>>>> YB
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Antonio Coelho [mailto:[email protected]]
>>>>>> Sent: Thursday, May 06, 2010 3:09 PM
>>>>>> To: [email protected]
>>>>>> Cc: [email protected]
>>>>>> Subject: Re: [Genome] download hg18 genomic sequences
>>>>>>
>>>>>> Hello Yongsheng,
>>>>>> There is no limit to the size of the files, but as you have noticed,
>>>>>> larger files to tend to load very slowly.
>>>>>> One alternative is to convert your files to the bigBed format. You
>>>>>> can
>>>>>> read about here:
>>>>>>
>>>>>> http://genome.ucsc.edu/FAQ/FAQformat.html#format1.5
>>>>>> http://genome.ucsc.edu/goldenPath/help/bigBed.html
>>>>>>
>>>>>> You could also try breaking up your file into a group of smaller bed
>>>>> files.
>>>>>>
>>>>>> I hope this answers your question. Please feel welcome to contact us
>>>>> again.
>>>>>>
>>>>>> Antonio Coelho
>>>>>> UCSC Genome Bioinformatics Group
>>>>>>
>>>>>> Yongsheng Bai wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> Is there any file size limit for loading a bed file into UCSC's
>>>>>>> custom
>>>>>>> track? I am loading a ~200MB file, it takes forever...
>>>>>>>
>>>>>>> Thanks.
>>>>>>> YB
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Jennifer Jackson [mailto:[email protected]]
>>>>>>> Sent: Monday, May 03, 2010 2:17 PM
>>>>>>> To: [email protected]
>>>>>>> Subject: Re: [Genome] download hg18 genomic sequences
>>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> The track GC Percent may be helpful. It is based on the
>>>>>>> reference genome.
>>>>>>>
>>>>>>>         quote from GC Percent track description page: Description
>>>>>>>
>>>>>>>         The GC percent track shows the percentage of G (guanine)
>>>>>>>         and C   (cytosine) bases in 5-base windows. High GC content
>>>>>>>         is typically associated with gene-rich areas.
>>>>>>>
>>>>>>>         You could BLAT your sequence against the reference genome
>>>>>>>         and view the region or note the coordinates, then export
>>>>>>>         data from this track using the Table browser.
>>>>>>>
>>>>>>>         Or for batch use, if you know the coordinates, upload them
>>>>>>>         into the region filter in the Table browser and export data
>>>>>>>         from the GC Percent track or use the file from downloads
>> (below).
>>>>>>>
>>>>>>> To this in batch on the command line to generate GC content for
your
>>> own
>>>>>>> sequences, use the utility percentage from the kent source tree. It
>>>>>>> would take input data that you format. Please ask if you need help
>>>>>>> to
>>>>>>> convert your files to the required formats.
>>>>>>>
>>>>>>>         Table browser help:
>>>>>>>         http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html
>>>>>>>
>>>>>>>         Link to description of utilities (scroll to hgGcPercent):
>>>>>>>         http://hgwdev.cse.ucsc.edu/~larrym/utilities.html
>>>>>>>
>>>>>>>         Download source:
>>>>>>>         http://genome.ucsc.edu/FAQ/FAQdownloads.html#download27
>>>>>>>
>>>>>>>         File formats:
>>>>>>>         http://genome.ucsc.edu/FAQ/FAQformat.html
>>>>>>>
>>>>>>>         Complete GC content file for latest human reference genome:
>>>>>>>         http://hgdownload.cse.ucsc.edu/goldenPath/hg19/gc5Base/
>>>>>>>
>>>>>>> Best wishes with your project,
>>>>>>> Jennifer
>>>>>>>
>>>>>>> ps: It would be best if you would send new questions directly to
the
>>>>>>> mailing list at [email protected]. This helps us to get you the
>>>>>>> quickest reply.
>>>>>>>
>>>>>>> ---------------------------------
>>>>>>> Jennifer Jackson
>>>>>>> UCSC Genome Informatics Group
>>>>>>> http://genome.ucsc.edu/
>>>>>>>
>>>>>>> On 5/3/10 10:30 AM, Yongsheng Bai wrote:
>>>>>>>
>>>>>>>> Hi Jennifer,
>>>>>>>>
>>>>>>>> Does the UCSC allow or have the functionality to view/calculate GC
>>>>>>>>
>>>>>>> contents
>>>>>>>
>>>>>>>> given an input sequence?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> YB
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Genome maillist  -  [email protected]
>>>>>>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Genome maillist  -  [email protected]
>>>>>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to