Hello YB,

I believe that the primary problem may be that you are searching against 
an assembly that is not the same species as the query. For example, your 
first Genbank accession is listed as AB004856. This is not a mouse (or 
human) sequence. It is from a species not included in the UCSC Browser 
(Buchnera aphidicola - Bacterial).

http://www.ncbi.nlm.nih.gov/nuccore/3036932

You may want to check the microbial browser.
http://microbes.ucsc.edu/

I did not examine all of your data to determine the source. Yesterday 
some of the sequences you shared were named with mouse MGC identifiers. 
It seems that you may have a mixed dataset or more than one dataset.

If you want to share more about your project we may be able to help more 
with technical questions concerning the browser, but determining what 
species any particular Genbank accession is from is something that you 
will need to do, probably at NCBI.

Hopefully this helps to clear up some of the confusion,
Jennifer

---------------------------------
Jennifer Jackson
UCSC Genome Informatics Group
http://genome.ucsc.edu/

On 5/12/10 1:14 PM, Yongsheng Bai wrote:
>
> Hi Jennifer,
>
> We already talked about they are GenBank IDs... The key question here is
> how to convert GenBank IDs to Entrez IDs... By cross-joining related
> tables, it returned nothing so far. But I do not believe there is not even
> a single match to my GenBank IDs in your UCSC tables. Any suggestions?
> Anyone else?
>
> BTW, how did the join function of UCSC occur?
>
> Thanks,
> YB
>
>
> On Wed, 12 May 2010 12:51:27 -0700, Jennifer Jackson<[email protected]>
> wrote:
>> Hello,
>>
>> If the sequences are not in the RefSeq track, then the RefSeq track
>> tables will return no data. It appears that your IDs are not RefSeqs.
>>
>> Thanks,
>> Jennifer
>>
>> ---------------------------------
>> Jennifer Jackson
>> UCSC Genome Informatics Group
>> http://genome.ucsc.edu/
>>
>> On 5/12/10 6:36 AM, Yongsheng Bai wrote:
>>> Hello,
>>>
>>> I would not be able to get the Entrez ID from GenBanl ID, it does not
>>> matter
>>> whichever table I tried to combine (mm9.kgXref.refseq (via
>>> gbCdnaInfo.acc);
>>> mm9.refGene.name (via gbCdnaInfo.acc); mm9.refLink.mrnaAcc (via
>>> gbCdnaInfo.acc)). The rest of columns are always "n/a".
>>>
>>> Do you know why? Thanks!!
>>>
>>> ===============
>>>
>>> #mm9.gbCdnaInfo.acc mm9.gbCdnaInfo.geneName mm9.geneName.id
>>> mm9.geneName.name   mm9.refLink.geneName    mm9.refLink.locusLinkId
>>> AB004856    1       1       thrB    n/a     n/a
>>> AB005263    2       2       argA    n/a     n/a
>>> AB011407    0       0       n/a     n/a     n/a
>>> AB012144    3       3       sam-pr  n/a     n/a
>>> AB012145    0       0       n/a     n/a     n/a
>>> AB017109    5       5       hacA    n/a     n/a
>>> AB019621    0       0       n/a     n/a     n/a
>>> AB026157    0       0       n/a     n/a     n/a
>>> AB027742    6       6       nifH    n/a     n/a
>>> AB027743    6       6       nifH    n/a     n/a
>>> AB027744    6       6       nifH    n/a     n/a
>>> AB027745    6       6       nifH    n/a     n/a
>>> AB027746    6       6       nifH    n/a     n/a
>>> AB027747    6       6       nifH    n/a     n/a
>>> AB027748    6       6       nifH    n/a     n/a
>>> AB027749    6       6       nifH    n/a     n/a
>>> AB027750    6       6       nifH    n/a     n/a
>>> AB045977    0       0       n/a     n/a     n/a
>>> AB062062    0       0       n/a     n/a     n/a
>>> AB062753    0       0       n/a     n/a     n/a
>>> AB071366    7       7       azr     n/a     n/a
>>> AB071367    7       7       azr     n/a     n/a
>>> AB071368    7       7       azr     n/a     n/a
>>> AB088633    8       8       enolA   n/a     n/a
>>> AB094433    9       9       LipL32  n/a     n/a
>>> AB094434    9       9       LipL32  n/a     n/a
>>> AB094435    9       9       LipL32  n/a     n/a
>>> AB094436    9       9       LipL32  n/a     n/a
>>> AB094437    9       9       LipL32  n/a     n/a
>>> AB099701    10      10      comS    n/a     n/a
>>> AB109116    11      11      SsgB    n/a     n/a
>>> AB166870    12      12      SshEstI n/a     n/a
>>> AB176840    14      14      gyrB    n/a     n/a
>>> AB240674    19      19      lipL41  n/a     n/a
>>> AB240675    19      19      lipL41  n/a     n/a
>>> AB240676    19      19      lipL41  n/a     n/a
>>> AB240677    19      19      lipL41  n/a     n/a
>>> AB240678    19      19      lipL41  n/a     n/a
>>> AB240679    20      20      lipL45  n/a     n/a
>>> AB240680    20      20      lipL45  n/a     n/a
>>> AB240681    20      20      lipL45  n/a     n/a
>>> AB240682    20      20      lipL45  n/a     n/a
>>> AB240683    20      20      lipL45  n/a     n/a
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Jennifer Jackson [mailto:[email protected]]
>>> Sent: Tuesday, May 11, 2010 4:32 PM
>>> To: [email protected]
>>> Cc: 'Antonio Coelho'; [email protected]
>>> Subject: Re: [Genome] download hg18 genomic sequences
>>>
>>> Hello -
>>>
>>> Locus Link IDs have been retired at NCBI. They have been replaced with
>>> Entrez Genes. We retain the older table names/labels as LocusLink in our
>>> database for convenience reasons, but the content is Entrez.
>>>
>>> Thanks,
>>> Jennifer
>>>
>>> ---------------------------------
>>> Jennifer Jackson
>>> UCSC Genome Informatics Group
>>> http://genome.ucsc.edu/
>>>
>>> On 5/11/10 1:08 PM, Yongsheng Bai wrote:
>>>> Is locusLinkId in refFlat table referred to Entrez ID?
>>>>
>>>> -----Original Message-----
>>>> From: Jennifer Jackson [mailto:[email protected]]
>>>> Sent: Tuesday, May 11, 2010 4:02 PM
>>>> To: [email protected]
>>>> Cc: 'Antonio Coelho'; [email protected]
>>>> Subject: Re: [Genome] download hg18 genomic sequences
>>>>
>>>> Hello -
>>>>
>>>> The Table browser can link files if you do not just wish to download
> and
>>>> process them yourself.
>>>>
>>>> Start with a base track (like mRNA), use output = "selected fields from
>>>> primary and related tables", name file, and click on "get output". The
>>>> next form will allow you to then link in the other tables and select
>>> fields.
>>>>
>>>> A previous answer today has the path for a similar query that you can
>>>> use as a template combined with the table information we have already
>>>> discussed:
>>>> https://lists.soe.ucsc.edu/pipermail/genome/2010-May/022216.html
>>>>
>>>> The Table browser user's guide has instructions for all functions:
>>>> http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html#SelectedFields
>>>>
>>>> Best wishes with your project,
>>>> Jennifer
>>>>
>>>> ---------------------------------
>>>> Jennifer Jackson
>>>> UCSC Genome Informatics Group
>>>> http://genome.ucsc.edu/
>>>>
>>>> On 5/11/10 12:51 PM, Yongsheng Bai wrote:
>>>>> Hello Jennifer,
>>>>>
>>>>> What I really want is a joined table with Entrez ID, Gene symbol,
>>>>> GenBank
>>>>> ID...
>>>>>
>>>>> Thanks,
>>>>> YB
>>>>>
>>>>> -----Original Message-----
>>>>> From: Jennifer Jackson [mailto:[email protected]]
>>>>> Sent: Tuesday, May 11, 2010 3:35 PM
>>>>> To: [email protected]
>>>>> Cc: 'Antonio Coelho'; [email protected]
>>>>> Subject: Re: [Genome] download hg18 genomic sequences
>>>>>
>>>>> Hi Yongsheng,
>>>>>
>>>>> geneName is a Genbank table, so is associated with all GenBank tracks.
>>>>> Whenever gbCdnaInfo is the selected table (the "top" defined table),
>>>>> geneName will appear as a linked table below.
>>>>>
>>>>> One way to find it is:
>>>>>
>>>>> mRna track ->
>>>>> describe table schema ->
>>>>> click on gbCdnaInfo in associated tables ->
>>>>> geneName will now appear in the list of linked tables
>>>>>
>>>>> Thanks!
>>>>> Jennifer
>>>>>
>>>>> On 5/11/10 12:26 PM, Yongsheng Bai wrote:
>>>>>> Under which group/track that table "geneName" is located?
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Jennifer Jackson [mailto:[email protected]]
>>>>>> Sent: Tuesday, May 11, 2010 2:46 PM
>>>>>> To: [email protected]
>>>>>> Cc: 'Antonio Coelho'; [email protected]
>>>>>> Subject: Re: [Genome] download hg18 genomic sequences
>>>>>>
>>>>>> Hello Yongsheng,
>>>>>>
>>>>>> The table refGene has two fields:
>>>>>> name = Genbank accession (nuc)
>>>>>> name2 = gene name
>>>>>>
>>>>>> The table gbCdnaInfo has two fields:
>>>>>> acc = Genbank accession (nuc)
>>>>>> gi = Genbank identifier
>>>>>>
>>>>>> The table geneName has two fields:
>>>>>> id = linking number into gbCdnaInfo.geneName
>>>>>> name = gene name as listed in the Genbank data sheet (note: RefSeq
>>>>>> sequences will have this as well as many mRNA sequences. It just
>>>>>> depends
>>>>>> on what the data submission included.)
>>>>>>
>>>>>> To link in more identifiers based on UCSC Genes, which includes
> RefSeq
>>>>>> as an input, use the table kgXref. Please note that any RefSeq
>>>>>> sequences
>>>>>> that have been added since the last UCSC Gene track update
> (2009-10-08
>>>>>> for hg19) will not be included in the UCSC Gene track's tables
>>>>>> (includes
>>>>>> kgXref). You will not find mRna or other Genbank accessions in this
>>>>>> table, but you will find data from the other sources used to build
> the
>>>>>> UCSC Genes track. See the UCSC Genes track description for details
>>>>>> about
>>>>>> sources/last update.
>>>>>>
>>>>>> To find/explore the schema yourself, open up the Table browser to the
>>>>>> target genome, select a track/table, then use the "describe table
>>>>>> schema" button. The table selected will be defined, followed by
> linked
>>>>>> tables (including what keys they are linked on to the selected table
>>>>>> above). Any of this can be clicked on to "promote" them to become the
>>>>>> select table. Fewer or less linked tables may come up. This is an
>>>>>> excellent way to navigate the schema and find out what data is
>>>>>> available
>>>>>> in which tables.
>>>>>>
>>>>>> Hopefully this is helpful!
>>>>>> Jennifer
>>>>>>
>>>>>> ---------------------------------
>>>>>> Jennifer Jackson
>>>>>> UCSC Genome Informatics Group
>>>>>> http://genome.ucsc.edu/
>>>>>>
>>>>>> On 5/11/10 7:19 AM, Yongsheng Bai wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> Thanks! What's UCSC table name for converting "GenBank ID" to
> "Entrez
>>>>>>> ID/Gene symbol"?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> YB
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Antonio Coelho [mailto:[email protected]]
>>>>>>> Sent: Thursday, May 06, 2010 3:09 PM
>>>>>>> To: [email protected]
>>>>>>> Cc: [email protected]
>>>>>>> Subject: Re: [Genome] download hg18 genomic sequences
>>>>>>>
>>>>>>> Hello Yongsheng,
>>>>>>> There is no limit to the size of the files, but as you have noticed,
>>>>>>> larger files to tend to load very slowly.
>>>>>>> One alternative is to convert your files to the bigBed format. You
>>>>>>> can
>>>>>>> read about here:
>>>>>>>
>>>>>>> http://genome.ucsc.edu/FAQ/FAQformat.html#format1.5
>>>>>>> http://genome.ucsc.edu/goldenPath/help/bigBed.html
>>>>>>>
>>>>>>> You could also try breaking up your file into a group of smaller bed
>>>>>> files.
>>>>>>>
>>>>>>> I hope this answers your question. Please feel welcome to contact us
>>>>>> again.
>>>>>>>
>>>>>>> Antonio Coelho
>>>>>>> UCSC Genome Bioinformatics Group
>>>>>>>
>>>>>>> Yongsheng Bai wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Is there any file size limit for loading a bed file into UCSC's
>>>>>>>> custom
>>>>>>>> track? I am loading a ~200MB file, it takes forever...
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>> YB
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Jennifer Jackson [mailto:[email protected]]
>>>>>>>> Sent: Monday, May 03, 2010 2:17 PM
>>>>>>>> To: [email protected]
>>>>>>>> Subject: Re: [Genome] download hg18 genomic sequences
>>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> The track GC Percent may be helpful. It is based on the
>>>>>>>> reference genome.
>>>>>>>>
>>>>>>>>          quote from GC Percent track description page: Description
>>>>>>>>
>>>>>>>>          The GC percent track shows the percentage of G (guanine)
>>>>>>>>          and C   (cytosine) bases in 5-base windows. High GC content
>>>>>>>>          is typically associated with gene-rich areas.
>>>>>>>>
>>>>>>>>          You could BLAT your sequence against the reference genome
>>>>>>>>          and view the region or note the coordinates, then export
>>>>>>>>          data from this track using the Table browser.
>>>>>>>>
>>>>>>>>          Or for batch use, if you know the coordinates, upload them
>>>>>>>>          into the region filter in the Table browser and export data
>>>>>>>>          from the GC Percent track or use the file from downloads
>>> (below).
>>>>>>>>
>>>>>>>> To this in batch on the command line to generate GC content for
> your
>>>> own
>>>>>>>> sequences, use the utility percentage from the kent source tree. It
>>>>>>>> would take input data that you format. Please ask if you need help
>>>>>>>> to
>>>>>>>> convert your files to the required formats.
>>>>>>>>
>>>>>>>>          Table browser help:
>>>>>>>>          http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html
>>>>>>>>
>>>>>>>>          Link to description of utilities (scroll to hgGcPercent):
>>>>>>>>          http://hgwdev.cse.ucsc.edu/~larrym/utilities.html
>>>>>>>>
>>>>>>>>          Download source:
>>>>>>>>          http://genome.ucsc.edu/FAQ/FAQdownloads.html#download27
>>>>>>>>
>>>>>>>>          File formats:
>>>>>>>>          http://genome.ucsc.edu/FAQ/FAQformat.html
>>>>>>>>
>>>>>>>>          Complete GC content file for latest human reference genome:
>>>>>>>>          http://hgdownload.cse.ucsc.edu/goldenPath/hg19/gc5Base/
>>>>>>>>
>>>>>>>> Best wishes with your project,
>>>>>>>> Jennifer
>>>>>>>>
>>>>>>>> ps: It would be best if you would send new questions directly to
> the
>>>>>>>> mailing list at [email protected]. This helps us to get you the
>>>>>>>> quickest reply.
>>>>>>>>
>>>>>>>> ---------------------------------
>>>>>>>> Jennifer Jackson
>>>>>>>> UCSC Genome Informatics Group
>>>>>>>> http://genome.ucsc.edu/
>>>>>>>>
>>>>>>>> On 5/3/10 10:30 AM, Yongsheng Bai wrote:
>>>>>>>>
>>>>>>>>> Hi Jennifer,
>>>>>>>>>
>>>>>>>>> Does the UCSC allow or have the functionality to view/calculate GC
>>>>>>>>>
>>>>>>>> contents
>>>>>>>>
>>>>>>>>> given an input sequence?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> YB
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Genome maillist  -  [email protected]
>>>>>>>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Genome maillist  -  [email protected]
>>>>>>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to