Hello, I would not be able to get the Entrez ID from GenBanl ID, it does not matter whichever table I tried to combine (mm9.kgXref.refseq (via gbCdnaInfo.acc); mm9.refGene.name (via gbCdnaInfo.acc); mm9.refLink.mrnaAcc (via gbCdnaInfo.acc)). The rest of columns are always "n/a".
Do you know why? Thanks!! =============== #mm9.gbCdnaInfo.acc mm9.gbCdnaInfo.geneName mm9.geneName.id mm9.geneName.name mm9.refLink.geneName mm9.refLink.locusLinkId AB004856 1 1 thrB n/a n/a AB005263 2 2 argA n/a n/a AB011407 0 0 n/a n/a n/a AB012144 3 3 sam-pr n/a n/a AB012145 0 0 n/a n/a n/a AB017109 5 5 hacA n/a n/a AB019621 0 0 n/a n/a n/a AB026157 0 0 n/a n/a n/a AB027742 6 6 nifH n/a n/a AB027743 6 6 nifH n/a n/a AB027744 6 6 nifH n/a n/a AB027745 6 6 nifH n/a n/a AB027746 6 6 nifH n/a n/a AB027747 6 6 nifH n/a n/a AB027748 6 6 nifH n/a n/a AB027749 6 6 nifH n/a n/a AB027750 6 6 nifH n/a n/a AB045977 0 0 n/a n/a n/a AB062062 0 0 n/a n/a n/a AB062753 0 0 n/a n/a n/a AB071366 7 7 azr n/a n/a AB071367 7 7 azr n/a n/a AB071368 7 7 azr n/a n/a AB088633 8 8 enolA n/a n/a AB094433 9 9 LipL32 n/a n/a AB094434 9 9 LipL32 n/a n/a AB094435 9 9 LipL32 n/a n/a AB094436 9 9 LipL32 n/a n/a AB094437 9 9 LipL32 n/a n/a AB099701 10 10 comS n/a n/a AB109116 11 11 SsgB n/a n/a AB166870 12 12 SshEstI n/a n/a AB176840 14 14 gyrB n/a n/a AB240674 19 19 lipL41 n/a n/a AB240675 19 19 lipL41 n/a n/a AB240676 19 19 lipL41 n/a n/a AB240677 19 19 lipL41 n/a n/a AB240678 19 19 lipL41 n/a n/a AB240679 20 20 lipL45 n/a n/a AB240680 20 20 lipL45 n/a n/a AB240681 20 20 lipL45 n/a n/a AB240682 20 20 lipL45 n/a n/a AB240683 20 20 lipL45 n/a n/a -----Original Message----- From: Jennifer Jackson [mailto:[email protected]] Sent: Tuesday, May 11, 2010 4:32 PM To: [email protected] Cc: 'Antonio Coelho'; [email protected] Subject: Re: [Genome] download hg18 genomic sequences Hello - Locus Link IDs have been retired at NCBI. They have been replaced with Entrez Genes. We retain the older table names/labels as LocusLink in our database for convenience reasons, but the content is Entrez. Thanks, Jennifer --------------------------------- Jennifer Jackson UCSC Genome Informatics Group http://genome.ucsc.edu/ On 5/11/10 1:08 PM, Yongsheng Bai wrote: > Is locusLinkId in refFlat table referred to Entrez ID? > > -----Original Message----- > From: Jennifer Jackson [mailto:[email protected]] > Sent: Tuesday, May 11, 2010 4:02 PM > To: [email protected] > Cc: 'Antonio Coelho'; [email protected] > Subject: Re: [Genome] download hg18 genomic sequences > > Hello - > > The Table browser can link files if you do not just wish to download and > process them yourself. > > Start with a base track (like mRNA), use output = "selected fields from > primary and related tables", name file, and click on "get output". The > next form will allow you to then link in the other tables and select fields. > > A previous answer today has the path for a similar query that you can > use as a template combined with the table information we have already > discussed: > https://lists.soe.ucsc.edu/pipermail/genome/2010-May/022216.html > > The Table browser user's guide has instructions for all functions: > http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html#SelectedFields > > Best wishes with your project, > Jennifer > > --------------------------------- > Jennifer Jackson > UCSC Genome Informatics Group > http://genome.ucsc.edu/ > > On 5/11/10 12:51 PM, Yongsheng Bai wrote: >> Hello Jennifer, >> >> What I really want is a joined table with Entrez ID, Gene symbol, GenBank >> ID... >> >> Thanks, >> YB >> >> -----Original Message----- >> From: Jennifer Jackson [mailto:[email protected]] >> Sent: Tuesday, May 11, 2010 3:35 PM >> To: [email protected] >> Cc: 'Antonio Coelho'; [email protected] >> Subject: Re: [Genome] download hg18 genomic sequences >> >> Hi Yongsheng, >> >> geneName is a Genbank table, so is associated with all GenBank tracks. >> Whenever gbCdnaInfo is the selected table (the "top" defined table), >> geneName will appear as a linked table below. >> >> One way to find it is: >> >> mRna track -> >> describe table schema -> >> click on gbCdnaInfo in associated tables -> >> geneName will now appear in the list of linked tables >> >> Thanks! >> Jennifer >> >> On 5/11/10 12:26 PM, Yongsheng Bai wrote: >>> Under which group/track that table "geneName" is located? >>> >>> -----Original Message----- >>> From: Jennifer Jackson [mailto:[email protected]] >>> Sent: Tuesday, May 11, 2010 2:46 PM >>> To: [email protected] >>> Cc: 'Antonio Coelho'; [email protected] >>> Subject: Re: [Genome] download hg18 genomic sequences >>> >>> Hello Yongsheng, >>> >>> The table refGene has two fields: >>> name = Genbank accession (nuc) >>> name2 = gene name >>> >>> The table gbCdnaInfo has two fields: >>> acc = Genbank accession (nuc) >>> gi = Genbank identifier >>> >>> The table geneName has two fields: >>> id = linking number into gbCdnaInfo.geneName >>> name = gene name as listed in the Genbank data sheet (note: RefSeq >>> sequences will have this as well as many mRNA sequences. It just depends >>> on what the data submission included.) >>> >>> To link in more identifiers based on UCSC Genes, which includes RefSeq >>> as an input, use the table kgXref. Please note that any RefSeq sequences >>> that have been added since the last UCSC Gene track update (2009-10-08 >>> for hg19) will not be included in the UCSC Gene track's tables (includes >>> kgXref). You will not find mRna or other Genbank accessions in this >>> table, but you will find data from the other sources used to build the >>> UCSC Genes track. See the UCSC Genes track description for details about >>> sources/last update. >>> >>> To find/explore the schema yourself, open up the Table browser to the >>> target genome, select a track/table, then use the "describe table >>> schema" button. The table selected will be defined, followed by linked >>> tables (including what keys they are linked on to the selected table >>> above). Any of this can be clicked on to "promote" them to become the >>> select table. Fewer or less linked tables may come up. This is an >>> excellent way to navigate the schema and find out what data is available >>> in which tables. >>> >>> Hopefully this is helpful! >>> Jennifer >>> >>> --------------------------------- >>> Jennifer Jackson >>> UCSC Genome Informatics Group >>> http://genome.ucsc.edu/ >>> >>> On 5/11/10 7:19 AM, Yongsheng Bai wrote: >>>> Hello, >>>> >>>> Thanks! What's UCSC table name for converting "GenBank ID" to "Entrez >>>> ID/Gene symbol"? >>>> >>>> Thanks, >>>> YB >>>> >>>> -----Original Message----- >>>> From: Antonio Coelho [mailto:[email protected]] >>>> Sent: Thursday, May 06, 2010 3:09 PM >>>> To: [email protected] >>>> Cc: [email protected] >>>> Subject: Re: [Genome] download hg18 genomic sequences >>>> >>>> Hello Yongsheng, >>>> There is no limit to the size of the files, but as you have noticed, >>>> larger files to tend to load very slowly. >>>> One alternative is to convert your files to the bigBed format. You can >>>> read about here: >>>> >>>> http://genome.ucsc.edu/FAQ/FAQformat.html#format1.5 >>>> http://genome.ucsc.edu/goldenPath/help/bigBed.html >>>> >>>> You could also try breaking up your file into a group of smaller bed >>> files. >>>> >>>> I hope this answers your question. Please feel welcome to contact us >>> again. >>>> >>>> Antonio Coelho >>>> UCSC Genome Bioinformatics Group >>>> >>>> Yongsheng Bai wrote: >>>>> Hi, >>>>> >>>>> Is there any file size limit for loading a bed file into UCSC's custom >>>>> track? I am loading a ~200MB file, it takes forever... >>>>> >>>>> Thanks. >>>>> YB >>>>> >>>>> -----Original Message----- >>>>> From: Jennifer Jackson [mailto:[email protected]] >>>>> Sent: Monday, May 03, 2010 2:17 PM >>>>> To: [email protected] >>>>> Subject: Re: [Genome] download hg18 genomic sequences >>>>> >>>>> Hello, >>>>> >>>>> The track GC Percent may be helpful. It is based on the >>>>> reference genome. >>>>> >>>>> quote from GC Percent track description page: Description >>>>> >>>>> The GC percent track shows the percentage of G (guanine) >>>>> and C (cytosine) bases in 5-base windows. High GC content >>>>> is typically associated with gene-rich areas. >>>>> >>>>> You could BLAT your sequence against the reference genome >>>>> and view the region or note the coordinates, then export >>>>> data from this track using the Table browser. >>>>> >>>>> Or for batch use, if you know the coordinates, upload them >>>>> into the region filter in the Table browser and export data >>>>> from the GC Percent track or use the file from downloads (below). >>>>> >>>>> To this in batch on the command line to generate GC content for your > own >>>>> sequences, use the utility percentage from the kent source tree. It >>>>> would take input data that you format. Please ask if you need help to >>>>> convert your files to the required formats. >>>>> >>>>> Table browser help: >>>>> http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html >>>>> >>>>> Link to description of utilities (scroll to hgGcPercent): >>>>> http://hgwdev.cse.ucsc.edu/~larrym/utilities.html >>>>> >>>>> Download source: >>>>> http://genome.ucsc.edu/FAQ/FAQdownloads.html#download27 >>>>> >>>>> File formats: >>>>> http://genome.ucsc.edu/FAQ/FAQformat.html >>>>> >>>>> Complete GC content file for latest human reference genome: >>>>> http://hgdownload.cse.ucsc.edu/goldenPath/hg19/gc5Base/ >>>>> >>>>> Best wishes with your project, >>>>> Jennifer >>>>> >>>>> ps: It would be best if you would send new questions directly to the >>>>> mailing list at [email protected]. This helps us to get you the >>>>> quickest reply. >>>>> >>>>> --------------------------------- >>>>> Jennifer Jackson >>>>> UCSC Genome Informatics Group >>>>> http://genome.ucsc.edu/ >>>>> >>>>> On 5/3/10 10:30 AM, Yongsheng Bai wrote: >>>>> >>>>>> Hi Jennifer, >>>>>> >>>>>> Does the UCSC allow or have the functionality to view/calculate GC >>>>>> >>>>> contents >>>>> >>>>>> given an input sequence? >>>>>> >>>>>> Thanks, >>>>>> YB >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Genome maillist - [email protected] >>>>> https://lists.soe.ucsc.edu/mailman/listinfo/genome >>>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Genome maillist - [email protected] >>>> https://lists.soe.ucsc.edu/mailman/listinfo/genome >>> >>> >>> >> >> >> > > > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
