Re: [Genome] Databases When Performing Local Installation

Hiram Clawson Fri, 09 Apr 2010 11:30:00 -0700

Good Morning Christophe:

The databases for genome assemblies can be seen from our public
MySQL server with the command:


$ mysql -N -A -hgenome-mysql.cse.ucsc.edu -ugenomep -ppassword \
         -e "select name from dbDb where active=1;" hgcentral | sort

See also, scripts in the source tree:
http://hgwdev.cse.ucsc.edu/~kent/src/unzipped/product/scripts/

to aid your mirror jobs.

Those databases are the primary data for a genome assembly with all
the annotations built on that genome assembly.

Some annotations require extra tables and databases that are common
across a number of genome assemblies.  Hence they are outside any
particular genome database.  Some of these external databases have
specific usage rights, see also:
http://hgdownload.cse.ucsc.edu/goldenPath/swissProt/database/README.txt

The extra databases are:
hgcentral - primary database the browser uses to find everything else
        also contains dynamic user/session "cart" data
visiGene - virtual microscope for mice sections
sp090821, etc ... - "Swiss-Prot" aka UniProt database
        obtained from files at ftp.expasy.org/databases/uniprot/
        used in UCSC genes track on various databases
uniprot - the newest version of the Swiss-Prot databases, can simply
        be a symlink to the newest sp* database directory
        used in UCSC genes track on various databases
go - The Gene Ontology database, obtained from:
        http://www.godatabase.org/dev/database/
        Used in the UCSC genes track
proteins090821, etc. - a combination of the UniProt data mentioned above
        and data from HGNC http://www.genenames.org/
        Used in the UCSC genes track and proteome browser
proteome - should merely be a symlink to the most recent proteins090821 
database.

Yes, the numbers in sp090821 and proteins090821 are the dates: 2009-08-21
The newest versions of these databases are used in newer annotation tracks.
It is possible some of the oldest ones are used in older genome databases.
To see the correspondence:
mysql -N -A -hgenome-mysql.cse.ucsc.edu -ugenomep -ppassword \
         -e "select * from gdbPdb;" hgcentral

--Hiram

[email protected] wrote:
> Hello UCSC's Team,
> 
>  
> 
> We are performing a local installation of the Genome Browser. 
> 
> During this installation, the downloading of numerous data and databases
> occurs. Most of these databases represent the builds of genomes such as
> mm8, hg18, rn4, tetNig1, panTro2, etc.
> 
>  
> 
> Among these list of databases (115 at this date), there are some
> databases which do not represent genome's build but information about
> molecule or other type of information such as:
> 
> VisiGene, uniprot, proteome, go080130, go, hgfixed, hgcentral, mysql,
> proteins040315, proteins050415 , proteins051015, proteins060115, etc.,
> ... ,  sp040315, sp050415, etc., ..., sp 090821.
> 
>  
> 
> We have guessed the meaning of some of them based on their names, but we
> still have no clue about the meaning and purpose of some of them;
> 
>  
> 
> QUESTIONS:
> 
> 1.       May you let us know what the databases with names sp090821,
> spXXXXXX (where X are number) along with proteinsXXXXXX represent?
> 
> 2.       Are the number associated to them related to the date version?
> 
> 3.       If so, do we still have to keep all the version to get fully
> functional Genome Browser, or can we just keep the latest version of it?
> 
> 4.       Do you have any list which gives details about these 115
> databases we have retrieved?
> 
>  
> 
>  
> 
> I Thank you in advance for your reply,
> 
>  
> 
> Best Regards,
> 
>  
> Christophe LEGENDRE, PhD
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Re: [Genome] Databases When Performing Local Installation

Reply via email to