Hello, I am interested in exploring ENSEMBL / Biomart datasets that are made available on Amazon EC2.
I'm wondering what is available and how to use it. If I search for "ENSEMBL" in "public datasets", I see three results: http://aws.amazon.com/search?searchQuery=ensembl&searchPath=datasets&x=0&y=0 Ensembl Annotated Human Genome Data (FASTA Release 65) Ensembl Annotated Human Genome Data (MySQL Release 65) Ensembl - FASTA Database Files However, none of the snapshot IDs associated with these three data sets show up in the list of datasets available when I try and create a new volume in the EC2 web console. Instead, I see the following datasets: Ensembl BioMart (Linux) Main Ensembl (Linux) Ensembl-53 (Linux) Ensembl - FASTA Database Files (Linux) Ensembl-54 (Linux) Ensembl-54b (Linux) Ensembl-55-FASTA-DB (Linux) Ensembl-55 (Linux) Ensembl-56 (Linux) Ensembl-56-FASTA-DB (Linux) Ensembl 57 for MySQL (Linux) Ensembl 57 for FASTA (Linux) Ensembl 59 FASTA dump Ensembl 59 MySQL flat file dumps Ensembl 60 MySQL flat file dumps Ensembl 60 fasta dumps Ensembl Release 61 FASTA dumps Ensembl Release 61 MySQL...at file dumps Ensembl 62 Fasta Data Ensembl 62 MySQL Data Ensembl Release 63 MySQL...t file dumps Ensembl Release 63 FASTA Dumps Ensembl 64 MySQL flat file dumps Ensembl 64 FASTA dumps Ensembl 64 FASTA dumps Ensembl 64 MySQL flat file dumps Ensembl Release 65 FASTA dumps Ensembl Release 65 MySQL dumps ensembl release 65 binary MySQL I mounted snap-c48360ad, referred to above as "Ensembl BioMart". Inspecting the contents of the disk, I see what look like MySQL database files (.MYI, .MYD. and .frm files). I would like to create the corresponding databases but I can't find any documentation about doing so. The page on amazon for one of the datasets (http://aws.amazon.com/datasets/2315?_encoding=UTF8&queryArg=searchQuery&x=0&fromSearch=1&y=0&searchPath=datasets&searchQuery=ensembl) tells me to look here for documentation: http://www.ensembl.org/info/docs/webcode/install/ensembl-data.html But that page has instructions which assume that you have gzipped .sql and .txt files (as far as I can tell). Where can I find documentation for creating MySQL databases from the MYI and MYD files? Also, is there any further/more accurate documentation about which ENSEMBL datasets are available on EC2/AWS and how to use them? Thanks! Dan _______________________________________________ Users mailing list [email protected] https://lists.biomart.org/mailman/listinfo/users
