On Mon, Feb 27, 2012 at 3:54 AM, Rhoda Kinsella <[email protected]> wrote:
> Hi Dan
> I have consulted our resident Cloud expert. Please find his comments below
> in between your text. I hope this helps, but please don't hesitate to get in
> contact should you require more information.

Thank you. Please see my further questions below.


> Regards
> Rhoda
>
> On 23 Feb 2012, at 19:52, Dan Tenenbaum wrote:
>
> Hello,
>
> I am interested in exploring ENSEMBL / Biomart datasets that are made
> available on Amazon EC2.
>
> I'm wondering what is available and how to use it.
>
> If I search for "ENSEMBL" in "public datasets", I see three results:
> http://aws.amazon.com/search?searchQuery=ensembl&searchPath=datasets&x=0&y=0
>
> Ensembl Annotated Human Genome Data (FASTA Release 65)
> Ensembl Annotated Human Genome Data (MySQL Release 65)
> Ensembl - FASTA Database Files
>
> However, none of the snapshot IDs associated with these three data
> sets show up in the list of datasets available when I try and create a
> new volume in the EC2 web console. Instead, I see the following
> datasets:
>
>
> What Region are you trying to use these in? Amazon public datasets are
> available only in the useast region.
>
>

I want to use them in the us-east region.



>
> Ensembl BioMart (Linux)
> Main Ensembl (Linux)
> Ensembl-53 (Linux)
> Ensembl - FASTA Database Files (Linux)
> Ensembl-54 (Linux)
> Ensembl-54b (Linux)
> Ensembl-55-FASTA-DB (Linux)
> Ensembl-55 (Linux)
> Ensembl-56 (Linux)
> Ensembl-56-FASTA-DB (Linux)
> Ensembl 57 for MySQL (Linux)
> Ensembl 57 for FASTA (Linux)
> Ensembl 59 FASTA dump
> Ensembl 59 MySQL flat file dumps
> Ensembl 60 MySQL flat file dumps
> Ensembl 60 fasta dumps
> Ensembl Release 61 FASTA dumps
> Ensembl Release 61 MySQL...at file dumps
> Ensembl 62 Fasta Data
> Ensembl 62 MySQL Data
> Ensembl Release 63 MySQL...t file dumps
> Ensembl Release 63 FASTA Dumps
> Ensembl 64 MySQL flat file dumps
> Ensembl 64 FASTA dumps
> Ensembl 64 FASTA dumps
> Ensembl 64 MySQL flat file dumps
> Ensembl Release 65 FASTA dumps
> Ensembl Release 65 MySQL dumps
> ensembl release 65 binary MySQL
>
> I mounted snap-c48360ad, referred to above as "Ensembl BioMart".
> Inspecting the contents of the disk, I see what look like MySQL
> database files (.MYI, .MYD. and .frm files).
>
> I would like to create the corresponding databases but I can't find
> any documentation about doing so.
> The page on amazon for one of the datasets
> (http://aws.amazon.com/datasets/2315?_encoding=UTF8&queryArg=searchQuery&x=0&fromSearch=1&y=0&searchPath=datasets&searchQuery=ensembl)
> tells me to look here for documentation:
> http://www.ensembl.org/info/docs/webcode/install/ensembl-data.html
>
>
>
>
> These are quite old versions of our data in the Public Datasets program and
> some represent some early experiments of ours with Amazon public datasets,
> and sets you have listed - whilst originating from us at a point in the past
> - are actually currently owned and controlled by the public dataset program.
>
> We do not currently submit separate biomart dumps to the public dataset
> program(although we tried it out once in 2008). What we do currently submit
> are the FASTA dumps and MYSQL text dumps for all of our databases. After
> some early tweaking, we finalised to this format after discussion, agreement
> and arrangement with the public dataset program. In the future there a may
> be an opportunity for further changes - but not at the minute.
>
> In this dataset you will find the MySQL text dumps for biomart (amongst all
> of our databases)
>
> http://aws.amazon.com/datasets/2315?_encoding=UTF8&jiveRedirect=1
>
> And this instructions on how to turn this into a mysql database is here:-
>
> http://www.ensembl.org/info/docs/webcode/install/ensembl-data.html
>
> You will obviously need to filter for the mart databases first.
>
>
>
> But that page has instructions which assume that you have gzipped .sql
> and .txt files (as far as I can tell).
> Where can I find documentation for creating MySQL databases from the
> MYI and MYD files?
>
> Also, is there any further/more accurate documentation about which
> ENSEMBL datasets are available on EC2/AWS and how to use them?
>
>
>
> Yes.
>
> We have pre-baked AMIs that will boot into MySQL database servers here:-
> (however there are no biomart databases in this set.)
>
> http://www.ensembl.org/info/data/amazon_aws.html
> http://www.ensembl.info/blog/2011/07/12/run-a-private-ensembl-mysql-in-the-cloud/
>

Thank you. Are there plans to make the biomart data available in this
way? That's what I am really looking for.

Thanks,
Dan

>
>
>
> Thanks!
> Dan
> _______________________________________________
> Users mailing list
> [email protected]
> https://lists.biomart.org/mailman/listinfo/users
>
>
> Rhoda Kinsella Ph.D.
> Ensembl Production Project Leader,
> European Bioinformatics Institute (EMBL-EBI),
> Wellcome Trust Genome Campus,
> Hinxton
> Cambridge CB10 1SD,
> UK.
>
_______________________________________________
Users mailing list
[email protected]
https://lists.biomart.org/mailman/listinfo/users

Reply via email to