Hi Ryan,
What you're suggesting to do is still somewhat experimental but we're
continuing to work on it to make it more integrated into the Galaxy
ecosystem and more robust. There are really three general approaches:
1. Run Galaxy via CloudMan 100% on AWS. This option is most robust and
basically ready for use but, over time and if you decide to make
modifications to various pieces of the puzzle, will require an increased
understanding of how CloudMan works. It's also the most expensive option.
2. Run Galaxy UI locally and create a CloudMan cluster on demand with the
Pulsar <http://pulsar.readthedocs.org/en/latest/index.html> service enabled
to accept jobs from the local Galaxy. This paper describes that approach:
http://onlinelibrary.wiley.com/doi/10.1002/cpe.3536/abstract
3. Run your Galaxy UI locally and create Ansible roles/tasks to dynamically
acquire cloud instances and assemble those into a cluster. You will
probably want to use Pulsar for job management again. This option gives you
most control but also means you'll need to build the system. Nate may also
have more comments about this.

I've also put some comments about your specific questions inline.

Hope this helps clarify the situation at least. We're actively working on
this scenario so things should get easier in the future. Let us know if you
have more questions and what you decide.

Cheers,
Enis



On Wed, Aug 19, 2015 at 8:53 AM, Ryan G <ngsbioinformat...@gmail.com> wrote:

> Hi all - We are running a local instance of Galaxy on our internal
> infrastructure.  It seems to be going well.
>
> We've gotten to the point where we are ready to migrate our NGS data to
> Amazon for storage in S3.  We are also looking at how Galaxy can be used in
> Amazon.  Specifically, we are interested in understanding:
>
> 1)  Should we run an instance of Galaxy in Amazon, or continue to run it
> locally (to minimize costs) but have it run analyses in Amazon?
>
 The options above summarize this scenario.

>
> 2)  Regardless of how we run it, data will be stored in S3.  How will
> Galaxy interact with S3 for its Data Libraries?
>
Galaxy implements an Object Store interface that can link to S3 as a
back-end data store. It's been around for a number of years now and
demonstrated as working but it also hasn't been used in production so I'd
suggest testing this first. Galaxy configuration options for the object
store are in Galaxy's config file:
https://github.com/galaxyproject/galaxy/blob/dev/config/galaxy.ini.sample#L289

>
> 3)  Is it even possible to separate the Galaxy web interface from the HPC
> cluster?
>
Yes; you either need a shared file system between the resources or use
Pulsar.

>
> 3)  We understand Galaxy in Amazon uses CloudMan.  Can we run this in our
> VPC with our own AMI?
>
Yes; you can build your own version of the system with the tools and
whatever else you would like to configure. Docs on how to do this are
available here: https://wiki.galaxyproject.org/CloudMan/Building


> If anyone can provide insights into how they are using Galaxy in Amazon, I
> am very interested to hear your thoughts.
>
> Ryan
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>   https://lists.galaxyproject.org/
>
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/mailinglists/
>
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Reply via email to