Hi Alistair,

One major goal of the Galaxy cloud offering is that it works out of the box
with a very large selection of tools and indices installed and ready to go,
with no configuration necessary.  My workshop heuristic is two students per
node using regular m1.xlarge nodes (though input from others on this is
absolutely welcome!).  For costs, see
http://aws.amazon.com/ec2/#pricingfor detailed pricing information,
but for the most part what you need to
know is:

You'll need a head node up for the entire lifespan of your cluster.  You
can terminate this and start it back up (and avoid paying for the node use
in the interim), but any time you want to interact with this cluster it
will need to be alive.  We generally use a high memory quadruple extra
large instance @ ~1.65/hr.  I'd start with this, and start with *more*
space on the galaxy volume than you think you'll need -- space is not
expensive, especially if you're not keeping it for more than a few days to
cover a workshop.

Worker nodes will be the brunt of your costs, and fortunately these are
*trivial* to add and remove and can be configured just prior to and after
the workshop.  if you use standard m1.xlarge instances, they're ~.48/hour.
 So, for 10 of them to cover your 20 students, you're looking at about
$5/hour for the workshop -- not too bad.

Amazon will bill you for every hour you have an instance, regardless of
whether you're using it or not.  Do make sure to terminate things when
you're done and/or not utilizing it anymore.  You may want to look into
Cloudman's auto-scaling to handle this for you -- this allows you to say
something like "Keep 2 worker instances up all the time, but under heavy
use scale up to at most 10".   I wouldn't recommend auto-scaling for a
workshop, however, I'd have the instances ready ahead of time since you're
fairly certain you'll need them.  Regarding setting up ahead of time, what
I'd recommend is to set up just the master instance (which you will pay
for) ahead of time, test it out, and only add the workers when it's time to
kick off the workshop.

One more performance tip is that for workshops, once you've added extra
worker nodes, you'll want to go into the Cloudman admin panel and disable
job running on the master node (just a single button click) for maximum
performance of the galaxy application.

We're in the middle of an update that will be released early next week
(already accessible using the cloudman-dev bucket), which fixes several
known issues with the current tool offering, so if you're able I'd wait
until that happens before starting a new workshop cluster that's probably
for the best.  Additionally, we have a supported 'workshop-ready' offering
that has preloaded data for our Galaxy101 and RNA-seq exercises, among
other things.  This will be updated with our forthcoming release, but see
http://wiki.galaxyproject.org/Teach/WorkshopAMI for more details.

Good luck, and let me know if there's anything else I can do to help out!

Dannon


On Thu, Aug 22, 2013 at 9:45 PM, Alistair Chilcott <
alistair.chilc...@utas.edu.au> wrote:

>  Hi,****
>
> ** **
>
> Firstly this is my first post to the list so be gentle :D****
>
> ** **
>
> We have been looking at the Galaxy in the Cloud option for our researchers
> ****
>
> ** **
>
> We have been doing a fair bit of reading on the various source but haven’t
> found any solid answers to the following questions:****
>
> ** **
>
> -When it is started (AWS account setup and launched from the “new cloud
> cluster link”) how much config is required to get it running all the tools
> such as megablast, Bowtie, Tophat etc?****
>
> (We have also been setting up a local install of Galaxy but are struggling
> with the setup of these tools that don’t come bundled with the base Galaxy
> install)****
>
> ** **
>
> -What size AWS cluster would be required to support a class of 20 or so
> students running a range of relatively short tasks with Megablast, Bowtie,
> Tophat, Fastq Groomer, SAM tools?****
>
> ** **
>
> -how is the AWS charge calculated does it run while the cluster is
> available or just for the actual compute time used? Ie Could setup it up
> and have it ready but while it isn’t processing data it’s not costing
> anything?****
>
> ** **
>
> Regards,****
>
> ** **
>
> Alistair****
>
> ** **
>
> [image: Description: Description: Description: Description: Description:
> Description: cid:image001.png@01CA36CF.30513900]****
>
> *Alistair Chilcott*
>
> Systems Administrator, (Domain)****
>
> Information Technology Services****
>
> Email: alistair.chilc...@utas.edu.au | P: +61 3 6226 7743 ****
>
> University of Tasmania, Locked Bag 23, Hobart  Tas.  7000****
>
> ** **
>
> ** **
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>   http://lists.bx.psu.edu/
>
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/mailinglists/
>

<<image001.png>>

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Reply via email to