Re: [galaxy-dev] Reducing costs in Cloud Galaxy
Enis, Thanks, your instructions below worked ok and I have reduced my 700GB to 1 GB. No doubt a pile of genome tools don't work now but I don't need them. I see this is described in the Wiki as well, but it was more concise below, thanks. cp -r was fine for copying the remnants of /mnt/galaxyIndices. Cheers, Greg E 2. The vast majority of the storage costs are fro the Gemome databases in the 700GB /mnt/galaxyIndices, which I don't need. Can this be reduced to the bare essentials ? You can do this manually: 1. Start a new Galaxy cluster (ie, one you can easily delete later) 2. ssh into the master instance and delete whatever genomes you don't need/want (these are all located under /mnt/galaxyIndices) 3. Create a new EBS volume of size that'll fit whatever's left on the original volume, attach it and mount it 4. Copy over the data from the original volume to the new one while keeping the directory structure the same (rsync is probably the best tool for this) 5. Unmount detach the new volume; create a snapshot from it 6. For the cluster you want to keep around (while it is terminated), edit persistent_data.yaml in it's bucket on S3 and replace the existing snap ID for the galaxyIndices with the snapshot ID you got in the previous step 7. Start that cluster and you should have a file system from the new snapshot mounted. 8. Terminate delete the cluster you created in step 1 If you don't want to have to do this the first time around on your custom cluster, you can first try it with another temporary cluster and make sure it all works as expected and then move on to the real cluster. Best, Enis ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Reducing costs in Cloud Galaxy
Awesome! Also glad to hear your rate of success can be quantified as a 700% improvement. Amazing! :) Best, Enis On Wed, Mar 28, 2012 at 2:57 PM, Greg Edwards gedwar...@gmail.com wrote: Enis, Thanks, your instructions below worked ok and I have reduced my 700GB to 1 GB. No doubt a pile of genome tools don't work now but I don't need them. I see this is described in the Wiki as well, but it was more concise below, thanks. cp -r was fine for copying the remnants of /mnt/galaxyIndices. Cheers, Greg E 2. The vast majority of the storage costs are fro the Gemome databases in the 700GB /mnt/galaxyIndices, which I don't need. Can this be reduced to the bare essentials ? You can do this manually: 1. Start a new Galaxy cluster (ie, one you can easily delete later) 2. ssh into the master instance and delete whatever genomes you don't need/want (these are all located under /mnt/galaxyIndices) 3. Create a new EBS volume of size that'll fit whatever's left on the original volume, attach it and mount it 4. Copy over the data from the original volume to the new one while keeping the directory structure the same (rsync is probably the best tool for this) 5. Unmount detach the new volume; create a snapshot from it 6. For the cluster you want to keep around (while it is terminated), edit persistent_data.yaml in it's bucket on S3 and replace the existing snap ID for the galaxyIndices with the snapshot ID you got in the previous step 7. Start that cluster and you should have a file system from the new snapshot mounted. 8. Terminate delete the cluster you created in step 1 If you don't want to have to do this the first time around on your custom cluster, you can first try it with another temporary cluster and make sure it all works as expected and then move on to the real cluster. Best, Enis ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Reducing costs in Cloud Galaxy
Greg, Regarding the performance of different types of instances, I came across this and thought you might potentially find it useful: http://cloudharmony.com/benchmarks Enis On Mon, Mar 19, 2012 at 7:49 PM, Greg Edwards gedwar...@gmail.com wrote: Enis, Thanks. Will try that re the storage. Greg E On Mon, Mar 19, 2012 at 4:49 PM, Enis Afgan eaf...@emory.edu wrote: Hi Greg, On Mon, Mar 19, 2012 at 11:01 AM, Greg Edwards gedwar...@gmail.comwrote: Hi, I've got an implementation of some proteomics tools going well in Galaxy on AWS EC2 under Cloudman. Thanks for the help along the way. I need to drive the costs down a bit. I'm using an m1.large AMI and it's costing about $180 - $200 / month. This is about 55% storage and 45% instance costs. That's peanuts in some senses but for now we need to get it down so that it comes out of petty cash for the department, while the case is proven for it's use. I have a few questions and would appreciate ny insights .. 1. AWS has just released an m1.medium and m1.small instance type, which are 1/2 and 1/4 the cost of m1.large. http://aws.amazon.com/ec2/instance-types/ http://aws.amazon.com/ec2/pricing/ I tried the m1.small and m1.medium with the latest Cloudman AMI * *galaxy-cloudman-2011-03-22 (ami-da58aab3) All seemed to install ok, but the Tools took up tp 30 minutes to start execution on m1.medium, and never started on m1.small. m1.medium only added about 15% to run times compared with m1.large, can't say for m1.small. t1.micro does run (and for free in my Free Tier first year) but blows execution times out by a factor of about 3 which is too much. Has anyone tried these new Instance Types ? (m1.small/medium) I have no real experience with these instance types yet either so maybe someone else can chime in on this? 2. The vast majority of the storage costs are fro the Gemome databases in the 700GB /mnt/galaxyIndices, which I don't need. Can this be reduced to the bare essentials ? You can do this manually: 1. Start a new Galaxy cluster (ie, one you can easily delete later) 2. ssh into the master instance and delete whatever genomes you don't need/want (these are all located under /mnt/galaxyIndices) 3. Create a new EBS volume of size that'll fit whatever's left on the original volume, attach it and mount it 4. Copy over the data from the original volume to the new one while keeping the directory structure the same (rsync is probably the best tool for this) 5. Unmount detach the new volume; create a snapshot from it 6. For the cluster you want to keep around (while it is terminated), edit persistent_data.yaml in it's bucket on S3 and replace the existing snap ID for the galaxyIndices with the snapshot ID you got in the previous step 7. Start that cluster and you should have a file system from the new snapshot mounted. 8. Terminate delete the cluster you created in step 1 If you don't want to have to do this the first time around on your custom cluster, you can first try it with another temporary cluster and make sure it all works as expected and then move on to the real cluster. Best, Enis Using m1.small/medium and getting rid of the 700GB would being my costs down to say $50 / month which is ok. Thanks ! Greg E -- Greg Edwards, Port Jackson Bioinformatics gedwar...@gmail.com ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Greg Edwards, Port Jackson Bioinformatics gedwar...@gmail.com ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Reducing costs in Cloud Galaxy
Hi Enis, Greg, I've taken stuff from my this email, and previous conversations with Enis and put it in the wiki: http://wiki.g2.bx.psu.edu/Admin/Cloud/CapacityPlanning Please feel free to update/correct/enhance. Dave C. On Mon, Mar 19, 2012 at 2:58 PM, Enis Afgan eaf...@emory.edu wrote: Greg, Regarding the performance of different types of instances, I came across this and thought you might potentially find it useful: http://cloudharmony.com/benchmarks Enis On Mon, Mar 19, 2012 at 7:49 PM, Greg Edwards gedwar...@gmail.com wrote: Enis, Thanks. Will try that re the storage. Greg E On Mon, Mar 19, 2012 at 4:49 PM, Enis Afgan eaf...@emory.edu wrote: Hi Greg, On Mon, Mar 19, 2012 at 11:01 AM, Greg Edwards gedwar...@gmail.comwrote: Hi, I've got an implementation of some proteomics tools going well in Galaxy on AWS EC2 under Cloudman. Thanks for the help along the way. I need to drive the costs down a bit. I'm using an m1.large AMI and it's costing about $180 - $200 / month. This is about 55% storage and 45% instance costs. That's peanuts in some senses but for now we need to get it down so that it comes out of petty cash for the department, while the case is proven for it's use. I have a few questions and would appreciate ny insights .. 1. AWS has just released an m1.medium and m1.small instance type, which are 1/2 and 1/4 the cost of m1.large. http://aws.amazon.com/ec2/instance-types/ http://aws.amazon.com/ec2/pricing/ I tried the m1.small and m1.medium with the latest Cloudman AMI * *galaxy-cloudman-2011-03-22 (ami-da58aab3) All seemed to install ok, but the Tools took up tp 30 minutes to start execution on m1.medium, and never started on m1.small. m1.medium only added about 15% to run times compared with m1.large, can't say for m1.small. t1.micro does run (and for free in my Free Tier first year) but blows execution times out by a factor of about 3 which is too much. Has anyone tried these new Instance Types ? (m1.small/medium) I have no real experience with these instance types yet either so maybe someone else can chime in on this? 2. The vast majority of the storage costs are fro the Gemome databases in the 700GB /mnt/galaxyIndices, which I don't need. Can this be reduced to the bare essentials ? You can do this manually: 1. Start a new Galaxy cluster (ie, one you can easily delete later) 2. ssh into the master instance and delete whatever genomes you don't need/want (these are all located under /mnt/galaxyIndices) 3. Create a new EBS volume of size that'll fit whatever's left on the original volume, attach it and mount it 4. Copy over the data from the original volume to the new one while keeping the directory structure the same (rsync is probably the best tool for this) 5. Unmount detach the new volume; create a snapshot from it 6. For the cluster you want to keep around (while it is terminated), edit persistent_data.yaml in it's bucket on S3 and replace the existing snap ID for the galaxyIndices with the snapshot ID you got in the previous step 7. Start that cluster and you should have a file system from the new snapshot mounted. 8. Terminate delete the cluster you created in step 1 If you don't want to have to do this the first time around on your custom cluster, you can first try it with another temporary cluster and make sure it all works as expected and then move on to the real cluster. Best, Enis Using m1.small/medium and getting rid of the 700GB would being my costs down to say $50 / month which is ok. Thanks ! Greg E -- Greg Edwards, Port Jackson Bioinformatics gedwar...@gmail.com ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Greg Edwards, Port Jackson Bioinformatics gedwar...@gmail.com ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- http://galaxyproject.org/GCC2012 http://galaxyproject.org/wiki/GCC2012 http://galaxyproject.org/ http://getgalaxy.org/ http://usegalaxy.org/ http://galaxyproject.org/wiki/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Reducing costs in Cloud Galaxy
Just one extra thought on this-- If you leave your instance up all the time it may be worth looking into having a reserved micro instance up as the front end (cheap, or free, with your intro tier) with SGE submission disabled. Then, enable autoscaling(max 1) of m1.large/xlarge instances. -Dannon On Mar 19, 2012, at 7:20 PM, Dave Clements wrote: Hi Enis, Greg, I've taken stuff from my this email, and previous conversations with Enis and put it in the wiki: http://wiki.g2.bx.psu.edu/Admin/Cloud/CapacityPlanning Please feel free to update/correct/enhance. Dave C. On Mon, Mar 19, 2012 at 2:58 PM, Enis Afgan eaf...@emory.edu wrote: Greg, Regarding the performance of different types of instances, I came across this and thought you might potentially find it useful: http://cloudharmony.com/benchmarks Enis On Mon, Mar 19, 2012 at 7:49 PM, Greg Edwards gedwar...@gmail.com wrote: Enis, Thanks. Will try that re the storage. Greg E On Mon, Mar 19, 2012 at 4:49 PM, Enis Afgan eaf...@emory.edu wrote: Hi Greg, On Mon, Mar 19, 2012 at 11:01 AM, Greg Edwards gedwar...@gmail.com wrote: Hi, I've got an implementation of some proteomics tools going well in Galaxy on AWS EC2 under Cloudman. Thanks for the help along the way. I need to drive the costs down a bit. I'm using an m1.large AMI and it's costing about $180 - $200 / month. This is about 55% storage and 45% instance costs. That's peanuts in some senses but for now we need to get it down so that it comes out of petty cash for the department, while the case is proven for it's use. I have a few questions and would appreciate ny insights .. 1. AWS has just released an m1.medium and m1.small instance type, which are 1/2 and 1/4 the cost of m1.large. http://aws.amazon.com/ec2/instance-types/ http://aws.amazon.com/ec2/pricing/ I tried the m1.small and m1.medium with the latest Cloudman AMI galaxy-cloudman-2011-03-22 (ami-da58aab3) All seemed to install ok, but the Tools took up tp 30 minutes to start execution on m1.medium, and never started on m1.small. m1.medium only added about 15% to run times compared with m1.large, can't say for m1.small. t1.micro does run (and for free in my Free Tier first year) but blows execution times out by a factor of about 3 which is too much. Has anyone tried these new Instance Types ? (m1.small/medium) I have no real experience with these instance types yet either so maybe someone else can chime in on this? 2. The vast majority of the storage costs are fro the Gemome databases in the 700GB /mnt/galaxyIndices, which I don't need. Can this be reduced to the bare essentials ? You can do this manually: 1. Start a new Galaxy cluster (ie, one you can easily delete later) 2. ssh into the master instance and delete whatever genomes you don't need/want (these are all located under /mnt/galaxyIndices) 3. Create a new EBS volume of size that'll fit whatever's left on the original volume, attach it and mount it 4. Copy over the data from the original volume to the new one while keeping the directory structure the same (rsync is probably the best tool for this) 5. Unmount detach the new volume; create a snapshot from it 6. For the cluster you want to keep around (while it is terminated), edit persistent_data.yaml in it's bucket on S3 and replace the existing snap ID for the galaxyIndices with the snapshot ID you got in the previous step 7. Start that cluster and you should have a file system from the new snapshot mounted. 8. Terminate delete the cluster you created in step 1 If you don't want to have to do this the first time around on your custom cluster, you can first try it with another temporary cluster and make sure it all works as expected and then move on to the real cluster. Best, Enis Using m1.small/medium and getting rid of the 700GB would being my costs down to say $50 / month which is ok. Thanks ! Greg E -- Greg Edwards, Port Jackson Bioinformatics gedwar...@gmail.com ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Greg Edwards, Port Jackson Bioinformatics gedwar...@gmail.com ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- http://galaxyproject.org/GCC2012 http://galaxyproject.org/ http://getgalaxy.org/ http://usegalaxy.org/ http://galaxyproject.org/wiki/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your
[galaxy-dev] Reducing costs in Cloud Galaxy
Hi, I've got an implementation of some proteomics tools going well in Galaxy on AWS EC2 under Cloudman. Thanks for the help along the way. I need to drive the costs down a bit. I'm using an m1.large AMI and it's costing about $180 - $200 / month. This is about 55% storage and 45% instance costs. That's peanuts in some senses but for now we need to get it down so that it comes out of petty cash for the department, while the case is proven for it's use. I have a few questions and would appreciate ny insights .. 1. AWS has just released an m1.medium and m1.small instance type, which are 1/2 and 1/4 the cost of m1.large. http://aws.amazon.com/ec2/instance-types/ http://aws.amazon.com/ec2/pricing/ I tried the m1.small and m1.medium with the latest Cloudman AMI * *galaxy-cloudman-2011-03-22 (ami-da58aab3) All seemed to install ok, but the Tools took up tp 30 minutes to start execution on m1.medium, and never started on m1.small. m1.medium only added about 15% to run times compared with m1.large, can't say for m1.small. t1.micro does run (and for free in my Free Tier first year) but blows execution times out by a factor of about 3 which is too much. Has anyone tried these new Instance Types ? (m1.small/medium) 2. The vast majority of the storage costs are fro the Gemome databases in the 700GB /mnt/galaxyIndices, which I don't need. Can this be reduced to the bare essentials ? Using m1.small/medium and getting rid of the 700GB would being my costs down to say $50 / month which is ok. Thanks ! Greg E -- Greg Edwards, Port Jackson Bioinformatics gedwar...@gmail.com ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Reducing costs in Cloud Galaxy
Hi Greg, On Mon, Mar 19, 2012 at 11:01 AM, Greg Edwards gedwar...@gmail.com wrote: Hi, I've got an implementation of some proteomics tools going well in Galaxy on AWS EC2 under Cloudman. Thanks for the help along the way. I need to drive the costs down a bit. I'm using an m1.large AMI and it's costing about $180 - $200 / month. This is about 55% storage and 45% instance costs. That's peanuts in some senses but for now we need to get it down so that it comes out of petty cash for the department, while the case is proven for it's use. I have a few questions and would appreciate ny insights .. 1. AWS has just released an m1.medium and m1.small instance type, which are 1/2 and 1/4 the cost of m1.large. http://aws.amazon.com/ec2/instance-types/ http://aws.amazon.com/ec2/pricing/ I tried the m1.small and m1.medium with the latest Cloudman AMI * *galaxy-cloudman-2011-03-22 (ami-da58aab3) All seemed to install ok, but the Tools took up tp 30 minutes to start execution on m1.medium, and never started on m1.small. m1.medium only added about 15% to run times compared with m1.large, can't say for m1.small. t1.micro does run (and for free in my Free Tier first year) but blows execution times out by a factor of about 3 which is too much. Has anyone tried these new Instance Types ? (m1.small/medium) I have no real experience with these instance types yet either so maybe someone else can chime in on this? 2. The vast majority of the storage costs are fro the Gemome databases in the 700GB /mnt/galaxyIndices, which I don't need. Can this be reduced to the bare essentials ? You can do this manually: 1. Start a new Galaxy cluster (ie, one you can easily delete later) 2. ssh into the master instance and delete whatever genomes you don't need/want (these are all located under /mnt/galaxyIndices) 3. Create a new EBS volume of size that'll fit whatever's left on the original volume, attach it and mount it 4. Copy over the data from the original volume to the new one while keeping the directory structure the same (rsync is probably the best tool for this) 5. Unmount detach the new volume; create a snapshot from it 6. For the cluster you want to keep around (while it is terminated), edit persistent_data.yaml in it's bucket on S3 and replace the existing snap ID for the galaxyIndices with the snapshot ID you got in the previous step 7. Start that cluster and you should have a file system from the new snapshot mounted. 8. Terminate delete the cluster you created in step 1 If you don't want to have to do this the first time around on your custom cluster, you can first try it with another temporary cluster and make sure it all works as expected and then move on to the real cluster. Best, Enis Using m1.small/medium and getting rid of the 700GB would being my costs down to say $50 / month which is ok. Thanks ! Greg E -- Greg Edwards, Port Jackson Bioinformatics gedwar...@gmail.com ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/