Hi Jeremy,

Also if you do remember what kind of Amazon node you used,
particularly for the cluster's master node (e.g. an 'xlarge' 4-core
15GB or perhaps one of the 'high-memory' nodes?), that would be a
reassuring sanity chech for me!


On Mon, Nov 21, 2011 at 10:37 AM, Clare Sloggett <s...@unimelb.edu.au> wrote:
> Hi Jeremy, Enis,
> That makes sense. I know I can configure how many threads BWA uses in
> its wrapper, with bwa -t. But, is there somewhere that I need to tell
> Galaxy the corresponding information, ie that this command-line task
> will make use of up to 4 cores?
> Or, does this imply that there is always exactly one job per node? So
> if I have (for instance) a cluster made of 4-core nodes, and a
> single-threaded task (e.g. samtools), are the other 3 cores just going
> to waste or will the scheduler allocate multiple single-threaded jobs
> to one node?
> I've cc'd galaxy-dev instead of galaxy-user as I think the
> conversation has gone that way!
> Thanks again,
> Clare
> On Fri, Nov 18, 2011 at 2:36 PM, Jeremy Goecks <jeremy.goe...@emory.edu> 
> wrote:
>>> On Fri, Nov 18, 2011 at 12:56 AM, Jeremy Goecks <jeremy.goe...@emory.edu> 
>>> wrote:
>>>> Scalability issues are more likely to arise on the back end than the front 
>>>> end, so you'll want to ensure that you have enough compute nodes. BWA uses 
>>>> four nodes by default--Enis, does the cloud config change this 
>>>> parameter?--so you'll want 4x50 or 200 total nodes if you want everyone to 
>>>> be able to run a BWA job simultaneously.
>>> Actually, one other question - this paragraph makes me realise that I
>>> don't really understand how Galaxy is distributing jobs. I had thought
>>> that each job would only use one node, and in some cases take
>>> advantage of multiple cores within that node. I'm taking a "node" to
>>> be a set of cores with their own shared memory, so in this case a VM
>>> instance, is this right? If some types of jobs can be distributed over
>>> multiple nodes, can I configure, in Galaxy, how many nodes they should
>>> use?
>> You're right -- my word choices were poor. Replace 'node' with 'core' in my 
>> paragraph to get an accurate suggestion for resources.
>> Galaxy uses a job scheduler--SGE on the cloud--to distribute jobs to 
>> different cluster nodes. Jobs that require multiple cores typically run on a 
>> single node. Enis can chime in on whether CloudMan supports job submission 
>> over multiple nodes; this would require setup of an appropriate parallel 
>> environment and a tool that can make use of this environment.
>> Good luck,
>> J.
> --
> E: s...@unimelb.edu.au
> P: 03 903 53357
> M: 0414 854 759

E: s...@unimelb.edu.au
P: 03 903 53357
M: 0414 854 759

Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:


Reply via email to