Re: [galaxy-dev] multithreaded tools

2012-11-27 Thread Andreas Kuntzagk

Hi,

the four processes I saw where all called lastz and ran in parallel and 
consumed 100% of a core each.
My guess is that the lastz_wrapper.py is responsible for this.
Looking at it I see a some code regarding queuing and in the very beginning 
this line:

WORKERS = 4

and further one the class BaseQueue which starts threads. BTW. there seems to be no way to adjust 
this number other than editing the source file - bad.


And this get's me wondering if there are other such surprises hidden in galaxy.

regards, Andreas

On 26.11.2012 15:55, Bob Harris wrote:

Howdy, Andreas,

The four processes started for a galaxy lastz job must involve post-processing 
the lastz output
through some other shell tool.  Lastz by itself doesn't support multiple 
threads or processes.

Bob H


On Nov 26, 2012, at 3:58 AM, Andreas Kuntzagk wrote:


Hi,

I'm wandering how galaxy supports tools that are multithreaded or multi-process.
When working with lastz I noticed that it starts 4 parallel processes.
Is that always so? Can this be adjusted? What other tools also are 
multi-process?

regards, Andreas

--
Andreas Kuntzagk

SystemAdministrator

Berlin Institute for Medical Systems Biology at the
Max-Delbrueck-Center for Molecular Medicine
Robert-Roessle-Str. 10, 13125 Berlin, Germany

http://www.mdc-berlin.de/en/bimsb/BIMSB_groups/Dieterich
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

http://lists.bx.psu.edu/




--
Andreas Kuntzagk

SystemAdministrator

Berlin Institute for Medical Systems Biology at the
Max-Delbrueck-Center for Molecular Medicine
Robert-Roessle-Str. 10, 13125 Berlin, Germany

http://www.mdc-berlin.de/en/bimsb/BIMSB_groups/Dieterich
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] multithreaded tools

2012-11-27 Thread Bossers, Alex
Andreas,
I am not sure if you can call these surprises.
Some tools (which I highly appreciate) of Peter have been parallelised to get 
the job done more quickly. I earlier mentioned the ncbi blast+ wrappers but 
there the tool by itself handles the multithreading.
Other tools I am aware that use a python script/wrapper to chunk up the initial 
query and rejoin later are tools like signalp, TMHMM and such. Usually it also 
involves some parsing of output to data that galaxy can subsequently handle.
In the latter examples its done using python scripts, but for some of our 
custom tools we did it in perl, some using bash parallel, or using R.
I wouldn't have a solution to getting to know this without going through the 
initial wrappers...

Alex


-Oorspronkelijk bericht-
Van: galaxy-dev-boun...@lists.bx.psu.edu 
[mailto:galaxy-dev-boun...@lists.bx.psu.edu] Namens Andreas Kuntzagk
Verzonden: dinsdag 27 november 2012 9:58
Aan: Bob Harris
CC: galaxy-dev@lists.bx.psu.edu
Onderwerp: Re: [galaxy-dev] multithreaded tools

Hi,

the four processes I saw where all called lastz and ran in parallel and 
consumed 100% of a core each.
My guess is that the lastz_wrapper.py is responsible for this.
Looking at it I see a some code regarding queuing and in the very beginning 
this line:

WORKERS = 4

and further one the class BaseQueue which starts threads. BTW. there seems to 
be no way to adjust this number other than editing the source file - bad.

And this get's me wondering if there are other such surprises hidden in galaxy.

regards, Andreas

On 26.11.2012 15:55, Bob Harris wrote:
 Howdy, Andreas,

 The four processes started for a galaxy lastz job must involve 
 post-processing the lastz output through some other shell tool.  Lastz by 
 itself doesn't support multiple threads or processes.

 Bob H


 On Nov 26, 2012, at 3:58 AM, Andreas Kuntzagk wrote:

 Hi,

 I'm wandering how galaxy supports tools that are multithreaded or 
 multi-process.
 When working with lastz I noticed that it starts 4 parallel processes.
 Is that always so? Can this be adjusted? What other tools also are 
 multi-process?

 regards, Andreas

 --
 Andreas Kuntzagk

 SystemAdministrator

 Berlin Institute for Medical Systems Biology at the 
 Max-Delbrueck-Center for Molecular Medicine Robert-Roessle-Str. 10, 
 13125 Berlin, Germany

 http://www.mdc-berlin.de/en/bimsb/BIMSB_groups/Dieterich
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this and other 
 Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


--
Andreas Kuntzagk

SystemAdministrator

Berlin Institute for Medical Systems Biology at the Max-Delbrueck-Center for 
Molecular Medicine Robert-Roessle-Str. 10, 13125 Berlin, Germany

http://www.mdc-berlin.de/en/bimsb/BIMSB_groups/Dieterich
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this and other Galaxy 
lists, please use the interface at:

  http://lists.bx.psu.edu/





___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] multithreaded tools

2012-11-27 Thread Peter Cock
On Tue, Nov 27, 2012 at 8:58 AM, Andreas Kuntzagk
andreas.kuntz...@mdc-berlin.de wrote:
 Hi,

 the four processes I saw where all called lastz and ran in parallel and
 consumed 100% of a core each.
 My guess is that the lastz_wrapper.py is responsible for this.
 Looking at it I see a some code regarding queuing and in the very beginning
 this line:

 WORKERS = 4

 and further one the class BaseQueue which starts threads. BTW. there seems
 to be no way to adjust this number other than editing the source file - bad.

 And this get's me wondering if there are other such surprises hidden in
 galaxy.

 regards, Andreas

As the author of several tool wrappers, I've been asking for a Galaxy
wide mechanism for Galaxy to tell the tool how many threads it can
use, for example via an environment variable. The value could then
be set with a general default, per runner default, or even per tool
using the existing runner configuration under [galaxy:tool_runners]
in universe_wsgi.ini

See: http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-March/009037.html
and: http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-June/010153.html

In your example, and others like the BWA and BLAST+ wrappers where
the tool XML is hard coded to 8 threads, you would probably want to
use a custom runner in universe_wsgi.ini setting the cluster submission
to request that many slots/CPUs.

For our local cluster, I modify the BLAST+ wrapper XML to use 4 threads,
and have something like this in my universe_wsgi.ini file:

[galaxy:tool_runners]
ncbi_blastp_wrapper  = drmaa://-V -pe smp 4/
ncbi_blastn_wrapper  = drmaa://-V -pe smp 4/
ncbi_blastx_wrapper  = drmaa://-V -pe smp 4/
ncbi_tblastn_wrapper = drmaa://-V -pe smp 4/
ncbi_tblastx_wrapper = drmaa://-V -pe smp 4/

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] multithreaded tools

2012-11-27 Thread Andreas Kuntzagk

Dear Peter,


As the author of several tool wrappers, I've been asking for a Galaxy
wide mechanism for Galaxy to tell the tool how many threads it can
use, for example via an environment variable. The value could then
be set with a general default, per runner default, or even per tool
using the existing runner configuration under [galaxy:tool_runners]
in universe_wsgi.ini


This would be a possibility. Another would be to communicate the number of 
threads the other way.
So the tool tells the runner how many threads. And the runner knows how to 
handle this.
I can imagine universe_wsgi.ini having such lines:

ncbi_blastp_wrapper  = drmaa://-V -pe smp $GALAXY_THREADS

and then $GALAXY_THREADS is changed for the value given by the wrapper.
Thinking again this is probably not goint to work because the runner comes first and the wrapper 
after. My idea was that the wrapper could decide what recources to request. So I could use lower 
memory settings for small mapping jobs ...




In your example, and others like the BWA and BLAST+ wrappers where
the tool XML is hard coded to 8 threads, you would probably want to
use a custom runner in universe_wsgi.ini setting the cluster submission
to request that many slots/CPUs.


A list of all these wrappers on the Wiki would be nice.

regards, Andreas

--
Andreas Kuntzagk

SystemAdministrator

Berlin Institute for Medical Systems Biology at the
Max-Delbrueck-Center for Molecular Medicine
Robert-Roessle-Str. 10, 13125 Berlin, Germany

http://www.mdc-berlin.de/en/bimsb/BIMSB_groups/Dieterich
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] multithreaded tools

2012-11-27 Thread Andreas Kuntzagk

Hi Alex,


I am not sure if you can call these surprises.


Well at least it surprised me :-)
Didn't want to sound to negative.


Some tools (which I highly appreciate) of Peter have been parallelised to get 
the job done more
quickly. I earlier mentioned the ncbi blast+ wrappers but there the tool by 
itself handles the
multithreading. Other tools I am aware that use a python script/wrapper to 
chunk up the initial
query and rejoin later are tools like signalp, TMHMM and such. Usually it also 
involves some
parsing of output to data that galaxy can subsequently handle. In the latter 
examples its done
using python scripts, but for some of our custom tools we did it in perl, some 
using bash
parallel, or using R. I wouldn't have a solution to getting to know this 
without going through
the initial wrappers...



While I can read Python and bash fine it becomes more complicated with perl and R. Don't know if I 
could easily spot from the code what the number of threads is.

So maybe somebody could setup a list of these tools?

regards, Andreas.



Alex


-Oorspronkelijk bericht- Van: galaxy-dev-boun...@lists.bx.psu.edu
[mailto:galaxy-dev-boun...@lists.bx.psu.edu] Namens Andreas Kuntzagk Verzonden: 
dinsdag 27
november 2012 9:58 Aan: Bob Harris CC: galaxy-dev@lists.bx.psu.edu Onderwerp: 
Re: [galaxy-dev]
multithreaded tools

Hi,

the four processes I saw where all called lastz and ran in parallel and 
consumed 100% of a core
each. My guess is that the lastz_wrapper.py is responsible for this. Looking at 
it I see a some
code regarding queuing and in the very beginning this line:

WORKERS = 4

and further one the class BaseQueue which starts threads. BTW. there seems to 
be no way to
adjust this number other than editing the source file - bad.

And this get's me wondering if there are other such surprises hidden in galaxy.

regards, Andreas

On 26.11.2012 15:55, Bob Harris wrote:

Howdy, Andreas,

The four processes started for a galaxy lastz job must involve post-processing 
the lastz output
through some other shell tool.  Lastz by itself doesn't support multiple 
threads or processes.

Bob H


On Nov 26, 2012, at 3:58 AM, Andreas Kuntzagk wrote:


Hi,

I'm wandering how galaxy supports tools that are multithreaded or 
multi-process. When working
with lastz I noticed that it starts 4 parallel processes. Is that always so? 
Can this be
adjusted? What other tools also are multi-process?

regards, Andreas

-- Andreas Kuntzagk

SystemAdministrator

Berlin Institute for Medical Systems Biology at the Max-Delbrueck-Center for 
Molecular
Medicine Robert-Roessle-Str. 10, 13125 Berlin, Germany

http://www.mdc-berlin.de/en/bimsb/BIMSB_groups/Dieterich
___ Please keep all 
replies on the
list by using reply all in your mail client.  To manage your subscriptions to 
this and
other Galaxy lists, please use the interface at:

http://lists.bx.psu.edu/




-- Andreas Kuntzagk

SystemAdministrator

Berlin Institute for Medical Systems Biology at the Max-Delbrueck-Center for 
Molecular Medicine
Robert-Roessle-Str. 10, 13125 Berlin, Germany

http://www.mdc-berlin.de/en/bimsb/BIMSB_groups/Dieterich
___ Please keep all 
replies on the list
by using reply all in your mail client.  To manage your subscriptions to this 
and other Galaxy
lists, please use the interface at:

http://lists.bx.psu.edu/






--
Andreas Kuntzagk

SystemAdministrator

Berlin Institute for Medical Systems Biology at the
Max-Delbrueck-Center for Molecular Medicine
Robert-Roessle-Str. 10, 13125 Berlin, Germany

http://www.mdc-berlin.de/en/bimsb/BIMSB_groups/Dieterich
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] multithreaded tools

2012-11-27 Thread Peter Cock
On Tue, Nov 27, 2012 at 10:38 AM, Andreas Kuntzagk
andreas.kuntz...@mdc-berlin.de wrote:
 Dear Peter,

 As the author of several tool wrappers, I've been asking for a Galaxy
 wide mechanism for Galaxy to tell the tool how many threads it can
 use, for example via an environment variable. The value could then
 be set with a general default, per runner default, or even per tool
 using the existing runner configuration under [galaxy:tool_runners]
 in universe_wsgi.ini

 This would be a possibility. Another would be to communicate the number
 of threads the other way. So the tool tells the runner how many threads.
 And the runner knows how to handle this.

 I can imagine universe_wsgi.ini having such lines:

 ncbi_blastp_wrapper  = drmaa://-V -pe smp $GALAXY_THREADS

 and then $GALAXY_THREADS is changed for the value given by the wrapper.
 Thinking again this is probably not goint to work because the runner comes
 first and the wrapper after. My idea was that the wrapper could decide what
 recources to request. So I could use lower memory settings for small mapping
 jobs ...

There is some work on dynamic job allocation you might be interested
in - have you seen this thread?
http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-November/011759.html

 In your example, and others like the BWA and BLAST+ wrappers where
 the tool XML is hard coded to 8 threads, you would probably want to
 use a custom runner in universe_wsgi.ini setting the cluster submission
 to request that many slots/CPUs.


 A list of all these wrappers on the Wiki would be nice.

With many tools on the Tool Shed, I'm not sure how easy that would
be to co-ordinate. Doing it for the core tools would be more realistic.

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] multithreaded tools

2012-11-27 Thread Peter Cock
On Tue, Nov 27, 2012 at 10:44 AM, Andreas Kuntzagk
andreas.kuntz...@mdc-berlin.de wrote:
 Hi Alex,

 I am not sure if you can call these surprises.

 Well at least it surprised me :-)
 Didn't want to sound to negative.

 Some tools (which I highly appreciate) of Peter have been parallelised
 to get the job done more quickly. I earlier mentioned the ncbi blast+
 wrappers but there the tool by itself handles the multithreading. ...

 While I can read Python and bash fine it becomes more complicated with perl
 and R. Don't know if I could easily spot from the code what the number of
 threads is.
 So maybe somebody could setup a list of these tools?

 regards, Andreas.

The short answer is *every* tool used in Galaxy may be multi-threaded.

Sometimes this is done in the binary (e.g. BLAST), others do it in the
wrapper when the underlying tool is single threaded (e.g. my SignalP
and TMHMM wrappers which Alex mentioned).

Sometimes the default is clearly defined in the XML (as a command
line switch, e.g. BLAST), sometimes it is defined in a wrapper script,
and sometimes it is defined in the tool binary itself (e.g. use all
available CPUs).

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] multithreaded tools

2012-11-27 Thread Andreas Kuntzagk

Hi Peter,

thanks for your replies.

On 27.11.2012 11:44, Peter Cock wrote:

On Tue, Nov 27, 2012 at 10:38 AM, Andreas Kuntzagk
andreas.kuntz...@mdc-berlin.de wrote:

Dear Peter,


As the author of several tool wrappers, I've been asking for a Galaxy
wide mechanism for Galaxy to tell the tool how many threads it can
use, for example via an environment variable. The value could then
be set with a general default, per runner default, or even per tool
using the existing runner configuration under [galaxy:tool_runners]
in universe_wsgi.ini


This would be a possibility. Another would be to communicate the number
of threads the other way. So the tool tells the runner how many threads.
And the runner knows how to handle this.

I can imagine universe_wsgi.ini having such lines:

ncbi_blastp_wrapper  = drmaa://-V -pe smp $GALAXY_THREADS

and then $GALAXY_THREADS is changed for the value given by the wrapper.
Thinking again this is probably not goint to work because the runner comes
first and the wrapper after. My idea was that the wrapper could decide what
recources to request. So I could use lower memory settings for small mapping
jobs ...


There is some work on dynamic job allocation you might be interested
in - have you seen this thread?
http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-November/011759.html


This looks very promising. What I did not get from these messages is if that's already in 
galaxy-dist and where to put the dynamic job runner.





In your example, and others like the BWA and BLAST+ wrappers where
the tool XML is hard coded to 8 threads, you would probably want to
use a custom runner in universe_wsgi.ini setting the cluster submission
to request that many slots/CPUs.



A list of all these wrappers on the Wiki would be nice.


With many tools on the Tool Shed, I'm not sure how easy that would
be to co-ordinate. Doing it for the core tools would be more realistic.


I see the problem here. Especially since more and more tools are going into 
Tool Sheds.
I was just looking for some way to reduce my workload ;-)

--
Andreas Kuntzagk

SystemAdministrator

Berlin Institute for Medical Systems Biology at the
Max-Delbrueck-Center for Molecular Medicine
Robert-Roessle-Str. 10, 13125 Berlin, Germany

http://www.mdc-berlin.de/en/bimsb/BIMSB_groups/Dieterich
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] multithreaded tools

2012-11-27 Thread Oleksandr Moskalenko
On Nov 27, 2012, at 6:06 AM, Andreas Kuntzagk andreas.kuntz...@mdc-berlin.de 
wrote:

 Hi Peter,
 
 thanks for your replies.
 
 On 27.11.2012 11:44, Peter Cock wrote:
 On Tue, Nov 27, 2012 at 10:38 AM, Andreas Kuntzagk
 andreas.kuntz...@mdc-berlin.de wrote:
 Dear Peter,
 
 As the author of several tool wrappers, I've been asking for a Galaxy
 wide mechanism for Galaxy to tell the tool how many threads it can
 use, for example via an environment variable. The value could then
 be set with a general default, per runner default, or even per tool
 using the existing runner configuration under [galaxy:tool_runners]
 in universe_wsgi.ini
 
 This would be a possibility. Another would be to communicate the number
 of threads the other way. So the tool tells the runner how many threads.
 And the runner knows how to handle this.
 
 I can imagine universe_wsgi.ini having such lines:
 
 ncbi_blastp_wrapper  = drmaa://-V -pe smp $GALAXY_THREADS
 
 and then $GALAXY_THREADS is changed for the value given by the wrapper.
 Thinking again this is probably not goint to work because the runner comes
 first and the wrapper after. My idea was that the wrapper could decide what
 recources to request. So I could use lower memory settings for small mapping
 jobs ...
 
 There is some work on dynamic job allocation you might be interested
 in - have you seen this thread?
 http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-November/011759.html
 
 This looks very promising. What I did not get from these messages is if 
 that's already in galaxy-dist and where to put the dynamic job runner.
 
 
 In your example, and others like the BWA and BLAST+ wrappers where
 the tool XML is hard coded to 8 threads, you would probably want to
 use a custom runner in universe_wsgi.ini setting the cluster submission
 to request that many slots/CPUs.
 
 
 A list of all these wrappers on the Wiki would be nice.
 
 With many tools on the Tool Shed, I'm not sure how easy that would
 be to co-ordinate. Doing it for the core tools would be more realistic.
 
 I see the problem here. Especially since more and more tools are going into 
 Tool Sheds.
 I was just looking for some way to reduce my workload ;-)
 
 -- 
 Andreas Kuntzagk

The Right Way (TM) I believe would be to have a universal resource request 
selector that could be plugged into any wrapper simply by including an 
appropriate element like say resources proc=x pmem=y walltime=z /. Those 
variables could be exported, so the corresponding DRMAA call could be made in 
the dynamic runner and the data could be used in the wrapper to run the 
underlying tool as needed.

Regards,

Alex
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] multithreaded tools

2012-11-27 Thread Peter Cock
On Tue, Nov 27, 2012 at 2:20 PM, Oleksandr Moskalenko o...@hpc.ufl.edu wrote:

 The Right Way (TM) I believe would be to have a universal resource request
 selector that could be plugged into any wrapper simply by including an
 appropriate element like say resources proc=x pmem=y walltime=z /.
 Those variables could be exported, so the corresponding DRMAA call
 could be made in the dynamic runner and the data could be used in the
 wrapper to run the underlying tool as needed.

I am not convinced about that. For a simple non-dynamic setup I
think the resources like the number of threads should be dictated
by the local configuration (e.g. universe_wsgi.ini) and customised
to the local compute resources, rather than in the tool wrappers
which must be sufficiently general to run on any Galaxy install.

In general we need dynamic negotiation between the tool (e.g. this
tool can use as many threads as you like, suggest 8) and the local
configuration (we want to limit this tool to just 4 threads to make
maximum use of our cluster), and ideally the input data (e.g. this
job will need lots of RAM and must go on the big memory queue).
Right now the dynamic runner  which John Chilton and others
are working on seems capable of this (although quite complex).

Regards,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] multithreaded tools

2012-11-27 Thread Oleksandr Moskalenko

On Nov 27, 2012, at 9:37 AM, Peter Cock p.j.a.c...@googlemail.com wrote:

 On Tue, Nov 27, 2012 at 2:20 PM, Oleksandr Moskalenko o...@hpc.ufl.edu 
 wrote:
 
 The Right Way (TM) I believe would be to have a universal resource request
 selector that could be plugged into any wrapper simply by including an
 appropriate element like say resources proc=x pmem=y walltime=z /.
 Those variables could be exported, so the corresponding DRMAA call
 could be made in the dynamic runner and the data could be used in the
 wrapper to run the underlying tool as needed.
 
 I am not convinced about that. For a simple non-dynamic setup I
 think the resources like the number of threads should be dictated
 by the local configuration (e.g. universe_wsgi.ini) and customised
 to the local compute resources, rather than in the tool wrappers
 which must be sufficiently general to run on any Galaxy install.
 
 In general we need dynamic negotiation between the tool (e.g. this
 tool can use as many threads as you like, suggest 8) and the local
 configuration (we want to limit this tool to just 4 threads to make
 maximum use of our cluster), and ideally the input data (e.g. this
 job will need lots of RAM and must go on the big memory queue).
 Right now the dynamic runner  which John Chilton and others
 are working on seems capable of this (although quite complex).
 
 Regards,
 
 Peter

The dynamic wrapper is capable of building a DRMAA call based on the external 
data and is what I am using for our local production instance. I cannot praise 
it highly enough. John Chilton has made a wonderful addition to the Galaxy. 
However, not being able to give users some manual control over the resource 
requests places the burden of figuring them out on the administrator and the 
dataset-based heuristics are often much worse then the knowledge of the person 
running the analysis. In addition, different tools use different options for 
setting thread numbers and cannot communicate realistic or even reasonable 
limits as those are based on the data from actually running the tool in 
different conditions and finding how well it scales. Dynamic negotiation is 
unfeasible at this time I think. The simple non-dynamic setup does not really 
work for any real-world multi-user instance anymore.

Regards,

Alex
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] multithreaded tools

2012-11-26 Thread Bossers, Alex
Andreas,

Yes this is possible. You can also have a look at the ncbi blast+ tools written 
by Peter. The same is true.
Usually the tool wrappers (xml) have an option preconfigured how many threads 
can be used... You can adjust these directly in the xml OR as we did we added 
the option to be adjustable by a parameter directly in galaxy. Default low 
number of cores, but advanced users (login email bound) can select higher 
thread numbers.

No experience on cluster/grid tools.

Hope this helps,
Alex


-Oorspronkelijk bericht-
Van: galaxy-dev-boun...@lists.bx.psu.edu 
[mailto:galaxy-dev-boun...@lists.bx.psu.edu] Namens Andreas Kuntzagk
Verzonden: maandag 26 november 2012 9:58
Aan: galaxy-dev@lists.bx.psu.edu
Onderwerp: [galaxy-dev] multithreaded tools

Hi,

I'm wandering how galaxy supports tools that are multithreaded or multi-process.
When working with lastz I noticed that it starts 4 parallel processes.
Is that always so? Can this be adjusted? What other tools also are 
multi-process?

regards, Andreas

--
Andreas Kuntzagk

SystemAdministrator

Berlin Institute for Medical Systems Biology at the Max-Delbrueck-Center for 
Molecular Medicine Robert-Roessle-Str. 10, 13125 Berlin, Germany

http://www.mdc-berlin.de/en/bimsb/BIMSB_groups/Dieterich
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this and other Galaxy 
lists, please use the interface at:

  http://lists.bx.psu.edu/





___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] multithreaded tools

2012-11-26 Thread Bob Harris

Howdy, Andreas,

The four processes started for a galaxy lastz job must involve post- 
processing the lastz output through some other shell tool.  Lastz by  
itself doesn't support multiple threads or processes.


Bob H


On Nov 26, 2012, at 3:58 AM, Andreas Kuntzagk wrote:


Hi,

I'm wandering how galaxy supports tools that are multithreaded or  
multi-process.

When working with lastz I noticed that it starts 4 parallel processes.
Is that always so? Can this be adjusted? What other tools also are  
multi-process?


regards, Andreas

--
Andreas Kuntzagk

SystemAdministrator

Berlin Institute for Medical Systems Biology at the
Max-Delbrueck-Center for Molecular Medicine
Robert-Roessle-Str. 10, 13125 Berlin, Germany

http://www.mdc-berlin.de/en/bimsb/BIMSB_groups/Dieterich
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/