Re: [galaxy-dev] Galaxy+Slurm (with elastic cluster) error: "Job output not returned from cluster"

2016-11-02 Thread Enis Afgan
Hi Marco,
You could do something similar to what we're doing with CloudMan: add a
placeholder node to Slurm. This will make Slurm accept a job but not run it
until a node is added that is actually capable of executing it. The
placeholder node is just a definition in slurm.conf that has state=future.
Check out slurm.conf we're using:
https://github.com/galaxyproject/cloudman/blob/master/cm/conftemplates/slurm.conf.default#L37

Cheers,
Enis

On Tue, Nov 1, 2016 at 1:04 PM, Marco Tangaro  wrote:

> Dear all,
> I've a issue using Galaxy with elastic cluster support.
> It is provided by integrating SLURM and a worker node is added as soon as
> jobs are submitted through the Galaxy portal.
>
> When a job is submitted, the node takes some minutes to be configured.
> After ~7 minutes, Galaxy give me a failure message "Job output not
> returned from cluster".
> On the contrary if the node is already up, everything is ok.
> I tried only a very simple job getting the UCSC human genome.
> I'm using the master galaxy branch with postgresq+nginx+uwsgi+proftpd.
>
> Here is my job_conf.xml configuration:
> https://gist.github.com/mtangaro/c0528c3d9a7b44b3bab35dbd947f2c81
>
> I'm not a slurm expert. I went through the mailing list archive, but I did
> not solved the issue.
> Thanks a lot for your help.
> Marco
>
> ___
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>   https://lists.galaxyproject.org/
>
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/mailinglists/
>
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] Galaxy+Slurm (with elastic cluster) error: "Job output not returned from cluster"

2016-11-01 Thread Marco Tangaro
Dear all,
I've a issue using Galaxy with elastic cluster support.
It is provided by integrating SLURM and a worker node is added as soon as
jobs are submitted through the Galaxy portal.

When a job is submitted, the node takes some minutes to be configured.
After ~7 minutes, Galaxy give me a failure message "Job output not returned
from cluster".
On the contrary if the node is already up, everything is ok.
I tried only a very simple job getting the UCSC human genome.
I'm using the master galaxy branch with postgresq+nginx+uwsgi+proftpd.

Here is my job_conf.xml configuration:
https://gist.github.com/mtangaro/c0528c3d9a7b44b3bab35dbd947f2c81

I'm not a slurm expert. I went through the mailing list archive, but I did
not solved the issue.
Thanks a lot for your help.
Marco
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/