Hi Reuti,

Thanks for your prompt answer.
Regarding yout questions:

> How does you application read the list of granted machines?
> Did you compile MPI on your own (which implementation in detail)?

I´ve got no control or no documentation about this app. It was design by an
Electrical Research Center for our proposes.

> PS: I assume that with $round_robin simply all (or at least: many) nodes
were access allowed to.

Yes. It´s correct.

>As now hosts are first filled before access to another one is granted, you
might see the >effect of the former (possibly wrong) distribution of slave
tasks to the nodes

So I understand that the app should be recompiled to take advantages of
$fill_up option?

All the best,

Sergio


On Fri, Jul 26, 2013 at 10:06 AM, Reuti <[email protected]> wrote:

> Hi,
>
> Am 26.07.2013 um 14:22 schrieb Sergio Mafra:
>
> > I'm using MIT StarCluster with mpich2 and OGE. Everything's ok.
> > But when I tried to change the strategy of distribution of work from
> Round Robin (default) to Fill Up... My problems had just began.
> > OGE keeps me teling that some nodes can not receive tasks...
>
> On the one hand this is a good sign, as it confirms that your PE is
> defined to control slave tasks on the nodes.
>
>
> > "Error: executing task of job 9 failed: execution daemon on host
> "node002" didn't accept task"It seems that my mpi app always tries to run
> in all nodes of the cluster, no matter if OGE doesn't allow it to do it.
> > Does anybody knows of a workaround ?
>
> This indicates, that you application tries to use a node in the cluster,
> which wasn't granted to this job by SGE.
>
> How does you application read the list of granted machines?
>
> Did you compile MPI on your own (which implementation in detail)?
>
> -- Reuti
>
> PS: I assume that with $round_robin simply all (or at least: many) nodes
> were access allowed to. As now hosts are first filled before access to
> another one is granted, you might see the effect of the former (possibly
> wrong) distribution of slave tasks to the nodes.
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to