Thanks Matt - that does indeed resolve the "how" question :-)

We'll talk internally about how best to resolve the issue. We could, of course, 
add a flag to indicate "we are using a shellscript version of srun" so we know 
to quote things, but it would mean another thing that the user would have to do 
(as opposed to just running out-of-the-box).

If we quote everything by default, then we have to modify our parser to strip 
the quotes when someone isn't using a script wrapper or else the system gets in 
trouble - but Jeff is concerned about us stripping things by default in case a 
user specifies an MCA param value that actually begins/ends with quotes. I'm 
not sure that's a valid use-case, but we'll debate it.

Either way, we'll give you a solution.
Ralph


On Sep 3, 2014, at 6:27 AM, Matt Thompson <fort...@gmail.com> wrote:

> On Tue, Sep 2, 2014 at 8:38 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> 
> wrote:
> Matt: Random thought -- is your "srun" a shell script, perchance?  (it 
> shouldn't be, but perhaps there's some kind of local override...?)
> 
> Ralph's point on the call today is that it doesn't matter *how* this problem 
> is happening.  It *is* happening to real users, and so we need to account for 
> it.
> 
> But it really bothers me that we don't understand *how/why* this is happening 
> (e.g., is this OMPI's fault somehow?  I don't think so, but then again, we 
> don't understand how it's happening).  *Somewhere* in there, a shell is 
> getting invoked.  But "srun" shouldn't be invoking a shell on the remote side 
> -- it should be directly fork/exec'ing the tokens with no shell 
> interpretation at all.
> 
> Jeff,
> 
> Just saw this, sorry. Our srun is indeed a shell script. It seems to be a 
> wrapper around the regular srun that runs a --task-prolog. What it 
> does...that's beyond my ken, but I could ask. My guess is that it probably 
> does something that helps keep our old PBS scripts running (sets 
> $PBS_NODEFILE, say). We used to run PBS but switched to SLURM recently. The 
> admins would, of course, prefer all future scripts be SLURM-native scripts, 
> but there are a lot of production runs that uses many, many PBS scripts. 
> Converting that would need slow, careful QC to make sure any "pure SLURM" 
> versions act as expected.
> 
> Matt
> 
> 
> -- 
> "And, isn't sanity really just a one-trick pony anyway? I mean all you
>  get is one trick: rational thinking. But when you're good and crazy, 
>  oooh, oooh, oooh, the sky is the limit!" -- The Tick
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/09/25248.php

Reply via email to