Thanks Matt - that does indeed resolve the "how" question :-) We'll talk internally about how best to resolve the issue. We could, of course, add a flag to indicate "we are using a shellscript version of srun" so we know to quote things, but it would mean another thing that the user would have to do (as opposed to just running out-of-the-box).
If we quote everything by default, then we have to modify our parser to strip the quotes when someone isn't using a script wrapper or else the system gets in trouble - but Jeff is concerned about us stripping things by default in case a user specifies an MCA param value that actually begins/ends with quotes. I'm not sure that's a valid use-case, but we'll debate it. Either way, we'll give you a solution. Ralph On Sep 3, 2014, at 6:27 AM, Matt Thompson <fort...@gmail.com> wrote: > On Tue, Sep 2, 2014 at 8:38 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> > wrote: > Matt: Random thought -- is your "srun" a shell script, perchance? (it > shouldn't be, but perhaps there's some kind of local override...?) > > Ralph's point on the call today is that it doesn't matter *how* this problem > is happening. It *is* happening to real users, and so we need to account for > it. > > But it really bothers me that we don't understand *how/why* this is happening > (e.g., is this OMPI's fault somehow? I don't think so, but then again, we > don't understand how it's happening). *Somewhere* in there, a shell is > getting invoked. But "srun" shouldn't be invoking a shell on the remote side > -- it should be directly fork/exec'ing the tokens with no shell > interpretation at all. > > Jeff, > > Just saw this, sorry. Our srun is indeed a shell script. It seems to be a > wrapper around the regular srun that runs a --task-prolog. What it > does...that's beyond my ken, but I could ask. My guess is that it probably > does something that helps keep our old PBS scripts running (sets > $PBS_NODEFILE, say). We used to run PBS but switched to SLURM recently. The > admins would, of course, prefer all future scripts be SLURM-native scripts, > but there are a lot of production runs that uses many, many PBS scripts. > Converting that would need slow, careful QC to make sure any "pure SLURM" > versions act as expected. > > Matt > > > -- > "And, isn't sanity really just a one-trick pony anyway? I mean all you > get is one trick: rational thinking. But when you're good and crazy, > oooh, oooh, oooh, the sky is the limit!" -- The Tick > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/09/25248.php