I can guarantee bproc support isn't broken in 1.2 - we use it on several
production machines every day, and it works fine. I heard of only one
potential problem having to do with specifying multiple app_contexts on a
cmd line, but we are still trying to confirm that it wasn't operator error.

In the 1.2 series, we don't pass mca params back to the orteds. The reason
this was done is that there are soooo many mca params that could be set that
we would frequently overrun the system limit on cmd line length. Remember,
those params can be in a system-level file, a user-level file, the
environment, and/or on the cmd line!

This restriction has been lifted in 1.3, but we didn't back-port it to the
1.2 series. So I'm afraid that the orted is going to pick the environment it
senses.

Of more interest would be understanding why your build isn't working in
bproc. Could you send me the error you are getting? I'm betting that the
problem lies in determining the node allocation as that is the usual place
we hit problems - not much is "standard" about how allocations are
communicated in the bproc world, though we did try to support a few of the
more common methods.

Ralph



On 6/23/08 2:12 PM, "Joshua Bernstein" <jbernst...@penguincomputing.com>
wrote:

> 
> 
> Ralph Castain wrote:
>> Hi Joshua
>> 
>> Again, forwarded by the friendly elf - so include me directly in any reply.
>> 
>> I gather from Jeff that you are attempting to do something with bproc -
>> true? If so, I will echo what Jeff said: bproc support in OMPI is being
>> dropped with the 1.3 release due to lack of interest/support. Just a "heads
>> up".
> 
> Understood.
> 
>> If you are operating in a bproc environment, then I'm not sure why you are
>> specifying that the system use the rsh launcher. Bproc requires some very
>> special handling which is only present in the bproc launcher. You can run
>> both MPI and non-MPI apps with it, but bproc is weird and so OMPI some
>> -very- different logic in it to make it all work.
> 
> Well, I'm trying to determine how broken, if at all, the bproc support
> is in OpenMPI. So considering out of the gate it wasn't working, I
> thought I'd try to disable the built in BProc stuff and fall back to RSH.
> 
>> I suspect the problem you are having is that all of the frameworks are
>> detecting bproc and trying to run accordingly. This means that the orted is
>> executing process startup procedures for bproc - which are totally different
>> than for any other environment (e.g., rsh). If mpirun is attempting to
>> execute an rsh launch, and the orted is expecting a bproc launch, then I can
>> guarantee that no processes will be launched and you will hang.
> 
> Exactly, what I'm seeing now...
> 
>> I'm not sure there is a way in 1.2 to tell the orteds to ignore the fact
>> that they see bproc and do something else. I can look, but would rather wait
>> to hear if that is truly what you are trying to do, and why.
> 
> I would really appreciate it if you wouldn't mind looking. From reading
> the documentation I didn't realize that mpirun and the orted were doing
> two different things. I thought the --mca parameter applied to both.
> 
> -Joshua Bernstein
> Software Engineer
> Penguin Computing


Reply via email to