Aha! This is a problem that continues to bite us - it relates to the pty
problem in Mac OSX. Been a ton of chatter about this, but Mac doesn't seem
inclined to fix it.

Try configuring --disable-pty-support and see if that helps. FWIW, you will
find a platform file for Mac OSX in the trunk - I always build with it, and
have spent considerable time fine-tuning it. You configure with:

./configure --prefix=whatever
--with-platform=contrib/platform/lanl/macosx-dynamic

In that directory, you will also find platform files for static builds under
both Tiger and Leopard (slight differences).

ralph


On 5/27/08 8:01 PM, "Greg Watson" <g.wat...@computer.org> wrote:

> Ralph,
> 
> I tried rolling back to 18513 but no luck. Steps:
> 
> $ ./autogen.sh
> $ ./configure --prefix=/usr/local/openmpi-1.3-devel
> $ make
> $ make install
> $ mpicc -g -o xxx xxx.c
> $ mpirun -np 2 ./xxx
> $ ps x
> 44832 s001  R+     0:50.00 mpirun -np 2 ./xxx
> 44833 s001  S+     0:00.03 ./xxx
> $ gdb /usr/local/openmpi-1.3-devel/bin/mpirun
> ...
> (gdb) attach 44832
> Attaching to program: `/usr/local/openmpi-1.3-devel/bin/mpirun',
> process 44832.
> Reading symbols for shared libraries ++++
> +.......................................... done
> 0x9371b3dd in ioctl ()
> (gdb) where
> #0  0x9371b3dd in ioctl ()
> #1  0x93754812 in grantpt ()
> #2  0x9375470b in openpty ()
> #3  0x001446d9 in opal_openpty ()
> #4  0x000bf3bf in orte_iof_base_setup_prefork ()
> #5  0x003da62f in odls_default_fork_local_proc (context=0x216a60,
> child=0x216dd0, environ_copy=0x217930) at odls_default_module.c:191
> #6  0x000c3e76 in orte_odls_base_default_launch_local ()
> #7  0x003daace in orte_odls_default_launch_local_procs (data=0x216780)
> at odls_default_module.c:360
> #8  0x000ad2f6 in process_commands (sender=0x216768, buffer=0x216780,
> tag=1) at orted/orted_comm.c:441
> #9  0x000acd52 in orte_daemon_cmd_processor (fd=-1, opal_event=1,
> data=0x216750) at orted/orted_comm.c:346
> #10 0x0012bd21 in event_process_active () at opal_object.h:498
> #11 0x0012c3c5 in opal_event_base_loop () at opal_object.h:498
> #12 0x0012bf8c in opal_event_loop () at opal_object.h:498
> #13 0x0011b334 in opal_progress () at runtime/opal_progress.c:169
> #14 0x000cd9b4 in orte_plm_base_report_launched () at opal_object.h:498
> #15 0x000cc2b7 in orte_plm_base_launch_apps () at opal_object.h:498
> #16 0x0003d626 in orte_plm_rsh_launch (jdata=0x200ae0) at
> plm_rsh_module.c:1126
> #17 0x00002604 in orterun (argc=4, argv=0xbffff880) at orterun.c:549
> #18 0x00001bd6 in main (argc=4, argv=0xbffff880) at main.c:13
> 
> On May 27, 2008, at 9:11 PM, Ralph Castain wrote:
> 
>> Yo Greg
>> 
>> I'm not seeing any problem on my Mac OSX - I'm running Leopard. Can
>> you tell
>> me how you configured, and the precise command you executed?
>> 
>> Thanks
>> Ralph
>> 
>> 
>> 
>> On 5/27/08 5:15 PM, "Ralph Castain" <r...@lanl.gov> wrote:
>> 
>>> Hmmm...well, it was working about 3 hours ago! I'll try to take a
>>> look
>>> tonight, but it may be tomorrow.
>>> 
>>> Try rolling it back just a little to r18513 - that's the last rev I
>>> tested
>>> on my Mac.
>>> 
>>> 
>>> On 5/27/08 5:00 PM, "Greg Watson" <g.wat...@computer.org> wrote:
>>> 
>>>> Something seems to be broken in the trunk for MacOS X. I can run a 1
>>>> process job, but a >1 process job hangs. It was working a few days
>>>> ago.
>>>> 
>>>> Greg
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


Reply via email to