Re: [OMPI devel] C++ errhandler

2008-02-15 Thread Tim Prins

Done: https://svn.open-mpi.org/trac/ompi/ticket/1216

Tim

Jeff Squyres wrote:
Blah; it is not a known issue.  I swear I tested this, but I must have  
goofed.  :-(


Can you file a bug and assign it to me?  I can't look at it this  
second, but perhaps I can later today.  Thanks...




On Feb 15, 2008, at 9:19 AM, Tim Prins wrote:


Hi,

We are running into a problem with the IBM test cxx_call_errhandler
since the merge of the c++ bindings changes. Not sure if this is a  
known

problem, but I did not see a bug or any traffic about this one.

MTT link: http://www.open-mpi.org/mtt/index.php?do_redir=532

Thanks,

Tim
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel








Re: [OMPI devel] C++ errhandler

2008-02-15 Thread Jeff Squyres
Blah; it is not a known issue.  I swear I tested this, but I must have  
goofed.  :-(


Can you file a bug and assign it to me?  I can't look at it this  
second, but perhaps I can later today.  Thanks...




On Feb 15, 2008, at 9:19 AM, Tim Prins wrote:


Hi,

We are running into a problem with the IBM test cxx_call_errhandler
since the merge of the c++ bindings changes. Not sure if this is a  
known

problem, but I did not see a bug or any traffic about this one.

MTT link: http://www.open-mpi.org/mtt/index.php?do_redir=532

Thanks,

Tim
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems



[OMPI devel] C++ errhandler

2008-02-15 Thread Tim Prins

Hi,

We are running into a problem with the IBM test cxx_call_errhandler 
since the merge of the c++ bindings changes. Not sure if this is a known 
problem, but I did not see a bug or any traffic about this one.


MTT link: http://www.open-mpi.org/mtt/index.php?do_redir=532

Thanks,

Tim


Re: [OMPI devel] New address selection for btl-tcp (was Re: [OMPI svn] svn:open-mpi r17307)

2008-02-15 Thread Tim Prins

Adrian Knoth wrote:

On Fri, Feb 01, 2008 at 11:40:20AM -0500, Tim Prins wrote:


Adrian,


Hi!

Sorry for the late reply and thanks for your testing.


1. There are some warnings when compiling:


I've fixed these issues.

Thanks.


2. If I exclude all my tcp interfaces, the connection fails properly, 
but I do get a malloc request for 0 bytes:
tprins@odin examples]$ mpirun -mca btl tcp,self  -mca btl_tcp_if_exclude 
eth0,ib0,lo -np 2 ./ring_c

malloc debug: Request for 0 bytes (btl_tcp_component.c, 844)
malloc debug: Request for 0 bytes (btl_tcp_component.c, 844)



Not my fault, but I guess we could fix it anyway. Should we?
It probably should be fixed. But I've noticed that other BTLs (such as 
MX) do not properly handle the case where there are no available 
interfaces either...




3. If the exclude list does not contain 'lo', or the include list 
contains 'lo', the job hangs when using multiple nodes:


That's weird. Loopback interfaces should automatically be excluded right
from the beginning. See opal/util/if.c.

I neither know nor haven't checked where things go wrong. Do you want to
investigate? As already mentioned, this should not happen.
I took a quick glance at this file, and I'd be lying if I said I 
understood what was going on in it. One thing I did notice is that the 
parameter btl_tcp_if_exclude defaults to 'lo', but the user can of 
course overwrite it.


It might be worth looking into this further. If the user got an error or 
the job aborted if they did something wrong with 'lo' I would not worry 
about it at all. But the fact that it causes a hang is worrisome to me.




Can you post the output of "ip a s" or "ifconfig -a"?

It is at the end of the email.



However, the great news about this patch is that it appears to fix 
https://svn.open-mpi.org/trac/ompi/ticket/1027 for me.


It also fixes my #1206. I'd like to merge tmp-public/btl-tcp into the
trunk, especially before the 1.3 code freeze. Any objections?

Not from me, especially now that it is already in the trunk :).

Tim


--
ifconfig -a:
eth0  Link encap:Ethernet  HWaddr 00:E0:81:2D:0B:08
  inet addr:129.79.240.101  Bcast:129.79.240.255 
Mask:255.255.255.0

  inet6 addr: 2001:18e8:2:240:2e0:81ff:fe2d:b08/64 Scope:Global
  inet6 addr: fe80::2e0:81ff:fe2d:b08/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:555918407 errors:0 dropped:2122 overruns:0 frame:0
  TX packets:569928551 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:448936694980 (418.1 GiB)  TX bytes:486030858441 
(452.6 GiB)

  Interrupt:193

eth1  Link encap:Ethernet  HWaddr 00:E0:81:2D:0B:09
  BROADCAST MULTICAST  MTU:1500  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
  Interrupt:201

ib0   Link encap:UNSPEC  HWaddr 
00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00

  inet addr:192.168.0.101  Bcast:192.168.0.255  Mask:255.255.255.0
  inet6 addr: fe80::202:c902:0:5d71/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
  RX packets:6304819 errors:0 dropped:0 overruns:0 frame:0
  TX packets:6355094 errors:0 dropped:2 overruns:0 carrier:0
  collisions:0 txqueuelen:128
  RX bytes:26794850321 (24.9 GiB)  TX bytes:35448899645 (33.0 GiB)

ib1   Link encap:UNSPEC  HWaddr 
00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00

  BROADCAST MULTICAST  MTU:2044  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:128
  RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

loLink encap:Local Loopback
  inet addr:127.0.0.1  Mask:255.0.0.0
  inet6 addr: ::1/128 Scope:Host
  UP LOOPBACK RUNNING  MTU:16436  Metric:1
  RX packets:182055033 errors:0 dropped:0 overruns:0 frame:0
  TX packets:182055033 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0
  RX bytes:997605665018 (929.0 GiB)  TX bytes:997605665018 
(929.0 GiB)


sit0  Link encap:IPv6-in-IPv4
  NOARP  MTU:1480  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0
  RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

ip a s:
1: lo:  mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
inet6 ::1/128 scope host
   valid_lft forever preferred_lft forever
2: eth0:  mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:e0:81:2d:0b:08 brd ff:ff:ff:ff:ff:ff
inet