ClusterTools 7.x does not support multiple IB HCAs with uDAPL.   
ClusterTools 8.0 will, this is in EA currently: 
http://www.sun.com/software/products/clustertools/early_access.xml .

Providing the additional data Bill suggested would be helpful. You say 
you saw this with dapltest as well, it would be good to know  version 
and where you got dapltest from, I have seen different versions around.

-DON

Bill Taylor wrote:
> The information provided is not enough for an investigation
> to be useful.  Additional data of interest would minimally be:
>
>       output of "ifconfig -a"
>       output of "datadm -v"   (multiple entries may be a problem)
>
> If the user was running s10u4, I'd recommend trying IB
> Update 1.0.  There may be a bug fix in it that is not in
> Open Solaris.  The one I remember is in datadm, so I think
> it should not cause this problem.
>
> I run with 2 ports connected all the time with only 1 IPoIB
> device configured, and do not have the problem described by
> this user.  I am running s10u4 plus the download (SDLC) of
> IB Updates 1.0.  The config on both my systems looks like:
>
>    $ ls -l /dev/ibd?
>    lrwxrwxrwx   1 root     root          71 Mar 31 14:54 /dev/ibd1 ->
>      ../devices/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci15b3,[EMAIL 
> PROTECTED]/[EMAIL PROTECTED],8001,ipib:ibd1
>    lrwxrwxrwx   1 root     root          71 Mar 31 14:54 /dev/ibd3 ->
>      ../devices/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci15b3,[EMAIL 
> PROTECTED]/[EMAIL PROTECTED],8001,ipib:ibd3
>    $ ifconfig -a
>    lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 
> index 1
>         inet 127.0.0.1 netmask ff000000
>    bge0: flags=1004843<UP,BROADCAST,RUNNING,MULTICAST,DHCP,IPv4> mtu 1500 
> index 2
>         inet 10.1.49.193 netmask ffffff00 broadcast 10.1.49.255
>    ibd1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 2044 index 3
>         inet 18.0.0.193 netmask ff000000 broadcast 18.255.255.255
>    $ datadm -v
>    ibd1  u1.2  nonthreadsafe  default  udapl_tavor.so.1  SUNW.1.0  " "  
> "driver_name=tavor"
>    $ mpirun --mca pls_rsh_agent rsh --host 18.0.0.193,18.0.0.194 \
>     --mca mpi_preconnect_all 1 \
>    --mca btl self,sm,udapl --mca mpi_leave_pinned 1 osu_latency
>    # OSU MPI Latency Test (Version 2.2)
>    # Size              Latency (us)
>    0                   4.32
>    1                   4.57
>    2                   4.57
>    4                   4.55
>    8                   4.61
>    16                  4.63
>    32                  4.72
>    64                  4.86
>    128                 5.06
>    256                 5.99
>    512                 6.66
>    1024                8.04
>    2048               10.73
>    4096               13.63
>    8192               31.09
>    16384              39.89
>    32768              57.02
>    65536              91.47
>    131072            160.03
>    262144            298.00
>    524288            572.74
>    1048576          1122.57
>    2097152          2221.51
>    4194304          5962.94
>    $
>
> - BT
>
>
>   
>> ------------------------------------------------------------------------
>>
>> Subject:
>> [networking-discuss] Solaris/OpenSolaris uDAPL doesn't work when both IB
>> HCA ports are connected
>> From:
>> Denis Golubev <[EMAIL PROTECTED]>
>> Date:
>> Thu, 03 Apr 2008 02:23:12 -0700 (PDT)
>> To:
>> [email protected]
>>
>> To:
>> [email protected]
>>
>>
>> Hello.
>>
>> I discovered that Solaris/OpenSolaris uDAPL works if only one HCA  port is 
>> connected to IB fabric. When I connect two HCA ports to the IB fabric or two 
>> ports from different HCAs to the IB fabric, uDAPL doesn't work.
>>
>> I discovered this problem with HPC ClusterTools 7.x and verified with 
>> dapltest. When two or more IB ports form single host  are connected to the 
>> IB fabric 'dat_ep_connect' routine always returns DAT_INTERNAL_ERROR. 
>>
>> uDAPL configuration via datadm doesn't care - more than one connection to 
>> the IB fabric breaks uDAPL despite of quantity of the IPoIB interfaces 
>> configured for DAT usage.
>>
>> Please advice is ist possible to avoid this error and use more than on IB 
>> connection on host for uDAPL. Thanks in advance.
>>
>> Regards,
>>
>> Denis
>>  
>>  
>> This message posted from opensolaris.org
>> _______________________________________________
>> networking-discuss mailing list
>> [email protected]
>>     
_______________________________________________
networking-discuss mailing list
[email protected]

Reply via email to