Re: [ewg] OFED-1.5.1 failure over iWarp

2010-02-04 Thread Or Gerlitz
Sean Hefty wrote:
 If I look at what's there today, we're trying to find some way to match the
 net_device src_dev_addr with some sort of address associated with an 
 ib_device.
 In the case of actual IB, the net_device src_dev_addr contains the SGID, which
 provides the mapping.

 
 Steve, can you please clarify the iWarp case for me?  For iWarp, doesn't the
 src_dev_addr contain the MAC?  So, the 'GID's reported for an iWarp device is
 really just the MAC.  Is this correct?


 If this is the case, then couldn't rocee (I hate that name) report its MAC as
 one of its GIDs?  This would ensure that the mapping between net_device and
 ib_device was correct.

Sean, AFAIK, reporting the MAC as one of the GIDs was part of the IBoE (feel 
free
not to use names which you don't like) design presented couple of time, isn't 
it, Eli, Liran?

Or.

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] OFED-1.5.1 failure over iWarp

2010-02-04 Thread Steve Wise
Never mind.  I see you already committed the change.  I just pulled the 
latest and rping works over iwarp.

Thanks,

Steve.


Steve Wise wrote:
 Hey Eli,

 This patch doesn't apply.

 If you give me one that applies and builds against RH5.3, I'll test it.

 Thanks,

 Steve.


 Eli Cohen wrote:
   
 Oops, you're right.

 Please try this one:

 commit 483fe703b03b1db99fa4a968fc3a918aa43f856f
 Author: Eli Cohen e...@mellanox.co.il
 Date:   Wed Feb 3 13:10:14 2010 +0200

 CMA: Fix iWarp failures to bind to a device
 
 rdma_addr_get_sgid() relies on dev_addr-transport to retrieve the 
 correct GID
 based on the hardware address. However, when called from 
 cma_acquire_dev(), the
 transport field is not yet valid. The solution is to avoid calling
 rdma_addr_get_sgid() from cma_acquire_dev() and find the device based on 
 it's
 GID: for ethernet, assume first it is rocee and search the GID table, if 
 not
 found generate the GID by copying it from the hardware address.
 
 Signed-off-by: Eli Cohen e...@mellanox.co.il

 diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
 index a2d5aad..3c5c59f 100644
 --- a/drivers/infiniband/core/cma.c
 +++ b/drivers/infiniband/core/cma.c
 @@ -348,15 +348,29 @@ static int cma_acquire_dev(struct rdma_id_private 
 *id_priv)
  union ib_gid gid;
  int ret = -ENODEV;
  
 -rdma_addr_get_sgid(dev_addr, gid);
 +if (dev_addr-dev_type != ARPHRD_INFINIBAND) {
 +rocee_addr_get_sgid(dev_addr, gid);
 +list_for_each_entry(cma_dev, dev_list, list) {
 +ret = ib_find_cached_gid(cma_dev-device, gid,
 + id_priv-id.port_num, NULL);
 +if (!ret)
 +goto out;
 +}
 +}
 +
 +memcpy(gid, dev_addr-src_dev_addr +
 +   rdma_addr_gid_offset(dev_addr), sizeof gid);
  list_for_each_entry(cma_dev, dev_list, list) {
  ret = ib_find_cached_gid(cma_dev-device, gid,
   id_priv-id.port_num, NULL);
 -if (!ret) {
 -cma_attach_to_dev(id_priv, cma_dev);
 +if (!ret)
  break;
 -}
  }
 +
 +out:
 +if (!ret)
 +cma_attach_to_dev(id_priv, cma_dev);
 +
  return ret;
  }
  

   
 
   memcpy(gid, dev_addr-src_dev_addr +
  rdma_addr_gid_offset(dev_addr), sizeof gid);
   list_for_each_entry(cma_dev, dev_list, list) {
   ret = ib_find_cached_gid(cma_dev-device, gid,
id_priv-id.port_num,
 NULL);
   if (!ret)
   break;
   }
   }

   if (!ret)
   cma_attach_to_dev(id_priv, cma_dev);

   return ret;
 }
 



 Eli Cohen wrote:
 
   
 On Wed, Feb 03, 2010 at 09:20:05AM -0600, Steve Wise wrote:
   
 
 diff --git a/drivers/infiniband/core/cma.c 
 b/drivers/infiniband/core/cma.c
 index a2d5aad..76dce2b 100644
 --- a/drivers/infiniband/core/cma.c
 +++ b/drivers/infiniband/core/cma.c
 @@ -348,15 +348,28 @@ static int cma_acquire_dev(struct 
 rdma_id_private *id_priv)
union ib_gid gid;
int ret = -ENODEV;
 -  rdma_addr_get_sgid(dev_addr, gid);
 -  list_for_each_entry(cma_dev, dev_list, list) {
 -  ret = ib_find_cached_gid(cma_dev-device, gid,
 -   id_priv-id.port_num, NULL);
 -  if (!ret) {
 -  cma_attach_to_dev(id_priv, cma_dev);
 -  break;
 +  if (dev_addr-dev_type != ARPHRD_INFINIBAND) {
 +  rocee_addr_get_sgid(dev_addr, gid);
 +  list_for_each_entry(cma_dev, dev_list, list) {
 +  ret = ib_find_cached_gid(cma_dev-device, gid,
 +   id_priv-id.port_num, 
 NULL);
 +  if (!ret)
 +  break;
 +  }
   
 
 The above if statement is true for iwarp devices, so this patch is
 just wrong.   rocee__addr_get_sgid() should only be used for ROCEE
 interfaces, correct?
 
   
 No, the idea is this: for non ARPHRD_INFINIBAND devices (e.g. rocee or
 iwarp) I assume first this rocee, get the rocee gid, and check if this
 gid appears in any device's gid table. It the mac address belongs to a
 rocee device then it will be found; if it belongs to an iwarp device
 then it won't be found. In the later case I build the gid in the pre
 rocee patches fashion and search again.
   
 
 +  } else {
 +  memcpy(gid, dev_addr-src_dev_addr +
 + rdma_addr_gid_offset(dev_addr), sizeof gid);
 +  list_for_each_entry(cma_dev, dev_list, list) {
 +

Re: [ewg] OFED-1.5.1 failure over iWarp

2010-02-04 Thread Eli Cohen
On Thu, Feb 04, 2010 at 09:46:58AM -0600, Steve Wise wrote:
 Never mind.  I see you already committed the change.  I just pulled
 the latest and rping works over iwarp.
 

Thanks for checking this.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] OFED-1.5.1 failure over iWarp

2010-02-04 Thread Sean Hefty
We should work to get this 'correct' when merging upstream.

Following the spirit of the current code, it is probably cma_acquire_dev()'s
job to fill in the missing ibdev type information after matching the netdev to
an ibdev.

This makes sense to me.

P.S. - I really wish that we had a cleaner way to match an ibdev to a netdev
without overloading the gid table entries.
Basically, it should be the job of the entity that created the netdev to make
this association, and stuff a pointer in the netdev.

Do you have a specific idea here?  So far, we've tried to keep the mapping the
responsibility of the rdma_cm module.  With rocee, we may need to re-architect
the solution and have the ib_device driver make this association.  Even if it's
unlikely, we need to make sure that we don't make the wrong match.



___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] OFED-1.5.1 failure over iWarp

2010-02-03 Thread Steve Wise
This patch didn't work.  I still get an address resolution error event 
with status -2.

Steve.



Eli Cohen wrote:
 On Tue, Jan 19, 2010 at 04:42:16PM -0800, Woodruff, Robert J wrote:
   
 I am getting the following error when trying to run Intel MPI 
 over nes iwarp cards on today's daily build of OFED-1.5.1.
 OFED-1.5 does not show this problem. 

 mpdtrace
 det-17-eth2
 det-16-eth2
 [0] dapl fabric is not available and fallback fabric is not enabled
 det-17:cd2:  open_hca: rdma_bind ERR No such file or directory. Is eth2 
 configured?
 

 All,

 Since I do not have iwarp cards, I can't check the following patch.
 Please try it and let me know if it solved your problem. If it does,
 I'll push it to tomorrow's build.


 commit 7490e1cce1a295219e23e90d09f78bcdba0977dd
 Author: Eli Cohen e...@mellanox.co.il
 Date:   Wed Feb 3 13:10:14 2010 +0200

 CMA: Fix iWarp failures to bind to a device
 
 rdma_addr_get_sgid() relies on dev_addr-transport to retrieve the 
 correct GID
 based on the hardware address. However, when called from 
 cma_acquire_dev(), the
 transport field is not yet valid. The solution is to avoid calling
 rdma_addr_get_sgid() from cma_acquire_dev() and find the device based on 
 it's
 GID: for ethernet, assume first it is rocee and search the GID table, if 
 not
 found generate the GID by copying it from the hardware address.
 
 Signed-off-by: Eli Cohen e...@mellanox.co.il

 diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
 index a2d5aad..76dce2b 100644
 --- a/drivers/infiniband/core/cma.c
 +++ b/drivers/infiniband/core/cma.c
 @@ -348,15 +348,28 @@ static int cma_acquire_dev(struct rdma_id_private 
 *id_priv)
   union ib_gid gid;
   int ret = -ENODEV;
  
 - rdma_addr_get_sgid(dev_addr, gid);
 - list_for_each_entry(cma_dev, dev_list, list) {
 - ret = ib_find_cached_gid(cma_dev-device, gid,
 -  id_priv-id.port_num, NULL);
 - if (!ret) {
 - cma_attach_to_dev(id_priv, cma_dev);
 - break;
 + if (dev_addr-dev_type != ARPHRD_INFINIBAND) {
 + rocee_addr_get_sgid(dev_addr, gid);
 + list_for_each_entry(cma_dev, dev_list, list) {
 + ret = ib_find_cached_gid(cma_dev-device, gid,
 +  id_priv-id.port_num, NULL);
 + if (!ret)
 + break;
 + }
 + } else {
 + memcpy(gid, dev_addr-src_dev_addr +
 +rdma_addr_gid_offset(dev_addr), sizeof gid);
 + list_for_each_entry(cma_dev, dev_list, list) {
 + ret = ib_find_cached_gid(cma_dev-device, gid,
 +  id_priv-id.port_num, NULL);
 + if (!ret)
 + break;
   }
   }
 +
 + if (!ret)
 + cma_attach_to_dev(id_priv, cma_dev);
 +
   return ret;
  }
  
   

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] OFED-1.5.1 failure over iWarp

2010-02-03 Thread Steve Wise

 diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
 index a2d5aad..76dce2b 100644
 --- a/drivers/infiniband/core/cma.c
 +++ b/drivers/infiniband/core/cma.c
 @@ -348,15 +348,28 @@ static int cma_acquire_dev(struct rdma_id_private 
 *id_priv)
   union ib_gid gid;
   int ret = -ENODEV;
  
 - rdma_addr_get_sgid(dev_addr, gid);
 - list_for_each_entry(cma_dev, dev_list, list) {
 - ret = ib_find_cached_gid(cma_dev-device, gid,
 -  id_priv-id.port_num, NULL);
 - if (!ret) {
 - cma_attach_to_dev(id_priv, cma_dev);
 - break;
 + if (dev_addr-dev_type != ARPHRD_INFINIBAND) {
 + rocee_addr_get_sgid(dev_addr, gid);
 + list_for_each_entry(cma_dev, dev_list, list) {
 + ret = ib_find_cached_gid(cma_dev-device, gid,
 +  id_priv-id.port_num, NULL);
 + if (!ret)
 + break;
 + }
   

The above if statement is true for iwarp devices, so this patch is just 
wrong.   rocee__addr_get_sgid() should only be used for ROCEE 
interfaces, correct?



 + } else {
 + memcpy(gid, dev_addr-src_dev_addr +
 +rdma_addr_gid_offset(dev_addr), sizeof gid);
 + list_for_each_entry(cma_dev, dev_list, list) {
 + ret = ib_find_cached_gid(cma_dev-device, gid,
 +  id_priv-id.port_num, NULL);
 + if (!ret)
 + break;
   }
   }
 +
 + if (!ret)
 + cma_attach_to_dev(id_priv, cma_dev);
 +
   return ret;
  }
  
   

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] OFED-1.5.1 failure over iWarp

2010-02-03 Thread Eli Cohen
On Wed, Feb 03, 2010 at 09:17:57AM -0600, Steve Wise wrote:
 This patch didn't work.  I still get an address resolution error
 event with status -2.

Can you tell whether the bind suceeded? Can you give some more details
about the failure. 
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] OFED-1.5.1 failure over iWarp

2010-01-29 Thread Hefty, Sean
I'm just now getting around to this, and iwarp seems to be 100% dead on
ofed-1.5.1.  I see the same error running mvapich2 and rping.

Has anybody looked into why?

I will be looking at this today.  Has anyone tested iwarp against the upstream 
kernel lately?  2.6.32 or 2.6.33?  Woody is suspecting that the IPv6 patches 
may be a contributing factor, which were pulled into 1.5.1.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] OFED-1.5.1 failure over iWarp

2010-01-29 Thread Steve Wise
Hefty, Sean wrote:
 Are these changes upstream?  I didn't see a problem on 2.6.33-rc4...
 

 Yes - the ipv6 fixes should be in 2.6.33-rc4.  I'm not sure if what's 
 upstream is the same as what's in OFED though, or if that's even the issue.
   
i'm debugging this too.

so far, it looks like addr_resolve() is failing...

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] OFED-1.5.1 failure over iWarp

2010-01-21 Thread Woodruff, Robert J


On the server side, if I do not specify an address,
the server starts

[wo...@det-17 src]$ 
[wo...@det-17 src]$ /sbin/ifconfig 
eth0  Link encap:Ethernet  HWaddr 00:04:23:AF:8E:CE  
  inet addr:192.168.0.17  Bcast:192.168.0.255  Mask:255.255.255.0
  inet6 addr: fe80::204:23ff:feaf:8ece/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:1544 errors:0 dropped:0 overruns:0 frame:0
  TX packets:991 errors:0 dropped:0 overruns:0 carrier:0
  collisions:41 txqueuelen:100 
  RX bytes:360766 (352.3 KiB)  TX bytes:131310 (128.2 KiB)
  Base address:0xdc00 Memory:f9fa-f9fc 

eth1  Link encap:Ethernet  HWaddr 00:04:23:AF:8E:CF  
  inet addr:192.168.1.17  Bcast:192.168.1.255  Mask:255.255.255.0
  UP BROADCAST MULTICAST  MTU:1500  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000 
  RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
  Base address:0xdc80 Memory:f9fe-fa00 

eth2  Link encap:Ethernet  HWaddr 00:12:55:02:AE:CC  
  inet addr:192.168.2.17  Bcast:192.168.2.255  Mask:255.255.255.0
  inet6 addr: fe80::212:55ff:fe02:aecc/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:34136 errors:0 dropped:0 overruns:0 frame:0
  TX packets:35012 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000 
  RX bytes:2863188 (2.7 MiB)  TX bytes:2907831 (2.7 MiB)
  Interrupt:169 

loLink encap:Local Loopback  
  inet addr:127.0.0.1  Mask:255.0.0.0
  inet6 addr: ::1/128 Scope:Host
  UP LOOPBACK RUNNING  MTU:16436  Metric:1
  RX packets:930 errors:0 dropped:0 overruns:0 frame:0
  TX packets:930 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0 
  RX bytes:70925 (69.2 KiB)  TX bytes:70925 (69.2 KiB)

[wo...@det-17 src]$ 
[wo...@det-17 src]$ 
[wo...@det-17 src]$ ucmatose 
cmatose: starting server


But on the client, it fails on any of the ip addresses,
but differently on 192.168.1.17.

 ucmatose -s  192.168.0.17
cmatose: starting client
cmatose: connecting
cmatose: event: RDMA_CM_EVENT_ADDR_ERROR, error: -2
receiving data transfers
sending replies
data transfers complete
test complete
return status 0

 ucmatose -s  192.168.1.17
cmatose: starting client
cmatose: connecting
cmatose: event: RDMA_CM_EVENT_ADDR_ERROR, error: -110
receiving data transfers
sending replies
data transfers complete
test complete
return status 0

 ucmatose -s  192.168.2.17
cmatose: starting client
cmatose: connecting
cmatose: event: RDMA_CM_EVENT_ADDR_ERROR, error: -2
receiving data transfers
sending replies
data transfers complete
test complete
return status 0
[wo...@det-16 ~]$

 

-Original Message-
From: Hefty, Sean 
Sent: Wednesday, January 20, 2010 9:30 PM
To: Woodruff, Robert J; tzipo...@dev.mellanox.co.il; Tung, Chien Tin; Davis, 
Arlin R
Cc: OpenFabrics EWG; Steve Wise; Eli Cohen; Vladimir Sokolovsky
Subject: RE: [ewg] OFED-1.5.1 failure over iWarp

[wo...@det-16 ~]$ ucmatose -s  192.168.2.17
cmatose: starting client

Btw - this should be 'ucmatose -s 192.168.0.17', and needs to start
after the server is running.  But, this isn't going to work since...

[wo...@det-17 src]$ ifconfig eth0
eth0  Link encap:Ethernet  HWaddr 00:04:23:AF:8E:CE
  inet addr:192.168.0.17  Bcast:192.168.0.255
Mask:255.255.255.0
  inet6 addr: fe80::204:23ff:feaf:8ece/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:1272 errors:0 dropped:0 overruns:0 frame:0
  TX packets:909 errors:0 dropped:0 overruns:0 carrier:0
  collisions:41 txqueuelen:100
  RX bytes:309724 (302.4 KiB)  TX bytes:123706 (120.8 KiB)
  Base address:0xdc00 Memory:f9fa-f9fc

[wo...@det-17 src]$ ucmatose -b 192.168.0.17
cmatose: starting server
cmatose: bind address failed: No such file or directory

the server is failing.

Can anyone test ucmatose over iwarp on the latest kernel.org kernel?

- Sean

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] OFED-1.5.1 failure over iWarp

2010-01-20 Thread Tung, Chien Tin
I am getting the following error when trying to run Intel MPI
over nes iwarp cards on today's daily build of OFED-1.5.1.
OFED-1.5 does not show this problem.

I will pull down 1-19 build and see if I get the same issue.

Chien
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] OFED-1.5.1 failure over iWarp

2010-01-20 Thread Tung, Chien Tin
I am getting the following error when trying to run Intel MPI
over nes iwarp cards on today's daily build of OFED-1.5.1.
OFED-1.5 does not show this problem.

I will pull down 1-19 build and see if I get the same issue.

I got similar issue using OpenMPI and 1/19 daily (non-RDMAoE).

--
No OpenFabrics connection schemes reported that they were able to be
used on a specific port.  As such, the openib BTL (OpenFabrics
support) will be disabled for this port.

  Local host:   sw30
  Local device: nes0
  Local port:   1
  CPCs attempted:   oob, xoob
--

Chien



___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] OFED-1.5.1 failure over iWarp

2010-01-20 Thread Tziporet Koren
On 1/20/2010 6:50 PM, Tung, Chien Tin wrote:
 I am getting the following error when trying to run Intel MPI
 over nes iwarp cards on today's daily build of OFED-1.5.1.
 OFED-1.5 does not show this problem.

 I will pull down 1-19 build and see if I get the same issue.
  
 I got similar issue using OpenMPI and 1/19 daily (non-RDMAoE).

Please work with Vlad and Eli to resolve

Thanks
Tziporet

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] OFED-1.5.1 failure over iWarp

2010-01-20 Thread Sean Hefty
Sean could this be a result of the new Ipv6 rdma_cm patches that were
added to OFED-1.5.1 ?

I guess it's possible.  I thought DAPL only used ipv4 addresses though.

Can you try running some simpler tests, like ucmatose or rping?

on server side run: ucmatose [-b optional_local_ip_addres]
on client side run: ucmatose -s server_ip_address

If you can, run on the server side with and without the -b option.

- Sean

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] OFED-1.5.1 failure over iWarp

2010-01-20 Thread Woodruff, Robert J
[wo...@det-16 ~]$ ucmatose -s  192.168.2.17
cmatose: starting client
[wo...@det-17 src]$ ifconfig eth0
eth0  Link encap:Ethernet  HWaddr 00:04:23:AF:8E:CE  
  inet addr:192.168.0.17  Bcast:192.168.0.255  Mask:255.255.255.0
  inet6 addr: fe80::204:23ff:feaf:8ece/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:1272 errors:0 dropped:0 overruns:0 frame:0
  TX packets:909 errors:0 dropped:0 overruns:0 carrier:0
  collisions:41 txqueuelen:100 
  RX bytes:309724 (302.4 KiB)  TX bytes:123706 (120.8 KiB)
  Base address:0xdc00 Memory:f9fa-f9fc 

[wo...@det-17 src]$ ucmatose -b 192.168.0.17
cmatose: starting server
cmatose: bind address failed: No such file or directory
test complete
return status -1

-Original Message-
From: Hefty, Sean 
Sent: Wednesday, January 20, 2010 4:19 PM
To: Woodruff, Robert J; tzipo...@dev.mellanox.co.il; Tung, Chien Tin; Davis, 
Arlin R
Cc: OpenFabrics EWG; Steve Wise; Eli Cohen; Vladimir Sokolovsky
Subject: RE: [ewg] OFED-1.5.1 failure over iWarp

Sean could this be a result of the new Ipv6 rdma_cm patches that were
added to OFED-1.5.1 ?

I guess it's possible.  I thought DAPL only used ipv4 addresses though.

Can you try running some simpler tests, like ucmatose or rping?

on server side run: ucmatose [-b optional_local_ip_addres]
on client side run: ucmatose -s server_ip_address

If you can, run on the server side with and without the -b option.

- Sean

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] OFED-1.5.1 failure over iWarp

2010-01-20 Thread Sean Hefty
[wo...@det-16 ~]$ ucmatose -s  192.168.2.17
cmatose: starting client

Btw - this should be 'ucmatose -s 192.168.0.17', and needs to start
after the server is running.  But, this isn't going to work since...

[wo...@det-17 src]$ ifconfig eth0
eth0  Link encap:Ethernet  HWaddr 00:04:23:AF:8E:CE
  inet addr:192.168.0.17  Bcast:192.168.0.255
Mask:255.255.255.0
  inet6 addr: fe80::204:23ff:feaf:8ece/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:1272 errors:0 dropped:0 overruns:0 frame:0
  TX packets:909 errors:0 dropped:0 overruns:0 carrier:0
  collisions:41 txqueuelen:100
  RX bytes:309724 (302.4 KiB)  TX bytes:123706 (120.8 KiB)
  Base address:0xdc00 Memory:f9fa-f9fc

[wo...@det-17 src]$ ucmatose -b 192.168.0.17
cmatose: starting server
cmatose: bind address failed: No such file or directory

the server is failing.

Can anyone test ucmatose over iwarp on the latest kernel.org kernel?

- Sean

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] OFED-1.5.1 failure over iWarp

2010-01-19 Thread Woodruff, Robert J
I am getting the following error when trying to run Intel MPI 
over nes iwarp cards on today's daily build of OFED-1.5.1.
OFED-1.5 does not show this problem. 

mpdtrace
det-17-eth2
det-16-eth2
[0] dapl fabric is not available and fallback fabric is not enabled
det-17:cd2:  open_hca: rdma_bind ERR No such file or directory. Is eth2 
configured?
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg