[ewg] ofa_1_5_kernel 20100218-0200 daily build status

2010-02-18 Thread Vladimir Sokolovsky (Mellanox)
This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git
git_branch: ofed_kernel_1_5

Common build parameters: 

Passed:
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.26
Passed on i686 with linux-2.6.24
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.27
Passed on x86_64 with linux-2.6.16.60-0.54.5-smp
Passed on x86_64 with linux-2.6.16.60-0.21-smp
Passed on x86_64 with linux-2.6.18
Passed on x86_64 with linux-2.6.18-128.el5
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.18-93.el5
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.24
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.26
Passed on x86_64 with linux-2.6.27
Passed on x86_64 with linux-2.6.25
Passed on x86_64 with linux-2.6.27.19-5-smp
Passed on x86_64 with linux-2.6.9-89.ELsmp
Passed on x86_64 with linux-2.6.9-67.ELsmp
Passed on x86_64 with linux-2.6.9-78.ELsmp
Passed on ia64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.19
Passed on ia64 with linux-2.6.18
Passed on ia64 with linux-2.6.23
Passed on ia64 with linux-2.6.24
Passed on ia64 with linux-2.6.22
Passed on ia64 with linux-2.6.26
Passed on ia64 with linux-2.6.25
Passed on ppc64 with linux-2.6.18
Passed on ppc64 with linux-2.6.19

Failed:
Build failed on x86_64 with linux-2.6.18-164.el5
Log:
/home/vlad/tmp/ofa_1_5_kernel-20100218-0200_linux-2.6.18-164.el5_x86_64_check/drivers/scsi/scsi_transport_iscsi.c:1832:
 warning: assignment from incompatible pointer type
/home/vlad/tmp/ofa_1_5_kernel-20100218-0200_linux-2.6.18-164.el5_x86_64_check/drivers/scsi/scsi_transport_iscsi.c:
 In function 'iscsi_transport_init':
/home/vlad/tmp/ofa_1_5_kernel-20100218-0200_linux-2.6.18-164.el5_x86_64_check/drivers/scsi/scsi_transport_iscsi.c:1935:
 warning: passing argument 3 of 'netlink_kernel_create' from incompatible 
pointer type
/home/vlad/tmp/ofa_1_5_kernel-20100218-0200_linux-2.6.18-164.el5_x86_64_check/drivers/scsi/scsi_transport_iscsi.c:1949:
 error: implicit declaration of function 'netlink_kernel_release'
make[3]: *** 
[/home/vlad/tmp/ofa_1_5_kernel-20100218-0200_linux-2.6.18-164.el5_x86_64_check/drivers/scsi/scsi_transport_iscsi.o]
 Error 1
make[2]: *** 
[/home/vlad/tmp/ofa_1_5_kernel-20100218-0200_linux-2.6.18-164.el5_x86_64_check/drivers/scsi]
 Error 2
make[1]: *** 
[_module_/home/vlad/tmp/ofa_1_5_kernel-20100218-0200_linux-2.6.18-164.el5_x86_64_check]
 Error 2
make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.18-164.el5'
make: *** [kernel] Error 2
--
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCH OFED-151] ehca forward ports

2010-02-18 Thread Alexander Schmidt
Hi Vlad,

please apply for OFED-151.

Forward ports for ehca driver to enable compilation
on 2.6.32 and 2.6.31.

Signed-off-by: Alexander Schmidt al...@linux.vnet.ibm.com
---
 kernel_patches/backport/2.6.32/ehca-010-remove_driver_data.patch |   60 
++
 kernel_patches/backport/2.6.32/ehca-020-fix_buswalk.patch|   17 ++
 2 files changed, 77 insertions(+)

--- /dev/null
+++ 
ofed_kernel-1.5/kernel_patches/backport/2.6.32/ehca-010-remove_driver_data.patch
@@ -0,0 +1,60 @@
+commit f899c2ddd45f2515deb446e2b143e4a686a49aee
+Author: Greg Kroah-Hartman gre...@suse.de
+Date:   Mon May 4 12:40:54 2009 -0700
+
+infiniband: ehca: remove driver_data direct access of struct device
+
+In the near future, the driver core is going to not allow direct access
+to the driver_data pointer in struct device.  Instead, the functions
+dev_get_drvdata() and dev_set_drvdata() should be used.  These functions
+have been around since the beginning, so are backwards compatible with
+all older kernel versions.
+
+Cc: Sean Hefty sean.he...@intel.com
+Cc: Roland Dreier rola...@cisco.com
+Cc: Hal Rosenstock hal.rosenst...@gmail.com
+Cc: gene...@lists.openfabrics.org
+Cc: Christoph Raisch rai...@de.ibm.com
+Acked-by: Hoang-Nam Nguyen hngu...@de.ibm.com
+Signed-off-by: Greg Kroah-Hartman gre...@suse.de
+
+diff --git a/drivers/infiniband/hw/ehca/ehca_main.c 
b/drivers/infiniband/hw/ehca/ehca_main.c
+index 85905ab..ce4e6ef 100644
+--- a/drivers/infiniband/hw/ehca/ehca_main.c
 b/drivers/infiniband/hw/ehca/ehca_main.c
+@@ -636,7 +636,7 @@ static ssize_t  ehca_show_##name(struct device *dev,   
\
+   struct hipz_query_hca *rblock; \
+   int data;  \
+  \
+-  shca = dev-driver_data;   \
++  shca = dev_get_drvdata(dev);   \
+  \
+   rblock = ehca_alloc_fw_ctrlblock(GFP_KERNEL);  \
+   if (!rblock) { \
+@@ -680,7 +680,7 @@ static ssize_t ehca_show_adapter_handle(struct device *dev,
+   struct device_attribute *attr,
+   char *buf)
+ {
+-  struct ehca_shca *shca = dev-driver_data;
++  struct ehca_shca *shca = dev_get_drvdata(dev);
+ 
+   return sprintf(buf, %llx\n, shca-ipz_hca_handle.handle);
+ 
+@@ -749,7 +749,7 @@ static int __devinit ehca_probe(struct of_device *dev,
+ 
+   shca-ofdev = dev;
+   shca-ipz_hca_handle.handle = *handle;
+-  dev-dev.driver_data = shca;
++  dev_set_drvdata(dev-dev, shca);
+ 
+   ret = ehca_sense_attributes(shca);
+   if (ret  0) {
+@@ -878,7 +878,7 @@ probe1:
+ 
+ static int __devexit ehca_remove(struct of_device *dev)
+ {
+-  struct ehca_shca *shca = dev-dev.driver_data;
++  struct ehca_shca *shca = dev_get_drvdata(dev-dev);
+   unsigned long flags;
+   int ret;
+ 
--- /dev/null
+++ ofed_kernel-1.5/kernel_patches/backport/2.6.32/ehca-020-fix_buswalk.patch
@@ -0,0 +1,17 @@
+---
+ drivers/infiniband/hw/ehca/ehca_mrmw.c |2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+Index: ofa_kernel-1.5.1/drivers/infiniband/hw/ehca/ehca_mrmw.c
+===
+--- ofa_kernel-1.5.1.orig/drivers/infiniband/hw/ehca/ehca_mrmw.c
 ofa_kernel-1.5.1/drivers/infiniband/hw/ehca/ehca_mrmw.c
+@@ -2463,7 +2463,7 @@ int ehca_create_busmap(void)
+   int ret;
+ 
+   ehca_mr_len = 0;
+-  ret = walk_memory_resource(0, 1ULL  MAX_PHYSMEM_BITS, NULL,
++  ret = walk_system_ram_range(0, 1ULL  MAX_PHYSMEM_BITS, NULL,
+  ehca_create_busmap_callback);
+   return ret;
+ }
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCH OFED-151] ehca in install.pl

2010-02-18 Thread Alexander Schmidt
Hi Vlad,

another patch for OFED-1.5.1...

Signed-off-by: Alexander Schmidt al...@linux.vnet.ibm.com
---
 install.pl |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- OFED-1.5.1-20100217-0757.orig/install.pl
+++ OFED-1.5.1-20100217-0757/install.pl
@@ -1658,7 +1658,7 @@ sub set_availability
 
 # Ehca
 if ($arch =~ m/ppc64|powerpc/ and
-$kernel =~ m/2.6.1[6-9]|2.6.2[0-9]|2.6.30/) {
+$kernel =~ m/2.6.1[6-9]|2.6.2[0-9]|2.6.3[0-2]/) {
 $kernel_modules_info{'ehca'}{'available'} = 1;
 $packages_info{'libehca'}{'available'} = 1;
 $packages_info{'libehca-devel-static'}{'available'} = 1;
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCHv8 0/11] IBoE support to Infiniband

2010-02-18 Thread Eli Cohen
IBoE allows running the IB transport protocol using Ethernet frames, enabling
the deployment of IB semantics on lossless Ethernet fabrics.

IBoE packets are standard Ethernet frames with an IEEE assigned Ethertype, a
GRH, unmodified IB transport headers and payload.  IB subnet management and SA
services are not required for IBoE operation; Ethernet management practices are
used instead. IBoE encodes IP addresses into its GIDs and resolves MAC
addresses using the host IP stack. For multicast GIDs, standard IP to MAC
mappings apply.

The OFA RDMA Verbs API is syntactically unmodified. The CMA is adapted to
support IBoE ports allowing existing RDMA applications to run over IBoE with no
changes.

Address handles for IBoE are required to contain valid L3 addresses (GIDs) and
the IB L2 address fields become reserved. The complementary Ethernet L2 address
information is subsequently resolved below the API.

As there is no SA in IBoE, the CMA code is adapted to locally fill-in
corresponding path record attributes for IBoE address handles. Also, the CMA
provides the required address handle attributes for SIDR requests and joining
of multicast groups.

With this patch set, each IBoE port is assigned a GID equal to the link local
address of its corresponding net device, and one more GID for each one of the
VLAN devices which are derived from it. iboe packets are tagged with the VLAN
ID of the corresponding netdevice through which they are generated.

The priority field in the 802.1q header of IBoE packets is derived from the SL
field in the address vector. rdma_cm applications can set the TOS value of the
rdma_cm_id object through the rdma_set_option() API which then maps to SL.

With these patches, IBoE multicast frames may be broadcast as there is
currently no use of a L2 multicast group membership protocol.

To enable IBoE with the mlx4 driver stack, both the mlx4_en and mlx4_ib drivers
must be loaded, and the netdevice for the corresponding IBoE port must be
running. Individual ports of a multi port HCA can be independently configured
as Ethernet (with support for IBoE) or as IB, as it was already the case.

We have successfully tested MPI, SDP, RDS, and native Verbs applications over
IBoE.

Following is a series of 11 patches based on Roland's for-next branch.  This
new series reflects changes based on feedback from the community on the
previous patch set.

Changes from v7
1. Rebase on 2.6.33-rc3
2. Add VLAN support
3. Bug fixes and improvements (see in the patches changelog).

Signed-off-by: Eli Cohen e...@mellanox.co.il
---


 drivers/infiniband/core/agent.c   |   37 +
 drivers/infiniband/core/cm.c  |5 
 drivers/infiniband/core/cma.c |  287 ++-
 drivers/infiniband/core/mad.c |   27 +
 drivers/infiniband/core/multicast.c   |   25 +
 drivers/infiniband/core/sa_query.c|   46 +-
 drivers/infiniband/core/ucma.c|   54 ++
 drivers/infiniband/core/ud_header.c   |  129 +-
 drivers/infiniband/core/user_mad.c|   11 
 drivers/infiniband/core/uverbs.h  |1 
 drivers/infiniband/core/uverbs_cmd.c  |   33 +
 drivers/infiniband/core/uverbs_main.c |1 
 drivers/infiniband/core/verbs.c   |   26 +
 drivers/infiniband/hw/mlx4/ah.c   |  196 --
 drivers/infiniband/hw/mlx4/mad.c  |   32 +
 drivers/infiniband/hw/mlx4/main.c |  557 --
 drivers/infiniband/hw/mlx4/mlx4_ib.h  |   35 +
 drivers/infiniband/hw/mlx4/qp.c   |  180 +++--
 drivers/infiniband/hw/mthca/mthca_qp.c|2 
 drivers/infiniband/ulp/ipoib/ipoib_main.c |7 
 drivers/net/mlx4/en_main.c|   15 
 drivers/net/mlx4/en_netdev.c  |   10 
 drivers/net/mlx4/en_port.c|4 
 drivers/net/mlx4/en_port.h|3 
 drivers/net/mlx4/fw.c |3 
 drivers/net/mlx4/intf.c   |   20 +
 drivers/net/mlx4/main.c   |6 
 drivers/net/mlx4/mlx4.h   |1 
 drivers/net/mlx4/mlx4_en.h|1 
 drivers/net/mlx4/port.c   |   19 +
 include/linux/mlx4/cmd.h  |1 
 include/linux/mlx4/device.h   |   32 +
 include/linux/mlx4/driver.h   |   16 
 include/linux/mlx4/qp.h   |9 
 include/rdma/ib_addr.h|  139 +++
 include/rdma/ib_pack.h|   28 +
 include/rdma/ib_sa.h  |3 
 include/rdma/ib_user_verbs.h  |   22 +
 include/rdma/ib_verbs.h   |   29 +
 39 files changed, 1813 insertions(+), 239 deletions(-)
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCHv8 01/11] ib core: Add link layer property to ports

2010-02-18 Thread Eli Cohen
This patch adds the infrastructure for querying the link layer of a port, which
can be either IB_LINK_LAYER_INFINIBAND or IB_LINK_LAYER_ETHERNET. This is
required for adding IBoE support to Infiniband drivers so that branching
decisions can be made according to the value of this property. For devices that
do not provide an implementation for querying the link layer property of a
port, the returned value depends on the node transport such that
RMA_TRANSPORT_IB nodes will return IB_LINK_LAYER_INFINIBAND and
RDMA_TRANSPORT_IWARP nodes will return IB_LINK_LAYER_ETHERNET.

Signed-off-by: Eli Cohen e...@mellanox.co.il
---
 drivers/infiniband/core/verbs.c |   16 
 include/rdma/ib_verbs.h |   12 
 2 files changed, 28 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index a7da9be..f9cbdb6 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -94,6 +94,22 @@ rdma_node_get_transport(enum rdma_node_type node_type)
 }
 EXPORT_SYMBOL(rdma_node_get_transport);
 
+enum rdma_link_layer rdma_port_link_layer(struct ib_device *device, u8 
port_num)
+{
+   if (device-get_link_layer)
+   return device-get_link_layer(device, port_num);
+
+   switch (rdma_node_get_transport(device-node_type)) {
+   case RDMA_TRANSPORT_IB:
+   return IB_LINK_LAYER_INFINIBAND;
+   case RDMA_TRANSPORT_IWARP:
+   return IB_LINK_LAYER_ETHERNET;
+   default:
+   return IB_LINK_LAYER_UNSPECIFIED;
+   }
+}
+EXPORT_SYMBOL(rdma_port_link_layer);
+
 /* Protection domains */
 
 struct ib_pd *ib_alloc_pd(struct ib_device *device)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 09509ed..bbfe315 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -75,6 +75,12 @@ enum rdma_transport_type {
 enum rdma_transport_type
 rdma_node_get_transport(enum rdma_node_type node_type) __attribute_const__;
 
+enum rdma_link_layer {
+   IB_LINK_LAYER_UNSPECIFIED,
+   IB_LINK_LAYER_INFINIBAND,
+   IB_LINK_LAYER_ETHERNET,
+};
+
 enum ib_device_cap_flags {
IB_DEVICE_RESIZE_MAX_WR = 1,
IB_DEVICE_BAD_PKEY_CNTR = (11),
@@ -298,6 +304,7 @@ struct ib_port_attr {
u8  active_width;
u8  active_speed;
u8  phys_state;
+   enum rdma_link_layerlink_layer;
 };
 
 enum ib_device_modify_flags {
@@ -1003,6 +1010,8 @@ struct ib_device {
int(*query_port)(struct ib_device *device,
 u8 port_num,
 struct ib_port_attr 
*port_attr);
+   enum rdma_link_layer   (*get_link_layer)(struct ib_device *device,
+u8 port_num);
int(*query_gid)(struct ib_device *device,
u8 port_num, int index,
union ib_gid *gid);
@@ -1213,6 +1222,9 @@ int ib_query_device(struct ib_device *device,
 int ib_query_port(struct ib_device *device,
  u8 port_num, struct ib_port_attr *port_attr);
 
+enum rdma_link_layer rdma_port_link_layer(struct ib_device *device,
+ u8 port_num);
+
 int ib_query_gid(struct ib_device *device,
 u8 port_num, int index, union ib_gid *gid);
 
-- 
1.7.0

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCHv8 02/11] ib_core: IBoE support only QP1

2010-02-18 Thread Eli Cohen
Since IBoE is using Ethernet as its link layer, there is no central management
entity so there is need for QP0.  QP1 is still needed since it handles
communications between CM agents. This patch will create only QP1 for IBoE
ports.

Signed-off-by: Eli Cohen e...@mellanox.co.il
---
Changes from v7:

1. Remove always true code
2. Fix failure to initialize port ah_lock in ib_sa_add_one


 drivers/infiniband/core/agent.c |   37 +++
 drivers/infiniband/core/mad.c   |   27 +++---
 drivers/infiniband/core/multicast.c |   25 ++---
 drivers/infiniband/core/sa_query.c  |   41 ++
 4 files changed, 93 insertions(+), 37 deletions(-)

diff --git a/drivers/infiniband/core/agent.c b/drivers/infiniband/core/agent.c
index ae7c288..964f4fb 100644
--- a/drivers/infiniband/core/agent.c
+++ b/drivers/infiniband/core/agent.c
@@ -48,6 +48,8 @@
 struct ib_agent_port_private {
struct list_head port_list;
struct ib_mad_agent *agent[2];
+   struct ib_device*device;
+   u8   port_num;
 };
 
 static DEFINE_SPINLOCK(ib_agent_port_list_lock);
@@ -58,11 +60,10 @@ __ib_get_agent_port(struct ib_device *device, int port_num)
 {
struct ib_agent_port_private *entry;
 
-   list_for_each_entry(entry, ib_agent_port_list, port_list) {
-   if (entry-agent[0]-device == device 
-   entry-agent[0]-port_num == port_num)
+   list_for_each_entry(entry, ib_agent_port_list, port_list)
+   if (entry-device == device  entry-port_num == port_num)
return entry;
-   }
+
return NULL;
 }
 
@@ -155,14 +156,16 @@ int ib_agent_port_open(struct ib_device *device, int 
port_num)
goto error1;
}
 
-   /* Obtain send only MAD agent for SMI QP */
-   port_priv-agent[0] = ib_register_mad_agent(device, port_num,
-   IB_QPT_SMI, NULL, 0,
-   agent_send_handler,
-   NULL, NULL);
-   if (IS_ERR(port_priv-agent[0])) {
-   ret = PTR_ERR(port_priv-agent[0]);
-   goto error2;
+   if (rdma_port_link_layer(device, port_num) == IB_LINK_LAYER_INFINIBAND) 
{
+   /* Obtain send only MAD agent for SMI QP */
+   port_priv-agent[0] = ib_register_mad_agent(device, port_num,
+   IB_QPT_SMI, NULL, 0,
+   agent_send_handler,
+   NULL, NULL);
+   if (IS_ERR(port_priv-agent[0])) {
+   ret = PTR_ERR(port_priv-agent[0]);
+   goto error2;
+   }
}
 
/* Obtain send only MAD agent for GSI QP */
@@ -175,6 +178,9 @@ int ib_agent_port_open(struct ib_device *device, int 
port_num)
goto error3;
}
 
+   port_priv-device = device;
+   port_priv-port_num = port_num;
+
spin_lock_irqsave(ib_agent_port_list_lock, flags);
list_add_tail(port_priv-port_list, ib_agent_port_list);
spin_unlock_irqrestore(ib_agent_port_list_lock, flags);
@@ -182,7 +188,8 @@ int ib_agent_port_open(struct ib_device *device, int 
port_num)
return 0;
 
 error3:
-   ib_unregister_mad_agent(port_priv-agent[0]);
+   if (rdma_port_link_layer(device, port_num) == IB_LINK_LAYER_INFINIBAND)
+   ib_unregister_mad_agent(port_priv-agent[0]);
 error2:
kfree(port_priv);
 error1:
@@ -205,7 +212,9 @@ int ib_agent_port_close(struct ib_device *device, int 
port_num)
spin_unlock_irqrestore(ib_agent_port_list_lock, flags);
 
ib_unregister_mad_agent(port_priv-agent[1]);
-   ib_unregister_mad_agent(port_priv-agent[0]);
+   if (rdma_port_link_layer(device, port_num) == IB_LINK_LAYER_INFINIBAND)
+   ib_unregister_mad_agent(port_priv-agent[0]);
+
kfree(port_priv);
return 0;
 }
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 7522008..f546ab7 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -2610,6 +2610,9 @@ static void cleanup_recv_queue(struct ib_mad_qp_info 
*qp_info)
struct ib_mad_private *recv;
struct ib_mad_list_head *mad_list;
 
+   if (!qp_info-qp)
+   return;
+
while (!list_empty(qp_info-recv_queue.list)) {
 
mad_list = list_entry(qp_info-recv_queue.list.next,
@@ -2651,6 +2654,9 @@ static int ib_mad_port_start(struct ib_mad_port_private 
*port_priv)
 
for (i = 0; i  IB_MAD_QPS_CORE; i++) {
qp = port_priv-qp_info[i].qp;
+   if (!qp)
+   continue;
+
/*
 * PKey index for QP1 is irrelevant but
   

[ewg] [PATCHv8 03/11] IB/umad: Enable support only for IB ports

2010-02-18 Thread Eli Cohen
Initialize umad context only for Infiniband (as opposed to Ethernet) ports.

Signed-off-by: Eli Cohen e...@mellanox.co.il
---
 drivers/infiniband/core/user_mad.c |   11 +++
 1 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/core/user_mad.c 
b/drivers/infiniband/core/user_mad.c
index 7de0296..e962c5a 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -1138,8 +1138,9 @@ static void ib_umad_add_one(struct ib_device *device)
for (i = s; i = e; ++i) {
umad_dev-port[i - s].umad_dev = umad_dev;
 
-   if (ib_umad_init_port(device, i, umad_dev-port[i - s]))
-   goto err;
+   if (rdma_port_link_layer(device, i) == IB_LINK_LAYER_INFINIBAND)
+   if (ib_umad_init_port(device, i, umad_dev-port[i - 
s]))
+   goto err;
}
 
ib_set_client_data(device, umad_client, umad_dev);
@@ -1148,7 +1149,8 @@ static void ib_umad_add_one(struct ib_device *device)
 
 err:
while (--i = s)
-   ib_umad_kill_port(umad_dev-port[i - s]);
+   if (rdma_port_link_layer(device, i) == IB_LINK_LAYER_INFINIBAND)
+   ib_umad_kill_port(umad_dev-port[i - s]);
 
kref_put(umad_dev-ref, ib_umad_release_dev);
 }
@@ -1162,7 +1164,8 @@ static void ib_umad_remove_one(struct ib_device *device)
return;
 
for (i = 0; i = umad_dev-end_port - umad_dev-start_port; ++i)
-   ib_umad_kill_port(umad_dev-port[i]);
+   if (rdma_port_link_layer(device, i + 1) == 
IB_LINK_LAYER_INFINIBAND)
+   ib_umad_kill_port(umad_dev-port[i]);
 
kref_put(umad_dev-ref, ib_umad_release_dev);
 }
-- 
1.7.0

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCHv8 04/11] ib_core: IBoE CMA device binding

2010-02-18 Thread Eli Cohen
Add support for IBoE device binding and IP -- GID resolution. Path resolving
and multicast joining are implemented within cma.c by filling the responses and
pushing the callbacks to the cma work queue. IP-GID resolution always yields
IPv6 link local addresses - remote GIDs are derived from the destination MAC
address of the remote port. Multicast GIDs are always mapped to multicast MACs
as is done in IPv6. Some helper functions are added to ib_addr.h.  IPv4
multicast is enabled by translating IPv4 multicast addresses to IPv6 multicast
as described in
http://www.mail-archive.com/i...@sunroof.eng.sun.com/msg02134.html.

Signed-off-by: Eli Cohen e...@mellanox.co.il
---

Chages from v7:

1. Add force_grh flag to ib_init_ah_from_path() to request IB_AH_GRH for
   IB_LINK_LAYER_ETHERNET ports thus allowing to use hop limit 1 in path
   records.
2. cma_acquire_dev() finds the cma_dev by first assuming an iboe type device
   for none ARPHRD_INFINIBAND dev type. If it fails to do that, it falls back to
   old method.


 drivers/infiniband/core/cm.c  |5 +-
 drivers/infiniband/core/cma.c |  283 +++--
 drivers/infiniband/core/sa_query.c|5 +-
 drivers/infiniband/core/ucma.c|   45 -
 drivers/infiniband/ulp/ipoib/ipoib_main.c |2 +-
 include/rdma/ib_addr.h|   98 ++-
 include/rdma/ib_sa.h  |3 +-
 7 files changed, 412 insertions(+), 29 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 5130fc5..6513b1c 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -351,6 +351,7 @@ static int cm_init_av_by_path(struct ib_sa_path_rec *path, 
struct cm_av *av)
unsigned long flags;
int ret;
u8 p;
+   int force_grh;
 
read_lock_irqsave(cm.device_lock, flags);
list_for_each_entry(cm_dev, cm.device_list, list) {
@@ -371,8 +372,10 @@ static int cm_init_av_by_path(struct ib_sa_path_rec *path, 
struct cm_av *av)
return ret;
 
av-port = port;
+   force_grh = rdma_port_link_layer(cm_dev-ib_device, port-port_num) ==
+   IB_LINK_LAYER_ETHERNET ? 1 : 0;
ib_init_ah_from_path(cm_dev-ib_device, port-port_num, path,
-av-ah_attr);
+av-ah_attr, force_grh);
av-timeout = path-packet_life_time + 1;
return 0;
 }
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index cc9b594..df5f636 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -58,6 +58,7 @@ MODULE_LICENSE(Dual BSD/GPL);
 #define CMA_CM_RESPONSE_TIMEOUT 20
 #define CMA_MAX_CM_RETRIES 15
 #define CMA_CM_MRA_SETTING (IB_CM_MRA_FLAG_DELAY | 24)
+#define IBOE_PACKET_LIFETIME 18
 
 static void cma_add_one(struct ib_device *device);
 static void cma_remove_one(struct ib_device *device);
@@ -157,6 +158,7 @@ struct cma_multicast {
struct list_headlist;
void*context;
struct sockaddr_storage addr;
+   struct kref mcref;
 };
 
 struct cma_work {
@@ -173,6 +175,12 @@ struct cma_ndev_work {
struct rdma_cm_eventevent;
 };
 
+struct iboe_mcast_work {
+   struct work_struct   work;
+   struct rdma_id_private  *id;
+   struct cma_multicast*mc;
+};
+
 union cma_ip_addr {
struct in6_addr ip6;
struct {
@@ -281,6 +289,8 @@ static void cma_attach_to_dev(struct rdma_id_private 
*id_priv,
atomic_inc(cma_dev-refcount);
id_priv-cma_dev = cma_dev;
id_priv-id.device = cma_dev-device;
+   id_priv-id.route.addr.dev_addr.transport =
+   rdma_node_get_transport(cma_dev-device-node_type);
list_add_tail(id_priv-list, cma_dev-id_list);
 }
 
@@ -290,6 +300,14 @@ static inline void cma_deref_dev(struct cma_device 
*cma_dev)
complete(cma_dev-comp);
 }
 
+static inline void release_mc(struct kref *kref)
+{
+   struct cma_multicast *mc = container_of(kref, struct cma_multicast, 
mcref);
+
+   kfree(mc-multicast.ib);
+   kfree(mc);
+}
+
 static void cma_detach_from_dev(struct rdma_id_private *id_priv)
 {
list_del(id_priv-list);
@@ -330,15 +348,29 @@ static int cma_acquire_dev(struct rdma_id_private 
*id_priv)
union ib_gid gid;
int ret = -ENODEV;
 
-   rdma_addr_get_sgid(dev_addr, gid);
+   if (dev_addr-dev_type != ARPHRD_INFINIBAND) {
+   iboe_addr_get_sgid(dev_addr, gid);
+   list_for_each_entry(cma_dev, dev_list, list) {
+   ret = ib_find_cached_gid(cma_dev-device, gid,
+id_priv-id.port_num, NULL);
+   if (!ret)
+   goto out;
+   }
+   }
+
+   memcpy(gid, dev_addr-src_dev_addr +
+  rdma_addr_gid_offset(dev_addr), sizeof gid);

[ewg] [PATCHv8 05/11] ib_core: IBoE UD packet packing support

2010-02-18 Thread Eli Cohen
Add support functions to aid in packing IBoE packets.

Signed-off-by: Eli Cohen e...@mellanox.co.il
---
 drivers/infiniband/core/ud_header.c |  100 ++
 include/rdma/ib_pack.h  |   29 +-
 2 files changed, 92 insertions(+), 37 deletions(-)

Changes from v7:

1. Re-work the changes so they extend the original idea behind these functions.
2. Fix wrong implementation of ib_ud_header_init(). A different patch was sent
   to Roland.


diff --git a/drivers/infiniband/core/ud_header.c 
b/drivers/infiniband/core/ud_header.c
index 8ec7876..7650313 100644
--- a/drivers/infiniband/core/ud_header.c
+++ b/drivers/infiniband/core/ud_header.c
@@ -80,6 +80,29 @@ static const struct ib_field lrh_table[]  = {
  .size_bits= 16 }
 };
 
+static const struct ib_field eth_table[]  = {
+   { STRUCT_FIELD(eth, dmac_h),
+ .offset_words = 0,
+ .offset_bits  = 0,
+ .size_bits= 32 },
+   { STRUCT_FIELD(eth, dmac_l),
+ .offset_words = 1,
+ .offset_bits  = 0,
+ .size_bits= 16 },
+   { STRUCT_FIELD(eth, smac_h),
+ .offset_words = 1,
+ .offset_bits  = 16,
+ .size_bits= 16 },
+   { STRUCT_FIELD(eth, smac_l),
+ .offset_words = 2,
+ .offset_bits  = 0,
+ .size_bits= 32 },
+   { STRUCT_FIELD(eth, type),
+ .offset_words = 3,
+ .offset_bits  = 0,
+ .size_bits= 16 }
+};
+
 static const struct ib_field grh_table[]  = {
{ STRUCT_FIELD(grh, ip_version),
  .offset_words = 0,
@@ -180,56 +203,51 @@ static const struct ib_field deth_table[] = {
 /**
  * ib_ud_header_init - Initialize UD header structure
  * @payload_bytes:Length of packet payload
+ * @lrh_present: specify if LRH is present
+ * @eth_present: specify if Eth header is present
  * @grh_present:GRH flag (if non-zero, GRH will be included)
+ * @immediate_present: specify if immediate data is present
  * @header:Structure to initialize
- *
- * ib_ud_header_init() initializes the lrh.link_version, lrh.link_next_header,
- * lrh.packet_length, grh.ip_version, grh.payload_length,
- * grh.next_header, bth.opcode, bth.pad_count and
- * bth.transport_header_version fields of a struct ib_ud_header given
- * the payload length and whether a GRH will be included.
  */
 void ib_ud_header_init(int payload_bytes,
+  int  lrh_present,
+  int  eth_present,
   int  grh_present,
+  int  immediate_present,
   struct ib_ud_header *header)
 {
-   int header_len;
u16 packet_length;
 
memset(header, 0, sizeof *header);
 
-   header_len =
-   IB_LRH_BYTES  +
-   IB_BTH_BYTES  +
-   IB_DETH_BYTES;
-   if (grh_present) {
-   header_len += IB_GRH_BYTES;
+   if (lrh_present) {
+   header-lrh.link_version = 0;
+   header-lrh.link_next_header =
+   grh_present ? IB_LNH_IBA_GLOBAL : IB_LNH_IBA_LOCAL;
+   packet_length = IB_LRH_BYTES;
}
 
-   header-lrh.link_version = 0;
-   header-lrh.link_next_header =
-   grh_present ? IB_LNH_IBA_GLOBAL : IB_LNH_IBA_LOCAL;
-   packet_length= (IB_LRH_BYTES +
-   IB_BTH_BYTES +
-   IB_DETH_BYTES+
-   payload_bytes+
-   4+ /* ICRC */
-   3) / 4;/* round up */
-
-   header-grh_present  = grh_present;
+   if (eth_present)
+   packet_length = IB_ETH_BYTES;
+
+   packet_length += IB_BTH_BYTES + IB_DETH_BYTES + payload_bytes +
+   4   + /* ICRC */
+   3;/* round up */
+   packet_length /= 4;
if (grh_present) {
-   packet_length  += IB_GRH_BYTES / 4;
-   header-grh.ip_version  = 6;
-   header-grh.payload_length  =
-   cpu_to_be16((IB_BTH_BYTES +
-IB_DETH_BYTES+
-payload_bytes+
-4+ /* ICRC */
-3)  ~3);  /* round up */
+   packet_length += IB_GRH_BYTES / 4;
+   header-grh.ip_version = 6;
+   header-grh.payload_length =
+   cpu_to_be16((IB_BTH_BYTES  +
+IB_DETH_BYTES +
+payload_bytes +
+4 + /* ICRC */
+3)  ~3);   /* 

[ewg] [PATCHv8 06/11] ipoib: avoid ipoib over IBoE

2010-02-18 Thread Eli Cohen
IPoIB is an implementation of IP over Infiniband transport. In the case of
IBoE, the link layer is Ethernet so IP can work directly over Ethernet, so
disable IPoIB for none IB_LINK_LAYER_INFINIBAND ports.

Signed-off-by: Eli Cohen e...@mellanox.co.il
---
 drivers/infiniband/ulp/ipoib/ipoib_main.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c 
b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 06014d2..5e6c2de 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -1362,6 +1362,8 @@ static void ipoib_add_one(struct ib_device *device)
}
 
for (p = s; p = e; ++p) {
+   if (rdma_port_link_layer(device, p) != IB_LINK_LAYER_INFINIBAND)
+   continue;
dev = ipoib_add_port(ib%d, device, p);
if (!IS_ERR(dev)) {
priv = netdev_priv(dev);
@@ -1383,6 +1385,9 @@ static void ipoib_remove_one(struct ib_device *device)
dev_list = ib_get_client_data(device, ipoib_client);
 
list_for_each_entry_safe(priv, tmp, dev_list, list) {
+   if (rdma_port_link_layer(device, priv-port) != 
IB_LINK_LAYER_INFINIBAND)
+   continue;
+
ib_unregister_event_handler(priv-event_handler);
 
rtnl_lock();
-- 
1.7.0

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCHv8 07/11] ib_core: Add API to support IBoE from userspace

2010-02-18 Thread Eli Cohen
Add ib_uverbs_get_eth_l2_addr() to allow ibv_create_ah() to resolve sgid,
dgid to vlan, dmac for any gid type.  Although user-space might bypass this
call for link-local gids, it is better not to replicate the kernel resolution
policy.  Port link layer is also returned by ibv_query_port().

Signed-off-by: Eli Cohen e...@mellanox.co.il
---

Changes from v7:

1. ib_uverbs_get_mac() was renamed to ib_uverbs_get_eth_l2_addr() and it now
   returns both MAC, VLAN ID and a tagged indication to indicate if packets
   should go out tagged..

 drivers/infiniband/core/uverbs.h  |1 +
 drivers/infiniband/core/uverbs_cmd.c  |   33 +
 drivers/infiniband/core/uverbs_main.c |1 +
 drivers/infiniband/core/verbs.c   |   10 ++
 include/rdma/ib_user_verbs.h  |   22 --
 include/rdma/ib_verbs.h   |   17 +
 6 files changed, 82 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index b3ea958..79359f6 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -194,5 +194,6 @@ IB_UVERBS_DECLARE_CMD(create_srq);
 IB_UVERBS_DECLARE_CMD(modify_srq);
 IB_UVERBS_DECLARE_CMD(query_srq);
 IB_UVERBS_DECLARE_CMD(destroy_srq);
+IB_UVERBS_DECLARE_CMD(get_eth_l2_addr);
 
 #endif /* UVERBS_H */
diff --git a/drivers/infiniband/core/uverbs_cmd.c 
b/drivers/infiniband/core/uverbs_cmd.c
index 112d397..19b4827 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -452,6 +452,7 @@ ssize_t ib_uverbs_query_port(struct ib_uverbs_file *file,
resp.active_width= attr.active_width;
resp.active_speed= attr.active_speed;
resp.phys_state  = attr.phys_state;
+   resp.link_layer  = attr.link_layer;
 
if (copy_to_user((void __user *) (unsigned long) cmd.response,
 resp, sizeof resp))
@@ -1824,6 +1825,38 @@ err:
return ret;
 }
 
+ssize_t ib_uverbs_get_eth_l2_addr(struct ib_uverbs_file *file, const char 
__user *buf,
+ int in_len, int out_len)
+{
+   struct ib_uverbs_get_eth_l2_addr   cmd;
+   struct ib_uverbs_get_eth_l2_addr_resp  resp;
+   int  ret;
+   struct ib_pd*pd;
+
+   if (out_len  sizeof resp)
+   return -ENOSPC;
+
+   if (copy_from_user(cmd, buf, sizeof cmd))
+   return -EFAULT;
+
+   pd = idr_read_pd(cmd.pd_handle, file-ucontext);
+   if (!pd)
+   return -EINVAL;
+
+   ret = ib_get_eth_l2_addr(pd-device, cmd.port, (union ib_gid *)cmd.gid,
+cmd.sgid_idx, resp.mac, resp.vlan_id, 
resp.tagged);
+   put_pd_read(pd);
+   if (!ret) {
+   if (copy_to_user((void __user *) (unsigned long) cmd.response,
+resp, sizeof resp))
+   return -EFAULT;
+
+   return in_len;
+   }
+
+   return ret;
+}
+
 ssize_t ib_uverbs_destroy_ah(struct ib_uverbs_file *file,
 const char __user *buf, int in_len, int out_len)
 {
diff --git a/drivers/infiniband/core/uverbs_main.c 
b/drivers/infiniband/core/uverbs_main.c
index 5f284ff..ef9eaa5 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -109,6 +109,7 @@ static ssize_t (*uverbs_cmd_table[])(struct ib_uverbs_file 
*file,
[IB_USER_VERBS_CMD_MODIFY_SRQ]  = ib_uverbs_modify_srq,
[IB_USER_VERBS_CMD_QUERY_SRQ]   = ib_uverbs_query_srq,
[IB_USER_VERBS_CMD_DESTROY_SRQ] = ib_uverbs_destroy_srq,
+   [IB_USER_VERBS_CMD_GET_ETH_L2_ADDR] = ib_uverbs_get_eth_l2_addr,
 };
 
 static struct vfsmount *uverbs_event_mnt;
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index f9cbdb6..f586702 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -920,3 +920,13 @@ int ib_detach_mcast(struct ib_qp *qp, union ib_gid *gid, 
u16 lid)
return qp-device-detach_mcast(qp, gid, lid);
 }
 EXPORT_SYMBOL(ib_detach_mcast);
+
+int ib_get_eth_l2_addr(struct ib_device *device, u8 port, union ib_gid *gid,
+  int sgid_idx, u8 *mac, __u16 *vlan_id, u8 *tagged)
+{
+   if (!device-get_eth_l2_addr)
+   return -ENOSYS;
+
+   return device-get_eth_l2_addr(device, port, gid, sgid_idx, mac, 
vlan_id, tagged);
+}
+EXPORT_SYMBOL(ib_get_eth_l2_addr);
diff --git a/include/rdma/ib_user_verbs.h b/include/rdma/ib_user_verbs.h
index a17f771..09f38df 100644
--- a/include/rdma/ib_user_verbs.h
+++ b/include/rdma/ib_user_verbs.h
@@ -81,7 +81,8 @@ enum {
IB_USER_VERBS_CMD_MODIFY_SRQ,
IB_USER_VERBS_CMD_QUERY_SRQ,
IB_USER_VERBS_CMD_DESTROY_SRQ,
-   IB_USER_VERBS_CMD_POST_SRQ_RECV
+   IB_USER_VERBS_CMD_POST_SRQ_RECV,
+   IB_USER_VERBS_CMD_GET_ETH_L2_ADDR
 };
 
 /*

[ewg] [PATCHv8 09/11] mlx4: Add support for IBoE - address resolution

2010-02-18 Thread Eli Cohen
The following patch handles address vectors creation for IBoE ports. mlx4 needs
the MAC address of the remote node to include it in the WQE of a UD QP or in
the QP context of connected QPs. Address resolution is done atomically in the
case of a link local address or a multicast GID and otherwise -EINVAL is
returned.  mlx4 transport packets were changed too to accommodate for IBoE.
Multicast groups attach/detach calls dev_mc_add/remove to update the NIC's
multicast filters.Since attaching a QP to a multicast group does not require
the QP to be in a state different then INIT - this is fine for IB. For IBoE
however, we need the port assigned to the QP in order to call dev_mc_add() for
the correct netdevice, while port is assigned when moving from INIT to RTR.
Hence, we must keep track of all the multicast groups attached to a QP and call
dev_mc_add() when the port becomes available.

Signed-off-by: Eli Cohen e...@mellanox.co.il
---
Changes from v7:

1. Fix failure to initialize gid_index in create_iboe_ah()
2. Move call register_netdevice_notifier() after call to ib_register_device()
3. Call flush_workqueue() after unregister notifier to flush any pending work
   requesets.
4. Change build_mlx_header to match changes to ud_header.c


 drivers/infiniband/hw/mlx4/ah.c|  182 ++---
 drivers/infiniband/hw/mlx4/mad.c   |   32 ++-
 drivers/infiniband/hw/mlx4/main.c  |  497 +---
 drivers/infiniband/hw/mlx4/mlx4_ib.h   |   35 +++-
 drivers/infiniband/hw/mlx4/qp.c|  139 +++--
 drivers/infiniband/hw/mthca/mthca_qp.c |2 +
 drivers/net/mlx4/en_port.c |4 +-
 drivers/net/mlx4/en_port.h |3 +-
 drivers/net/mlx4/fw.c  |3 +-
 include/linux/mlx4/cmd.h   |1 +
 include/linux/mlx4/device.h|   31 ++-
 include/linux/mlx4/qp.h|7 +-
 12 files changed, 819 insertions(+), 117 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/ah.c b/drivers/infiniband/hw/mlx4/ah.c
index c75ac94..0a2f1fb 100644
--- a/drivers/infiniband/hw/mlx4/ah.c
+++ b/drivers/infiniband/hw/mlx4/ah.c
@@ -31,63 +31,158 @@
  */
 
 #include mlx4_ib.h
+#include rdma/ib_addr.h
+#include linux/inet.h
+#include linux/string.h
 
-struct ib_ah *mlx4_ib_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr)
+int mlx4_ib_resolve_grh(struct mlx4_ib_dev *dev, const struct ib_ah_attr 
*ah_attr,
+   u8 *mac, int *is_mcast, u8 port)
 {
-   struct mlx4_dev *dev = to_mdev(pd-device)-dev;
-   struct mlx4_ib_ah *ah;
+   struct mlx4_ib_iboe *iboe = dev-iboe;
+   struct in6_addr in6;
 
-   ah = kmalloc(sizeof *ah, GFP_ATOMIC);
-   if (!ah)
-   return ERR_PTR(-ENOMEM);
+   *is_mcast = 0;
+   spin_lock(iboe-lock);
+   if (!iboe-netdevs[port - 1]) {
+   spin_unlock(iboe-lock);
+   return -EINVAL;
+   }
+   spin_unlock(iboe-lock);
 
-   memset(ah-av, 0, sizeof ah-av);
+   memcpy(in6, ah_attr-grh.dgid.raw, sizeof in6);
+   if (rdma_link_local_addr(in6))
+   rdma_get_ll_mac(in6, mac);
+   else if (rdma_is_multicast_addr(in6)) {
+   rdma_get_mcast_mac(in6, mac);
+   *is_mcast = 1;
+   } else
+   return -EINVAL;
 
-   ah-av.port_pd = cpu_to_be32(to_mpd(pd)-pdn | (ah_attr-port_num  
24));
-   ah-av.g_slid  = ah_attr-src_path_bits;
-   ah-av.dlid= cpu_to_be16(ah_attr-dlid);
-   if (ah_attr-static_rate) {
-   ah-av.stat_rate = ah_attr-static_rate + MLX4_STAT_RATE_OFFSET;
-   while (ah-av.stat_rate  IB_RATE_2_5_GBPS + 
MLX4_STAT_RATE_OFFSET 
-  !(1  ah-av.stat_rate  dev-caps.stat_rate_support))
-   --ah-av.stat_rate;
-   }
-   ah-av.sl_tclass_flowlabel = cpu_to_be32(ah_attr-sl  28);
+   return 0;
+}
+
+static struct ib_ah *create_ib_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr,
+ struct mlx4_ib_ah *ah)
+{
+   struct mlx4_dev *dev = to_mdev(pd-device)-dev;
+
+   ah-av.ib.port_pd = cpu_to_be32(to_mpd(pd)-pdn | (ah_attr-port_num  
24));
+   ah-av.ib.g_slid  = ah_attr-src_path_bits;
if (ah_attr-ah_flags  IB_AH_GRH) {
-   ah-av.g_slid   |= 0x80;
-   ah-av.gid_index = ah_attr-grh.sgid_index;
-   ah-av.hop_limit = ah_attr-grh.hop_limit;
-   ah-av.sl_tclass_flowlabel |=
+   ah-av.ib.g_slid   |= 0x80;
+   ah-av.ib.gid_index = ah_attr-grh.sgid_index;
+   ah-av.ib.hop_limit = ah_attr-grh.hop_limit;
+   ah-av.ib.sl_tclass_flowlabel |=
cpu_to_be32((ah_attr-grh.traffic_class  20) |
ah_attr-grh.flow_label);
-   memcpy(ah-av.dgid, ah_attr-grh.dgid.raw, 16);
+   memcpy(ah-av.ib.dgid, ah_attr-grh.dgid.raw, 16);
+   }
+
+   ah-av.ib.dlid= 

[ewg] [PATCHv8 10/11] ib_core: Add VLAN support to IBoE

2010-02-18 Thread Eli Cohen
Add 802.1q vlan support to IBoE. The vlan tag is encoded within the GID
derived from a link local address in the following way:

GID[11] GID[12] contain the vlan ID.
The 3 bit user priority field is identical to the 3 bits of the SL.

In case rdma_cm apps, the TOS field is used to generate the SL field by doing a
shift right of 5 bits effectively taking to 3 MS bits of the TOS field. In
order to support userspace verbs consumers, ib_uverbs_get_mac has changed into
ib_uverbs_get_eth_l2_addr and now returns both MAC and VLAN information.

Signed-off-by: Eli Cohen e...@mellanox.co.il
---
 drivers/infiniband/core/cma.c   |   20 -
 drivers/infiniband/core/ucma.c  |   13 -
 drivers/infiniband/core/ud_header.c |   31 -
 include/rdma/ib_addr.h  |   49 ---
 include/rdma/ib_pack.h  |   19 ++---
 5 files changed, 106 insertions(+), 26 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index df5f636..108d1bb 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1763,6 +1763,7 @@ static int cma_resolve_iboe_route(struct rdma_id_private 
*id_priv)
struct sockaddr_in *src_addr = (struct sockaddr_in 
*)route-addr.src_addr;
struct sockaddr_in *dst_addr = (struct sockaddr_in 
*)route-addr.dst_addr;
struct net_device *ndev = NULL;
+   u16 vid;
 
if (src_addr-sin_family != dst_addr-sin_family)
return -EINVAL;
@@ -1782,14 +1783,6 @@ static int cma_resolve_iboe_route(struct rdma_id_private 
*id_priv)
 
route-num_paths = 1;
 
-   iboe_mac_to_ll(route-path_rec-sgid, addr-dev_addr.src_dev_addr);
-   iboe_mac_to_ll(route-path_rec-dgid, addr-dev_addr.dst_dev_addr);
-
-   route-path_rec-hop_limit = 1;
-   route-path_rec-reversible = 1;
-   route-path_rec-pkey = cpu_to_be16(0x);
-   route-path_rec-mtu_selector = IB_SA_EQ;
-
if (addr-dev_addr.bound_dev_if)
ndev = dev_get_by_index(init_net, addr-dev_addr.bound_dev_if);
if (!ndev) {
@@ -1797,6 +1790,17 @@ static int cma_resolve_iboe_route(struct rdma_id_private 
*id_priv)
goto err2;
}
 
+   vid = rdma_vlan_dev_vlan_id(ndev);
+
+   iboe_mac_vlan_to_ll(route-path_rec-sgid, 
addr-dev_addr.src_dev_addr, vid);
+   iboe_mac_vlan_to_ll(route-path_rec-dgid, 
addr-dev_addr.dst_dev_addr, vid);
+
+   route-path_rec-hop_limit = 1;
+   route-path_rec-reversible = 1;
+   route-path_rec-pkey = cpu_to_be16(0x);
+   route-path_rec-mtu_selector = IB_SA_EQ;
+   route-path_rec-sl = id_priv-tos  5;
+
route-path_rec-mtu = iboe_get_mtu(ndev-mtu);
route-path_rec-rate_selector = IB_SA_EQ;
route-path_rec-rate = iboe_get_rate(ndev);
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index fcc27bc..ed670f5 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -586,13 +586,22 @@ static void ucma_copy_iboe_route(struct 
rdma_ucm_query_route_resp *resp,
 struct rdma_route *route)
 {
struct rdma_dev_addr *dev_addr;
+   struct net_device *dev;
+   u16 vid = 0;
 
resp-num_paths = route-num_paths;
switch (route-num_paths) {
case 0:
dev_addr = route-addr.dev_addr;
-   iboe_mac_to_ll((union ib_gid *) resp-ib_route[0].dgid,
-  dev_addr-dst_dev_addr);
+   dev = dev_get_by_index(init_net, dev_addr-bound_dev_if);
+   if (dev) {
+   vid = rdma_vlan_dev_vlan_id(dev);
+   dev_put(dev);
+   }
+
+
+   iboe_mac_vlan_to_ll((union ib_gid *) resp-ib_route[0].dgid,
+   dev_addr-dst_dev_addr, vid);
iboe_addr_get_sgid(dev_addr,
   (union ib_gid *) resp-ib_route[0].sgid);
resp-ib_route[0].pkey = cpu_to_be16(0x);
diff --git a/drivers/infiniband/core/ud_header.c 
b/drivers/infiniband/core/ud_header.c
index 7650313..7d03cf1 100644
--- a/drivers/infiniband/core/ud_header.c
+++ b/drivers/infiniband/core/ud_header.c
@@ -33,6 +33,7 @@
 
 #include linux/errno.h
 #include linux/string.h
+#include linux/if_ether.h
 
 #include rdma/ib_pack.h
 
@@ -103,6 +104,17 @@ static const struct ib_field eth_table[]  = {
  .size_bits= 16 }
 };
 
+static const struct ib_field vlan_table[]  = {
+   { STRUCT_FIELD(vlan, tag),
+ .offset_words = 0,
+ .offset_bits  = 0,
+ .size_bits= 16 },
+   { STRUCT_FIELD(vlan, type),
+ .offset_words = 0,
+ .offset_bits  = 16,
+ .size_bits= 16 }
+};
+
 static const struct ib_field grh_table[]  = {
{ STRUCT_FIELD(grh, ip_version),
  .offset_words = 0,
@@ -205,6 +217,7 @@ 

[ewg] [PATCHv8 11/11] mlx4: Add vlan support to IBoE

2010-02-18 Thread Eli Cohen
This patch allows IBoE traffic to be encapsulated in 802.1q tagged VLAN
frames. The VLAN tag is encoded in the GID and derived from it by a simple
computation. The netdev notifier callback is modified to catch new VLAN devices
addition/removal and the port's GID table is updated to reflect the change such
that for each netdevice there is an entry in the GID table. When the port's GID
table is exhausted, GID entries will not be added. Only children of the main
interface's can add to the GID table. If a vlan interface is added on another
vlan interface (e.g. vconfig add eth2.6 8), then that interfaces will not add
an entry to the GID table.

Signed-off-by: Eli Cohen e...@mellanox.co.il
---
 drivers/infiniband/hw/mlx4/ah.c|   22 +++--
 drivers/infiniband/hw/mlx4/main.c  |   84 +++-
 drivers/infiniband/hw/mlx4/mlx4_ib.h   |4 +-
 drivers/infiniband/hw/mlx4/qp.c|   49 +++---
 drivers/infiniband/hw/mthca/mthca_qp.c |2 +-
 drivers/net/mlx4/en_netdev.c   |   10 
 drivers/net/mlx4/mlx4_en.h |1 +
 drivers/net/mlx4/port.c|   19 +++
 include/linux/mlx4/device.h|1 +
 include/linux/mlx4/qp.h|2 +-
 10 files changed, 166 insertions(+), 28 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/ah.c b/drivers/infiniband/hw/mlx4/ah.c
index 0a2f1fb..32911c0 100644
--- a/drivers/infiniband/hw/mlx4/ah.c
+++ b/drivers/infiniband/hw/mlx4/ah.c
@@ -34,6 +34,7 @@
 #include rdma/ib_addr.h
 #include linux/inet.h
 #include linux/string.h
+#include rdma/ib_cache.h
 
 int mlx4_ib_resolve_grh(struct mlx4_ib_dev *dev, const struct ib_ah_attr 
*ah_attr,
u8 *mac, int *is_mcast, u8 port)
@@ -98,6 +99,8 @@ static struct ib_ah *create_iboe_ah(struct ib_pd *pd, struct 
ib_ah_attr *ah_attr
u8 mac[6];
int err;
int is_mcast;
+   u16 vlan_tag;
+   union ib_gid sgid;
 
err = mlx4_ib_resolve_grh(ibdev, ah_attr, mac, is_mcast, 
ah_attr-port_num);
if (err)
@@ -105,8 +108,14 @@ static struct ib_ah *create_iboe_ah(struct ib_pd *pd, 
struct ib_ah_attr *ah_attr
 
memcpy(ah-av.eth.mac_0_1, mac, 2);
memcpy(ah-av.eth.mac_2_5, mac + 2, 4);
+   err = ib_get_cached_gid(pd-device, ah_attr-port_num, 
ah_attr-grh.sgid_index, sgid);
+   if (err)
+   return ERR_PTR(err);
+   vlan_tag = rdma_get_vlan_id(sgid);
+   vlan_tag |= (ah_attr-sl  7)  13;
ah-av.eth.port_pd = cpu_to_be32(to_mpd(pd)-pdn | (ah_attr-port_num 
 24));
ah-av.eth.gid_index = ah_attr-grh.sgid_index;
+   ah-av.eth.vlan = cpu_to_be16(vlan_tag);
if (ah_attr-static_rate) {
ah-av.eth.stat_rate = ah_attr-static_rate + 
MLX4_STAT_RATE_OFFSET;
while (ah-av.eth.stat_rate  IB_RATE_2_5_GBPS + 
MLX4_STAT_RATE_OFFSET 
@@ -194,8 +203,8 @@ int mlx4_ib_destroy_ah(struct ib_ah *ah)
return 0;
 }
 
-int mlx4_ib_get_eth_l2_addr(struct ib_device *device, u8 port, union ib_gid 
*gid,
-   int sgid_idx, u8 *mac, u16 *vlan_id)
+int mlx4_ib_get_eth_l2_addr(struct ib_device *device, u8 port, union ib_gid 
*dgid,
+   int sgid_idx, u8 *mac, u16 *vlan_id, u8 *tagged)
 {
int err;
struct mlx4_ib_dev *ibdev = to_mdev(device);
@@ -203,13 +212,18 @@ int mlx4_ib_get_eth_l2_addr(struct ib_device *device, u8 
port, union ib_gid *gid
.port_num = port,
};
int is_mcast;
+   union ib_gid sgid;
 
-   memcpy(ah_attr.grh.dgid.raw, gid, 16);
+   memcpy(ah_attr.grh.dgid.raw, dgid, 16);
err = mlx4_ib_resolve_grh(ibdev, ah_attr, mac, is_mcast, port);
if (err)
ERR_PTR(err);
 
-   *vlan_id = 0;
+   err = ib_get_cached_gid(device, port, sgid_idx, sgid);
+   if (err)
+   return err;
+   *vlan_id = rdma_get_vlan_id(sgid);
+   *tagged = !!(*vlan_id);
 
return 0;
 }
diff --git a/drivers/infiniband/hw/mlx4/main.c 
b/drivers/infiniband/hw/mlx4/main.c
index 3b8ab83..f02897d 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -37,6 +37,7 @@
 #include linux/netdevice.h
 #include linux/inetdevice.h
 #include linux/rtnetlink.h
+#include linux/if_vlan.h
 
 #include rdma/ib_smi.h
 #include rdma/ib_user_verbs.h
@@ -78,6 +79,8 @@ static void init_query_mad(struct ib_smp *mad)
mad-method= IB_MGMT_METHOD_GET;
 }
 
+static union ib_gid zgid;
+
 static int mlx4_ib_query_device(struct ib_device *ibdev,
struct ib_device_attr *props)
 {
@@ -800,12 +803,17 @@ static struct device_attribute *mlx4_class_attributes[] = 
{
dev_attr_board_id
 };
 
-static void mlx4_addrconf_ifid_eui48(u8 *eui, struct net_device *dev)
+static void mlx4_addrconf_ifid_eui48(u8 *eui, int is_vlan, u16 vlan_id, struct 
net_device *dev)
 {
memcpy(eui, dev-dev_addr, 3);
memcpy(eui + 5, 

[ewg] [PATCHv8 1/4] libibverbs: Add link layer field to ibv_port_attr

2010-02-18 Thread Eli Cohen
This field can have one of the values - IBV_LINK_LAYER_UNSPECIFIED,
IBV_LINK_LAYER_INFINIBAND, IBV_LINK_LAYER_ETHERNET. It can be used by
applications to know the link layer used by the port, which can be either
Infiniband or Ethernet. The addition of the new field does not change the size
of struct ibv_port_attr due to alignment of the preceding field. Binary
compatibility is not compromised either since new apps with old libraries will
determine the link layer as IB while old applications with new a new library do
not read this field.

Solution suggested by:
   Roland Dreier rola...@cisco.com
   Jason Gunthorpe jguntho...@obsidianresearch.com
Signed-off-by: Eli Cohen e...@mellanox.co.il
---
 include/infiniband/verbs.h |   21 +
 1 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/include/infiniband/verbs.h b/include/infiniband/verbs.h
index 0f1cb2e..17df3ff 100644
--- a/include/infiniband/verbs.h
+++ b/include/infiniband/verbs.h
@@ -161,6 +161,12 @@ enum ibv_port_state {
IBV_PORT_ACTIVE_DEFER   = 5
 };
 
+enum {
+   IBV_LINK_LAYER_UNSPECIFIED,
+   IBV_LINK_LAYER_INFINIBAND,
+   IBV_LINK_LAYER_ETHERNET,
+};
+
 struct ibv_port_attr {
enum ibv_port_state state;
enum ibv_mtumax_mtu;
@@ -181,6 +187,8 @@ struct ibv_port_attr {
uint8_t active_width;
uint8_t active_speed;
uint8_t phys_state;
+   uint8_t link_layer;
+   uint8_t pad;
 };
 
 enum ibv_event_type {
@@ -693,6 +701,16 @@ struct ibv_context {
void   *abi_compat;
 };
 
+static inline int ___ibv_query_port(struct ibv_context *context,
+   uint8_t port_num,
+   struct ibv_port_attr *port_attr)
+{
+   port_attr-link_layer = IBV_LINK_LAYER_UNSPECIFIED;
+   port_attr-pad = 0;
+
+   return context-ops.query_port(context, port_num, port_attr);
+}
+
 /**
  * ibv_get_device_list - Get list of IB devices currently available
  * @num_devices: optional.  if non-NULL, set to the number of devices
@@ -1097,4 +1115,7 @@ END_C_DECLS
 
 #  undef __attribute_const
 
+#define ibv_query_port(context, port_num, port_attr) \
+   ___ibv_query_port(context, port_num, port_attr)
+
 #endif /* INFINIBAND_VERBS_H */
-- 
1.7.0

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCHv8 2/4] libibverbs: change kernel API to accept link layer

2010-02-18 Thread Eli Cohen
Modify the code to allow passing the link layer of a port from kernel to user.
Update ibv_query_port.3 man page with the change.

Signed-off-by: Eli Cohen e...@mellanox.co.il
---
 include/infiniband/kern-abi.h |3 ++-
 man/ibv_query_port.3  |1 +
 src/cmd.c |1 +
 3 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/include/infiniband/kern-abi.h b/include/infiniband/kern-abi.h
index 0db083a..619ea7e 100644
--- a/include/infiniband/kern-abi.h
+++ b/include/infiniband/kern-abi.h
@@ -223,7 +223,8 @@ struct ibv_query_port_resp {
__u8  active_width;
__u8  active_speed;
__u8  phys_state;
-   __u8  reserved[3];
+   __u8  link_layer;
+   __u8  reserved[2];
 };
 
 struct ibv_alloc_pd {
diff --git a/man/ibv_query_port.3 b/man/ibv_query_port.3
index 882470d..6d8b873 100644
--- a/man/ibv_query_port.3
+++ b/man/ibv_query_port.3
@@ -44,6 +44,7 @@ uint8_t init_type_reply;/* Type of 
initialization performed by S
 uint8_t active_width;   /* Currently active link width */
 uint8_t active_speed;   /* Currently active link speed */
 uint8_t phys_state; /* Physical port state */
+uint8_t link_layer; /* link layer protocol of the port */  
 
 .in -8
 };
 .sp
diff --git a/src/cmd.c b/src/cmd.c
index cbd5288..39af833 100644
--- a/src/cmd.c
+++ b/src/cmd.c
@@ -196,6 +196,7 @@ int ibv_cmd_query_port(struct ibv_context *context, uint8_t 
port_num,
port_attr-active_width= resp.active_width;
port_attr-active_speed= resp.active_speed;
port_attr-phys_state  = resp.phys_state;
+   port_attr-link_layer  = resp.link_layer;
 
return 0;
 }
-- 
1.7.0

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCHv8 3/4] libibverbs: Add API to retrieve eth link layer address

2010-02-18 Thread Eli Cohen
Add a command to retrieve the layer 2 address of an ethernet port. The layer 2
address is comprised of the port's MAC address and the VLAN ID.This is required
by libraries to build work requests when the port's link layer is Ethernet.

Signed-off-by: Eli Cohen e...@mellanox.co.il
---
 include/infiniband/driver.h   |2 ++
 include/infiniband/kern-abi.h |   23 ++-
 src/cmd.c |   24 
 src/libibverbs.map|1 +
 4 files changed, 49 insertions(+), 1 deletions(-)

diff --git a/include/infiniband/driver.h b/include/infiniband/driver.h
index 9a81416..3e09548 100644
--- a/include/infiniband/driver.h
+++ b/include/infiniband/driver.h
@@ -131,6 +131,8 @@ int ibv_cmd_create_ah(struct ibv_pd *pd, struct ibv_ah *ah,
 int ibv_cmd_destroy_ah(struct ibv_ah *ah);
 int ibv_cmd_attach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t 
lid);
 int ibv_cmd_detach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t 
lid);
+int ibv_cmd_get_eth_l2_addr(struct ibv_pd *pd, uint8_t port, const union 
ibv_gid *gid,
+   int sgid_idx, uint8_t *mac, uint16_t *vlan_id, 
uint8_t *tagged);
 
 int ibv_dontfork_range(void *base, size_t size);
 int ibv_dofork_range(void *base, size_t size);
diff --git a/include/infiniband/kern-abi.h b/include/infiniband/kern-abi.h
index 619ea7e..642c7db 100644
--- a/include/infiniband/kern-abi.h
+++ b/include/infiniband/kern-abi.h
@@ -85,7 +85,8 @@ enum {
IB_USER_VERBS_CMD_MODIFY_SRQ,
IB_USER_VERBS_CMD_QUERY_SRQ,
IB_USER_VERBS_CMD_DESTROY_SRQ,
-   IB_USER_VERBS_CMD_POST_SRQ_RECV
+   IB_USER_VERBS_CMD_POST_SRQ_RECV,
+   IB_USER_VERBS_CMD_GET_ETH_L2_ADDR,
 };
 
 /*
@@ -804,6 +805,7 @@ enum {
 * trick opcodes in IBV_INIT_CMD() doesn't break.
 */
IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL_V2 = -1,
+   IB_USER_VERBS_CMD_GET_ETH_L2_ADDR_V2 = -1,
 };
 
 struct ibv_destroy_cq_v1 {
@@ -879,4 +881,23 @@ struct ibv_create_srq_resp_v5 {
__u32 srq_handle;
 };
 
+struct ibv_get_eth_l2_addr {
+   __u32 command;
+   __u16 in_words;
+   __u16 out_words;
+   __u64 response;
+   __u32 pd_handle;
+   __u8  port;
+   __u8  sgid_idx;
+   __u8  reserved[2];
+   __u8  dgid[16];
+};
+
+struct ibv_get_eth_l2_addr_resp {
+   __u8mac[6];
+   __u16   vlan_id;
+   __u8tagged;
+   __u8reserved[3];
+};
+
 #endif /* KERN_ABI_H */
diff --git a/src/cmd.c b/src/cmd.c
index 39af833..6a3c101 100644
--- a/src/cmd.c
+++ b/src/cmd.c
@@ -1123,3 +1123,27 @@ int ibv_cmd_detach_mcast(struct ibv_qp *qp, const union 
ibv_gid *gid, uint16_t l
 
return 0;
 }
+
+int ibv_cmd_get_eth_l2_addr(struct ibv_pd *pd, uint8_t port, const union 
ibv_gid *gid,
+   int sgid_idx, uint8_t *mac, uint16_t *vlan_id, 
uint8_t *tagged)
+
+{
+   struct ibv_get_eth_l2_addr cmd;
+   struct ibv_get_eth_l2_addr_resp resp;
+
+   IBV_INIT_CMD_RESP(cmd, sizeof cmd, GET_ETH_L2_ADDR, resp, sizeof 
resp);
+   memcpy(cmd.dgid, gid, sizeof cmd.dgid);
+   cmd.pd_handle = pd-handle;
+   cmd.port = port;
+   cmd.sgid_idx = sgid_idx;
+
+   if (write(pd-context-cmd_fd, cmd, sizeof cmd) != sizeof cmd)
+   return errno;
+
+   memcpy(mac, resp.mac, 6);
+   *vlan_id = resp.vlan_id;
+   *tagged = resp.tagged;
+
+   return 0;
+}
+
diff --git a/src/libibverbs.map b/src/libibverbs.map
index 1827da0..af52d0d 100644
--- a/src/libibverbs.map
+++ b/src/libibverbs.map
@@ -64,6 +64,7 @@ IBVERBS_1.0 {
ibv_cmd_destroy_ah;
ibv_cmd_attach_mcast;
ibv_cmd_detach_mcast;
+   ibv_cmd_get_eth_l2_addr;
ibv_copy_qp_attr_from_kern;
ibv_copy_path_rec_from_kern;
ibv_copy_path_rec_to_kern;
-- 
1.7.0

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCHv8 4/4] libibverbs: Update examples for IBoE

2010-02-18 Thread Eli Cohen
Since IBoE requires usage of GRH, update ibv_*_pinpong examples to accept GIDs.
GIDs are given as an index to the local port's table and are exchanged between
the client and the server through the socket connection. The examples are also
modified to pass the gid index to the code that creates the address vector as a
preparation to using gids other the the on in index 0.

Signed-off-by: Eli Cohen e...@mellanox.co.il
---
 examples/devinfo.c  |   14 +++
 examples/pingpong.c |   31 
 examples/pingpong.h |4 ++
 examples/rc_pingpong.c  |   91 ++
 examples/srq_pingpong.c |   84 ---
 examples/uc_pingpong.c  |   82 +++---
 examples/ud_pingpong.c  |   81 ++
 7 files changed, 297 insertions(+), 90 deletions(-)

diff --git a/examples/devinfo.c b/examples/devinfo.c
index 84f95c7..393ec04 100644
--- a/examples/devinfo.c
+++ b/examples/devinfo.c
@@ -184,6 +184,19 @@ static int print_all_port_gids(struct ibv_context *ctx, 
uint8_t port_num, int tb
return rc;
 }
 
+static const char *link_layer_str(uint8_t link_layer)
+{
+   switch (link_layer) {
+   case IBV_LINK_LAYER_UNSPECIFIED:
+   case IBV_LINK_LAYER_INFINIBAND:
+   return IB;
+   case IBV_LINK_LAYER_ETHERNET:
+   return Ethernet;
+   default:
+   return Unknown;
+   }
+}
+
 static int print_hca_cap(struct ibv_device *ib_dev, uint8_t ib_port)
 {
struct ibv_context *ctx;
@@ -284,6 +297,7 @@ static int print_hca_cap(struct ibv_device *ib_dev, uint8_t 
ib_port)
printf(\t\t\tsm_lid:\t\t\t%d\n, port_attr.sm_lid);
printf(\t\t\tport_lid:\t\t%d\n, port_attr.lid);
printf(\t\t\tport_lmc:\t\t0x%02x\n, port_attr.lmc);
+   printf(\t\t\tlink_layer:\t\t%s\n, 
link_layer_str(port_attr.link_layer));
 
if (verbose) {
printf(\t\t\tmax_msg_sz:\t\t0x%x\n, 
port_attr.max_msg_sz);
diff --git a/examples/pingpong.c b/examples/pingpong.c
index b916f59..806f446 100644
--- a/examples/pingpong.c
+++ b/examples/pingpong.c
@@ -31,6 +31,10 @@
  */
 
 #include pingpong.h
+#include arpa/inet.h
+#include stdlib.h
+#include stdio.h
+#include string.h
 
 enum ibv_mtu pp_mtu_to_enum(int mtu)
 {
@@ -53,3 +57,30 @@ uint16_t pp_get_local_lid(struct ibv_context *context, int 
port)
 
return attr.lid;
 }
+
+int pp_get_port_info(struct ibv_context *context, int port,
+struct ibv_port_attr *attr)
+{
+   return ibv_query_port(context, port, attr);
+}
+
+void wire_gid_to_gid(const char *wgid, union ibv_gid *gid)
+{
+   char tmp[9];
+   uint32_t v32;
+   int i;
+
+   for (tmp[8] = 0, i = 0; i  4; ++i) {
+   memcpy(tmp, wgid + i * 8, 8);
+   sscanf(tmp, %x, v32);
+   *(uint32_t *)(gid-raw[i * 4]) = ntohl(v32);
+   }
+}
+
+void gid_to_wire_gid(const union ibv_gid *gid, char wgid[])
+{
+   int i;
+
+   for (i = 0; i  4; ++i)
+   sprintf(wgid[i * 8], %08x, htonl(*(uint32_t *)(gid-raw + i 
* 4)));
+}
diff --git a/examples/pingpong.h b/examples/pingpong.h
index 71d7c3f..9cdc03e 100644
--- a/examples/pingpong.h
+++ b/examples/pingpong.h
@@ -37,5 +37,9 @@
 
 enum ibv_mtu pp_mtu_to_enum(int mtu);
 uint16_t pp_get_local_lid(struct ibv_context *context, int port);
+int pp_get_port_info(struct ibv_context *context, int port,
+struct ibv_port_attr *attr);
+void wire_gid_to_gid(const char *wgid, union ibv_gid *gid);
+void gid_to_wire_gid(const union ibv_gid *gid, char wgid[]);
 
 #endif /* IBV_PINGPONG_H */
diff --git a/examples/rc_pingpong.c b/examples/rc_pingpong.c
index fa969e0..a63905d 100644
--- a/examples/rc_pingpong.c
+++ b/examples/rc_pingpong.c
@@ -67,17 +67,19 @@ struct pingpong_context {
int  size;
int  rx_depth;
int  pending;
+   struct ibv_port_attr portinfo;
 };
 
 struct pingpong_dest {
int lid;
int qpn;
int psn;
+   union ibv_gid gid;
 };
 
 static int pp_connect_ctx(struct pingpong_context *ctx, int port, int my_psn,
  enum ibv_mtu mtu, int sl,
- struct pingpong_dest *dest)
+ struct pingpong_dest *dest, int sgid_idx)
 {
struct ibv_qp_attr attr = {
.qp_state   = IBV_QPS_RTR,
@@ -94,6 +96,13 @@ static int pp_connect_ctx(struct pingpong_context *ctx, int 
port, int my_psn,
.port_num   = port
}
};
+
+   if (dest-gid.global.interface_id) {
+   attr.ah_attr.is_global = 1;
+   attr.ah_attr.grh.hop_limit = 1;
+   attr.ah_attr.grh.dgid = dest-gid;
+   attr.ah_attr.grh.sgid_index = sgid_idx;
+   }
   

[ewg] [PATCHv8 2/2] libmlx4: Add Eth devices to ib devices list

2010-02-18 Thread Eli Cohen
With the new IBoE implementation, Ethernet devices expose also IB devices.
Update the list of supported devices with that of the kernel.

Signed-off-by: Eli Cohen e...@mellanox.co.il
---
 src/mlx4.c |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/src/mlx4.c b/src/mlx4.c
index 1295c53..6068208 100644
--- a/src/mlx4.c
+++ b/src/mlx4.c
@@ -66,6 +66,13 @@ struct {
HCA(MELLANOX, 0x6354),  /* MT25408 Hermon QDR */
HCA(MELLANOX, 0x6732),  /* MT25408 Hermon DDR PCIe gen2 */
HCA(MELLANOX, 0x673c),  /* MT25408 Hermon QDR PCIe gen2 */
+   HCA(MELLANOX, 0x6368), /* MT25448 [ConnectX EN 10GigE, PCIe 2.0 
2.5GT/s] */
+   HCA(MELLANOX, 0x6750), /* MT26448 [ConnectX EN 10GigE, PCIe 2.0 5GT/s] 
*/
+   HCA(MELLANOX, 0x6372), /* MT25408 [ConnectX EN 10GigE 10GBaseT, PCIe 
2.0 2.5GT/s] */
+   HCA(MELLANOX, 0x675a), /* MT25408 [ConnectX EN 10GigE 10GBaseT, PCIe 
Gen2 5GT/s] */
+   HCA(MELLANOX, 0x6764), /* MT26468 [ConnectX EN 10GigE, PCIe 2.0 5GT/s] 
*/
+   HCA(MELLANOX, 0x6746), /* MT26438 ConnectX EN 40GigE PCIe gen2 5GT/s */
+   HCA(MELLANOX, 0x676e), /* MT26478 ConnectX2 40GigE PCIe gen2 */
 };
 
 static struct ibv_context_ops mlx4_ctx_ops = {
-- 
1.7.0

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCHv8 1/2] libmlx4: Add IBoE support

2010-02-18 Thread Eli Cohen
Modify libmlx4 to support IBoE. The change involves retrieving the ethernet
layer 2 address of a port based on its GID and source index through a new
system call, ibv_cmd_eth_l2_addr(), and embedding the layer 2 information in
the address vector representation of mlx4.

Signed-off-by: Eli Cohen e...@mellanox.co.il
---
 src/mlx4.h  |4 
 src/qp.c|8 +++-
 src/verbs.c |   34 ++
 src/wqe.h   |6 --
 4 files changed, 49 insertions(+), 3 deletions(-)

diff --git a/src/mlx4.h b/src/mlx4.h
index 4445998..4b12456 100644
--- a/src/mlx4.h
+++ b/src/mlx4.h
@@ -236,11 +236,15 @@ struct mlx4_av {
uint8_t hop_limit;
uint32_tsl_tclass_flowlabel;
uint8_t dgid[16];
+   uint8_t mac[8];
 };
 
 struct mlx4_ah {
struct ibv_ah   ibv_ah;
struct mlx4_av  av;
+   uint16_tvlan;
+   uint8_t mac[6];
+   uint8_t tagged;
 };
 
 static inline unsigned long align(unsigned long val, unsigned long align)
diff --git a/src/qp.c b/src/qp.c
index d194ae3..fa70889 100644
--- a/src/qp.c
+++ b/src/qp.c
@@ -143,6 +143,8 @@ static void set_datagram_seg(struct mlx4_wqe_datagram_seg 
*dseg,
memcpy(dseg-av, to_mah(wr-wr.ud.ah)-av, sizeof (struct mlx4_av));
dseg-dqpn = htonl(wr-wr.ud.remote_qpn);
dseg-qkey = htonl(wr-wr.ud.remote_qkey);
+   dseg-vlan = htons(to_mah(wr-wr.ud.ah)-vlan);
+   memcpy(dseg-mac, to_mah(wr-wr.ud.ah)-mac, 6);
 }
 
 static void __set_data_seg(struct mlx4_wqe_data_seg *dseg, struct ibv_sge *sg)
@@ -281,6 +283,10 @@ int mlx4_post_send(struct ibv_qp *ibqp, struct ibv_send_wr 
*wr,
set_datagram_seg(wqe, wr);
wqe  += sizeof (struct mlx4_wqe_datagram_seg);
size += sizeof (struct mlx4_wqe_datagram_seg) / 16;
+   if (to_mah(wr-wr.ud.ah)-tagged) {
+   ctrl-ins_vlan = 1  6;
+   ctrl-vlan_tag = 
htons(to_mah(wr-wr.ud.ah)-vlan);
+   }
break;
 
default:
@@ -393,7 +399,7 @@ out:
 
if (nreq == 1  inl  size  1  size  ctx-bf_buf_size / 16) {
ctrl-owner_opcode |= htonl((qp-sq.head  0x)  8);
-   *(uint32_t *) ctrl-reserved |= qp-doorbell_qpn;
+   *(uint32_t *) (ctrl-vlan_tag) |= qp-doorbell_qpn;
/*
 * Make sure that descriptor is written to memory
 * before writing to BlueFlame page.
diff --git a/src/verbs.c b/src/verbs.c
index 1ac1362..48731a7 100644
--- a/src/verbs.c
+++ b/src/verbs.c
@@ -614,9 +614,21 @@ int mlx4_destroy_qp(struct ibv_qp *ibqp)
return 0;
 }
 
+static int mcast_mac(uint8_t *mac)
+{
+   int i;
+   uint8_t val = 0xff;
+
+   for (i = 0; i  6; ++i)
+   val = mac[i];
+
+   return val == 0xff;
+}
+
 struct ibv_ah *mlx4_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr)
 {
struct mlx4_ah *ah;
+   struct ibv_port_attr port_attr;
 
ah = malloc(sizeof *ah);
if (!ah)
@@ -642,7 +654,29 @@ struct ibv_ah *mlx4_create_ah(struct ibv_pd *pd, struct 
ibv_ah_attr *attr)
memcpy(ah-av.dgid, attr-grh.dgid.raw, 16);
}
 
+   if (ibv_query_port(pd-context, attr-port_num, port_attr))
+   goto err;
+
+   if (port_attr.link_layer == IBV_LINK_LAYER_ETHERNET) {
+   if (ibv_cmd_get_eth_l2_addr(pd, attr-port_num,
+   (const union ibv_gid *)ah-av.dgid,
+   attr-grh.sgid_index,
+   ah-mac, ah-vlan, ah-tagged))
+   goto err;
+
+   if (mcast_mac(ah-mac))
+   ah-av.dlid = htons(0xc000);
+   if (ah-tagged) {
+   ah-av.port_pd |= htonl(1  29);
+   ah-vlan |= (attr-sl  7)  13;
+   }
+   }
+
+
return ah-ibv_ah;
+err:
+   free(ah);
+   return NULL;
 }
 
 int mlx4_destroy_ah(struct ibv_ah *ah)
diff --git a/src/wqe.h b/src/wqe.h
index 6f7f309..1e6159c 100644
--- a/src/wqe.h
+++ b/src/wqe.h
@@ -54,7 +54,8 @@ enum {
 
 struct mlx4_wqe_ctrl_seg {
uint32_towner_opcode;
-   uint8_t reserved[3];
+   uint16_tvlan_tag;
+   uint8_t ins_vlan;
uint8_t fence_size;
/*
 * High 24 bits are SRC remote buffer; low 8 bits are flags:
@@ -78,7 +79,8 @@ struct mlx4_wqe_datagram_seg {
uint32_tav[8];
uint32_tdqpn;
uint32_tqkey;
-   uint32_treserved[2];
+   uint16_t 

Re: [ewg] [PATCH OFED-151] ehca forward ports

2010-02-18 Thread Vladimir Sokolovsky
Alexander Schmidt wrote:
 Hi Vlad,
 
 please apply for OFED-151.
 
 Forward ports for ehca driver to enable compilation
 on 2.6.32 and 2.6.31.
 
 Signed-off-by: Alexander Schmidt al...@linux.vnet.ibm.com
 ---
  kernel_patches/backport/2.6.32/ehca-010-remove_driver_data.patch |   60 
 ++
  kernel_patches/backport/2.6.32/ehca-020-fix_buswalk.patch|   17 ++
  2 files changed, 77 insertions(+)
 

Hi Alex,
I don't see patches for 2.6.31. Should they be here?

Regards,
Vladimir

 --- /dev/null
 +++ 
 ofed_kernel-1.5/kernel_patches/backport/2.6.32/ehca-010-remove_driver_data.patch
 @@ -0,0 +1,60 @@
 +commit f899c2ddd45f2515deb446e2b143e4a686a49aee
 +Author: Greg Kroah-Hartman gre...@suse.de
 +Date:   Mon May 4 12:40:54 2009 -0700
 +
 +infiniband: ehca: remove driver_data direct access of struct device
 +
 +In the near future, the driver core is going to not allow direct access
 +to the driver_data pointer in struct device.  Instead, the functions
 +dev_get_drvdata() and dev_set_drvdata() should be used.  These functions
 +have been around since the beginning, so are backwards compatible with
 +all older kernel versions.
 +
 +Cc: Sean Hefty sean.he...@intel.com
 +Cc: Roland Dreier rola...@cisco.com
 +Cc: Hal Rosenstock hal.rosenst...@gmail.com
 +Cc: gene...@lists.openfabrics.org
 +Cc: Christoph Raisch rai...@de.ibm.com
 +Acked-by: Hoang-Nam Nguyen hngu...@de.ibm.com
 +Signed-off-by: Greg Kroah-Hartman gre...@suse.de
 +
 +diff --git a/drivers/infiniband/hw/ehca/ehca_main.c 
 b/drivers/infiniband/hw/ehca/ehca_main.c
 +index 85905ab..ce4e6ef 100644
 +--- a/drivers/infiniband/hw/ehca/ehca_main.c
  b/drivers/infiniband/hw/ehca/ehca_main.c
 +@@ -636,7 +636,7 @@ static ssize_t  ehca_show_##name(struct device *dev, 
   \
 + struct hipz_query_hca *rblock; \
 + int data;  \
 +\
 +-shca = dev-driver_data;   \
 ++shca = dev_get_drvdata(dev);   \
 +\
 + rblock = ehca_alloc_fw_ctrlblock(GFP_KERNEL);  \
 + if (!rblock) { \
 +@@ -680,7 +680,7 @@ static ssize_t ehca_show_adapter_handle(struct device 
 *dev,
 + struct device_attribute *attr,
 + char *buf)
 + {
 +-struct ehca_shca *shca = dev-driver_data;
 ++struct ehca_shca *shca = dev_get_drvdata(dev);
 + 
 + return sprintf(buf, %llx\n, shca-ipz_hca_handle.handle);
 + 
 +@@ -749,7 +749,7 @@ static int __devinit ehca_probe(struct of_device *dev,
 + 
 + shca-ofdev = dev;
 + shca-ipz_hca_handle.handle = *handle;
 +-dev-dev.driver_data = shca;
 ++dev_set_drvdata(dev-dev, shca);
 + 
 + ret = ehca_sense_attributes(shca);
 + if (ret  0) {
 +@@ -878,7 +878,7 @@ probe1:
 + 
 + static int __devexit ehca_remove(struct of_device *dev)
 + {
 +-struct ehca_shca *shca = dev-dev.driver_data;
 ++struct ehca_shca *shca = dev_get_drvdata(dev-dev);
 + unsigned long flags;
 + int ret;
 + 
 --- /dev/null
 +++ ofed_kernel-1.5/kernel_patches/backport/2.6.32/ehca-020-fix_buswalk.patch
 @@ -0,0 +1,17 @@
 +---
 + drivers/infiniband/hw/ehca/ehca_mrmw.c |2 +-
 + 1 file changed, 1 insertion(+), 1 deletion(-)
 +
 +Index: ofa_kernel-1.5.1/drivers/infiniband/hw/ehca/ehca_mrmw.c
 +===
 +--- ofa_kernel-1.5.1.orig/drivers/infiniband/hw/ehca/ehca_mrmw.c
  ofa_kernel-1.5.1/drivers/infiniband/hw/ehca/ehca_mrmw.c
 +@@ -2463,7 +2463,7 @@ int ehca_create_busmap(void)
 + int ret;
 + 
 + ehca_mr_len = 0;
 +-ret = walk_memory_resource(0, 1ULL  MAX_PHYSMEM_BITS, NULL,
 ++ret = walk_system_ram_range(0, 1ULL  MAX_PHYSMEM_BITS, NULL,
 +ehca_create_busmap_callback);
 + return ret;
 + }
 

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCH] IB/qib: update driver for OFED 1.5.1

2010-02-18 Thread Ralph Campbell
Vlad,

Please pull from:

git://git.openfabrics.org/~ralphc/linux-2.6 ofed_kernel_1_5

commit bbf2471eac44a9cf2db05803a212162da3898ca4
Author: Ralph Campbell (QLogic) ral...@lists.openfabrics.org
Date:   Thu Feb 18 13:53:24 2010 -0800

IB/qib: update driver for OFED 1.5.1

This patch rolls up several fixes for the QIB driver to improve
serdes settings, minor bug fixes, and copyright updates.
It also adds a vendor specific performance MAD for returning some
congestion statistics.

Signed-off-by: Ralph Campbell ralph.campb...@qlogic.com


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] MLX4 Strangeness

2010-02-18 Thread Vu Pham
Hi Tom,

Status 12 = IB_WC_RETRY_EXC_ERR
Vendor_err = 129 -- Timeout and transport error counter exceeded

This indicates that we lost connection to the client ie. something went
wrong on client side (bad operation cause QP error...) please try to
catch any error on the client (qp async event, cq error status and
vendor_err...)

Today I just run vdbench on big file and get error right away (lost
connection and nfsrdma cannot recover from there)

Thanks,
-vu

-Original Message-
From: ewg-boun...@lists.openfabrics.org
[mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of Tom Tucker
Sent: Wednesday, February 17, 2010 10:07 AM
To: Tziporet Koren
Cc: linux-r...@vger.kernel.org; ewg@lists.openfabrics.org
Subject: Re: [ewg] MLX4 Strangeness

Hi Tziporet:

Here is a trace with the data for WR failing with status 12. The vendor 
error is 129.

Feb 17 12:27:33 vic10 kernel: rpcrdma_event_process:154 wr_id 
 status 12 opcode 0 vendor_err 129 byte_len 0 qp 
81002a13ec00 ex  src_qp  wc_flags, 0 pkey_index
Feb 17 12:27:33 vic10 kernel: rpcrdma_event_process:154 wr_id 
81002878d800 status 5 opcode 0 vendor_err 244 byte_len 0 qp 
81002a13ec00 ex  src_qp  wc_flags, 0 pkey_index
Feb 17 12:27:33 vic10 kernel: rpcrdma_event_process:167 wr_id 
81002878d800 status 5 opcode 0 vendor_err 244 byte_len 0 qp 
81002a13ec00 ex  src_qp  wc_flags, 0 pkey_index

Any thoughts?
Tom

Tom Tucker wrote:
 Tom Tucker wrote:
 Tziporet Koren wrote:
 On 2/15/2010 10:24 PM, Tom Tucker wrote:
  
 Hello,

 I am seeing some very strange behavior on my MLX4 adapters running
2.7
 firmware and the latest OFED 1.5.1. Two systems are involved and
each
 have dual ported MTHCA DDR adapter and MLX4 adapters.

 The scenario starts with NFSRDMA stress testing between the two 
 systems
 running bonnie++ and iozone concurrently. The test completes and
there
 is no issue. Then 6 minutes pass and the server times out the
 connection and shuts down the RC connection to the client.

   From this point on, using the RDMA CM, a new RC QP can be brought
up
 and moved to RTS, however, the first RDMA_SEND to the NFS SERVER 
 system
 fails with IB_WC_RETRY_EXC_ERR. I have confirmed:

 - that arp completed successfully and the neighbor entries are
 populated on both the client and server
 - that the QP are in the RTS state on both the client and server
 - that there are RECV WR posted to the RQ on the server and they 
 did not
 error out
 - that no RECV WR completed successfully or in error on the server
 - that there are SEND WR posted to the QP on the client
 - the client side SEND_WR fails with error 12 as mentioned above

 I have also confirmed the following with a different application
(i.e.
 rping):

 server# rping -s
 client# rping -c -a 192.168.80.129

 fails with the exact same error, i.e.
 client# rping -c -a 192.168.80.129
 cq completion failed status 12
 wait for RDMA_WRITE_ADV state 10
 client DISCONNECT EVENT...

 However, if I run rping the other way, it works fine, that is,

 client# rping -s
 server# rping -c -a 192.168.80.135

 It runs without error until I stop it.

 Does anyone have any ideas on how I might debug this?



 Tom
 What is the vendor syndrome error when you get a completion with
error?

   
 Feb 16 15:08:29 vic10 kernel: rpcrdma: connection to 
 192.168.80.129:20049 closed (-103)
 Feb 16 15:51:27 vic10 kernel: rpcrdma: connection to 
 192.168.80.129:20049 on mlx4_0, memreg 5 slots 32 ird 16
 Feb 16 15:52:01 vic10 kernel: rpcrdma_event_process:160 wr_id 
 81002879a000 status 5 opcode 0 vendor_err 244 byte_len 0 qp 
 81003c9e3200 ex  src_qp  wc_flags, 0 pkey_index
 Feb 16 15:52:06 vic10 kernel: rpcrdma: connection to 
 192.168.80.129:20049 closed (-103)
 Feb 16 15:52:06 vic10 kernel: rpcrdma: connection to 
 192.168.80.129:20049 on mlx4_0, memreg 5 slots 32 ird 16
 Feb 16 15:52:40 vic10 kernel: rpcrdma_event_process:160 wr_id 
 81002879a000 status 5 opcode 0 vendor_err 244 byte_len 0 qp 
 81002f2d8400 ex  src_qp  wc_flags, 0 pkey_index

 Repeat forever

 So the vendor err is 244.


 Please ignore this. This log skips the failing WR (:-\). I need to do 
 another trace.



 Does the issue occurs only on the ConnectX cards (mlx4) or also on 
 the InfiniHost cards (mthca)

 Tziporet

 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
   





___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg