[ewg] RE: [PATCH] ISER: fix compilation issues on Lustre kernels based on RHEL4.0U[4-6].
We'll try to find a computer and test it. Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -Original Message- From: Vladimir Sokolovsky [mailto:[EMAIL PROTECTED] Sent: Monday, January 21, 2008 6:08 PM To: Moshe Kazir; Erez Zilber Cc: OpenFabricsEWG; Nir Gal; Yair Ifergan Subject: [PATCH] ISER: fix compilation issues on Lustre kernels based on RHEL4.0U[4-6]. Hi Moshe, The following patch fixes OFED-1.3 compilation issue on Lustre kernels: ISER: fix compilation issues on Lustre kernels based on RHEL4.0U[4-6]. Signed-off-by: Vladimir Sokolovsky [EMAIL PROTECTED] --- diff --git a/kernel_patches/backport/2.6.9_U4/iscsi_06_scsi_addons.patch b/kernel_patches/backport/2.6.9_U4/iscsi_06_scsi_addons.patch index 1b5af7b..2c6d71f 100644 --- a/kernel_patches/backport/2.6.9_U4/iscsi_06_scsi_addons.patch +++ b/kernel_patches/backport/2.6.9_U4/iscsi_06_scsi_addons.patch @@ -69,7 +69,7 @@ index e212608..3bf2015 100644 obj-$(CONFIG_SCSI_ISCSI_ATTRS) += scsi_transport_iscsi.o obj-$(CONFIG_ISCSI_TCP)+= libiscsi.o iscsi_tcp.o + -+CFLAGS_attribute_container.o = -I$(PWD)/kernel_addons/backport/2.6.9_U4/include/src/ ++CFLAGS_attribute_container.o = $(BACKPORT_INCLUDES)/src/ + +scsi_transport_iscsi-y := scsi_transport_iscsi_f.o scsi.o scsi_lib.o init.o klist.o attribute_container.o transport_class.o +libiscsi-y := libiscsi_f.o scsi_scan.o diff --git a/kernel_patches/backport/2.6.9_U5/iscsi_06_scsi_addons.patch b/kernel_patches/backport/2.6.9_U5/iscsi_06_scsi_addons.patch index 1b5af7b..2c6d71f 100644 --- a/kernel_patches/backport/2.6.9_U5/iscsi_06_scsi_addons.patch +++ b/kernel_patches/backport/2.6.9_U5/iscsi_06_scsi_addons.patch @@ -69,7 +69,7 @@ index e212608..3bf2015 100644 obj-$(CONFIG_SCSI_ISCSI_ATTRS) += scsi_transport_iscsi.o obj-$(CONFIG_ISCSI_TCP)+= libiscsi.o iscsi_tcp.o + -+CFLAGS_attribute_container.o = -I$(PWD)/kernel_addons/backport/2.6.9_U4/include/src/ ++CFLAGS_attribute_container.o = $(BACKPORT_INCLUDES)/src/ + +scsi_transport_iscsi-y := scsi_transport_iscsi_f.o scsi.o scsi_lib.o init.o klist.o attribute_container.o transport_class.o +libiscsi-y := libiscsi_f.o scsi_scan.o diff --git a/kernel_patches/backport/2.6.9_U6/iscsi_06_scsi_addons.patch b/kernel_patches/backport/2.6.9_U6/iscsi_06_scsi_addons.patch index 1b5af7b..2c6d71f 100644 --- a/kernel_patches/backport/2.6.9_U6/iscsi_06_scsi_addons.patch +++ b/kernel_patches/backport/2.6.9_U6/iscsi_06_scsi_addons.patch @@ -69,7 +69,7 @@ index e212608..3bf2015 100644 obj-$(CONFIG_SCSI_ISCSI_ATTRS) += scsi_transport_iscsi.o obj-$(CONFIG_ISCSI_TCP)+= libiscsi.o iscsi_tcp.o + -+CFLAGS_attribute_container.o = -I$(PWD)/kernel_addons/backport/2.6.9_U4/include/src/ ++CFLAGS_attribute_container.o = $(BACKPORT_INCLUDES)/src/ + +scsi_transport_iscsi-y := scsi_transport_iscsi_f.o scsi.o scsi_lib.o init.o klist.o attribute_container.o transport_class.o +libiscsi-y := libiscsi_f.o scsi_scan.o diff --git a/ofed_scripts/makefile b/ofed_scripts/makefile index bcc55fe..cb89d00 100644 --- a/ofed_scripts/makefile +++ b/ofed_scripts/makefile @@ -67,7 +67,7 @@ kernel: @echo Kernel version: $(KVERSION) @echo Modules directory: $(DESTDIR)/$(MODULES_DIR) @echo Kernel sources: $(KSRC) - env CWD=$(CWD) \ + env CWD=$(CWD) BACKPORT_INCLUDES=$(BACKPORT_INCLUDES) \ $(MAKE) -C $(KSRC) SUBDIRS=$(CWD) \ V=1 $(WITH_MAKE_PARAMS) \ CONFIG_MEMTRACK=$(CONFIG_MEMTRACK) \ ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [PATCH] IB/iser: Add iSER fix that will be merged into 2.6.25
Erez Zilber wrote: This fix adds a printk before initiating a BUG() when receiving an unhandled RDMA-CM event. Signed-off-by: Erez Zilber [EMAIL PROTECTED] --- ...nformation_about_unhandled_RDMA_CM_events.patch | 33 1 files changed, 33 insertions(+), 0 deletions(-) create mode 100644 kernel_patches/fixes/iser_01_Print_information_about_unhandled_RDMA_CM_events.patch diff --git a/kernel_patches/fixes/iser_01_Print_information_about_unhandled_RDMA_CM_events.patch b/kernel_patches/fixes/iser_01_Print_information_about_unhandled_RDMA_CM_events.patch new file mode 100644 index 000..4b96c8f --- /dev/null +++ b/kernel_patches/fixes/iser_01_Print_information_about_unhandled_RDMA_CM_events.patch @@ -0,0 +1,33 @@ +Print information about unhandled RDMA CM events + +Some RDMA CM events are not supported or not handled in iSER. +This patch adds some info (printk) for the user about them. + +Signed-off-by: Erez Zilber [EMAIL PROTECTED] +--- + drivers/infiniband/ulp/iser/iser_verbs.c |6 ++ + 1 files changed, 2 insertions(+), 4 deletions(-) + +diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c b/drivers/infiniband/ulp/iser/iser_verbs.c +index 654a4dc..675d00b 100644 +--- a/drivers/infiniband/ulp/iser/iser_verbs.c b/drivers/infiniband/ulp/iser/iser_verbs.c +@@ -475,13 +475,11 @@ static int iser_cma_handler(struct rdma_cm_id *cma_id, struct rdma_cm_event *eve + iser_disconnected_handler(cma_id); + break; + case RDMA_CM_EVENT_DEVICE_REMOVAL: ++ iser_err(Device removal is currently unsupported\n); + BUG(); + break; +- case RDMA_CM_EVENT_CONNECT_RESPONSE: +- BUG(); +- break; +- case RDMA_CM_EVENT_CONNECT_REQUEST: + default: ++ iser_err(Unexpected RDMA CM event (%d)\n, event-event); + break; + } + return ret; +-- +1.5.3.7 + Applied to the ofed_1_3/linux-2.6.git ofed_kernel. Regards, Vladimir ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] OFED Jan 21 meeting summary on RC2 status
OFED Jan-21 meeting summary on OFED 1.3-rc2 status Meeting summary: -- * RC2 is in good status, beside some bugs that should be fixed for RC3 * RC3 is planned for next week Meeting details: 1. Review RC2 status * Qlogic - status is good with their vnic and general tests; see issues with qperf on RDS * Intel - RC2 is OK, vmapich is fixed for ia64 * Mellanox - regression is good and stable; Cleanup SDP bugs. * IBM - Status is good; PPC issues resolved * Neteffect - Status is OK * Chelsio - Testing progress well * Cisco - no update * Voltaire - have issues with bonding and IPoIB performance * MPI - all MPI packages are in good shape 2. Update on tasks that should be completed for RC2: *XRC - enhanced API - should be submitted today * IPoIB - need to resolve the new issue reported by Voltaire Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] RDS - Recovering from RDMA errors
On Sunday 20 January 2008 20:57, Roland Dreier wrote: If you could send me some code and a recipe to get the bogus CQ message, that might be helpful. Because as far as I can see, there shouldn't be any way for a consumer to get that message without a bug in the low-level driver. It's fine if it's a whole big RDS test case, I just want to be able to run the test and instrument the low-level driver to get a better handle on what's happening. Okay, I put my current patch queue into a git tree. It's in the testing branch of git://www.openfabrics.org/~okir/ofed_1_3/linux-2.6.git git://www.openfabrics.org/~okir/ofed_1_3/rds-tools.git In order to reproduce the problem, I usually run while sleep 1; do rds-stress -R -r locip -s remip -p 4000 -c -d2 -t8 -T5 -D1m done Within minutes, I get syslog messages saying Timed out waiting for CQs to be drained - recv: 0 entries, send: 4 entries left This message originates from net/rds_ib_cm.c - as a workaround, I added a timeout of 1 second when waiting for the WQs to be drained. I usually get those stalls after a WQE completes with status 10 (or sometimes 4). BTW, what kind of HCA are you using for this testing? A pair of fairly new Mellanox cards. Olaf -- Olaf Kirch | --- o --- Nous sommes du soleil we love when we play [EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Question regarding non srq patch for OFED 1.3
Some HCAs like ehca do not natively support srq. In order to enable IPoIB CM for such HCAs, I have developed a nonsrq patch. This patch has been accepted into Roland's for-2.6.25 git tree for about 3 months now. I am working on porting that to OFED 1.3 and it will take me at least several days to finish the port and test it. Is there a date by which I need to complete it for it's inclusion into OFED 1.3? Pradeep ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] RDS - Recovering from RDMA errors
BTW, what kind of HCA are you using for this testing? A pair of fairly new Mellanox cards. How new? Is it ConnectX or something older -- ie do you use the ib_mthca or mlx4_ib driver? If you're using mlx4, then I could believe there is a firmware bug that leads to lost completions. - R. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] RE: [ofa-general] OFED Jan 14 meeting summary on RC2readiness
I guess you mean just implement XRC without allowing multiple processes to share an XRC domain? That actually seems like a sensible thing to implement as well... This is part of the current XRC implementation -- just give -1 as the fd value in ibv_open_xrc_domain(). I *think* Gleb's point was that the XRC implementation could be much simpler if this were the *only* case supported -- you wouldn't need all the complexity of kernel receive QPs etc I guess. Gleb, is that what you meant? ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: Question regarding non srq patch for OFED 1.3
Pradeep Satyanarayana wrote: Some HCAs like ehca do not natively support srq. In order to enable IPoIB CM for such HCAs, I have developed a nonsrq patch. This patch has been accepted into Roland's for-2.6.25 git tree for about 3 months now. I am working on porting that to OFED 1.3 and it will take me at least several days to finish the port and test it. Is there a date by which I need to complete it for it's inclusion into OFED 1.3? Pradeep If I remember correctly this is not a small patch, thus I don't know if this is not too late for OFED 1.3 since it may delay the Feb release. However we can discuss this in the next OFED meeting on Monday and see what other people think Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg