[ewg] Re: [ofa-general] Re: dapl attribute bug
Bug 1613 opened to track this. I think we need this for ofed-1.4.1. Steve. Steve Wise wrote: Hey Arlin, Did this ever get fixed? I think UNH is seeing this issue still. Steve Wise wrote: Davis, Arlin R wrote: The DAPL dat_ia_attr-max_lmr_block_size is a u32, yet the dapl code maps this to the linux ib_device_attr-max_mr_size which is u64. This causes dapltest to fail in some cases when running over chelsio which sets max_mr_size to 0x1 (4GB). The dapl code truncates the value to 0. See dapl/openib_cma/dapl_ib_util.c. I'm not sure what the fix should be, but maybe the dapl code should set anything over 32 bits to 0x? This attribute changed with DAT 2.0 to match the 32-bit ibv_sge length field. Since there are no direct max lmr segments mappings I will need add some checks when setting max_lmr_block_size from max_mr_size. Thanks. -arlin I'll test your fix when its ready. Lemme know. Steve. ___ general mailing list gene...@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ general mailing list gene...@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [GIT PULL ofed-1.4.1] iw_cxgb3/nfsrdma fixes
Vlad, Please pull from: git://v...@sofa.openfabrics.org/~swise/scm/ofed-1.4.git ofed_1_4 You'll get these fixes from Jon and I: Author: Jon Mason j...@opengridcomputing.com Date: Wed Apr 29 16:03:12 2009 -0500 NFS-RDMA: DMA direction error on NFS server This patch fixes an issue I am seeing on ppc64 when running on that platform as a NFS Server. The incorrect DMA direction causes an EEH event. This patch has already been sent upstream for inclusion into 2.6.30. Signed-Off-By: Jon Mason j...@opengridcomputing.com commit 1f3248b3942427c437db26fec8297c754f085494 Author: Steve Wise sw...@opengridcomputing.com Date: Wed Apr 29 16:00:43 2009 -0500 RDMA/cxgb3: Pull in sq flush fix. Signed-off-by: Steve Wise sw...@opengridcomputing.com commit fde3500748351e0b431ebd667f03a6d95c045333 Author: Steve Wise sw...@opengridcomputing.com Date: Wed Apr 29 16:00:38 2009 -0500 NFSRDMA: Pull in error paths fix. Signed-off-by: Steve Wise sw...@opengridcomputing.com commit f3a84550b84aa8262821b2114ad353a2b144668c Author: Steve Wise sw...@opengridcomputing.com Date: Sun Apr 26 13:44:59 2009 -0500 NFSRDMA: pull in frmr iova_start truncation fix. Signed-off-by: Steve Wise sw...@opengridcomputing.com ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED 1.4.1 RC4 is delayed to Thursday
Hi Tziporet. My update is that I believe I know what is causing bug 1607, and I'm working on a fix. Thanks. -jeff Steve Wise wrote: Status update: I cleaned up some NFSRDMA server crashes that happen when there are asynchronous WR failures. That might help Vu figure out 1571. I think there is a FW issue causing the async failure. But the code shouldn't crash anymore with my latest fix. But I'd also like 1613, 1616 into ofed-1.4.1: 1613: dapl regression that UNH uncovered. Arlin has a fix. 1616: nfsrdma ppc64 issue uncovered today. Hopefully we can nail this one by EOB friday Should we crank RC4 tomorrow and plan an RC5? Or hold off for a few more days with RC4? Steve. Tziporet Koren wrote: Jon Mason wrote: On Mon, Apr 27, 2009 at 05:43:05PM +0300, Tziporet Koren wrote: Hi All Since there are still few open critical bugs we delay OFED 1.4.1-RC4 build to Thursday. Note that we are on vacation on Wed this week (Israel Independence Day) The bugs that must be fixed: 1607blo SLES jeffrey.c.bec...@nasa.gov kernel oops during login on sles10 sp2 with OFED-1.4.1-20... 1609 cri RHEL sw...@opengridcomputing.com kernel panic running iozone on x86 system This was fixed by the patch Steve pushed on Friday. I'll close the bug for him. Well - its too late now for us to build and test it What about bug 1571 ? Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
RE: [ewg] OFED 1.4.1 RC4 is delayed to Thursday
I am running unit tests with the dapl fix (#1613) now and can have a new package later tonight. No need to delay for this bug. -arlin -Original Message- From: Steve Wise [mailto:sw...@opengridcomputing.com] Sent: Wednesday, April 29, 2009 2:46 PM To: Tziporet Koren Cc: Jon Mason; Vu Pham; ewg@lists.openfabrics.org; Davis, Arlin R; Vladimir Sokolovsky Subject: Re: [ewg] OFED 1.4.1 RC4 is delayed to Thursday Status update: I cleaned up some NFSRDMA server crashes that happen when there are asynchronous WR failures. That might help Vu figure out 1571. I think there is a FW issue causing the async failure. But the code shouldn't crash anymore with my latest fix. But I'd also like 1613, 1616 into ofed-1.4.1: 1613: dapl regression that UNH uncovered. Arlin has a fix. 1616: nfsrdma ppc64 issue uncovered today. Hopefully we can nail this one by EOB friday Should we crank RC4 tomorrow and plan an RC5? Or hold off for a few more days with RC4? Steve. Tziporet Koren wrote: Jon Mason wrote: On Mon, Apr 27, 2009 at 05:43:05PM +0300, Tziporet Koren wrote: Hi All Since there are still few open critical bugs we delay OFED 1.4.1-RC4 build to Thursday. Note that we are on vacation on Wed this week (Israel Independence Day) The bugs that must be fixed: 1607blo SLES jeffrey.c.bec...@nasa.gov kernel oops during login on sles10 sp2 with OFED-1.4.1-20... 1609 cri RHEL sw...@opengridcomputing.com kernel panic running iozone on x86 system This was fixed by the patch Steve pushed on Friday. I'll close the bug for him. Well - its too late now for us to build and test it What about bug 1571 ? Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] RE: Do you still need the local SA in OFED 1.5?
Subject: Do you still need the local SA in OFED 1.5? The RDMA/IB CMs do not scale without PR caching or hard-coding PR parameters. I'm personally fine removing it from OFED. MPI and other applications are working around SA scaling issues by connecting over sockets anyway. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg