[ewg] [GIT PULL 00/10] ofed_1_2 - Chelsio Bug Fixes
Vlad, The following patches are bug fixes to the rdma and low level chelsio drivers for ofed-1.2. All of these patches are upstream in either 2.6.22 or pending for 2.6.23 and need to be pulled into ofed-1.2. I plan to make these available to chelsio customers either through a series of patches, or a full ofa_kernel tarball. Please pull these from: http://git.openfabrics.org/~swise/ofed_1_2 ofed_1_2 Thanks, Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [PATCH 01/10] iw_cxgb3: ctrl-qp init/clear shouldn't set the gen bit.
iw_cxgb3: ctrl-qp init/clear shouldn't set the gen bit. Signed-off-by: Steve Wise [EMAIL PROTECTED] --- drivers/infiniband/hw/cxgb3/core/cxio_hal.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/cxgb3/core/cxio_hal.c b/drivers/infiniband/hw/cxgb3/core/cxio_hal.c index 62998d3..9746635 100644 --- a/drivers/infiniband/hw/cxgb3/core/cxio_hal.c +++ b/drivers/infiniband/hw/cxgb3/core/cxio_hal.c @@ -162,7 +162,7 @@ int cxio_hal_clear_qp_ctx(struct cxio_rd } wqe = (struct t3_modify_qp_wr *) skb_put(skb, sizeof(*wqe)); memset(wqe, 0, sizeof(*wqe)); - build_fw_riwrh((struct fw_riwrh *) wqe, T3_WR_QP_MOD, 3, 1, qpid, 7); + build_fw_riwrh((struct fw_riwrh *) wqe, T3_WR_QP_MOD, 3, 0, qpid, 7); wqe-flags = cpu_to_be32(MODQP_WRITE_EC); sge_cmd = qpid 8 | 3; wqe-sge_cmd = cpu_to_be64(sge_cmd); @@ -566,7 +566,7 @@ static int cxio_hal_init_ctrl_qp(struct V_EC_UP_TOKEN(T3_CTL_QP_TID) | F_EC_VALID)) 32; wqe = (struct t3_modify_qp_wr *) skb_put(skb, sizeof(*wqe)); memset(wqe, 0, sizeof(*wqe)); - build_fw_riwrh((struct fw_riwrh *) wqe, T3_WR_QP_MOD, 0, 1, + build_fw_riwrh((struct fw_riwrh *) wqe, T3_WR_QP_MOD, 0, 0, T3_CTL_QP_TID, 7); wqe-flags = cpu_to_be32(MODQP_WRITE_EC); sge_cmd = (3ULL 56) | FW_RI_SGEEC_START 8 | 3; ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [PATCH 07/10] cxgb3 - Fix direct XAUI support
cxgb3 - Fix direct XAUI support Check all lanes for link status on direct XAUI cards. Don't assume that direct XAUI always uses XGMAC 1. Signed-off-by: Divy Le Ray [EMAIL PROTECTED] Signed-off-by: Jeff Garzik [EMAIL PROTECTED] --- drivers/net/cxgb3/ael1002.c | 10 -- drivers/net/cxgb3/regs.h|2 ++ 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/net/cxgb3/ael1002.c b/drivers/net/cxgb3/ael1002.c old mode 100755 new mode 100644 index 73a41e6..ee140e6 --- a/drivers/net/cxgb3/ael1002.c +++ b/drivers/net/cxgb3/ael1002.c @@ -219,7 +219,13 @@ static int xaui_direct_get_link_status(s unsigned int status; status = t3_read_reg(phy-adapter, -XGM_REG(A_XGM_SERDES_STAT0, phy-addr)); +XGM_REG(A_XGM_SERDES_STAT0, phy-addr)) | + t3_read_reg(phy-adapter, + XGM_REG(A_XGM_SERDES_STAT1, phy-addr)) | + t3_read_reg(phy-adapter, + XGM_REG(A_XGM_SERDES_STAT2, phy-addr)) | + t3_read_reg(phy-adapter, + XGM_REG(A_XGM_SERDES_STAT3, phy-addr)); *link_ok = !(status F_LOWSIG0); } if (speed) @@ -247,5 +253,5 @@ static struct cphy_ops xaui_direct_ops = void t3_xaui_direct_phy_prep(struct cphy *phy, struct adapter *adapter, int phy_addr, const struct mdio_ops *mdio_ops) { - cphy_init(phy, adapter, 1, xaui_direct_ops, mdio_ops); + cphy_init(phy, adapter, phy_addr, xaui_direct_ops, mdio_ops); } diff --git a/drivers/net/cxgb3/regs.h b/drivers/net/cxgb3/regs.h index e5a5534..bf9d6be 100644 --- a/drivers/net/cxgb3/regs.h +++ b/drivers/net/cxgb3/regs.h @@ -2128,6 +2128,8 @@ #define V_RESETPLL01(x) ((x) S_RESETP #define F_RESETPLL01V_RESETPLL01(1U) #define A_XGM_SERDES_STAT0 0x8f0 +#define A_XGM_SERDES_STAT1 0x8f4 +#define A_XGM_SERDES_STAT2 0x8f8 #define S_LOWSIG00 #define V_LOWSIG0(x) ((x) S_LOWSIG0) ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [PATCH 09/10] cxgb3 - MAC watchdog update
cxgb3 - MAC watchdog update Fix variables initialization and usage in the MAC watchdog. Signed-off-by: Divy Le Ray [EMAIL PROTECTED] Signed-off-by: Jeff Garzik [EMAIL PROTECTED] --- drivers/net/cxgb3/xgmac.c | 31 +-- 1 files changed, 21 insertions(+), 10 deletions(-) diff --git a/drivers/net/cxgb3/xgmac.c b/drivers/net/cxgb3/xgmac.c index 16cadba..b261be1 100644 --- a/drivers/net/cxgb3/xgmac.c +++ b/drivers/net/cxgb3/xgmac.c @@ -501,6 +501,10 @@ int t3b2_mac_watchdog_task(struct cmac * unsigned int rx_xcnt; int status; + status = 0; + tx_xcnt = 1;/* By default tx_xcnt is making progress */ + tx_tcnt = mac-tx_tcnt; /* If tx_mcnt is progressing ignore tx_tcnt */ + rx_xcnt = 1;/* By default rx_xcnt is making progress */ if (tx_mcnt == mac-tx_mcnt) { tx_xcnt = (G_TXSPI4SOPCNT(t3_read_reg(adap, A_XGM_TX_SPI4_SOP_EOP_CNT + @@ -511,37 +515,44 @@ int t3b2_mac_watchdog_task(struct cmac * tx_tcnt = (G_TXDROPCNTCH0RCVD(t3_read_reg(adap, A_TP_PIO_DATA))); } else { - mac-toggle_cnt = 0; - return 0; + goto rxcheck; } } else { mac-toggle_cnt = 0; - return 0; + goto rxcheck; } if (((tx_tcnt != mac-tx_tcnt) (tx_xcnt == 0) (mac-tx_xcnt == 0)) || ((mac-tx_mcnt == tx_mcnt) (tx_xcnt != 0) (mac-tx_xcnt != 0))) { - if (mac-toggle_cnt 4) + if (mac-toggle_cnt 4) { status = 2; - else + goto out; + } else { status = 1; + goto out; + } } else { mac-toggle_cnt = 0; - return 0; + goto rxcheck; } +rxcheck: if (rx_mcnt != mac-rx_mcnt) rx_xcnt = (G_TXSPI4SOPCNT(t3_read_reg(adap, A_XGM_RX_SPI4_SOP_EOP_CNT + mac-offset))); - else - return 0; + else + goto out; - if (mac-rx_mcnt != s-rx_frames rx_xcnt == 0 mac-rx_xcnt == 0) + if (mac-rx_mcnt != s-rx_frames rx_xcnt == 0 + mac-rx_xcnt == 0) { status = 2; - + goto out; + } + +out: mac-tx_tcnt = tx_tcnt; mac-tx_xcnt = tx_xcnt; mac-tx_mcnt = s-tx_frames; ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: Why isn't kfifo get built with ib-core for RH4?
Michael S. Tsirkin wrote: Quoting Erez Zilber [EMAIL PROTECTED]: Subject: Why isn't kfifo get built with ib-core for RH4? Michael, I saw that kfifo that was built with ib-core for RH4 was removed: http://www2.openfabrics.org/git/?p=~vlad/ofed_kernel.git;a=commit;h=afe4186a2b383e58d9937d0b2fe2ddfb03cd7268 I can't think of a reason. Likely just an oversight. Why was it removed? open-iscsi cannot be loaded without it. If nobody else is using it, This was added here: ac758ec6bff062844a5a42141aa5da492b2cb02b so I think it's needed by Chelsio. Steve? Yes the chelsio rdma driver uses kfifos. I can move it to iscsi_scsi_addons.patch and build it with libiscsi. I think we should just re-add it in core. Patch? While we are at it: as a separate cleanup, can you please remove the file from kernel_addons/./backport and just check out the file from kernel/kfifo.c. Just like we do with e.g. klist.c. OK? ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [ofa-general] reminder: OFED meeting today at 9am PST
Am I missing the call info? I tried an older conf id, and it didn't work. Can you please post the conf call info along with the meeting notification? Thanks, Steve. Tziporet Koren wrote: Hi All, We will have our bi-weekly OFED meeting today at 9am PST Agenda: - Status update - Bugzilla cleanup If you have more agenda items please send them Tziporet ___ general mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Re: [ofa-general] reminder: OFED meeting today at 9am PST
Jeff Squyres wrote: Yes, you missed it; the call was over about half an hour ago. I [re-]posted the dial-in info about 3 hours before the call this morning on the ewg list. I see. That's why I missed it. I'm not on the ewg list. Are all attendees expected to be on the ewg list? Steve. On Jul 30, 2007, at 12:55 PM, Steve Wise wrote: Am I missing the call info? I tried an older conf id, and it didn't work. Can you please post the conf call info along with the meeting notification? Thanks, Steve. Tziporet Koren wrote: Hi All, We will have our bi-weekly OFED meeting today at 9am PST Agenda: - Status update - Bugzilla cleanup If you have more agenda items please send them Tziporet ___ general mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] patches for 1.2.c
Guys, I have 2 more patches to go in ofed_1_2/ofed_1_2_c. Is there some grand scheme to the naming of kernel_patches/fixes/* for 1.2.c? I noticed a slew of new files for the post-2.6.22 fixes, and wondered if there is a naming scheme? Or should I just post a patch for the ofed_1_2 branch and let you all create the ofed_1_2_c kernel_patches/fixes/ patch file ?? Thanks, Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [ofa-general] Re: ofa_1_2_c_kernel 20070802-0201 daily build status
Also, Is something broken in the ofed_1_2 branch? I cannot even build against the local kernel on the ofa server using the ~vlad/ofed_1_2/linux-2.6 repository. [EMAIL PROTECTED]:~/git/ofabuild$ env git_url=~vlad/ofed_1_2/linux-2.6 git_branch=ofed_1_2 CHECK_LOCAL=yes CHECK_KERNEL_ORG=no CHECK_CROSS=no ~swise/git/ofabuild/build_ofa_kernel.sh mkdir -p /home/swise/tmp/ofa_1_2_c_kernel-20070802-0912 ~/tmp/ofa_1_2_c_kernel-20070802-0912 ~/git/ofabuild git clone -s --bare --reference /home/vlad/scm/ofed_1_2 /home/vlad/ofed_1_2/linux-2.6 .git git checkout ofed_1_2 ofed_scripts/ofed_checkout.sh ofed_scripts/ofed_checkout.sh ofed_1_2 git update-ref HEAD ofed_1_2 Git: /home/vlad/ofed_1_2/linux-2.6 ofed_1_2 commit 020bfb400c759ba89ffb0b13c41f2ca50181aebe ~/git/ofabuild cp -a /home/swise/tmp/ofa_1_2_c_kernel-20070802-0912 /home/swise/builds/ofa_1_2_c_kernel/ofa_1_2_c_kernel-20070802-0912 rm -rf /home/swise/tmp/ofa_1_2_c_kernel-20070802-0912 Build failed on i686 with 2.6.15-23-server ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Re: [ofa-general] Re: ofa_1_2_c_kernel 20070802-0201 daily build status
I've narrowed my problem down to this: I'm cloning vlad's ofabuild repos, but it only gets me the master branch. So if I clone via: git clone git://git.openfabrics.org/~vlad/ofabuild.git or git clone -s /home/vlad/scm/ofabuild.git My clone repos only has the master branch. There should be master, ofed_1_2, ofed_1_2_c, and origin. If I clone from my local system over the net, I _get_ all the branches! Anybody know why local clones on the ofa build server are not pulling all the branches? Maybe I'm abusing git? Thanks, Steve. Michael S. Tsirkin wrote: Looke here: /home/vlad/scripts/ofed_1_2 Quoting Steve Wise [EMAIL PROTECTED]: Subject: Re: [ewg] Re: [ofa-general] Re: ofa_1_2_c_kernel 20070802-0201 daily build status I'm havin' a bad day. Can you all help me? My normal process is to use the build_ofa_kernel.sh script from the ofabuild repository to build against all ofed kernels. But that scripts in the master branch of the ofabuild repository now assumes 1.2.c because it tries to configure in the connectx device. There aren't ofed_1_2 and ofed_1_2_c branches in that repos for tree-specific build scripts. S: What exactly should I be using to do cross-compile builds of my patched trees before submitting patches for inclusion into ofed? Thanks and sorry for the pain. And if there a RTFM somewhere that I should be readying, feel free to say RTFM. :) Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Re: [ofa-general] Re: ofa_1_2_c_kernel 20070802-0201 daily build status
Sean Hefty wrote: If I clone from my local system over the net, I _get_ all the branches! Anybody know why local clones on the ofa build server are not pulling all the branches? Maybe I'm abusing git? It sounds like a different between git versions. Older git versions brought in remote branches such that 'git branch' would show them. (This causes problems if a remote branch conflicts with a local branch.) With the newer git version (what's on the ofqmanqa server), you need to use 'git branch -r' to see all of the actual branches. - Sean Yea, that's it. But how do I checkout the remote branch? The man page and 'git help' don't even show the -r option... Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [GIT PULL ofed-1.2/ofed-1.2.c] Chelsio Bug Fixes
Vlad, Please include these two patches into ofed 1.2 and 1.2.c. The patches are direct changes to the ofed_1_2 branch and patch files in kernel_patches/fixes for ofed_1_2_c. These have been accepted upstream, and address OFED bugs 696 and 698. Please pull these from: http://git.openfabrics.org/~swise/ofed_1_2 ofed_1_2 and http://git.openfabrics.org/~swise/ofed_1_2 ofed_1_2_c Thanks, Steve. Shortlog: iw_cxgb3: Make the iw_cxgb3 module parameters writable. iw_cxgb3: Always call low level send function via cxgb3_ofld_send(). ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED 1.2.c status plans
Tziporet, Can we make the change to 1.2.5? This should probably include: - change builds/connectx to builds/ofed-1.2.5 (or just add a link) - change build names from 1.2.c to 1.2.5 Scott Weitzenkamp (sweitzen) wrote: Cisco has been testing 1.2.c-10 IPoIB/SDP/MPI successfully on a 32-node cluster. We are still working on tvflash, though. Scott *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Tziporet Koren *Sent:* Tuesday, August 07, 2007 12:06 PM *To:* EWG *Cc:* OpenFabrics General *Subject:* [ewg] OFED 1.2.c status plans Hi All, I wish to update on OFED 1.2.c status and plans to synch everybody: * OFED 1.2.c-11 is going out tomorrow * This release should be the base for the GA release * Need an approval from Steve (Chelsio) Nam (IBM) that everything is in place from their perspective. Also please send me the release notes for ehca and cxgb3 * Need an approval from the companies that are testing this release that it can go to GA From Mellanox perspective (mlx4 readiness) we are ready for GA. I have one question: do we prefer to stay with the name 1.2.c or 1.2.5? Thanks, Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [ofa-general] RE: OFA website edits
Hey Jeff, Can I get ownership of /var/www/openfabrics.org/downloads/cxgb3? That way I can publish libcxgb3 releases, which I maintain. Right now ralfc owns this instead of swise. Thanks, Steve. Jeff Becker wrote: Hi. I created most of the requested directory/owner pairs in /var/www/openfabrics.org/downloads. I left out the various MPI directories, figuring the appropriate web pages will be linked from somewhere (possibly the downloads web page). I gave Stan Smith an account. Stan, please contact me to get the account info. I'm still working out how to do the dynamic web page stuff, but at least people can start populating their directories. Thanks. -jeff On 7/25/07, Arlin Davis [EMAIL PROTECTED] wrote: I would like to propose adding project directories under http://www.openfabrics.org/downloads/ where appropriate and give maintainers access. For example: Jeff, please add the following directories with maintainer access as follow (or grant access at a maintainer group level): http://www.openfabrics.org/downloads/verbs (rdreier) http://www.openfabrics.org/downloads/rdmacm (shefty) http://www.openfabrics.org/downloads/dapl (ardavis) http://www.openfabrics.org/downloads/sdp (eitan) http://www.openfabrics.org/downloads/utils (eitan) http://www.openfabrics.org/downloads/management (sashak) http://www.openfabrics.org/downloads/OFED (vlad) http://www.openfabrics.org/downloads/archives (vlad) http://www.openfabrics.org/downloads/WinOF (ssmith) (Stan Smith will need an account) http://www.openfabrics.org/downloads/hw/mthca (rdreir) http://www.openfabrics.org/downloads/hw/mlx4 (rdreir) http://www.openfabrics.org/downloads/hw/ehca (raisch) http://www.openfabrics.org/downloads/hw/ipath (ralphc) http://www.openfabrics.org/downloads/hw/cxgb3 (ralphc) http://www.openfabrics.org/downloads/mpi/mvapich (pasha) http://www.openfabrics.org/downloads/mpi/mvapich2 (rowland) http://www.openfabrics.org/downloads/mpi/openmpi (jsquyres) Let us know when these directories are created and the maintainers, who want to expose their packages via the webpage, will create a README that details the contents of the directory along with WEB_README that provides a short description for the webpage. Will this format allow you to auto configure the download webpage sufficiently? The idea is to only add links/descriptions to those project sub-directories with WEB_README files present. Please advise if something on the list is wrong or we missed a project. Thanks, -arlin ___ general mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ general mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [Fwd: [GIT PULL] ofed-1.3 - libcxgb3 refresh]
resending to lists... Original Message Subject: [GIT PULL] ofed-1.3 - libcxgb3 refresh Date: Tue, 21 Aug 2007 09:17:23 -0500 From: Steve Wise [EMAIL PROTECTED] To: Vladimir Sokolovsky [EMAIL PROTECTED] CC: OpenFabricsEWG ewg@lists.openfabrics.org, OpenFabrics General [EMAIL PROTECTED] Vlad, I have a new release of libcxgb3 that I want included in ofed-1.3. Changes since ofed-1.2 are 1 bug fix (bug 703), some cleanup on the spec file as well as adding a ChangeLog file (bug 707). This is release 1.0.1 of libcxgb3... Pull from git://git.openfabrics.org/~swise/libcxgb3 ofed_1_3 Thanks, Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [PATCH RFC] iw_cxgb3: Support iwarp-only interfaces to avoid 4-tuple conflicts with the host stack.
Roland/All, Here is the first swipe at keeping iwarp connections on their own ip addresses to avoid conflicts with the host stack. - this is a request for comments - it is not yet tested fully (tested a prototype of the initial concept) - still needs serialization/locking - stays in our RDMA sandbox ;-) For background reading (if you dare), see: http://www.mail-archive.com/[EMAIL PROTECTED]/msg05162.html and http://www.mail-archive.com/[EMAIL PROTECTED]/msg44312.html Also: I'm on vacation starting tomorrow until Tuesday 9/4. I'll address comments when I return... Steve. --- iw_cxgb3: Support iwarp-only interfaces to avoid 4-tuple conflicts with the host stack. Design: The sysadmin creates for iwarp use only alias interfaces of the form devname:iw* where devname is the native interface name (eg eth0) for the iwarp netdev device. The alias label can be anything starting with iw. The iw immediately after the ':' is the key used by the iwarp driver. EG: ifconfig eth0 192.168.70.123 up ifconfig eth0:iw1 192.168.71.123 up ifconfig eth0:iw2 192.168.72.123 up In the above example, 192.168.70/24 is for TCP traffic, while 192.168.71/24 and 192.168.72/24 are for iWARP/RDMA use. The rdma-only interface must be on its own subnet. This allows routing all rdma traffic onto this interface. The iWARP driver must translate all listens on address 0.0.0.0 to the set of rdma-only ip addresses. This prevents incoming connects to the TCP ipaddresses from going up the rdma stack. Implementation Details: - The iwarp driver registers for inetaddr events via register_inetaddr_notifier(). This allows tracking the iwarp-only addresses/subnets as they get added and deleted. The iwarp driver maintains a list of the current iwarp-only addresses. - The iwarp driver builds the list of iwarp-only addresses for its devices at module insert time. This is needed because the inetaddr notifier callbacks don't replay address-add events when someone registers. So the driver must build the initial list at module load time. - When a listen is done on address 0.0.0.0, then the iwarp driver must translate that into a set of listens on the iwarp-only addresses. - When a new iwarp-only address is added or removed, the iwarp driver must traverse the set of listening endpoints and update them accordingly. This allows an application to bind to 0.0.0.0 prior to the iwarp-only interfaces being configured. It also allows changing the iwarp-only set of addresses and getting the expected behavior for apps already bound to 0.0.0.0. Signed-off-by: Steve Wise [EMAIL PROTECTED] --- drivers/infiniband/hw/cxgb3/iwch.c| 116 + drivers/infiniband/hw/cxgb3/iwch.h| 10 + drivers/infiniband/hw/cxgb3/iwch_cm.c | 229 ++--- drivers/infiniband/hw/cxgb3/iwch_cm.h | 11 +- 4 files changed, 318 insertions(+), 48 deletions(-) diff --git a/drivers/infiniband/hw/cxgb3/iwch.c b/drivers/infiniband/hw/cxgb3/iwch.c index 0315c9d..da57b77 100644 --- a/drivers/infiniband/hw/cxgb3/iwch.c +++ b/drivers/infiniband/hw/cxgb3/iwch.c @@ -63,6 +63,115 @@ struct cxgb3_client t3c_client = { static LIST_HEAD(dev_list); static DEFINE_MUTEX(dev_mutex); +static void insert_ifa(struct iwch_dev *rnicp, struct in_ifaddr *ifa) +{ + struct iwch_addrlist *addr; + + addr = kmalloc(sizeof *addr, GFP_KERNEL); + if (!addr) { + printk(KERN_ERR MOD %s - failed to alloc memory!\n, + __FUNCTION__); + return; + } + addr-ifa = ifa; + list_add_tail(addr-entry, rnicp-addrlist); +} + +static void remove_ifa(struct iwch_dev *rnicp, struct in_ifaddr *ifa) +{ + struct iwch_addrlist *addr, *tmp; + + list_for_each_entry_safe(addr, tmp, rnicp-addrlist, entry) { + if (addr-ifa == ifa) { + list_del_init(addr-entry); + kfree(addr); + return; + } + } +} + +static int netdev_is_ours(struct iwch_dev *rnicp, struct net_device *netdev) +{ + int i; + + for (i = 0; i rnicp-rdev.port_info.nports; i++) + if (netdev == rnicp-rdev.port_info.lldevs[i]) + return 1; + return 0; +} + +static inline int is_iwarp_label(char *label) +{ + char *colon; + + colon = strchr(label, ':'); + if (colon !strncmp(colon+1, iw, 2)) + return 1; + return 0; +} + +static int nb_callback(struct notifier_block *self, unsigned long event, + void *ctx) +{ + struct in_ifaddr *ifa = ctx; + struct iwch_dev *rnicp = container_of(self, struct iwch_dev, nb); + + printk(KERN_INFO %s rnicp %p event %lx\n, __FUNCTION__, rnicp, event); + + switch (event) { + case NETDEV_UP: + if (netdev_is_ours(rnicp, ifa-ifa_dev-dev) + is_iwarp_label(ifa-ifa_label)) { + printk
[ewg] Re: [PATCH RFC] iw_cxgb3: Support iwarp-only interfaces to avoid 4-tuple conflicts with the host stack.
Steve Wise wrote: Roland Dreier wrote: What's wrong with my suggestion of having the iwarp driver create an iwX interface to go with the normal ethX interface? It seems simpler to me, and there's a somewhat similar precedent with how mac80211 devices create both wlan0 and wmaster0 interfaces. - R. It seemed much more painful for me to implement. :-) I'll look into this, but I think for this to be done, the changes must be in the cxgb3 driver, not the rdma driver, because the guts of the netdev struct are all private to cxgb3. Remember that this interface needs to still do non TCP traffic (like ARP and UDP)... Maybe you have something in mind here that I'm not thinking about? No, I was just spouting off. At least someone is looking at my patch. ;-) But the whole create a magic alias seems kind of unfriendly to the user. Maybe as you said, the cxgb3 net driver could create the alias for the iw_cxgb3 driver? I agree that it is not very user friendly. My current patch just utilizes the IP address alias logic in the IP stack. So when you do 'ifconfig ethxx:blah ipaddr up' it creates a struct in_ifaddr which contains a ptr to the real struct net_device that services this alias. However, from what I can tell, I cannot just create one of these without binding an address. So the driver cannot create the alias interface until it knows the ipaddr/netmask/etc. IE: if you say 'ifconfig ethxx:blah up' it fails... You must supply an address to get one of these created. To have the cxgb3 driver create something like 'iw0', I think it would need to create a full net_device struct. This makes the change much more complex. But perhaps its the right thing to do... Steve. Also, I could defer registering the device with the rdma core until the alias interface is created by the user. Thus the T3 device wouldn't be available for use until the ethxx:iw interface is created. And I could log a WARN or INFO message if the iw_cxgb3 module is loaded and no ethxx:iw alias exists. This would help clue in the user... Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Agenda for the OFED meeting today
Hey Tziporet, I cannot attend today's call. For the chelsio drivers, there will be a series of patches to be pulled into ofed-1.3 for the chelsio cxgb3 driver. They have been submitted upstream and ACKed by Garzik but he hasn't applied all of them yet. Once they are in his upstream branch, I'll pull them in for ofed-1.3 and ask Vlad (or you/michael in his absence) to pull these in. In addition, I want to pull these same patches into ofed-1.2.5 so that tree has the latest chelsio fixes as well. For the chelsio rdma driver iw_cxgb3, there will be a big patch to fix our port space issue. It is still under development and review, however. Steve. Tziporet Koren wrote: Agenda for the OFED meeting today: 1. Review OFED 1.3 features status main features that need update: NetEffect - done QoS: OSM - done QoS - need to merge Sean patches to the kernel XRC - 90% IPoIB: stateless offloads - 90% IPoIB: enable IGMP - ?? RDS - RDMA API - done QLVNIC update - done SDP: Keepalive - done; Asynch IO - done, Zero Copy - 80% Bonding -- ?? Management - ?? 2. Decide on feature freeze date (based on the status) 3. Close supported OS: Suggestion: * kernel.org: kernel 2.6.23 * Novell: SLES 10; SLES 10 SP1 * Redhat: RHEL 4 (up4 and up5); RHEL 5 - Do we want up1 too? * Free distros (Fedora, OpenSuSE, Ubuntu) - basic testing only Tziporet Koren Software Director Mellanox Technologies mailto: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] Tel +972-4-9097200, ext 380 ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [GIT PULL ofed-1.3] cxgb3 bug fixes
For ofed-1.3, please pull from: git://git.openfabrics.org/~swise/ofed-1.3 ofed_kernel The 1.3 patch series is identical to the ofed_1_2_c series except that the first patch, 0029-*, isn't needed since its already in ofed-1.3 from 2.6.23. Thanks, Steve. Steve Wise wrote: Vlad (Michael/Tziporet in Vlad's absence), Please integrate the following cxgb3 bug fixes into ofed-1.2.5. All of these patches are either in 2.6.23 or merged into Jeff Garzik's upstream branch of netdev-2.6 and will go into 2.6.24. Chelsio recommends we update ofed-1.2.5 and ofed-1.3 will all of these fixes. I'll send another email with the ofed-1.3 changes as they will be slightly different. Please pull the ofed_1_2_c changes from: git://git.openfabrics.org/~swise/ofed_1_2 ofed_1_2_c The patch files added to kernel_patches/fixes include: [EMAIL PROTECTED]:~/git/ofed-1.2.5 stg series + 0029-cxgb3-engine-microcode-load + 0030-cxgb3-MAC-workaround-update + 0031-cxgb3-Update-rx-coalescing-length + 0032-cxgb3-SGE-doorbell-overflow-warning + 0033-cxgb3-use-immediate-data-for-offload-Tx + 0034-cxgb3-Expose-HW-memory-page-info + 0035-cxgb3-tighten-checks-on-TID-values + 0036-cxgb3-Fatal-error-update + 0037-cxgb3-log-adapter-serial-number + 0038-cxgb3-Update-internal-memory-management + 0039-cxgb3-update-firmware-version + 0040-cxgb3-log-and-clear-PEX-errors + 0041-cxgb3-remove-false-positive-in-xgmac-workaround + 0042-cxgb3-Set-the-CQ_ERR-bit-in-CQ-contexts + 0043-cxgb3-CQ-context-operations-time-out-too-soon + 0044-cxgb3-Add-T3C-rev + 0045-cxgb3-Update-engine-microcode-version 0046-cxgb3-driver-version Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [PATCH] RDMA/CMA: Use neigh_event_send() to initiate neighbour discovery.
Michael, can you pull this patch into ofed-1.2.5 and ofed-1.3? Or would you want me to push it into my git tree for you to pull from? Thanks, Steve. Roland Dreier wrote: Roland - can you please queue this up for 2.6.24? Done, thanks. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [GIT PULL] ofed-1.2.5 / ofed-1.3 - new libcxgb3 release v1.0.2
Michael S. Tsirkin wrote: Quoting Steve Wise [EMAIL PROTECTED]: Subject: [GIT PULL] ofed-1.2.5 / ofed-1.3 - new libcxgb3 release v1.0.2 Please pull the latest from my libcxgb3 git repos to update the ofed-1.2.5 and ofed-1.3 libcxgb3 release. This will update to version 1.0.2 of libcxgb3 which fixes a doorbell issue on big-endian platforms. git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5 Go look at http://www.openfabrics.org/git/?p=ofed_1_2_5/libcxgb3.git;a=summary It has a ofed_1_2_5 branch. I believe Vlad setup the build scripts to handle this. Yes? This looks wrong. 1.2.X releases are done from ofed_1_2 branch. 1.2.5 is just a tag. What do you want me to do? and git://git.openfabrics.org/~swise/libcxgb3 ofed_1_3 OK for that one. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ofa-general] Re: [ewg] OFED teleconference today
I cannot make the meeting today. I vote for 2.6.24 base. There is still the outstanding iwarp port space issue that will need to be pulled into ofed-1.3 when it finalizes. But its a bug fix really, so not a new feature I guess. Tziporet Koren wrote: Jeff Squyres wrote: Friendly reminder: the OFED teleconference is several hours from now (Monday, September 24, 2007). Noon US eastern / 9am US Pacific / -=6pm Israel=- 1. Monday, Sep 24, code 210062024 (***TODAY***) Agenda: 1. Agree on the new OFED 1.3 schedule: * Feature freeze - Sep 25 * Alpha release - Oct 1 * Beta release - Oct 17 (may change according to 2.6.24 rc1 availability) * RC1 - Oct 24 * RC2 - Nov 7 * RC3 - Nov 20 * RC4 - Dec 4 * GA release - Dec 18 2. Agree to move to kernel base 2.6.24 Start with what we have now (2.6.23) and move to 2.6.24 when RC1 is available. This will reduce many patches and with the new timeline seems more appropriate. Please send if you have any other agenda items Tziporet ___ general mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Re: [PATCH] RDMA/CMA: Use neigh_event_send() to initiate neighbour discovery.
Michael, Have you pulled this in yet? I want to close out the bug I have open... Thanks, Steve. Steve Wise wrote: Michael S. Tsirkin wrote: Yes, please push this into your git tree (and please verify that cross-build to all OS-es passes). done! git://git.openfabrics.org/~swise/ofed_1_2 ofed_1_2_c Further, please do it this way: add the patch in ofed-1.2.5 and then merge 1.2.5 into 1.3. done! git://git.openfabrics.org/~swise/ofed-1.3 ofed_kernel Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] libcxgb3-1.0.3 available for ofed-1.2.5 and ofed-1.3
Thanks Vlad, Can you crank a ofed-1.2.5 development build too? Thanks, Steve. Vladimir Sokolovsky wrote: Steve Wise wrote: Vlad/Tziporet, Can you please pull version 1.0.3 of libcxgb3 for inclusion in ofed-1.2.5 and ofed-1.3? It contains a bug fix for olders kernels like RHEL4U4. You can use the master branch for both releases: git://git.openfabrics.org/~swise/libcxgb3.git master Also, please update the spec file you're using to reflect the release (1.0.3). The spec file in the libcxgb3 git tree should be correct. Thanks, Steve. Done, Regards, Vladimir ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] libcxgb3-1.0.3 available for ofed-1.2.5 and ofed-1.3
Hey Vlad, The libcxgb3 rpms built by this ofed-1.2.5 release are still named libcxgb3*-1.0.1 instead of 1.0.3. Can you update your spec files to indicate that the library is release 1.0.3? You'll need to also update the ofed-1.3 spec file I guess. Thanks, Steve. Vladimir Sokolovsky wrote: Steve Wise wrote: Thanks Vlad, Can you crank a ofed-1.2.5 development build too? Thanks, Steve. Done: http://www.openfabrics.org/builds/connectx/OFED-1.2.5-20071009-0955.tgz Regards, Vladimir ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ofa-general] Re: [ewg] libcxgb3-1.0.3 available for ofed-1.2.5 and ofed-1.3
Ok, can you re-pull to get the configure.in change? Sorry for the pain. Steve. Steve Wise wrote: oops. Lemme fix this up then we'll re-pull. Thanks, Steve. Vladimir Sokolovsky wrote: Steve Wise wrote: Hey Vlad, The libcxgb3 rpms built by this ofed-1.2.5 release are still named libcxgb3*-1.0.1 instead of 1.0.3. Can you update your spec files to indicate that the library is release 1.0.3? You'll need to also update the ofed-1.3 spec file I guess. Thanks, Steve. Hi Steve, You should update libcxgb3 version in the configure.in file: Update version to 1.0.3 Signed-off-by: Vladimir Sokolovsky [EMAIL PROTECTED] --- diff --git a/configure.in b/configure.in index 6f916d3..15406b7 100644 --- a/configure.in +++ b/configure.in @@ -1,11 +1,11 @@ dnl Process this file with autoconf to produce a configure script. AC_PREREQ(2.57) -AC_INIT(libcxgb3, 1.0.1, [EMAIL PROTECTED]) +AC_INIT(libcxgb3, 1.0.3, [EMAIL PROTECTED]) AC_CONFIG_SRCDIR([src/iwch.h]) AC_CONFIG_AUX_DIR(config) AM_CONFIG_HEADER(config.h) -AM_INIT_AUTOMAKE(libcxgb3, 1.0.1) +AM_INIT_AUTOMAKE(libcxgb3, 1.0.3) AM_PROG_LIBTOOL AC_ARG_ENABLE(libcheck, [ --disable-libcheck do not test for presence of ib libraries], Regards, Vladimir ___ general mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] iw_cxgb3 genalloc memory allocator dependency
The iw_cxgb3 module depends on the linux kernel genalloc service. This service gets compiled into the kernel _only_ if another subsystem has a config dependency on the genalloc module (CONFIG_GENERIC_ALLOCATOR). In addtion, there are only two users of this service: iw_cxgb3 and some IA64 subsystem. So on a kernel.org kernel that has iw_cxgb3, genalloc gets built into the kernel when you enable the iw_cxgb3 module. But on non IA64 platforms that do not have iw_cxgb3 configured in, the genalloc code is not pulled into the kernel. The side affect of this is that if one tries to compile OFED on a kernel.org kernel that doesn't have iw_cxgb3 configured, the genalloc server is not available and ofed doesn't compile. Now, ofed has a backport of genalloc to support older kernels that do not even have the genalloc service. But we don't pull in that backport for kernels that do have genalloc. Thus the problem... I'm looking for suggestions on how and if we should do something about this? Here are some ideas: 1) always build in our own genalloc service as a backport. This solves the problem, but duplicates the code if it is indeed built into the kernel. 2) detect and ofed config time if we need the genalloc service or not. Then pull in the backport as needed. This one is nice in that it won't replicate the gencalloc code when not needed, but at the expense of adding complexity to the configure script for ofed. I'm not really sure how to do it at all. But maybe vlad knows how? Thoughts? BTW: bug 767 opened to track this. Thanks, Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: ofed_kernel merged with 2.6.24-rc1 patches update required
Vladimir Sokolovsky wrote: Hello, There is a new branch ofed_kernel_2_6_24_rc1 under git://git.openfabrics.org/ofed_1_3/linux-2.6.git All patches from kernel_patches/fixes that were applied in 2.6.24-rc1 were removed from kernel_patches/fixes directory. The problematic patches from kernel_patches/fixes were moved to the kernel_patches/attic directory. Backport patches and fixes should be updated according to the new kernel tree. The easy way to do so is using ofed_scripts/ofed_makedist.sh utility which creates tgz file for every supported kernel with all relevant patches applied. Vlad, have you done any builds against the various kernels? What exactly should I, as cxgb3 owner, do with this branch other than verify the patches are correct? Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED Nov 05 meeting agenda on OFED 1.3 beta readiness
Tziporet Koren wrote: This is the agenda of OFED 1.3 meeting today: 1. Rebase for kernel 2.6.24-rc1: The backport was more complicated (mainly in IPoIB). The following kernel modules have now backports to all kernels: mthca, mlx4, ehca, core, IPoIB, RDS Kernel modules that need update: Chelsio driver (cxgb3), ipath driver, iSER, SDP, SRP, VNIC I'll get any backport fixes for cxgb3 by EOB tomorrow. I cannot make the call today. That's the only status I have. Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] printk in ofed-1.3 ib_destroy_qp()
Should this printk be here? int ib_destroy_qp(struct ib_qp *qp) { struct ib_pd *pd; struct ib_cq *scq, *rcq; struct ib_srq *srq; struct ib_xrcd *xrcd; enum ib_qp_type qp_type = qp-qp_type; int ret; pd = qp-pd; scq = qp-send_cq; rcq = qp-recv_cq; srq = qp-srq; xrcd = qp-xrcd; ret = qp-device-destroy_qp(qp); if (!ret) { atomic_dec(pd-usecnt); atomic_dec(scq-usecnt); atomic_dec(rcq-usecnt); if (srq) atomic_dec(srq-usecnt); if (qp_type == IB_QPT_XRC) atomic_dec(xrcd-usecnt); else printk(ib_destroy_qp: type = %d, xrcd = %p\n, qp_type, xrcd); } return ret; } ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED Nov 05 meeting summary on OFED 1.3 beta readiness
I get this failure trying to configure the 2.6.24 tree against sles9sp3/x86_64. Is this a known issue with umad? patching file drivers/infiniband/core/sysfs.c Hunk #1 FAILED at 442. 1 out of 1 hunk FAILED -- saving rejects to file drivers/infiniband/core/sysfs.c.rej patching file drivers/infiniband/core/user_mad.c Hunk #1 FAILED at 45. Hunk #2 succeeded at 736 (offset 150 lines). Hunk #3 FAILED at 830. Hunk #4 succeeded at 1194 (offset 179 lines). Hunk #5 succeeded at 1227 (offset 179 lines). 2 out of 5 hunks FAILED -- saving rejects to file drivers/infiniband/core/user_mad.c.rej patching file drivers/infiniband/core/uverbs_main.c Hunk #2 succeeded at 122 (offset 7 lines). patching file drivers/infiniband/core/umem.c Hunk #1 succeeded at 182 (offset 85 lines). Failed to apply patch: /usr/local/src/ofa_1_3_kernel-20071107-0842/kernel_patches/backport/2.6.5_sles9_sp3/core_4807_to_2_6_9.patch Failed executing /usr/local/src/ofa_1_3_kernel-20071107-0842/ofed_scripts/ofed_patch.sh vic11:/usr/local/src/ofa_1_3_kernel-20071107-0842 # Tziporet Koren wrote: Note: there will be no meeting next week - CU all in SC07 Tziporet OFED Nov 05 meeting summary on OFED 1.3 beta readiness 1. Rebase for kernel 2.6.24-rc1: The backport was more complicated (mainly in IPoIB). The following kernel modules have now backports to all kernels: mthca, mlx4, ehca, ipath, core, IPoIB, RDS Kernel modules that need update: Chelsio driver (cxgb3), iSER, SDP, SRP, VNIC Note: Please work on this git branch: git://git.openfabrics.org/ofed_1_3/linux-2.6.git ofed_kernel_2_6_24_rc1 Schedule: All new backport patches should be send to Vlad by Tuesday Nov 6. On Wed (Nov 7) we will start to publish the new package based on kernel 2.6.24. Kernel modules that will not pass compilation will be disabled 2. Other Beta tasks status: 1. Fix compilation problems on PPC SLES10 with 32 bits - Vlad (Mellanox) - on work 2. SPEC files should be part of each user space package - each owner should take the spec file 3. Fix all compilation and install issues - All 4. management readiness and open a branch for 1.3 - Sasha 3. Beta schedule: Target: do the beta release by the end of this week (Note: Since in Israel we are not working on Friday it will be done either on Thursday or Sunday) 4. GA schedule: Tziporet to publish the GA schedule - after the beta release will be done The schedule we had is published on the Wiki at https://wiki.openfabrics.org/tiki-index.php?page=OFED+1.3+release+plan+a nd+features 5. Integration of OFED 1.3 with Redhat: Tziporet to talk to Doug in SC07 Done tasks for the beta: o Multiple uDAPL libs (1.0 2.0) - Vlad and Arlin (Intel) o ibutils on SLES10 PPC64 (64 bits) - Vlad o Add qperf test from Qlogic - Johann (Qlogic) o Support RHEL 5 up1 - Woody Vlad o Apply patches that fix warning of backport patches - Vlad o New MVAPICH package - Pasha DK (OSU) o Complete RDS work - Vlad (Mellanox) o Integrate all SDP features - Jim (Mellanox) o nes - updated backport patches - Glenn (NetEffect) ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] open fabrics server slow
Seems the ofa server comes to crawl when folks are building on it. Can we get more memory added perhaps? And a GIANT 32 disk raid array? :) Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] open fabrics server slow
because vlad has every kernel tree known to man kind on that server. ;-) Seriously, if we need to to 27 backports for each driver for each new ofed kernel rebase, then it makes more sense to have a common system to build this rather than each vendor/maintainer try to duplicate the whole setup. my 2 cents... Sasha Khapyorsky wrote: On 09:32 Wed 07 Nov , Steve Wise wrote: Seems the ofa server comes to crawl when folks are building on it. OTOH why it is necessary to build on the server? Sasha ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] printk in ofed-1.3 ib_destroy_qp()
Perhaps I'm missing something, but I don't see this in the kernel tree. Can you give me a git commit id? Thanks, Steve. Jack Morgenstein wrote: On Tuesday 06 November 2007 22:02, Tziporet Koren wrote: Steve Wise wrote: Should this printk be here? else printk(ib_destroy_qp: type = %d, xrcd = %p\n, qp_type, xrcd); } return ret; } I think Jack already fixed this (there was also a bug about it) Jack? Tziporet Fixed on Oct 31. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [ofa-general] ofed--1.3 rdma_connect problems
Looks like the following commit exposed a bug in the chelsio driver. iw_cxgb3 was _not_ setting the max_qp_init_rd_atom attribute. commit 487a52078fe1ba322273a6b893d31e0caaa69a57 Author: Sean Hefty [EMAIL PROTECTED] Date: Tue Oct 16 14:59:21 2007 -0700 librdmacm/cma: provide sanity checks for max outstanding rdma ops Ensure that the responder_resources and initiator_depth values provided by the user are supported by the local hardware. This traps errors sooner during connection establishment (when calling rdma_connect), rather than waiting until the modify QP fails (after calling rdma_accept). Signed-off-by: Sean Hefty [EMAIL PROTECTED] I've opened bug 777 to fix this. Tziporet/Vlad, can we get this in beta? I will provide a patch shortly. Steve. Steve Wise wrote: Sean, I'm testing iwarp usermode on ofed-1.3 and I always get a -22 error from rdma_connect(). I tried rping and a home brew unit test program and bot this this error. I'm diving in now to see who's returning it, but wanted to give you a heads up... Stay tuned... Steve. ___ general mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Re: [ofa-general] [PATCH 2.6.24] RDMA/cxgb3: Set the max_qp_init_rd_atom attribute.
I haven't submitted a rhel4u5 backport yet for cxgb3. I'll do this today. Stay tuned. Steve. Vladimir Sokolovsky wrote: Steve Wise wrote: Hey Vlad, Can you pull this in for 1.3 beta? Roland has merged it for 2.6.24, so it can be removed if we rebase and get it that way, but rping and most other rdma/iwarp apps are dead over chelsio without this fix. Please pull from: git://git.openfabrics.org/~swise/ofed-1.3 stevo Thanks, Steve. Hi Steve, Merged into ofed_1_3/linux-2.6.git ofed_kernel_2_6_24_rc1. Please check (ofed_kernel_2_6_24_rc1 branch): Build failed on x86_64 with linux-2.6.9-55.ELsmp Log: /home/vlad/tmp/ofa_1_3_kernel-2007-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/hw/cxgb3/cxio_hal.c:921: error: (Each undeclared identifier is reported only once /home/vlad/tmp/ofa_1_3_kernel-2007-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/hw/cxgb3/cxio_hal.c:921: error: for each function it appears in.) /home/vlad/tmp/ofa_1_3_kernel-2007-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/hw/cxgb3/cxio_hal.c:921: error: too many arguments to function 'dev_get_by_name' make[4]: *** [/home/vlad/tmp/ofa_1_3_kernel-2007-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/hw/cxgb3/cxio_hal.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-2007-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/hw/cxgb3] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-2007-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-2007-0200_linux-2.6.9-55.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-55.ELsmp' make: *** [kernel] Error 2 Regards, Vladimir ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Re: [ofa-general] [PATCH 2.6.24] RDMA/cxgb3: Set the max_qp_init_rd_atom attribute.
Vlad, I added an rhel4u5 backport for cxgb3. Please full from: git://git.openfabrics.org/~swise/ofed-1.3 stevo Thanks, Steve. Vladimir Sokolovsky wrote: Steve Wise wrote: Hey Vlad, Can you pull this in for 1.3 beta? Roland has merged it for 2.6.24, so it can be removed if we rebase and get it that way, but rping and most other rdma/iwarp apps are dead over chelsio without this fix. Please pull from: git://git.openfabrics.org/~swise/ofed-1.3 stevo Thanks, Steve. Hi Steve, Merged into ofed_1_3/linux-2.6.git ofed_kernel_2_6_24_rc1. Please check (ofed_kernel_2_6_24_rc1 branch): Build failed on x86_64 with linux-2.6.9-55.ELsmp Log: /home/vlad/tmp/ofa_1_3_kernel-2007-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/hw/cxgb3/cxio_hal.c:921: error: (Each undeclared identifier is reported only once /home/vlad/tmp/ofa_1_3_kernel-2007-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/hw/cxgb3/cxio_hal.c:921: error: for each function it appears in.) /home/vlad/tmp/ofa_1_3_kernel-2007-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/hw/cxgb3/cxio_hal.c:921: error: too many arguments to function 'dev_get_by_name' make[4]: *** [/home/vlad/tmp/ofa_1_3_kernel-2007-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/hw/cxgb3/cxio_hal.o] Error 1 make[3]: *** [/home/vlad/tmp/ofa_1_3_kernel-2007-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband/hw/cxgb3] Error 2 make[2]: *** [/home/vlad/tmp/ofa_1_3_kernel-2007-0200_linux-2.6.9-55.ELsmp_x86_64_check/drivers/infiniband] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_3_kernel-2007-0200_linux-2.6.9-55.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-55.ELsmp' make: *** [kernel] Error 2 Regards, Vladimir ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [GIT PULL] ofed-1.3 - cxgb3 rh5.1 backport
Vlad, I've added a RH5.1 backport for cxgb3. Please pull from: git://git.openfabrics.org/~swise/ofed-1.3 stevo Thanks, Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED teleconference today
I cannot attend the OFED call today. cxgb3 status: - A 9-patch series was posted by Chelsio to netdev for some bug fixes and ppc64 support we'd like to pull into ofed-1.2.5 and ofed-1.3. I'll be creating the 1.2.5 and 1.3 patches once these get ACKed by Garzik/Miller. The original series can be found at http://lkml.org/lkml/2007/11/16/224 - There will be an additional patch for iw_cxgb3 to support 5.0 firmware that is pushed out in the above 9-patch series. Steve. Jeff Squyres wrote: Friendly reminder: the OFED teleconference is today (Tuesday, 20 November, 2007). Next few teleconferences: - All are at noon US eastern / 9am US Pacific / 7pm Israel 1. Tuesday, Nov 20, code 210020028 (***TODAY***) 2. Tuesday, Nov 27, code 210020028 (***NOTE: TUESDAY!***) 3. Monday, Dec 3, code 210020028 Dial-in information: US/Canada: +1.866.432.9903 India: +91.80.4103.3979 Israel: +972.9.892.7026 Others: http://cisco.com/en/US/about/doing_business/conferencing/ ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [PATCH 2.6.25 1/2] RDMA/cxgb3: Hold rtnl_lock() around ethtool get_drvinfo call.
RDMA/cxgb3: Hold rtnl_lock() around ethtool get_drvinfo call. Currently the call into cxgb3 to get the driver info is not serialized. The iw_cxgb3 module needs to hold the rtnl_lock around the ethtool ops call like dev_ioctl() does. Signed-off-by: Steve Wise [EMAIL PROTECTED] --- drivers/infiniband/hw/cxgb3/iwch_provider.c |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c b/drivers/infiniband/hw/cxgb3/iwch_provider.c index b5436ca..69b1204 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_provider.c +++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c @@ -39,6 +39,7 @@ #include linux/list.h #include linux/spinlock.h #include linux/ethtool.h +#include linux/rtnetlink.h #include asm/io.h #include asm/irq.h @@ -1053,7 +1054,9 @@ static ssize_t show_fw_ver(struct class_device *cdev, char *buf) struct net_device *lldev = dev-rdev.t3cdev_p-lldev; PDBG(%s class dev 0x%p\n, __FUNCTION__, cdev); + rtnl_lock(); lldev-ethtool_ops-get_drvinfo(lldev, info); + rtnl_unlock(); return sprintf(buf, %s\n, info.fw_version); } @@ -1065,7 +1068,9 @@ static ssize_t show_hca(struct class_device *cdev, char *buf) struct net_device *lldev = dev-rdev.t3cdev_p-lldev; PDBG(%s class dev 0x%p\n, __FUNCTION__, cdev); + rtnl_lock(); lldev-ethtool_ops-get_drvinfo(lldev, info); + rtnl_unlock(); return sprintf(buf, %s\n, info.driver); } ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [PATCH 2.6.25 2/2] RDMA/cxgb3: Support 5.0 firmware.
Yes. Roland Dreier wrote: OK, applied 1 and 2... Note: this change requires 5.0 firmware. I assume the change to the cxgb3 FW versions is pending in a net driver change for 2.6.25? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [GIT PULL ofed-1.2.5] - RDMA/cxgb3 - fixes and 5.0 firmware support
Vlad, please pull cxgb3 fixes for ofed-1.2.5 from: git://git.openfabrics.org/~swise/ofed-1.2.5 stevo These are cxgb3 bug fixes and PPC64 additions that we need for ofed-1.2.5 (stay tuned for ofed-1.3 patches soon). The patches are all accepted upstream and were posted here: http://www.spinics.net/lists/netdev/msg47492.html and here: http://www.spinics.net/lists/netdev/msg48240.html Also, please pull version 1.1.0 of libcxgb3 from: git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5 The library and drivers need to be included together as they are both needed to support the chelsio 5.0 firmware. Alsoalso: After you integrate these, can you crank a daily OFED-1.2.5.3 build including all this? Thanks, Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [GIT PULL ofed-1.3] - RDMA/cxgb3 - fixes and 5.0 firmware support
Vlad, please pull cxgb3 fixes for ofed-1.3 from: git://git.openfabrics.org/~swise/ofed-1.3 ofed_kernel These are cxgb3 bug fixes and PPC64 additions that we need for ofed-1.3. The patches are all accepted upstream and were posted here: http://www.spinics.net/lists/netdev/msg47492.html and here: http://www.spinics.net/lists/netdev/msg48240.html Also, please pull version 1.1.0 of libcxgb3 from: git://git.openfabrics.org/~swise/libcxgb3 ofed_1_3 The library and drivers need to be included together as they are both needed to support the chelsio 5.0 firmware. Thanks, Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [GIT PULL ofed-1.3] cxgb3: backports remove 'ethtool -S' support.
Vlad, The patch below fixes broken chelsio backports for ofed-1.3. Please pull from: git://git.openfabrics.org/~swise/ofed-1.3 ofed_kernel Thanks, Steve. - From: Steve Wise [EMAIL PROTECTED] cxgb3: backports remove 'ethtool -S' support. I mistakenly removed the get_stats_count ethtool op for cxgb3. The real backport is to change its signature... Signed-off-by: Steve Wise [EMAIL PROTECTED] --- .../backport/2.6.12/cxgb3_0200_sset.patch | 17 + .../backport/2.6.13/cxgb3_0200_sset.patch | 17 + .../backport/2.6.14/cxgb3_0200_sset.patch | 17 + .../backport/2.6.15/cxgb3_0200_sset.patch | 17 + .../2.6.15_ubuntu606/cxgb3_0200_sset.patch | 17 + .../backport/2.6.16/cxgb3_0200_sset.patch | 17 + .../backport/2.6.16_sles10/cxgb3_0200_sset.patch | 17 + .../2.6.16_sles10_sp1/cxgb3_0200_sset.patch| 17 + .../backport/2.6.17/cxgb3_0200_sset.patch | 17 + .../backport/2.6.18-EL5.1/cxgb3_0200_sset.patch| 17 + .../backport/2.6.18/cxgb3_0200_sset.patch | 17 + .../backport/2.6.18_FC6/cxgb3_0200_sset.patch | 17 + .../backport/2.6.19/cxgb3_0200_sset.patch | 17 + .../backport/2.6.20/cxgb3_0200_sset.patch | 17 + .../backport/2.6.21/cxgb3_0200_sset.patch | 17 + .../backport/2.6.22/cxgb3_0200_sset.patch | 17 + .../backport/2.6.23/cxgb3_0200_sset.patch | 17 + .../backport/2.6.9_U4/cxgb3_0200_sset.patch| 17 + .../backport/2.6.9_U5/cxgb3_0200_sset.patch| 17 + 19 files changed, 171 insertions(+), 152 deletions(-) diff --git a/kernel_patches/backport/2.6.12/cxgb3_0200_sset.patch b/kernel_patches/backport/2.6.12/cxgb3_0200_sset.patch index e331411..dde776e 100644 --- a/kernel_patches/backport/2.6.12/cxgb3_0200_sset.patch +++ b/kernel_patches/backport/2.6.12/cxgb3_0200_sset.patch @@ -1,29 +1,30 @@ diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c -index 61ffc92..676df2f 100644 +index 61ffc92..57ffa8e 100644 --- a/drivers/net/cxgb3/cxgb3_main.c +++ b/drivers/net/cxgb3/cxgb3_main.c -@@ -1131,16 +1131,6 @@ static char stats_strings[][ETH_GSTRING_LEN] = { +@@ -1131,14 +1131,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = { }; -static int get_sset_count(struct net_device *dev, int sset) --{ ++static int get_stats_count(struct net_device *dev) + { - switch (sset) { - case ETH_SS_STATS: - return ARRAY_SIZE(stats_strings); - default: - return -EOPNOTSUPP; - } --} -- - #define T3_REGMAP_SIZE (3 * 1024) ++ return ARRAY_SIZE(stats_strings); + } - static int get_regs_len(struct net_device *dev) -@@ -1645,7 +1635,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = { + #define T3_REGMAP_SIZE (3 * 1024) +@@ -1645,7 +1640,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = { .get_strings = get_strings, .phys_id = cxgb3_phys_id, .nway_reset = restart_autoneg, - .get_sset_count = get_sset_count, ++ .get_stats_count = get_stats_count, .get_ethtool_stats = get_stats, .get_regs_len = get_regs_len, .get_regs = get_regs, diff --git a/kernel_patches/backport/2.6.13/cxgb3_0200_sset.patch b/kernel_patches/backport/2.6.13/cxgb3_0200_sset.patch index e331411..dde776e 100644 --- a/kernel_patches/backport/2.6.13/cxgb3_0200_sset.patch +++ b/kernel_patches/backport/2.6.13/cxgb3_0200_sset.patch @@ -1,29 +1,30 @@ diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c -index 61ffc92..676df2f 100644 +index 61ffc92..57ffa8e 100644 --- a/drivers/net/cxgb3/cxgb3_main.c +++ b/drivers/net/cxgb3/cxgb3_main.c -@@ -1131,16 +1131,6 @@ static char stats_strings[][ETH_GSTRING_LEN] = { +@@ -1131,14 +1131,9 @@ static char stats_strings[][ETH_GSTRING_LEN] = { }; -static int get_sset_count(struct net_device *dev, int sset) --{ ++static int get_stats_count(struct net_device *dev) + { - switch (sset) { - case ETH_SS_STATS: - return ARRAY_SIZE(stats_strings); - default: - return -EOPNOTSUPP; - } --} -- - #define T3_REGMAP_SIZE (3 * 1024) ++ return ARRAY_SIZE(stats_strings); + } - static int get_regs_len(struct net_device *dev) -@@ -1645,7 +1635,6 @@ static const struct ethtool_ops cxgb_ethtool_ops = { + #define T3_REGMAP_SIZE (3 * 1024) +@@ -1645,7 +1640,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = { .get_strings = get_strings, .phys_id = cxgb3_phys_id, .nway_reset = restart_autoneg, - .get_sset_count = get_sset_count, ++ .get_stats_count = get_stats_count
[ewg] Re: [GIT PULL ofed-1.2.5] - RDMA/cxgb3 - fixes and 5.0 firmware support
Vlad, it looks like you didn't pull in version 1.1.0 of libcxgb3 for ofed-1.2.5? Right now the ofed-1.2.5.4 is broken from chelsio's perspective because the kernel drivers require 5.0 firmware, but the library doesn't have 5.0 firmware support. Can you please pull in 1.1.0 of libcxgb3 and crank a new ofed-1.2.5.4 release? Pull from: git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5 Thanks, Steve. Steve Wise wrote: Vlad, please pull cxgb3 fixes for ofed-1.2.5 from: git://git.openfabrics.org/~swise/ofed-1.2.5 stevo These are cxgb3 bug fixes and PPC64 additions that we need for ofed-1.2.5 (stay tuned for ofed-1.3 patches soon). The patches are all accepted upstream and were posted here: http://www.spinics.net/lists/netdev/msg47492.html and here: http://www.spinics.net/lists/netdev/msg48240.html Also, please pull version 1.1.0 of libcxgb3 from: git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5 The library and drivers need to be included together as they are both needed to support the chelsio 5.0 firmware. Alsoalso: After you integrate these, can you crank a daily OFED-1.2.5.3 build including all this? Thanks, Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [GIT PULL ofed-1.2.5] - RDMA/cxgb3 - fixes and 5.0 firmware support
Great, thanks! Steve. Vladimir Sokolovsky wrote: Hi Steve, Sorry, I missed your libcxgb3 updates for ofed-1.2.5. It is updated now. There is OFED-1.2.5.4-20071210-0614 build which includes updated libcxgb3 library. In any case we are going to release OFED-1.2.5.5 in a few days. Regards, Vladimir Steve Wise wrote: Vlad, it looks like you didn't pull in version 1.1.0 of libcxgb3 for ofed-1.2.5? Right now the ofed-1.2.5.4 is broken from chelsio's perspective because the kernel drivers require 5.0 firmware, but the library doesn't have 5.0 firmware support. Can you please pull in 1.1.0 of libcxgb3 and crank a new ofed-1.2.5.4 release? Pull from: git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5 Thanks, Steve. Steve Wise wrote: Vlad, please pull cxgb3 fixes for ofed-1.2.5 from: git://git.openfabrics.org/~swise/ofed-1.2.5 stevo These are cxgb3 bug fixes and PPC64 additions that we need for ofed-1.2.5 (stay tuned for ofed-1.3 patches soon). The patches are all accepted upstream and were posted here: http://www.spinics.net/lists/netdev/msg47492.html and here: http://www.spinics.net/lists/netdev/msg48240.html Also, please pull version 1.1.0 of libcxgb3 from: git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5 The library and drivers need to be included together as they are both needed to support the chelsio 5.0 firmware. Alsoalso: After you integrate these, can you crank a daily OFED-1.2.5.3 build including all this? Thanks, Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] ofed-1.3-rc1 problem
linking with libibumad fails on ofed-1.3-rc1. I get a 'cannot find -libumad' from ld. I looked in /usr/lib64 and there wasn't a link from libibumad.so to libibumad.so.1.0.2. I added the link and the ld works now. This was on PPC64. I think this is some install problem with libibumad. Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [PATCH] OFED/kernel_addons - fix compiler warnings in 2.6.9_U5
Hey Arthur, did you compile all the modules with this fix? Arthur Jones wrote: hi vlad, here is a patch which fixes a couple compiler warnings for me on 2.6.9_U5... arthur ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] ofed-1.3-rc1 problem
Vladimir Sokolovsky wrote: On Sunday 16 December 2007 18:11:07 Steve Wise wrote: You're right! It is not installed, but it was built by install.pl. However I didn't explicitly request to build/install ibumad. Its a prerequisite of mvapich2, which I did ask to have built/installed. So I think install.pl needs to be fixed to prereq this maybe? install.pl set requirements for OFED packages selected to be installed. As I understand, correct me if I am wrong, the installation passed successfully, so , there is no issues in the install. If you want to compile your application (not from OFED) over libibumad, then you have to select libibumad-devel to be installed during OFED installation. I'm not sure where the fix needs to go, but if I install mvapich2-1.0.0 via install.pl, it should also build/install libibumad-devel. Otherwise, mpi programs cannot link correctly. I don't know if this dependency should be defined in the mvapich2 srpm somehow or the ofed tools. But in install.pl I see this: 'mvapich2_gcc' = { name = mvapich2_gcc, parent = mvapich2, selected = 0, installed = 0, rpm_exist = 0, rpm_exist32 = 0, available = 0, mode = user, dist_req_build = [], dist_req_inst = [], ofa_req_build = [libibumad-devel, libibverbs-devel, librdmacm-devel], ofa_req_inst = [mpi-selector, librdmacm, libibumad], install32 = 0, exception = 0 }, And see that libibumad-devel is a build requirement. I claim it is also an install requirement. This all worked on 1.2.5 by the way... Steve. Regards, Vladimir Vladimir Sokolovsky wrote: Steve Wise wrote: linking with libibumad fails on ofed-1.3-rc1. I get a 'cannot find -libumad' from ld. I looked in /usr/lib64 and there wasn't a link from libibumad.so to libibumad.so.1.0.2. I added the link and the ld works now. This was on PPC64. I think this is some install problem with libibumad. Steve. Hi Steve, Check that libibumad-devel is installed. Regards, Vladimir ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [PATCH 3 of 5] libcxgb3: zero context struct at allocation time (prep for additional context ops)
Applied. Thanks. I've released version 1.1.1 of the library, and updated the ofed_1_3 branch. Vlad, can you pull version 1.1.1 for ofed-1.3? git://git.openfabrics.org/~swise/libcxgb3.git ofed_1_3 Thanks, Steve. Jack Morgenstein wrote: The ibv_context structure will be getting additional ops, to be added at the end of the structure (and not as part of the existing ibv_context_ops structure). Reason: ibv_context_ops is declared directly as a member of ibv_context, and not as a pointer. Binaries compiled with previous libibverbs versions will not be backwards compatible if we add new operations to ibv_context_ops, since fields following the ops structure will move. To enable adding new operations at the end of the existing ibv_context struct, all driver libraries MUST zero their context structure at allocation time, so that new ops will be NULL by default. Signed-off-by: Jack Morgenstein [EMAIL PROTECTED] diff --git a/src/iwch.c b/src/iwch.c index 2747518..517ff00 100644 --- a/src/iwch.c +++ b/src/iwch.c @@ -114,6 +114,7 @@ static struct ibv_context *iwch_alloc_context(struct ibv_device *ibdev, if (!context) return NULL; + memset(context, 0, sizeof *context); context-ibv_ctx.cmd_fd = cmd_fd; if (ibv_cmd_get_context(context-ibv_ctx, cmd, sizeof cmd, ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [GIT PULL ofed-1.2.5 and ofed-1.3] cxgb3 fixes
Vlad, Please pull 3 new cxgb3 driver fixes + backport support for ofed-1.2.5 and ofed-1.3. The 3 patches have been submitted and merged upstream. First two patches are submitted here: http://www.spinics.net/lists/kernel/msg659899.html And the third here: http://www.spinics.net/lists/kernel/msg660541.html For ofed-1.2.5, please pull from: git://git.openfabrics.org/~swise/ofed-1.2.5.git ofed_1_2_c For ofed-1.3, please pull from: git://git.openfabrics.org/~swise/ofed-1.3.git ofed_kernel Thanks, Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] 32b rping
Has anyone run ofed-1.3 and used rping successfully on 32b distro/platforms? Thanks, Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [GIT PULL ofed-1.2.5 / ofed-1.3] - libcxgb3-1.1.2 release
Vlad, Please pull version 1.1.2 of libcxgb3 for ofed-1.2.5 and ofed-1.3. This release fixes a segfault that can happen when running rdma apps over chelsio's device on 32b platforms and distros (bug 680). Pull from: git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5 and git://git.openfabrics.org/~swise/libcxgb3 ofed_1_3 Thanks, Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] iommu issues with PPC64 and 2.6.23 and beyond
From today's conf call, I was asked to post info on the iommu issue I'm hitting on ppc64 systems. This bug shows up as data corruption when I run stressful mvapich2 tests that force lots of dma mappings in the ppc iommu code. The problem happens on kernels with the iommu size != host page size. This is the default config for rh5.1 at least.Workaround is to make the host page size 4KB. Here is my original thread on this: http://lkml.org/lkml/2007/12/20/368 Included in that thread is a proposed kernel fix from Ben that I tested ok. Email if you have questions. Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [ofa-general] OFED Jan-07, 2008 meeting summary on readiness toward RC2
Tziporet Koren wrote: OFED Jan-07, 2008 meeting summary on readiness toward RC2 1. Release status: * In general there are no major issues - testing continue at all companies * There is a wide coverage of platform and OSes 2. Tasks that should be completed for RC2: *XRC - enhanced API - will be ready by next week *IPoIB performance improvements for small messages - at least some of the changes will be integrated *Open MPI 1.2.5-rc2 - will be ready by next week *Qlogic new driver - done 3. Agree on new schedule for the release: *RC2: Jan 15, 2008 *RC3: Jan 29, 2008 *RC4: Feb 12, 2008 * Release: Feb 19, 2008 If we will see that RC3 is stable enough we will try to pull-in And in any case we do not want to delay the release any more 4. Review critical and major bugs: 750 critical [EMAIL PROTECTED] Problem with modprobe ib_ehca with older kernel versions - probably fixed 760 major [EMAIL PROTECTED] UDP performance on Rx is lower than Tx - related to IPoIB above 761 major [EMAIL PROTECTED] Poor and jittery UDP performance at small messages - related to IPoIB above 820 major [EMAIL PROTECTED] rpm 4.4.2.2, Binary file matches Binary file - patch was sent by OSU will be incorporated by Pasha 800 major [EMAIL PROTECTED] MVAPICH2 compile error on PPC64 - fixed 736 major [EMAIL PROTECTED] IBV_WC_RETRY_EXC_ERR errors with local rdma_reads - Need Arlin to retest with new FW 767 major [EMAIL PROTECTED] Non backport Kernels that don't build in genalloc compile errors for cxgb3 - not a major issue (will be in RN) I'm beginning to think I should really fix this. I had another customer hit this issue today. The fix, however, is to _always_ build the genpool backport into the ib_core module. Is that a reasonable fix? I'd basically move the genpool backport patch into kernel_patches/fixes so it always gets applied... Thoughts? ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] uninstall.sh bug
Looks like uninstall.sh needs perftest-debuginfo added... Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [GIT PULL ofed-1.3] - Tag the ofed cxgb3 driver version.
Vlad, Please pull the following patch from: git://git.openfabrics.org/~swise/ofed-1.3 ofed_kernel This patch must have gotten lost from 1.2.5 - 1.3. Thanks, Steve. - Tag -ofed for cxgb3 driver version. This keeps kernel.org vs ofed driver versions unique. Signed-off-by: Steve Wise [EMAIL PROTECTED] --- .../fixes/cxgb3_00300_add_ofed_version_tag.patch | 13 + 1 files changed, 13 insertions(+), 0 deletions(-) diff --git a/kernel_patches/fixes/cxgb3_00300_add_ofed_version_tag.patch b/kernel_patches/fixes/cxgb3_00300_add_ofed_version_tag.patch new file mode 100644 index 000..ffee40a --- /dev/null +++ b/kernel_patches/fixes/cxgb3_00300_add_ofed_version_tag.patch @@ -0,0 +1,13 @@ +diff --git a/drivers/net/cxgb3/version.h b/drivers/net/cxgb3/version.h +index ef1c633..ef2405a 100644 +--- a/drivers/net/cxgb3/version.h b/drivers/net/cxgb3/version.h +@@ -35,7 +35,7 @@ + #define DRV_DESC Chelsio T3 Network Driver + #define DRV_NAME cxgb3 + /* Driver version */ +-#define DRV_VERSION 1.0-ko ++#define DRV_VERSION 1.0-ofed + + /* Firmware version */ + #define FW_VERSION_MAJOR 4 ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [GIT PULL ofed-1.2.5 / ofed-1.3] - libcxgb3-1.1.3 release
Vlad, Please pull version 1.1.3 of libcxgb3 for ofed-1.2.5 and ofed-1.3. This release fixes problems with running libcxgb3 on RH4U5 and other distros. Pull from: git://git.openfabrics.org/~swise/libcxgb3 ofed_1_2_5 and git://git.openfabrics.org/~swise/libcxgb3 ofed_1_3 Also, the release can be downloaded from: http://www.openfabrics.org/downloads/cxgb3/libcxgb3-1.1.3.tar.gz Thanks, Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Is it possible to upgrage memory in the OFA hosting server
Johann George wrote: I can arrange for the disk space and memory to be upgraded. I'm not sure how disruptive it is and am wondering if we should wait until after this OFED release is out? Johann On Mon, Jan 28, 2008 at 01:28:27PM +, Sasha Khapyorsky wrote: On 10:30 Mon 28 Jan , Tziporet Koren wrote: Many people suffers from the performance of the server, especially since all use it for the cross compilation Is it possible to add more memory to this server. It is not only memory problem. This weekend (and it is not first time) the server was almost not functional (mailman, bugzilla, git) due to lack of free space on root fs (and I removed some old files in /tmp). I can see that Vlad cleaned some temporary files, but nobody else cared. Probably it could be useful: 1. to run cleanup script daily or weekly (at least over /tmp and ~user/tmp directories). 2. to publish top10 list of users which consume most disk space on the server and send them notification by email. 3. to not use the OFA server (which primary goal was to host git, mailman, bugzilla, wiki, etc.) for builds. A new server for builds might be the ticket. Maybe NetApp can donate a filer? :) ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [PATCH ofed-1.3] Load iw_cxgb3 as part of ofed init.
Vlad, can you please review this? Is there anything else needed to get iw_cxgb3 to be loaded at init time? I tested the patched openibd and it seems to work fine. If this looks good to you, please pull this patch for ofed-1.3 from: git://www.openfabrics.org/~swise/ofed-1.3 ofed_kernel This change is long overdue... Thanks, Steve. -- Load iw_cxgb3 as part of ofed init. Signed-off-by: Steve Wise [EMAIL PROTECTED] --- ofed_scripts/ofa_kernel.spec |6 ++ ofed_scripts/openibd | 19 +++ 2 files changed, 21 insertions(+), 4 deletions(-) diff --git a/ofed_scripts/ofa_kernel.spec b/ofed_scripts/ofa_kernel.spec index 6184cfc..954dd0c 100755 --- a/ofed_scripts/ofa_kernel.spec +++ b/ofed_scripts/ofa_kernel.spec @@ -488,6 +488,12 @@ fi echo MLX4_LOAD=yes %{IB_CONF_DIR}/openib.conf %endif +%if %{build_cxgb3} + echo %{IB_CONF_DIR}/openib.conf + echo # Load CXGB3 modules %{IB_CONF_DIR}/openib.conf + echo CXGB3_LOAD=yes %{IB_CONF_DIR}/openib.conf +%endif + %if %{build_ipoib} echo %{IB_CONF_DIR}/openib.conf echo # Load IPoIB %{IB_CONF_DIR}/openib.conf diff --git a/ofed_scripts/openibd b/ofed_scripts/openibd index 2553881..700c8ef 100755 --- a/ofed_scripts/openibd +++ b/ofed_scripts/openibd @@ -273,13 +273,13 @@ fi GEN1_UNLOAD_MODULES=ib_srp_target scsi_target ib_srp kdapltest_module ib_kdapl ib_sdp ib_useraccess ib_useraccess_cm ib_cm ib_dapl_srv ib_ip2pr ib_ipoib ib_tavor mod_thh mod_rhh ib_dm_client ib_sa_client ib_client_query ib_poll ib_mad ib_core ib_services -UNLOAD_MODULES=ib_mthca mlx4_enet mlx4_ib mlx4_core ib_ipath ipath_core ib_ehca +UNLOAD_MODULES=ib_mthca mlx4_enet mlx4_ib mlx4_core ib_ipath ipath_core ib_ehca iw_cxgb3 UNLOAD_MODULES=$UNLOAD_MODULES ib_ipoib ib_madeye ib_rds UNLOAD_MODULES=$UNLOAD_MODULES rds ib_ucm kdapl ib_srp_target scsi_target ib_srpt ib_srp qlgc_vnic ib_iser ib_sdp UNLOAD_MODULES=$UNLOAD_MODULES rdma_ucm rdma_cm ib_addr ib_cm ib_local_sa findex UNLOAD_MODULES=$UNLOAD_MODULES ib_sa ib_uverbs ib_umad ib_mad ib_core -STATUS_MODULES=rdma_ucm ib_rds rds ib_srpt ib_srp qlgc_vnic ib_sdp rdma_cm ib_addr ib_local_sa findex ib_ipoib ib_ehca ib_ipath ipath_core mlx4_core mlx4_ib ib_mthca ib_uverbs ib_umad ib_ucm ib_sa ib_cm ib_mad ib_core +STATUS_MODULES=rdma_ucm ib_rds rds ib_srpt ib_srp qlgc_vnic ib_sdp rdma_cm ib_addr ib_local_sa findex ib_ipoib ib_ehca ib_ipath ipath_core mlx4_core mlx4_ib ib_mthca ib_uverbs ib_umad ib_ucm ib_sa ib_cm ib_mad ib_core iw_cxgb3 ipoib_ha_pidfile=/var/run/ipoib_ha.pid srp_daemon_pidfile=/var/run/srp_daemon.pid @@ -800,6 +800,17 @@ start() RC=$[ $RC + $my_rc ] fi +# Load iw_cxgb3 driver +if [ X${CXGB3_LOAD} == Xyes ]; then +fix_location_codes +/sbin/modprobe iw_cxgb3 /dev/null 21 +my_rc=$? +if [ $my_rc -ne 0 ]; then +echo_failure $Loading cxgb3 driver: +fi +RC=$[ $RC + $my_rc ] +fi + # Add node description to sysfs IBSYSDIR=/sys/class/infiniband if [ -d ${IBSYSDIR} ]; then @@ -1101,7 +1112,7 @@ unload() if is_module $mod; then case $mod in - ib_mthca | mlx4_ib | ib_ipath | ib_ehca) + ib_mthca | mlx4_ib | ib_ipath | ib_ehca | iw_cxgb3) rm_mod $mod sleep 2 ;; @@ -1273,7 +1284,7 @@ status() { local RC=0 - if is_module ib_mthca || is_module mlx4_core || is_module ib_ipath || is_module ib_ehca; then + if is_module ib_mthca || is_module mlx4_core || is_module ib_ipath || is_module ib_ehca || is_module iw_cxgb3; then echo echo HCA driver loaded echo ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[Fwd: [ewg] [PATCH ofed-1.3] Load iw_cxgb3 as part of ofed init.]
Can this fix make ofed-1.3? Its a trivial change and allows iw_cxgb3 to get loaded like the other rdma modules... Steve. Original Message Subject: [ewg] [PATCH ofed-1.3] Load iw_cxgb3 as part of ofed init. Date: Thu, 31 Jan 2008 09:21:18 -0600 From: Steve Wise [EMAIL PROTECTED] To: [EMAIL PROTECTED] CC: ewg@lists.openfabrics.org Vlad, can you please review this? Is there anything else needed to get iw_cxgb3 to be loaded at init time? I tested the patched openibd and it seems to work fine. If this looks good to you, please pull this patch for ofed-1.3 from: git://www.openfabrics.org/~swise/ofed-1.3 ofed_kernel This change is long overdue... Thanks, Steve. -- Load iw_cxgb3 as part of ofed init. Signed-off-by: Steve Wise [EMAIL PROTECTED] --- ofed_scripts/ofa_kernel.spec |6 ++ ofed_scripts/openibd | 19 +++ 2 files changed, 21 insertions(+), 4 deletions(-) diff --git a/ofed_scripts/ofa_kernel.spec b/ofed_scripts/ofa_kernel.spec index 6184cfc..954dd0c 100755 --- a/ofed_scripts/ofa_kernel.spec +++ b/ofed_scripts/ofa_kernel.spec @@ -488,6 +488,12 @@ fi echo MLX4_LOAD=yes %{IB_CONF_DIR}/openib.conf %endif +%if %{build_cxgb3} + echo %{IB_CONF_DIR}/openib.conf + echo # Load CXGB3 modules %{IB_CONF_DIR}/openib.conf + echo CXGB3_LOAD=yes %{IB_CONF_DIR}/openib.conf +%endif + %if %{build_ipoib} echo %{IB_CONF_DIR}/openib.conf echo # Load IPoIB %{IB_CONF_DIR}/openib.conf diff --git a/ofed_scripts/openibd b/ofed_scripts/openibd index 2553881..700c8ef 100755 --- a/ofed_scripts/openibd +++ b/ofed_scripts/openibd @@ -273,13 +273,13 @@ fi GEN1_UNLOAD_MODULES=ib_srp_target scsi_target ib_srp kdapltest_module ib_kdapl ib_sdp ib_useraccess ib_useraccess_cm ib_cm ib_dapl_srv ib_ip2pr ib_ipoib ib_tavor mod_thh mod_rhh ib_dm_client ib_sa_client ib_client_query ib_poll ib_mad ib_core ib_services -UNLOAD_MODULES=ib_mthca mlx4_enet mlx4_ib mlx4_core ib_ipath ipath_core ib_ehca +UNLOAD_MODULES=ib_mthca mlx4_enet mlx4_ib mlx4_core ib_ipath ipath_core ib_ehca iw_cxgb3 UNLOAD_MODULES=$UNLOAD_MODULES ib_ipoib ib_madeye ib_rds UNLOAD_MODULES=$UNLOAD_MODULES rds ib_ucm kdapl ib_srp_target scsi_target ib_srpt ib_srp qlgc_vnic ib_iser ib_sdp UNLOAD_MODULES=$UNLOAD_MODULES rdma_ucm rdma_cm ib_addr ib_cm ib_local_sa findex UNLOAD_MODULES=$UNLOAD_MODULES ib_sa ib_uverbs ib_umad ib_mad ib_core -STATUS_MODULES=rdma_ucm ib_rds rds ib_srpt ib_srp qlgc_vnic ib_sdp rdma_cm ib_addr ib_local_sa findex ib_ipoib ib_ehca ib_ipath ipath_core mlx4_core mlx4_ib ib_mthca ib_uverbs ib_umad ib_ucm ib_sa ib_cm ib_mad ib_core +STATUS_MODULES=rdma_ucm ib_rds rds ib_srpt ib_srp qlgc_vnic ib_sdp rdma_cm ib_addr ib_local_sa findex ib_ipoib ib_ehca ib_ipath ipath_core mlx4_core mlx4_ib ib_mthca ib_uverbs ib_umad ib_ucm ib_sa ib_cm ib_mad ib_core iw_cxgb3 ipoib_ha_pidfile=/var/run/ipoib_ha.pid srp_daemon_pidfile=/var/run/srp_daemon.pid @@ -800,6 +800,17 @@ start() RC=$[ $RC + $my_rc ] fi +# Load iw_cxgb3 driver +if [ X${CXGB3_LOAD} == Xyes ]; then +fix_location_codes +/sbin/modprobe iw_cxgb3 /dev/null 21 +my_rc=$? +if [ $my_rc -ne 0 ]; then +echo_failure $Loading cxgb3 driver: +fi +RC=$[ $RC + $my_rc ] +fi + # Add node description to sysfs IBSYSDIR=/sys/class/infiniband if [ -d ${IBSYSDIR} ]; then @@ -1101,7 +1112,7 @@ unload() if is_module $mod; then case $mod in - ib_mthca | mlx4_ib | ib_ipath | ib_ehca) + ib_mthca | mlx4_ib | ib_ipath | ib_ehca | iw_cxgb3) rm_mod $mod sleep 2 ;; @@ -1273,7 +1284,7 @@ status() { local RC=0 - if is_module ib_mthca || is_module mlx4_core || is_module ib_ipath || is_module ib_ehca; then + if is_module ib_mthca || is_module mlx4_core || is_module ib_ipath || is_module ib_ehca || is_module iw_cxgb3; then echo echo HCA driver loaded echo ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED meeting agenda on 1.3-rc3 status and rc4 readiness
Tziporet, I cannot attend today's call. I have status below: Tziporet Koren wrote: This is the agenda to OFED meeting today on 1.3-rc3 status and rc4 readiness Reminder for the release schedule * RC3 - done (30-Jan) * RC4 - Feb 6 or 7 * RC5 - Feb 18 == Gold (is this a vacation day in US?) * GA - Feb 25 Agenda: 1. Status update - all Uncovered a cxgb3 bug that we need fixed for ofed-1.3. I just opened 890 to track this. I hope to have a fix today or tomorrow... Also, I posted a trivial change to rmda_lat to enable it on chelsio devices. This was an oversight that should have been fixed a while ago. 2. Agree on the above schedule Agree on the schedule, but I need bug 890 in. Most likely it'll have to go in RC5. steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] perftest tools
Who owns perftest these days? I have a small change to rdma_lat.c I'd like to consider for ofed-1.3. It adds a new option to allow specifying the max inline size, which is currently hard-coded to 400. 400 is too big for chelsio devices. Adding a command line option will allow the user to specify a different value. I have a test patch ready. Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [PATCH v2] rdma_lat: Add -m --max-inline option to support devices with different
Ignore this patch. Its bad. Stay tuned for v3... :( Steve Wise wrote: From: Steve Wise [EMAIL PROTECTED] inline max values. Currently the max inline value is hard-coded and too big for the chelsio device. This patch allows specifying the max inline as a command line param. Signed-off-by: Steve Wise [EMAIL PROTECTED] --- rdma_lat.c | 13 ++--- 1 files changed, 10 insertions(+), 3 deletions(-) diff --git a/rdma_lat.c b/rdma_lat.c index 68c9120..f2ef8e2 100755 --- a/rdma_lat.c +++ b/rdma_lat.c @@ -60,6 +60,7 @@ #define PINGPONG_RDMA_WRID 3 #define MAX_INLINE 400 +static int inline_size = MAX_INLINE; static int page_size; static pid_t pid; @@ -603,7 +604,7 @@ static struct pingpong_context *pp_init_ctx(void *ptr, struct pp_data *data) .max_recv_wr = 1, .max_send_sge = 1, .max_recv_sge = 1, - .max_inline_data = MAX_INLINE + .max_inline_data = inline_size, }, .qp_type = IBV_QPT_RC }; @@ -915,6 +916,7 @@ static void usage(const char *argv0) printf( -s, --size=size size of message to exchange (default 1)\n); printf( -t, --tx-depth=dep size of tx queue (default 50)\n); printf( -n, --iters=itersnumber of exchanges (at least 2, default 1000)\n); + printf( -I, --inline_size=size max size of message to be sent in inline mode (default 400)\n); printf( -C, --report-cyclesreport times in cpu cycle units (default microseconds)\n); printf( -H, --report-histogram print out all results (default print summary only)\n); printf( -U, --report-unsorted (implies -H) print out unsorted results (default sorted)\n); @@ -1036,6 +1038,7 @@ int main(int argc, char *argv[]) { .name = size, .has_arg = 1, .val = 's' }, { .name = iters, .has_arg = 1, .val = 'n' }, { .name = tx-depth, .has_arg = 1, .val = 't' }, + { .name = max-inline, .has_arg = 1, .val = 'm' }, { .name = report-cycles, .has_arg = 0, .val = 'C' }, { .name = report-histogram,.has_arg = 0, .val = 'H' }, { .name = report-unsorted,.has_arg = 0, .val = 'U' }, @@ -1043,7 +1046,7 @@ int main(int argc, char *argv[]) { 0 } }; - c = getopt_long(argc, argv, p:d:i:s:n:t:CHUc, long_options, NULL); + c = getopt_long(argc, argv, p:d:i:s:n:t:I:CHUc, long_options, NULL); if (c == -1) break; @@ -1087,6 +1090,10 @@ int main(int argc, char *argv[]) break; + case 'I': + inline_size = strtol(optarg, NULL, 0); + break; + case 'C': report.cycles = 1; break; @@ -1192,7 +1199,7 @@ int main(int argc, char *argv[]) ctx-wr.sg_list= ctx-list; ctx-wr.num_sge= 1; ctx-wr.opcode = IBV_WR_RDMA_WRITE; - if (ctx-size MAX_INLINE || ctx-size == 0) { + if (ctx-size inline_size || ctx-size == 0) { ctx-wr.send_flags = IBV_SEND_SIGNALED; } else { ctx-wr.send_flags = IBV_SEND_SIGNALED | IBV_SEND_INLINE; ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [PATCH v2] rdma_lat: Add -m --max-inline option to support devices with different
From: Steve Wise [EMAIL PROTECTED] inline max values. Currently the max inline value is hard-coded and too big for the chelsio device. This patch allows specifying the max inline as a command line param. Signed-off-by: Steve Wise [EMAIL PROTECTED] --- rdma_lat.c | 13 ++--- 1 files changed, 10 insertions(+), 3 deletions(-) diff --git a/rdma_lat.c b/rdma_lat.c index 68c9120..f2ef8e2 100755 --- a/rdma_lat.c +++ b/rdma_lat.c @@ -60,6 +60,7 @@ #define PINGPONG_RDMA_WRID 3 #define MAX_INLINE 400 +static int inline_size = MAX_INLINE; static int page_size; static pid_t pid; @@ -603,7 +604,7 @@ static struct pingpong_context *pp_init_ctx(void *ptr, struct pp_data *data) .max_recv_wr = 1, .max_send_sge = 1, .max_recv_sge = 1, - .max_inline_data = MAX_INLINE + .max_inline_data = inline_size, }, .qp_type = IBV_QPT_RC }; @@ -915,6 +916,7 @@ static void usage(const char *argv0) printf( -s, --size=size size of message to exchange (default 1)\n); printf( -t, --tx-depth=dep size of tx queue (default 50)\n); printf( -n, --iters=itersnumber of exchanges (at least 2, default 1000)\n); + printf( -I, --inline_size=size max size of message to be sent in inline mode (default 400)\n); printf( -C, --report-cyclesreport times in cpu cycle units (default microseconds)\n); printf( -H, --report-histogram print out all results (default print summary only)\n); printf( -U, --report-unsorted (implies -H) print out unsorted results (default sorted)\n); @@ -1036,6 +1038,7 @@ int main(int argc, char *argv[]) { .name = size, .has_arg = 1, .val = 's' }, { .name = iters, .has_arg = 1, .val = 'n' }, { .name = tx-depth, .has_arg = 1, .val = 't' }, + { .name = max-inline, .has_arg = 1, .val = 'm' }, { .name = report-cycles, .has_arg = 0, .val = 'C' }, { .name = report-histogram,.has_arg = 0, .val = 'H' }, { .name = report-unsorted,.has_arg = 0, .val = 'U' }, @@ -1043,7 +1046,7 @@ int main(int argc, char *argv[]) { 0 } }; - c = getopt_long(argc, argv, p:d:i:s:n:t:CHUc, long_options, NULL); + c = getopt_long(argc, argv, p:d:i:s:n:t:I:CHUc, long_options, NULL); if (c == -1) break; @@ -1087,6 +1090,10 @@ int main(int argc, char *argv[]) break; + case 'I': + inline_size = strtol(optarg, NULL, 0); + break; + case 'C': report.cycles = 1; break; @@ -1192,7 +1199,7 @@ int main(int argc, char *argv[]) ctx-wr.sg_list= ctx-list; ctx-wr.num_sge= 1; ctx-wr.opcode = IBV_WR_RDMA_WRITE; - if (ctx-size MAX_INLINE || ctx-size == 0) { + if (ctx-size inline_size || ctx-size == 0) { ctx-wr.send_flags = IBV_SEND_SIGNALED; } else { ctx-wr.send_flags = IBV_SEND_SIGNALED | IBV_SEND_INLINE; ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [PATCH v3 ofed-1.3] rdma_lat: Add option to support devices with different inline max values.
rdma_lat: Add option to support devices with different inline max values. Currently the max inline value is hard-coded and too big for the chelsio device. This patch allows specifying the max inline as a command line param. Signed-off-by: Steve Wise [EMAIL PROTECTED] --- rdma_lat.c | 13 ++--- 1 files changed, 10 insertions(+), 3 deletions(-) diff --git a/rdma_lat.c b/rdma_lat.c index 68c9120..30cb4a3 100755 --- a/rdma_lat.c +++ b/rdma_lat.c @@ -60,6 +60,7 @@ #define PINGPONG_RDMA_WRID 3 #define MAX_INLINE 400 +static int inline_size = MAX_INLINE; static int page_size; static pid_t pid; @@ -603,7 +604,7 @@ static struct pingpong_context *pp_init_ctx(void *ptr, struct pp_data *data) .max_recv_wr = 1, .max_send_sge = 1, .max_recv_sge = 1, - .max_inline_data = MAX_INLINE + .max_inline_data = inline_size, }, .qp_type = IBV_QPT_RC }; @@ -915,6 +916,7 @@ static void usage(const char *argv0) printf( -s, --size=size size of message to exchange (default 1)\n); printf( -t, --tx-depth=dep size of tx queue (default 50)\n); printf( -n, --iters=itersnumber of exchanges (at least 2, default 1000)\n); + printf( -I, --inline_size=size max size of message to be sent in inline mode (default 400)\n); printf( -C, --report-cyclesreport times in cpu cycle units (default microseconds)\n); printf( -H, --report-histogram print out all results (default print summary only)\n); printf( -U, --report-unsorted (implies -H) print out unsorted results (default sorted)\n); @@ -1036,6 +1038,7 @@ int main(int argc, char *argv[]) { .name = size, .has_arg = 1, .val = 's' }, { .name = iters, .has_arg = 1, .val = 'n' }, { .name = tx-depth, .has_arg = 1, .val = 't' }, + { .name = inline_size, .has_arg = 1, .val = 'I' }, { .name = report-cycles, .has_arg = 0, .val = 'C' }, { .name = report-histogram,.has_arg = 0, .val = 'H' }, { .name = report-unsorted,.has_arg = 0, .val = 'U' }, @@ -1043,7 +1046,7 @@ int main(int argc, char *argv[]) { 0 } }; - c = getopt_long(argc, argv, p:d:i:s:n:t:CHUc, long_options, NULL); + c = getopt_long(argc, argv, p:d:i:s:n:t:I:CHUc, long_options, NULL); if (c == -1) break; @@ -1087,6 +1090,10 @@ int main(int argc, char *argv[]) break; + case 'I': + inline_size = strtol(optarg, NULL, 0); + break; + case 'C': report.cycles = 1; break; @@ -1192,7 +1199,7 @@ int main(int argc, char *argv[]) ctx-wr.sg_list= ctx-list; ctx-wr.num_sge= 1; ctx-wr.opcode = IBV_WR_RDMA_WRITE; - if (ctx-size MAX_INLINE || ctx-size == 0) { + if (ctx-size inline_size || ctx-size == 0) { ctx-wr.send_flags = IBV_SEND_SIGNALED; } else { ctx-wr.send_flags = IBV_SEND_SIGNALED | IBV_SEND_INLINE; ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[Fwd: Re: [ewg] [PATCH ofed-1.3] rdma_lat: Add -m --max-inline option to support devices with different]
Doran/Oren, Can you please push this into ofed-1.3? Thanks, Steve. Original Message Subject: Re: [ewg] [PATCH ofed-1.3] rdma_lat: Add -m --max-inline option to support devices with different Date: Thu, 07 Feb 2008 09:17:30 -0600 From: Steve Wise [EMAIL PROTECTED] To: [EMAIL PROTECTED] CC: Oren Meron [EMAIL PROTECTED], Sagi Rotem [EMAIL PROTECTED], [EMAIL PROTECTED] References: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] Dotan Barak wrote: Steve Wise wrote: Oren Meron wrote: Hi Steve, rdma_lat is one of our older test. We use now the newer read_lat and write_lat instead, which support inline specification in command line. The only disadvantage of the new tests, is that they do not yet support CMA. Will it satisfy? No because the RDMA CMA is required for iwarp devices. Steve: i remember that you add the CMA to the rdma_* tests. I don't want to be rude, what can you find some time to add CMA support to the other tests? (then we won't have a good reason to continue support the rdma_* tests). Thanks Dotan I guess I can do this. But not in time for ofed-1.3. If you will get the rdma_lat change in for 1.3, then I'll get the other apps enabled for iwarp for 1.4 or 1.3.x. Sound like a plan? Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [Fwd: [PATCH] libcxbg3: zeroing of wc_flags]
Tziporet, I've released libcxgb3-1.1.4 with this single change. Can we please merge this into ofed-1.3-rc5? Its a small change, it enables the OMPI support we're working on, and is limited to libcxgb3 only. If that's ok, then vlad please pull from: git://git.openfabrics.org/~swise/libcxgb3.git ofed_1_3 and for ofed-1.2.5: git://git.openfabrics.org/~swise/libcxgb3.git ofed_1_2_5 Thanks, Steve. Original Message Subject: [PATCH] libcxbg3: zeroing of wc_flags Date: Tue, 12 Feb 2008 14:14:48 -0600 From: Jon Mason [EMAIL PROTECTED] To: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] From 666b9d67dda0fd01e90ceb93b189a773d14916d5 Mon Sep 17 00:00:00 2001 From: Jon Mason [EMAIL PROTECTED] Date: Tue, 12 Feb 2008 14:08:02 -0600 Subject: [PATCH] The wc_flags field in struct ibv_wc is left uninitialized in iwch_poll_cq_one. User space applications may check this field and deterministically perform actions based on the garbage in the field. Zeroing this out will prevent this unintended behavior. Signed-off-by: Jon Mason [EMAIL PROTECTED] --- src/cq.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/src/cq.c b/src/cq.c index d27c6b7..fcf91c8 100644 --- a/src/cq.c +++ b/src/cq.c @@ -277,6 +277,7 @@ int iwch_poll_cq_one(struct iwch_device *rhp, struct iwch_cq *chp, wc-wr_id = cookie; wc-qp_num = qhp-wq.qpid; wc-vendor_err = CQE_STATUS(cqe); + wc-wc_flags = 0; PDBG(%s qpid 0x%x type %d opcode %d status 0x%x wrid hi 0x%x lo 0x%x cookie 0x% PRIx64 \n, -- 1.5.3.3 ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED 1.3 fix releases
Hal Rosenstock wrote: Hi Tziporet, I know OFED 1.3 GA is just out but I'm trying to look ahead. Will there be fix releases for OFED 1.3 ? If so, are these going to be time based (every n weeks) or on a critical need basis ? Thanks. I hope we can put out point releases like we did for ofed-1.2.5.x. At least for bug fixes in device drivers and libs. Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] OFED March 24 meeting summary on OFED 1.4 plans
Title: OFED March 24 meeting summary on OFED 1.4 plans Tziporet Koren wrote: OFED March 24 meeting summary about OFED 1.4 and 1.3.1 plans: 1.3.1 Release: As we decided we should do a release in 2-3 month after 1.3. In addition if there are any special fixes as outcome from the interop we can do a release earlier. All - please send me your requests for fixed issues and needed time frame and I will publish 1.3.1 schedule based on this. There will be some cxgb3 driver changes for 1.3.1. OFED 1.4: 1. Kernel base: since we target 1.4 release to Sep we target the kernel base to be 2.6.27 This is a good target, but we may need to stay with 2.6.26 if the kernel progress will not be aligned. 2. Suggestions for new features: NFS-RDMA Verbs: Reliable Multicast (to be presented at Sonoma) SDP - Zero copy (There was a question on IPv6 support - seems no one interested for now) IPoIB - continue with performance enhancements Xsigo new virtual NIC New vendor HW support - non was reported so far (IBM and Chelsio - do you have something?) OpenSM: Incremental routing Temporary SA DB - to answer queries and a heavy sweep is done APM - disjoint paths (?) MKey manager (?) Sasha to send more management features MPI: Open MPI 1.3 APM support in MPI mvapich ??? uDAPl Extensions for new APIs (like XRC) - ? uDAPL provider for interop between Windows Linux 1.2 and 2.0 will stay Sorry I missed these meetings. For iWARP, here is my plan: New iWARP Verbs: - stag_alloc/dealloc - nsmr_fastreg - read-with-inv-local-stag - inv-local-stag Note the above verbs might be transport-independent. I believe the IBTA has defined a fastreg verb too? - peer-2-peer support in IWCM/Drivers Steve. 3. Supported OSes kernels RHEL 4 up 5,6,7 RHEL 5 up 1,2 - all vendors please see if we can drop RHEL5 (base) SLES10 SP1 SP2 Kernel.org: start from 2.6.18 - need to decide here based on customers requests Fedora C8 OpenSuSE 10.3 Open (for discussion at Sonoma) : - Do we want to move that user level part will be based on tarballs and not git - Kernel.org and OFED kernel modules Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] RE: [ofa-general] how do I use uDAPL with iWARP?
Scott Weitzenkamp (sweitzen) wrote: I tried that, and it didn't work: [EMAIL PROTECTED] ~]# grep eth /etc/dat.conf OpenIB-cma u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 eth2 0 [EMAIL PROTECTED] ~]# dtest 10194 Running as server - OpenIB-cma 10194 Error dat_ep_create: DAT_INVALID_HANDLE 10194 Error freeing EP: DAT_INVALID_HANDLE DAT_INVALID_HANDLE_EP try setting DAPL_MAX_INLINE=64 ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] RE: [ofa-general] how do I use uDAPL with iWARP?
What does your network inferface config look like? Does rping work? Scott Weitzenkamp (sweitzen) wrote: Steve, Thanks, that gets further, but dtest still fails. Client side: [EMAIL PROTECTED] ~]$ DAPL_MAX_INLINE=64 dtest -h 192.168.0.198 13926 Running as client - OpenIB-cma 13926 Server Name: 192.168.0.198 13926 Server Net Address: 192.168.0.198 13926 Waiting for connect response 13926 Error unexpected conn event : DAT_CONNECTION_EVENT_UNREACHABLE 13926 Error connect_ep: DAT_ABORT 13926: DAPL Test Complete. 13926: Message RTT: Total= 0.00 usec, 10 bursts, itime= 0.00 usec, pc= 0 13926: RDMA write: Total= 0.00 usec, 10 bursts, itime= 0.00 usec, pc= 0 13926: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 13926: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 13926: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 13926: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 13926: open:36619.19 usec 13926: close: 32500.98 usec 13926: PZ create: 7.87 usec 13926: PZ free: 4.05 usec 13926: LMR create: 58.89 usec 13926: LMR free: 11.92 usec 13926: EVD create: 9.78 usec 13926: EVD free: 14.07 usec 13926: EP create: 78.92 usec 13926: EP free:26.23 usec 13926: TOTAL: 199.79 usec Server side: [EMAIL PROTECTED] ~]$ DAPL_MAX_INLINE=64 dtest 11461 Running as server - OpenIB-cma 11461 Server waiting for connect request.. 11461 Waiting for connect response 11461 CONNECTED! 11461 Send RMR to remote: snd_msg: r_key_ctx=bff,pad=0,va=146db580,len=0x40 11461 Waiting for remote to send RMR data 11461 Error waiting on h_dto_rcv_evd: DAT_TIMEOUT_EXPIRED 11461 Error connect_ep: DAT_TIMEOUT_EXPIRED 11461: DAPL Test Complete. 11461: Message RTT: Total= 0.00 usec, 10 bursts, itime= 0.00 usec, pc= 0 11461: RDMA write: Total= 0.00 usec, 10 bursts, itime= 0.00 usec, pc= 0 11461: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 11461: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 11461: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 11461: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 11461: open: 900676.01 usec 11461: close: 31543.97 usec 11461: PZ create: 7.87 usec 11461: PZ free: 5.01 usec 11461: LMR create: 51.98 usec 11461: LMR free: 12.16 usec 11461: EVD create: 10.97 usec 11461: EVD free: 12.87 usec 11461: EP create: 77.01 usec 11461: EP free:30.04 usec 11461: TOTAL: 195.03 usec Scott -Original Message- From: Steve Wise [mailto:[EMAIL PROTECTED] Sent: Thursday, April 03, 2008 9:19 AM To: Scott Weitzenkamp (sweitzen) Cc: Joshua Bernstein; OpenFabrics EWG; [ofa_general] Subject: Re: [ewg] RE: [ofa-general] how do I use uDAPL with iWARP? Scott Weitzenkamp (sweitzen) wrote: I tried that, and it didn't work: [EMAIL PROTECTED] ~]# grep eth /etc/dat.conf OpenIB-cma u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 eth2 0 [EMAIL PROTECTED] ~]# dtest 10194 Running as server - OpenIB-cma 10194 Error dat_ep_create: DAT_INVALID_HANDLE 10194 Error freeing EP: DAT_INVALID_HANDLE DAT_INVALID_HANDLE_EP try setting DAPL_MAX_INLINE=64 ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] RE: [ofa-general] how do I use uDAPL with iWARP?
I can reproduce this. Lemme dig into it... Steve. Scott Weitzenkamp (sweitzen) wrote: Steve, Thanks, that gets further, but dtest still fails. Client side: [EMAIL PROTECTED] ~]$ DAPL_MAX_INLINE=64 dtest -h 192.168.0.198 13926 Running as client - OpenIB-cma 13926 Server Name: 192.168.0.198 13926 Server Net Address: 192.168.0.198 13926 Waiting for connect response 13926 Error unexpected conn event : DAT_CONNECTION_EVENT_UNREACHABLE 13926 Error connect_ep: DAT_ABORT 13926: DAPL Test Complete. 13926: Message RTT: Total= 0.00 usec, 10 bursts, itime= 0.00 usec, pc= 0 13926: RDMA write: Total= 0.00 usec, 10 bursts, itime= 0.00 usec, pc= 0 13926: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 13926: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 13926: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 13926: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 13926: open:36619.19 usec 13926: close: 32500.98 usec 13926: PZ create: 7.87 usec 13926: PZ free: 4.05 usec 13926: LMR create: 58.89 usec 13926: LMR free: 11.92 usec 13926: EVD create: 9.78 usec 13926: EVD free: 14.07 usec 13926: EP create: 78.92 usec 13926: EP free:26.23 usec 13926: TOTAL: 199.79 usec Server side: [EMAIL PROTECTED] ~]$ DAPL_MAX_INLINE=64 dtest 11461 Running as server - OpenIB-cma 11461 Server waiting for connect request.. 11461 Waiting for connect response 11461 CONNECTED! 11461 Send RMR to remote: snd_msg: r_key_ctx=bff,pad=0,va=146db580,len=0x40 11461 Waiting for remote to send RMR data 11461 Error waiting on h_dto_rcv_evd: DAT_TIMEOUT_EXPIRED 11461 Error connect_ep: DAT_TIMEOUT_EXPIRED 11461: DAPL Test Complete. 11461: Message RTT: Total= 0.00 usec, 10 bursts, itime= 0.00 usec, pc= 0 11461: RDMA write: Total= 0.00 usec, 10 bursts, itime= 0.00 usec, pc= 0 11461: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 11461: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 11461: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 11461: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 11461: open: 900676.01 usec 11461: close: 31543.97 usec 11461: PZ create: 7.87 usec 11461: PZ free: 5.01 usec 11461: LMR create: 51.98 usec 11461: LMR free: 12.16 usec 11461: EVD create: 10.97 usec 11461: EVD free: 12.87 usec 11461: EP create: 77.01 usec 11461: EP free:30.04 usec 11461: TOTAL: 195.03 usec Scott -Original Message- From: Steve Wise [mailto:[EMAIL PROTECTED] Sent: Thursday, April 03, 2008 9:19 AM To: Scott Weitzenkamp (sweitzen) Cc: Joshua Bernstein; OpenFabrics EWG; [ofa_general] Subject: Re: [ewg] RE: [ofa-general] how do I use uDAPL with iWARP? Scott Weitzenkamp (sweitzen) wrote: I tried that, and it didn't work: [EMAIL PROTECTED] ~]# grep eth /etc/dat.conf OpenIB-cma u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 eth2 0 [EMAIL PROTECTED] ~]# dtest 10194 Running as server - OpenIB-cma 10194 Error dat_ep_create: DAT_INVALID_HANDLE 10194 Error freeing EP: DAT_INVALID_HANDLE DAT_INVALID_HANDLE_EP try setting DAPL_MAX_INLINE=64 ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] RE: [ofa-general] how do I use uDAPL with iWARP?
Guys, I think this is the same iWARP issue that has been biting me for a while: The client must send the first RDMA message. The dtest app is a peer-2-peer (p2p) application where both sides send immediately after setting up the connection. So dtest doesn't adhere to the iWARP specification (I know: the iWARP spec is broken :). News: I have some prototype FW from chelsio that supports p2p setup and with that FW and my associated iw_cxgb3 driver/library changes, then dtest seems to work fine. These changes will be published upstream soon in order to support Open MPI and other p2p applications for chelsio. For this initial release of p2p support over chelsio, the functionality will be 100% handled in the iw_cxgb3 driver and fw. This is similar to what iw_nes does today with its send_first module option to send a 0B write from the client and defer connection establishment on the server until the 0B write is received. Chelsio will have a similar module option called peer2peer (or I could make it the same option name: send_first) that will use a 0B read to force the client to send first (chelsio cannot use a 0B write for this). The chelsio FW will defer the ESTABLISHED event until the 0B read is received and responded to. The final proper device-independent solution to this will be done in the rdma-cma, the iwarp core and iwarp devices for upstream inclusion as well as for ofed-1.4. Its a much bigger change and will affect the ABI for the rdma_cm probably (app can request p2p behavior). There was a thread a while back driven by Arkady at NetApp with details on how we will implement this (using a small protocol in mpa start req/rep to negotiate this p2p mode). Stay tuned for more on this. Steve. Steve Wise wrote: I can reproduce this. Lemme dig into it... Steve. Scott Weitzenkamp (sweitzen) wrote: Steve, Thanks, that gets further, but dtest still fails. Client side: [EMAIL PROTECTED] ~]$ DAPL_MAX_INLINE=64 dtest -h 192.168.0.198 13926 Running as client - OpenIB-cma 13926 Server Name: 192.168.0.198 13926 Server Net Address: 192.168.0.198 13926 Waiting for connect response 13926 Error unexpected conn event : DAT_CONNECTION_EVENT_UNREACHABLE 13926 Error connect_ep: DAT_ABORT 13926: DAPL Test Complete. 13926: Message RTT: Total= 0.00 usec, 10 bursts, itime= 0.00 usec, pc= 0 13926: RDMA write: Total= 0.00 usec, 10 bursts, itime= 0.00 usec, pc= 0 13926: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 13926: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 13926: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 13926: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 13926: open:36619.19 usec 13926: close: 32500.98 usec 13926: PZ create: 7.87 usec 13926: PZ free: 4.05 usec 13926: LMR create: 58.89 usec 13926: LMR free: 11.92 usec 13926: EVD create: 9.78 usec 13926: EVD free: 14.07 usec 13926: EP create: 78.92 usec 13926: EP free:26.23 usec 13926: TOTAL: 199.79 usec Server side: [EMAIL PROTECTED] ~]$ DAPL_MAX_INLINE=64 dtest 11461 Running as server - OpenIB-cma 11461 Server waiting for connect request.. 11461 Waiting for connect response 11461 CONNECTED! 11461 Send RMR to remote: snd_msg: r_key_ctx=bff,pad=0,va=146db580,len=0x40 11461 Waiting for remote to send RMR data 11461 Error waiting on h_dto_rcv_evd: DAT_TIMEOUT_EXPIRED 11461 Error connect_ep: DAT_TIMEOUT_EXPIRED 11461: DAPL Test Complete. 11461: Message RTT: Total= 0.00 usec, 10 bursts, itime= 0.00 usec, pc= 0 11461: RDMA write: Total= 0.00 usec, 10 bursts, itime= 0.00 usec, pc= 0 11461: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 11461: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 11461: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 11461: RDMA read: Total= 0.00 usec, 4 bursts, itime= 0.00 usec, pc =0 11461: open: 900676.01 usec 11461: close: 31543.97 usec 11461: PZ create: 7.87 usec 11461: PZ free: 5.01 usec 11461: LMR create: 51.98 usec 11461: LMR free: 12.16 usec 11461: EVD create: 10.97 usec 11461: EVD free: 12.87 usec 11461: EP create: 77.01 usec 11461: EP free:30.04 usec 11461: TOTAL: 195.03 usec Scott -Original Message- From: Steve Wise [mailto:[EMAIL PROTECTED] Sent: Thursday, April 03, 2008 9:19 AM To: Scott Weitzenkamp (sweitzen) Cc: Joshua Bernstein; OpenFabrics EWG; [ofa_general] Subject: Re: [ewg] RE: [ofa-general] how do I use uDAPL with iWARP? Scott Weitzenkamp (sweitzen) wrote: I tried that, and it didn't work: [EMAIL PROTECTED] ~]# grep eth /etc/dat.conf OpenIB-cma u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 eth2 0 [EMAIL PROTECTED] ~]# dtest 10194 Running
[ewg] Re: ofed_kernel git tree for OFED-1.4 (based on 2.6.25-rc7)
Vladimir Sokolovsky wrote: Hi Steve, I prepared ofed_kernel git tree: git://git.openfabrics.org/ofed_1_4/linux-2.6.git branch ofed_kernel. This tree merged with 2.6.25-rc7. Currently ofed_scripts/ofed_makedist.sh fails on cxgb3_0040_Add_EEH_support.patch: /usr/bin/quilt --quiltrc /tmp/build-ofed_kernel-g19281/ofed_kernel-2.6.11/patches/quiltrc push patches/cxgb3_0040_Add_EEH_support.patch Applying patch cxgb3_0040_Add_EEH_support.patch patching file drivers/net/cxgb3/cxgb3_main.c Hunk #1 succeeded at 2498 with fuzz 2 (offset 182 lines). Hunk #2 FAILED at 2904. 1 out of 2 hunks FAILED -- rejects in file drivers/net/cxgb3/cxgb3_main.c Patch cxgb3_0040_Add_EEH_support.patch does not apply (enforce with -f) Should this patch be removed from the git tree? Regards, Vladimir Yes. In fact: All of the cxgb_* patches can be removed except: cxgb3_00300_add_ofed_version_tag.patch And all of the iw_cxgb_* patches can be removed. Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [ANNOUNCE] libcxgb3-1.1.5 released
All, I've released version 1.1.5 of libcxgb3. The changes include 2 minor fixes, and some house-keeping to make the release easily integrate into distros. Thanks Roland for helping me see the light. :) Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Agenda for the OFED meeting today
Hey Tziporet, Sorry I missed today's call. If possible, I'd like a few weeks to get the cxgb3 fixes tested and ready to go. That puts me around mid may. I'll try and pull that in to make a RC1 of May 6, but I'm thinking I might need another week or so. Steve. Tziporet Koren wrote: Hi, This is the agenda for the OFED meeting today: 1. OFED 1.3.1: 1.1 Planned changes: ULPs changes: IB-bonding - done SRP failover - on work SDP crashes - on work RDS fixes for RDMA API - already applied but not clear if these are all the changes librdmacm 1.0.7 - done Open MPI 1.2.6 - done Low level drivers: - each HW vendor should reply when the changes will be ready nes mlx4 cxgb3 Ipath ehca 1.2 Schedule: GA is planned for May-29 I suggest to have only two release candidates: - RC1 - May 6 - RC2 - May 20 Note: daily builds of 1.3.1 are already available at: _http://www.openfabrics.org/builds/ofed-1.3.1_ 2. OFED 1.4: Release features were presented at Sonoma (presentation available at _http://www.openfabrics.org/archives/april2008sonoma.htm_) Kernel tree is under work at: git://git.openfabrics.org/ofed_1_4/linux-2.6.git branch ofed_kernel Now failing on ipath drivers - waiting for an update. We should try to get the kernel code to compile as soon as possible so everybody will be able to contribute code 3. Follow up from Sonoma - open discussion Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [GIT PULL ofed-1.3.1] - chelsio changes for ofed-1.3.1
Vlad, Please pull from: git://git.openfabrics.org/~swise/ofed-1.3 ofed_kernel This will sync up ofed-1.3.1 with all the important upstream fixes since ofed-1.3. The patch files added are: kernel_patches/fixes/iw_cxgb3_0080_Fail_Loopback_Connections.patch kernel_patches/fixes/iw_cxgb3_0090_Fix_shift_calc_in_build_phys_page_list_for_1-entry_page_lists.patch kernel_patches/fixes/iw_cxgb3_0100_Return_correct_max_inline_data_when_creating_a_QP.patch kernel_patches/fixes/iw_cxgb3_0110_Fix_iwch_create_cq_off-by-one_error.patch kernel_patches/fixes/iw_cxgb3_0120_Dont_access_a_cm_id_after_dropping_reference.patch kernel_patches/fixes/iw_cxgb3_0130_Correctly_set_the_max_mr_size_device_attribute.patch kernel_patches/fixes/iw_cxgb3_0140_Correctly_serialize_peer_abort_path.patch kernel_patches/fixes/iw_cxgb3_0150_Support_peer-2-peer_connection_setup.patch Thanks, Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [GIT PULL ofed-1.3.1] libcxgb3 version 1.2.0
Vlad, Please pull in version 1.2.0 of libcxgb3. This is needed for the ofed-1.3.1 kernel drivers. Pull from: git://git.openfabrics.org/~swise/libcxgb3 ofed_1_3_1 Thanks, Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [ofa-general] [GIT PULL ofed-1.3.1] - chelsio changes for ofed-1.3.1
Roland Dreier wrote: Steve -- did the IRD/ORD mixup fix get included? (It's 1f71f503 RDMA/cxgb3: Program hardware IRD with correct value) in the upstream kernel Oops. Good catch. No worries though, I've got another series to post (including the qp flush bug NFSRDMA found) for ofed-1.3.1 so i'll add this one. Thanks, Steve. On Wed, Apr 30, 2008 at 2:21 PM, Steve Wise [EMAIL PROTECTED] wrote: Vlad, Please pull from: git://git.openfabrics.org/~swise/ofed-1.3 ofed_kernel This will sync up ofed-1.3.1 with all the important upstream fixes since ofed-1.3. The patch files added are: kernel_patches/fixes/iw_cxgb3_0080_Fail_Loopback_Connections.patch kernel_patches/fixes/iw_cxgb3_0090_Fix_shift_calc_in_build_phys_page_list_for_1-entry_page_lists.patch kernel_patches/fixes/iw_cxgb3_0100_Return_correct_max_inline_data_when_creating_a_QP.patch kernel_patches/fixes/iw_cxgb3_0110_Fix_iwch_create_cq_off-by-one_error.patch kernel_patches/fixes/iw_cxgb3_0120_Dont_access_a_cm_id_after_dropping_reference.patch kernel_patches/fixes/iw_cxgb3_0130_Correctly_set_the_max_mr_size_device_attribute.patch kernel_patches/fixes/iw_cxgb3_0140_Correctly_serialize_peer_abort_path.patch kernel_patches/fixes/iw_cxgb3_0150_Support_peer-2-peer_connection_setup.patch Thanks, Steve. ___ general mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] [GIT PULL ofed-1.3.1] libcxgb3 version 1.2.0
Roland Dreier wrote: Steve -- If you put a tarball (from make dist ;) on openfabrics.org, I'll update the Debian packages. I plan to do this soon. Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [PATCH] Request For Comments:
Roland Dreier wrote: - always do peer2peer and don't let the app choose. This forces the overhead of p2p mode on all apps, but preserves the API. How bad is the overhead? - R. The client side must send a Ready To Receive message. This will be negotiated via the MPA exchange and the resulting RTR message may be a 0B read + read response, 0B write, or a 0B send. For chelsio, the 0B write couldn't be used, and the 0B read was the least impact on the driver code, so we used that. For nes, they currently use a 0B write. Also, there are some caveats if you turn this on: 1) private data is used to negotiate the type of RTR message and if its needed. This is more of a global module option I think, since it will break interoperability with iwarp. Prolly will bump the MPA version number if this option is on too. 2) if the RTR message fails, it can generate a CQE that is unexpected. 3) if using SEND, then a recv completion is always generated. Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [ofa-general] Re: [PATCH] Request For Comments:
Caitlin Bestler wrote: On Tue, May 6, 2008 at 11:32 AM, Steve Wise [EMAIL PROTECTED] wrote: Roland Dreier wrote: - always do peer2peer and don't let the app choose. This forces the overhead of p2p mode on all apps, but preserves the API. How bad is the overhead? - R. The client side must send a Ready To Receive message. This will be negotiated via the MPA exchange and the resulting RTR message may be a 0B read + read response, 0B write, or a 0B send. For chelsio, the 0B write couldn't be used, and the 0B read was the least impact on the driver code, so we used that. For nes, they currently use a 0B write. Also, there are some caveats if you turn this on: 1) private data is used to negotiate the type of RTR message and if its needed. This is more of a global module option I think, since it will break interoperability with iwarp. Prolly will bump the MPA version number if this option is on too. 2) if the RTR message fails, it can generate a CQE that is unexpected. 3) if using SEND, then a recv completion is always generated. Steve. Keep in mind that even if it is a zero byte RDMA Write, it is still a distinct packet that needs TCP handling, will occupy a buffer in various switch queues, etc. So while it can be about as innocuous as any TCP segment can be, it is still an excess packet if it did not need to be sent. The overwhelming majority of applications use a client/server model rather than peer2peer. For them this is an excess wire packet, so I think that would make it excessive overhead. Secondly, the applications that need this feature will generally know that they need it. Developers of MPI and other peer-2-peer applications tend to know advanced networking a bit more than typical app developers. So keeping the default to match the client/server model makes sense. What are the overwhelming majority of user mode rdma applications that don't assume a peer2peer model? Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [ofa-general] Re: [PATCH] Request For Comments:
Sean Hefty wrote: nes - requires 0b write cxgb3 - requires 0b read amso1100 - won't work in p2p mode I'm assuming by requires that you, uhm, mean requires, and nes couldn't do 0b reads, or cxgb3 0b writes. Well, I'm not sure about nes. But cxgb3 cannot deal with receiving a 0B write for the RTR because the FW doesn't see incoming writes, nor does the driver. nes may be able to request a 0b read, but I what I meant was they currently use a 0B write and not a read. So its possible to reduce the complexity if we just mandate 0B read for RTR. But it makes sense in my mind to allow the other message types... Its is painful. But without anything, you cannot run OMPI, IMPI or HPMPI on a iwarp cluster with mixed vendor rnics... Is there any requirement at the receiving side, versus the initiating side? That is, just because nes issues a 0b write, does the receiving HW care if a read or write shows up? Or is this restriction on both sides? The requirement is mostly driven from the receiving side. For cxgb3 it is anyway... The receiving side, ie the side that issues the rdma_accept will tell the sending side what RTR message to send, if any. So the MPA exchange will look like this: client sends MPA Start request with private data saying i can send an RTR if you want it. server moves connection into RDMA mode server sends MPA Start response with lets do RTR and send me X where X could be 0B write, 0B read request or 0B send. client moves connection into RDMA mode client sends X and then enables SQ processing (or indicate ESTABLISHED) Once server gets X it can enable SQ processing (or indicate ESTABLISHED) If X was a 0B read request, server sends 0B read response. Steve ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [ofa-general] Re: [PATCH] Request For Comments:
From RFC 5044, section 7.1.2 Connection Startup Rules, Page 29: 4. MPA Responder mode implementations MUST receive and validate at least one FPDU before sending any FPDUs or Markers. Note: This requirement is present to allow the Initiator time to get its receiver into Full Operation before an FPDU arrives, avoiding potential race conditions at the Initiator. This was also subject to some debate in the work group before rough consensus was reached. Eliminating this requirement would allow faster startup in some types of applications. However, that would also make certain implementations (particularly dual stack) much harder. Steve Wise wrote: Sean Hefty wrote: The requirement is mostly driven from the receiving side. For cxgb3 it is anyway... Maybe you can help me understand the spec here. If we ignore this feature for a minute, then the side that calls rdma_connect() must instead issue the first 'send' request to the server. Can the first 'send' be a 0B rdma write or read? According to the MPI IETF RFC, the initiator must send the first FPDU. That could be anything. The spec leaves it up to the ULP. Why wouldn't the target of that request not have to transition to connected? I don't understand this question? What does 'transition to connected' mean? The requirement is that the responder (the side that issues the rdma_accept in rdma-cma terms) _cannot_ send an FPDU until it first receives one from the initiator. How that is enforces is an implementation detail. The responder driver could hold off on the ESTABLISHED event until it receives the first FPDU. Or it could stall SQ processing until the first FPDU is received yet still indicate that the connection is ESTABLISHED. Is the issue that there's no way for the receiving FW/driver to know that this has occurred so that it can signal that the connection has been established? I.e. a client that does this must signal the server that things are ready through some out of band means. I don't understand what you're getting at exactly. The issue is that the server doesn't know when the client receives the MPA Start Response and has successfully transitioned the connection into RDMA mode. IF the server sends an FPDU immediately following the MPA Start Response (which is in streaming mode), then its possible for that first FPDU to get passed up to the driver/ULP as streaming mode data. Which breaks everything. S, the spec says the server cannot send an FPDU until it first receives one and thus _knows_ the client is in RDMA mode (by virtue of the fact that the client sent and FPDU). server sends MPA Start response with lets do RTR and send me X where X could be 0B write, 0B read request or 0B send. Are there any restrictions where a client may not be able to issue what the server requests? E.g. the hardware doesn't issue 0B writes. Well I guess there could be. The concensus within the iWARP vendors at Reno was that 0B read would ok. During the previous discussion on this list shortly after Reno, issues where raised that we should allow other types. We could make the MPA start request have more info than I can do RTR. It could have Here are the RTR msgs I can send.Does that help? Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [ofa-general] Re: [PATCH] Request For Comments:
Here is the thread where we discussed how to implement peer-to-peer for iWARP in Nov/2007: http://lists.openfabrics.org/pipermail/general/2007-November/043252.html Steve Wise wrote: From RFC 5044, section 7.1.2 Connection Startup Rules, Page 29: 4. MPA Responder mode implementations MUST receive and validate at least one FPDU before sending any FPDUs or Markers. Note: This requirement is present to allow the Initiator time to get its receiver into Full Operation before an FPDU arrives, avoiding potential race conditions at the Initiator. This was also subject to some debate in the work group before rough consensus was reached. Eliminating this requirement would allow faster startup in some types of applications. However, that would also make certain implementations (particularly dual stack) much harder. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [GIT PULL] ofed-1.3.1 - cxgb3 fixes
Vlad, Please pull from git://git.openfabrics.org/~swise/ofed-1.3 ofed_kernel All the changes are upstream in 2.6.26. Patches added: kernel_patches/fixes/cxgb3_0240_fix_port_up_down_error_path.patch kernel_patches/fixes/cxgb3_0250_fix_EEH.patch kernel_patches/fixes/iw_cxgb3_0200_Dont_add_PBL_memory_to_gen_pool_in_chunks.patch kernel_patches/fixes/iw_cxgb3_0210_Fix_severe_limit_on_userspace_memory_registration_size.patch kernel_patches/fixes/iw_cxgb3_0220_Wrap_the_software_sq_ptr_as_needed_on_flush.patch Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [GIT PULL] ofed-1.3.1 - cxgb3 fixes
Strange... I ran the kernel build script on all the kernel versions and platforms and they passed. I must not be doing something correctly when running that script? I'll address this today. Vladimir Sokolovsky wrote: Steve Wise wrote: Vlad, Please pull from git://git.openfabrics.org/~swise/ofed-1.3 ofed_kernel All the changes are upstream in 2.6.26. Patches added: kernel_patches/fixes/cxgb3_0240_fix_port_up_down_error_path.patch kernel_patches/fixes/cxgb3_0250_fix_EEH.patch kernel_patches/fixes/iw_cxgb3_0200_Dont_add_PBL_memory_to_gen_pool_in_chunks.patch kernel_patches/fixes/iw_cxgb3_0210_Fix_severe_limit_on_userspace_memory_registration_size.patch kernel_patches/fixes/iw_cxgb3_0220_Wrap_the_software_sq_ptr_as_needed_on_flush.patch Steve. Hi Steve, ofed_makedist.sh script fails to apply cxgb3_remove_eeh.patch. Please fix: ./ofed_scripts/ofed_makedist.sh Importing patch /tmp/build-ofed_kernel-Q23205/ofed_kernel-2.6.11/kernel_patches/backport/2.6.11/cxgb3_remove_eeh.patch (stored as cxgb3_remove_eeh.patch) /usr/bin/quilt --quiltrc /tmp/build-ofed_kernel-Q23205/ofed_kernel-2.6.11/patches/quiltrc push patches/cxgb3_remove_eeh.patch Applying patch cxgb3_remove_eeh.patch patching file drivers/net/cxgb3/cxgb3_main.c Hunk #1 FAILED at 2426. Hunk #2 succeeded at 2699 (offset -34 lines). 1 out of 2 hunks FAILED -- rejects in file drivers/net/cxgb3/cxgb3_main.c Patch cxgb3_remove_eeh.patch does not apply (enforce with -f) Regards, Vladimir ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: [GIT PULL] ofed-1.3.1 - cxgb3 fixes
Hi Steve, ofed_makedist.sh script fails to apply cxgb3_remove_eeh.patch. Please fix: ./ofed_scripts/ofed_makedist.sh Importing patch /tmp/build-ofed_kernel-Q23205/ofed_kernel-2.6.11/kernel_patches/backport/2.6.11/cxgb3_remove_eeh.patch (stored as cxgb3_remove_eeh.patch) /usr/bin/quilt --quiltrc /tmp/build-ofed_kernel-Q23205/ofed_kernel-2.6.11/patches/quiltrc push patches/cxgb3_remove_eeh.patch Applying patch cxgb3_remove_eeh.patch patching file drivers/net/cxgb3/cxgb3_main.c Hunk #1 FAILED at 2426. Hunk #2 succeeded at 2699 (offset -34 lines). 1 out of 2 hunks FAILED -- rejects in file drivers/net/cxgb3/cxgb3_main.c Patch cxgb3_remove_eeh.patch does not apply (enforce with -f) Regards, Vladimir Vlad, Please pull from here to get the fix for this: git://git.openfabrics.org/~swise/ofed-1.3 ofed_kernel BTW: The build_ofa_kernel.sh script doesn't build all the backports. That is how these slipped through the cracks. I'll always run ofed_makedist.sh from now on as part of my submission process, but it seems like we're missing some backport build checks. These backports aren't getting built by build_ofa_kernel.sh: 2.6.11 2.6.11_FC4 2.6.13_suse10_0_u Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] RC2 open-iscsi build problem
Has anyone seen this with rc2? thanks, Steve. Original Message Subject:RE: [Fwd: [ofa-general] OFED 1.3.1 RC2 release is available] Date: Fri, 23 May 2008 13:34:19 -0700 From: Johnny Wang [EMAIL PROTECTED] To: Steve Wise [EMAIL PROTECTED], Scott Bardone [EMAIL PROTECTED], Arvind Chadda [EMAIL PROTECTED] CC: Jon Mason [EMAIL PROTECTED] There may be a typo in the installer for rc2. During the installation I got: Running rpmbuild --rebuild --define '_topdir /var/tmp/OFED_topdir' --define 'dist ' --target x86_64 /root/rdma/OFED-1.3.1-rc2/SRPMS/open-iscsi-generic-2.0-865.15.src.rpm Failed to build open-iscsi-generic RPM See /tmp/OFED.3373.logs/open-iscsi-generic.rpmbuild.log The log file says: error: cannot open /root/rdma/OFED-1.3.1-rc2/SRPMS/open-iscsi-generic-2.0-865.15.src.rpm: No such file or directory And doing an ‘ls’ on the SRPM/ there seems to be a “open-iscsi-generic-2.0-865.15.1.src.rpm” but not “open-iscsi-generic-2.0-865.15.src.rpm” Johnny ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [PATCH ofed-1.3.1] cxgb3: Release note updates for ofed-1.3.1.
From: Steve Wise [EMAIL PROTECTED] Signed-off-by: Steve Wise [EMAIL PROTECTED] --- cxgb3_release_notes.txt | 135 --- 1 files changed, 125 insertions(+), 10 deletions(-) diff --git a/cxgb3_release_notes.txt b/cxgb3_release_notes.txt index 6d9ee57..7173887 100644 --- a/cxgb3_release_notes.txt +++ b/cxgb3_release_notes.txt @@ -1,14 +1,119 @@ Open Fabrics Enterprise Distribution (OFED) CHELSIO T3 RNIC RELEASE NOTES - February 2008 + May 2008 - -Author: Steve Wise - The iw_cxgb3 and cxgb3 modules provide RDMA and NIC support for the Chelsio S310/320 and R310/320 series adapters. Make sure you choose the -'cxgb3' and 'libcxgb3' options when generating your ofed-1.3 rpms. +'cxgb3' and 'libcxgb3' options when generating your ofed-1.3.1 rpms. + + + +New enhancements for OFED-1.3.1: + + +- Various MPI libraries are enabled via a new iw_cxgb3 module option +called peer2peer. When loading iw_cxgb3, set peer2peer=1 to enable Intel +MPI version 3.1.038, HP MPI version 2.02.05.01, OpenMPI (will be released +with OMPI-3.1), and Scali MPI (will be available in version 3.13.7). +This option must be set on all systems in your cluster. See more info +below on running these MPIs. NOTE: None of these MPIs are included in +the ofed-1.3.1 release. Contact the specific vendors for obtaining the +MPI code. Open MPI can be pulled from www.open-mpi.org. + +- New 6.0 firmware required. Download this firmware from +service.chelsio.com and put the t3fw-6.0.0.bin file in /lib/firmware. +Then reload the cxgb3 module and reconfigure the interface to flash this +new firmware. + +- Large memory registration. User applications can now register 30MB +memory regions. + + +Enabling Various MPIs + + +For OpenMPI, Intel MPI, HP MPI, and Scali MPI: you must set the iw_cxgb3 +module option peer2peer=1 on all systems. This can be done by writing +to the /sys/module file system during boot. EG: + +# echo 1 /sys/module/iw_cxgb3/parameters/peer2peer + +Or you can add the following line to /etc/modprobe.conf to set the option +at module load time: + +options iw_cxgb3 peer2peer=1 + +For Intel MPI, HP MPI, and Scali MPI: Enable the chelsio device by adding +an entry to /etc/dat.conf for the chelsio interface. For instance, +if your chelsio interface name is eth2, then the following line adds a +DAT device named chelsio for that interface: + +chelsio u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 eth2 0 + += +Intel MPI: += + +The following env vars enable Intel MPI version 3.1.038. Place these +in your user env after installing and setting up Intel MPI: + +export RSH=ssh +export DAPL_MAX_INLINE=64 +export I_MPI_DEVICE=rdssm:chelsio +export MPIEXEC_TIMEOUT=180 +export MPI_BIT_MODE=64 + +Note: I_MPI_DEVICE=rdssm:chelsio assumes you have an entry in +/etc/dat.conf named chelsio. + +Contact Intel for obtaining their MPI with DAPL support. + += +HP MPI: += + +To run HP MPI applications, use these mpirun options: + +-prot -e DAPL_MAX_INLINE=64 -UDAPL + +EG: + +$ mpirun -prot -e DAPL_MAX_INLINE=64 -UDAPL -hostlist r1-iw,r2-iw ~/tests/presta-1.4.0/glob + +Where r1-iw and r2-iw are hostnames mapping to the chelsio interfaces. + +Also this assumes your first entry in /etc/dat.conf is for the chelsio +device. + +Contact HP for obtaining their MPI with DAPL support. + += +Scali MPI: += + +The following env vars enable Scali MPI. Place these in your user env +after installing and setting up Scali MPI for running over Infiniband: + +export DAPL_MAX_INLINE=64 +export SCAMPI_NETWORKS=chelsio +export SCAMPI_CHANNEL_ENTRY_COUNT=chelsio:128 + +Note: SCAMPI_NETWORKS=chelsio assumes you have an entry in /etc/dat.conf +named chelsio. + +Contact Scali for obtaining their MPI with DAPL support. + += +OpenMPI: += + +OpenMPI iWARP support is only available in version 1.3 or greater. + +Open MPI will work without any specific configuration via the openib btl. +Users wishing to performance tune the configurable options may wish to +inspect the receive queue values. Those can be found in the Chelsio T3 +section of mca-btl-openib-hca-params.ini. Loadable Module options: @@ -17,15 +122,15 @@ Loadable Module options: The following options can be used when loading the iw_cxgb3 module to tune the iWARP driver: -cong_flavor - set the congestion congtrol algorithm. Default is 1. +cong_flavor - set the congestion control algorithm. Default is 1. 0 == Reno 1 == Tahoe 2 == NewReno 3 == HighSpeed -snd_win - set the TCP send window
[ewg] OpenSM from ofed-1.2 and ofed-1.3 clients
Hello opensm gurus: Sandia is seeing problems after migrating up to ofed-1.3. They are still using an ofed-1.2 opensm but with ofed-1.3 clients, updated from ofed-1.2.5. They are getting the errors below. Q: should this work? Or are the backwards compat issues? Thanks, Steve. log: May 23 08:29:22 408613 [45007960] - __osm_sm_mad_ctrl_send_err_cb: ERR 3113: MAD completed in error (IB_TIMEOUT) May 23 08:29:22 408622 [45007960] - __osm_sm_mad_ctrl_send_err_cb: ERR 3119: Set method failed May 23 08:29:22 408652 [45007960] - SMP dump: base_ver0x1 mgmt_class..0x81 class_ver...0x1 method..0x2 (SubnSet) D bit...0x0 status..0x0 hop_ptr.0x0 hop_count...0x3 trans_id0x1694a4 attr_id.0x1B (MulticastForwardingTable) resv0x0 attr_mod0x1000 m_key...0x dr_slid.0x dr_dlid.0x Initial path: 0,1,14,9 Return path: 0,0,0,0 Reserved: [0][0][0][0][0][0][0] 00 40 00 40 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 May 23 08:29:22 408689 [45007960] - umad_receiver: ERR 5409: send completed with error (method=0x2 attr=0x1B trans_id=0x14001694a5) -- dropping May 23 08:29:22 408699 [45007960] - umad_receiver: ERR 5411: DR SMP Hop Ptr: 0x0 May 23 08:29:22 408711 [45007960] - Received SMP on a 3 hop path: Initial path = 0,0,0,0 Return path = 0,0,0,0 May 23 08:29:22 408721 [45007960] - __osm_sm_mad_ctrl_send_err_cb: ERR 3113: MAD completed in error (IB_TIMEOUT) May 23 08:29:22 408729 [45007960] - __osm_sm_mad_ctrl_send_err_cb: ERR 3119: Set method failed May 23 08:29:22 408759 [45007960] - SMP dump: base_ver0x1 mgmt_class..0x81 class_ver...0x1 method..0x2 (SubnSet) D bit...0x0 status..0x0 hop_ptr.0x0 hop_count...0x3 trans_id0x1694a5 attr_id.0x1B (MulticastForwardingTable) resv0x0 attr_mod0x1 m_key...0x dr_slid.0x dr_dlid.0x Initial path: 0,1,14,9 Return path: 0,0,0,0 Reserved: [0][0][0][0][0][0][0] 00 00 00 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 May 23 08:29:22 412432 [42803960] - Errors during initialization May 23 08:29:22 412508 [42803960] - __osm_state_mgr_init_errors_msg: ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Re: [Fwd: [ofa-general] ofa_1_4_kernel 20080612-0200 daily build status]
Hey Tziporet, When is ofed-1.4 going to go up to 2.6.26? Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Re: [Fwd: [ofa-general] ofa_1_4_kernel 20080612-0200 daily build status]
By the way, I'm adding some new kernel_addons backports to support the network namespace stuff that affects things like ip_dev_find(), ip_route_output_*(), and friends. This will remove several patches from kernel_patches/backports and simplify each backport from the perspective of the rdma drivers, core and ulps. I'm doing these changes for all as I hit them. Is this ok with folks? Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Re: [Fwd: [ofa-general] ofa_1_4_kernel 20080612-0200 daily build status]
Steve Wise wrote: Hey Tziporet, When is ofed-1.4 going to go up to 2.6.26? Steve. Never mind. I see it is at 2.6.26-rc1 at least. But when I do a git tag -l it doesn't show the 2.6.26 tag? Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] [GIT PULL ofed-1.4 ofed_kernel] - cxgb3 backports for ofed-1.4
Vlad, Please pull from: git://git.openfabrics.org/~swise/ofed-1.4.git ofed_kernel There are 20 commits that add the backports for cxgb3, one for each kernerl. They also create new include files in kernel_addons/backports to provide backports for net_namespace and a few other needed functions. This results in removing 4 kernel_patches/backports/* files for the core, iw_cxgb3, and rds modules. Namely these patches are no longer needed and are removed: cma_to_2_6_24.patch addr_to_2_6_24.patch iwch_cm_to_2_6_24.patch rds_to_2_6_24.patch New kernel_addon/backports/* changes: - add net_namespace.h so modules that include it won't break - add backport macro to remove the net namespace parameter from: - ip_dev_find() - route_ouput_flow() and route_output_key() - dev_get_by_name() - add is_vmalloc_addr() backport function - add vlan_dev_info() backport function I built these against a handful of the backport kernels. I didn't do the cross platforms builds. So please lemme know if anything still breaks in the cxgb3 code. Thanks, Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Can you work with Vlad to integrate NFSoRDMA to OFED 1.4
Tziporet Koren wrote: We already have a kernel tree that you can work with I understand you had something for OFED 1.3 and most important is to get it now for OFED 1.4 Thanks, Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg All, Since NFSRDMA client and server are in 2.6.26, I suspect ofed-1.4 NFSRDMA effort need only include: - any latest and greatest patches from Tucker/Talpey - user mode cmds/libs needed to do mounts. - NFSRDMA fastreg usage for client and server. Steve. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] Can you work with Vlad to integrate NFSoRDMA to OFED 1.4
Tziporet Koren wrote: Steve Wise wrote: All, Since NFSRDMA client and server are in 2.6.26, I suspect ofed-1.4 NFSRDMA effort need only include: - any latest and greatest patches from Tucker/Talpey - user mode cmds/libs needed to do mounts. - NFSRDMA fastreg usage for client and server. And - backports to SLES10 (+ SP1 SP2), RHEL5 (+ up1 and up2) Tziporet How could I forget! :) ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
Re: [ewg] root fs full on hosting server
2GB is enough ram? Tziporet Koren wrote: Johann George wrote: For $239/month: * quad 2.13 GHz Xeon * 2 GB RAM * hardware RAID1 * 500 GB disk (2 250 GB SATA II drives) * 250 GB/month bandwidth * 10 GB of backup space My current thought is that the $239/month package would meet our needs. We will need to upgrade it to at least 20GB of backup; or even more. Also, we have been running Ubuntu 6.06 Dapper Drake with LTS. The latest LTS version of Ubuntu recently came out, Hardy Heron 8.04, and it probably makes sense to start out with that. Note that we should also get a 10% discount on any of the above quoted prices. Comments? As soon as we agree on the configuration, we can put it into place. I agree - the 239$ is suitable for us Lets go for it Tziporet ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] perfquery error
Hal, perfquery barfs when an iwarp device is in the mix. I think it needs to skip over devices that are not IB. [EMAIL PROTECTED] ~]# perfquery ibpanic: [5790] madrpc_init: can't open UMAD port ((null):0): (No such file or directory) [EMAIL PROTECTED] ~]# ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg