[openib-general] ofa_1_2_kernel 20070227-0200 daily build status
This email was generated automatically, please do not reply Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-core-mod --with-addr_trans-mod --with-cxgb3-mod Passed: Passed on i686 with 2.6.15-23-server Passed on i686 with linux-2.6.12 Passed on i686 with linux-2.6.13 Passed on i686 with linux-2.6.17 Passed on i686 with linux-2.6.14 Passed on i686 with linux-2.6.16 Passed on i686 with linux-2.6.15 Passed on i686 with linux-2.6.18 Passed on i686 with linux-2.6.19 Passed on x86_64 with linux-2.6.12 Passed on x86_64 with linux-2.6.20 Passed on powerpc with linux-2.6.19 Passed on powerpc with linux-2.6.17 Passed on x86_64 with linux-2.6.19 Passed on powerpc with linux-2.6.18 Passed on x86_64 with linux-2.6.18 Passed on x86_64 with linux-2.6.16 Passed on ppc64 with linux-2.6.19 Passed on x86_64 with linux-2.6.14 Passed on x86_64 with linux-2.6.15 Passed on x86_64 with linux-2.6.17 Passed on ia64 with linux-2.6.18 Passed on powerpc with linux-2.6.16 Passed on x86_64 with linux-2.6.13 Passed on ia64 with linux-2.6.19 Passed on ppc64 with linux-2.6.12 Passed on ppc64 with linux-2.6.18 Passed on ppc64 with linux-2.6.16 Passed on powerpc with linux-2.6.12 Passed on powerpc with linux-2.6.14 Passed on x86_64 with linux-2.6.9-42.ELsmp Passed on ppc64 with linux-2.6.17 Passed on ia64 with linux-2.6.17 Passed on powerpc with linux-2.6.13 Passed on ppc64 with linux-2.6.14 Passed on x86_64 with linux-2.6.5-7.244-smp Passed on x86_64 with linux-2.6.16.21-0.8-smp Passed on ia64 with linux-2.6.12 Passed on ia64 with linux-2.6.16 Passed on ppc64 with linux-2.6.15 Passed on powerpc with linux-2.6.15 Passed on ia64 with linux-2.6.15 Passed on ia64 with linux-2.6.13 Passed on ppc64 with linux-2.6.13 Passed on ia64 with linux-2.6.14 Passed on x86_64 with linux-2.6.18-1.2798.fc6 Passed on ia64 with linux-2.6.16.21-0.8-default Failed: Build failed on x86_64 with linux-2.6.9-22.ELsmp Log: /home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-22.ELsmp_x86_64_check/drivers/net/cxgb3/vsc8211.c:167: error: âADVERTISE_PAUSE_CAPâ undeclared (first use in this function) /home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-22.ELsmp_x86_64_check/drivers/net/cxgb3/vsc8211.c:167: error: (Each undeclared identifier is reported only once /home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-22.ELsmp_x86_64_check/drivers/net/cxgb3/vsc8211.c:167: error: for each function it appears in.) /home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-22.ELsmp_x86_64_check/drivers/net/cxgb3/vsc8211.c:170: error: âADVERTISE_PAUSE_ASYMâ undeclared (first use in this function) make[3]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-22.ELsmp_x86_64_check/drivers/net/cxgb3/vsc8211.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-22.ELsmp_x86_64_check/drivers/net/cxgb3] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-22.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-22.ELsmp' make: *** [kernel] Error 2 -- Build failed on x86_64 with linux-2.6.9-34.ELsmp Log: /home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-34.ELsmp_x86_64_check/drivers/net/cxgb3/cxgb3_offload.c: In function âadd_adapterâ: /home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-34.ELsmp_x86_64_check/drivers/net/cxgb3/cxgb3_offload.c:1061: error: âadapter_list_lockâ undeclared (first use in this function) /home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-34.ELsmp_x86_64_check/drivers/net/cxgb3/cxgb3_offload.c: In function âremove_adapterâ: /home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-34.ELsmp_x86_64_check/drivers/net/cxgb3/cxgb3_offload.c:1068: error: âadapter_list_lockâ undeclared (first use in this function) make[3]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-34.ELsmp_x86_64_check/drivers/net/cxgb3/cxgb3_offload.o] Error 1 make[2]: *** [/home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-34.ELsmp_x86_64_check/drivers/net/cxgb3] Error 2 make[1]: *** [_module_/home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-34.ELsmp_x86_64_check] Error 2 make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-34.ELsmp' make: *** [kernel] Error 2 -- ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Fwd: Address List Change Now Scheduled for Wednesday, 2/28/2007
On Feb 27, 2007, at 2:10 AM, Diego Guella wrote: Should I do something to get subscribed to the new mailing list or I will be automatically subscribed? There is nothing that you need to do; the list is simply being migrated from one server to another and changing names in the process. The only change is that I have to write messages to [EMAIL PROTECTED], correct? Correct. There will be aliases in place to redirect messages from the old name to the new name, too. So the warning is more about updating e-mail client filters, etc. - Original Message - From: Jeff Squyres [EMAIL PROTECTED] To: OpenFabrics General openib-general@openib.org Sent: Monday, February 26, 2007 6:05 PM Subject: [openib-general] Fwd: Address List Change Now Scheduled for Wednesday, 2/28/2007 FYI. In case you missed it the Nth time: THIS LIST IS CHANGING ON WEDNESDAY 2/28/2007 (2 days from now). Really. For sure this time. Trust me. Honest. Please update your addressbooks! Begin forwarded message: From: Lee, Michael Paichi [EMAIL PROTECTED] Date: February 22, 2007 11:44:25 AM EST To: Jeff Squyres [EMAIL PROTECTED], Michael S. Tsirkin [EMAIL PROTECTED] Cc: OpenFabrics General openib-general@openib.org Subject: Address List Change Now Scheduled for Wednesday, 2/28/2007 The list will now be migrated on Wednesday, 2/28/2007. List address: [EMAIL PROTECTED] Updated change-date: Wednesday, 2/28/2007 Michael -- Jeff Squyres Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/ openib-general -- Jeff Squyres Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] ib0 shows MAC address as 00-00-00.... is it normal??
Hi All, We have build and installed OFED-1.1 on RHEL-4 machine, using ipoib we set the IPs for the interface and able to ping each other, but my ifconfig shows ib0 MAC address as shown below 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 -- ib0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 inet addr:192.168.0.1 Bcast:192.168.0.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1 RX packets:271465 errors:0 dropped:0 overruns:0 frame:0 TX packets:1444336 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:128 RX bytes:15664386 (14.9 MiB) TX bytes:2718736764 (2.5 GiB) --- pls let me know is it normal, is there any way to get the real hw/mac address. regards, Bala. Be a PS3 game guru. Get your game face on with the latest PS3 news and previews at Yahoo! Games. http://videogames.yahoo.com/platform?platform=120121 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] mpi over IB
Hi All, We have build and installed OFED-1.1 on RHEL-4 machines, while compiling selected mpi support, pls through some light on how to use mpi over IB interface, using what modules etc. or do we need to install separate mpi software to use. thanks in advance, -bala- 8:00? 8:25? 8:40? Find a flick in no time with the Yahoo! Search movie showtime shortcut. http://tools.search.yahoo.com/shortcuts/#news ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] mpi over IB
During the installation process, the OFED installer should have asked you if you wanted to install Open MPI and/or MVAPICH. Both of these MPI implementations are capable of communicating natively over the IB interface. Running MPI applications with Open MPI should natively choose the IB interface at run time if your IB network is up and running properly (e.g., try running ibv_devinfo to ensure that ports are listed in the PORT_ACTIVE state, etc.). I assume that the same is true with MVAPICH as well. On Feb 27, 2007, at 6:35 AM, Bala wrote: Hi All, We have build and installed OFED-1.1 on RHEL-4 machines, while compiling selected mpi support, pls through some light on how to use mpi over IB interface, using what modules etc. or do we need to install separate mpi software to use. thanks in advance, -bala- __ __ 8:00? 8:25? 8:40? Find a flick in no time with the Yahoo! Search movie showtime shortcut. http://tools.search.yahoo.com/shortcuts/#news ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/ openib-general -- Jeff Squyres Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] ib0 shows MAC address as 00-00-00.... is it normal??
On Tue, 2007-02-27 at 06:30, Bala wrote: Hi All, We have build and installed OFED-1.1 on RHEL-4 machine, using ipoib we set the IPs for the interface and able to ping each other, but my ifconfig shows ib0 MAC address as shown below 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 -- ib0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 inet addr:192.168.0.1 Bcast:192.168.0.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1 RX packets:271465 errors:0 dropped:0 overruns:0 frame:0 TX packets:1444336 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:128 RX bytes:15664386 (14.9 MiB) TX bytes:2718736764 (2.5 GiB) --- pls let me know is it normal, Depends on the (truncated) guid for the HCA port. is there any way to get the real hw/mac address. ip addr show ib0 -- Hal regards, Bala. Be a PS3 game guru. Get your game face on with the latest PS3 news and previews at Yahoo! Games. http://videogames.yahoo.com/platform?platform=120121 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [RFC] [PATCH v2] IB/ipoib: Add bonding support to IPoIB
Thanks for the comments To fix it, this patch adds a dev field to struct ipoib_neigh which is used instead of the struct neighbour dev one. It seems that in this design, if multiple ipoib interfaces are present, we might get an skb such that skb-dev will be different from the new dev field in struct ipoib_neigh. It seems that the result will be that the packet will be sent on a wrong interface. Right? I don't see how. The field dev in ipoib_neigh doesn't take part in interface selection. As I see it, skb travels this path: 1. Passed to bond_dev-hard_start_xmit 2. bond_dev-hard_start_xmit chooses the current active interface, changes skb-dev and enqueues it back for xmittig. In addition, if an IPoIB device is removed before bonding is unloaded it may cause bond0 neighbours (neighbours that point to bond0) to exist after the IPoIB device no longer exist. This is why a neighbour cleanup is required during device cleanup. This cleanup scans the arp cache and the ndisc cache to find there neighbours of bond0 which refer also to the relevant ibX. Also, when ib_ipoib module is unloaded, the neighbour destructor must be set to NULL because the neighbour function is in ib_ipoib. For this neigh table cleanup, it is required to export the symbol nd_tbl just like the symbol arp_tbl is. I wonder about this: is it really true that any allocated neighbour is always in either arp_tbl or nd_tbl? For example, could some code have called neigh_hold and retained a neighbour that is not in either one of these tables? I got the assumption about neighbours living in one of these 2 tables from observation and code reading. I preferred that that on keeping track of all ipoib_neighs and putting them in a list. However, I could do that instead of neigh_table scanning. Do you think it's better? For the example... I didn't understand it. Could you please explain? During my tests I found that when running 1. modprobe -r ib_mthca (to delete IPoIB interfaces) 2. ping somewhere on the subnet of bond0 I get this stack dump (which ends with kernel death) [8037ff32] skb_under_panic+0x5c/0x60 [882e00c2] :ib_ipoib:ipoib_hard_header+0xa6/0xc0 [803c3c98] arp_create+0x120/0x226 [803c3dc3] arp_send+0x25/0x3b [803c466a] arp_solicit+0x186/0x195 [8038c0ac] neigh_timer_handler+0x2b5/0x309 [8038bdf7] neigh_timer_handler+0x0/0x309 [80239599] run_timer_softirq+0x130/0x19e [80235fcc] __do_softirq+0x55/0xc3 [8020acac] call_softirq+0x1c/0x28 [8020c02b] do_softirq+0x2c/0x7d [8021864a] smp_apic_timer_interrupt+0x57/0x6a [80208e19] mwait_idle+0x0/0x45 [8020a756] apic_timer_interrupt+0x66/0x70 EOI [80208e5b] mwait_idle+0x42/0x45 [80208db1] cpu_idle+0x8b/0xae [80217d60] start_secondary+0x47f/0x48f The only way I found to avoid this (for now) is to check skb headroom in ipoib_hard_header. I guess that this safety check doesn't harm regular IPoIB operation and it seems to solve my problem. However, I would be happy to hear what others think of this last issue. As I said, this seems to indicate a problem in the bonding code. But what will happen after you error out in ipoib_hard_header? Is the packet dropped? What might break as a result? I will check the hard_header_len issue in the bonding code more carefully. From first look it seems that bonding does borrow the hard_header_len. Also, my checks show that it is safe to return with error from hard_header(). For example, in neigh_connected_output: err = dev-hard_header(skb, dev, ntohs(skb-protocol), neigh-ha, NULL, skb-len); read_unlock_bh(neigh-lock); if (err = 0) err = neigh-ops-queue_xmit(skb); else { err = -EINVAL; kfree_skb(skb); I would really appreciate comments. thanks -MoniS ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [RFC] IB/ipoib: Asynchronous events delivered without port parameter.
Hello, I did a short code review of the ipoib code concentrating on partitioning support and I mentioned that the asynchronous events handler in the ipoib code does not take the port number reported in the event record into consideration. The effect of that is that all of the ib# devices related to that specific HCA are flushed when it seems to me that only the relevant port one should be. Is that done on purpose, or am I missing something ? Thanks, Moni p.s. I'm working on a patch that should solve another issue caused by PKEY reordering ipoib behavior and the above issue further complicates things for me. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [RFC] IB/ipoib: Asynchronous events delivered without port parameter.
Quoting Moni Levy [EMAIL PROTECTED]: Subject: [RFC] IB/ipoib: Asynchronous events delivered without port parameter. Hello, I did a short code review of the ipoib code concentrating on partitioning support and I mentioned that the asynchronous events handler in the ipoib code does not take the port number reported in the event record into consideration. The effect of that is that all of the ib# devices related to that specific HCA are flushed when it seems to me that only the relevant port one should be. Is that done on purpose, or am I missing something ? Thanks, Moni p.s. I'm working on a patch that should solve another issue caused by PKEY reordering ipoib behavior and the above issue further complicates things for me. If true, why is this a problem? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] HOWTO check ofa_kernel build from your git tree
Where are all the kernel src trees on ssh. openfabrics.org? I would like to build against specific trees that are failing with cxgb3... Also: what RH distro ships: linux-2.6.9-22.ELsmp and linux-2.6.9-34.ELsmp Thanks, Steve. On Mon, 2007-02-26 at 17:07 +0200, Vladimir Sokolovsky wrote: On ssh.openfabrics.org: Run env git_url=/home/mst/scm/ofed_1_2_devel.git git_branch=ofed_1_2 \ CHECK_LOCAL=yes \ CHECK_KERNEL_ORG=yes \ CHECK_CROSS=yes /home/vlad/scripts/build_ofa_kernel.sh ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCHv2] IB/ipoib: Fix ipoib handling for pkey reordering
This issue was found during partitioning SM fail over testing. The fix was tested over the weekend with pkey reshuffling, removal and addition every few seconds concurrent with OFED restart. The patch applies on Roland's git tree. Changes from v1: * added flush flag to ipoib_ib_dev_stop(), ipoib_ib_dev_down() alike * fixed a bug in device extraction from the work struct * removed some warnings in case they are caused due to missing PKEY as this seems like a valid flow now. SM reconfiguration or failover possibly causes a shuffling of the values in the port pkey table. The current implementation only queries for the index of the pkey once, when it creates the device QP and after that moves it into working state, and hence does not address this scenario. Fix this by using the PKEY_CHANGE event as a trigger to reconfigure the device QP. Signed-off-by: Moni Levy [EMAIL PROTECTED] --- ipoib.h |4 +++- ipoib_ib.c| 51 +-- ipoib_main.c |5 +++-- ipoib_multicast.c | 11 ++- ipoib_verbs.c |8 +++- 5 files changed, 60 insertions(+), 19 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index 2594db2..d08ecca 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -205,6 +205,7 @@ struct ipoib_dev_priv { struct delayed_work pkey_task; struct delayed_work mcast_task; struct work_struct flush_task; + struct work_struct flush_restart_qp_task; struct work_struct restart_task; struct delayed_work ah_reap_task; @@ -334,12 +335,13 @@ struct ipoib_dev_priv *ipoib_intf_alloc( int ipoib_ib_dev_init(struct net_device *dev, struct ib_device *ca, int port); void ipoib_ib_dev_flush(struct work_struct *work); +void ipoib_ib_dev_flush_restart_qp(struct work_struct *work); void ipoib_ib_dev_cleanup(struct net_device *dev); int ipoib_ib_dev_open(struct net_device *dev); int ipoib_ib_dev_up(struct net_device *dev); int ipoib_ib_dev_down(struct net_device *dev, int flush); -int ipoib_ib_dev_stop(struct net_device *dev); +int ipoib_ib_dev_stop(struct net_device *dev, int flush); int ipoib_dev_init(struct net_device *dev, struct ib_device *ca, int port); void ipoib_dev_cleanup(struct net_device *dev); diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index f2aa923..b0287c1 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -415,21 +415,22 @@ int ipoib_ib_dev_open(struct net_device ret = ipoib_init_qp(dev); if (ret) { - ipoib_warn(priv, ipoib_init_qp returned %d\n, ret); + if (ret != -ENOENT) + ipoib_warn(priv, ipoib_init_qp returned %d\n, ret); return -1; } ret = ipoib_ib_post_receives(dev); if (ret) { ipoib_warn(priv, ipoib_ib_post_receives returned %d\n, ret); - ipoib_ib_dev_stop(dev); + ipoib_ib_dev_stop(dev, 1); return -1; } ret = ipoib_cm_dev_open(dev); if (ret) { ipoib_warn(priv, ipoib_ib_post_receives returned %d\n, ret); - ipoib_ib_dev_stop(dev); + ipoib_ib_dev_stop(dev, 1); return -1; } @@ -508,7 +509,7 @@ static int recvs_pending(struct net_devi return pending; } -int ipoib_ib_dev_stop(struct net_device *dev) +int ipoib_ib_dev_stop(struct net_device *dev, int flush) { struct ipoib_dev_priv *priv = netdev_priv(dev); struct ib_qp_attr qp_attr; @@ -581,7 +582,8 @@ timeout: /* Wait for all AHs to be reaped */ set_bit(IPOIB_STOP_REAPER, priv-flags); cancel_delayed_work(priv-ah_reap_task); - flush_workqueue(ipoib_workqueue); + if (flush) + flush_workqueue(ipoib_workqueue); begin = jiffies; @@ -622,13 +624,17 @@ int ipoib_ib_dev_init(struct net_device return 0; } -void ipoib_ib_dev_flush(struct work_struct *work) +static void __ipoib_ib_dev_flush(struct ipoib_dev_priv *priv, int restart_qp) { - struct ipoib_dev_priv *cpriv, *priv = - container_of(work, struct ipoib_dev_priv, flush_task); + struct ipoib_dev_priv *cpriv; struct net_device *dev = priv-dev; - if (!test_bit(IPOIB_FLAG_INITIALIZED, priv-flags) ) { + /* +* ipoib_ib_dev_stop() below may not find the PKey and leave the +* IPOIB_FLAG_INITIALIZED flag off so flush in that case with restart_qp +* flag on is Ok. +*/ + if (!test_bit(IPOIB_FLAG_INITIALIZED, priv-flags) !restart_qp) { ipoib_dbg(priv, Not flushing - IPOIB_FLAG_INITIALIZED not set.\n); return; } @@ -641,6 +647,13 @@ void
Re: [openib-general] [RFC] [PATCH v2] IB/ipoib: Add bonding support to IPoIB
Quoting Moni Shoua [EMAIL PROTECTED]: Subject: Re: [RFC] [PATCH v2] IB/ipoib: Add bonding support to IPoIB Thanks for the comments To fix it, this patch adds a dev field to struct ipoib_neigh which is used instead of the struct neighbour dev one. It seems that in this design, if multiple ipoib interfaces are present, we might get an skb such that skb-dev will be different from the new dev field in struct ipoib_neigh. It seems that the result will be that the packet will be sent on a wrong interface. Right? I don't see how. The field dev in ipoib_neigh doesn't take part in interface selection. As I see it, skb travels this path: 1. Passed to bond_dev-hard_start_xmit 2. bond_dev-hard_start_xmit chooses the current active interface, changes skb-dev and enqueues it back for xmittig. ipoib_neigh ah field includes struct ib_ah *. This selects important parameters which depend on both packet source and destination interfaces. I think the right thing might be to compare ipoib_neigh dev and skb-dev, and destroy ipoib_neigh if these do not match. In addition, if an IPoIB device is removed before bonding is unloaded it may cause bond0 neighbours (neighbours that point to bond0) to exist after the IPoIB device no longer exist. This is why a neighbour cleanup is required during device cleanup. This cleanup scans the arp cache and the ndisc cache to find there neighbours of bond0 which refer also to the relevant ibX. Also, when ib_ipoib module is unloaded, the neighbour destructor must be set to NULL because the neighbour function is in ib_ipoib. For this neigh table cleanup, it is required to export the symbol nd_tbl just like the symbol arp_tbl is. I wonder about this: is it really true that any allocated neighbour is always in either arp_tbl or nd_tbl? For example, could some code have called neigh_hold and retained a neighbour that is not in either one of these tables? I got the assumption about neighbours living in one of these 2 tables from observation and code reading. I preferred that that on keeping track of all ipoib_neighs and putting them in a list. However, I could do that instead of neigh_table scanning. Do you think it's better? If some neighbours are not on any tables, it seems using our own lists (e.g. lists we have in ipoib_path) is the only option, no? For the example... I didn't understand it. Could you please explain? grep for neigh_hold. neighbour is only destroyed when ref count goes to 0. If some code does neigh_hold, it seems neighbour could be removed from table but destructor not yet called. During my tests I found that when running 1. modprobe -r ib_mthca (to delete IPoIB interfaces) 2. ping somewhere on the subnet of bond0 I get this stack dump (which ends with kernel death) [8037ff32] skb_under_panic+0x5c/0x60 [882e00c2] :ib_ipoib:ipoib_hard_header+0xa6/0xc0 [803c3c98] arp_create+0x120/0x226 [803c3dc3] arp_send+0x25/0x3b [803c466a] arp_solicit+0x186/0x195 [8038c0ac] neigh_timer_handler+0x2b5/0x309 [8038bdf7] neigh_timer_handler+0x0/0x309 [80239599] run_timer_softirq+0x130/0x19e [80235fcc] __do_softirq+0x55/0xc3 [8020acac] call_softirq+0x1c/0x28 [8020c02b] do_softirq+0x2c/0x7d [8021864a] smp_apic_timer_interrupt+0x57/0x6a [80208e19] mwait_idle+0x0/0x45 [8020a756] apic_timer_interrupt+0x66/0x70 EOI [80208e5b] mwait_idle+0x42/0x45 [80208db1] cpu_idle+0x8b/0xae [80217d60] start_secondary+0x47f/0x48f The only way I found to avoid this (for now) is to check skb headroom in ipoib_hard_header. I guess that this safety check doesn't harm regular IPoIB operation and it seems to solve my problem. However, I would be happy to hear what others think of this last issue. As I said, this seems to indicate a problem in the bonding code. But what will happen after you error out in ipoib_hard_header? Is the packet dropped? What might break as a result? I will check the hard_header_len issue in the bonding code more carefully. From first look it seems that bonding does borrow the hard_header_len. So where does a shorter message come from? Also, my checks show that it is safe to return with error from hard_header(). For example, in neigh_connected_output: err = dev-hard_header(skb, dev, ntohs(skb-protocol), neigh-ha, NULL, skb-len); read_unlock_bh(neigh-lock); if (err = 0) err = neigh-ops-queue_xmit(skb); else { err = -EINVAL; kfree_skb(skb); I would really appreciate comments. thanks -MoniS -- MST ___ openib-general
Re: [openib-general] [PATCHv2] IB/ipoib: Fix ipoib handling for pkey reordering
I just gave this a cursory glance. A suggestion: would it not be much simpler to modify the QP from RTS to RTS on pkey change? diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index f2aa923..b0287c1 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -415,21 +415,22 @@ int ipoib_ib_dev_open(struct net_device ret = ipoib_init_qp(dev); if (ret) { - ipoib_warn(priv, ipoib_init_qp returned %d\n, ret); + if (ret != -ENOENT) + ipoib_warn(priv, ipoib_init_qp returned %d\n, ret); return -1; } What's the reason for this? @@ -993,6 +993,7 @@ static void ipoib_setup(struct net_devic INIT_DELAYED_WORK(priv-pkey_task,ipoib_pkey_poll); INIT_DELAYED_WORK(priv-mcast_task, ipoib_mcast_join_task); INIT_WORK(priv-flush_task, ipoib_ib_dev_flush); + INIT_WORK(priv-flush_restart_qp_task, ipoib_ib_dev_flush_restart_qp); INIT_WORK(priv-restart_task, ipoib_mcast_restart_task); INIT_DELAYED_WORK(priv-ah_reap_task, ipoib_reap_ah); } Shorter name? diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index b303ce6..27d6fd4 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -232,9 +232,10 @@ static int ipoib_mcast_join_finish(struc ret = ipoib_mcast_attach(dev, be16_to_cpu(mcast-mcmember.mlid), mcast-mcmember.mgid); if (ret 0) { - ipoib_warn(priv, couldn't attach QP to multicast group -IPOIB_GID_FMT \n, -IPOIB_GID_ARG(mcast-mcmember.mgid)); + if (ret != -ENXIO) /* No PKEY found */ + ipoib_warn(priv, couldn't attach QP to multicast group +IPOIB_GID_FMT \n, +IPOIB_GID_ARG(mcast-mcmember.mgid)); clear_bit(IPOIB_MCAST_FLAG_ATTACHED, mcast-flags); return ret; @@ -312,7 +313,7 @@ ipoib_mcast_sendonly_join_complete(int s status = ipoib_mcast_join_finish(mcast, multicast-rec); if (status) { - if (mcast-logcount++ 20) + if (mcast-logcount++ 20 status != -ENXIO) ipoib_dbg_mcast(netdev_priv(dev), multicast join failed for IPOIB_GID_FMT , status %d\n, IPOIB_GID_ARG(mcast-mcmember.mgid), status); @@ -416,7 +417,7 @@ static int ipoib_mcast_join_complete(int , status %d\n, IPOIB_GID_ARG(mcast-mcmember.mgid), status); - } else { + } else if (status != -ENXIO) { ipoib_warn(priv, multicast join failed for IPOIB_GID_FMT , status %d\n, IPOIB_GID_ARG(mcast-mcmember.mgid), diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c index 3cb551b..d0384ea 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c @@ -52,8 +52,10 @@ int ipoib_mcast_attach(struct net_device if (ib_find_cached_pkey(priv-ca, priv-port, priv-pkey, pkey_index)) { clear_bit(IPOIB_PKEY_ASSIGNED, priv-flags); ret = -ENXIO; + ipoib_dbg(priv, PKEY %X not found\n, priv-pkey); goto out; } + ipoib_dbg(priv, PKEY %X found at index %d\n, priv-pkey, pkey_index); set_bit(IPOIB_PKEY_ASSIGNED, priv-flags); /* set correct QKey for QP */ Make it PKey or pkey: no text in uppercase in log messages please. @@ -105,9 +107,11 @@ int ipoib_init_qp(struct net_device *dev */ ret = ib_find_cached_pkey(priv-ca, priv-port, priv-pkey, pkey_index); if (ret) { + ipoib_dbg(priv, PKEY %X not found.\n, priv-pkey); clear_bit(IPOIB_PKEY_ASSIGNED, priv-flags); return ret; } + ipoib_dbg(priv, PKEY %X found at index %d.\n, priv-pkey, pkey_index); set_bit(IPOIB_PKEY_ASSIGNED, priv-flags); qp_attr.qp_state = IB_QPS_INIT; going a bit overboard on the number of debug messages here. @@ -260,12 +264,14 @@ void ipoib_event(struct ib_event_handler container_of(handler, struct ipoib_dev_priv, event_handler); if (record-event == IB_EVENT_PORT_ERR|| - record-event == IB_EVENT_PKEY_CHANGE || record-event == IB_EVENT_PORT_ACTIVE || record-event == IB_EVENT_LID_CHANGE ||
Re: [openib-general] [PATCHv2] IB/ipoib: Fix ipoib handling for pkey reordering
I just gave this a cursory glance. I haven't really read it except to think why is this so complicated? A suggestion: would it not be much simpler to modify the QP from RTS to RTS on pkey change? Changing the P_Key index is not allowed for RTS-RTS. You would have to modify the QP RTS-SQD, wait for the SQ to drain, then modify the P_Key index with SQD-SQD, and finally go SQD-RTS. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCHv2] IB/ipoib: Fix ipoib handling for pkey reordering
Quoting Roland Dreier [EMAIL PROTECTED]: Subject: Re: [PATCHv2] IB/ipoib: Fix ipoib handling for pkey reordering I just gave this a cursory glance. I haven't really read it except to think why is this so complicated? A suggestion: would it not be much simpler to modify the QP from RTS to RTS on pkey change? Changing the P_Key index is not allowed for RTS-RTS. You would have to modify the QP RTS-SQD, wait for the SQ to drain, then modify the P_Key index with SQD-SQD, and finally go SQD-RTS. True, I misread the spec. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [RFC] IB/ipoib: Asynchronous events delivered without port parameter.
I did a short code review of the ipoib code concentrating on partitioning support and I mentioned that the asynchronous events handler in the ipoib code does not take the port number reported in the event record into consideration. The effect of that is that all of the ib# devices related to that specific HCA are flushed when it seems to me that only the relevant port one should be. Is that done on purpose, or am I missing something ? I don't think there's any particular reason the code is that way except for the oversight never being corrected. But it looks trivial to fix, like the patch below. Does that look right to you? p.s. I'm working on a patch that should solve another issue caused by PKEY reordering ipoib behavior and the above issue further complicates things for me. Why not fix the issue first then? commit a27cbe878203076247c1b5287f5ab59ed143b560 Author: Roland Dreier [EMAIL PROTECTED] Date: Tue Feb 27 07:37:49 2007 -0800 IPoIB: Only handle async events for one port An asynchronous event carries the port number that the event occurred on, so there's no reason for an IPoIB interface to process an event associated with a different local HCA port. Signed-off-by: Roland Dreier [EMAIL PROTECTED] diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c index 3cb551b..7f3ec20 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c @@ -259,12 +259,13 @@ void ipoib_event(struct ib_event_handler *handler, struct ipoib_dev_priv *priv = container_of(handler, struct ipoib_dev_priv, event_handler); - if (record-event == IB_EVENT_PORT_ERR|| - record-event == IB_EVENT_PKEY_CHANGE || - record-event == IB_EVENT_PORT_ACTIVE || - record-event == IB_EVENT_LID_CHANGE || - record-event == IB_EVENT_SM_CHANGE || - record-event == IB_EVENT_CLIENT_REREGISTER) { + if ((record-event == IB_EVENT_PORT_ERR|| +record-event == IB_EVENT_PKEY_CHANGE || +record-event == IB_EVENT_PORT_ACTIVE || +record-event == IB_EVENT_LID_CHANGE || +record-event == IB_EVENT_SM_CHANGE || +record-event == IB_EVENT_CLIENT_REREGISTER) + record-element.port_num == priv-port) { ipoib_dbg(priv, Port state change event\n); queue_work(ipoib_workqueue, priv-flush_task); } ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCHv2] IB/ipoib: Fix ipoib handling for pkey reordering
On 2/27/07, Roland Dreier [EMAIL PROTECTED] wrote: I just gave this a cursory glance. I haven't really read it except to think why is this so complicated? Do you refer to that complication of the patch of the issue ? A suggestion: would it not be much simpler to modify the QP from RTS to RTS on pkey change? Changing the P_Key index is not allowed for RTS-RTS. You would have to modify the QP RTS-SQD, wait for the SQ to drain, then modify the P_Key index with SQD-SQD, and finally go SQD-RTS. Do you think that using that way to solve it will be a significant simplification ? We'll still have to reuse that handling for missed completion that is currently implemented in ipoib_ib_dev_stop and still have additional work element. -- Moni - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [RFC] IB/ipoib: Asynchronous events delivered without port parameter.
Quoting Roland Dreier [EMAIL PROTECTED]: Subject: Re: [RFC] IB/ipoib: Asynchronous events delivered without port parameter. I did a short code review of the ipoib code concentrating on partitioning support and I mentioned that the asynchronous events handler in the ipoib code does not take the port number reported in the event record into consideration. The effect of that is that all of the ib# devices related to that specific HCA are flushed when it seems to me that only the relevant port one should be. Is that done on purpose, or am I missing something ? I don't think there's any particular reason the code is that way except for the oversight never being corrected. But it looks trivial to fix, like the patch below. Does that look right to you? p.s. I'm working on a patch that should solve another issue caused by PKEY reordering ipoib behavior and the above issue further complicates things for me. Why not fix the issue first then? commit a27cbe878203076247c1b5287f5ab59ed143b560 Author: Roland Dreier [EMAIL PROTECTED] Date: Tue Feb 27 07:37:49 2007 -0800 IPoIB: Only handle async events for one port An asynchronous event carries the port number that the event occurred on, so there's no reason for an IPoIB interface to process an event associated with a different local HCA port. Signed-off-by: Roland Dreier [EMAIL PROTECTED] diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c index 3cb551b..7f3ec20 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c @@ -259,12 +259,13 @@ void ipoib_event(struct ib_event_handler *handler, struct ipoib_dev_priv *priv = container_of(handler, struct ipoib_dev_priv, event_handler); - if (record-event == IB_EVENT_PORT_ERR|| - record-event == IB_EVENT_PKEY_CHANGE || - record-event == IB_EVENT_PORT_ACTIVE || - record-event == IB_EVENT_LID_CHANGE || - record-event == IB_EVENT_SM_CHANGE || - record-event == IB_EVENT_CLIENT_REREGISTER) { + if ((record-event == IB_EVENT_PORT_ERR|| + record-event == IB_EVENT_PKEY_CHANGE || + record-event == IB_EVENT_PORT_ACTIVE || + record-event == IB_EVENT_LID_CHANGE || + record-event == IB_EVENT_SM_CHANGE || + record-event == IB_EVENT_CLIENT_REREGISTER) + record-element.port_num == priv-port) { ipoib_dbg(priv, Port state change event\n); queue_work(ipoib_workqueue, priv-flush_task); } Looks good. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [RFC] IB/ipoib: Asynchronous events delivered without port parameter.
On 2/27/07, Roland Dreier [EMAIL PROTECTED] wrote: I did a short code review of the ipoib code concentrating on partitioning support and I mentioned that the asynchronous events handler in the ipoib code does not take the port number reported in the event record into consideration. The effect of that is that all of the ib# devices related to that specific HCA are flushed when it seems to me that only the relevant port one should be. Is that done on purpose, or am I missing something ? I don't think there's any particular reason the code is that way except for the oversight never being corrected. But it looks trivial to fix, like the patch below. Does that look right to you? p.s. I'm working on a patch that should solve another issue caused by PKEY reordering ipoib behavior and the above issue further complicates things for me. Why not fix the issue first then? commit a27cbe878203076247c1b5287f5ab59ed143b560 Author: Roland Dreier [EMAIL PROTECTED] Date: Tue Feb 27 07:37:49 2007 -0800 IPoIB: Only handle async events for one port An asynchronous event carries the port number that the event occurred on, so there's no reason for an IPoIB interface to process an event associated with a different local HCA port. Signed-off-by: Roland Dreier [EMAIL PROTECTED] diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c index 3cb551b..7f3ec20 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c @@ -259,12 +259,13 @@ void ipoib_event(struct ib_event_handler *handler, struct ipoib_dev_priv *priv = container_of(handler, struct ipoib_dev_priv, event_handler); - if (record-event == IB_EVENT_PORT_ERR|| - record-event == IB_EVENT_PKEY_CHANGE || - record-event == IB_EVENT_PORT_ACTIVE || - record-event == IB_EVENT_LID_CHANGE || - record-event == IB_EVENT_SM_CHANGE || - record-event == IB_EVENT_CLIENT_REREGISTER) { + if ((record-event == IB_EVENT_PORT_ERR|| +record-event == IB_EVENT_PKEY_CHANGE || +record-event == IB_EVENT_PORT_ACTIVE || +record-event == IB_EVENT_LID_CHANGE || +record-event == IB_EVENT_SM_CHANGE || +record-event == IB_EVENT_CLIENT_REREGISTER) + record-element.port_num == priv-port) { ipoib_dbg(priv, Port state change event\n); queue_work(ipoib_workqueue, priv-flush_task); } That's exactly what I intended to post. --Moni ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCHv2] IB/ipoib: Fix ipoib handling for pkey reordering
I haven't really read it except to think why is this so complicated? Do you refer to that complication of the patch of the issue ? the patch. Changing the P_Key index is not allowed for RTS-RTS. You would have to modify the QP RTS-SQD, wait for the SQ to drain, then modify the P_Key index with SQD-SQD, and finally go SQD-RTS. Do you think that using that way to solve it will be a significant simplification ? We'll still have to reuse that handling for missed completion that is currently implemented in ipoib_ib_dev_stop and still have additional work element. no, I don't think SQD is really useful in practice. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] HOWTO check ofa_kernel build from your git tree
On Tue, 2007-02-27 at 08:23 -0600, Steve Wise wrote: Where are all the kernel src trees on ssh. openfabrics.org? I would like to build against specific trees that are failing with cxgb3... /home/vlad/kernel.org/arch/kernel Also: what RH distro ships: linux-2.6.9-22.ELsmp RHEL4.0U2 and linux-2.6.9-34.ELsmp RHEL4.0U3 Thanks, Steve. On Mon, 2007-02-26 at 17:07 +0200, Vladimir Sokolovsky wrote: On ssh.openfabrics.org: Run env git_url=/home/mst/scm/ofed_1_2_devel.git git_branch=ofed_1_2 \ CHECK_LOCAL=yes \ CHECK_KERNEL_ORG=yes \ CHECK_CROSS=yes /home/vlad/scripts/build_ofa_kernel.sh ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCHv2] IB/ipoib: Fix ipoib handling for pkey reordering
On 2/27/07, Roland Dreier [EMAIL PROTECTED] wrote: I haven't really read it except to think why is this so complicated? Do you refer to that complication of the patch of the issue ? the patch. Please advise and I'll change it. Changing the P_Key index is not allowed for RTS-RTS. You would have to modify the QP RTS-SQD, wait for the SQ to drain, then modify the P_Key index with SQD-SQD, and finally go SQD-RTS. Do you think that using that way to solve it will be a significant simplification ? We'll still have to reuse that handling for missed completion that is currently implemented in ipoib_ib_dev_stop and still have additional work element. no, I don't think SQD is really useful in practice. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 0/6] ofed_1_2: cxgb3 bug fixes
Hey Vlad, These fixes need to be pulled into ofed_1_2 for the Chelsio Ethernet driver. You can pull them directly from my ofa git tree: git://staging.openfabrics.org/~swise/ofed_1_2 cxgb3_fixes Thanks, Steve. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 1/6] sysfs attributes are now managed per port, no longer per adapter.
sysfs attributes are now managed per port, no longer per adapter. Signed-off-by: Divy Le Ray [EMAIL PROTECTED] --- drivers/net/cxgb3/cxgb3_main.c | 21 - 1 files changed, 12 insertions(+), 9 deletions(-) diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c index dfa035a..638b0ab 100755 --- a/drivers/net/cxgb3/cxgb3_main.c +++ b/drivers/net/cxgb3/cxgb3_main.c @@ -435,26 +435,24 @@ static int setup_sge_qsets(struct adapte } static ssize_t attr_show(struct class_device *cd, char *buf, -ssize_t(*format) (struct adapter *, char *)) +ssize_t(*format) (struct net_device *, char *)) { ssize_t len; - struct adapter *adap = to_net_dev(cd)-priv; /* Synchronize with ioctls that may shut down the device */ rtnl_lock(); - len = (*format) (adap, buf); + len = (*format) (to_net_dev(cd), buf); rtnl_unlock(); return len; } static ssize_t attr_store(struct class_device *cd, const char *buf, size_t len, - ssize_t(*set) (struct adapter *, unsigned int), + ssize_t(*set) (struct net_device *, unsigned int), unsigned int min_val, unsigned int max_val) { char *endp; ssize_t ret; unsigned int val; - struct adapter *adap = to_net_dev(cd)-priv; if (!capable(CAP_NET_ADMIN)) return -EPERM; @@ -464,7 +462,7 @@ static ssize_t attr_store(struct class_d return -EINVAL; rtnl_lock(); - ret = (*set) (adap, val); + ret = (*set) (to_net_dev(cd), val); if (!ret) ret = len; rtnl_unlock(); @@ -472,8 +470,9 @@ static ssize_t attr_store(struct class_d } #define CXGB3_SHOW(name, val_expr) \ -static ssize_t format_##name(struct adapter *adap, char *buf) \ +static ssize_t format_##name(struct net_device *dev, char *buf) \ { \ + struct adapter *adap = dev-priv; \ return sprintf(buf, %u\n, val_expr); \ } \ static ssize_t show_##name(struct class_device *cd, char *buf) \ @@ -481,8 +480,10 @@ static ssize_t show_##name(struct class_ return attr_show(cd, buf, format_##name); \ } -static ssize_t set_nfilters(struct adapter *adap, unsigned int val) +static ssize_t set_nfilters(struct net_device *dev, unsigned int val) { + struct adapter *adap = dev-priv; + if (adap-flags FULL_INIT_DONE) return -EBUSY; if (val adap-params.rev == 0) @@ -499,8 +500,10 @@ static ssize_t store_nfilters(struct cla return attr_store(cd, buf, len, set_nfilters, 0, ~0); } -static ssize_t set_nservers(struct adapter *adap, unsigned int val) +static ssize_t set_nservers(struct net_device *dev, unsigned int val) { + struct adapter *adap = dev-priv; + if (adap-flags FULL_INIT_DONE) return -EBUSY; if (val t3_mc5_size(adap-mc5) - adap-params.mc5.nfilters) ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 3/6] Update FW version to 3.2
Update FW version to 3.2 Signed-off-by: Steve Wise [EMAIL PROTECTED] --- drivers/net/cxgb3/t3_hw.c |6 -- drivers/net/cxgb3/version.h |2 ++ 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c old mode 100755 new mode 100644 index 365a7f5..eaa7a2e --- a/drivers/net/cxgb3/t3_hw.c +++ b/drivers/net/cxgb3/t3_hw.c @@ -884,11 +884,13 @@ int t3_check_fw_version(struct adapter * major = G_FW_VERSION_MAJOR(vers); minor = G_FW_VERSION_MINOR(vers); - if (type == FW_VERSION_T3 major == 3 minor == 1) + if (type == FW_VERSION_T3 major == FW_VERSION_MAJOR + minor == FW_VERSION_MINOR) return 0; CH_ERR(adapter, found wrong FW version(%u.%u), - driver needs version 3.1\n, major, minor); + driver needs version %u.%u\n, major, minor, + FW_VERSION_MAJOR, FW_VERSION_MINOR); return -EINVAL; } diff --git a/drivers/net/cxgb3/version.h b/drivers/net/cxgb3/version.h old mode 100755 new mode 100644 index 2b67dd5..782a6cf --- a/drivers/net/cxgb3/version.h +++ b/drivers/net/cxgb3/version.h @@ -36,4 +36,6 @@ #define DRV_DESC Chelsio T3 Network Dri #define DRV_NAME cxgb3 /* Driver version */ #define DRV_VERSION 1.0 +#define FW_VERSION_MAJOR 3 +#define FW_VERSION_MINOR 2 #endif /* __CHELSIO_VERSION_H */ ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 4/6] Offload packets may be DMAed long after their SGE Tx descriptors are done
Offload packets may be DMAed long after their SGE Tx descriptors are done so they must remain mapped until they are freed rather than until their descriptors are freed. Unmap such packets through an skb destructor. Signed-off-by: Divy Le Ray [EMAIL PROTECTED] --- drivers/net/cxgb3/sge.c | 63 ++- 1 files changed, 61 insertions(+), 2 deletions(-) diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c old mode 100755 new mode 100644 index 3f2cf8a..822a598 --- a/drivers/net/cxgb3/sge.c +++ b/drivers/net/cxgb3/sge.c @@ -105,6 +105,15 @@ struct unmap_info {/* packet unmapping }; /* + * Holds unmapping information for Tx packets that need deferred unmapping. + * This structure lives at skb-head and must be allocated by callers. + */ +struct deferred_unmap_info { + struct pci_dev *pdev; + dma_addr_t addr[MAX_SKB_FRAGS + 1]; +}; + +/* * Maps a number of flits to the number of Tx descriptors that can hold them. * The formula is * @@ -252,10 +261,13 @@ static void free_tx_desc(struct adapter struct pci_dev *pdev = adapter-pdev; unsigned int cidx = q-cidx; + const int need_unmap = need_skb_unmap() + q-cntxt_id = FW_TUNNEL_SGEEC_START; + d = q-sdesc[cidx]; while (n--) { if (d-skb) { /* an SGL is present */ - if (need_skb_unmap()) + if (need_unmap) unmap_skb(d-skb, q, cidx, pdev); if (d-skb-priority == cidx) kfree_skb(d-skb); @@ -1227,6 +1239,50 @@ int t3_mgmt_tx(struct adapter *adap, str } /** + * deferred_unmap_destructor - unmap a packet when it is freed + * @skb: the packet + * + * This is the packet destructor used for Tx packets that need to remain + * mapped until they are freed rather than until their Tx descriptors are + * freed. + */ +static void deferred_unmap_destructor(struct sk_buff *skb) +{ + int i; + const dma_addr_t *p; + const struct skb_shared_info *si; + const struct deferred_unmap_info *dui; + const struct unmap_info *ui = (struct unmap_info *)skb-cb; + + dui = (struct deferred_unmap_info *)skb-head; + p = dui-addr; + + if (ui-len) + pci_unmap_single(dui-pdev, *p++, ui-len, PCI_DMA_TODEVICE); + + si = skb_shinfo(skb); + for (i = 0; i si-nr_frags; i++) + pci_unmap_page(dui-pdev, *p++, si-frags[i].size, + PCI_DMA_TODEVICE); +} + +static void setup_deferred_unmapping(struct sk_buff *skb, struct pci_dev *pdev, +const struct sg_ent *sgl, int sgl_flits) +{ + dma_addr_t *p; + struct deferred_unmap_info *dui; + + dui = (struct deferred_unmap_info *)skb-head; + dui-pdev = pdev; + for (p = dui-addr; sgl_flits = 3; sgl++, sgl_flits -= 3) { + *p++ = be64_to_cpu(sgl-addr[0]); + *p++ = be64_to_cpu(sgl-addr[1]); + } + if (sgl_flits) + *p = be64_to_cpu(sgl-addr[0]); +} + +/** * write_ofld_wr - write an offload work request * @adap: the adapter * @skb: the packet to send @@ -1262,8 +1318,11 @@ static void write_ofld_wr(struct adapter sgp = ndesc == 1 ? (struct sg_ent *)d-flit[flits] : sgl; sgl_flits = make_sgl(skb, sgp, skb-h.raw, skb-tail - skb-h.raw, adap-pdev); - if (need_skb_unmap()) + if (need_skb_unmap()) { + setup_deferred_unmapping(skb, adap-pdev, sgp, sgl_flits); + skb-destructor = deferred_unmap_destructor; ((struct unmap_info *)skb-cb)-len = skb-tail - skb-h.raw; + } write_wr_hdr_sgl(ndesc, skb, d, pidx, q, sgl, flits, sgl_flits, gen, from-wr_hi, from-wr_lo); ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 5/6] Improve the traffic recovery after the HW ran out of response queue entries.
Improve the traffic recovery after the HW ran out of response queue entries. Signed-off-by: Divy Le Ray [EMAIL PROTECTED] --- drivers/net/cxgb3/adapter.h |2 ++ drivers/net/cxgb3/sge.c | 15 ++- 2 files changed, 16 insertions(+), 1 deletions(-) diff --git a/drivers/net/cxgb3/adapter.h b/drivers/net/cxgb3/adapter.h old mode 100755 new mode 100644 index 5c97a64..01b99b9 --- a/drivers/net/cxgb3/adapter.h +++ b/drivers/net/cxgb3/adapter.h @@ -121,6 +121,8 @@ struct sge_rspq { /* state for an SGE r unsigned long empty;/* # of times queue ran out of credits */ unsigned long nomem;/* # of responses deferred due to no mem */ unsigned long unhandled_irqs; /* # of spurious intrs */ + unsigned long starved; + unsigned long restarted; }; struct tx_desc; diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c index 822a598..4ff0ab6 100644 --- a/drivers/net/cxgb3/sge.c +++ b/drivers/net/cxgb3/sge.c @@ -2376,13 +2376,26 @@ static void sge_timer_cb(unsigned long d spin_unlock(qs-txq[TXQ_OFLD].lock); } lock = (adap-flags USING_MSIX) ? qs-rspq.lock : - adap-sge.qs[0].rspq.lock; + adap-sge.qs[0].rspq.lock; if (spin_trylock_irq(lock)) { if (!napi_is_scheduled(qs-netdev)) { + u32 status = t3_read_reg(adap, A_SG_RSPQ_FL_STATUS); + if (qs-fl[0].credits qs-fl[0].size) __refill_fl(adap, qs-fl[0]); if (qs-fl[1].credits qs-fl[1].size) __refill_fl(adap, qs-fl[1]); + + if (status (1 qs-rspq.cntxt_id)) { + qs-rspq.starved++; + if (qs-rspq.credits) { + refill_rspq(adap, qs-rspq, 1); + qs-rspq.credits--; + qs-rspq.restarted++; + t3_write_reg(adap, A_SG_RSPQ_FL_STATUS, +1 qs-rspq.cntxt_id); + } + } } spin_unlock_irq(lock); } ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 2/6] Clean up some private ioctls.
Clean up some private ioctls. Signed-off-by: Divy Le Ray [EMAIL PROTECTED] --- drivers/net/cxgb3/cxgb3_ioctl.h | 33 +-- drivers/net/cxgb3/cxgb3_main.c | 48 +++ 2 files changed, 15 insertions(+), 66 deletions(-) diff --git a/drivers/net/cxgb3/cxgb3_ioctl.h b/drivers/net/cxgb3/cxgb3_ioctl.h old mode 100755 new mode 100644 index a942818..0a82fcd --- a/drivers/net/cxgb3/cxgb3_ioctl.h +++ b/drivers/net/cxgb3/cxgb3_ioctl.h @@ -36,28 +36,17 @@ #define __CHIOCTL_H__ * Ioctl commands specific to this driver. */ enum { - CHELSIO_SETREG = 1024, - CHELSIO_GETREG, - CHELSIO_SETTPI, - CHELSIO_GETTPI, - CHELSIO_GETMTUTAB, - CHELSIO_SETMTUTAB, - CHELSIO_GETMTU, - CHELSIO_SET_PM, - CHELSIO_GET_PM, - CHELSIO_GET_TCAM, - CHELSIO_SET_TCAM, - CHELSIO_GET_TCB, - CHELSIO_GET_MEM, - CHELSIO_LOAD_FW, - CHELSIO_GET_PROTO, - CHELSIO_SET_PROTO, - CHELSIO_SET_TRACE_FILTER, - CHELSIO_SET_QSET_PARAMS, - CHELSIO_GET_QSET_PARAMS, - CHELSIO_SET_QSET_NUM, - CHELSIO_GET_QSET_NUM, - CHELSIO_SET_PKTSCHED, + CHELSIO_GETMTUTAB = 1029, + CHELSIO_SETMTUTAB = 1030, + CHELSIO_SET_PM = 1032, + CHELSIO_GET_PM = 1033, + CHELSIO_GET_MEM = 1038, + CHELSIO_LOAD_FW = 1041, + CHELSIO_SET_TRACE_FILTER= 1044, + CHELSIO_SET_QSET_PARAMS = 1045, + CHELSIO_GET_QSET_PARAMS = 1046, + CHELSIO_SET_QSET_NUM= 1047, + CHELSIO_GET_QSET_NUM= 1048, }; struct ch_reg { diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c old mode 100755 new mode 100644 index 638b0ab..0e84c4e --- a/drivers/net/cxgb3/cxgb3_main.c +++ b/drivers/net/cxgb3/cxgb3_main.c @@ -1547,32 +1547,6 @@ static int cxgb_extension_ioctl(struct n return -EFAULT; switch (cmd) { - case CHELSIO_SETREG:{ - struct ch_reg edata; - - if (!capable(CAP_NET_ADMIN)) - return -EPERM; - if (copy_from_user(edata, useraddr, sizeof(edata))) - return -EFAULT; - if ((edata.addr 3) != 0 - || edata.addr = adapter-mmio_len) - return -EINVAL; - writel(edata.val, adapter-regs + edata.addr); - break; - } - case CHELSIO_GETREG:{ - struct ch_reg edata; - - if (copy_from_user(edata, useraddr, sizeof(edata))) - return -EFAULT; - if ((edata.addr 3) != 0 - || edata.addr = adapter-mmio_len) - return -EINVAL; - edata.val = readl(adapter-regs + edata.addr); - if (copy_to_user(useraddr, edata, sizeof(edata))) - return -EFAULT; - break; - } case CHELSIO_SET_QSET_PARAMS:{ int i; struct qset_params *q; @@ -1836,10 +1810,10 @@ static int cxgb_extension_ioctl(struct n return -EINVAL; /* - * Version scheme: - * bits 0..9: chip version - * bits 10..15: chip revision - */ +* Version scheme: +* bits 0..9: chip version +* bits 10..15: chip revision +*/ t.version = 3 | (adapter-params.rev 10); if (copy_to_user(useraddr, t, sizeof(t))) return -EFAULT; @@ -1888,20 +1862,6 @@ static int cxgb_extension_ioctl(struct n t.trace_rx); break; } - case CHELSIO_SET_PKTSCHED:{ - struct ch_pktsched_params p; - - if (!capable(CAP_NET_ADMIN)) - return -EPERM; - if (!adapter-open_device_map) - return -EAGAIN; /* uP and SGE must be running */ - if (copy_from_user(p, useraddr, sizeof(p))) - return -EFAULT; - send_pktsched_cmd(adapter, p.sched, p.idx, p.min, p.max, - p.binding); - break; - - } default: return -EOPNOTSUPP; } ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 6/6] Populate Rx free list with pages.
Populate Rx free list with pages. Signed-off-by: Divy Le Ray [EMAIL PROTECTED] --- drivers/net/cxgb3/adapter.h |9 + drivers/net/cxgb3/sge.c | 318 +++ 2 files changed, 235 insertions(+), 92 deletions(-) diff --git a/drivers/net/cxgb3/adapter.h b/drivers/net/cxgb3/adapter.h index 01b99b9..80c3d8f 100644 --- a/drivers/net/cxgb3/adapter.h +++ b/drivers/net/cxgb3/adapter.h @@ -74,6 +74,11 @@ enum { /* adapter flags */ struct rx_desc; struct rx_sw_desc; +struct sge_fl_page { + struct skb_frag_struct frag; + unsigned char *va; +}; + struct sge_fl {/* SGE per free-buffer list state */ unsigned int buf_size; /* size of each Rx buffer */ unsigned int credits; /* # of available Rx buffers */ @@ -81,11 +86,13 @@ struct sge_fl { /* SGE per free-buffer unsigned int cidx; /* consumer index */ unsigned int pidx; /* producer index */ unsigned int gen; /* free list generation */ + unsigned int cntxt_id; /* SGE context id for the free list */ + struct sge_fl_page page; struct rx_desc *desc; /* address of HW Rx descriptor ring */ struct rx_sw_desc *sdesc; /* address of SW Rx descriptor ring */ dma_addr_t phys_addr; /* physical address of HW ring start */ - unsigned int cntxt_id; /* SGE context id for the free list */ unsigned long empty;/* # of times queue ran out of buffers */ + unsigned long alloc_failed; /* # of times buffer allocation failed */ }; /* diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c index 4ff0ab6..c237834 100644 --- a/drivers/net/cxgb3/sge.c +++ b/drivers/net/cxgb3/sge.c @@ -45,9 +45,25 @@ #include firmware_exports.h #define USE_GTS 0 #define SGE_RX_SM_BUF_SIZE 1536 + +/* + * If USE_RX_PAGE is defined, the small freelist populated with (partial) + * pages instead of skbs. Pages are carved up into RX_PAGE_SIZE chunks (must + * be a multiple of the host page size). + */ +#define USE_RX_PAGE +#define RX_PAGE_SIZE 2048 + +/* + * skb freelist packets are copied into a new skb (and the freelist one is + * reused) if their len is = + */ #define SGE_RX_COPY_THRES 256 -# define SGE_RX_DROP_THRES 16 +/* + * Minimum number of freelist entries before we start dropping TUNNEL frames. + */ +#define SGE_RX_DROP_THRES 16 /* * Period of the Tx buffer reclaim timer. This timer does not need to run @@ -85,7 +101,10 @@ struct tx_sw_desc { /* SW state per Tx }; struct rx_sw_desc {/* SW state per Rx descriptor */ - struct sk_buff *skb; + union { + struct sk_buff *skb; + struct sge_fl_page page; + } t; DECLARE_PCI_UNMAP_ADDR(dma_addr); }; @@ -332,16 +351,27 @@ static void free_rx_bufs(struct pci_dev pci_unmap_single(pdev, pci_unmap_addr(d, dma_addr), q-buf_size, PCI_DMA_FROMDEVICE); - kfree_skb(d-skb); - d-skb = NULL; + + if (q-buf_size != RX_PAGE_SIZE) { + kfree_skb(d-t.skb); + d-t.skb = NULL; + } else { + if (d-t.page.frag.page) + put_page(d-t.page.frag.page); + d-t.page.frag.page = NULL; + } if (++cidx == q-size) cidx = 0; } + + if (q-page.frag.page) + put_page(q-page.frag.page); + q-page.frag.page = NULL; } /** * add_one_rx_buf - add a packet buffer to a free-buffer list - * @skb: the buffer to add + * @va: va of the buffer to add * @len: the buffer length * @d: the HW Rx descriptor to write * @sd: the SW Rx descriptor to write @@ -351,14 +381,13 @@ static void free_rx_bufs(struct pci_dev * Add a buffer of the given length to the supplied HW and SW Rx * descriptors. */ -static inline void add_one_rx_buf(struct sk_buff *skb, unsigned int len, +static inline void add_one_rx_buf(unsigned char *va, unsigned int len, struct rx_desc *d, struct rx_sw_desc *sd, unsigned int gen, struct pci_dev *pdev) { dma_addr_t mapping; - sd-skb = skb; - mapping = pci_map_single(pdev, skb-data, len, PCI_DMA_FROMDEVICE); + mapping = pci_map_single(pdev, va, len, PCI_DMA_FROMDEVICE); pci_unmap_addr_set(sd, dma_addr, mapping); d-addr_lo = cpu_to_be32(mapping); @@ -383,14 +412,47 @@ static void refill_fl(struct adapter *ad { struct rx_sw_desc *sd = q-sdesc[q-pidx]; struct rx_desc *d = q-desc[q-pidx]; + struct sge_fl_page *p = q-page; while (n--) { - struct sk_buff *skb = alloc_skb(q-buf_size, gfp); + unsigned char *va; -
Re: [openib-general] [PATCH] for OFED 1.2
On Mon, 2007-02-26 at 09:46 -0800, Sean Hefty wrote: Vladimir Sokolovsky wrote: On Fri, 2007-02-23 at 12:15 -0800, Sean Hefty wrote: I would like these fixes in OFED 1.2 as well. What git tree / branch do I generate a patch against? - Sean git://git.openfabrics.org/~vlad/ofed_1_2/.git branch: ofed_1_2 Can you try pulling from: git://git.openfabrics.org/~shefty/ofed_1_2.git ofed_1_2 - Sean Sean, Please send patches that will be added to kernel_patches/fixes. Please update your git tree from git://git.openfabrics.org/~vlad/ofed_1_2/.git ofed_1_2 -- Vladimir Sokolovsky [EMAIL PROTECTED] Mellanox Technologies Ltd. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [RFC] IB/ipoib: Asynchronous events delivered without port parameter.
On 2/27/07, Moni Levy [EMAIL PROTECTED] wrote: On 2/27/07, Roland Dreier [EMAIL PROTECTED] wrote: I did a short code review of the ipoib code concentrating on partitioning support and I mentioned that the asynchronous events handler in the ipoib code does not take the port number reported in the event record into consideration. The effect of that is that all of the ib# devices related to that specific HCA are flushed when it seems to me that only the relevant port one should be. Is that done on purpose, or am I missing something ? I don't think there's any particular reason the code is that way except for the oversight never being corrected. But it looks trivial to fix, like the patch below. Does that look right to you? p.s. I'm working on a patch that should solve another issue caused by PKEY reordering ipoib behavior and the above issue further complicates things for me. Why not fix the issue first then? commit a27cbe878203076247c1b5287f5ab59ed143b560 Author: Roland Dreier [EMAIL PROTECTED] Date: Tue Feb 27 07:37:49 2007 -0800 IPoIB: Only handle async events for one port An asynchronous event carries the port number that the event occurred on, so there's no reason for an IPoIB interface to process an event associated with a different local HCA port. Signed-off-by: Roland Dreier [EMAIL PROTECTED] diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c index 3cb551b..7f3ec20 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c @@ -259,12 +259,13 @@ void ipoib_event(struct ib_event_handler *handler, struct ipoib_dev_priv *priv = container_of(handler, struct ipoib_dev_priv, event_handler); - if (record-event == IB_EVENT_PORT_ERR|| - record-event == IB_EVENT_PKEY_CHANGE || - record-event == IB_EVENT_PORT_ACTIVE || - record-event == IB_EVENT_LID_CHANGE || - record-event == IB_EVENT_SM_CHANGE || - record-event == IB_EVENT_CLIENT_REREGISTER) { + if ((record-event == IB_EVENT_PORT_ERR|| +record-event == IB_EVENT_PKEY_CHANGE || +record-event == IB_EVENT_PORT_ACTIVE || +record-event == IB_EVENT_LID_CHANGE || +record-event == IB_EVENT_SM_CHANGE || +record-event == IB_EVENT_CLIENT_REREGISTER) + record-element.port_num == priv-port) { ipoib_dbg(priv, Port state change event\n); queue_work(ipoib_workqueue, priv-flush_task); } That's exactly what I intended to post. On a second thought based on the fact that on a two port HCA we'll have a 50% miss on the events being delivered, I would move the new condition to be evaluated first. I apologize if this is too much of micro optimization. What do you think ? --Moni --Moni ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
Please send patches that will be added to kernel_patches/fixes. Please update your git tree from git://git.openfabrics.org/~vlad/ofed_1_2/.git ofed_1_2 You want me to create a patch that adds a file that contains the actual patches? Why not apply the patches directly? ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [RFC] IB/ipoib: Asynchronous events delivered without port parameter.
On a second thought based on the fact that on a two port HCA we'll have a 50% miss on the events being delivered, I would move the new condition to be evaluated first. I apologize if this is too much of micro optimization. What do you think ? That wouldn't really be correct since element.port_num isn't valid unless we already know it's a port-related event. And it's not worth worrying about this since it's not remotely a hot path. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
On Tue, 2007-02-27 at 08:45 -0800, Sean Hefty wrote: Please send patches that will be added to kernel_patches/fixes. Please update your git tree from git://git.openfabrics.org/~vlad/ofed_1_2/.git ofed_1_2 You want me to create a patch that adds a file that contains the actual patches? Yes, actual patches should be created under kernel_patches/fixes. Please update your git tree because the following patch fails: From 2e7e33936de5f92656c0565ce88f97e796367dae Mon Sep 17 00:00:00 2001 From: Sean Hefty [EMAIL PROTECTED] Date: Fri, 23 Feb 2007 12:35:43 -0800 Subject: [PATCH] rdma_cm: request reversible paths only The rdma_cm requires that path records be reversible. Set the reversible bit when issuing an path record query. Signed-off-by: Sean Hefty [EMAIL PROTECTED] diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 9e0ab04..171cce9 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1396,11 +1396,13 @@ static int cma_query_ib_route(struct rdma_id_private *id_priv, int timeout_ms, ib_addr_get_dgid(addr, path_rec.dgid); path_rec.pkey = cpu_to_be16(ib_addr_get_pkey(addr)); path_rec.numb_path = 1; + path_rec.reversible = 1; id_priv-query_id = ib_sa_path_rec_get(sa_client, id_priv-id.device, id_priv-id.port_num, path_rec, IB_SA_PATH_REC_DGID | IB_SA_PATH_REC_SGID | - IB_SA_PATH_REC_PKEY | IB_SA_PATH_REC_NUMB_PATH, + IB_SA_PATH_REC_PKEY | IB_SA_PATH_REC_NUMB_PATH | + IB_SA_PATH_REC_REVERSIBLE, timeout_ms, GFP_KERNEL, cma_query_handler, work, id_priv-query); Why not apply the patches directly? To be consistent with 2.6.20 kernel. -- Vladimir Sokolovsky [EMAIL PROTECTED] Mellanox Technologies Ltd. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
Quoting Sean Hefty [EMAIL PROTECTED]: Subject: Re: [PATCH] for OFED 1.2 Please send patches that will be added to kernel_patches/fixes. Please update your git tree from git://git.openfabrics.org/~vlad/ofed_1_2/.git ofed_1_2 You want me to create a patch that adds a file that contains the actual patches? Why not apply the patches directly? That's the ofed structure, this was discussed multiple times already. The point is to keep all changes to upstream components separate, to make updating to upstream kernel trivial in the future. Worked quite well for OFED 1.1 - 1.2 transition. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [RFC] [PATCH] ib_cache: do not mask upper bit when searching for a pkey
Sean, On 2/26/07, Sean Hefty [EMAIL PROTECTED] wrote: I think the following patch would make ipoib spec compliant. ib_find_cached_pkey is called by ib_cm, rdma_cm, ib_srp, and ib_ipoib. I'm not certain what this change would do to SRP, but the ib_cm and rdma_cm look okay, given that non-reversible paths aren't supported yet anyway. Sorry for jumping into that thread, but although this patch will make things more spec compliant, it will break functionality we depend one. I suggest that we first find an alternate way to enable usage of partial partition membership before disabling that functionality at all. --Moni -- ib_find_cached_pkey masks off the upper-bit of the PKey when searching for a match. The upper bit indicates partial or full membership. Ignoring the upper bit can result in a full membership PKey matching with a partial membership PKey. For ipoib, this can result in joining a multicast group that disallows communication between all members. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- drivers/infiniband/core/cache.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c index 558c9a0..6f366c3 100644 --- a/drivers/infiniband/core/cache.c +++ b/drivers/infiniband/core/cache.c @@ -179,7 +179,7 @@ int ib_find_cached_pkey(struct ib_device *device, *index = -1; for (i = 0; i cache-table_len; ++i) - if ((cache-table[i] 0x7fff) == (pkey 0x7fff)) { + if (cache-table[i] == pkey) { *index = i; ret = 0; break; -- 1.4.4.3 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [RFC] IB/ipoib: Asynchronous events delivered without port parameter.
On 2/27/07, Roland Dreier [EMAIL PROTECTED] wrote: On a second thought based on the fact that on a two port HCA we'll have a 50% miss on the events being delivered, I would move the new condition to be evaluated first. I apologize if this is too much of micro optimization. What do you think ? That wouldn't really be correct since element.port_num isn't valid unless we already know it's a port-related event. You're perfectly right, sorry. And it's not worth worrying about this since it's not remotely a hot path. Ok. --Moni - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
Yes, actual patches should be created under kernel_patches/fixes. Please update your git tree because the following patch fails: Can you explain how the patch fails? I don't see how putting the patch into a file helps. Why not apply the patches directly? To be consistent with 2.6.20 kernel. You can check out stock 2.6.20 using a tag. Why maintain the ofed code in git if you don't use it to track patches? - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [RFC] [PATCH] ib_cache: do not mask upper bit when searching for a pkey
Sorry for jumping into that thread, but although this patch will make things more spec compliant, it will break functionality we depend one. I suggest that we first find an alternate way to enable usage of partial partition membership before disabling that functionality at all. Can you clarify the functionality you depend on? Are you reliant on ipoib being able to join a multicast group from partial partition membership? If so, do all SA's and switches support this? - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
Quoting Sean Hefty [EMAIL PROTECTED]: Subject: Re: [PATCH] for OFED 1.2 Yes, actual patches should be created under kernel_patches/fixes. Please update your git tree because the following patch fails: Can you explain how the patch fails? I don't see how putting the patch into a file helps. Try applying it? Why not apply the patches directly? To be consistent with 2.6.20 kernel. You can check out stock 2.6.20 using a tag. Why maintain the ofed code in git if you don't use it to track patches? Basically so that conflicts in future merges from upstream are easy to resolve. If you like, let's reopen this for 1.3. We are after freeze in OFED 1.2. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
On Tue, 2007-02-27 at 18:55 +0200, Michael S. Tsirkin wrote: Quoting Sean Hefty [EMAIL PROTECTED]: Subject: Re: [PATCH] for OFED 1.2 Please send patches that will be added to kernel_patches/fixes. Please update your git tree from git://git.openfabrics.org/~vlad/ofed_1_2/.git ofed_1_2 You want me to create a patch that adds a file that contains the actual patches? Why not apply the patches directly? That's the ofed structure, this was discussed multiple times already. The point is to keep all changes to upstream components separate, to make updating to upstream kernel trivial in the future. Worked quite well for OFED 1.1 - 1.2 transition. Having these patches as files is painful for every developer because they cannot create a patch against ofed_1_2/drivers/infiniband/* nor the kernel.org upstream tree. They need to apply all the current patches and then create a patch on top of that. Or hope the patch applies fuzzily. I think with stacked git or just git and rebasing at key times, you could keep an ofed_1_2 tree that folks can easily apply patches to... Its too late to change this for 1.2, but you might want to reconsider the design for 1.3. my 2 cents... ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [RFC] [PATCH] ib_cache: do not mask upper bit when searching for a pkey
On 2/27/07, Sean Hefty [EMAIL PROTECTED] wrote: Sorry for jumping into that thread, but although this patch will make things more spec compliant, it will break functionality we depend one. I suggest that we first find an alternate way to enable usage of partial partition membership before disabling that functionality at all. Can you clarify the functionality you depend on? Are you reliant on ipoib being able to join a multicast group from partial partition membership? Exactly. If so, do all SA's and switches support this? I can't commit on all the SA's and switches. -- Moni - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [RFC] [PATCH] ib_cache: do not mask upper bit when searching for a pkey
On Tue, 2007-02-27 at 12:06, Sean Hefty wrote: Sorry for jumping into that thread, but although this patch will make things more spec compliant, it will break functionality we depend one. I suggest that we first find an alternate way to enable usage of partial partition membership before disabling that functionality at all. Can you clarify the functionality you depend on? Are you reliant on ipoib being able to join a multicast group from partial partition membership? If so, do all SA's and switches support this? I'm not sure who can speak for all SAs nor necessarily would the vendor SAs indicate this. From a quick code inspection of OpenSM, it appears to not enforce the compliance properly. Switches do whatever they are told to do by the SM. -- Hal - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
Quoting Steve Wise [EMAIL PROTECTED]: Subject: Re: [openib-general] [PATCH] for OFED 1.2 On Tue, 2007-02-27 at 18:55 +0200, Michael S. Tsirkin wrote: Quoting Sean Hefty [EMAIL PROTECTED]: Subject: Re: [PATCH] for OFED 1.2 Please send patches that will be added to kernel_patches/fixes. Please update your git tree from git://git.openfabrics.org/~vlad/ofed_1_2/.git ofed_1_2 You want me to create a patch that adds a file that contains the actual patches? Why not apply the patches directly? That's the ofed structure, this was discussed multiple times already. The point is to keep all changes to upstream components separate, to make updating to upstream kernel trivial in the future. Worked quite well for OFED 1.1 - 1.2 transition. Having these patches as files is painful for every developer because they cannot create a patch against ofed_1_2/drivers/infiniband/* nor the kernel.org upstream tree. Did you try using quilt which makes managing patch stacks quite easy? If you have quilt installed, OFED scripts actually use it to apply patches, so things are easy. They need to apply all the current patches and then create a patch on top of that. Or hope the patch applies fuzzily. One point I can't stress enough: whatever way you create a patch, developers are expected to build and test it in OFED environment before posting. I think with stacked git or just git and rebasing at key times, you could keep an ofed_1_2 tree that folks can easily apply patches to... Its too late to change this for 1.2, but you might want to reconsider the design for 1.3. Well, I experimented with git rebase and it is unfortunately still fragile at this point. I agree using stacked git might be a good idea, I just did not have the chance to experiment with it enough. I had an impression that publishing stg managed branch creates problems for whoever attempts to track it, but I might be wrong. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
I think with stacked git or just git and rebasing at key times, you could keep an ofed_1_2 tree that folks can easily apply patches to... Its too late to change this for 1.2, but you might want to reconsider the design for 1.3. Can't we just create a new branch (ofed_1_2_patched) with these patches already applied and in the correct order? Maybe I'm just not understanding the work flow here... - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
It would be great if all of this knowledge is posted to the wiki to avoid repeating this conversation in the future (or one of countless variations of this conversation). For example, I admit to not paying close attention to many of the threads on this list, but this was the first time I'd head of quilt. Specifically: if there are tools and methods that are helpful for OFA/ OFED development, they should be detailed on the wiki. The wiki is where all permanent knowledge should be posted. This is just my $0.01... On Feb 27, 2007, at 12:31 PM, Michael S. Tsirkin wrote: Quoting Steve Wise [EMAIL PROTECTED]: Subject: Re: [openib-general] [PATCH] for OFED 1.2 On Tue, 2007-02-27 at 18:55 +0200, Michael S. Tsirkin wrote: Quoting Sean Hefty [EMAIL PROTECTED]: Subject: Re: [PATCH] for OFED 1.2 Please send patches that will be added to kernel_patches/fixes. Please update your git tree from git://git.openfabrics.org/~vlad/ofed_1_2/.git ofed_1_2 You want me to create a patch that adds a file that contains the actual patches? Why not apply the patches directly? That's the ofed structure, this was discussed multiple times already. The point is to keep all changes to upstream components separate, to make updating to upstream kernel trivial in the future. Worked quite well for OFED 1.1 - 1.2 transition. Having these patches as files is painful for every developer because they cannot create a patch against ofed_1_2/drivers/infiniband/* nor the kernel.org upstream tree. Did you try using quilt which makes managing patch stacks quite easy? If you have quilt installed, OFED scripts actually use it to apply patches, so things are easy. They need to apply all the current patches and then create a patch on top of that. Or hope the patch applies fuzzily. One point I can't stress enough: whatever way you create a patch, developers are expected to build and test it in OFED environment before posting. I think with stacked git or just git and rebasing at key times, you could keep an ofed_1_2 tree that folks can easily apply patches to... Its too late to change this for 1.2, but you might want to reconsider the design for 1.3. Well, I experimented with git rebase and it is unfortunately still fragile at this point. I agree using stacked git might be a good idea, I just did not have the chance to experiment with it enough. I had an impression that publishing stg managed branch creates problems for whoever attempts to track it, but I might be wrong. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/ openib-general -- Jeff Squyres Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
Quoting Sean Hefty [EMAIL PROTECTED]: Subject: Re: [PATCH] for OFED 1.2 I think with stacked git or just git and rebasing at key times, you could keep an ofed_1_2 tree that folks can easily apply patches to... Its too late to change this for 1.2, but you might want to reconsider the design for 1.3. Can't we just create a new branch (ofed_1_2_patched) with these patches already applied and in the correct order? Then what we do when we want to update to new upstream? Throw this branch away? As it is, I just pull then build and remove patches that conflict. By the way, there are backport patches, etc - it is still incorrect to say that you would be able to generate a patch out of git and know it's a good one without test-build. Maybe I'm just not understanding the work flow here... Sean, please install quilt and try using it for working with the system. Adding new patch is usually done in this way quilt new patch quilt add files edit quilt refresh cp patches/patch kernel_patches/fixes/ git add kernel_patches/fixes/patch git commit kernel_patches/fixes/patch -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
Lot's of stuff *is* in wiki already - did you look at pages Vlad created? Things can always be improved, you can add stuff too. Quoting Jeff Squyres [EMAIL PROTECTED]: Subject: Re: [PATCH] for OFED 1.2 It would be great if all of this knowledge is posted to the wiki to avoid repeating this conversation in the future (or one of countless variations of this conversation). For example, I admit to not paying close attention to many of the threads on this list, but this was the first time I'd head of quilt. Specifically: if there are tools and methods that are helpful for OFA/ OFED development, they should be detailed on the wiki. The wiki is where all permanent knowledge should be posted. This is just my $0.01... On Feb 27, 2007, at 12:31 PM, Michael S. Tsirkin wrote: Quoting Steve Wise [EMAIL PROTECTED]: Subject: Re: [openib-general] [PATCH] for OFED 1.2 On Tue, 2007-02-27 at 18:55 +0200, Michael S. Tsirkin wrote: Quoting Sean Hefty [EMAIL PROTECTED]: Subject: Re: [PATCH] for OFED 1.2 Please send patches that will be added to kernel_patches/fixes. Please update your git tree from git://git.openfabrics.org/~vlad/ofed_1_2/.git ofed_1_2 You want me to create a patch that adds a file that contains the actual patches? Why not apply the patches directly? That's the ofed structure, this was discussed multiple times already. The point is to keep all changes to upstream components separate, to make updating to upstream kernel trivial in the future. Worked quite well for OFED 1.1 - 1.2 transition. Having these patches as files is painful for every developer because they cannot create a patch against ofed_1_2/drivers/infiniband/* nor the kernel.org upstream tree. Did you try using quilt which makes managing patch stacks quite easy? If you have quilt installed, OFED scripts actually use it to apply patches, so things are easy. They need to apply all the current patches and then create a patch on top of that. Or hope the patch applies fuzzily. One point I can't stress enough: whatever way you create a patch, developers are expected to build and test it in OFED environment before posting. I think with stacked git or just git and rebasing at key times, you could keep an ofed_1_2 tree that folks can easily apply patches to... Its too late to change this for 1.2, but you might want to reconsider the design for 1.3. Well, I experimented with git rebase and it is unfortunately still fragile at this point. I agree using stacked git might be a good idea, I just did not have the chance to experiment with it enough. I had an impression that publishing stg managed branch creates problems for whoever attempts to track it, but I might be wrong. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/ openib-general -- Jeff Squyres Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
This is just my $0.01... Thanks for the suggestions, but what does $0.01 buy one in US today? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
On Tue, 2007-02-27 at 19:44 +0200, Michael S. Tsirkin wrote: Quoting Sean Hefty [EMAIL PROTECTED]: Subject: Re: [PATCH] for OFED 1.2 I think with stacked git or just git and rebasing at key times, you could keep an ofed_1_2 tree that folks can easily apply patches to... Its too late to change this for 1.2, but you might want to reconsider the design for 1.3. Can't we just create a new branch (ofed_1_2_patched) with these patches already applied and in the correct order? Then what we do when we want to update to new upstream? Throw this branch away? As it is, I just pull then build and remove patches that conflict. By the way, there are backport patches, etc - it is still incorrect to say that you would be able to generate a patch out of git and know it's a good one without test-build. Maybe I'm just not understanding the work flow here... Sean, please install quilt and try using it for working with the system. Adding new patch is usually done in this way quilt new patch quilt add files edit quilt refresh cp patches/patch kernel_patches/fixes/ git add kernel_patches/fixes/patch git commit kernel_patches/fixes/patch NOTE: The key to the above process is the assumption that the developer maintains _all_ of the existing patches from kernel_patches/ on top of the ofed_1_2 tree using quilt or stg. Otherwise quilt/stg isn't buying you anything. And this doesn't take into account backports. Regardless, you need to build, install and test any ofed patch on an ofed system, so you're gonna have extra work: 1) create ofed-specific patch build/test it on ofed post it to openib-general/ewg 2) create kernel.org patch build/test it on kernel.org post it to openib-gernel/lklm/netdev My .27 cents... ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
On Feb 27, 2007, at 12:45 PM, Michael S. Tsirkin wrote: Lot's of stuff *is* in wiki already - did you look at pages Vlad created? A search for quilt on the wiki turns up nothing (I checked before I posted :-) ). And yes, I have [thoroughly] read the pages Vlad created. But the very fact that this conversation is occurring is because either the information is not on the wiki or what is on the wiki is not clear. Otherwise, I suspect that you simply would have pointed Steve to the wiki and said Please read the fine manual at http://;. Don't get me wrong; what has already been posted is great. I'm just saying: keep it coming! The wiki should be a living document that changes as our procedures and collective wisdom changes. It saves us *all* time over the long run. A one-time dump of information is not nearly as useful as an ever-updated document. Things can always be improved, you can add stuff too. https://wiki.openfabrics.org/tiki-lastchanges.php?days=31 shows that only Tziporet and myself have changed the OFED portion of the wiki over the past month. So -- *you* can add stuff to the wiki, too. :-) This is just my $0.01... It buys very little, if anything. In fact, a whole $0.02 also buys very little, if anything. So take my comments for what they're worth. -- Jeff Squyres Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
Quoting Steve Wise [EMAIL PROTECTED]: Subject: Re: [PATCH] for OFED 1.2 On Tue, 2007-02-27 at 19:44 +0200, Michael S. Tsirkin wrote: Quoting Sean Hefty [EMAIL PROTECTED]: Subject: Re: [PATCH] for OFED 1.2 I think with stacked git or just git and rebasing at key times, you could keep an ofed_1_2 tree that folks can easily apply patches to... Its too late to change this for 1.2, but you might want to reconsider the design for 1.3. Can't we just create a new branch (ofed_1_2_patched) with these patches already applied and in the correct order? Then what we do when we want to update to new upstream? Throw this branch away? As it is, I just pull then build and remove patches that conflict. By the way, there are backport patches, etc - it is still incorrect to say that you would be able to generate a patch out of git and know it's a good one without test-build. Maybe I'm just not understanding the work flow here... Sean, please install quilt and try using it for working with the system. Adding new patch is usually done in this way quilt new patch quilt add files edit quilt refresh cp patches/patch kernel_patches/fixes/ git add kernel_patches/fixes/patch git commit kernel_patches/fixes/patch NOTE: The key to the above process is the assumption that the developer maintains _all_ of the existing patches from kernel_patches/ on top of the ofed_1_2 tree using quilt or stg. Otherwise quilt/stg isn't buying you anything. OFED will do this automatically. And this doesn't take into account backports. The process works with backport patches too: you just have to do this quilt pop -a quilt new patch quilt add files edit quilt refresh quilt push -a -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
This is just my $0.01... It buys very little, if anything. In fact, a whole $0.02 also buys very little, if anything. So take my comments for what they're worth. Oh, good, I thought deflation is getting out of hand ... -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
Lot's of stuff *is* in wiki already - did you look at pages Vlad created? A search for quilt on the wiki turns up nothing (I checked before I posted :-) ). And yes, I have [thoroughly] read the pages Vlad created. But the very fact that this conversation is occurring is because either the information is not on the wiki or what is on the wiki is not clear. Otherwise, I suspect that you simply would have pointed Steve to the wiki and said Please read the fine manual at http://;. You are right in that, I don't disclaim it. Thanks for the suggestion, I'll try to find the time to add this to wiki. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
Sean, please install quilt and try using it for working with the system. Adding new patch is usually done in this way quilt new patch quilt add files edit quilt refresh cp patches/patch kernel_patches/fixes/ git add kernel_patches/fixes/patch git commit kernel_patches/fixes/patch NOTE: The key to the above process is the assumption that the developer maintains _all_ of the existing patches from kernel_patches/ on top of the ofed_1_2 tree using quilt or stg. Otherwise quilt/stg isn't buying you anything. OFED will do this automatically. uh, can you explain this? Given I have a freshly cloned ofed_1_2 git tree, and I want to change cma.c (a good one cuz there are patches). What do I do? There's no quilt stack at all at this point. Right? And this doesn't take into account backports. The process works with backport patches too: you just have to do this quilt pop -a quilt new patch quilt add files edit quilt refresh quilt push -a But you cannot keep a stack for more than one backport pushed, right? So you still need to be slapping the stacks of patches around for each backport. Or maybe I'm confused? ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
But you cannot keep a stack for more than one backport pushed, right? So you still need to be slapping the stacks of patches around for each backport. Why not have separate branches for each kernels too? ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
Quoting Steve Wise [EMAIL PROTECTED]: Subject: Re: [PATCH] for OFED 1.2 Sean, please install quilt and try using it for working with the system. Adding new patch is usually done in this way quilt new patch quilt add files edit quilt refresh cp patches/patch kernel_patches/fixes/ git add kernel_patches/fixes/patch git commit kernel_patches/fixes/patch NOTE: The key to the above process is the assumption that the developer maintains _all_ of the existing patches from kernel_patches/ on top of the ofed_1_2 tree using quilt or stg. Otherwise quilt/stg isn't buying you anything. OFED will do this automatically. uh, can you explain this? Given I have a freshly cloned ofed_1_2 git tree, and I want to change cma.c (a good one cuz there are patches). What do I do? There's no quilt stack at all at this point. Right? Try running the configure script. After this, quilt applied will show what patches are applied. And this doesn't take into account backports. The process works with backport patches too: you just have to do this quilt pop -a quilt new patch quilt add files edit quilt refresh quilt push -a But you cannot keep a stack for more than one backport pushed, right? So you still need to be slapping the stacks of patches around for each backport. Or maybe I'm confused? Yes. Fortunately it's not too hard: you can do quilt pop -a and re-run configure for another kernel. Of course for testing the patch, it is easier to commit the change in your tree and then to use openfabrics cross-build functionality that will clone this tree and build for multiple arches/kernels. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
Quoting Sean Hefty [EMAIL PROTECTED]: Subject: RE: [PATCH] for OFED 1.2 But you cannot keep a stack for more than one backport pushed, right? So you still need to be slapping the stacks of patches around for each backport. Why not have separate branches for each kernels too? I think it'll be much more work to maintain all these branches. And again, there will be conflicts, and it's too easy to get confused when resolving a conflict. With patches we have scripts to automate this. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] remove www.openfabrics.org SVN links..
Can someone please update the main www.openfabrics.org web page to remove all references to subversion, and link to a wiki page on how to get the latest source? Thanks. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
I think it'll be much more work to maintain all these branches. And again, there will be conflicts, and it's too easy to get confused when resolving a conflict. Storing patches in a directory seems confusing to me. They must be applied in a specific order for everything to work, and that knowledge is not captured. Conflicts need to be resolved anyway. If someone wants to use scripts to make their life easier, that's fine, but they shouldn't be a necessity to checking out code and creating patches using git. For OFED they are. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
On 19:44 Tue 27 Feb , Michael S. Tsirkin wrote: Quoting Sean Hefty [EMAIL PROTECTED]: Subject: Re: [PATCH] for OFED 1.2 I think with stacked git or just git and rebasing at key times, you could keep an ofed_1_2 tree that folks can easily apply patches to... Its too late to change this for 1.2, but you might want to reconsider the design for 1.3. Can't we just create a new branch (ofed_1_2_patched) with these patches already applied and in the correct order? Then what we do when we want to update to new upstream? Throw this branch away? As it is, I just pull then build and remove patches that conflict. You can save this branch as branch-name-upstream-name (or better) and to rebase branch-name to the new upstream. By the way, there are backport patches, etc - it is still incorrect to say that you would be able to generate a patch out of git and know it's a good one without test-build. In similar way you can track backport patch sets as branches. Maybe I'm just not understanding the work flow here... Sean, please install quilt and try using it for working with the system. Adding new patch is usually done in this way quilt new patch quilt add files edit quilt refresh cp patches/patch kernel_patches/fixes/ git add kernel_patches/fixes/patch git commit kernel_patches/fixes/patch This looks strange for me to track patches against patches... Sasha ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] ofed_1_2_scripts for bug 372
Hi Vladimir. I've attached a small patch to the ofed_1_2_scripts build.sh file for the mvapich2() function. This fixes bug 372 where the F90 compiler was not being set properly for the GNU compiler case and other possible compilers in the path were being found. This patch is against the latest ofed_1_2_scripts git. -- Shaun Rowland [EMAIL PROTECTED] http://www.cse.ohio-state.edu/~rowland/ diff --git a/build.sh b/build.sh index ae5ea1e..86894be 100755 --- a/build.sh +++ b/build.sh @@ -528,9 +528,9 @@ mvapich2() MVAPICH2_COMP_ENV=CC=gcc CXX=g++ if [ $is_gfortran -eq 1 ]; then -MVAPICH2_COMP_ENV=$MVAPICH2_COMP_ENV F77=gfortran +MVAPICH2_COMP_ENV=$MVAPICH2_COMP_ENV F77=gfortran F90=gfortran elif [ $is_gcc_g77 -eq 1 ]; then -MVAPICH2_COMP_ENV=$MVAPICH2_COMP_ENV F77=g77 +MVAPICH2_COMP_ENV=$MVAPICH2_COMP_ENV F77=g77 F90=g77 fi ;; pathscale) ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] for OFED 1.2
Quoting Sasha Khapyorsky [EMAIL PROTECTED]: Subject: Re: [PATCH] for OFED 1.2 On 19:44 Tue 27 Feb , Michael S. Tsirkin wrote: Quoting Sean Hefty [EMAIL PROTECTED]: Subject: Re: [PATCH] for OFED 1.2 I think with stacked git or just git and rebasing at key times, you could keep an ofed_1_2 tree that folks can easily apply patches to... Its too late to change this for 1.2, but you might want to reconsider the design for 1.3. Can't we just create a new branch (ofed_1_2_patched) with these patches already applied and in the correct order? Then what we do when we want to update to new upstream? Throw this branch away? As it is, I just pull then build and remove patches that conflict. You can save this branch as branch-name-upstream-name (or better) and to rebase branch-name to the new upstream. rebase does not seem to be too robust when run on such a large repository as the linux kernel. Maybe stacked git will work. By the way, there are backport patches, etc - it is still incorrect to say that you would be able to generate a patch out of git and know it's a good one without test-build. In similar way you can track backport patch sets as branches. At the moment it seems like a lot of work. Again, maybe stg makes it easy, I know it's hard with plain git. And I think lots of people (including me) will be confused if we have a ton of branches. Maybe I'm just not understanding the work flow here... Sean, please install quilt and try using it for working with the system. Adding new patch is usually done in this way quilt new patch quilt add files edit quilt refresh cp patches/patch kernel_patches/fixes/ git add kernel_patches/fixes/patch git commit kernel_patches/fixes/patch This looks strange for me to track patches against patches... One gets used to it :) Seriously, we have these patches, and we want to version them together with source they are intended to apply to. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] failure to create an FMR mapping 1K pages on memfree
On 2/27/07, Roland Dreier [EMAIL PROTECTED] wrote: Is it really returning -ENOMEM? It seems much more likely that you are hitting the code /* For Arbel, all MTTs must fit in the same page. */ if (mthca_is_memfree(dev) mr-attr.max_pages * sizeof *mr-mem.arbel.mtts PAGE_SIZE) return -EINVAL; I guess you could call this limit a driver design issue. Indeed, sorry for the in accorate description, mthca_fmr_alloc returns -EINVAL and the fmr pool code returns -ENOMEM. Thanks for the clarification. Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Fwd: [ANNOUNCE] GIT 1.5.0.2
FYI. - Forwarded message from Junio C Hamano [EMAIL PROTECTED] - Subject: [ANNOUNCE] GIT 1.5.0.2 Date: Tue, 27 Feb 2007 10:58:22 +0200 In-Reply-To: [EMAIL PROTECTED] (Junio C. Hamano'smessage of Sun, 18 Feb 2007 18:07:42 -0800) References: [EMAIL PROTECTED] From: Junio C Hamano [EMAIL PROTECTED] The latest maintenance release GIT 1.5.0.2 is available at the usual places: http://www.kernel.org/pub/software/scm/git/ git-1.5.0.2.tar.{gz,bz2} (tarball) git-htmldocs-1.5.0.2.tar.{gz,bz2} (preformatted docs) git-manpages-1.5.0.2.tar.{gz,bz2} (preformatted docs) RPMS/$arch/git-*-1.5.0.2-1.$arch.rpm (RPM) GIT v1.5.0.2 Release Notes == Fixes since v1.5.0.1 * Bugfixes - Automated merge conflict handling when changes to symbolic links conflicted were completely broken. The merge-resolve strategy created a regular file with conflict markers in it in place of the symbolic link. The default strategy, merge-recursive was even more broken. It removed the path that was pointed at by the symbolic link. Both of these problems have been fixed. - 'git diff maint master next' did not correctly give combined diff across three trees. - 'git fast-import' portability fix for Solaris. - 'git show-ref --verify' without arguments did not error out but segfaulted. - 'git diff :tracked-file `pwd`/an-untracked-file' gave an extra slashes after a/ and b/. - 'git format-patch' produced too long filenames if the commit message had too long line at the beginning. - Running 'make all' and then without changing anything running 'make install' still rebuilt some files. This was inconvenient when building as yourself and then installing as root (especially problematic when the source directory is on NFS and root is mapped to nobody). - 'git-rerere' failed to deal with two unconflicted paths that sorted next to each other. - 'git-rerere' attempted to open(2) a symlink and failed if there was a conflict. Since a conflicting change to a symlink would not benefit from rerere anyway, the command now ignores conflicting changes to symlinks. - 'git-repack' did not like to pass more than 64 arguments internally to underlying 'rev-list' logic, which made it impossible to repack after accumulating many (small) packs in the repository. - 'git-diff' to review the combined diff during a conflicted merge were not reading the working tree version correctly when changes to a symbolic link conflicted. It should have read the data using readlink(2) but read from the regular file the symbolic link pointed at. - 'git-remote' did not like period in a remote's name. * Documentation updates - added and clarified core.bare, core.legacyheaders configurations. - updated git-clone --depth documentation. * Assorted git-gui fixes. - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - End forwarded message - -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [RFC/BUG] DMA vs. CQ race
On our cell blade + PCI-e Mellanox. I don't see anything in arch/powerpc that looks like dma_alloc_coherent() will do anything other than allocate some memory and map it with DMA_BIDIRECTIONAL. So how does this altix fix help in your situation? Am I misreading the Cell IOMMU code? Shirley, can you clarify why doing dma_alloc_coherent() in the kernel helps on your Cell blade? It really seems that dma_alloc_coherent() just allocates some memory and then does dma_map(DMA_BIDIRECTIONAL), which would be exactly the same as allocating the CQ buffer in userspace and using ib_umem_get() to map it into the kernel. I'm looking at a possibly cleaner solution to the Altix issue, so I would like to make sure it fixes whatever the bug on Cell is as well. So any details you can provide about the problem you see on Cell would help a lot. Thanks... ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Port error rate detection
On Mon, Feb 19, 2007 at 03:53:36PM -0500, Steven Carter wrote: I have a Nagios module that alerts on connectivity, port errors, speed/width problems. I would like to give it the ability to change the severity of the alert depending on whether errors are just present or if they are increasing faster than a specified rate. The intent is to equip the module to keep the state of the last query and possibly history, but I wanted to make sure that I was not re-inventing the wheel first. Is there an attribute or utility that I am overlooking that will help me do this? One other thing you might want to take a look at is the Fountain/Goanna node monitoring setup... It's not really anything like the proposed performance manager, but it might get you want you need. (And we'd like some feedback on what it should do differently ;) http://www.scl.ameslab.gov/Projects/Monitor/ ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [RFC/BUG] DMA vs. CQ race
Roland Dreier [EMAIL PROTECTED] wrote on 02/27/2007 01:40:36 PM: Shirley, can you clarify why doing dma_alloc_coherent() in the kernel helps on your Cell blade? It really seems that dma_alloc_coherent() just allocates some memory and then does dma_map(DMA_BIDIRECTIONAL), which would be exactly the same as allocating the CQ buffer in userspace and using ib_umem_get() to map it into the kernel. I'm looking at a possibly cleaner solution to the Altix issue, so I would like to make sure it fixes whatever the bug on Cell is as well. So any details you can provide about the problem you see on Cell would help a lot. Thanks... Thanks, Roland. The failure on Cell is different with Altix issue after I reviewed the whole thread. So this fix might not help Cell. The problem I have might be related to multiple DMAs mapping to the same CQ. It might be somewhere else lost the sync. Thanks Shirley Ma___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Fw: [PATCH] enable IPoIB only if broadcast join finish
Hello Roland, Sorry to bother you again. Could you please review below patch to see it's possible to be in upper stream soon? IPoIB can't ping each other if broadcast join successfully but encounting any other IB multicast join failure (like IB multicast group join failure for default IPv6 link local solicited address) when bringing the interface up. It does impact IPoIB usability in large node cluster when MCG LIDs are limited. Thanks Shirley Ma - Forwarded by Shirley Ma/Beaverton/IBM on 02/27/07 06:23 AM - Shirley Ma/Beaverton/IBM@ IBMUS To Sent by: Roland Dreier [EMAIL PROTECTED] openib-general-bo cc [EMAIL PROTECTED] openib-general@openib.org Subject [openib-general] [PATCH] enable 02/05/07 06:50 AM IPoIB only if broadcast join finish Hi, Roland, Please review this patch. According to IPoIB RFC4391 section 5, once IPoIB broacast group has been joined, the interface should be ready for data transfer. In current IPoIB implementation, the interface is UP and RUNNING when all default multicast join successful. We hit a problem while the broadcast join finishe and sucessful but the all hosts multicast join failure. Here is the patch, if possible please give your input asap, we have an urgent customer issue need to be resolved: diff -urpN ipoib/ipoib_multicast.c ipoib-multicast/ipoib_multicast.c --- ipoib/ipoib_multicast.c 2006-11-29 13:57:37.0 -0800 +++ ipoib-multicast/ipoib_multicast.c 2007-02-04 22:34:16.0 -0800 @@ -402,6 +402,11 @@ static void ipoib_mcast_join_complete(in queue_work(ipoib_workqueue, priv-mcast_task); mutex_unlock(mcast_mutex); complete(mcast-done); + /* + * broadcast join finished, enable carrier + */ + if (mcast == priv-broadcast) + netif_carrier_on(dev); return; } @@ -599,7 +604,6 @@ void ipoib_mcast_join_task(void *dev_ptr ipoib_dbg_mcast(priv, successfully joined all multicast groups\n); clear_bit(IPOIB_MCAST_RUN, priv-flags); - netif_carrier_on(dev); } int ipoib_mcast_start_thread(struct net_device *dev) (See attached file: ipoib-multicast.patch) Thanks Shirley Ma IBM Linux Technology Center 15300 SW Koll Parkway Beaverton, OR 97006-6063 Phone(Fax): (503) 578-7638(See attached file: ipoib-multicast.patch) ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ipoib-multicast.patch Description: Binary data ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Fw: [PATCH] enable IPoIB only if broadcast join finish
I don't think this applies any more since Sean's multicast stuff was merged. I didn't realize you wanted to get this merged upstream -- anyway, can you please regenerate the patch against the latest kernel? Thanks ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IPOIB NAPI
Roland Dreier [EMAIL PROTECTED] wrote on 02/26/2007 02:36:26 PM: No way, it's way too late at this point to change the kernel-user ABI, let alone change all ULPs. - R. Hello Roland, So the IBV_CQ_REPORT_MISSED_EVENTS has been part of OFED-1.2 already? I can generate the patch for all ULPs to use this for review. Do you need me to do that? Thanks Shirley Ma___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IPOIB NAPI
So the IBV_CQ_REPORT_MISSED_EVENTS has been part of OFED-1.2 already? I can generate the patch for all ULPs to use this for review. Do you need me to do that? No, it's not in OFED 1.2 or the upstream kernel. And no one has implemented it for userspace (and I'm somewhat reluctant to break the ABI at this point without some performance numbers to motivate making this API change). Have the NAPI performance problems with ehca been resolved? We could probably merge IPoIB NAPI for 2.6.22 then, which would pull in the kernel changes at least. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] cannot instal ofed-1.2 kernel rpm on 2.6.20.1
I built the ofed 1.2 rpms from the OFED-1.2-20070227-0602 build and the kernel rpm fails to install on a 2.6.20.1 kernel: vic13:/usr/local/src/OFED-1.2-20070227-0602/RPMS/sles-release-10-15.2 # rpm -U kernel-ib-1.2-2.6.20.1.x86_64.rpm error: Failed dependencies: ksym(schedule) = 1000e51 is needed by kernel-ib-1.2-2.6.20.1.x86_64 ksym(__up_wakeup) = 1042cbb5 is needed by kernel-ib-1.2-2.6.20.1.x86_64 ksym(pci_request_region) = 10cc2981 is needed by kernel-ib-1.2-2.6.20.1.x86_64 ksym(skb_dequeue) = 10fc721b is needed by kernel-ib-1.2-2.6.20.1.x86_64 ksym(mod_timer) = 14777d07 is needed by kernel-ib-1.2-2.6.20.1.x86_64 ksym(remap_pfn_range) = 155834a8 is needed by kernel-ib-1.2-2.6.20.1.x86_64 ksym(unregister_netevent_notifier) = 1598dc9d is needed by kernel-ib-1.2-2.6.20.1.x86_64 ksym(bad_dma_address) = 1675606f is needed by kernel-ib-1.2-2.6.20.1.x86_64 ksym(dev_get_by_name) = 16ab1a6b is needed by kernel-ib-1.2-2.6.20.1.x86_64 ... many more of these deleted Anybody seen this? ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Fw: [PATCH] enable IPoIB only if broadcast join finish
Roland Dreier [EMAIL PROTECTED] wrote on 02/27/2007 02:35:34 PM: I don't think this applies any more since Sean's multicast stuff was merged. I didn't realize you wanted to get this merged upstream -- anyway, can you please regenerate the patch against the latest kernel? Thanks Sure. I will generate a new patch. Thanks Shirley Ma___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] osm: trivial data type change to remove compilation warning
On Mon, 2007-02-26 at 06:20, Yevgeny Kliteynik wrote: Hi Hal Trivial data type change to remove compilation warning. Please apply to the trunk and to the 1.2 branch. Thanks. Signed-off-by: Yevgeny Kliteynik [EMAIL PROTECTED] Thanks. Applied (to both master and ofed_1_2). -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IPOIB NAPI
oland Dreier [EMAIL PROTECTED] wrote on 02/27/2007 02:41:44 PM: So the IBV_CQ_REPORT_MISSED_EVENTS has been part of OFED-1.2 already? I can generate the patch for all ULPs to use this for review. Do you need me to do that? No, it's not in OFED 1.2 or the upstream kernel. And no one has implemented it for userspace (and I'm somewhat reluctant to break the ABI at this point without some performance numbers to motivate making this API change). Have the NAPI performance problems with ehca been resolved? We could probably merge IPoIB NAPI for 2.6.22 then, which would pull in the kernel changes at least. - R. We have addressed the NAPI performance issues with ehca driver. I believe the patches have been upper stream. However the test results show that it's better to delay poll again to next NAPI interval, something like this: poll-cq notify-cq, if missed_event netif_rx_reschedule() return 1 vs. poll-cq, notify-cq, if missed_event netif_rx_reschedule() poll again return 0 It seems ehca delivering packet much faster than other HCAs. So poll again would stay in the loop for many many times. So the above changes doesn't impact other HCAs, I would recommand it. I saw same implementations on other ethernet drivers. Thanks Shirley Ma___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] cannot instal ofed-1.2 kernel rpm on 2.6.20.1
I opened bug 399 to track this. I also opened bug 398 because I got an error installing opensm with this same OFED-1.2 build. Steve. On Tue, 2007-02-27 at 16:43 -0600, Steve Wise wrote: I built the ofed 1.2 rpms from the OFED-1.2-20070227-0602 build and the kernel rpm fails to install on a 2.6.20.1 kernel: vic13:/usr/local/src/OFED-1.2-20070227-0602/RPMS/sles-release-10-15.2 # rpm -U kernel-ib-1.2-2.6.20.1.x86_64.rpm error: Failed dependencies: ksym(schedule) = 1000e51 is needed by kernel-ib-1.2-2.6.20.1.x86_64 ksym(__up_wakeup) = 1042cbb5 is needed by kernel-ib-1.2-2.6.20.1.x86_64 ksym(pci_request_region) = 10cc2981 is needed by kernel-ib-1.2-2.6.20.1.x86_64 ksym(skb_dequeue) = 10fc721b is needed by kernel-ib-1.2-2.6.20.1.x86_64 ksym(mod_timer) = 14777d07 is needed by kernel-ib-1.2-2.6.20.1.x86_64 ksym(remap_pfn_range) = 155834a8 is needed by kernel-ib-1.2-2.6.20.1.x86_64 ksym(unregister_netevent_notifier) = 1598dc9d is needed by kernel-ib-1.2-2.6.20.1.x86_64 ksym(bad_dma_address) = 1675606f is needed by kernel-ib-1.2-2.6.20.1.x86_64 ksym(dev_get_by_name) = 16ab1a6b is needed by kernel-ib-1.2-2.6.20.1.x86_64 ... many more of these deleted Anybody seen this? ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Fw: [PATCH] enable IPoIB only if broadcast join finish
Hello Roland, Here is the new patch against 2.6.20-rc1 kernel. Please review it. diff -urpN ipoib/ipoib_multicast.c ipoib-link/ipoib_multicast.c --- ipoib/ipoib_multicast.c 2007-02-27 07:21:50.0 -0800 +++ ipoib-link/ipoib_multicast.c2007-02-27 07:52:10.0 -0800 @@ -407,6 +407,11 @@ static int ipoib_mcast_join_complete(int queue_delayed_work(ipoib_workqueue, priv-mcast_task, 0); mutex_unlock(mcast_mutex); + /* +* broadcast join finished, enable carrier +*/ + if (unlikely(mcast == priv-broadcast)) + netif_carrier_on(dev); return 0; } @@ -596,7 +601,6 @@ void ipoib_mcast_join_task(struct work_s ipoib_dbg_mcast(priv, successfully joined all multicast groups\n); clear_bit(IPOIB_MCAST_RUN, priv-flags); - netif_carrier_on(dev); } int ipoib_mcast_start_thread(struct net_device *dev) (See attached file: ipoib-link.patch) Thanks Shirley Ma ipoib-link.patch Description: Binary data ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [Bug 263] OFED 1.1 rc6: IPoIB Oops during IPoIB failover loop
https://bugs.openfabrics.org/show_bug.cgi?id=263 [EMAIL PROTECTED] changed: What|Removed |Added Status|RESOLVED|CLOSED --- Comment #14 from [EMAIL PROTECTED] 2007-02-27 21:00 --- With OFED 1.2 alpha1, I was able to failover/failback an IB port every 10 seconds for 8 hours on RHEL4 x86_64 LionMini SDR and DDR. Will keep testing on other platforms. -- Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IPOIB NAPI
Quoting Shirley Ma [EMAIL PROTECTED]: Subject: Re: [openib-general] IPOIB NAPI Roland Dreier [EMAIL PROTECTED] wrote on 02/27/2007 02:41:44 PM: So the IBV_CQ_REPORT_MISSED_EVENTS has been part of OFED-1.2 already? I can generate the patch for all ULPs to use this for review. Do you need me to do that? No, it's not in OFED 1.2 or the upstream kernel. And no one has implemented it for userspace (and I'm somewhat reluctant to break the ABI at this point without some performance numbers to motivate making this API change). Have the NAPI performance problems with ehca been resolved? We could probably merge IPoIB NAPI for 2.6.22 then, which would pull in the kernel changes at least. - R. We have addressed the NAPI performance issues with ehca driver. I believe the patches have been upper stream. However the test results show that it's better to delay poll again to next NAPI interval, something like this: poll-cq notify-cq, if missed_event netif_rx_reschedule() return 1 vs. poll-cq, notify-cq, if missed_event netif_rx_reschedule() poll again return 0 It seems ehca delivering packet much faster than other HCAs. So poll again would stay in the loop for many many times. So the above changes doesn't impact other HCAs, I would recommand it. I saw same implementations on other ethernet drivers. I'm confused. Which one is faster? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [Bug 400] New: OFED 1.2 alpha1 IPoIB HA failover gets QP warnings
https://bugs.openfabrics.org/show_bug.cgi?id=400 Summary: OFED 1.2 alpha1 IPoIB HA failover gets QP warnings Product: OpenFabrics Linux Version: 1.2alpha1 Platform: X86-64 OS/Version: RHEL 4 Status: NEW Severity: normal Priority: P3 Component: IPoIB AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] OFED 1.2 alpha1 on RHEL4 U4 x86_64, LionMini DDR HCA. I have IPoIB HA configured, running traffic via netperf, and bringing up/down a different host IB port every 10 seconds. This is working for several hours, but I see warnings in dmesg, more on server side. Client dmesg: ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib_mthca :04:00.0: QP 000405 not found in MGM ib1: ib_detach_mcast failed (result = -22) ib1: ipoib_mcast_detach failed (result = -22) ib_mthca :04:00.0: QP 000404 not found in MGM ib0: ib_detach_mcast failed (result = -22) ib0: ipoib_mcast_detach failed (result = -22) [EMAIL PROTECTED] log]# Server dmesg: ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib_mthca :04:00.0: QP 000405 not found in MGM ib1: ib_detach_mcast failed (result = -22) ib1: ipoib_mcast_detach failed (result = -22) ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib_mthca :04:00.0: QP 000405 not found in MGM ib1: ib_detach_mcast failed (result = -22) ib1: ipoib_mcast_detach failed (result = -22) ib_mthca :04:00.0: QP 000405 not found in MGM ib1: ib_detach_mcast failed (result = -22) ib1: ipoib_mcast_detach failed (result = -22) ib0: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib_mthca :04:00.0: QP 000405 not found in MGM ib1: ib_detach_mcast failed (result = -22) ib1: ipoib_mcast_detach failed (result = -22) ib0: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib_mthca :04:00.0: QP 000405 not found in MGM ib1: ib_detach_mcast failed (result = -22) ib1: ipoib_mcast_detach failed (result = -22) ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib_mthca :04:00.0: QP 000405 not found in MGM ib1: ib_detach_mcast failed (result = -22) ib1: ipoib_mcast_detach failed (result = -22) ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib_mthca :04:00.0: QP 000405 not found in MGM ib1: ib_detach_mcast failed (result = -22) ib1: ipoib_mcast_detach failed (result = -22) ib_mthca :04:00.0: QP 000405 not found in MGM ib1: ib_detach_mcast failed (result = -22) ib1: ipoib_mcast_detach failed (result = -22) ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet [EMAIL PROTECTED] log]# -- Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. ___ openib-general mailing list openib-general@openib.org
[openib-general] [Bug 400] OFED 1.2 alpha1 IPoIB HA failover gets QP warnings
https://bugs.openfabrics.org/show_bug.cgi?id=400 [EMAIL PROTECTED] changed: What|Removed |Added AssignedTo|[EMAIL PROTECTED] |[EMAIL PROTECTED] --- Comment #1 from [EMAIL PROTECTED] 2007-02-27 21:18 --- Roland, can you take a look at this, please? -- Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. You are the assignee for the bug, or are watching the assignee. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [Bug 400] OFED 1.2 alpha1 IPoIB HA failover gets QP warnings
ib1: dev_queue_xmit failed to requeue packet ib_mthca :04:00.0: QP 000405 not found in MGM ib1: ib_detach_mcast failed (result = -22) ib1: ipoib_mcast_detach failed (result = -22) Looks like this is related to the multicast change that recently went upstream. So this likely affects upstream IPoIB as well. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[OFA General] Re: [openib-general] IPOIB NAPI
I'm confused. Which one is faster? Sorry for the confusion, Michael. The one with return 1 has better throughput. Thanks Shirley Ma___ general mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[OFA General] [Bug 371] IPoIB HA not working properly with OFED1.2-alpha
https://bugs.openfabrics.org/show_bug.cgi?id=371 [EMAIL PROTECTED] changed: What|Removed |Added CC||[EMAIL PROTECTED] -- Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. ___ general mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[OFA General] [Bug 371] IPoIB HA not working properly with OFED1.2-alpha
https://bugs.openfabrics.org/show_bug.cgi?id=371 [EMAIL PROTECTED] changed: What|Removed |Added AssignedTo|[EMAIL PROTECTED] |[EMAIL PROTECTED] --- Comment #2 from [EMAIL PROTECTED] 2007-02-27 23:08 --- Assigned to Vlad. -- Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. You are the assignee for the bug, or are watching the assignee. ___ general mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[OFA General] List Address Change Completed
This list has been migrated to the new server, lists.openfabrics.org. Please update any address book or filter settings to reflect the new mailing list address. Future messages and replies should be sent to this address: [EMAIL PROTECTED] The new web address for this list is: http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general If you have any questions, please contact me at [EMAIL PROTECTED] Regards, Michael ___ general mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[OFA General] Re: IPOIB NAPI
Quoting Shirley Ma [EMAIL PROTECTED]: Subject: Re: IPOIB NAPI oland Dreier [EMAIL PROTECTED] wrote on 02/27/2007 02:41:44 PM: So the IBV_CQ_REPORT_MISSED_EVENTS has been part of OFED-1.2 already? I can generate the patch for all ULPs to use this for review. Do you need me to do that? No, it's not in OFED 1.2 or the upstream kernel. And no one has implemented it for userspace (and I'm somewhat reluctant to break the ABI at this point without some performance numbers to motivate making this API change). Have the NAPI performance problems with ehca been resolved? We could probably merge IPoIB NAPI for 2.6.22 then, which would pull in the kernel changes at least. - R. We have addressed the NAPI performance issues with ehca driver. I believe the patches have been upper stream. However the test results show that it's better to delay poll again to next NAPI interval, something like this: poll-cq notify-cq, if missed_event netif_rx_reschedule() return 1 vs. poll-cq, notify-cq, if missed_event netif_rx_reschedule() poll again return 0 It seems ehca delivering packet much faster than other HCAs. So poll again would stay in the loop for many many times. So the above changes doesn't impact other HCAs, I would recommand it. I saw same implementations on other ethernet drivers. I have not benchmarked this, but actually the return 1 version makes sense to me too: since a new completion was observed after notify-cq, we likely currently have HCA writing new completions into the CQ at a high rate, so it makes sense to delay polling by a few cycles, and reduce the number of interrupts in this way. Right? -- MST ___ general mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [OFA General] List Address Change Completed
Quoting Lee, Michael Paichi [EMAIL PROTECTED]: Subject: [OFA General] List Address Change Completed This list has been migrated to the new server, lists.openfabrics.org. Please update any address book or filter settings to reflect the new mailing list address. Future messages and replies should be sent to this address: [EMAIL PROTECTED] The new web address for this list is: http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general If you have any questions, please contact me at [EMAIL PROTECTED] Can the subject prefix be made all lower-case, with dash, please? OFA General - ofa-general? Upper case words look like shouting to me, and e.g. exchange rules are limited in coping with spaces. -- MST ___ general mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[OFA General] Re: [PATCH 0/6] ofed_1_2: cxgb3 bug fixes
On Tue, 2007-02-27 at 09:59 -0600, Steve Wise wrote: Hey Vlad, These fixes need to be pulled into ofed_1_2 for the Chelsio Ethernet driver. You can pull them directly from my ofa git tree: git://staging.openfabrics.org/~swise/ofed_1_2 cxgb3_fixes Thanks, Steve. Applied. -- Vladimir Sokolovsky [EMAIL PROTECTED] Mellanox Technologies Ltd. ___ general mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] RE: [OFA General] List Address Change Completed
Done -Original Message- From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] Sent: Tue 2/27/2007 11:23 PM To: Lee, Michael Paichi Cc: [EMAIL PROTECTED]; openib-general@openib.org Subject: Re: [OFA General] List Address Change Completed Quoting Lee, Michael Paichi [EMAIL PROTECTED]: Subject: [OFA General] List Address Change Completed This list has been migrated to the new server, lists.openfabrics.org. Please update any address book or filter settings to reflect the new mailing list address. Future messages and replies should be sent to this address: [EMAIL PROTECTED] The new web address for this list is: http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general If you have any questions, please contact me at [EMAIL PROTECTED] Can the subject prefix be made all lower-case, with dash, please? OFA General - ofa-general? Upper case words look like shouting to me, and e.g. exchange rules are limited in coping with spaces. -- MST ___ general mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general