[openib-general] ofa_1_2_kernel 20070227-0200 daily build status

2007-02-27 Thread vlad
This email was generated automatically, please do not reply


Common build parameters:  --with-ipoib-mod --with-sdp-mod --with-srp-mod 
--with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-core-mod 
--with-addr_trans-mod --with-cxgb3-mod 

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.12
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.19
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.20
Passed on powerpc with linux-2.6.19
Passed on powerpc with linux-2.6.17
Passed on x86_64 with linux-2.6.19
Passed on powerpc with linux-2.6.18
Passed on x86_64 with linux-2.6.18
Passed on x86_64 with linux-2.6.16
Passed on ppc64 with linux-2.6.19
Passed on x86_64 with linux-2.6.14
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.17
Passed on ia64 with linux-2.6.18
Passed on powerpc with linux-2.6.16
Passed on x86_64 with linux-2.6.13
Passed on ia64 with linux-2.6.19
Passed on ppc64 with linux-2.6.12
Passed on ppc64 with linux-2.6.18
Passed on ppc64 with linux-2.6.16
Passed on powerpc with linux-2.6.12
Passed on powerpc with linux-2.6.14
Passed on x86_64 with linux-2.6.9-42.ELsmp
Passed on ppc64 with linux-2.6.17
Passed on ia64 with linux-2.6.17
Passed on powerpc with linux-2.6.13
Passed on ppc64 with linux-2.6.14
Passed on x86_64 with linux-2.6.5-7.244-smp
Passed on x86_64 with linux-2.6.16.21-0.8-smp
Passed on ia64 with linux-2.6.12
Passed on ia64 with linux-2.6.16
Passed on ppc64 with linux-2.6.15
Passed on powerpc with linux-2.6.15
Passed on ia64 with linux-2.6.15
Passed on ia64 with linux-2.6.13
Passed on ppc64 with linux-2.6.13
Passed on ia64 with linux-2.6.14
Passed on x86_64 with linux-2.6.18-1.2798.fc6
Passed on ia64 with linux-2.6.16.21-0.8-default

Failed:
Build failed on x86_64 with linux-2.6.9-22.ELsmp
Log:
/home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-22.ELsmp_x86_64_check/drivers/net/cxgb3/vsc8211.c:167:
 error: ‘ADVERTISE_PAUSE_CAP’ undeclared (first use in this function)
/home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-22.ELsmp_x86_64_check/drivers/net/cxgb3/vsc8211.c:167:
 error: (Each undeclared identifier is reported only once
/home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-22.ELsmp_x86_64_check/drivers/net/cxgb3/vsc8211.c:167:
 error: for each function it appears in.)
/home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-22.ELsmp_x86_64_check/drivers/net/cxgb3/vsc8211.c:170:
 error: ‘ADVERTISE_PAUSE_ASYM’ undeclared (first use in this function)
make[3]: *** 
[/home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-22.ELsmp_x86_64_check/drivers/net/cxgb3/vsc8211.o]
 Error 1
make[2]: *** 
[/home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-22.ELsmp_x86_64_check/drivers/net/cxgb3]
 Error 2
make[1]: *** 
[_module_/home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-22.ELsmp_x86_64_check]
 Error 2
make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-22.ELsmp'
make: *** [kernel] Error 2
--
Build failed on x86_64 with linux-2.6.9-34.ELsmp
Log:
/home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-34.ELsmp_x86_64_check/drivers/net/cxgb3/cxgb3_offload.c:
 In function ‘add_adapter’:
/home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-34.ELsmp_x86_64_check/drivers/net/cxgb3/cxgb3_offload.c:1061:
 error: ‘adapter_list_lock’ undeclared (first use in this function)
/home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-34.ELsmp_x86_64_check/drivers/net/cxgb3/cxgb3_offload.c:
 In function ‘remove_adapter’:
/home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-34.ELsmp_x86_64_check/drivers/net/cxgb3/cxgb3_offload.c:1068:
 error: ‘adapter_list_lock’ undeclared (first use in this function)
make[3]: *** 
[/home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-34.ELsmp_x86_64_check/drivers/net/cxgb3/cxgb3_offload.o]
 Error 1
make[2]: *** 
[/home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-34.ELsmp_x86_64_check/drivers/net/cxgb3]
 Error 2
make[1]: *** 
[_module_/home/vlad/tmp/ofa_1_2_kernel-20070227-0200_linux-2.6.9-34.ELsmp_x86_64_check]
 Error 2
make[1]: Leaving directory `/home/vlad/kernel.org/x86_64/linux-2.6.9-34.ELsmp'
make: *** [kernel] Error 2
--

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Fwd: Address List Change Now Scheduled for Wednesday, 2/28/2007

2007-02-27 Thread Jeff Squyres
On Feb 27, 2007, at 2:10 AM, Diego Guella wrote:

 Should I do something to get subscribed to the new mailing list or  
 I will be automatically subscribed?

There is nothing that you need to do; the list is simply being  
migrated from one server to another and changing names in the process.

 The only change is that I have to write messages to  
 [EMAIL PROTECTED], correct?

Correct.  There will be aliases in place to redirect messages from  
the old name to the new name, too.  So the warning is more about  
updating e-mail client filters, etc.





 - Original Message - From: Jeff Squyres [EMAIL PROTECTED]
 To: OpenFabrics General openib-general@openib.org
 Sent: Monday, February 26, 2007 6:05 PM
 Subject: [openib-general] Fwd: Address List Change Now Scheduled  
 for Wednesday, 2/28/2007


 FYI.  In case you missed it the Nth time: THIS LIST IS CHANGING ON
 WEDNESDAY 2/28/2007 (2 days from now).  Really.  For sure this time.
 Trust me.  Honest.

 Please update your addressbooks!



 Begin forwarded message:

 From: Lee, Michael Paichi [EMAIL PROTECTED]
 Date: February 22, 2007 11:44:25 AM EST
 To: Jeff Squyres [EMAIL PROTECTED], Michael S. Tsirkin
 [EMAIL PROTECTED]
 Cc: OpenFabrics General openib-general@openib.org
 Subject: Address List Change Now Scheduled for Wednesday, 2/28/2007

 The list will now be migrated on Wednesday, 2/28/2007.

 List address: [EMAIL PROTECTED]
 Updated change-date:  Wednesday, 2/28/2007

 Michael


 -- 
 Jeff Squyres
 Server Virtualization Business Unit
 Cisco Systems


 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general

 To unsubscribe, please visit http://openib.org/mailman/listinfo/ 
 openib-general


-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] ib0 shows MAC address as 00-00-00.... is it normal??

2007-02-27 Thread Bala
Hi All,
   We have build and installed OFED-1.1 on
RHEL-4 machine, using ipoib we set the IPs
for the interface and able to ping each other,
but my ifconfig shows ib0 MAC address as
shown below
00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00

--
ib0   Link encap:UNSPEC  HWaddr
00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
  inet addr:192.168.0.1  Bcast:192.168.0.255 
Mask:255.255.255.0
  UP BROADCAST RUNNING MULTICAST  MTU:2044 
Metric:1
  RX packets:271465 errors:0 dropped:0
overruns:0 frame:0
  TX packets:1444336 errors:0 dropped:0
overruns:0 carrier:0
  collisions:0 txqueuelen:128
  RX bytes:15664386 (14.9 MiB)  TX
bytes:2718736764 (2.5 GiB)
---

pls let me know is it normal, is there any way
to get the real hw/mac address.

regards,
Bala.


 

Be a PS3 game guru.
Get your game face on with the latest PS3 news and previews at Yahoo! Games.
http://videogames.yahoo.com/platform?platform=120121

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] mpi over IB

2007-02-27 Thread Bala
Hi All,
   We have build and installed OFED-1.1
on RHEL-4 machines, while compiling selected
mpi support, pls through some light on how
to use mpi over IB interface, using what 
modules etc. or do we need to install separate
mpi software to use.

thanks in advance,
-bala-


 

8:00? 8:25? 8:40? Find a flick in no time 
with the Yahoo! Search movie showtime shortcut.
http://tools.search.yahoo.com/shortcuts/#news

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] mpi over IB

2007-02-27 Thread Jeff Squyres
During the installation process, the OFED installer should have asked  
you if you wanted to install Open MPI and/or MVAPICH.  Both of these  
MPI implementations are capable of communicating natively over the IB  
interface.

Running MPI applications with Open MPI should natively choose the IB  
interface at run time if your IB network is up and running properly  
(e.g., try running ibv_devinfo to ensure that ports are listed in the  
PORT_ACTIVE state, etc.).  I assume that the same is true with  
MVAPICH as well.



On Feb 27, 2007, at 6:35 AM, Bala wrote:

 Hi All,
We have build and installed OFED-1.1
 on RHEL-4 machines, while compiling selected
 mpi support, pls through some light on how
 to use mpi over IB interface, using what
 modules etc. or do we need to install separate
 mpi software to use.

 thanks in advance,
 -bala-



 __ 
 __
 8:00? 8:25? 8:40? Find a flick in no time
 with the Yahoo! Search movie showtime shortcut.
 http://tools.search.yahoo.com/shortcuts/#news

 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general

 To unsubscribe, please visit http://openib.org/mailman/listinfo/ 
 openib-general


-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] ib0 shows MAC address as 00-00-00.... is it normal??

2007-02-27 Thread Hal Rosenstock
On Tue, 2007-02-27 at 06:30, Bala wrote:
 Hi All,
We have build and installed OFED-1.1 on
 RHEL-4 machine, using ipoib we set the IPs
 for the interface and able to ping each other,
 but my ifconfig shows ib0 MAC address as
 shown below
 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
 
 --
 ib0   Link encap:UNSPEC  HWaddr
 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
   inet addr:192.168.0.1  Bcast:192.168.0.255 
 Mask:255.255.255.0
   UP BROADCAST RUNNING MULTICAST  MTU:2044 
 Metric:1
   RX packets:271465 errors:0 dropped:0
 overruns:0 frame:0
   TX packets:1444336 errors:0 dropped:0
 overruns:0 carrier:0
   collisions:0 txqueuelen:128
   RX bytes:15664386 (14.9 MiB)  TX
 bytes:2718736764 (2.5 GiB)
 ---
 
 pls let me know is it normal,

Depends on the (truncated) guid for the HCA port.

  is there any way
 to get the real hw/mac address.

ip addr show ib0

-- Hal

 regards,
 Bala.
 
 
  
 
 Be a PS3 game guru.
 Get your game face on with the latest PS3 news and previews at Yahoo! Games.
 http://videogames.yahoo.com/platform?platform=120121
 
 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general
 
 To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC] [PATCH v2] IB/ipoib: Add bonding support to IPoIB

2007-02-27 Thread Moni Shoua

Thanks for the comments 

 To fix it, this patch adds a dev field to struct ipoib_neigh which is used
 instead of the struct neighbour dev one.
 
 It seems that in this design, if multiple ipoib interfaces are present, we 
 might
 get an skb such that skb-dev will be different from the new dev field in 
 struct
 ipoib_neigh.
 
 It seems that the result will be that the packet will be sent on a wrong 
 interface.
 Right?
 
I don't see how. The field dev in ipoib_neigh doesn't take part in interface 
selection.
As I see it, skb travels this path:
1. Passed to bond_dev-hard_start_xmit
2. bond_dev-hard_start_xmit chooses the current active interface, changes 
skb-dev and enqueues it back for xmittig.

 In addition, if an IPoIB device is removed before bonding is unloaded it may 
 cause bond0 neighbours (neighbours that point to bond0) to exist after the 
 IPoIB
 device no longer exist. This is why a neighbour cleanup is required during 
 device 
 cleanup. This cleanup scans the arp cache and the ndisc cache to find there 
 neighbours of bond0 which refer also to the relevant ibX. Also, when 
 ib_ipoib module is
 unloaded, the neighbour destructor must be set to NULL because the neighbour 
 function is in
 ib_ipoib.
 For this neigh table cleanup, it is required to export the symbol nd_tbl 
 just like the symbol arp_tbl is.
 
 I wonder about this: is it really true that any allocated neighbour is always 
 in
 either arp_tbl or nd_tbl? For example, could some code have called neigh_hold
 and retained a neighbour that is not in either one of these tables?
 
I got the assumption about neighbours living in one of these 2 tables from 
observation and code reading.
I preferred that that on keeping track of all ipoib_neighs and putting them in 
a list. However, I could 
do that instead of neigh_table scanning. Do you think it's better?
For the example... I didn't understand it. Could you please explain?

 During my tests I found that when running 

  1. modprobe -r ib_mthca (to delete IPoIB interfaces)
  2. ping somewhere on the subnet of bond0

 I get this stack dump (which ends with kernel death)
   [8037ff32] skb_under_panic+0x5c/0x60
   [882e00c2] :ib_ipoib:ipoib_hard_header+0xa6/0xc0
   [803c3c98] arp_create+0x120/0x226
   [803c3dc3] arp_send+0x25/0x3b
   [803c466a] arp_solicit+0x186/0x195
   [8038c0ac] neigh_timer_handler+0x2b5/0x309
   [8038bdf7] neigh_timer_handler+0x0/0x309
   [80239599] run_timer_softirq+0x130/0x19e
   [80235fcc] __do_softirq+0x55/0xc3
   [8020acac] call_softirq+0x1c/0x28
   [8020c02b] do_softirq+0x2c/0x7d
   [8021864a] smp_apic_timer_interrupt+0x57/0x6a
   [80208e19] mwait_idle+0x0/0x45
   [8020a756] apic_timer_interrupt+0x66/0x70
   EOI  [80208e5b] mwait_idle+0x42/0x45
   [80208db1] cpu_idle+0x8b/0xae
   [80217d60] start_secondary+0x47f/0x48f

 The only way I found to avoid this (for now) is to check skb headroom in
 ipoib_hard_header. I guess that this safety check doesn't harm regular IPoIB 
 operation and it seems to solve my problem. However, I would be happy to 
 hear what
 others think of this last issue.
 
 As I said, this seems to indicate a problem in the bonding code.
 But what will happen after you error out in ipoib_hard_header?
 Is the packet dropped? What might break as a result?
 
I will check the hard_header_len issue in the bonding code more carefully. From 
first look
it seems that bonding does borrow the hard_header_len.
Also, my checks show that it is safe to return with error from hard_header().
For example,  in neigh_connected_output:

err = dev-hard_header(skb, dev, ntohs(skb-protocol),
   neigh-ha, NULL, skb-len);
read_unlock_bh(neigh-lock);
if (err = 0)
err = neigh-ops-queue_xmit(skb);
else {
err = -EINVAL;
kfree_skb(skb);
 
 I would really appreciate comments.

 thanks

  -MoniS
 



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [RFC] IB/ipoib: Asynchronous events delivered without port parameter.

2007-02-27 Thread Moni Levy
Hello,
I did a short code review of the ipoib code concentrating on
partitioning support and I mentioned that the asynchronous events
handler in the ipoib code does not take the port number reported in
the event record into consideration. The effect of that is that all of
the ib# devices related to that specific HCA are flushed when it seems
to me that only the relevant port one should be. Is that done on
purpose, or am I missing something ?

Thanks,
Moni

p.s. I'm working on a patch that should solve another issue caused by
PKEY reordering  ipoib behavior and the above issue further
complicates things for me.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC] IB/ipoib: Asynchronous events delivered without port parameter.

2007-02-27 Thread Michael S. Tsirkin
 Quoting Moni Levy [EMAIL PROTECTED]:
 Subject: [RFC] IB/ipoib: Asynchronous events delivered without port parameter.
 
 Hello,
 I did a short code review of the ipoib code concentrating on
 partitioning support and I mentioned that the asynchronous events
 handler in the ipoib code does not take the port number reported in
 the event record into consideration. The effect of that is that all of
 the ib# devices related to that specific HCA are flushed when it seems
 to me that only the relevant port one should be. Is that done on
 purpose, or am I missing something ?
 
 Thanks,
 Moni
 
 p.s. I'm working on a patch that should solve another issue caused by
 PKEY reordering  ipoib behavior and the above issue further
 complicates things for me.

If true, why is this a problem?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] HOWTO check ofa_kernel build from your git tree

2007-02-27 Thread Steve Wise
Where are all the kernel src trees on ssh. openfabrics.org?

I would like to build against specific trees that are failing with
cxgb3...

Also:  

what RH distro ships:

linux-2.6.9-22.ELsmp

and

linux-2.6.9-34.ELsmp


Thanks,

Steve.



On Mon, 2007-02-26 at 17:07 +0200, Vladimir Sokolovsky wrote:
 On ssh.openfabrics.org:
 Run
 env git_url=/home/mst/scm/ofed_1_2_devel.git git_branch=ofed_1_2 \
   CHECK_LOCAL=yes \
   CHECK_KERNEL_ORG=yes \
   CHECK_CROSS=yes /home/vlad/scripts/build_ofa_kernel.sh
 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCHv2] IB/ipoib: Fix ipoib handling for pkey reordering

2007-02-27 Thread Moni Levy
This issue was found during partitioning  SM fail over testing. The fix was 
tested over the weekend with pkey reshuffling, removal and addition every few 
seconds concurrent with OFED restart. The patch applies on Roland's git tree. 

Changes from v1: 
* added flush flag to ipoib_ib_dev_stop(), ipoib_ib_dev_down() alike
* fixed a bug in device extraction from the work struct
* removed some warnings in case they are caused due to missing PKEY as 
this seems like a valid flow now.

SM reconfiguration or failover possibly causes a shuffling of the values in the 
port pkey table. The current implementation only queries for the index of the 
pkey once, when it creates the device QP and after that moves it into working 
state, and hence does not address this scenario. Fix this by using the 
PKEY_CHANGE event as a trigger to reconfigure the device QP.

Signed-off-by: Moni Levy [EMAIL PROTECTED]
---
 ipoib.h   |4 +++-
 ipoib_ib.c|   51 +--
 ipoib_main.c  |5 +++--
 ipoib_multicast.c |   11 ++-
 ipoib_verbs.c |8 +++-
 5 files changed, 60 insertions(+), 19 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h 
b/drivers/infiniband/ulp/ipoib/ipoib.h
index 2594db2..d08ecca 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -205,6 +205,7 @@ struct ipoib_dev_priv {
struct delayed_work pkey_task;
struct delayed_work mcast_task;
struct work_struct flush_task;
+   struct work_struct flush_restart_qp_task;
struct work_struct restart_task;
struct delayed_work ah_reap_task;
 
@@ -334,12 +335,13 @@ struct ipoib_dev_priv *ipoib_intf_alloc(
 
 int ipoib_ib_dev_init(struct net_device *dev, struct ib_device *ca, int port);
 void ipoib_ib_dev_flush(struct work_struct *work);
+void ipoib_ib_dev_flush_restart_qp(struct work_struct *work);
 void ipoib_ib_dev_cleanup(struct net_device *dev);
 
 int ipoib_ib_dev_open(struct net_device *dev);
 int ipoib_ib_dev_up(struct net_device *dev);
 int ipoib_ib_dev_down(struct net_device *dev, int flush);
-int ipoib_ib_dev_stop(struct net_device *dev);
+int ipoib_ib_dev_stop(struct net_device *dev, int flush);
 
 int ipoib_dev_init(struct net_device *dev, struct ib_device *ca, int port);
 void ipoib_dev_cleanup(struct net_device *dev);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c 
b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index f2aa923..b0287c1 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -415,21 +415,22 @@ int ipoib_ib_dev_open(struct net_device 
 
ret = ipoib_init_qp(dev);
if (ret) {
-   ipoib_warn(priv, ipoib_init_qp returned %d\n, ret);
+   if (ret != -ENOENT)
+   ipoib_warn(priv, ipoib_init_qp returned %d\n, ret);
return -1;
}
 
ret = ipoib_ib_post_receives(dev);
if (ret) {
ipoib_warn(priv, ipoib_ib_post_receives returned %d\n, ret);
-   ipoib_ib_dev_stop(dev);
+   ipoib_ib_dev_stop(dev, 1);
return -1;
}
 
ret = ipoib_cm_dev_open(dev);
if (ret) {
ipoib_warn(priv, ipoib_ib_post_receives returned %d\n, ret);
-   ipoib_ib_dev_stop(dev);
+   ipoib_ib_dev_stop(dev, 1);
return -1;
}
 
@@ -508,7 +509,7 @@ static int recvs_pending(struct net_devi
return pending;
 }
 
-int ipoib_ib_dev_stop(struct net_device *dev)
+int ipoib_ib_dev_stop(struct net_device *dev, int flush)
 {
struct ipoib_dev_priv *priv = netdev_priv(dev);
struct ib_qp_attr qp_attr;
@@ -581,7 +582,8 @@ timeout:
/* Wait for all AHs to be reaped */
set_bit(IPOIB_STOP_REAPER, priv-flags);
cancel_delayed_work(priv-ah_reap_task);
-   flush_workqueue(ipoib_workqueue);
+   if (flush)
+   flush_workqueue(ipoib_workqueue);
 
begin = jiffies;
 
@@ -622,13 +624,17 @@ int ipoib_ib_dev_init(struct net_device 
return 0;
 }
 
-void ipoib_ib_dev_flush(struct work_struct *work)
+static void __ipoib_ib_dev_flush(struct ipoib_dev_priv *priv, int restart_qp)
 {
-   struct ipoib_dev_priv *cpriv, *priv =
-   container_of(work, struct ipoib_dev_priv, flush_task);
+   struct ipoib_dev_priv *cpriv;
struct net_device *dev = priv-dev;
 
-   if (!test_bit(IPOIB_FLAG_INITIALIZED, priv-flags) ) {
+   /*
+* ipoib_ib_dev_stop() below may not find the PKey and leave the
+* IPOIB_FLAG_INITIALIZED flag off so flush in that case with restart_qp
+* flag on is Ok.
+*/
+   if (!test_bit(IPOIB_FLAG_INITIALIZED, priv-flags)  !restart_qp) {
ipoib_dbg(priv, Not flushing - IPOIB_FLAG_INITIALIZED not 
set.\n);
return;
}
@@ -641,6 +647,13 @@ void 

Re: [openib-general] [RFC] [PATCH v2] IB/ipoib: Add bonding support to IPoIB

2007-02-27 Thread Michael S. Tsirkin
 Quoting Moni Shoua [EMAIL PROTECTED]:
 Subject: Re: [RFC] [PATCH v2] IB/ipoib: Add bonding support to IPoIB
 
 
 Thanks for the comments 
 
  To fix it, this patch adds a dev field to struct ipoib_neigh which is used
  instead of the struct neighbour dev one.
  
  It seems that in this design, if multiple ipoib interfaces are present, we 
  might
  get an skb such that skb-dev will be different from the new dev field in 
  struct
  ipoib_neigh.
  
  It seems that the result will be that the packet will be sent on a wrong 
  interface.
  Right?
  
 I don't see how. The field dev in ipoib_neigh doesn't take part in interface 
 selection.
 As I see it, skb travels this path:
 1. Passed to bond_dev-hard_start_xmit
 2. bond_dev-hard_start_xmit chooses the current active interface, changes 
 skb-dev and enqueues it back for xmittig.

ipoib_neigh ah field includes struct ib_ah *.
This selects important parameters which depend on both packet source and
destination interfaces.

I think the right thing might be to compare ipoib_neigh dev and skb-dev,
and destroy ipoib_neigh if these do not match.

  In addition, if an IPoIB device is removed before bonding is unloaded it 
  may 
  cause bond0 neighbours (neighbours that point to bond0) to exist after the 
  IPoIB
  device no longer exist. This is why a neighbour cleanup is required during 
  device 
  cleanup. This cleanup scans the arp cache and the ndisc cache to find 
  there 
  neighbours of bond0 which refer also to the relevant ibX. Also, when 
  ib_ipoib module is
  unloaded, the neighbour destructor must be set to NULL because the 
  neighbour function is in
  ib_ipoib.
  For this neigh table cleanup, it is required to export the symbol nd_tbl 
  just like the symbol arp_tbl is.
  
  I wonder about this: is it really true that any allocated neighbour is 
  always in
  either arp_tbl or nd_tbl? For example, could some code have called 
  neigh_hold
  and retained a neighbour that is not in either one of these tables?
  
 I got the assumption about neighbours living in one of these 2 tables from
 observation and code reading.  I preferred that that on keeping track of all
 ipoib_neighs and putting them in a list. However, I could do that instead of
 neigh_table scanning. Do you think it's better?

If some neighbours are not on any tables, it seems using our own lists
(e.g. lists we have in ipoib_path) is the only option, no?

 For the example... I didn't
 understand it. Could you please explain?

grep for neigh_hold. neighbour is only destroyed when ref count goes to 0.
If some code does neigh_hold, it seems neighbour could be removed from table
but destructor not yet called.

  During my tests I found that when running 
 
 1. modprobe -r ib_mthca (to delete IPoIB interfaces)
 2. ping somewhere on the subnet of bond0
 
  I get this stack dump (which ends with kernel death)
  [8037ff32] skb_under_panic+0x5c/0x60
  [882e00c2] :ib_ipoib:ipoib_hard_header+0xa6/0xc0
  [803c3c98] arp_create+0x120/0x226
  [803c3dc3] arp_send+0x25/0x3b
  [803c466a] arp_solicit+0x186/0x195
  [8038c0ac] neigh_timer_handler+0x2b5/0x309
  [8038bdf7] neigh_timer_handler+0x0/0x309
  [80239599] run_timer_softirq+0x130/0x19e
  [80235fcc] __do_softirq+0x55/0xc3
  [8020acac] call_softirq+0x1c/0x28
  [8020c02b] do_softirq+0x2c/0x7d
  [8021864a] smp_apic_timer_interrupt+0x57/0x6a
  [80208e19] mwait_idle+0x0/0x45
  [8020a756] apic_timer_interrupt+0x66/0x70
  EOI  [80208e5b] mwait_idle+0x42/0x45
  [80208db1] cpu_idle+0x8b/0xae
  [80217d60] start_secondary+0x47f/0x48f
 
  The only way I found to avoid this (for now) is to check skb headroom in
  ipoib_hard_header. I guess that this safety check doesn't harm regular 
  IPoIB 
  operation and it seems to solve my problem. However, I would be happy to 
  hear what
  others think of this last issue.
  
  As I said, this seems to indicate a problem in the bonding code.
  But what will happen after you error out in ipoib_hard_header?
  Is the packet dropped? What might break as a result?
  
 I will check the hard_header_len issue in the bonding code more carefully.
 From first look it seems that bonding does borrow the hard_header_len.

So where does a shorter message come from?

 Also,
 my checks show that it is safe to return with error from
 hard_header().  For example,  in neigh_connected_output:
 
 err = dev-hard_header(skb, dev, ntohs(skb-protocol),
neigh-ha, NULL, skb-len);
 read_unlock_bh(neigh-lock);
 if (err = 0)
 err = neigh-ops-queue_xmit(skb);
 else {
 err = -EINVAL;
 kfree_skb(skb);
  
  I would really appreciate comments.
 
  thanks
 
   -MoniS
  

-- 
MST

___
openib-general 

Re: [openib-general] [PATCHv2] IB/ipoib: Fix ipoib handling for pkey reordering

2007-02-27 Thread Michael S. Tsirkin
I just gave this a cursory glance.
A suggestion: would it not be much simpler to modify the QP from RTS to RTS on 
pkey
change?

 diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c 
 b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
 index f2aa923..b0287c1 100644
 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
 +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
 @@ -415,21 +415,22 @@ int ipoib_ib_dev_open(struct net_device 
  
   ret = ipoib_init_qp(dev);
   if (ret) {
 - ipoib_warn(priv, ipoib_init_qp returned %d\n, ret);
 + if (ret != -ENOENT)
 + ipoib_warn(priv, ipoib_init_qp returned %d\n, ret);
   return -1;
   }
 
What's the reason for this?

 @@ -993,6 +993,7 @@ static void ipoib_setup(struct net_devic
   INIT_DELAYED_WORK(priv-pkey_task,ipoib_pkey_poll);
   INIT_DELAYED_WORK(priv-mcast_task,   ipoib_mcast_join_task);
   INIT_WORK(priv-flush_task,   ipoib_ib_dev_flush);
 + INIT_WORK(priv-flush_restart_qp_task, ipoib_ib_dev_flush_restart_qp);
   INIT_WORK(priv-restart_task, ipoib_mcast_restart_task);
   INIT_DELAYED_WORK(priv-ah_reap_task, ipoib_reap_ah);
  }

Shorter name?

 diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 
 b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
 index b303ce6..27d6fd4 100644
 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
 +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
 @@ -232,9 +232,10 @@ static int ipoib_mcast_join_finish(struc
   ret = ipoib_mcast_attach(dev, be16_to_cpu(mcast-mcmember.mlid),
mcast-mcmember.mgid);
   if (ret  0) {
 - ipoib_warn(priv, couldn't attach QP to multicast group 
 
 -IPOIB_GID_FMT \n,
 -IPOIB_GID_ARG(mcast-mcmember.mgid));
 + if (ret != -ENXIO) /* No PKEY found */
 + ipoib_warn(priv, couldn't attach QP to 
 multicast group 
 +IPOIB_GID_FMT \n,
 +IPOIB_GID_ARG(mcast-mcmember.mgid));
  
   clear_bit(IPOIB_MCAST_FLAG_ATTACHED, mcast-flags);
   return ret;
 @@ -312,7 +313,7 @@ ipoib_mcast_sendonly_join_complete(int s
   status = ipoib_mcast_join_finish(mcast, multicast-rec);
  
   if (status) {
 - if (mcast-logcount++  20)
 + if (mcast-logcount++  20  status != -ENXIO)
   ipoib_dbg_mcast(netdev_priv(dev), multicast join 
 failed for 
   IPOIB_GID_FMT , status %d\n,
   IPOIB_GID_ARG(mcast-mcmember.mgid), 
 status);
 @@ -416,7 +417,7 @@ static int ipoib_mcast_join_complete(int
   , status %d\n,
   IPOIB_GID_ARG(mcast-mcmember.mgid),
   status);
 - } else {
 + } else if (status != -ENXIO) {
   ipoib_warn(priv, multicast join failed for 
  IPOIB_GID_FMT , status %d\n,
  IPOIB_GID_ARG(mcast-mcmember.mgid),
 diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c 
 b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
 index 3cb551b..d0384ea 100644
 --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
 +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
 @@ -52,8 +52,10 @@ int ipoib_mcast_attach(struct net_device
   if (ib_find_cached_pkey(priv-ca, priv-port, priv-pkey, pkey_index)) 
 {
   clear_bit(IPOIB_PKEY_ASSIGNED, priv-flags);
   ret = -ENXIO;
 + ipoib_dbg(priv, PKEY %X not found\n, priv-pkey);
   goto out;
   }
 + ipoib_dbg(priv, PKEY %X found at index %d\n, priv-pkey, pkey_index);
   set_bit(IPOIB_PKEY_ASSIGNED, priv-flags);
  
   /* set correct QKey for QP */

Make it PKey or pkey: no text in uppercase in log messages please.

 @@ -105,9 +107,11 @@ int ipoib_init_qp(struct net_device *dev
*/
   ret = ib_find_cached_pkey(priv-ca, priv-port, priv-pkey, 
 pkey_index);
   if (ret) {
 + ipoib_dbg(priv, PKEY %X not found.\n, priv-pkey);
   clear_bit(IPOIB_PKEY_ASSIGNED, priv-flags);
   return ret;
   }
 + ipoib_dbg(priv, PKEY %X found at index %d.\n, priv-pkey, pkey_index);
   set_bit(IPOIB_PKEY_ASSIGNED, priv-flags);
  
   qp_attr.qp_state = IB_QPS_INIT;

going a bit overboard on the number of debug messages here.

 @@ -260,12 +264,14 @@ void ipoib_event(struct ib_event_handler
   container_of(handler, struct ipoib_dev_priv, event_handler);
  
   if (record-event == IB_EVENT_PORT_ERR||
 - record-event == IB_EVENT_PKEY_CHANGE ||
   record-event == IB_EVENT_PORT_ACTIVE ||
   record-event == IB_EVENT_LID_CHANGE  ||
   

Re: [openib-general] [PATCHv2] IB/ipoib: Fix ipoib handling for pkey reordering

2007-02-27 Thread Roland Dreier
  I just gave this a cursory glance.

I haven't really read it except to think why is this so complicated?

  A suggestion: would it not be much simpler to modify the QP from RTS to RTS 
  on pkey
  change?

Changing the P_Key index is not allowed for RTS-RTS.  You would have
to modify the QP RTS-SQD, wait for the SQ to drain, then modify the
P_Key index with SQD-SQD, and finally go SQD-RTS.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCHv2] IB/ipoib: Fix ipoib handling for pkey reordering

2007-02-27 Thread Michael S. Tsirkin
 Quoting Roland Dreier [EMAIL PROTECTED]:
 Subject: Re: [PATCHv2] IB/ipoib: Fix ipoib handling for pkey reordering
 
   I just gave this a cursory glance.
 
 I haven't really read it except to think why is this so complicated?
 
   A suggestion: would it not be much simpler to modify the QP from RTS to 
 RTS on pkey
   change?
 
 Changing the P_Key index is not allowed for RTS-RTS.  You would have
 to modify the QP RTS-SQD, wait for the SQ to drain, then modify the
 P_Key index with SQD-SQD, and finally go SQD-RTS.

True, I misread the spec.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC] IB/ipoib: Asynchronous events delivered without port parameter.

2007-02-27 Thread Roland Dreier
 I did a short code review of the ipoib code concentrating on
  partitioning support and I mentioned that the asynchronous events
  handler in the ipoib code does not take the port number reported in
  the event record into consideration. The effect of that is that all of
  the ib# devices related to that specific HCA are flushed when it seems
  to me that only the relevant port one should be. Is that done on
  purpose, or am I missing something ?

I don't think there's any particular reason the code is that way
except for the oversight never being corrected.  But it looks trivial
to fix, like the patch below.  Does that look right to you?

  p.s. I'm working on a patch that should solve another issue caused by
  PKEY reordering  ipoib behavior and the above issue further
  complicates things for me.

Why not fix the issue first then?

commit a27cbe878203076247c1b5287f5ab59ed143b560
Author: Roland Dreier [EMAIL PROTECTED]
Date:   Tue Feb 27 07:37:49 2007 -0800

IPoIB: Only handle async events for one port

An asynchronous event carries the port number that the event occurred
on, so there's no reason for an IPoIB interface to process an event
associated with a different local HCA port.

Signed-off-by: Roland Dreier [EMAIL PROTECTED]

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c 
b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
index 3cb551b..7f3ec20 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
@@ -259,12 +259,13 @@ void ipoib_event(struct ib_event_handler *handler,
struct ipoib_dev_priv *priv =
container_of(handler, struct ipoib_dev_priv, event_handler);
 
-   if (record-event == IB_EVENT_PORT_ERR||
-   record-event == IB_EVENT_PKEY_CHANGE ||
-   record-event == IB_EVENT_PORT_ACTIVE ||
-   record-event == IB_EVENT_LID_CHANGE  ||
-   record-event == IB_EVENT_SM_CHANGE   ||
-   record-event == IB_EVENT_CLIENT_REREGISTER) {
+   if ((record-event == IB_EVENT_PORT_ERR||
+record-event == IB_EVENT_PKEY_CHANGE ||
+record-event == IB_EVENT_PORT_ACTIVE ||
+record-event == IB_EVENT_LID_CHANGE  ||
+record-event == IB_EVENT_SM_CHANGE   ||
+record-event == IB_EVENT_CLIENT_REREGISTER) 
+   record-element.port_num == priv-port) {
ipoib_dbg(priv, Port state change event\n);
queue_work(ipoib_workqueue, priv-flush_task);
}

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCHv2] IB/ipoib: Fix ipoib handling for pkey reordering

2007-02-27 Thread Moni Levy
On 2/27/07, Roland Dreier [EMAIL PROTECTED] wrote:
   I just gave this a cursory glance.

 I haven't really read it except to think why is this so complicated?

Do you refer to that complication of the patch of the issue ?


   A suggestion: would it not be much simpler to modify the QP from RTS to 
 RTS on pkey
   change?

 Changing the P_Key index is not allowed for RTS-RTS.  You would have
 to modify the QP RTS-SQD, wait for the SQ to drain, then modify the
 P_Key index with SQD-SQD, and finally go SQD-RTS.

Do you think that using that way to solve it will be a significant
simplification ? We'll still have to reuse that handling for missed
completion that is currently implemented in ipoib_ib_dev_stop and
still have additional work element.

-- Moni


  - R.

 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general

 To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC] IB/ipoib: Asynchronous events delivered without port parameter.

2007-02-27 Thread Michael S. Tsirkin
 Quoting Roland Dreier [EMAIL PROTECTED]:
 Subject: Re: [RFC] IB/ipoib: Asynchronous events delivered without port 
 parameter.
 
  I did a short code review of the ipoib code concentrating on
   partitioning support and I mentioned that the asynchronous events
   handler in the ipoib code does not take the port number reported in
   the event record into consideration. The effect of that is that all of
   the ib# devices related to that specific HCA are flushed when it seems
   to me that only the relevant port one should be. Is that done on
   purpose, or am I missing something ?
 
 I don't think there's any particular reason the code is that way
 except for the oversight never being corrected.  But it looks trivial
 to fix, like the patch below.  Does that look right to you?
 
   p.s. I'm working on a patch that should solve another issue caused by
   PKEY reordering  ipoib behavior and the above issue further
   complicates things for me.
 
 Why not fix the issue first then?
 
 commit a27cbe878203076247c1b5287f5ab59ed143b560
 Author: Roland Dreier [EMAIL PROTECTED]
 Date:   Tue Feb 27 07:37:49 2007 -0800
 
 IPoIB: Only handle async events for one port
 
 An asynchronous event carries the port number that the event occurred
 on, so there's no reason for an IPoIB interface to process an event
 associated with a different local HCA port.
 
 Signed-off-by: Roland Dreier [EMAIL PROTECTED]
 
 diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c 
 b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
 index 3cb551b..7f3ec20 100644
 --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
 +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
 @@ -259,12 +259,13 @@ void ipoib_event(struct ib_event_handler *handler,
   struct ipoib_dev_priv *priv =
   container_of(handler, struct ipoib_dev_priv, event_handler);
  
 - if (record-event == IB_EVENT_PORT_ERR||
 - record-event == IB_EVENT_PKEY_CHANGE ||
 - record-event == IB_EVENT_PORT_ACTIVE ||
 - record-event == IB_EVENT_LID_CHANGE  ||
 - record-event == IB_EVENT_SM_CHANGE   ||
 - record-event == IB_EVENT_CLIENT_REREGISTER) {
 + if ((record-event == IB_EVENT_PORT_ERR||
 +  record-event == IB_EVENT_PKEY_CHANGE ||
 +  record-event == IB_EVENT_PORT_ACTIVE ||
 +  record-event == IB_EVENT_LID_CHANGE  ||
 +  record-event == IB_EVENT_SM_CHANGE   ||
 +  record-event == IB_EVENT_CLIENT_REREGISTER) 
 + record-element.port_num == priv-port) {
   ipoib_dbg(priv, Port state change event\n);
   queue_work(ipoib_workqueue, priv-flush_task);
   }

Looks good.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC] IB/ipoib: Asynchronous events delivered without port parameter.

2007-02-27 Thread Moni Levy
On 2/27/07, Roland Dreier [EMAIL PROTECTED] wrote:
  I did a short code review of the ipoib code concentrating on
   partitioning support and I mentioned that the asynchronous events
   handler in the ipoib code does not take the port number reported in
   the event record into consideration. The effect of that is that all of
   the ib# devices related to that specific HCA are flushed when it seems
   to me that only the relevant port one should be. Is that done on
   purpose, or am I missing something ?

 I don't think there's any particular reason the code is that way
 except for the oversight never being corrected.  But it looks trivial
 to fix, like the patch below.  Does that look right to you?

   p.s. I'm working on a patch that should solve another issue caused by
   PKEY reordering  ipoib behavior and the above issue further
   complicates things for me.

 Why not fix the issue first then?

 commit a27cbe878203076247c1b5287f5ab59ed143b560
 Author: Roland Dreier [EMAIL PROTECTED]
 Date:   Tue Feb 27 07:37:49 2007 -0800

 IPoIB: Only handle async events for one port

 An asynchronous event carries the port number that the event occurred
 on, so there's no reason for an IPoIB interface to process an event
 associated with a different local HCA port.

 Signed-off-by: Roland Dreier [EMAIL PROTECTED]

 diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c 
 b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
 index 3cb551b..7f3ec20 100644
 --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
 +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
 @@ -259,12 +259,13 @@ void ipoib_event(struct ib_event_handler *handler,
 struct ipoib_dev_priv *priv =
 container_of(handler, struct ipoib_dev_priv, event_handler);

 -   if (record-event == IB_EVENT_PORT_ERR||
 -   record-event == IB_EVENT_PKEY_CHANGE ||
 -   record-event == IB_EVENT_PORT_ACTIVE ||
 -   record-event == IB_EVENT_LID_CHANGE  ||
 -   record-event == IB_EVENT_SM_CHANGE   ||
 -   record-event == IB_EVENT_CLIENT_REREGISTER) {
 +   if ((record-event == IB_EVENT_PORT_ERR||
 +record-event == IB_EVENT_PKEY_CHANGE ||
 +record-event == IB_EVENT_PORT_ACTIVE ||
 +record-event == IB_EVENT_LID_CHANGE  ||
 +record-event == IB_EVENT_SM_CHANGE   ||
 +record-event == IB_EVENT_CLIENT_REREGISTER) 
 +   record-element.port_num == priv-port) {
 ipoib_dbg(priv, Port state change event\n);
 queue_work(ipoib_workqueue, priv-flush_task);
 }


That's exactly what I intended to post.

--Moni

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCHv2] IB/ipoib: Fix ipoib handling for pkey reordering

2007-02-27 Thread Roland Dreier
   I haven't really read it except to think why is this so complicated?
  
  Do you refer to that complication of the patch of the issue ?

the patch.

   Changing the P_Key index is not allowed for RTS-RTS.  You would have
   to modify the QP RTS-SQD, wait for the SQ to drain, then modify the
   P_Key index with SQD-SQD, and finally go SQD-RTS.
  
  Do you think that using that way to solve it will be a significant
  simplification ? We'll still have to reuse that handling for missed
  completion that is currently implemented in ipoib_ib_dev_stop and
  still have additional work element.

no, I don't think SQD is really useful in practice.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] HOWTO check ofa_kernel build from your git tree

2007-02-27 Thread Vladimir Sokolovsky
On Tue, 2007-02-27 at 08:23 -0600, Steve Wise wrote:
 Where are all the kernel src trees on ssh. openfabrics.org?
 
 I would like to build against specific trees that are failing with
 cxgb3...
 
/home/vlad/kernel.org/arch/kernel

 Also:  
 
 what RH distro ships:
 
 linux-2.6.9-22.ELsmp
 
RHEL4.0U2
 
 and
 
 linux-2.6.9-34.ELsmp
 
RHEL4.0U3
 
 Thanks,
 
 Steve.
 
 
 
 On Mon, 2007-02-26 at 17:07 +0200, Vladimir Sokolovsky wrote:
  On ssh.openfabrics.org:
  Run
  env git_url=/home/mst/scm/ofed_1_2_devel.git git_branch=ofed_1_2 \
  CHECK_LOCAL=yes \
  CHECK_KERNEL_ORG=yes \
  CHECK_CROSS=yes /home/vlad/scripts/build_ofa_kernel.sh
  

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCHv2] IB/ipoib: Fix ipoib handling for pkey reordering

2007-02-27 Thread Moni Levy
On 2/27/07, Roland Dreier [EMAIL PROTECTED] wrote:
I haven't really read it except to think why is this so complicated?
  
   Do you refer to that complication of the patch of the issue ?

 the patch.

Please advise and I'll change it.


Changing the P_Key index is not allowed for RTS-RTS.  You would have
to modify the QP RTS-SQD, wait for the SQ to drain, then modify the
P_Key index with SQD-SQD, and finally go SQD-RTS.
  
   Do you think that using that way to solve it will be a significant
   simplification ? We'll still have to reuse that handling for missed
   completion that is currently implemented in ipoib_ib_dev_stop and
   still have additional work element.

 no, I don't think SQD is really useful in practice.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH 0/6] ofed_1_2: cxgb3 bug fixes

2007-02-27 Thread Steve Wise

Hey Vlad,

These fixes need to be pulled into ofed_1_2 for the Chelsio Ethernet
driver.

You can pull them directly from my ofa git tree:

git://staging.openfabrics.org/~swise/ofed_1_2 cxgb3_fixes

Thanks,

Steve.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH 1/6] sysfs attributes are now managed per port, no longer per adapter.

2007-02-27 Thread Steve Wise

sysfs attributes are now managed per port, no longer per adapter.

Signed-off-by: Divy Le Ray [EMAIL PROTECTED]
---

 drivers/net/cxgb3/cxgb3_main.c |   21 -
 1 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
index dfa035a..638b0ab 100755
--- a/drivers/net/cxgb3/cxgb3_main.c
+++ b/drivers/net/cxgb3/cxgb3_main.c
@@ -435,26 +435,24 @@ static int setup_sge_qsets(struct adapte
 }
 
 static ssize_t attr_show(struct class_device *cd, char *buf,
-ssize_t(*format) (struct adapter *, char *))
+ssize_t(*format) (struct net_device *, char *))
 {
ssize_t len;
-   struct adapter *adap = to_net_dev(cd)-priv;
 
/* Synchronize with ioctls that may shut down the device */
rtnl_lock();
-   len = (*format) (adap, buf);
+   len = (*format) (to_net_dev(cd), buf);
rtnl_unlock();
return len;
 }
 
 static ssize_t attr_store(struct class_device *cd, const char *buf, size_t len,
- ssize_t(*set) (struct adapter *, unsigned int),
+ ssize_t(*set) (struct net_device *, unsigned int),
  unsigned int min_val, unsigned int max_val)
 {
char *endp;
ssize_t ret;
unsigned int val;
-   struct adapter *adap = to_net_dev(cd)-priv;
 
if (!capable(CAP_NET_ADMIN))
return -EPERM;
@@ -464,7 +462,7 @@ static ssize_t attr_store(struct class_d
return -EINVAL;
 
rtnl_lock();
-   ret = (*set) (adap, val);
+   ret = (*set) (to_net_dev(cd), val);
if (!ret)
ret = len;
rtnl_unlock();
@@ -472,8 +470,9 @@ static ssize_t attr_store(struct class_d
 }
 
 #define CXGB3_SHOW(name, val_expr) \
-static ssize_t format_##name(struct adapter *adap, char *buf) \
+static ssize_t format_##name(struct net_device *dev, char *buf) \
 { \
+   struct adapter *adap = dev-priv; \
return sprintf(buf, %u\n, val_expr); \
 } \
 static ssize_t show_##name(struct class_device *cd, char *buf) \
@@ -481,8 +480,10 @@ static ssize_t show_##name(struct class_
return attr_show(cd, buf, format_##name); \
 }
 
-static ssize_t set_nfilters(struct adapter *adap, unsigned int val)
+static ssize_t set_nfilters(struct net_device *dev, unsigned int val)
 {
+   struct adapter *adap = dev-priv;
+
if (adap-flags  FULL_INIT_DONE)
return -EBUSY;
if (val  adap-params.rev == 0)
@@ -499,8 +500,10 @@ static ssize_t store_nfilters(struct cla
return attr_store(cd, buf, len, set_nfilters, 0, ~0);
 }
 
-static ssize_t set_nservers(struct adapter *adap, unsigned int val)
+static ssize_t set_nservers(struct net_device *dev, unsigned int val)
 {
+   struct adapter *adap = dev-priv;
+
if (adap-flags  FULL_INIT_DONE)
return -EBUSY;
if (val  t3_mc5_size(adap-mc5) - adap-params.mc5.nfilters)

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH 3/6] Update FW version to 3.2

2007-02-27 Thread Steve Wise

Update FW version to 3.2

Signed-off-by: Steve Wise [EMAIL PROTECTED]
---

 drivers/net/cxgb3/t3_hw.c   |6 --
 drivers/net/cxgb3/version.h |2 ++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c
old mode 100755
new mode 100644
index 365a7f5..eaa7a2e
--- a/drivers/net/cxgb3/t3_hw.c
+++ b/drivers/net/cxgb3/t3_hw.c
@@ -884,11 +884,13 @@ int t3_check_fw_version(struct adapter *
major = G_FW_VERSION_MAJOR(vers);
minor = G_FW_VERSION_MINOR(vers);
 
-   if (type == FW_VERSION_T3  major == 3  minor == 1)
+   if (type == FW_VERSION_T3  major == FW_VERSION_MAJOR 
+   minor == FW_VERSION_MINOR)
return 0;
 
CH_ERR(adapter, found wrong FW version(%u.%u), 
-  driver needs version 3.1\n, major, minor);
+  driver needs version %u.%u\n, major, minor,
+  FW_VERSION_MAJOR, FW_VERSION_MINOR);
return -EINVAL;
 }
 
diff --git a/drivers/net/cxgb3/version.h b/drivers/net/cxgb3/version.h
old mode 100755
new mode 100644
index 2b67dd5..782a6cf
--- a/drivers/net/cxgb3/version.h
+++ b/drivers/net/cxgb3/version.h
@@ -36,4 +36,6 @@ #define DRV_DESC Chelsio T3 Network Dri
 #define DRV_NAME cxgb3
 /* Driver version */
 #define DRV_VERSION 1.0
+#define FW_VERSION_MAJOR 3
+#define FW_VERSION_MINOR 2
 #endif /* __CHELSIO_VERSION_H */

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH 4/6] Offload packets may be DMAed long after their SGE Tx descriptors are done

2007-02-27 Thread Steve Wise

Offload packets may be DMAed long after their SGE Tx descriptors are done

so they must remain mapped until they are freed rather than until their
descriptors are freed.  Unmap such packets through an skb destructor.

Signed-off-by: Divy Le Ray [EMAIL PROTECTED]
---

 drivers/net/cxgb3/sge.c |   63 ++-
 1 files changed, 61 insertions(+), 2 deletions(-)

diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c
old mode 100755
new mode 100644
index 3f2cf8a..822a598
--- a/drivers/net/cxgb3/sge.c
+++ b/drivers/net/cxgb3/sge.c
@@ -105,6 +105,15 @@ struct unmap_info {/* packet unmapping
 };
 
 /*
+ * Holds unmapping information for Tx packets that need deferred unmapping.
+ * This structure lives at skb-head and must be allocated by callers.
+ */
+struct deferred_unmap_info {
+   struct pci_dev *pdev;
+   dma_addr_t addr[MAX_SKB_FRAGS + 1];
+};
+
+/*
  * Maps a number of flits to the number of Tx descriptors that can hold them.
  * The formula is
  *
@@ -252,10 +261,13 @@ static void free_tx_desc(struct adapter 
struct pci_dev *pdev = adapter-pdev;
unsigned int cidx = q-cidx;
 
+   const int need_unmap = need_skb_unmap() 
+  q-cntxt_id = FW_TUNNEL_SGEEC_START;
+
d = q-sdesc[cidx];
while (n--) {
if (d-skb) {   /* an SGL is present */
-   if (need_skb_unmap())
+   if (need_unmap)
unmap_skb(d-skb, q, cidx, pdev);
if (d-skb-priority == cidx)
kfree_skb(d-skb);
@@ -1227,6 +1239,50 @@ int t3_mgmt_tx(struct adapter *adap, str
 }
 
 /**
+ * deferred_unmap_destructor - unmap a packet when it is freed
+ * @skb: the packet
+ *
+ * This is the packet destructor used for Tx packets that need to remain
+ * mapped until they are freed rather than until their Tx descriptors are
+ * freed.
+ */
+static void deferred_unmap_destructor(struct sk_buff *skb)
+{
+   int i;
+   const dma_addr_t *p;
+   const struct skb_shared_info *si;
+   const struct deferred_unmap_info *dui;
+   const struct unmap_info *ui = (struct unmap_info *)skb-cb;
+
+   dui = (struct deferred_unmap_info *)skb-head;
+   p = dui-addr;
+
+   if (ui-len)
+   pci_unmap_single(dui-pdev, *p++, ui-len, PCI_DMA_TODEVICE);
+
+   si = skb_shinfo(skb);
+   for (i = 0; i  si-nr_frags; i++)
+   pci_unmap_page(dui-pdev, *p++, si-frags[i].size,
+  PCI_DMA_TODEVICE);
+}
+
+static void setup_deferred_unmapping(struct sk_buff *skb, struct pci_dev *pdev,
+const struct sg_ent *sgl, int sgl_flits)
+{
+   dma_addr_t *p;
+   struct deferred_unmap_info *dui;
+
+   dui = (struct deferred_unmap_info *)skb-head;
+   dui-pdev = pdev;
+   for (p = dui-addr; sgl_flits = 3; sgl++, sgl_flits -= 3) {
+   *p++ = be64_to_cpu(sgl-addr[0]);
+   *p++ = be64_to_cpu(sgl-addr[1]);
+   }
+   if (sgl_flits)
+   *p = be64_to_cpu(sgl-addr[0]);
+}
+
+/**
  * write_ofld_wr - write an offload work request
  * @adap: the adapter
  * @skb: the packet to send
@@ -1262,8 +1318,11 @@ static void write_ofld_wr(struct adapter
sgp = ndesc == 1 ? (struct sg_ent *)d-flit[flits] : sgl;
sgl_flits = make_sgl(skb, sgp, skb-h.raw, skb-tail - skb-h.raw,
 adap-pdev);
-   if (need_skb_unmap())
+   if (need_skb_unmap()) {
+   setup_deferred_unmapping(skb, adap-pdev, sgp, sgl_flits);
+   skb-destructor = deferred_unmap_destructor;
((struct unmap_info *)skb-cb)-len = skb-tail - skb-h.raw;
+   }
 
write_wr_hdr_sgl(ndesc, skb, d, pidx, q, sgl, flits, sgl_flits,
 gen, from-wr_hi, from-wr_lo);

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH 5/6] Improve the traffic recovery after the HW ran out of response queue entries.

2007-02-27 Thread Steve Wise

Improve the traffic recovery after the HW ran out of response queue entries.

Signed-off-by: Divy Le Ray [EMAIL PROTECTED]
---

 drivers/net/cxgb3/adapter.h |2 ++
 drivers/net/cxgb3/sge.c |   15 ++-
 2 files changed, 16 insertions(+), 1 deletions(-)

diff --git a/drivers/net/cxgb3/adapter.h b/drivers/net/cxgb3/adapter.h
old mode 100755
new mode 100644
index 5c97a64..01b99b9
--- a/drivers/net/cxgb3/adapter.h
+++ b/drivers/net/cxgb3/adapter.h
@@ -121,6 +121,8 @@ struct sge_rspq {   /* state for an SGE r
unsigned long empty;/* # of times queue ran out of credits */
unsigned long nomem;/* # of responses deferred due to no mem */
unsigned long unhandled_irqs;   /* # of spurious intrs */
+   unsigned long starved;
+   unsigned long restarted;
 };
 
 struct tx_desc;
diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c
index 822a598..4ff0ab6 100644
--- a/drivers/net/cxgb3/sge.c
+++ b/drivers/net/cxgb3/sge.c
@@ -2376,13 +2376,26 @@ static void sge_timer_cb(unsigned long d
spin_unlock(qs-txq[TXQ_OFLD].lock);
}
lock = (adap-flags  USING_MSIX) ? qs-rspq.lock :
-   adap-sge.qs[0].rspq.lock;
+   adap-sge.qs[0].rspq.lock;
if (spin_trylock_irq(lock)) {
if (!napi_is_scheduled(qs-netdev)) {
+   u32 status = t3_read_reg(adap, A_SG_RSPQ_FL_STATUS);
+
if (qs-fl[0].credits  qs-fl[0].size)
__refill_fl(adap, qs-fl[0]);
if (qs-fl[1].credits  qs-fl[1].size)
__refill_fl(adap, qs-fl[1]);
+
+   if (status  (1  qs-rspq.cntxt_id)) {
+   qs-rspq.starved++;
+   if (qs-rspq.credits) {
+   refill_rspq(adap, qs-rspq, 1);
+   qs-rspq.credits--;
+   qs-rspq.restarted++;
+   t3_write_reg(adap, A_SG_RSPQ_FL_STATUS, 
+1  qs-rspq.cntxt_id);
+   }
+   }
}
spin_unlock_irq(lock);
}

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH 2/6] Clean up some private ioctls.

2007-02-27 Thread Steve Wise

Clean up some private ioctls.

Signed-off-by: Divy Le Ray [EMAIL PROTECTED]
---

 drivers/net/cxgb3/cxgb3_ioctl.h |   33 +--
 drivers/net/cxgb3/cxgb3_main.c  |   48 +++
 2 files changed, 15 insertions(+), 66 deletions(-)

diff --git a/drivers/net/cxgb3/cxgb3_ioctl.h b/drivers/net/cxgb3/cxgb3_ioctl.h
old mode 100755
new mode 100644
index a942818..0a82fcd
--- a/drivers/net/cxgb3/cxgb3_ioctl.h
+++ b/drivers/net/cxgb3/cxgb3_ioctl.h
@@ -36,28 +36,17 @@ #define __CHIOCTL_H__
  * Ioctl commands specific to this driver.
  */
 enum {
-   CHELSIO_SETREG = 1024,
-   CHELSIO_GETREG,
-   CHELSIO_SETTPI,
-   CHELSIO_GETTPI,
-   CHELSIO_GETMTUTAB,
-   CHELSIO_SETMTUTAB,
-   CHELSIO_GETMTU,
-   CHELSIO_SET_PM,
-   CHELSIO_GET_PM,
-   CHELSIO_GET_TCAM,
-   CHELSIO_SET_TCAM,
-   CHELSIO_GET_TCB,
-   CHELSIO_GET_MEM,
-   CHELSIO_LOAD_FW,
-   CHELSIO_GET_PROTO,
-   CHELSIO_SET_PROTO,
-   CHELSIO_SET_TRACE_FILTER,
-   CHELSIO_SET_QSET_PARAMS,
-   CHELSIO_GET_QSET_PARAMS,
-   CHELSIO_SET_QSET_NUM,
-   CHELSIO_GET_QSET_NUM,
-   CHELSIO_SET_PKTSCHED,
+   CHELSIO_GETMTUTAB   = 1029,
+   CHELSIO_SETMTUTAB   = 1030,
+   CHELSIO_SET_PM  = 1032,
+   CHELSIO_GET_PM  = 1033,
+   CHELSIO_GET_MEM = 1038,
+   CHELSIO_LOAD_FW = 1041,
+   CHELSIO_SET_TRACE_FILTER= 1044,
+   CHELSIO_SET_QSET_PARAMS = 1045,
+   CHELSIO_GET_QSET_PARAMS = 1046,
+   CHELSIO_SET_QSET_NUM= 1047,
+   CHELSIO_GET_QSET_NUM= 1048,
 };
 
 struct ch_reg {
diff --git a/drivers/net/cxgb3/cxgb3_main.c b/drivers/net/cxgb3/cxgb3_main.c
old mode 100755
new mode 100644
index 638b0ab..0e84c4e
--- a/drivers/net/cxgb3/cxgb3_main.c
+++ b/drivers/net/cxgb3/cxgb3_main.c
@@ -1547,32 +1547,6 @@ static int cxgb_extension_ioctl(struct n
return -EFAULT;
 
switch (cmd) {
-   case CHELSIO_SETREG:{
-   struct ch_reg edata;
-
-   if (!capable(CAP_NET_ADMIN))
-   return -EPERM;
-   if (copy_from_user(edata, useraddr, sizeof(edata)))
-   return -EFAULT;
-   if ((edata.addr  3) != 0
-   || edata.addr = adapter-mmio_len)
-   return -EINVAL;
-   writel(edata.val, adapter-regs + edata.addr);
-   break;
-   }
-   case CHELSIO_GETREG:{
-   struct ch_reg edata;
-
-   if (copy_from_user(edata, useraddr, sizeof(edata)))
-   return -EFAULT;
-   if ((edata.addr  3) != 0
-   || edata.addr = adapter-mmio_len)
-   return -EINVAL;
-   edata.val = readl(adapter-regs + edata.addr);
-   if (copy_to_user(useraddr, edata, sizeof(edata)))
-   return -EFAULT;
-   break;
-   }
case CHELSIO_SET_QSET_PARAMS:{
int i;
struct qset_params *q;
@@ -1836,10 +1810,10 @@ static int cxgb_extension_ioctl(struct n
return -EINVAL;
 
/*
-   * Version scheme:
-   * bits 0..9: chip version
-   * bits 10..15: chip revision
-   */
+* Version scheme:
+* bits 0..9: chip version
+* bits 10..15: chip revision
+*/
t.version = 3 | (adapter-params.rev  10);
if (copy_to_user(useraddr, t, sizeof(t)))
return -EFAULT;
@@ -1888,20 +1862,6 @@ static int cxgb_extension_ioctl(struct n
t.trace_rx);
break;
}
-   case CHELSIO_SET_PKTSCHED:{
-   struct ch_pktsched_params p;
-
-   if (!capable(CAP_NET_ADMIN))
-   return -EPERM;
-   if (!adapter-open_device_map)
-   return -EAGAIN; /* uP and SGE must be running */
-   if (copy_from_user(p, useraddr, sizeof(p)))
-   return -EFAULT;
-   send_pktsched_cmd(adapter, p.sched, p.idx, p.min, p.max,
- p.binding);
-   break;
-   
-   }
default:
return -EOPNOTSUPP;
}

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH 6/6] Populate Rx free list with pages.

2007-02-27 Thread Steve Wise

Populate Rx free list with pages.

Signed-off-by: Divy Le Ray [EMAIL PROTECTED]
---

 drivers/net/cxgb3/adapter.h |9 +
 drivers/net/cxgb3/sge.c |  318 +++
 2 files changed, 235 insertions(+), 92 deletions(-)

diff --git a/drivers/net/cxgb3/adapter.h b/drivers/net/cxgb3/adapter.h
index 01b99b9..80c3d8f 100644
--- a/drivers/net/cxgb3/adapter.h
+++ b/drivers/net/cxgb3/adapter.h
@@ -74,6 +74,11 @@ enum {   /* adapter flags */
 struct rx_desc;
 struct rx_sw_desc;
 
+struct sge_fl_page {
+   struct skb_frag_struct frag;
+   unsigned char *va;
+};
+
 struct sge_fl {/* SGE per free-buffer list state */
unsigned int buf_size;  /* size of each Rx buffer */
unsigned int credits;   /* # of available Rx buffers */
@@ -81,11 +86,13 @@ struct sge_fl { /* SGE per free-buffer
unsigned int cidx;  /* consumer index */
unsigned int pidx;  /* producer index */
unsigned int gen;   /* free list generation */
+   unsigned int cntxt_id;  /* SGE context id for the free list */
+   struct sge_fl_page page;
struct rx_desc *desc;   /* address of HW Rx descriptor ring */
struct rx_sw_desc *sdesc;   /* address of SW Rx descriptor ring */
dma_addr_t phys_addr;   /* physical address of HW ring start */
-   unsigned int cntxt_id;  /* SGE context id for the free list */
unsigned long empty;/* # of times queue ran out of buffers */
+   unsigned long alloc_failed; /* # of times buffer allocation failed */
 };
 
 /*
diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c
index 4ff0ab6..c237834 100644
--- a/drivers/net/cxgb3/sge.c
+++ b/drivers/net/cxgb3/sge.c
@@ -45,9 +45,25 @@ #include firmware_exports.h
 #define USE_GTS 0
 
 #define SGE_RX_SM_BUF_SIZE 1536
+
+/*
+ * If USE_RX_PAGE is defined, the small freelist populated with (partial)
+ * pages instead of skbs. Pages are carved up into RX_PAGE_SIZE chunks (must
+ * be a multiple of the host page size).
+ */
+#define USE_RX_PAGE
+#define RX_PAGE_SIZE 2048
+
+/*
+ * skb freelist packets are copied into a new skb (and the freelist one is 
+ * reused) if their len is = 
+ */
 #define SGE_RX_COPY_THRES  256
 
-# define SGE_RX_DROP_THRES 16
+/*
+ * Minimum number of freelist entries before we start dropping TUNNEL frames.
+ */
+#define SGE_RX_DROP_THRES 16
 
 /*
  * Period of the Tx buffer reclaim timer.  This timer does not need to run
@@ -85,7 +101,10 @@ struct tx_sw_desc { /* SW state per Tx 
 };
 
 struct rx_sw_desc {/* SW state per Rx descriptor */
-   struct sk_buff *skb;
+   union {
+   struct sk_buff *skb;
+   struct sge_fl_page page;
+   } t;
 DECLARE_PCI_UNMAP_ADDR(dma_addr);
 };
 
@@ -332,16 +351,27 @@ static void free_rx_bufs(struct pci_dev 
 
pci_unmap_single(pdev, pci_unmap_addr(d, dma_addr),
 q-buf_size, PCI_DMA_FROMDEVICE);
-   kfree_skb(d-skb);
-   d-skb = NULL;
+
+   if (q-buf_size != RX_PAGE_SIZE) {
+   kfree_skb(d-t.skb);
+   d-t.skb = NULL;
+   } else {
+   if (d-t.page.frag.page)
+   put_page(d-t.page.frag.page);
+   d-t.page.frag.page = NULL;
+   }
if (++cidx == q-size)
cidx = 0;
}
+
+   if (q-page.frag.page)
+   put_page(q-page.frag.page);
+   q-page.frag.page = NULL;
 }
 
 /**
  * add_one_rx_buf - add a packet buffer to a free-buffer list
- * @skb: the buffer to add
+ * @va: va of the buffer to add
  * @len: the buffer length
  * @d: the HW Rx descriptor to write
  * @sd: the SW Rx descriptor to write
@@ -351,14 +381,13 @@ static void free_rx_bufs(struct pci_dev 
  * Add a buffer of the given length to the supplied HW and SW Rx
  * descriptors.
  */
-static inline void add_one_rx_buf(struct sk_buff *skb, unsigned int len,
+static inline void add_one_rx_buf(unsigned char *va, unsigned int len,
  struct rx_desc *d, struct rx_sw_desc *sd,
  unsigned int gen, struct pci_dev *pdev)
 {
dma_addr_t mapping;
 
-   sd-skb = skb;
-   mapping = pci_map_single(pdev, skb-data, len, PCI_DMA_FROMDEVICE);
+   mapping = pci_map_single(pdev, va, len, PCI_DMA_FROMDEVICE);
pci_unmap_addr_set(sd, dma_addr, mapping);
 
d-addr_lo = cpu_to_be32(mapping);
@@ -383,14 +412,47 @@ static void refill_fl(struct adapter *ad
 {
struct rx_sw_desc *sd = q-sdesc[q-pidx];
struct rx_desc *d = q-desc[q-pidx];
+   struct sge_fl_page *p = q-page;
 
while (n--) {
-   struct sk_buff *skb = alloc_skb(q-buf_size, gfp);
+   unsigned char *va;
 
-   

Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Vladimir Sokolovsky
On Mon, 2007-02-26 at 09:46 -0800, Sean Hefty wrote:
 Vladimir Sokolovsky wrote:
  On Fri, 2007-02-23 at 12:15 -0800, Sean Hefty wrote:
I would like these fixes in OFED 1.2 as well.  What git tree / branch 
  do I
generate a patch against?
   
- Sean
  
  git://git.openfabrics.org/~vlad/ofed_1_2/.git
  branch: ofed_1_2
 
 Can you try pulling from:
 
 git://git.openfabrics.org/~shefty/ofed_1_2.git   ofed_1_2
 
 - Sean

Sean,
Please send patches that will be added to kernel_patches/fixes.

Please update your git tree from
git://git.openfabrics.org/~vlad/ofed_1_2/.git  ofed_1_2


-- 
Vladimir Sokolovsky [EMAIL PROTECTED]
Mellanox Technologies Ltd.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC] IB/ipoib: Asynchronous events delivered without port parameter.

2007-02-27 Thread Moni Levy
On 2/27/07, Moni Levy [EMAIL PROTECTED] wrote:
 On 2/27/07, Roland Dreier [EMAIL PROTECTED] wrote:
   I did a short code review of the ipoib code concentrating on
partitioning support and I mentioned that the asynchronous events
handler in the ipoib code does not take the port number reported in
the event record into consideration. The effect of that is that all of
the ib# devices related to that specific HCA are flushed when it seems
to me that only the relevant port one should be. Is that done on
purpose, or am I missing something ?
 
  I don't think there's any particular reason the code is that way
  except for the oversight never being corrected.  But it looks trivial
  to fix, like the patch below.  Does that look right to you?
 
p.s. I'm working on a patch that should solve another issue caused by
PKEY reordering  ipoib behavior and the above issue further
complicates things for me.
 
  Why not fix the issue first then?
 
  commit a27cbe878203076247c1b5287f5ab59ed143b560
  Author: Roland Dreier [EMAIL PROTECTED]
  Date:   Tue Feb 27 07:37:49 2007 -0800
 
  IPoIB: Only handle async events for one port
 
  An asynchronous event carries the port number that the event occurred
  on, so there's no reason for an IPoIB interface to process an event
  associated with a different local HCA port.
 
  Signed-off-by: Roland Dreier [EMAIL PROTECTED]
 
  diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c 
  b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
  index 3cb551b..7f3ec20 100644
  --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
  +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
  @@ -259,12 +259,13 @@ void ipoib_event(struct ib_event_handler *handler,
  struct ipoib_dev_priv *priv =
  container_of(handler, struct ipoib_dev_priv, event_handler);
 
  -   if (record-event == IB_EVENT_PORT_ERR||
  -   record-event == IB_EVENT_PKEY_CHANGE ||
  -   record-event == IB_EVENT_PORT_ACTIVE ||
  -   record-event == IB_EVENT_LID_CHANGE  ||
  -   record-event == IB_EVENT_SM_CHANGE   ||
  -   record-event == IB_EVENT_CLIENT_REREGISTER) {
  +   if ((record-event == IB_EVENT_PORT_ERR||
  +record-event == IB_EVENT_PKEY_CHANGE ||
  +record-event == IB_EVENT_PORT_ACTIVE ||
  +record-event == IB_EVENT_LID_CHANGE  ||
  +record-event == IB_EVENT_SM_CHANGE   ||
  +record-event == IB_EVENT_CLIENT_REREGISTER) 
  +   record-element.port_num == priv-port) {
  ipoib_dbg(priv, Port state change event\n);
  queue_work(ipoib_workqueue, priv-flush_task);
  }
 

 That's exactly what I intended to post.

On a second thought based on the fact that on a two port HCA we'll
have a 50% miss on the events being delivered, I would move the new
condition to be evaluated first. I apologize if this is too much of
micro optimization. What do you think ?

--Moni


 --Moni


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Sean Hefty
Please send patches that will be added to kernel_patches/fixes.

Please update your git tree from
git://git.openfabrics.org/~vlad/ofed_1_2/.git  ofed_1_2

You want me to create a patch that adds a file that contains the actual patches?

Why not apply the patches directly?

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC] IB/ipoib: Asynchronous events delivered without port parameter.

2007-02-27 Thread Roland Dreier
  On a second thought based on the fact that on a two port HCA we'll
  have a 50% miss on the events being delivered, I would move the new
  condition to be evaluated first. I apologize if this is too much of
  micro optimization. What do you think ?

That wouldn't really be correct since element.port_num isn't valid
unless we already know it's a port-related event.

And it's not worth worrying about this since it's not remotely a hot path.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Vladimir Sokolovsky
On Tue, 2007-02-27 at 08:45 -0800, Sean Hefty wrote:
 Please send patches that will be added to kernel_patches/fixes.
 
 Please update your git tree from
 git://git.openfabrics.org/~vlad/ofed_1_2/.git  ofed_1_2
 
 You want me to create a patch that adds a file that contains the actual 
 patches?
Yes, actual patches should be created under kernel_patches/fixes.

Please update your git tree because the following patch fails:

From 2e7e33936de5f92656c0565ce88f97e796367dae Mon Sep 17 00:00:00 2001
From: Sean Hefty [EMAIL PROTECTED]
Date: Fri, 23 Feb 2007 12:35:43 -0800
Subject: [PATCH] rdma_cm: request reversible paths only

The rdma_cm requires that path records be reversible.  Set the 
reversible
bit when issuing an path record query.

Signed-off-by: Sean Hefty [EMAIL PROTECTED]

diff --git a/drivers/infiniband/core/cma.c 
b/drivers/infiniband/core/cma.c
index 9e0ab04..171cce9 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1396,11 +1396,13 @@ static int cma_query_ib_route(struct 
rdma_id_private *id_priv, int timeout_ms,
ib_addr_get_dgid(addr, path_rec.dgid);
path_rec.pkey = cpu_to_be16(ib_addr_get_pkey(addr));
path_rec.numb_path = 1;
+   path_rec.reversible = 1;

id_priv-query_id = ib_sa_path_rec_get(sa_client, 
id_priv-id.device,
id_priv-id.port_num, path_rec,
IB_SA_PATH_REC_DGID | 
IB_SA_PATH_REC_SGID |
-   IB_SA_PATH_REC_PKEY | 
IB_SA_PATH_REC_NUMB_PATH,
+   IB_SA_PATH_REC_PKEY | 
IB_SA_PATH_REC_NUMB_PATH |
+   IB_SA_PATH_REC_REVERSIBLE,
timeout_ms, GFP_KERNEL,
cma_query_handler, work, 
id_priv-query);


 
 Why not apply the patches directly?
 
To be consistent with 2.6.20 kernel.


-- 
Vladimir Sokolovsky [EMAIL PROTECTED]
Mellanox Technologies Ltd.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Michael S. Tsirkin
 Quoting Sean Hefty [EMAIL PROTECTED]:
 Subject: Re: [PATCH] for OFED 1.2
 
 Please send patches that will be added to kernel_patches/fixes.
 
 Please update your git tree from
 git://git.openfabrics.org/~vlad/ofed_1_2/.git  ofed_1_2
 
 You want me to create a patch that adds a file that contains the actual 
 patches?
 
 Why not apply the patches directly?

That's the ofed structure, this was discussed multiple times already.
The point is to keep all changes to upstream components separate,
to make updating to upstream kernel trivial in the future.

Worked quite well for OFED 1.1 - 1.2 transition.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC] [PATCH] ib_cache: do not mask upper bit when searching for a pkey

2007-02-27 Thread Moni Levy
Sean,
On 2/26/07, Sean Hefty [EMAIL PROTECTED] wrote:
 I think the following patch would make ipoib spec compliant.
 ib_find_cached_pkey is called by ib_cm, rdma_cm, ib_srp, and ib_ipoib.
 I'm not certain what this change would do to SRP, but the ib_cm and
 rdma_cm look okay, given that non-reversible paths aren't supported
 yet anyway.

Sorry for jumping into that thread, but although this patch will make
things more spec compliant, it will break functionality we depend one.
I suggest that we first find an alternate way to enable usage of
partial partition membership before disabling that functionality at
all.

--Moni

 --

 ib_find_cached_pkey masks off the upper-bit of the PKey when searching
 for a match.  The upper bit indicates partial or full membership.  Ignoring
 the upper bit can result in a full membership PKey matching with a partial
 membership PKey.  For ipoib, this can result in joining a multicast group
 that disallows communication between all members.

 Signed-off-by: Sean Hefty [EMAIL PROTECTED]
 ---
  drivers/infiniband/core/cache.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

 diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
 index 558c9a0..6f366c3 100644
 --- a/drivers/infiniband/core/cache.c
 +++ b/drivers/infiniband/core/cache.c
 @@ -179,7 +179,7 @@ int ib_find_cached_pkey(struct ib_device *device,
 *index = -1;

 for (i = 0; i  cache-table_len; ++i)
 -   if ((cache-table[i]  0x7fff) == (pkey  0x7fff)) {
 +   if (cache-table[i] == pkey) {
 *index = i;
 ret = 0;
 break;
 --
 1.4.4.3



 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general

 To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC] IB/ipoib: Asynchronous events delivered without port parameter.

2007-02-27 Thread Moni Levy
On 2/27/07, Roland Dreier [EMAIL PROTECTED] wrote:
   On a second thought based on the fact that on a two port HCA we'll
   have a 50% miss on the events being delivered, I would move the new
   condition to be evaluated first. I apologize if this is too much of
   micro optimization. What do you think ?

 That wouldn't really be correct since element.port_num isn't valid
 unless we already know it's a port-related event.

You're perfectly right, sorry.


 And it's not worth worrying about this since it's not remotely a hot path.

Ok.

--Moni


  - R.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Sean Hefty
Yes, actual patches should be created under kernel_patches/fixes.

Please update your git tree because the following patch fails:

Can you explain how the patch fails?  I don't see how putting the patch into a
file helps.

 Why not apply the patches directly?

To be consistent with 2.6.20 kernel.

You can check out stock 2.6.20 using a tag.  Why maintain the ofed code in git
if you don't use it to track patches?

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC] [PATCH] ib_cache: do not mask upper bit when searching for a pkey

2007-02-27 Thread Sean Hefty
 Sorry for jumping into that thread, but although this patch will make
 things more spec compliant, it will break functionality we depend one.
 I suggest that we first find an alternate way to enable usage of
 partial partition membership before disabling that functionality at
 all.

Can you clarify the functionality you depend on?  Are you reliant on ipoib 
being 
able to join a multicast group from partial partition membership?  If so, do 
all 
SA's and switches support this?

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Michael S. Tsirkin
 Quoting Sean Hefty [EMAIL PROTECTED]:
 Subject: Re: [PATCH] for OFED 1.2
 
 Yes, actual patches should be created under kernel_patches/fixes.
 
 Please update your git tree because the following patch fails:
 
 Can you explain how the patch fails?  I don't see how putting the patch into a
 file helps.

Try applying it?

  Why not apply the patches directly?
 
 To be consistent with 2.6.20 kernel.
 
 You can check out stock 2.6.20 using a tag.  Why maintain the ofed code in git
 if you don't use it to track patches?

Basically so that conflicts in future merges from upstream are easy to resolve.
If you like, let's reopen this for 1.3. We are after freeze in OFED 1.2.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Steve Wise
On Tue, 2007-02-27 at 18:55 +0200, Michael S. Tsirkin wrote:
  Quoting Sean Hefty [EMAIL PROTECTED]:
  Subject: Re: [PATCH] for OFED 1.2
  
  Please send patches that will be added to kernel_patches/fixes.
  
  Please update your git tree from
  git://git.openfabrics.org/~vlad/ofed_1_2/.git  ofed_1_2
  
  You want me to create a patch that adds a file that contains the actual 
  patches?
  
  Why not apply the patches directly?
 
 That's the ofed structure, this was discussed multiple times already.
 The point is to keep all changes to upstream components separate,
 to make updating to upstream kernel trivial in the future.
 
 Worked quite well for OFED 1.1 - 1.2 transition.
 

Having these patches as files is painful for every developer because
they cannot create a patch against ofed_1_2/drivers/infiniband/* nor the
kernel.org upstream tree.  They need to apply all the current patches
and then create a patch on top of that. Or hope the patch applies
fuzzily.  

I think with stacked git or just git and rebasing at key times, you
could keep an ofed_1_2 tree that folks can easily apply patches to...

Its too late to change this for 1.2, but you might want to reconsider
the design for 1.3.


my 2 cents...










___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC] [PATCH] ib_cache: do not mask upper bit when searching for a pkey

2007-02-27 Thread Moni Levy
On 2/27/07, Sean Hefty [EMAIL PROTECTED] wrote:
  Sorry for jumping into that thread, but although this patch will make
  things more spec compliant, it will break functionality we depend one.
  I suggest that we first find an alternate way to enable usage of
  partial partition membership before disabling that functionality at
  all.

 Can you clarify the functionality you depend on?  Are you reliant on ipoib 
 being
 able to join a multicast group from partial partition membership?

Exactly.

 If so, do all SA's and switches support this?

I can't commit on all the SA's and switches.

-- Moni


 - Sean

 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general

 To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC] [PATCH] ib_cache: do not mask upper bit when searching for a pkey

2007-02-27 Thread Hal Rosenstock
On Tue, 2007-02-27 at 12:06, Sean Hefty wrote:
  Sorry for jumping into that thread, but although this patch will make
  things more spec compliant, it will break functionality we depend one.
  I suggest that we first find an alternate way to enable usage of
  partial partition membership before disabling that functionality at
  all.
 
 Can you clarify the functionality you depend on?  Are you reliant on ipoib 
 being 
 able to join a multicast group from partial partition membership?  If so, do 
 all 
 SA's and switches support this?

I'm not sure who can speak for all SAs nor necessarily would the vendor
SAs indicate this. From a quick code inspection of OpenSM, it appears to
not enforce the compliance properly.

Switches do whatever they are told to do by the SM.

-- Hal

 - Sean
 
 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general
 
 To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Michael S. Tsirkin
 Quoting Steve Wise [EMAIL PROTECTED]:
 Subject: Re: [openib-general] [PATCH] for OFED 1.2
 
 On Tue, 2007-02-27 at 18:55 +0200, Michael S. Tsirkin wrote:
   Quoting Sean Hefty [EMAIL PROTECTED]:
   Subject: Re: [PATCH] for OFED 1.2
   
   Please send patches that will be added to kernel_patches/fixes.
   
   Please update your git tree from
   git://git.openfabrics.org/~vlad/ofed_1_2/.git  ofed_1_2
   
   You want me to create a patch that adds a file that contains the actual 
   patches?
   
   Why not apply the patches directly?
  
  That's the ofed structure, this was discussed multiple times already.
  The point is to keep all changes to upstream components separate,
  to make updating to upstream kernel trivial in the future.
  
  Worked quite well for OFED 1.1 - 1.2 transition.
  
 
 Having these patches as files is painful for every developer because
 they cannot create a patch against ofed_1_2/drivers/infiniband/* nor the
 kernel.org upstream tree.

Did you try using quilt which makes managing patch stacks quite easy?
If you have quilt installed, OFED scripts actually use it
to apply patches, so things are easy.

 They need to apply all the current patches
 and then create a patch on top of that. Or hope the patch applies
 fuzzily.  

One point I can't stress enough: whatever way you create a patch,
developers are expected to build and test it in OFED environment
before posting.

 I think with stacked git or just git and rebasing at key times, you
 could keep an ofed_1_2 tree that folks can easily apply patches to...
 
 Its too late to change this for 1.2, but you might want to reconsider
 the design for 1.3.

Well, I experimented with git rebase and it is unfortunately still
fragile at this point.

I agree using stacked git might be a good idea, I just did not
have the chance to experiment with it enough. I had an impression
that publishing stg managed branch creates problems for whoever
attempts to track it, but I might be wrong.


-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Sean Hefty
I think with stacked git or just git and rebasing at key times, you
could keep an ofed_1_2 tree that folks can easily apply patches to...

Its too late to change this for 1.2, but you might want to reconsider
the design for 1.3.

Can't we just create a new branch (ofed_1_2_patched) with these patches already
applied and in the correct order?  

Maybe I'm just not understanding the work flow here...

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Jeff Squyres
It would be great if all of this knowledge is posted to the wiki to  
avoid repeating this conversation in the future (or one of countless  
variations of this conversation).  For example, I admit to not paying  
close attention to many of the threads on this list, but this was the  
first time I'd head of quilt.

Specifically: if there are tools and methods that are helpful for OFA/ 
OFED development, they should be detailed on the wiki.  The wiki is  
where all permanent knowledge should be posted.

This is just my $0.01...



On Feb 27, 2007, at 12:31 PM, Michael S. Tsirkin wrote:

 Quoting Steve Wise [EMAIL PROTECTED]:
 Subject: Re: [openib-general] [PATCH] for OFED 1.2

 On Tue, 2007-02-27 at 18:55 +0200, Michael S. Tsirkin wrote:
 Quoting Sean Hefty [EMAIL PROTECTED]:
 Subject: Re: [PATCH] for OFED 1.2

 Please send patches that will be added to kernel_patches/fixes.

 Please update your git tree from
 git://git.openfabrics.org/~vlad/ofed_1_2/.git  ofed_1_2

 You want me to create a patch that adds a file that contains the  
 actual patches?

 Why not apply the patches directly?

 That's the ofed structure, this was discussed multiple times  
 already.
 The point is to keep all changes to upstream components separate,
 to make updating to upstream kernel trivial in the future.

 Worked quite well for OFED 1.1 - 1.2 transition.


 Having these patches as files is painful for every developer because
 they cannot create a patch against ofed_1_2/drivers/infiniband/*  
 nor the
 kernel.org upstream tree.

 Did you try using quilt which makes managing patch stacks quite easy?
 If you have quilt installed, OFED scripts actually use it
 to apply patches, so things are easy.

 They need to apply all the current patches
 and then create a patch on top of that. Or hope the patch applies
 fuzzily.

 One point I can't stress enough: whatever way you create a patch,
 developers are expected to build and test it in OFED environment
 before posting.

 I think with stacked git or just git and rebasing at key times, you
 could keep an ofed_1_2 tree that folks can easily apply patches to...

 Its too late to change this for 1.2, but you might want to reconsider
 the design for 1.3.

 Well, I experimented with git rebase and it is unfortunately still
 fragile at this point.

 I agree using stacked git might be a good idea, I just did not
 have the chance to experiment with it enough. I had an impression
 that publishing stg managed branch creates problems for whoever
 attempts to track it, but I might be wrong.


 -- 
 MST

 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general

 To unsubscribe, please visit http://openib.org/mailman/listinfo/ 
 openib-general


-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Michael S. Tsirkin
 Quoting Sean Hefty [EMAIL PROTECTED]:
 Subject: Re: [PATCH] for OFED 1.2
 
 I think with stacked git or just git and rebasing at key times, you
 could keep an ofed_1_2 tree that folks can easily apply patches to...
 
 Its too late to change this for 1.2, but you might want to reconsider
 the design for 1.3.
 
 Can't we just create a new branch (ofed_1_2_patched) with these patches 
 already
 applied and in the correct order?  

Then what we do when we want to update to new upstream? Throw this branch away?
As it is, I just pull then build and remove patches that conflict.

By the way, there are backport patches, etc - it is still incorrect
to say that you would be able to generate a patch out of git
and know it's a good one without test-build.

 Maybe I'm just not understanding the work flow here...

Sean, please install quilt and try using it for working with the system.
Adding new patch is usually done in this way
quilt new patch
quilt add files
edit
quilt refresh

cp patches/patch kernel_patches/fixes/
git add kernel_patches/fixes/patch
git commit kernel_patches/fixes/patch


-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Michael S. Tsirkin
Lot's of stuff *is* in wiki already - did you look at pages Vlad created?
Things can always be improved, you can add stuff too.


Quoting Jeff Squyres [EMAIL PROTECTED]:
Subject: Re: [PATCH] for OFED 1.2

It would be great if all of this knowledge is posted to the wiki to  
avoid repeating this conversation in the future (or one of countless  
variations of this conversation).  For example, I admit to not paying  
close attention to many of the threads on this list, but this was the  
first time I'd head of quilt.

Specifically: if there are tools and methods that are helpful for OFA/ 
OFED development, they should be detailed on the wiki.  The wiki is  
where all permanent knowledge should be posted.

This is just my $0.01...



On Feb 27, 2007, at 12:31 PM, Michael S. Tsirkin wrote:

 Quoting Steve Wise [EMAIL PROTECTED]:
 Subject: Re: [openib-general] [PATCH] for OFED 1.2

 On Tue, 2007-02-27 at 18:55 +0200, Michael S. Tsirkin wrote:
 Quoting Sean Hefty [EMAIL PROTECTED]:
 Subject: Re: [PATCH] for OFED 1.2

 Please send patches that will be added to kernel_patches/fixes.

 Please update your git tree from
 git://git.openfabrics.org/~vlad/ofed_1_2/.git  ofed_1_2

 You want me to create a patch that adds a file that contains the  
 actual patches?

 Why not apply the patches directly?

 That's the ofed structure, this was discussed multiple times  
 already.
 The point is to keep all changes to upstream components separate,
 to make updating to upstream kernel trivial in the future.

 Worked quite well for OFED 1.1 - 1.2 transition.


 Having these patches as files is painful for every developer because
 they cannot create a patch against ofed_1_2/drivers/infiniband/*  
 nor the
 kernel.org upstream tree.

 Did you try using quilt which makes managing patch stacks quite easy?
 If you have quilt installed, OFED scripts actually use it
 to apply patches, so things are easy.

 They need to apply all the current patches
 and then create a patch on top of that. Or hope the patch applies
 fuzzily.

 One point I can't stress enough: whatever way you create a patch,
 developers are expected to build and test it in OFED environment
 before posting.

 I think with stacked git or just git and rebasing at key times, you
 could keep an ofed_1_2 tree that folks can easily apply patches to...

 Its too late to change this for 1.2, but you might want to reconsider
 the design for 1.3.

 Well, I experimented with git rebase and it is unfortunately still
 fragile at this point.

 I agree using stacked git might be a good idea, I just did not
 have the chance to experiment with it enough. I had an impression
 that publishing stg managed branch creates problems for whoever
 attempts to track it, but I might be wrong.


 -- 
 MST

 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general

 To unsubscribe, please visit http://openib.org/mailman/listinfo/ 
 openib-general


-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Michael S. Tsirkin
 This is just my $0.01...

Thanks for the suggestions, but what does $0.01 buy one in US today?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Steve Wise
On Tue, 2007-02-27 at 19:44 +0200, Michael S. Tsirkin wrote:
  Quoting Sean Hefty [EMAIL PROTECTED]:
  Subject: Re: [PATCH] for OFED 1.2
  
  I think with stacked git or just git and rebasing at key times, you
  could keep an ofed_1_2 tree that folks can easily apply patches to...
  
  Its too late to change this for 1.2, but you might want to reconsider
  the design for 1.3.
  
  Can't we just create a new branch (ofed_1_2_patched) with these patches 
  already
  applied and in the correct order?  
 
 Then what we do when we want to update to new upstream? Throw this branch 
 away?
 As it is, I just pull then build and remove patches that conflict.
 
 By the way, there are backport patches, etc - it is still incorrect
 to say that you would be able to generate a patch out of git
 and know it's a good one without test-build.
 
  Maybe I'm just not understanding the work flow here...
 
 Sean, please install quilt and try using it for working with the system.
 Adding new patch is usually done in this way
 quilt new patch
 quilt add files
 edit
 quilt refresh
 
 cp patches/patch kernel_patches/fixes/
 git add kernel_patches/fixes/patch
 git commit kernel_patches/fixes/patch

NOTE: The key to the above process is the assumption that the developer
maintains _all_ of the existing patches from kernel_patches/ on top of
the ofed_1_2 tree using quilt or stg.  Otherwise quilt/stg isn't buying
you anything.

And this doesn't take into account backports.

Regardless, you need to build, install and test any ofed patch on an
ofed system, so you're gonna have extra work:

1) create ofed-specific patch
   build/test it on ofed
   post it to openib-general/ewg

2) create kernel.org patch
   build/test it on kernel.org
   post it to openib-gernel/lklm/netdev


My .27 cents...








___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Jeff Squyres
On Feb 27, 2007, at 12:45 PM, Michael S. Tsirkin wrote:

 Lot's of stuff *is* in wiki already - did you look at pages Vlad  
 created?

A search for quilt on the wiki turns up nothing (I checked before I  
posted :-) ).

And yes, I have [thoroughly] read the pages Vlad created.  But the  
very fact that this conversation is occurring is because either the  
information is not on the wiki or what is on the wiki is not clear.   
Otherwise, I suspect that you simply would have pointed Steve to the  
wiki and said Please read the fine manual at http://;.

Don't get me wrong; what has already been posted is great.  I'm just  
saying: keep it coming!  The wiki should be a living document that  
changes as our procedures and collective wisdom changes.  It saves us  
*all* time over the long run.  A one-time dump of information is not  
nearly as useful as an ever-updated document.

 Things can always be improved, you can add stuff too.

https://wiki.openfabrics.org/tiki-lastchanges.php?days=31 shows that  
only Tziporet and myself have changed the OFED portion of the wiki  
over the past month.

So -- *you* can add stuff to the wiki, too.  :-)

 This is just my $0.01...

It buys very little, if anything.  In fact, a whole $0.02 also buys  
very little, if anything.  So take my comments for what they're worth.

-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Michael S. Tsirkin
 Quoting Steve Wise [EMAIL PROTECTED]:
 Subject: Re: [PATCH] for OFED 1.2
 
 On Tue, 2007-02-27 at 19:44 +0200, Michael S. Tsirkin wrote:
   Quoting Sean Hefty [EMAIL PROTECTED]:
   Subject: Re: [PATCH] for OFED 1.2
   
   I think with stacked git or just git and rebasing at key times, you
   could keep an ofed_1_2 tree that folks can easily apply patches to...
   
   Its too late to change this for 1.2, but you might want to reconsider
   the design for 1.3.
   
   Can't we just create a new branch (ofed_1_2_patched) with these patches 
   already
   applied and in the correct order?  
  
  Then what we do when we want to update to new upstream? Throw this branch 
  away?
  As it is, I just pull then build and remove patches that conflict.
  
  By the way, there are backport patches, etc - it is still incorrect
  to say that you would be able to generate a patch out of git
  and know it's a good one without test-build.
  
   Maybe I'm just not understanding the work flow here...
  
  Sean, please install quilt and try using it for working with the system.
  Adding new patch is usually done in this way
  quilt new patch
  quilt add files
  edit
  quilt refresh
  
  cp patches/patch kernel_patches/fixes/
  git add kernel_patches/fixes/patch
  git commit kernel_patches/fixes/patch
 
 NOTE: The key to the above process is the assumption that the developer
 maintains _all_ of the existing patches from kernel_patches/ on top of
 the ofed_1_2 tree using quilt or stg.  Otherwise quilt/stg isn't buying
 you anything.

OFED will do this automatically.

 And this doesn't take into account backports.

The process works with backport patches too: you just have to do this

 quilt pop -a
 
   quilt new patch
   quilt add files
   edit
   quilt refresh
 
 quilt push -a



-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Michael S. Tsirkin
  This is just my $0.01...
 
 It buys very little, if anything.  In fact, a whole $0.02 also buys  
 very little, if anything.  So take my comments for what they're worth.

Oh, good, I thought deflation is getting out of hand ...

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Michael S. Tsirkin
  Lot's of stuff *is* in wiki already - did you look at pages Vlad  
  created?
 
 A search for quilt on the wiki turns up nothing (I checked before I  
 posted :-) ).
 
 And yes, I have [thoroughly] read the pages Vlad created.  But the  
 very fact that this conversation is occurring is because either the  
 information is not on the wiki or what is on the wiki is not clear.   
 Otherwise, I suspect that you simply would have pointed Steve to the  
 wiki and said Please read the fine manual at http://;.

You are right in that, I don't disclaim it.
Thanks for the suggestion, I'll try to find the time to add this to wiki.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Steve Wise
   
   Sean, please install quilt and try using it for working with the system.
   Adding new patch is usually done in this way
   quilt new patch
   quilt add files
   edit
   quilt refresh
   
   cp patches/patch kernel_patches/fixes/
   git add kernel_patches/fixes/patch
   git commit kernel_patches/fixes/patch
  
  NOTE: The key to the above process is the assumption that the developer
  maintains _all_ of the existing patches from kernel_patches/ on top of
  the ofed_1_2 tree using quilt or stg.  Otherwise quilt/stg isn't buying
  you anything.
 
 OFED will do this automatically.
 

uh, can you explain this?  Given I have a freshly cloned ofed_1_2 git
tree, and I want to change cma.c (a good one cuz there are patches).
What do I do?  There's no quilt stack at all at this point.  Right?  


  And this doesn't take into account backports.
 
 The process works with backport patches too: you just have to do this
 
  quilt pop -a
  
quilt new patch
quilt add files
edit
quilt refresh
  
  quilt push -a


But you cannot keep a stack for more than one backport pushed, right?
So you still need to be slapping the stacks of patches around for each
backport.  

Or maybe I'm confused?





___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Sean Hefty
But you cannot keep a stack for more than one backport pushed, right?
So you still need to be slapping the stacks of patches around for each
backport.

Why not have separate branches for each kernels too?

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Michael S. Tsirkin
 Quoting Steve Wise [EMAIL PROTECTED]:
 Subject: Re: [PATCH] for OFED 1.2
 

Sean, please install quilt and try using it for working with the system.
Adding new patch is usually done in this way
quilt new patch
quilt add files
edit
quilt refresh

cp patches/patch kernel_patches/fixes/
git add kernel_patches/fixes/patch
git commit kernel_patches/fixes/patch
   
   NOTE: The key to the above process is the assumption that the developer
   maintains _all_ of the existing patches from kernel_patches/ on top of
   the ofed_1_2 tree using quilt or stg.  Otherwise quilt/stg isn't buying
   you anything.
  
  OFED will do this automatically.
  
 
 uh, can you explain this?  Given I have a freshly cloned ofed_1_2 git
 tree, and I want to change cma.c (a good one cuz there are patches).
 What do I do?  There's no quilt stack at all at this point.  Right?  

Try running the configure script.
After this, quilt applied will show what patches are applied.

   And this doesn't take into account backports.
  
  The process works with backport patches too: you just have to do this
  
   quilt pop -a
   
 quilt new patch
 quilt add files
 edit
 quilt refresh
   
   quilt push -a
 
 
 But you cannot keep a stack for more than one backport pushed, right?
 So you still need to be slapping the stacks of patches around for each
 backport.  
 
 Or maybe I'm confused?

Yes.

Fortunately it's not too hard: you can do
quilt pop -a
and re-run configure for another kernel.

Of course for testing the patch, it is easier to commit the change in your tree
and then to use openfabrics cross-build functionality that will clone this
tree and build for multiple arches/kernels.


-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Michael S. Tsirkin
 Quoting Sean Hefty [EMAIL PROTECTED]:
 Subject: RE: [PATCH] for OFED 1.2
 
 But you cannot keep a stack for more than one backport pushed, right?
 So you still need to be slapping the stacks of patches around for each
 backport.
 
 Why not have separate branches for each kernels too?

I think it'll be much more work to maintain all these branches.
And again, there will be conflicts, and it's too easy to get confused when
resolving a conflict.

With patches we have scripts to automate this.


-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] remove www.openfabrics.org SVN links..

2007-02-27 Thread Troy Benjegerdes
Can someone please update the main www.openfabrics.org web page to
remove all references to subversion, and link to a wiki page on how to
get the latest source?

Thanks.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Sean Hefty
 I think it'll be much more work to maintain all these branches.
 And again, there will be conflicts, and it's too easy to get confused when
 resolving a conflict.

Storing patches in a directory seems confusing to me.  They must be applied in 
a 
specific order for everything to work, and that knowledge is not captured. 
Conflicts need to be resolved anyway.

If someone wants to use scripts to make their life easier, that's fine, but 
they 
shouldn't be a necessity to checking out code and creating patches using git. 
For OFED they are.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Sasha Khapyorsky
On 19:44 Tue 27 Feb , Michael S. Tsirkin wrote:
  Quoting Sean Hefty [EMAIL PROTECTED]:
  Subject: Re: [PATCH] for OFED 1.2
  
  I think with stacked git or just git and rebasing at key times, you
  could keep an ofed_1_2 tree that folks can easily apply patches to...
  
  Its too late to change this for 1.2, but you might want to reconsider
  the design for 1.3.
  
  Can't we just create a new branch (ofed_1_2_patched) with these patches 
  already
  applied and in the correct order?  
 
 Then what we do when we want to update to new upstream? Throw this branch 
 away?
 As it is, I just pull then build and remove patches that conflict.

You can save this branch as branch-name-upstream-name (or better)
and to rebase branch-name to the new upstream.

 By the way, there are backport patches, etc - it is still incorrect
 to say that you would be able to generate a patch out of git
 and know it's a good one without test-build.

In similar way you can track backport patch sets as branches.

  Maybe I'm just not understanding the work flow here...
 
 Sean, please install quilt and try using it for working with the system.
 Adding new patch is usually done in this way
 quilt new patch
 quilt add files
 edit
 quilt refresh
 
 cp patches/patch kernel_patches/fixes/
 git add kernel_patches/fixes/patch
 git commit kernel_patches/fixes/patch

This looks strange for me to track patches against patches...

Sasha

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] ofed_1_2_scripts for bug 372

2007-02-27 Thread Shaun Rowland

Hi Vladimir. I've attached a small patch to the ofed_1_2_scripts
build.sh file for the mvapich2() function. This fixes bug 372 where the
F90 compiler was not being set properly for the GNU compiler case and
other possible compilers in the path were being found. This patch is
against the latest ofed_1_2_scripts git.
--
Shaun Rowland   [EMAIL PROTECTED]
http://www.cse.ohio-state.edu/~rowland/
diff --git a/build.sh b/build.sh
index ae5ea1e..86894be 100755
--- a/build.sh
+++ b/build.sh
@@ -528,9 +528,9 @@ mvapich2()
 MVAPICH2_COMP_ENV=CC=gcc CXX=g++
 
 if [ $is_gfortran -eq 1 ]; then
-MVAPICH2_COMP_ENV=$MVAPICH2_COMP_ENV F77=gfortran
+MVAPICH2_COMP_ENV=$MVAPICH2_COMP_ENV F77=gfortran 
F90=gfortran
 elif [ $is_gcc_g77 -eq 1 ]; then
-MVAPICH2_COMP_ENV=$MVAPICH2_COMP_ENV F77=g77
+MVAPICH2_COMP_ENV=$MVAPICH2_COMP_ENV F77=g77 F90=g77
 fi
 ;;
 pathscale)
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] for OFED 1.2

2007-02-27 Thread Michael S. Tsirkin
 Quoting Sasha Khapyorsky [EMAIL PROTECTED]:
 Subject: Re: [PATCH] for OFED 1.2
 
 On 19:44 Tue 27 Feb , Michael S. Tsirkin wrote:
   Quoting Sean Hefty [EMAIL PROTECTED]:
   Subject: Re: [PATCH] for OFED 1.2
   
   I think with stacked git or just git and rebasing at key times, you
   could keep an ofed_1_2 tree that folks can easily apply patches to...
   
   Its too late to change this for 1.2, but you might want to reconsider
   the design for 1.3.
   
   Can't we just create a new branch (ofed_1_2_patched) with these patches 
   already
   applied and in the correct order?  
  
  Then what we do when we want to update to new upstream? Throw this branch 
  away?
  As it is, I just pull then build and remove patches that conflict.
 
 You can save this branch as branch-name-upstream-name (or better)
 and to rebase branch-name to the new upstream.

rebase does not seem to be too robust when run on such a large repository as the
linux kernel.  Maybe stacked git will work.

  By the way, there are backport patches, etc - it is still incorrect
  to say that you would be able to generate a patch out of git
  and know it's a good one without test-build.
 
 In similar way you can track backport patch sets as branches.

At the moment it seems like a lot of work. Again, maybe stg makes it easy,
I know it's hard with plain git.

And I think lots of people (including me) will be confused if we have a ton of 
branches.

   Maybe I'm just not understanding the work flow here...
  
  Sean, please install quilt and try using it for working with the system.
  Adding new patch is usually done in this way
  quilt new patch
  quilt add files
  edit
  quilt refresh
  
  cp patches/patch kernel_patches/fixes/
  git add kernel_patches/fixes/patch
  git commit kernel_patches/fixes/patch
 
 This looks strange for me to track patches against patches...

One gets used to it :)
Seriously, we have these patches, and we want to version them together
with source they are intended to apply to.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] failure to create an FMR mapping 1K pages on memfree

2007-02-27 Thread Or Gerlitz
On 2/27/07, Roland Dreier [EMAIL PROTECTED] wrote:
 Is it really returning -ENOMEM?  It seems much more likely that you
 are hitting the code

 /* For Arbel, all MTTs must fit in the same page. */
 if (mthca_is_memfree(dev) 
 mr-attr.max_pages * sizeof *mr-mem.arbel.mtts  PAGE_SIZE)
 return -EINVAL;

 I guess you could call this limit a driver design issue.

Indeed, sorry for the in accorate description, mthca_fmr_alloc returns
-EINVAL and the fmr pool code returns -ENOMEM. Thanks for the
clarification.

Or.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Fwd: [ANNOUNCE] GIT 1.5.0.2

2007-02-27 Thread Michael S. Tsirkin
FYI.

- Forwarded message from Junio C Hamano [EMAIL PROTECTED] -

Subject: [ANNOUNCE] GIT 1.5.0.2
Date: Tue, 27 Feb 2007 10:58:22 +0200
In-Reply-To: [EMAIL PROTECTED] (Junio C. Hamano'smessage of Sun, 18 Feb 2007 
18:07:42 -0800)
References: [EMAIL PROTECTED]
From: Junio C Hamano [EMAIL PROTECTED]

The latest maintenance release GIT 1.5.0.2 is available at the
usual places:

  http://www.kernel.org/pub/software/scm/git/

  git-1.5.0.2.tar.{gz,bz2}  (tarball)
  git-htmldocs-1.5.0.2.tar.{gz,bz2} (preformatted docs)
  git-manpages-1.5.0.2.tar.{gz,bz2} (preformatted docs)
  RPMS/$arch/git-*-1.5.0.2-1.$arch.rpm  (RPM)


GIT v1.5.0.2 Release Notes
==

Fixes since v1.5.0.1


* Bugfixes

  - Automated merge conflict handling when changes to symbolic
links conflicted were completely broken.  The merge-resolve
strategy created a regular file with conflict markers in it
in place of the symbolic link.  The default strategy,
merge-recursive was even more broken.  It removed the path
that was pointed at by the symbolic link.  Both of these
problems have been fixed.

  - 'git diff maint master next' did not correctly give combined
diff across three trees.

  - 'git fast-import' portability fix for Solaris.

  - 'git show-ref --verify' without arguments did not error out
but segfaulted.

  - 'git diff :tracked-file `pwd`/an-untracked-file' gave an extra
slashes after a/ and b/.

  - 'git format-patch' produced too long filenames if the commit
message had too long line at the beginning.

  - Running 'make all' and then without changing anything
running 'make install' still rebuilt some files.  This
was inconvenient when building as yourself and then
installing as root (especially problematic when the source
directory is on NFS and root is mapped to nobody).

  - 'git-rerere' failed to deal with two unconflicted paths that
sorted next to each other.

  - 'git-rerere' attempted to open(2) a symlink and failed if
there was a conflict.  Since a conflicting change to a
symlink would not benefit from rerere anyway, the command
now ignores conflicting changes to symlinks.

  - 'git-repack' did not like to pass more than 64 arguments
internally to underlying 'rev-list' logic, which made it
impossible to repack after accumulating many (small) packs
in the repository.

  - 'git-diff' to review the combined diff during a conflicted
merge were not reading the working tree version correctly
when changes to a symbolic link conflicted.  It should have
read the data using readlink(2) but read from the regular
file the symbolic link pointed at.

  - 'git-remote' did not like period in a remote's name.

* Documentation updates

  - added and clarified core.bare, core.legacyheaders configurations.

  - updated git-clone --depth documentation.


* Assorted git-gui fixes.

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

- End forwarded message -

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC/BUG] DMA vs. CQ race

2007-02-27 Thread Roland Dreier
On our cell blade + PCI-e Mellanox.
  
  I don't see anything in arch/powerpc that looks like
  dma_alloc_coherent() will do anything other than allocate some memory
  and map it with DMA_BIDIRECTIONAL.  So how does this altix fix help in
  your situation?  Am I misreading the Cell IOMMU code?

Shirley, can you clarify why doing dma_alloc_coherent() in the kernel
helps on your Cell blade?  It really seems that dma_alloc_coherent()
just allocates some memory and then does dma_map(DMA_BIDIRECTIONAL),
which would be exactly the same as allocating the CQ buffer in
userspace and using ib_umem_get() to map it into the kernel.

I'm looking at a possibly cleaner solution to the Altix issue, so I
would like to make sure it fixes whatever the bug on Cell is as well.
So any details you can provide about the problem you see on Cell would
help a lot.

Thanks...

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Port error rate detection

2007-02-27 Thread Troy Benjegerdes
On Mon, Feb 19, 2007 at 03:53:36PM -0500, Steven Carter wrote:
 I have a Nagios module that alerts on connectivity, port errors, 
 speed/width problems.  I would like to give it the ability to change the 
 severity of the alert depending on whether errors are just present or if 
 they are increasing faster than a specified rate.  The intent is to 
 equip the module to keep the state of the last query and possibly 
 history, but I wanted to make sure that I was not re-inventing the wheel 
 first.  Is there an attribute or utility that I am overlooking that will 
 help me do this?

One other thing you might want to take a look at is the Fountain/Goanna
node monitoring setup... It's not really anything like the proposed
performance manager, but it might get you want you need. (And we'd like
some feedback on what it should do differently ;)

http://www.scl.ameslab.gov/Projects/Monitor/

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC/BUG] DMA vs. CQ race

2007-02-27 Thread Shirley Ma





Roland Dreier [EMAIL PROTECTED] wrote on 02/27/2007 01:40:36 PM:

 Shirley, can you clarify why doing dma_alloc_coherent() in the kernel
 helps on your Cell blade?  It really seems that dma_alloc_coherent()
 just allocates some memory and then does dma_map(DMA_BIDIRECTIONAL),
 which would be exactly the same as allocating the CQ buffer in
 userspace and using ib_umem_get() to map it into the kernel.

 I'm looking at a possibly cleaner solution to the Altix issue, so I
 would like to make sure it fixes whatever the bug on Cell is as well.
 So any details you can provide about the problem you see on Cell would
 help a lot.

 Thanks...
Thanks, Roland. The failure on Cell is different with Altix issue after I
reviewed the whole thread. So this fix might not help Cell. The problem I
have might be related to multiple DMAs mapping to the same CQ. It might be
somewhere else lost the sync.

Thanks
Shirley Ma___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] Fw: [PATCH] enable IPoIB only if broadcast join finish

2007-02-27 Thread Shirley Ma





Hello Roland,

  Sorry to bother you again. Could you please review below patch to see
it's possible to be in upper stream soon? IPoIB can't ping each other if
broadcast join successfully but encounting any other IB multicast join
failure (like  IB multicast group join failure for default IPv6 link local
solicited address) when bringing the interface up. It does impact IPoIB
usability in large node cluster when MCG LIDs are limited.

Thanks
Shirley Ma


- Forwarded by Shirley Ma/Beaverton/IBM on 02/27/07 06:23 AM -
   
 Shirley   
 Ma/Beaverton/IBM@ 
 IBMUS  To
 Sent by:  Roland Dreier [EMAIL PROTECTED]
 openib-general-bo  cc
 [EMAIL PROTECTED]  openib-general@openib.org   
   Subject
   [openib-general] [PATCH] enable 
 02/05/07 06:50 AM IPoIB only if broadcast join finish
   
   
   
   
   
   




Hi, Roland,

Please review this patch. According to IPoIB RFC4391 section 5, once IPoIB
broacast group has been joined, the interface should be ready for data
transfer. In current IPoIB implementation, the interface is UP and RUNNING
when all default multicast join successful. We hit a problem while the
broadcast join finishe and sucessful but the all hosts multicast join
failure.

Here is the patch, if possible please give your input asap, we have an
urgent customer issue need to be resolved:

diff -urpN ipoib/ipoib_multicast.c ipoib-multicast/ipoib_multicast.c
--- ipoib/ipoib_multicast.c 2006-11-29 13:57:37.0 -0800
+++ ipoib-multicast/ipoib_multicast.c 2007-02-04 22:34:16.0 -0800
@@ -402,6 +402,11 @@ static void ipoib_mcast_join_complete(in
queue_work(ipoib_workqueue, priv-mcast_task);
mutex_unlock(mcast_mutex);
complete(mcast-done);
+ /*
+ * broadcast join finished, enable carrier
+ */
+ if (mcast == priv-broadcast)
+ netif_carrier_on(dev);
return;
}

@@ -599,7 +604,6 @@ void ipoib_mcast_join_task(void *dev_ptr
ipoib_dbg_mcast(priv, successfully joined all multicast groups\n);

clear_bit(IPOIB_MCAST_RUN, priv-flags);
- netif_carrier_on(dev);
}

int ipoib_mcast_start_thread(struct net_device *dev)

(See attached file: ipoib-multicast.patch)

Thanks
Shirley Ma
IBM Linux Technology Center
15300 SW Koll Parkway
Beaverton, OR 97006-6063
Phone(Fax): (503) 578-7638(See attached file: ipoib-multicast.patch)
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general



ipoib-multicast.patch
Description: Binary data
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Fw: [PATCH] enable IPoIB only if broadcast join finish

2007-02-27 Thread Roland Dreier
I don't think this applies any more since Sean's multicast stuff was
merged.  I didn't realize you wanted to get this merged upstream --
anyway, can you please regenerate the patch against the latest kernel?

Thanks

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IPOIB NAPI

2007-02-27 Thread Shirley Ma




Roland Dreier [EMAIL PROTECTED] wrote on 02/26/2007 02:36:26 PM:
 No way, it's way too late at this point to change the kernel-user ABI,
 let alone change all ULPs.

  - R.

Hello Roland,

So the IBV_CQ_REPORT_MISSED_EVENTS has been part of OFED-1.2 already? I can
generate the patch for all ULPs to use this for review. Do you need me to
do that?

Thanks
Shirley Ma___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] IPOIB NAPI

2007-02-27 Thread Roland Dreier
  So the IBV_CQ_REPORT_MISSED_EVENTS has been part of OFED-1.2 already? I can
  generate the patch for all ULPs to use this for review. Do you need me to
  do that?

No, it's not in OFED 1.2 or the upstream kernel.  And no one has
implemented it for userspace (and I'm somewhat reluctant to break the
ABI at this point without some performance numbers to motivate making
this API change).

Have the NAPI performance problems with ehca been resolved?  We could
probably merge IPoIB NAPI for 2.6.22 then, which would pull in the
kernel changes at least.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] cannot instal ofed-1.2 kernel rpm on 2.6.20.1

2007-02-27 Thread Steve Wise
I built the ofed 1.2 rpms from the OFED-1.2-20070227-0602 build and the
kernel rpm fails to install on a 2.6.20.1 kernel:

vic13:/usr/local/src/OFED-1.2-20070227-0602/RPMS/sles-release-10-15.2 # rpm -U 
kernel-ib-1.2-2.6.20.1.x86_64.rpm
error: Failed dependencies:
ksym(schedule) = 1000e51 is needed by kernel-ib-1.2-2.6.20.1.x86_64
ksym(__up_wakeup) = 1042cbb5 is needed by kernel-ib-1.2-2.6.20.1.x86_64
ksym(pci_request_region) = 10cc2981 is needed by 
kernel-ib-1.2-2.6.20.1.x86_64
ksym(skb_dequeue) = 10fc721b is needed by kernel-ib-1.2-2.6.20.1.x86_64
ksym(mod_timer) = 14777d07 is needed by kernel-ib-1.2-2.6.20.1.x86_64
ksym(remap_pfn_range) = 155834a8 is needed by 
kernel-ib-1.2-2.6.20.1.x86_64
ksym(unregister_netevent_notifier) = 1598dc9d is needed by 
kernel-ib-1.2-2.6.20.1.x86_64
ksym(bad_dma_address) = 1675606f is needed by 
kernel-ib-1.2-2.6.20.1.x86_64
ksym(dev_get_by_name) = 16ab1a6b is needed by 
kernel-ib-1.2-2.6.20.1.x86_64

...

many more of these deleted

Anybody seen this?







___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Fw: [PATCH] enable IPoIB only if broadcast join finish

2007-02-27 Thread Shirley Ma




Roland Dreier [EMAIL PROTECTED] wrote on 02/27/2007 02:35:34 PM:

 I don't think this applies any more since Sean's multicast stuff was
 merged.  I didn't realize you wanted to get this merged upstream --
 anyway, can you please regenerate the patch against the latest kernel?

 Thanks
Sure. I will generate a new patch.

Thanks
Shirley Ma___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] osm: trivial data type change to remove compilation warning

2007-02-27 Thread Hal Rosenstock
On Mon, 2007-02-26 at 06:20, Yevgeny Kliteynik wrote: 
 Hi Hal
 
 Trivial data type change to remove compilation warning.
 Please apply to the trunk and to the 1.2 branch.
 
 Thanks.
 
 Signed-off-by: Yevgeny Kliteynik [EMAIL PROTECTED]

Thanks. Applied (to both master and ofed_1_2).

-- Hal


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IPOIB NAPI

2007-02-27 Thread Shirley Ma




oland Dreier [EMAIL PROTECTED] wrote on 02/27/2007 02:41:44 PM:

   So the IBV_CQ_REPORT_MISSED_EVENTS has been part of OFED-1.2 already?
I can
   generate the patch for all ULPs to use this for review. Do you need me
to
   do that?

 No, it's not in OFED 1.2 or the upstream kernel.  And no one has
 implemented it for userspace (and I'm somewhat reluctant to break the
 ABI at this point without some performance numbers to motivate making
 this API change).

 Have the NAPI performance problems with ehca been resolved?  We could
 probably merge IPoIB NAPI for 2.6.22 then, which would pull in the
 kernel changes at least.

  - R.
We have addressed the NAPI performance issues with ehca driver. I believe
the patches have been upper stream. However the test results show that it's
better to delay poll again to next NAPI interval, something like this:

poll-cq
notify-cq, if missed_event  netif_rx_reschedule()
return 1

vs.
poll-cq,
notify-cq, if missed_event  netif_rx_reschedule()
poll again
return 0

It seems ehca delivering packet much faster than other HCAs. So poll again
would stay in the loop for many many times. So the above changes doesn't
impact other HCAs, I would recommand it. I saw same implementations on
other ethernet drivers.

Thanks
Shirley Ma___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] cannot instal ofed-1.2 kernel rpm on 2.6.20.1

2007-02-27 Thread Steve Wise
I opened bug 399 to track this.

I also opened bug 398 because I got an error installing opensm with this
same OFED-1.2 build.


Steve.



On Tue, 2007-02-27 at 16:43 -0600, Steve Wise wrote:
 I built the ofed 1.2 rpms from the OFED-1.2-20070227-0602 build and the
 kernel rpm fails to install on a 2.6.20.1 kernel:
 
 vic13:/usr/local/src/OFED-1.2-20070227-0602/RPMS/sles-release-10-15.2 # rpm 
 -U kernel-ib-1.2-2.6.20.1.x86_64.rpm
 error: Failed dependencies:
 ksym(schedule) = 1000e51 is needed by kernel-ib-1.2-2.6.20.1.x86_64
 ksym(__up_wakeup) = 1042cbb5 is needed by 
 kernel-ib-1.2-2.6.20.1.x86_64
 ksym(pci_request_region) = 10cc2981 is needed by 
 kernel-ib-1.2-2.6.20.1.x86_64
 ksym(skb_dequeue) = 10fc721b is needed by 
 kernel-ib-1.2-2.6.20.1.x86_64
 ksym(mod_timer) = 14777d07 is needed by kernel-ib-1.2-2.6.20.1.x86_64
 ksym(remap_pfn_range) = 155834a8 is needed by 
 kernel-ib-1.2-2.6.20.1.x86_64
 ksym(unregister_netevent_notifier) = 1598dc9d is needed by 
 kernel-ib-1.2-2.6.20.1.x86_64
 ksym(bad_dma_address) = 1675606f is needed by 
 kernel-ib-1.2-2.6.20.1.x86_64
 ksym(dev_get_by_name) = 16ab1a6b is needed by 
 kernel-ib-1.2-2.6.20.1.x86_64
 
 ...
 
 many more of these deleted
 
 Anybody seen this?
 
 
 
 
 
 
 
 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general
 
 To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Fw: [PATCH] enable IPoIB only if broadcast join finish

2007-02-27 Thread Shirley Ma




Hello Roland,

Here is the new patch against 2.6.20-rc1 kernel. Please review it.

diff -urpN ipoib/ipoib_multicast.c ipoib-link/ipoib_multicast.c
--- ipoib/ipoib_multicast.c   2007-02-27 07:21:50.0 -0800
+++ ipoib-link/ipoib_multicast.c2007-02-27 07:52:10.0 -0800
@@ -407,6 +407,11 @@ static int ipoib_mcast_join_complete(int
  queue_delayed_work(ipoib_workqueue,
 priv-mcast_task, 0);
mutex_unlock(mcast_mutex);
+   /*
+* broadcast join finished, enable carrier
+*/
+   if (unlikely(mcast == priv-broadcast))
+ netif_carrier_on(dev);
return 0;
  }

@@ -596,7 +601,6 @@ void ipoib_mcast_join_task(struct work_s
  ipoib_dbg_mcast(priv, successfully joined all multicast groups\n);

  clear_bit(IPOIB_MCAST_RUN, priv-flags);
- netif_carrier_on(dev);
 }

 int ipoib_mcast_start_thread(struct net_device *dev)

(See attached file: ipoib-link.patch)

Thanks
Shirley Ma

ipoib-link.patch
Description: Binary data
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [Bug 263] OFED 1.1 rc6: IPoIB Oops during IPoIB failover loop

2007-02-27 Thread bugzilla-daemon
https://bugs.openfabrics.org/show_bug.cgi?id=263


[EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|RESOLVED|CLOSED




--- Comment #14 from [EMAIL PROTECTED]  2007-02-27 21:00 ---
With OFED 1.2 alpha1, I was able to failover/failback an IB port every 10
seconds for 8 hours on RHEL4 x86_64 LionMini SDR and DDR.  Will keep testing on
other platforms.


-- 
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IPOIB NAPI

2007-02-27 Thread Michael S. Tsirkin
 Quoting Shirley Ma [EMAIL PROTECTED]:
 Subject: Re: [openib-general] IPOIB NAPI
 
 Roland Dreier [EMAIL PROTECTED] wrote on 02/27/2007 02:41:44 PM:
 
So the IBV_CQ_REPORT_MISSED_EVENTS has been part of OFED-1.2 already? I 
  can
generate the patch for all ULPs to use this for review. Do you need me to
do that?
  
  No, it's not in OFED 1.2 or the upstream kernel.  And no one has
  implemented it for userspace (and I'm somewhat reluctant to break the
  ABI at this point without some performance numbers to motivate making
  this API change).
  
  Have the NAPI performance problems with ehca been resolved?  We could
  probably merge IPoIB NAPI for 2.6.22 then, which would pull in the
  kernel changes at least.
  
   - R.
 We have addressed the NAPI performance issues with ehca driver. I believe the 
 patches have been upper stream. However the test results show that it's 
 better to delay poll again to next NAPI interval, something like this:
 
 poll-cq
 notify-cq, if missed_event  netif_rx_reschedule()
 return 1
 
 vs.
 poll-cq,
 notify-cq, if missed_event  netif_rx_reschedule()
 poll again
 return 0
 
 It seems ehca delivering packet much faster than other HCAs. So poll again 
 would stay in the loop for many many times. So the above changes doesn't 
 impact other HCAs, I would recommand it. I saw same implementations on other 
 ethernet drivers.

I'm confused. Which one is faster?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [Bug 400] New: OFED 1.2 alpha1 IPoIB HA failover gets QP warnings

2007-02-27 Thread bugzilla-daemon
https://bugs.openfabrics.org/show_bug.cgi?id=400

   Summary: OFED 1.2 alpha1 IPoIB HA failover gets QP warnings
   Product: OpenFabrics Linux
   Version: 1.2alpha1
  Platform: X86-64
OS/Version: RHEL 4
Status: NEW
  Severity: normal
  Priority: P3
 Component: IPoIB
AssignedTo: [EMAIL PROTECTED]
ReportedBy: [EMAIL PROTECTED]


OFED 1.2 alpha1 on RHEL4 U4 x86_64, LionMini DDR HCA.

I have IPoIB HA configured, running traffic via netperf, and bringing up/down a
different host IB port every 10 seconds.

This is working for several hours, but I see warnings in dmesg, more on server
side.

Client dmesg:

ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib_mthca :04:00.0: QP 000405 not found in MGM
ib1: ib_detach_mcast failed (result = -22)
ib1: ipoib_mcast_detach failed (result = -22)
ib_mthca :04:00.0: QP 000404 not found in MGM
ib0: ib_detach_mcast failed (result = -22)
ib0: ipoib_mcast_detach failed (result = -22)
[EMAIL PROTECTED] log]#

Server dmesg:

ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib_mthca :04:00.0: QP 000405 not found in MGM
ib1: ib_detach_mcast failed (result = -22)
ib1: ipoib_mcast_detach failed (result = -22)
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib_mthca :04:00.0: QP 000405 not found in MGM
ib1: ib_detach_mcast failed (result = -22)
ib1: ipoib_mcast_detach failed (result = -22)
ib_mthca :04:00.0: QP 000405 not found in MGM
ib1: ib_detach_mcast failed (result = -22)
ib1: ipoib_mcast_detach failed (result = -22)
ib0: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib_mthca :04:00.0: QP 000405 not found in MGM
ib1: ib_detach_mcast failed (result = -22)
ib1: ipoib_mcast_detach failed (result = -22)
ib0: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib_mthca :04:00.0: QP 000405 not found in MGM
ib1: ib_detach_mcast failed (result = -22)
ib1: ipoib_mcast_detach failed (result = -22)
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib_mthca :04:00.0: QP 000405 not found in MGM
ib1: ib_detach_mcast failed (result = -22)
ib1: ipoib_mcast_detach failed (result = -22)
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib_mthca :04:00.0: QP 000405 not found in MGM
ib1: ib_detach_mcast failed (result = -22)
ib1: ipoib_mcast_detach failed (result = -22)
ib_mthca :04:00.0: QP 000405 not found in MGM
ib1: ib_detach_mcast failed (result = -22)
ib1: ipoib_mcast_detach failed (result = -22)
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
[EMAIL PROTECTED] log]#


-- 
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.

___
openib-general mailing list
openib-general@openib.org

[openib-general] [Bug 400] OFED 1.2 alpha1 IPoIB HA failover gets QP warnings

2007-02-27 Thread bugzilla-daemon
https://bugs.openfabrics.org/show_bug.cgi?id=400


[EMAIL PROTECTED] changed:

   What|Removed |Added

 AssignedTo|[EMAIL PROTECTED] |[EMAIL PROTECTED]




--- Comment #1 from [EMAIL PROTECTED]  2007-02-27 21:18 ---
Roland, can you take a look at this, please?


-- 
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.
You are the assignee for the bug, or are watching the assignee.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [Bug 400] OFED 1.2 alpha1 IPoIB HA failover gets QP warnings

2007-02-27 Thread Michael S. Tsirkin
 ib1: dev_queue_xmit failed to requeue packet
 ib_mthca :04:00.0: QP 000405 not found in MGM
 ib1: ib_detach_mcast failed (result = -22)
 ib1: ipoib_mcast_detach failed (result = -22)

Looks like this is related to the multicast change that recently went upstream.
So this likely affects upstream IPoIB as well.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[OFA General] Re: [openib-general] IPOIB NAPI

2007-02-27 Thread Shirley Ma




I'm confused. Which one is faster?
Sorry for the confusion, Michael. The one with return 1 has better
throughput.

Thanks
Shirley Ma___
general mailing list
[EMAIL PROTECTED]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[OFA General] [Bug 371] IPoIB HA not working properly with OFED1.2-alpha

2007-02-27 Thread bugzilla-daemon
https://bugs.openfabrics.org/show_bug.cgi?id=371


[EMAIL PROTECTED] changed:

   What|Removed |Added

 CC||[EMAIL PROTECTED]




-- 
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.

___
general mailing list
[EMAIL PROTECTED]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[OFA General] [Bug 371] IPoIB HA not working properly with OFED1.2-alpha

2007-02-27 Thread bugzilla-daemon
https://bugs.openfabrics.org/show_bug.cgi?id=371


[EMAIL PROTECTED] changed:

   What|Removed |Added

 AssignedTo|[EMAIL PROTECTED] |[EMAIL PROTECTED]




--- Comment #2 from [EMAIL PROTECTED]  2007-02-27 23:08 ---
Assigned to Vlad.


-- 
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.
You are the assignee for the bug, or are watching the assignee.

___
general mailing list
[EMAIL PROTECTED]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[OFA General] List Address Change Completed

2007-02-27 Thread Lee, Michael Paichi
This list has been migrated to the new server, lists.openfabrics.org.  Please 
update any address book or filter settings to reflect the new mailing list 
address.  Future messages and replies should be sent to this address:

[EMAIL PROTECTED]

The new web address for this list is:

http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

If you have any questions, please contact me at [EMAIL PROTECTED]   

Regards,
Michael
___
general mailing list
[EMAIL PROTECTED]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[OFA General] Re: IPOIB NAPI

2007-02-27 Thread Michael S. Tsirkin
 Quoting Shirley Ma [EMAIL PROTECTED]:
 Subject: Re: IPOIB NAPI
 
 oland Dreier [EMAIL PROTECTED] wrote on 02/27/2007 02:41:44 PM:
 
So the IBV_CQ_REPORT_MISSED_EVENTS has been part of OFED-1.2 already? I
 can
generate the patch for all ULPs to use this for review. Do you need me to
do that?
 
  No, it's not in OFED 1.2 or the upstream kernel.  And no one has
  implemented it for userspace (and I'm somewhat reluctant to break the
  ABI at this point without some performance numbers to motivate making
  this API change).
 
  Have the NAPI performance problems with ehca been resolved?  We could
  probably merge IPoIB NAPI for 2.6.22 then, which would pull in the
  kernel changes at least.
 
   - R.
 We have addressed the NAPI performance issues with ehca driver. I believe the
 patches have been upper stream. However the test results show that it's better
 to delay poll again to next NAPI interval, something like this:
 
 poll-cq
 notify-cq, if missed_event  netif_rx_reschedule()
 return 1
 
 vs.
 poll-cq,
 notify-cq, if missed_event  netif_rx_reschedule()
 poll again
 return 0
 
 It seems ehca delivering packet much faster than other HCAs. So poll again
 would stay in the loop for many many times. So the above changes doesn't 
 impact
 other HCAs, I would recommand it. I saw same implementations on other ethernet
 drivers.

I have not benchmarked this, but actually the return 1 version makes sense to
me too: since a new completion was observed after notify-cq, we likely currently
have HCA writing new completions into the CQ at a high rate, so it makes sense
to delay polling by a few cycles, and reduce the number of interrupts in this
way.

Right?

-- 
MST

___
general mailing list
[EMAIL PROTECTED]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [OFA General] List Address Change Completed

2007-02-27 Thread Michael S. Tsirkin
 Quoting Lee, Michael Paichi [EMAIL PROTECTED]:
 Subject: [OFA General] List Address Change Completed
 
 This list has been migrated to the new server, lists.openfabrics.org.  Please 
 update any address book or filter settings to reflect the new mailing list 
 address.  Future messages and replies should be sent to this address:
 
 [EMAIL PROTECTED]
 
 The new web address for this list is:
 
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
 
 If you have any questions, please contact me at [EMAIL PROTECTED]   

Can the subject prefix be made all lower-case, with dash, please?
OFA General - ofa-general?

Upper case words look like shouting to me, and e.g. exchange rules
are limited in coping with spaces.

-- 
MST
___
general mailing list
[EMAIL PROTECTED]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[OFA General] Re: [PATCH 0/6] ofed_1_2: cxgb3 bug fixes

2007-02-27 Thread Vladimir Sokolovsky
On Tue, 2007-02-27 at 09:59 -0600, Steve Wise wrote:
 Hey Vlad,
 
 These fixes need to be pulled into ofed_1_2 for the Chelsio Ethernet
 driver.
 
 You can pull them directly from my ofa git tree:
 
 git://staging.openfabrics.org/~swise/ofed_1_2 cxgb3_fixes
 
 Thanks,
 
 Steve.

Applied.

-- 
Vladimir Sokolovsky [EMAIL PROTECTED]
Mellanox Technologies Ltd.

___
general mailing list
[EMAIL PROTECTED]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] RE: [OFA General] List Address Change Completed

2007-02-27 Thread Lee, Michael Paichi
Done


-Original Message-
From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED]
Sent: Tue 2/27/2007 11:23 PM
To: Lee, Michael Paichi
Cc: [EMAIL PROTECTED]; openib-general@openib.org
Subject: Re: [OFA General] List Address Change Completed
 
 Quoting Lee, Michael Paichi [EMAIL PROTECTED]:
 Subject: [OFA General] List Address Change Completed
 
 This list has been migrated to the new server, lists.openfabrics.org.  Please 
 update any address book or filter settings to reflect the new mailing list 
 address.  Future messages and replies should be sent to this address:
 
 [EMAIL PROTECTED]
 
 The new web address for this list is:
 
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
 
 If you have any questions, please contact me at [EMAIL PROTECTED]   

Can the subject prefix be made all lower-case, with dash, please?
OFA General - ofa-general?

Upper case words look like shouting to me, and e.g. exchange rules
are limited in coping with spaces.

-- 
MST


___
general mailing list
[EMAIL PROTECTED]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general