from:"Roland Dreier"

Re: [PATCH] CDC-NCM: remove "connected" log message

2021-01-05 Thread Roland Dreier

> now that you put it that way, I get the merit of what you are saying.
> Very well. I will submit the first set of patches.
>
> May I add your "Tested-by"?

Yes, absolutely:

Tested-by: Roland Dreier

Re: [PATCH] CDC-NCM: remove "connected" log message

2021-01-04 Thread Roland Dreier

> > to preserve the legacy behavior rather than changing the behavior of
> > every usbnet driver all at once?  Like make a new
> > usbnet_get_link_ksettings_nonmdio and update only cdc_ncm to use it?
>
> Then I would have to touch them all. The problem is that the MDIO
> stuff really is pretty much a layering violation. It should never
> have been default. But now it is.

I don't understand this.  Your 0001 patch changes the behavior of
usbnet_get_link_ksettings() and you have to touch all of the 8 drivers
that use it if you don't want to change their behavior.  If you keep
the old usbnet_get_link_ksettings() and add
usbnet_get_link_ksettings_nonmdio() then you can just update cdc_ncm
to start with, and then gradually migrate other drivers.  And
eventually fix the layering violation and get rid of the legacy
function when the whole transition is done.

 - R.

Re: [PATCH] CDC-NCM: remove "connected" log message

2020-12-31 Thread Roland Dreier

I haven't tried these patches yet but they don't look quite right to
me.  inlining the first 0001 patch:

 > diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
 > index 1447da1d5729..bcd17f6d6de6 100644
 > --- a/drivers/net/usb/usbnet.c
 > +++ b/drivers/net/usb/usbnet.c
 > @@ -944,7 +944,7 @@ EXPORT_SYMBOL_GPL(usbnet_open);
 >   * they'll probably want to use this base set.
 >   */
 >
 > -int usbnet_get_link_ksettings(struct net_device *net,
 > +int usbnet_get_link_ksettings_mdio(struct net_device *net,
 >struct ethtool_link_ksettings *cmd)
 >  {
 >  struct usbnet *dev = netdev_priv(net);
 > @@ -956,6 +956,32 @@ int usbnet_get_link_ksettings(struct net_device *net,
 >
 >  return 0;
 >  }
 > +EXPORT_SYMBOL_GPL(usbnet_get_link_ksettings_mdio);

why keep and export the old function when it will have no callers?

 > +int usbnet_get_link_ksettings(struct net_device *net,
 > +struct ethtool_link_ksettings *cmd)
 > +{
 > +struct usbnet *dev = netdev_priv(net);
 > +
 > +/* the assumption that speed is equal on tx and rx
 > + * is deeply engrained into the networking layer.
 > + * For wireless stuff it is not true.
 > + * We assume that rxspeed matters more.
 > + */
 > +if (dev->rxspeed != SPEED_UNKNOWN)
 > +cmd->base.speed = dev->rxspeed / 100;
 > +else if (dev->txspeed != SPEED_UNKNOWN)
 > +cmd->base.speed = dev->txspeed / 100;
 > +/* if a minidriver does not record speed we try to
 > + * fall back on MDIO
 > + */
 > +else if (!dev->mii.mdio_read)
 > +cmd->base.speed = SPEED_UNKNOWN;
 > +else
 > +mii_ethtool_get_link_ksettings(&dev->mii, cmd);
 > +
 > +return 0;

This is a change in behavior for every driver that doesn't set rxspeed
/ txspeed - the old get_link function would return EOPNOTSUPP if
mdio_read isn't implemented, now we give SPEED_UNKNOWN with a
successful return code.

 > @@ -1661,6 +1687,8 @@ usbnet_probe (struct usb_interface *udev,
const struct usb_device_id *prod)
 >  dev->intf = udev;
 >  dev->driver_info = info;
 >  dev->driver_name = name;
 > +dev->rxspeed = -1; /* unknown or handled by MII */
 > +dev->txspeed = -1;

Minor nit: if we're going to test these against SPEED_UNKNOWN above,
then I think it's clearer to initialize them to that value via the
same constant.

 > diff --git a/include/linux/usb/usbnet.h b/include/linux/usb/usbnet.h
 > index 88a7673894d5..f748c758f82a 100644
 > --- a/include/linux/usb/usbnet.h
 > +++ b/include/linux/usb/usbnet.h
 > @@ -267,8 +269,11 @@ extern void usbnet_purge_paused_rxq(struct usbnet *);
 >
 >  extern int usbnet_get_link_ksettings(struct net_device *net,
 >   struct ethtool_link_ksettings *cmd);
 > -extern int usbnet_set_link_ksettings(struct net_device *net,
 > +extern int usbnet_set_link_ksettings_mdio(struct net_device *net,
 >   const struct ethtool_link_ksettings *cmd);
 > +/* Legacy - to be used if you really need an error to be returned */
 > +extern int usbnet_set_link_ksettings(struct net_device *net,
 > +const struct ethtool_link_ksettings *cmd);
 >  extern u32 usbnet_get_link(struct net_device *net);
 >  extern u32 usbnet_get_msglevel(struct net_device *);
 >  extern void usbnet_set_msglevel(struct net_device *, u32);

I think this was meant to be changing get_link, not set_link.

Also I don't understand the "Legacy" comment.  Is that referring to
the EOPNOTSUPP change I mentioned above?  If so, wouldn't it be better
to preserve the legacy behavior rather than changing the behavior of
every usbnet driver all at once?  Like make a new
usbnet_get_link_ksettings_nonmdio and update only cdc_ncm to use it?

 - R.

Re: [PATCH] CDC-NCM: remove "connected" log message

2020-12-29 Thread Roland Dreier

> I looked at them again and found that there is a way to get
> the same effect that will make maintenance easier in the long run.
> Could I send them to you later this week for testing?

Yes, please.  I have a good test setup now so I can easily try out patches.

Thanks,
  Roland

Re: [PATCH] CDC-NCM: remove "connected" log message

2020-12-28 Thread Roland Dreier

> Applied to net and queued for LTS, thanks!

Thanks - is Oliver's series of 3 patches that get rid of the other
half of the log spam also on the way upstream?

 - R.

[PATCH] CDC-NCM: remove "connected" log message

2020-12-23 Thread Roland Dreier

The cdc_ncm driver passes network connection notifications up to
usbnet_link_change(), which is the right place for any logging.
Remove the netdev_info() duplicating this from the driver itself.

This stops devices such as my "TRENDnet USB 10/100/1G/2.5G LAN"
(ID 20f4:e02b) adapter from spamming the kernel log with

cdc_ncm 2-2:2.0 enp0s2u2c2: network connection: connected

messages every 60 msec or so.

Signed-off-by: Roland Dreier 
---
 drivers/net/usb/cdc_ncm.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
index a45fcc44facf..50d3a4e6d445 100644
--- a/drivers/net/usb/cdc_ncm.c
+++ b/drivers/net/usb/cdc_ncm.c
@@ -1850,9 +1850,6 @@ static void cdc_ncm_status(struct usbnet *dev, struct urb 
*urb)
 * USB_CDC_NOTIFY_NETWORK_CONNECTION notification shall be
 * sent by device after USB_CDC_NOTIFY_SPEED_CHANGE.
 */
-   netif_info(dev, link, dev->net,
-  "network connection: %sconnected\n",
-  !!event->wValue ? "" : "dis");
usbnet_link_change(dev, !!event->wValue, 0);
break;
 
-- 
2.29.2

Re: cdc_ncm kernel log spam with trendnet 2.5G USB adapter

2020-12-22 Thread Roland Dreier

On Tue, Dec 22, 2020 at 6:49 PM Jakub Kicinski  wrote:

> I'm not sure what the story here is but if this change is expected to
> get into the networking tree we'll need a fresh posting. This sort of
> scissored reply does not get into patchwork.

OK, will resend.  Too bad about patchwork, "git am" drops everything
before scissors lines by default.

> It sounds like you're getting tens of those messages a second, we can
> remove the message but the device is still generating spurious events,
> wasting CPU cycles. Was blocking those events deemed unfeasible?

I certainly don't know enough about the USB CDC class to know why the
spurious messages are showing up or whether they could be suppressed
without a fix in the adapter firmware.  But even ~30 spurious messages
per second doesn't seem so bad for a multi-gig adapter that might be
handling 100,000 or more packets per second.

 - R.

Re: cdc_ncm kernel log spam with trendnet 2.5G USB adapter

2020-12-19 Thread Roland Dreier

(Apologies, trying one more time with a better mailer)

Sorry it took so long, but I finally got a chance to test the patches.  They
seem to work well, but they only get rid of the downlink / uplink speed spam -
I still get the following filling my kernel log with a patched kernel:

  [   29.830383] cdc_ncm 2-2:2.0 enp0s2u2c2: network connection: connected
  [   29.894359] cdc_ncm 2-2:2.0 enp0s2u2c2: network connection: connected
  [   29.958601] cdc_ncm 2-2:2.0 enp0s2u2c2: network connection: connected
  [   30.022473] cdc_ncm 2-2:2.0 enp0s2u2c2: network connection: connected
  [   30.086548] cdc_ncm 2-2:2.0 enp0s2u2c2: network connection: connected

with the below patch on top of your 3, then my kernel log is clean.

Please apply your patches plus my patch, and feel free to add

Tested-by: Roland Dreier 

to the other three.

--- 8< -- 8< ---

Subject: [PATCH] CDC-NCM: remove "connected" log message

The cdc_ncm driver passes network connection notifications up to
usbnet_link_change(), which is the right place for any logging.
Remove the netdev_info() duplicating this from the driver itself.

This stops devices such as my "TRENDnet USB 10/100/1G/2.5G LAN"
(ID 20f4:e02b) adapter from spamming the kernel log with

cdc_ncm 2-2:2.0 enp0s2u2c2: network connection: connected

messages every 60 msec or so.

Signed-off-by: Roland Dreier 
---
 drivers/net/usb/cdc_ncm.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
index a45fcc44facf..50d3a4e6d445 100644
--- a/drivers/net/usb/cdc_ncm.c
+++ b/drivers/net/usb/cdc_ncm.c
@@ -1850,9 +1850,6 @@ static void cdc_ncm_status(struct usbnet *dev, struct urb 
*urb)
 * USB_CDC_NOTIFY_NETWORK_CONNECTION notification shall be
 * sent by device after USB_CDC_NOTIFY_SPEED_CHANGE.
 */
-   netif_info(dev, link, dev->net,
-  "network connection: %sconnected\n",
-  !!event->wValue ? "" : "dis");
usbnet_link_change(dev, !!event->wValue, 0);
break;
 
-- 
2.29.2

Re: cdc_ncm kernel log spam with trendnet 2.5G USB adapter

2020-12-01 Thread Roland Dreier

> your criticism was valid. Could you test the attached patches?

Thanks - I will build a kernel and report back.

 - R.

cdc_ncm kernel log spam with trendnet 2.5G USB adapter

2020-11-16 Thread Roland Dreier

Hi, I recently got a 2.5G USB adapter, and although it works fine, the
driver continually spams the kernel log with messages like

[127662.025647] cdc_ncm 4-1:2.0 enx3c8cf8f942e0: network connection: connected
[127662.057680] cdc_ncm 4-1:2.0 enx3c8cf8f942e0: 2500 mbit/s downlink
2500 mbit/s uplink
[127662.089794] cdc_ncm 4-1:2.0 enx3c8cf8f942e0: network connection: connected
[127662.121831] cdc_ncm 4-1:2.0 enx3c8cf8f942e0: 2500 mbit/s downlink
2500 mbit/s uplink
[127662.153858] cdc_ncm 4-1:2.0 enx3c8cf8f942e0: network connection: connected
...

Looking at the code in cdc_ncm.c it seems the device is just sending
USB_CDC_NOTIFY_NETWORK_CONNECTION and USB_CDC_NOTIFY_SPEED_CHANGE urbs
over and over, and the driver logs every single one.

Should we add code to the driver to keep track of what the last event
was and only log if the state has really change?

Thanks!
  Roland

Full details on the adapter, in case it matters:

Bus 004 Device 005: ID 20f4:e02b TRENDnet USB 10/100/1G/2.5G LAN
Device Descriptor:
  bLength18
  bDescriptorType 1
  bcdUSB   3.20
  bDeviceClass0
  bDeviceSubClass 0
  bDeviceProtocol 0
  bMaxPacketSize0 9
  idVendor   0x20f4 TRENDnet
  idProduct  0xe02b
  bcdDevice   30.04
  iManufacturer   1
  iProduct2
  iSerial 6
  bNumConfigurations  3
  Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength   0x0039
bNumInterfaces  1
bConfigurationValue 1
iConfiguration  0
bmAttributes 0xa0
  (Bus Powered)
  Remote Wakeup
MaxPower  512mA
Interface Descriptor:
  bLength 9
  bDescriptorType 4
  bInterfaceNumber0
  bAlternateSetting   0
  bNumEndpoints   3
  bInterfaceClass   255 Vendor Specific Class
  bInterfaceSubClass255 Vendor Specific Subclass
  bInterfaceProtocol  0
  iInterface  0
  Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x81  EP 1 IN
bmAttributes2
  Transfer TypeBulk
  Synch Type   None
  Usage Type   Data
wMaxPacketSize 0x0400  1x 1024 bytes
bInterval   0
bMaxBurst   3
  Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x02  EP 2 OUT
bmAttributes2
  Transfer TypeBulk
  Synch Type   None
  Usage Type   Data
wMaxPacketSize 0x0400  1x 1024 bytes
bInterval   0
bMaxBurst   3
  Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x83  EP 3 IN
bmAttributes3
  Transfer TypeInterrupt
  Synch Type   None
  Usage Type   Data
wMaxPacketSize 0x0002  1x 2 bytes
bInterval  11
bMaxBurst   0
  Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength   0x0068
bNumInterfaces  2
bConfigurationValue 2
iConfiguration  0
bmAttributes 0xa0
  (Bus Powered)
  Remote Wakeup
MaxPower  512mA
Interface Descriptor:
  bLength 9
  bDescriptorType 4
  bInterfaceNumber0
  bAlternateSetting   0
  bNumEndpoints   1
  bInterfaceClass 2 Communications
  bInterfaceSubClass 13
  bInterfaceProtocol  0
  iInterface  5
  CDC Header:
bcdCDC   1.10
  CDC Union:
bMasterInterface0
bSlaveInterface 1
  CDC Ethernet:
iMacAddress  3 (??)
bmEthernetStatistics0x0031500f
wMaxSegmentSize   1518
wNumberMCFilters0x8000
bNumberPowerFilters  0
  CDC NCM:
bcdNcmVersion1.00
bmNetworkCapabilities 0x2b
  8-byte ntb input size
  max datagram size
  net address
  packet filter
  Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x83  EP 3 IN
bmAttributes3
  Transfer TypeInterrupt
  Synch Type   None
  Usage Type   Data
wMaxPacketSize 0x0010  1x 16 bytes
bInterval   8
bMaxBurst   0
Interface Descriptor:
  bLength 9

[PATCH] qed: Fix potential use-after-free in qed_spq_post()

2018-01-15 Thread Roland Dreier

From: Roland Dreier 

We need to check if p_ent->comp_mode is QED_SPQ_MODE_EBLOCK before
calling qed_spq_add_entry().  The test is fine is the mode is EBLOCK,
but if it isn't then qed_spq_add_entry() might kfree(p_ent).

Signed-off-by: Roland Dreier 
---
 drivers/net/ethernet/qlogic/qed/qed_spq.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_spq.c 
b/drivers/net/ethernet/qlogic/qed/qed_spq.c
index be48d9abd001..3588081b2e27 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_spq.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_spq.c
@@ -776,6 +776,7 @@ int qed_spq_post(struct qed_hwfn *p_hwfn,
int rc = 0;
struct qed_spq *p_spq = p_hwfn ? p_hwfn->p_spq : NULL;
bool b_ret_ent = true;
+   bool eblock;
 
if (!p_hwfn)
return -EINVAL;
@@ -794,6 +795,11 @@ int qed_spq_post(struct qed_hwfn *p_hwfn,
if (rc)
goto spq_post_fail;
 
+   /* Check if entry is in block mode before qed_spq_add_entry,
+* which might kfree p_ent.
+*/
+   eblock = (p_ent->comp_mode == QED_SPQ_MODE_EBLOCK);
+
/* Add the request to the pending queue */
rc = qed_spq_add_entry(p_hwfn, p_ent, p_ent->priority);
if (rc)
@@ -811,7 +817,7 @@ int qed_spq_post(struct qed_hwfn *p_hwfn,
 
spin_unlock_bh(&p_spq->lock);
 
-   if (p_ent->comp_mode == QED_SPQ_MODE_EBLOCK) {
+   if (eblock) {
/* For entries in QED BLOCK mode, the completion code cannot
 * perform the necessary cleanup - if it did, we couldn't
 * access p_ent here to see whether it's successful or not.
-- 
2.14.1

Re: [PATCH] PCI: Update ACS quirk for more Intel 10G NICs

2017-08-04 Thread Roland Dreier

> I think the conclusion is that a hard-wired ACS capability is a
> positive indication of isolation for a multifunction device, the code
> is intended to support this and appears to do so, and Roland was going
> to investigate the sightings that inspired this patch in more detail.
> Dropping for now is appropriate.  Thanks,

That's right.  I confirmed that the issue we found was due to another PCI quirk.

It may make sense to add more 82599 variants to the table, but the X540 and X550
work without a quirk.

Sorry for the noise.

 - R.

Re: [PATCH] PCI: Update ACS quirk for more Intel 10G NICs

2017-07-24 Thread Roland Dreier

> Is there a misunderstanding of the code flow here?  We're never setting
> EC.  In the first code block we're just masking out requested
> capabilities where unimplemented capabilities is the same as
> implemented + enabled.  We're not adding EC to the request, we're
> just not removing it based on the implemented capabilities because we
> don't interpret it the same way.  Thus if someone wants to test a
> device for EC, it really needs to implement EC, we cannot assume it
> based on lack of support for EC in the ACS capability.  As you point
> out, nobody really cares about EC yet though.

I guess I find the semantics confusing.  For every other bit,
pci_acs_enabled() returns true if the bit is either enabled or not
implemented.  For EC, it returns false if the bit is not implemented.

It's not clear to me what the use case for checking for PCI_ACS_EC
enabled would be.  Seems like checking for EC in the capabilities
register would be more useful.

 - R.

Re: [PATCH] PCI: Update ACS quirk for more Intel 10G NICs

2017-07-24 Thread Roland Dreier

> I think that the device implementing ACS and not implementing certain
> features that are "must be implemented if..." is a positive indication
> that the device does not have that behavior and therefore the ability
> to control that behavior is unnecessary.  pci_acs_flags_enabled() is
> coded with this intention:
>
> static bool pci_acs_flags_enabled(struct pci_dev *pdev, u16 acs_flags)
> {
> int pos;
> u16 cap, ctrl;
>
> pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ACS);
> if (!pos)
> return false;
>
> /*
>  * Except for egress control, capabilities are either required
>  * or only required if controllable.  Features missing from the
>  * capability field can therefore be assumed as hard-wired enabled.
>  */
> pci_read_config_word(pdev, pos + PCI_ACS_CAP, &cap);
> acs_flags &= (cap | PCI_ACS_EC);
>
> 
>
> pci_read_config_word(pdev, pos + PCI_ACS_CTRL, &ctrl);
> return (ctrl & acs_flags) == acs_flags;
>
> 

I'm not sure it makes sense to look for the EC bit in the control register if
the capability bit says it is not supported.  In fact I think it would
violate the PCI spec to set the bit in the control register if it is zero in the
capability register - the spec says:

  Default value of this bit is 0b. Must be hardwired to 0b if the ACS P2P
  Egress Control functionality is not implemented.

I don't understand the reasoning behind forcing the EC bit in commit
83db7e0bdb70.

I think the only reason this works is that nowhere in the kernel uses
this function to check PCI_ACS_EC.

> At least that's my interpretation and how I've indicated to folks at
> Intel how things should work for a device that does not support p2p.
> Of course this all hinges on having an ACS capability.  Without an ACS
> capability we must assume any sort of p2p is possible unless we have a
> quirk to tell us otherwise.  With ACS, the absence of capabilities is a
> positive indication that those forms of p2p are not possible (depending
> on the spec wording for that capability).  If the code doesn't behave
> that way, let's fix it. Thanks,

Thanks - I'm looking more closely at what goes wrong with X550, since
from reading the code it looks like it should work, but people have observed
that it doesn't on our system.  However the issue might be elsewhere.

However I'll send a patch removing that "| PCI_ACS_EC" from the check if
you agree - maybe I'm misunderstanding the logic but I don't see how it
could work if it ever became relevant.

 - R.

Re: [PATCH] PCI: Update ACS quirk for more Intel 10G NICs

2017-07-20 Thread Roland Dreier

On Thu, Jul 20, 2017 at 3:15 PM, Alex Williamson
 wrote:

> Most of the ACS capabilities are worded as "Must be implemented by
> devices that implement ..."  Shouldn't a hard-wired ACS capability
> sufficiently describe that, or is there something wrong with how
> they're hard wired?

Interesting question.  I just looked at what the PCI spec says about
the various bits for ACS functions in multi-function devices.  Many of
the functions are "must not be implemented."  Of the ones that are
"must be implemented if..." the key is that the if is for devices that
support peer-to-peer traffic.

I think the Intel NICs are compliant with the spec - they don't
support any ACS mechanisms for controlling peer-to-peer traffic, but
they also don't implement peer-to-peer traffic.  This means that the
PCI standard way of knowing that it is safe to assign individual
functions does not prove it is safe - but with device-specific
knowledge we do know it is safe.  Hence a quirk to give that
device-specific information to the kernel.

 - R.

[PATCH] PCI: Update ACS quirk for more Intel 10G NICs

2017-07-20 Thread Roland Dreier

From: Roland Dreier 

Add one more variant of the 82599 plus the device IDs for X540 and X550
variants.  Intel has confirmed that none of these devices does peer-to-peer
between functions.  The X540 and X550 have added ACS capabilities in their
PCI config space, but the ACS control register is hard-wired to 0 for both
devices, so we still need the quirk for IOMMU grouping to allow assignment
of individual SR-IOV functions.

Signed-off-by: Roland Dreier 
---
 drivers/pci/quirks.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 6967c6b4cf6b..b939db671326 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4335,12 +4335,33 @@ static const struct pci_dev_acs_enabled {
{ PCI_VENDOR_ID_INTEL, 0x1507, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_INTEL, 0x1514, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_INTEL, 0x151C, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x1528, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_INTEL, 0x1529, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x154A, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_INTEL, 0x152A, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_INTEL, 0x154D, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_INTEL, 0x154F, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_INTEL, 0x1551, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_INTEL, 0x1558, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x1560, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x1563, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x15AA, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x15AB, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x15AC, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x15AD, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x15AE, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x15B0, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x15AB, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x15C2, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x15C3, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x15C4, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x15C6, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x15C7, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x15C8, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x15CE, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x15E4, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x15E5, pci_quirk_mf_endpoint_acs },
+   { PCI_VENDOR_ID_INTEL, 0x15D1, pci_quirk_mf_endpoint_acs },
/* 82580 */
{ PCI_VENDOR_ID_INTEL, 0x1509, pci_quirk_mf_endpoint_acs },
{ PCI_VENDOR_ID_INTEL, 0x150E, pci_quirk_mf_endpoint_acs },
-- 
2.11.0

Re: Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-07-08 Thread Roland Dreier

On Fri, Jul 8, 2016 at 9:51 AM, Jason Gunthorpe
 wrote:
> So, it appears, the dst and neigh can be used for all performances cases.
>
> For the non performance dst == null case, can we just burn cycles and
> stuff the daddr in front of the packet at hardheader time, even if we
> have to copy?

OK, sounds interesting.

Unfortunately the scope of this work has gotten to the point where I
can't take it on right now.  My system is running 4.4.y for now
(before struct skb_gso_cb grew) so I think shrinking struct skb_gso_cb
to 8 bytes plus changing SKB_SGO_CB_OFFSET to 20 will work for now.
Hope someone is able to come up with a real fix before I need to
upgrade to 4.10.y...

 - R.

Re: Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-07-08 Thread Roland Dreier

On Thu, Jul 7, 2016 at 4:14 PM, Jason Gunthorpe
 wrote:
> We have neighbour_priv, and ndo_neigh_construct/destruct now ..
>
> A first blush that would seem to be enough to let ipoib store the AH
> and other path information in the neigh and avoid the cb? At least the
> example in clip sure looks like what ipoib needs to do.

Do you think those new facilities let us go back to using the neigh
and still avoid the issues that led to commit b63b70d87741 ("IPoIB:
Use a private hash table for path lookup in xmit path")?

 - R.

Re: Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-07-07 Thread Roland Dreier

>> struct skb_gso_cb {
>> int mac_offset;
>> int encap_level;
>> __u16   csum_start;
>> };

> This is based on an out-dated version of this struct.  The 4.7 RC
> kernel has a few more fields that were added to support local checksum
> offload for encapsulated frames.

Thanks for pointing that out.  I hit the perf regression on 4.4.y
(stable) and looked at the struct there.  I see that latest upstream
has changed, and I agree that this struct really can't shrink below 10
bytes.

Since IP needs 20 bytes, GSO needs 10 bytes and IPoIB needs 20 bytes,
we're 2 bytes over the 48 that are available in cb[].  So this is
harder to fix than just changing skb_gso_cb and SKB_SGO_CB_OFFSET
unfortunately.

>> What is the best way to keep the crash fix but not kill IPoIB performance?
>
> It seems like what would probably need to happen is to move where the
> IPoIB address is stored.  I'm not sure the control buffer is really
> the best place for it since the cb gets overwritten at various levels,
> and storing 20 bytes makes it hard to avoid bumping up against the
> size restrictions of the buffer.  Seeing as how the IPoIB hwaddr is
> generated around the same time we generate the L2 header for the
> frame, I wonder if you couldn't get away with using a bit of extra skb
> headroom to store it and then use a offset from the MAC header to
> access it.  An added bonus would be that with a few tricks with
> SKB_GSO_CB(skb)->mac_offset you might even be able to set things up so
> that you copy the hwaddr when you copy the header for each fragment
> instead of having to go and copy the hwaddr out of the cb and clone it
> for each frame.

Can we assume there are 20 bytes of skb headroom available?  What if
we're forwarding an skb received on an Ethernet device?

The reason we moved to the cb storage is that in the past, trying to
hide some data in the actual skb buffer that we don't actually send
led to some awkward-at-best code.  (As I recall GRO was difficult to
handle before commit 936d7de3d736 "IPoIB: Stop lying about
hard_header_len and use skb->cb to stash LL addresses")  But maybe
there's a third way to handle this other than the old way and the
skb->cb way.

 - R.

Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-07-06 Thread Roland Dreier

On Thu, Jan 7, 2016 at 3:00 AM, Konstantin Khlebnikov  wrote:
> Or just shift GSO CB and add couple checks like
> BUILD_BUG_ON(sizeof(SKB_GSO_CB(skb)->room) < sizeof(*IPCB(skb)));

Resurrecting this old thread, because the patch that ultimately went
upstream (commit 9207f9d45b0a / net: preserve IP control block during
GSO segmentation) causes a huge IPoIB performance regression (to the
point of being unusable):
https://bugzilla.kernel.org/show_bug.cgi?id=111921

I don't think anyone has explained what goes wrong or why IPoIB works
the way it does.  The underlying difference that IPoIB has from other
drivers is that there are two levels of address resolution.  First,
normal ARP / ND resolves an IP address to a "hardware" address.  The
difference is that in IPoIB, the hardware address is an IB GID (plus a
QPN, but we can ignore that).  To actually send data to that GID, the
IPoIB driver has to do a second lookup - it needs to ask the IB subnet
manager for a path record that tells it how to reach that GID.

In particular this means that "destination address" (as the IP / ARP
layer understands it) actually isn't in the packet anywhere - there's
nothing like an ethernet header as there is for "normal" network
drivers.  Instead, the driver stashes the address in skb->cb during
hard_header_ops->create() and then looks at it in the xmit routine -
this was designed way back around when commit a0417fa3a18a / net: Make
qdisc_skb_cb upper size bound explicit. was merged.  The expectation
was that the part of the cb after sizeof (struct qdisc_skb_cb) would
be preserved.

The problem with commit 9207f9d45b0a is that GSO operations now access
cb after SKB_SGO_CB_OFFSET==32, which lands right in the middle of
where IPoIB stashes its hwaddr.

It seems that the intent of the commit is to preserve the IP control
block - struct inet_skb_parm (and presumably struct inet6_skb_parm) -
even when using SKB_GSO_CB().  Seems like both inet_skb_parm and
inet6_skb_parm are 20 bytes.  IPoIB uses the part of cb after 28
bytes, so if we could squeeze struct skb_gso_cb down to 8 bytes and
set SKB_SGO_CB_OFFSET to 20, then everything would work.  The struct
is

struct skb_gso_cb {
int mac_offset;
int encap_level;
__u16   csum_start;
};

is it feasible to make encap_level a __u16 (which would make the
overall struct exactly 8 bytes)?  If I understand this correctly, 64K
nested encapsulations seems like quite a bit for a packet...

Or, earlier in this thread, having the GSO in ip_output and other gso
paths save and restore the IP/IP6 control block was suggested as an
alternate approach.  I don't know if there are performance
implications to that.

What is the best way to keep the crash fix but not kill IPoIB performance?

Thanks!
 - R.

[PATCH 3.19 and earlier] fib_rules: Fix dump_rules() not to exit early

2015-10-05 Thread Roland Dreier

From: Roland Dreier 

Backports of 41fc014332d9 ("fib_rules: fix fib rule dumps across
multiple skbs") introduced a regression in "ip rule show" - it ends up
dumping the first rule over and over and never exiting, because 3.19
and earlier are missing commit 053c095a82cf ("netlink: make
nlmsg_end() and genlmsg_end() void"), so fib_nl_fill_rule() ends up
returning skb->len (i.e. > 0) in the success case.

Fix this by checking the return code for < 0 instead of != 0.

Signed-off-by: Roland Dreier 
---
Hi, this is needed for all stable trees earlier than 4.0 that have
picked up 41fc014332d9; so far looks like at least 3.10.y and 3.14.y
have made such releases.

 net/core/fib_rules.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
index 627e517077e4..84340a2605ed 100644
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -606,7 +606,7 @@ static int dump_rules(struct sk_buff *skb, struct 
netlink_callback *cb,
err = fib_nl_fill_rule(skb, rule, NETLINK_CB(cb->skb).portid,
   cb->nlh->nlmsg_seq, RTM_NEWRULE,
   NLM_F_MULTI, ops);
-   if (err)
+   if (err < 0)
break;
 skip:
idx++;
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net] fib_rules: fix fib rule dumps across multiple skbs

2015-10-02 Thread Roland Dreier

On Tue, Sep 22, 2015 at 9:40 PM, Roopa Prabhu  wrote:
> +   err = fib_nl_fill_rule(skb, rule, NETLINK_CB(cb->skb).portid,
> +  cb->nlh->nlmsg_seq, RTM_NEWRULE,
> +  NLM_F_MULTI, ops);
> +   if (err)

FWIW I believe this breaks pre-4.0 stable kernels (unfortunately it
just showed up in 3.10.90).  In kernels without 053c095a82cf
("netlink: make nlmsg_end() and genlmsg_end() void") then
fib_nl_fill_rule() returns a positive value (skb->len) on success, so
we break out of the loop here immediately.  Symptom is "ip rule show"
loops forever printing the first rule.

After I finish testing a fix (as trivial as changing to "if (err <
0)") here, I'll send it to -stable guys.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ofa-general] [PATCH 2.6.25] RDMA/cxgb3: Shift calculation wrong for single sge entries.

2008-02-25 Thread Roland Dreier

Thanks, applied, although I assume based on the Signed-off-by line
that you left out a

  From: Bryan Rosenburg <[EMAIL PROTECTED]>

at the top (to get the authorship in git correctly).

 > RDMA/cxgb3: Shift calculation wrong for single sge entries.

BTW, there's no need to duplicate the subject line in the message
body, but if you are going to do it, please put a "Subject:" before
it.  Otherwise I just have to edit it out by hand to avoid git putting
the subject in twice.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH RESEND] libertas: Remove unused exports

2008-02-23 Thread Roland Dreier

Any chance of getting this applied?  It seems the build is still
broken on ia64 at least due to the export of static functions.
---


The libertas driver exports a number of symbols with no in-tree
users; remove these unused exports.  lbs_reset_device() is completely
unused, with no callers at all, so remove the function completely.

A couple of these unused exported symbols are static, which causes the
following build error on ia64 with gcc 4.2.3:

drivers/net/wireless/libertas/main.c:1375: error: __ksymtab_lbs_remove_mesh 
causes a section type conflict
drivers/net/wireless/libertas/main.c:1354: error: __ksymtab_lbs_add_mesh 
causes a section type conflict

This patch fixes this problem.  I don't have hardware, so this is not
run-tested, but I tested the build with

CONFIG_LIBERTAS=y
CONFIG_LIBERTAS_USB=m
CONFIG_LIBERTAS_CS=m
CONFIG_LIBERTAS_SDIO=m

and there were no problems with undefined symbols.

Signed-off-by: Roland Dreier <[EMAIL PROTECTED]>
---
diff --git a/drivers/net/wireless/libertas/cmd.c 
b/drivers/net/wireless/libertas/cmd.c
index eab0203..b3c1acb 100644
--- a/drivers/net/wireless/libertas/cmd.c
+++ b/drivers/net/wireless/libertas/cmd.c
@@ -1040,7 +1040,6 @@ int lbs_mesh_access(struct lbs_private *priv, uint16_t 
cmd_action,
lbs_deb_leave(LBS_DEB_CMD);
return ret;
 }
-EXPORT_SYMBOL_GPL(lbs_mesh_access);
 
 int lbs_mesh_config(struct lbs_private *priv, uint16_t enable, uint16_t chan)
 {
@@ -1576,7 +1575,6 @@ done:
lbs_deb_leave_args(LBS_DEB_HOST, "ret %d", ret);
return ret;
 }
-EXPORT_SYMBOL_GPL(lbs_prepare_and_send_command);
 
 /**
  *  @brief This function allocates the command buffer and link
diff --git a/drivers/net/wireless/libertas/decl.h 
b/drivers/net/wireless/libertas/decl.h
index aaacd9b..4e22341 100644
--- a/drivers/net/wireless/libertas/decl.h
+++ b/drivers/net/wireless/libertas/decl.h
@@ -69,7 +69,6 @@ struct lbs_private *lbs_add_card(void *card, struct device 
*dmdev);
 int lbs_remove_card(struct lbs_private *priv);
 int lbs_start_card(struct lbs_private *priv);
 int lbs_stop_card(struct lbs_private *priv);
-int lbs_reset_device(struct lbs_private *priv);
 void lbs_host_to_card_done(struct lbs_private *priv);
 
 int lbs_update_channel(struct lbs_private *priv);
diff --git a/drivers/net/wireless/libertas/main.c 
b/drivers/net/wireless/libertas/main.c
index 84fb49c..1eaf6af 100644
--- a/drivers/net/wireless/libertas/main.c
+++ b/drivers/net/wireless/libertas/main.c
@@ -1351,8 +1350,6 @@ done:
lbs_deb_leave_args(LBS_DEB_MESH, "ret %d", ret);
return ret;
 }
-EXPORT_SYMBOL_GPL(lbs_add_mesh);
-
 
 static void lbs_remove_mesh(struct lbs_private *priv)
 {
@@ -1372,7 +1369,6 @@ static void lbs_remove_mesh(struct lbs_private *priv)
free_netdev(mesh_dev);
lbs_deb_leave(LBS_DEB_MESH);
 }
-EXPORT_SYMBOL_GPL(lbs_remove_mesh);
 
 /**
  *  @brief This function finds the CFP in
@@ -1458,20 +1454,6 @@ void lbs_interrupt(struct lbs_private *priv)
 }
 EXPORT_SYMBOL_GPL(lbs_interrupt);
 
-int lbs_reset_device(struct lbs_private *priv)
-{
-   int ret;
-
-   lbs_deb_enter(LBS_DEB_MAIN);
-   ret = lbs_prepare_and_send_command(priv, CMD_802_11_RESET,
-   CMD_ACT_HALT, 0, 0, NULL);
-   msleep_interruptible(10);
-
-   lbs_deb_leave_args(LBS_DEB_MAIN, "ret %d", ret);
-   return ret;
-}
-EXPORT_SYMBOL_GPL(lbs_reset_device);
-
 static int __init lbs_init_module(void)
 {
lbs_deb_enter(LBS_DEB_MAIN);
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ofa-general] [PATCH 2.6.25] RDMA/cxgb3: Fail loopbackconnections.

2008-02-13 Thread Roland Dreier

Steve, I had to update the patch adding an include and fixing the
function declaration (as below)... but how much testing have you done
with this??

commit 8704e9a8790cc9e394198663c1c9150c899fb9a2
Author: Steve Wise <[EMAIL PROTECTED]>
Date:   Tue Feb 12 16:09:29 2008 -0600

RDMA/cxgb3: Fail loopback connections

The cxgb3 HW and driver don't support loopback RDMA connections.  So
fail any connection attempt where the destination address is local.

Signed-off-by: Steve Wise <[EMAIL PROTECTED]>
    Signed-off-by: Roland Dreier <[EMAIL PROTECTED]>

diff --git a/drivers/infiniband/hw/cxgb3/iwch_cm.c 
b/drivers/infiniband/hw/cxgb3/iwch_cm.c
index e9a08fa..320f2b6 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_cm.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_cm.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1784,6 +1785,17 @@ err:
return err;
 }
 
+static int is_loopback_dst(struct iw_cm_id *cm_id)
+{
+   struct net_device *dev;
+
+   dev = ip_dev_find(&init_net, cm_id->remote_addr.sin_addr.s_addr);
+   if (!dev)
+   return 0;
+   dev_put(dev);
+   return 1;
+}
+
 int iwch_connect(struct iw_cm_id *cm_id, struct iw_cm_conn_param *conn_param)
 {
int err = 0;
@@ -1791,6 +1803,11 @@ int iwch_connect(struct iw_cm_id *cm_id, struct 
iw_cm_conn_param *conn_param)
struct iwch_ep *ep;
struct rtable *rt;
 
+   if (is_loopback_dst(cm_id)) {
+   err = -ENOSYS;
+   goto out;
+   }
+
ep = alloc_ep(sizeof(*ep), GFP_KERNEL);
if (!ep) {
printk(KERN_ERR MOD "%s - cannot alloc ep.\n", __FUNCTION__);
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ofa-general] [PATCH 2.6.25] RDMA/cxgb3: Fail loopbackconnections.

2008-02-13 Thread Roland Dreier

  > how can a static void function return 0?

good question... I've fixed the patch in my tree.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ofa-general] [PATCH 2.6.25] RDMA/cxgb3: Fail loopback connections.

2008-02-12 Thread Roland Dreier

applied, although:

 > +static void is_loopback_dst(struct iw_cm_id *cm_id)
 > +{
 > +struct net_device *dev;
 > +
 > +dev = ip_dev_find(&init_net, cm_id->remote_addr.sin_addr.s_addr);
 > +if (!dev)
 > +return 0;
 > +dev_put(dev);
 > +return 1;
 > +}

is there any way this could trigger when it should, like if I'm trying
to make a connection from one local device to a different local device
(which should work fine)?

 - R.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] libertas: Remove unused exports

2008-02-12 Thread Roland Dreier

The libertas driver exports a number of symbols with no in-tree
users; remove these unused exports.  lbs_reset_device() is completely
unused, with no callers at all, so remove the function completely.

A couple of these unused exported symbols are static, which causes the
following build error on ia64 with gcc 4.2.3:

drivers/net/wireless/libertas/main.c:1375: error: __ksymtab_lbs_remove_mesh 
causes a section type conflict
drivers/net/wireless/libertas/main.c:1354: error: __ksymtab_lbs_add_mesh 
causes a section type conflict

This patch fixes this problem.  I don't have hardware, so this is not
run-tested, but I tested the build with

CONFIG_LIBERTAS=y
CONFIG_LIBERTAS_USB=m
CONFIG_LIBERTAS_CS=m
CONFIG_LIBERTAS_SDIO=m

and there were no problems with undefined symbols.

Signed-off-by: Roland Dreier <[EMAIL PROTECTED]>
---
Here's the patch that removes the unused exports (and a few more that
I found).

diff --git a/drivers/net/wireless/libertas/cmd.c 
b/drivers/net/wireless/libertas/cmd.c
index eab0203..b3c1acb 100644
--- a/drivers/net/wireless/libertas/cmd.c
+++ b/drivers/net/wireless/libertas/cmd.c
@@ -1040,7 +1040,6 @@ int lbs_mesh_access(struct lbs_private *priv, uint16_t 
cmd_action,
lbs_deb_leave(LBS_DEB_CMD);
return ret;
 }
-EXPORT_SYMBOL_GPL(lbs_mesh_access);
 
 int lbs_mesh_config(struct lbs_private *priv, uint16_t enable, uint16_t chan)
 {
@@ -1576,7 +1575,6 @@ done:
lbs_deb_leave_args(LBS_DEB_HOST, "ret %d", ret);
return ret;
 }
-EXPORT_SYMBOL_GPL(lbs_prepare_and_send_command);
 
 /**
  *  @brief This function allocates the command buffer and link
diff --git a/drivers/net/wireless/libertas/decl.h 
b/drivers/net/wireless/libertas/decl.h
index aaacd9b..4e22341 100644
--- a/drivers/net/wireless/libertas/decl.h
+++ b/drivers/net/wireless/libertas/decl.h
@@ -69,7 +69,6 @@ struct lbs_private *lbs_add_card(void *card, struct device 
*dmdev);
 int lbs_remove_card(struct lbs_private *priv);
 int lbs_start_card(struct lbs_private *priv);
 int lbs_stop_card(struct lbs_private *priv);
-int lbs_reset_device(struct lbs_private *priv);
 void lbs_host_to_card_done(struct lbs_private *priv);
 
 int lbs_update_channel(struct lbs_private *priv);
diff --git a/drivers/net/wireless/libertas/main.c 
b/drivers/net/wireless/libertas/main.c
index 84fb49c..1eaf6af 100644
--- a/drivers/net/wireless/libertas/main.c
+++ b/drivers/net/wireless/libertas/main.c
@@ -1351,8 +1350,6 @@ done:
lbs_deb_leave_args(LBS_DEB_MESH, "ret %d", ret);
return ret;
 }
-EXPORT_SYMBOL_GPL(lbs_add_mesh);
-
 
 static void lbs_remove_mesh(struct lbs_private *priv)
 {
@@ -1372,7 +1369,6 @@ static void lbs_remove_mesh(struct lbs_private *priv)
free_netdev(mesh_dev);
lbs_deb_leave(LBS_DEB_MESH);
 }
-EXPORT_SYMBOL_GPL(lbs_remove_mesh);
 
 /**
  *  @brief This function finds the CFP in
@@ -1458,20 +1454,6 @@ void lbs_interrupt(struct lbs_private *priv)
 }
 EXPORT_SYMBOL_GPL(lbs_interrupt);
 
-int lbs_reset_device(struct lbs_private *priv)
-{
-   int ret;
-
-   lbs_deb_enter(LBS_DEB_MAIN);
-   ret = lbs_prepare_and_send_command(priv, CMD_802_11_RESET,
-   CMD_ACT_HALT, 0, 0, NULL);
-   msleep_interruptible(10);
-
-   lbs_deb_leave_args(LBS_DEB_MAIN, "ret %d", ret);
-   return ret;
-}
-EXPORT_SYMBOL_GPL(lbs_reset_device);
-
 static int __init lbs_init_module(void)
 {
lbs_deb_enter(LBS_DEB_MAIN);
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] mlx4: needs vmalloc.h for vmap()

2008-02-11 Thread Roland Dreier

thanks, just merged the same patch from Olof Johansson.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] libertas: Don't mark exported symbols as static

2008-02-08 Thread Roland Dreier

 > Why not pull the exports? they aren't used anywhere in the existing kernel.

I'm guessing there's some not-(yet-)merged mesh networking stuff that
uses the symbols, but it doesn't matter much to me...

 - R.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] libertas: Don't mark exported symbols as static

2008-02-08 Thread Roland Dreier

Marking exported symbols as static causes the following build error on
ia64 with gcc 4.2.3:

drivers/net/wireless/libertas/main.c:1375: error: __ksymtab_lbs_remove_mesh 
causes a section type conflict
drivers/net/wireless/libertas/main.c:1354: error: __ksymtab_lbs_add_mesh causes 
a section type conflict

Therefore, remove the static marking on lbs_remove_mesh and lbs_add_mesh.

Signed-off-by: Roland Dreier <[EMAIL PROTECTED]>
---
diff --git a/drivers/net/wireless/libertas/main.c 
b/drivers/net/wireless/libertas/main.c
index 84fb49c..a688ce8 100644
--- a/drivers/net/wireless/libertas/main.c
+++ b/drivers/net/wireless/libertas/main.c
@@ -253,8 +253,8 @@ static ssize_t lbs_anycast_set(struct device *dev,
 
 static int lbs_add_rtap(struct lbs_private *priv);
 static void lbs_remove_rtap(struct lbs_private *priv);
-static int lbs_add_mesh(struct lbs_private *priv);
-static void lbs_remove_mesh(struct lbs_private *priv);
+int lbs_add_mesh(struct lbs_private *priv);
+void lbs_remove_mesh(struct lbs_private *priv);
 
 
 /**
@@ -1296,7 +1296,7 @@ EXPORT_SYMBOL_GPL(lbs_stop_card);
  *  @param privA pointer to the struct lbs_private structure
  *  @return   0 if successful, -X otherwise
  */
-static int lbs_add_mesh(struct lbs_private *priv)
+int lbs_add_mesh(struct lbs_private *priv)
 {
struct net_device *mesh_dev = NULL;
int ret = 0;
@@ -1354,7 +1354,7 @@ done:
 EXPORT_SYMBOL_GPL(lbs_add_mesh);
 
 
-static void lbs_remove_mesh(struct lbs_private *priv)
+void lbs_remove_mesh(struct lbs_private *priv)
 {
struct net_device *mesh_dev;
 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] [1/2] Skip empty hash buckets faster in /proc/net/tcp

2008-01-30 Thread Roland Dreier

 > On a 2GB Core2 system here I see a time cat /proc/net/tcp > /dev/null
 > constently dropping from 0.44s to 0.4-0.8s system time with this change.

Seems like there must be a typo in either the before or after times
you report here?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] cxgb3: Remove incorrect __devinit annotations

2008-01-29 Thread Roland Dreier

When PCI error recovery was added to cxgb3, a function t3_io_slot_reset()
was added.  This function can call back into t3_prep_adapter() at any
time, so t3_prep_adapter() can no longer be marked __devinit.
This patch removes the __devinit annotation from t3_prep_adapter() and
all the functions that it calls, which fixes

WARNING: drivers/net/cxgb3/built-in.o(.text+0x2427): Section mismatch in 
reference from the function t3_io_slot_reset() to the function 
.devinit.text:t3_prep_adapter()

Signed-off-by: Roland Dreier <[EMAIL PROTECTED]>
---
 drivers/net/cxgb3/mc5.c   |2 +-
 drivers/net/cxgb3/sge.c   |2 +-
 drivers/net/cxgb3/t3_hw.c |   22 ++
 3 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/drivers/net/cxgb3/mc5.c b/drivers/net/cxgb3/mc5.c
index 84c1ffa..4c4d6e8 100644
--- a/drivers/net/cxgb3/mc5.c
+++ b/drivers/net/cxgb3/mc5.c
@@ -452,7 +452,7 @@ void t3_mc5_intr_handler(struct mc5 *mc5)
t3_write_reg(adap, A_MC5_DB_INT_CAUSE, cause);
 }
 
-void __devinit t3_mc5_prep(struct adapter *adapter, struct mc5 *mc5, int mode)
+void t3_mc5_prep(struct adapter *adapter, struct mc5 *mc5, int mode)
 {
 #define K * 1024
 
diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c
index cb684d3..9ca8c66 100644
--- a/drivers/net/cxgb3/sge.c
+++ b/drivers/net/cxgb3/sge.c
@@ -2836,7 +2836,7 @@ void t3_sge_init(struct adapter *adap, struct sge_params 
*p)
  * defaults for the assorted SGE parameters, which admins can change until
  * they are used to initialize the SGE.
  */
-void __devinit t3_sge_prep(struct adapter *adap, struct sge_params *p)
+void t3_sge_prep(struct adapter *adap, struct sge_params *p)
 {
int i;
 
diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c
index 7469935..a99496a 100644
--- a/drivers/net/cxgb3/t3_hw.c
+++ b/drivers/net/cxgb3/t3_hw.c
@@ -2675,7 +2675,7 @@ void t3_tp_set_max_rxsize(struct adapter *adap, unsigned 
int size)
 V_PMMAXXFERLEN0(size) | V_PMMAXXFERLEN1(size));
 }
 
-static void __devinit init_mtus(unsigned short mtus[])
+static void init_mtus(unsigned short mtus[])
 {
/*
 * See draft-mathis-plpmtud-00.txt for the values.  The min is 88 so
@@ -2703,7 +2703,7 @@ static void __devinit init_mtus(unsigned short mtus[])
 /*
  * Initial congestion control parameters.
  */
-static void __devinit init_cong_ctrl(unsigned short *a, unsigned short *b)
+static void init_cong_ctrl(unsigned short *a, unsigned short *b)
 {
a[0] = a[1] = a[2] = a[3] = a[4] = a[5] = a[6] = a[7] = a[8] = 1;
a[9] = 2;
@@ -3354,8 +3354,7 @@ out_err:
  * Determines a card's PCI mode and associated parameters, such as speed
  * and width.
  */
-static void __devinit get_pci_mode(struct adapter *adapter,
-  struct pci_params *p)
+static void get_pci_mode(struct adapter *adapter, struct pci_params *p)
 {
static unsigned short speed_map[] = { 33, 66, 100, 133 };
u32 pci_mode, pcie_cap;
@@ -3395,8 +3394,7 @@ static void __devinit get_pci_mode(struct adapter 
*adapter,
  * capabilities and default speed/duplex/flow-control/autonegotiation
  * settings.
  */
-static void __devinit init_link_config(struct link_config *lc,
-  unsigned int caps)
+static void init_link_config(struct link_config *lc, unsigned int caps)
 {
lc->supported = caps;
lc->requested_speed = lc->speed = SPEED_INVALID;
@@ -3419,7 +3417,7 @@ static void __devinit init_link_config(struct link_config 
*lc,
  * Calculates the size of an MC7 memory in bytes from the value of its
  * configuration register.
  */
-static unsigned int __devinit mc7_calc_size(u32 cfg)
+static unsigned int mc7_calc_size(u32 cfg)
 {
unsigned int width = G_WIDTH(cfg);
unsigned int banks = !!(cfg & F_BKS) + 1;
@@ -3430,8 +3428,8 @@ static unsigned int __devinit mc7_calc_size(u32 cfg)
return MBs << 20;
 }
 
-static void __devinit mc7_prep(struct adapter *adapter, struct mc7 *mc7,
-  unsigned int base_addr, const char *name)
+static void mc7_prep(struct adapter *adapter, struct mc7 *mc7,
+unsigned int base_addr, const char *name)
 {
u32 cfg;
 
@@ -3517,7 +3515,7 @@ static int t3_reset_adapter(struct adapter *adapter)
return 0;
 }
 
-static int __devinit init_parity(struct adapter *adap)
+static int init_parity(struct adapter *adap)
 {
int i, err, addr;
 
@@ -3552,8 +3550,8 @@ static int __devinit init_parity(struct adapter *adap)
  * for some adapter tunables, take PHYs out of reset, and initialize the MDIO
  * interface.
  */
-int __devinit t3_prep_adapter(struct adapter *adapter,
- const struct adapter_info *ai, int reset)
+int t3_prep_adapter(struct adapter *adapter, const struct adapter_info *ai,
+   int reset)
 {
int ret;
unsigned int i,

Re: [PATCH 2.6.25] RDMA/cxgb3: Fix the T3A workaround checks.

2008-01-24 Thread Roland Dreier

thanks, applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RESEND 3/3] RDMA/cxgb3: Mark qp as privileged based on user capabilities.

2008-01-22 Thread Roland Dreier

thanks, applied 1-3
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC PATCH] IPoIB: improve IPv4/IPv6 to IB mcast mapping functions

2008-01-04 Thread Roland Dreier

Any objection to merging the following for 2.6.25?

[Rolf -- I think it makes more sense to delete the overwriting of the
P_Key in ipoib_multicast.c in this patch rather than later in the
series; do you agree?]

Thanks,
  Roland


From: Rolf Manderscheid <[EMAIL PROTECTED]>

An IPoIB subnet on an IB fabric that spans multiple IB subnets can't
use link-local scope in multicast GIDs.  The existing routines that
map IP/IPv6 multicast addresses into IB link-level addresses hard-code
the scope to link-local, and they also leave the partition key field
uninitialised.  This patch adds a parameter (the link-level broadcast
address) to the mapping routines, allowing them to initialise both the
scope and the P_Key appropriately, and fixes up the call sites.

The next step will be to add a way to configure the scope for an IPoIB
interface.

Signed-off-by: Rolf Manderscheid <[EMAIL PROTECTED]>
Signed-off-by: Roland Dreier <[EMAIL PROTECTED]>
---
 drivers/infiniband/core/cma.c  |4 +---
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |4 
 include/net/if_inet6.h |   11 +++
 include/net/ip.h   |   10 ++
 net/ipv4/arp.c |2 +-
 net/ipv6/ndisc.c   |2 +-
 6 files changed, 16 insertions(+), 17 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 312ec74..982836e 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -2610,11 +2610,9 @@ static void cma_set_mgid(struct rdma_id_private *id_priv,
/* IPv6 address is an SA assigned MGID. */
memcpy(mgid, &sin6->sin6_addr, sizeof *mgid);
} else {
-   ip_ib_mc_map(sin->sin_addr.s_addr, mc_map);
+   ip_ib_mc_map(sin->sin_addr.s_addr, dev_addr->broadcast, mc_map);
if (id_priv->id.ps == RDMA_PS_UDP)
mc_map[7] = 0x01;   /* Use RDMA CM signature */
-   mc_map[8] = ib_addr_get_pkey(dev_addr) >> 8;
-   mc_map[9] = (unsigned char) ib_addr_get_pkey(dev_addr);
*mgid = *(union ib_gid *) (mc_map + 4);
}
 }
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 
b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
index 858ada1..2628339 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -788,10 +788,6 @@ void ipoib_mcast_restart_task(struct work_struct *work)
 
memcpy(mgid.raw, mclist->dmi_addr + 4, sizeof mgid);
 
-   /* Add in the P_Key */
-   mgid.raw[4] = (priv->pkey >> 8) & 0xff;
-   mgid.raw[5] = priv->pkey & 0xff;
-
mcast = __ipoib_mcast_find(dev, &mgid);
if (!mcast || test_bit(IPOIB_MCAST_FLAG_SENDONLY, 
&mcast->flags)) {
struct ipoib_mcast *nmcast;
diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h
index 448eccb..b24508a 100644
--- a/include/net/if_inet6.h
+++ b/include/net/if_inet6.h
@@ -269,18 +269,21 @@ static inline void ipv6_arcnet_mc_map(const struct 
in6_addr *addr, char *buf)
buf[0] = 0x00;
 }
 
-static inline void ipv6_ib_mc_map(struct in6_addr *addr, char *buf)
+static inline void ipv6_ib_mc_map(const struct in6_addr *addr,
+ const unsigned char *broadcast, char *buf)
 {
+   unsigned char scope = broadcast[5] & 0xF;
+
buf[0]  = 0;/* Reserved */
buf[1]  = 0xff; /* Multicast QPN */
buf[2]  = 0xff;
buf[3]  = 0xff;
buf[4]  = 0xff;
-   buf[5]  = 0x12; /* link local scope */
+   buf[5]  = 0x10 | scope; /* scope from broadcast address */
buf[6]  = 0x60; /* IPv6 signature */
buf[7]  = 0x1b;
-   buf[8]  = 0;/* P_Key */
-   buf[9]  = 0;
+   buf[8]  = broadcast[8]; /* P_Key */
+   buf[9]  = broadcast[9];
memcpy(buf + 10, addr->s6_addr + 6, 10);
 }
 #endif
diff --git a/include/net/ip.h b/include/net/ip.h
index 840dd91..50c8889 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -266,20 +266,22 @@ static inline void ip_eth_mc_map(__be32 naddr, char *buf)
  * Leave P_Key as 0 to be filled in by driver.
  */
 
-static inline void ip_ib_mc_map(__be32 naddr, char *buf)
+static inline void ip_ib_mc_map(__be32 naddr, const unsigned char *broadcast, 
char *buf)
 {
__u32 addr;
+   unsigned char scope = broadcast[5] & 0xF;
+
buf[0]  = 0;/* Reserved */
buf[1]  = 0xff; /* Multicast QPN */
buf[2]  = 0xff;
buf[3]  = 0xff;
addr= ntohl(naddr);
buf[4]  = 0xff;
-   buf[5]  = 0x12; /* link local scope */
+   buf[5]  = 0x10 | scope; /* scope from broadcast address */

Bogus commit 70eba18b ("make bnx2x select ZLIB_INFLATE")?

2007-12-08 Thread Roland Dreier

Commit 70eba18b ("make bnx2x select ZLIB_INFLATE") in Linus's tree
seems bogus.  As far as I can tell, bnx2x is not upstream yet, and the
commit in question actually adds "select ZLIB_INFLATE" to the TEHUTI
config, since there is no BNX2X config option (and also I don't see
any reference to zlib in any tehuti files).

I assume this is some sort of merge problem that got added to the
wrong tree and should be reverted upstream.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH]: fix lro_gen_skb() alignment

2007-11-30 Thread Roland Dreier

 > >> -   skb = netdev_alloc_skb(lro_mgr->dev, hlen);
 > >> +   skb = netdev_alloc_skb(lro_mgr->dev, hlen + NET_IP_ALIGN);
 > > NET_IP_ALIGN should only be used if you're DMAing into the skb head.
 > > Otherwise you should say 2.  It would be nice to have another macro
 > > for that I suppose.
 > 
 > It is certainly simple enough to say 2.  Thank you for pointing
 > this out.  I have attached a patch to do that..
 > 
 > Signed off by: Andrew Gallatin <[EMAIL PROTECTED]>

Isn't the value of 2 ethernet-specific (to round the 14-byte header up
to 16)?  Given that the rest of the lro code is fairly careful to
calculate mac_hdr_len etc it seems as if it would be cleaner to make
this independent of the specific L2 being used.

(And I plan on using the LRO module for IP-over-InfiniBand so this is
not completely theoretical)

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2.6.25 2/2] RDMA/cxgb3: Support 5.0 firmware.

2007-11-28 Thread Roland Dreier

OK, applied 1 and 2...

 > Note: this change requires 5.0 firmware.

I assume the change to the cxgb3 FW versions is pending in a net
driver change for 2.6.25?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-26 Thread Roland Dreier

 > Agreed. On first glance, I was intrigued but:
 > 
 > 1) Why is everyone so concerned that export symbol space is large?
 >  - does it cost cpu or running memory?
 >  - does it cause bugs?
 >  - or are you just worried about "evil modules"?
 > 
 > 2) These aren't real namespaces
 >  - all global names still have to be unique
 >  - still have to handle the "non-modular build" namespace conflicts
 >  - there isn't a big problem with conflicting symbols today.

Perhaps changing the name from "namespace" to "interface" would help?
Then a module could have something like

MODULE_USE_INTERFACE(foo);

and I think that makes it clearer what the advantage of this is: it
marks symbols as being part of a certain interface, requires modules
that use that interface to declare that use explicitly, and allows
reviewers to say "Hey why is this code using the scsi interface when
it's a webcam driver?"

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-25 Thread Roland Dreier

 > Except C doesn't have namespaces and this mechanism doesn't create them.  So 
 > this is just complete and utter makework; as I said before, noone's going to 
 > confuse all those udp_* functions if they're not in the udp namespace.

I don't understand why you're so opposed to organizing the kernel's
exported symbols in a more self-documenting way.  It seems pretty
clear to me that having a mechanism that requires modules to make
explicit which (semi-)internal APIs makes reviewing easier, makes it
easier to communicate "please don't use that API" to module authors,
and takes at least a small step towards bringing the kernel's exported
API under control.  What's the real downside?

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-25 Thread Roland Dreier

 > > I agree that we shouldn't make things too hard for out-of-tree
 > > modules, but I disagree with your first statement: there clearly is a
 > > large class of symbols that are used by multiple modules but which are
 > > not generically useful -- they are only useful by a certain small class
 > > of modules.
 > 
 > If it is so clear, you should be able to easily provide examples?

Sure -- Andi's example of symbols required only by TCP congestion
modules; the SCSI internals that Christoph wants to mark; the symbols
exported by my mlx4_core driver (which I admit are currently only used
by the mlx4_ib driver, but which will also be used by at least the
ethernet NIC driver for the same hardware).  I thought this was
already covered repeatedly in the thread and indeed in Andi's code so
there was no need to repeat it...
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-25 Thread Roland Dreier

 > Yes, and if a symbol is already used by multiple modules, it's generically 
 > useful.  And if so, why restrict it to in-tree modules?

I agree that we shouldn't make things too hard for out-of-tree
modules, but I disagree with your first statement: there clearly is a
large class of symbols that are used by multiple modules but which are
not generically useful -- they are only useful by a certain small class
of modules.

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-25 Thread Roland Dreier

 > This patch allows to export symbols only for specific modules by 
 > introducing symbol name spaces. A module name space has a white
 > list of modules that are allowed to import symbols for it; all others
 > can't use the symbols.
 > 
 > It adds two new macros: 
 > 
 > MODULE_NAMESPACE_ALLOW(namespace, module);

I definitely like the idea of organizing exported symbols into
namespaces.  However, I feel like it would make more sense to have
something like

MODULE_NAMESPACE_IMPORT(namespace);

inside the modules that use a namespace.  This matches the way C
preprocessor #includes, C++ namespaces, perl "use", etc. works and so
it is probably closer to how programmers think.  It does weaken the
enforcement of review a little bit, since there are no changes to the
site where things are exported to catch, but git makes it easy to
sneak that kind of change in anyway.

The practical benefits are that you don't end up with stupid patch
conflicts caused by one patch adding MODULE_NAMESPACE_ALLOW() for
module "foo" and another patch adding it for module "bar" at the same
place, and that it becomes simpler for people to test or deliver, say,
a new TCP congestion module without having to rebuild the whole kernel
(which is an especially huge pain for distro kernels).

In any case I would make use of this in whatever form it gets merged
in -- Mellanox ConnectX hardware can operate simultaneously as an
InfiniBand adapter and a 10Gb ethernet NIC, and so my driver is split
up into a low-level mlx4_core driver that exports symbols for other
mlx4 drivers to use; clearly it only makes sense to export them to
other parts of the driver, and in this case the difference between
"ALLOW" and "IMPORT" semantics is not a big deal.

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2.6.24] RDMA/cxgb3: Set the max_qp_init_rd_atom attribute.

2007-11-09 Thread Roland Dreier

thanks, applied.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 11/14 v2] nes: OpenFabrics kernel verbs

2007-10-31 Thread Roland Dreier

 > +/**
 > + * nes_post_send
 > + */
 > +static int nes_post_send(struct ib_qp *ibqp, struct ib_send_wr *ib_wr,
 > +struct ib_send_wr **bad_wr)

 > ...

 > +switch (ib_wr->opcode) {

 > ...

 > +if (ib_wr->num_sge > 
 > nesdev->nesadapter->max_sge) {
 > +err = -EINVAL;
 > +break;
 > +}

 > ...

 > +default:
 > +/* error */
 > +err = -EINVAL;
 > +break;

looks like if you detect an error while posting a work request, you
break out of the switch statement but just continue through the while
loop going through the list of work reuqests.  Which doesn't seem like
it will work very well.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 10/14 v2] nes: eeprom and phy routines

2007-10-31 Thread Roland Dreier

 > +/* TODO: deal with EEPROM endian issues */

This is pretty scary.  Is the driver broken on big-endian systems now?

 > +/*
 > +"Everything you wanted to know about CRC algorithms, but were afraid to ask
 > + for fear that errors in your understanding might be detected." Version  : 
 > 3.

etc etc... can all this be replaced with what's in lib/crc32.c?  (I
hope so)

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/14 v2] nes: module and device initialization

2007-10-30 Thread Roland Dreier

 > Thanks Roland.  Let me know when you have your branch ready.

OK, I pushed out a "neteffect" branch at

git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git

This has the driver from your git tree plus some compile fixes and
cleanups (added as separate commits, so you can see what I did).  If
it suits you, let's work against that tree to continue cleaning things
up -- you can send me patches or git pull requests to pick up new things.

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/14 v2] nes: module and device initialization

2007-10-26 Thread Roland Dreier

OK, a couple quick review comments and a process comment too:

 - First step in the driver is to kill off a lot of the #ifdefs:

 > +#ifdef IRQF_SHARED

The upstream driver really shouldn't have compatibility gunk for older
kernels... just make it build against the kernel it's in.

 > +#ifdef OFED_1_2

Same... kernel code shouldn't worry about OFED.

 > +#ifdef CONFIG_PCI_MSI
 > +if (nesdev->msi_enabled) {
 > +pci_disable_msi(pcidev);
 > +}
 > +#endif

This can be much simpler, because pci_disable_msi() is always
available and is a NOP if the config option is off or MSI is not
enabled.  So you can just unconditionally do

pci_disable_msi(pcidev);

 > +#ifdef NES_NAPI

I don't see anything that defines NES_NAPI.  I think for the final
merge we want a NAPI-only driver (ie no ifdef at all)... is there any
performance or other reason to ever build a non-NAPI driver (for a
modern kernel)?

OK, on a process level, my plan is to pull the current driver into a
"neteffect" branch in my git tree with the intention of merging it for
2.6.25.  I'll let you know when that's ready (probably early next
week).  I'll probably do some cleanups there, and you can send me
cleanup/fix patches against that branch any time too.  We should try
to keep the cycle time short: the interval between the first posting
of this driver and the current one was pretty long, and there's a lot
of cleanup to do to get ready for the next merge window.  Does that
plan make sense?

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 13/14 v2] nes: kernel build infrastructure

2007-10-21 Thread Roland Dreier

 > +config INFINIBAND_NES_DEBUG
 > +bool "Verbose debugging output"
 > +depends on INFINIBAND_NES
 > +default n
 > +---help---
 > +  This option causes the NetEffect RNIC driver to produce debug
 > +  messages.  Select this if you are developing the driver
 > +  or trying to diagnose a problem.

If you make this default n then no distro will have it enabled and
you'll have to rebuild to debug anything.  Better to have the default
be enabled and make it controllable at runtime too with a module
parameter.  (you can look at what mthca does for an example of what I mean)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ofa-general] [PATCH 14/14 v2] nes: kernel build infrastructure

2007-10-21 Thread Roland Dreier

 > +
 > +EXTRA_CFLAGS += -DNES_MINICM

I don't see anyplace NES_MINICM is used.  Delete this line?

 > +
 > +obj-$(CONFIG_INFINIBAND_NES) += iw_nes.o
 > +
 > +iw_nes-objs := nes.o nes_hw.o nes_nic.o nes_utils.o nes_verbs.o nes_cm.o
 > +

Also the file has an extra blank line at the beginning and end.  Might
as well kill them.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/14 v2] nes: device structures and defines

2007-10-19 Thread Roland Dreier

 > You are starting off on the wrong foot.

???

 > > +if(!(expr)) { 
 > > \
 > > +  printk(KERN_ERR PFX "Assertion failed! %s, %s, %s, line %d\n",  \
 > > + #expr, __FILE__, __FUNCTION__, __LINE__);
 > > \
 > > +}
 > 
 > Use BUG_ON

I agree that there's no need to invent a driver-private assertion
macro, but (to first order at least) drivers should never use BUG_ON.
I don't want some glitch in a network driver that the system could
probably survive to be turned into a panic by BUG_ON -- WARN_ON seems
infinitely preferable.

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/14 v2] nes: module and device initialization

2007-10-19 Thread Roland Dreier

Thanks... I am kind of overloaded trying to handle the last few things
for the 2.6.24 merge window, but I will look at this next week, and I
expect we should be able to merge the driver for 2.6.25 unless there
are unexpected hangups.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 5/8] [MV643XX_ETH] Remove SHARED_REGS register address bias

2007-10-18 Thread Roland Dreier

 > +static void __iomem *mv643xx_eth_base;

 > +return readl(((void __iomem *)mv643xx_eth_base) + offset);

Given the declaration of mv643xx_eth_base as void __iomem * already, I
don't understand why you need the cast to the same type here (and
elsewhere in the driver).

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Bonding support for eth1394?

2007-10-13 Thread Roland Dreier

 > The bonding sources have a few occurrences of EOPNOTSUPP.  Unless I
 > missed something, they are all related to setting the hardware address
 > of the interface.  AFAICS this is impossible with IP over FireWire.  If
 > it is crucial to bonding to be able to change the slaves' hardware
 > addresses, then you are out of luck.

There are a few changes to the bonding driver pending that will add
support for bonding IP-over-InfiniBand interfaces.  IPoIB also cannot
change its HW address, so the patches address that issue.

Once those patches land, bonding eth1394 interfaces may "just work".

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] IB/ipoib: Bound the net device to the ipoib_neigh structue

2007-10-11 Thread Roland Dreier

 > It happens only when ib interfaces are slaves of a bonding device.
 > I thought before that the stuck is in napi_disable() but it's almost right.
 > I put prints before and after call to napi_disable and see that it is called 
 > twice.
 > I'll try to investigate in this direction.
 > 
 > ib0: stopping interface
 > ib0: before napi_disable
 > ib0: after napi_disable
 > ib0: downing ib_dev
 > ib0: All sends and receives done.
 > ib0: stopping interface
 > ib0: before napi_disable

Yes, two napi_disable()s in a row without a matching napi_enable()
will deadlock.  I guess the question is why the ipoib interface is
being stopped twice.

If you just take the net-2.6.24 tree (without bonding patches), does
bonding for ethernet interfaces work OK, or is there a similar problem
with double napi_disable()?  How about bonding of ethernet after this
batch of bonding patches?

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] IB/ipoib: Bound the net device to the ipoib_neigh structue

2007-10-10 Thread Roland Dreier

 > I also ran a test for the code in the branch of 2.6.24 and found a problem.
 > I see that ifconfig down doesn't return (for IPoIB interfaces) and it's 
 > stuck in napi_disable() in the kernel (any idea why?)

For what it's worth, I took the upstream 2.6.23 git tree and merged in
Dave's latest net-2.6.24 tree and my latest for-2.6.24 tree and tried
that.  I brought up an IPoIB interface, sent a few pings, and did
ifconfig down, and it worked fine.

Can you try the same thing without the bonding patches to see if your
setup works OK too?

Also can you give more details about what you do to get ifconfig down stuck?

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ofa-general] [PATCH 4/4] ibm_emac: Convert to use napi_struct independent of struct net_device

2007-10-09 Thread Roland Dreier

Sorry... wrong subject here; it should have been "ibm_newemac: ..."

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/4] ibm_emac: Convert to use napi_struct independent of struct net_device

2007-10-09 Thread Roland Dreier

Commit da3dedd9 ("[NET]: Make NAPI polling independent of struct
net_device objects.") changed the interface to NAPI polling.  Fix up
the ibm_newemac driver so that it works with this new interface.  This
is actually a nice cleanup because ibm_newemac is one of the drivers
that wants to have multiple NAPI structures for a single net_device.

Compile-tested only as I don't have a system that uses the ibm_newemac
driver.  This conversion the conversion for the ibm_emac driver that
was tested on real PowerPC 440SPe hardware.

Signed-off-by: Roland Dreier <[EMAIL PROTECTED]>
---
 drivers/net/ibm_newemac/mal.c |   55 ++--
 drivers/net/ibm_newemac/mal.h |2 +-
 2 files changed, 20 insertions(+), 37 deletions(-)

diff --git a/drivers/net/ibm_newemac/mal.c b/drivers/net/ibm_newemac/mal.c
index c4335b7..5885411 100644
--- a/drivers/net/ibm_newemac/mal.c
+++ b/drivers/net/ibm_newemac/mal.c
@@ -235,10 +235,10 @@ static irqreturn_t mal_serr(int irq, void *dev_instance)
 
 static inline void mal_schedule_poll(struct mal_instance *mal)
 {
-   if (likely(netif_rx_schedule_prep(&mal->poll_dev))) {
+   if (likely(napi_schedule_prep(&mal->napi))) {
MAL_DBG2(mal, "schedule_poll" NL);
mal_disable_eob_irq(mal);
-   __netif_rx_schedule(&mal->poll_dev);
+   __napi_schedule(&mal->napi);
} else
MAL_DBG2(mal, "already in poll" NL);
 }
@@ -318,8 +318,7 @@ void mal_poll_disable(struct mal_instance *mal, struct 
mal_commac *commac)
msleep(1);
 
/* Synchronize with the MAL NAPI poller. */
-   while (test_bit(__LINK_STATE_RX_SCHED, &mal->poll_dev.state))
-   msleep(1);
+   napi_disable(&mal->napi);
 }
 
 void mal_poll_enable(struct mal_instance *mal, struct mal_commac *commac)
@@ -330,11 +329,11 @@ void mal_poll_enable(struct mal_instance *mal, struct 
mal_commac *commac)
// XXX might want to kick a poll now...
 }
 
-static int mal_poll(struct net_device *ndev, int *budget)
+static int mal_poll(struct napi_struct *napi, int budget)
 {
-   struct mal_instance *mal = netdev_priv(ndev);
+   struct mal_instance *mal = container_of(napi, struct mal_instance, 
napi);
struct list_head *l;
-   int rx_work_limit = min(ndev->quota, *budget), received = 0, done;
+   int received = 0;
unsigned long flags;
 
MAL_DBG2(mal, "poll(%d) %d ->" NL, *budget,
@@ -358,26 +357,21 @@ static int mal_poll(struct net_device *ndev, int *budget)
int n;
if (unlikely(test_bit(MAL_COMMAC_POLL_DISABLED, &mc->flags)))
continue;
-   n = mc->ops->poll_rx(mc->dev, rx_work_limit);
+   n = mc->ops->poll_rx(mc->dev, budget);
if (n) {
received += n;
-   rx_work_limit -= n;
-   if (rx_work_limit <= 0) {
-   done = 0;
-   // XXX What if this is the last one ?
-   goto more_work;
-   }
+   budget -= n;
+   if (budget <= 0)
+   goto more_work; // XXX What if this is the last 
one ?
}
}
 
/* We need to disable IRQs to protect from RXDE IRQ here */
spin_lock_irqsave(&mal->lock, flags);
-   __netif_rx_complete(ndev);
+   __napi_complete(napi);
mal_enable_eob_irq(mal);
spin_unlock_irqrestore(&mal->lock, flags);
 
-   done = 1;
-
/* Check for "rotting" packet(s) */
list_for_each(l, &mal->poll_list) {
struct mal_commac *mc =
@@ -387,12 +381,12 @@ static int mal_poll(struct net_device *ndev, int *budget)
if (unlikely(mc->ops->peek_rx(mc->dev) ||
 test_bit(MAL_COMMAC_RX_STOPPED, &mc->flags))) {
MAL_DBG2(mal, "rotting packet" NL);
-   if (netif_rx_reschedule(ndev, received))
+   if (napi_reschedule(napi))
mal_disable_eob_irq(mal);
else
MAL_DBG2(mal, "already in poll list" NL);
 
-   if (rx_work_limit > 0)
+   if (budget > 0)
goto again;
else
goto more_work;
@@ -401,13 +395,8 @@ static int mal_poll(struct net_device *ndev, int *budget)
}
 
  more_work:
-   ndev->quota -= received;
-   *budget -= received;
-
-   MAL_DBG2(mal, "poll() %d <- %d" NL, *budget,
-done ? 0 : 1);
-
-   re

[PATCH 1/4] IPoIB: Fix unused variable warning

2007-10-09 Thread Roland Dreier

The conversion to use netdevice internal stats left an unused variable
in ipoib_neigh_free(), since there's no longer any reason to get
netdev_priv() in order to increment dropped packets.  Delete the
unused priv variable.

Signed-off-by: Roland Dreier <[EMAIL PROTECTED]>
---
 drivers/infiniband/ulp/ipoib/ipoib_main.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c 
b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 6b1b4b2..855c9de 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -854,7 +854,6 @@ struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour 
*neighbour)
 
 void ipoib_neigh_free(struct net_device *dev, struct ipoib_neigh *neigh)
 {
-   struct ipoib_dev_priv *priv = netdev_priv(dev);
struct sk_buff *skb;
*to_ipoib_neigh(neigh->neighbour) = NULL;
while ((skb = __skb_dequeue(&neigh->queue))) {
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/4] ibm_emac: Convert to use napi_struct independent of struct net_device

2007-10-09 Thread Roland Dreier

Commit da3dedd9 ("[NET]: Make NAPI polling independent of struct
net_device objects.") changed the interface to NAPI polling.  Fix up
the ibm_emac driver so that it works with this new interface.  This is
actually a nice cleanup because ibm_emac is one of the drivers that
wants to have multiple NAPI structures for a single net_device.

Tested with the internal MAC of a PowerPC 440SPe SoC with an AMCC
'Yucca' evaluation board.

Signed-off-by: Roland Dreier <[EMAIL PROTECTED]>
---
 drivers/net/ibm_emac/ibm_emac_mal.c |   48 --
 drivers/net/ibm_emac/ibm_emac_mal.h |2 +-
 include/linux/netdevice.h   |   10 +++
 3 files changed, 28 insertions(+), 32 deletions(-)

diff --git a/drivers/net/ibm_emac/ibm_emac_mal.c 
b/drivers/net/ibm_emac/ibm_emac_mal.c
index cabd984..cc3ddc9 100644
--- a/drivers/net/ibm_emac/ibm_emac_mal.c
+++ b/drivers/net/ibm_emac/ibm_emac_mal.c
@@ -207,10 +207,10 @@ static irqreturn_t mal_serr(int irq, void *dev_instance)
 
 static inline void mal_schedule_poll(struct ibm_ocp_mal *mal)
 {
-   if (likely(netif_rx_schedule_prep(&mal->poll_dev))) {
+   if (likely(napi_schedule_prep(&mal->napi))) {
MAL_DBG2("%d: schedule_poll" NL, mal->def->index);
mal_disable_eob_irq(mal);
-   __netif_rx_schedule(&mal->poll_dev);
+   __napi_schedule(&mal->napi);
} else
MAL_DBG2("%d: already in poll" NL, mal->def->index);
 }
@@ -273,11 +273,11 @@ static irqreturn_t mal_rxde(int irq, void *dev_instance)
return IRQ_HANDLED;
 }
 
-static int mal_poll(struct net_device *ndev, int *budget)
+static int mal_poll(struct napi_struct *napi, int budget)
 {
-   struct ibm_ocp_mal *mal = ndev->priv;
+   struct ibm_ocp_mal *mal = container_of(napi, struct ibm_ocp_mal, napi);
struct list_head *l;
-   int rx_work_limit = min(ndev->quota, *budget), received = 0, done;
+   int received = 0;
 
MAL_DBG2("%d: poll(%d) %d ->" NL, mal->def->index, *budget,
 rx_work_limit);
@@ -295,38 +295,34 @@ static int mal_poll(struct net_device *ndev, int *budget)
list_for_each(l, &mal->poll_list) {
struct mal_commac *mc =
list_entry(l, struct mal_commac, poll_list);
-   int n = mc->ops->poll_rx(mc->dev, rx_work_limit);
+   int n = mc->ops->poll_rx(mc->dev, budget);
if (n) {
received += n;
-   rx_work_limit -= n;
-   if (rx_work_limit <= 0) {
-   done = 0;
+   budget -= n;
+   if (budget <= 0)
goto more_work; // XXX What if this is the last 
one ?
-   }
}
}
 
/* We need to disable IRQs to protect from RXDE IRQ here */
local_irq_disable();
-   __netif_rx_complete(ndev);
+   __napi_complete(napi);
mal_enable_eob_irq(mal);
local_irq_enable();
 
-   done = 1;
-
/* Check for "rotting" packet(s) */
list_for_each(l, &mal->poll_list) {
struct mal_commac *mc =
list_entry(l, struct mal_commac, poll_list);
if (unlikely(mc->ops->peek_rx(mc->dev) || mc->rx_stopped)) {
MAL_DBG2("%d: rotting packet" NL, mal->def->index);
-   if (netif_rx_reschedule(ndev, received))
+   if (napi_reschedule(napi))
mal_disable_eob_irq(mal);
else
MAL_DBG2("%d: already in poll list" NL,
 mal->def->index);
 
-   if (rx_work_limit > 0)
+   if (budget > 0)
goto again;
else
goto more_work;
@@ -335,12 +331,8 @@ static int mal_poll(struct net_device *ndev, int *budget)
}
 
   more_work:
-   ndev->quota -= received;
-   *budget -= received;
-
-   MAL_DBG2("%d: poll() %d <- %d" NL, mal->def->index, *budget,
-done ? 0 : 1);
-   return done ? 0 : 1;
+   MAL_DBG2("%d: poll() %d <- %d" NL, mal->def->index, budget, received);
+   return received;
 }
 
 static void mal_reset(struct ibm_ocp_mal *mal)
@@ -425,11 +417,8 @@ static int __init mal_probe(struct ocp_device *ocpdev)
mal->def = ocpdev->def;
 
INIT_LIST_HEAD(&mal->poll_list);
-   set_bit(__LINK_STATE_START, &mal->poll_dev.state);
-   mal->poll_dev.weight = CONFIG_IBM_EMAC_POLL_WEIGHT;
-   mal->

[PATCH 3/4] ibm_new_emac: Nuke SET_MODULE_OWNER() use

2007-10-09 Thread Roland Dreier

Signed-off-by: Roland Dreier <[EMAIL PROTECTED]>
---
 drivers/net/ibm_newemac/core.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ibm_newemac/core.c b/drivers/net/ibm_newemac/core.c
index ce127b9..8ea5009 100644
--- a/drivers/net/ibm_newemac/core.c
+++ b/drivers/net/ibm_newemac/core.c
@@ -2549,7 +2549,6 @@ static int __devinit emac_probe(struct of_device *ofdev,
dev->ndev = ndev;
dev->ofdev = ofdev;
dev->blist = blist;
-   SET_MODULE_OWNER(ndev);
SET_NETDEV_DEV(ndev, &ofdev->dev);
 
/* Initialize some embedded data structures */
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-09 Thread Roland Dreier

 > Before you add new entries to your list, how is that ibm driver NAPI
 > conversion coming along? :-)

OK, thanks for the kick in the pants, I have a couple of patches for
net-2.6.24 coming (including an unrelated trivial warning fix for
IPoIB).

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-09 Thread Roland Dreier

 > Before you add new entries to your list, how is that ibm driver NAPI
 > conversion coming along? :-)

I still haven't done much.  OK, I will try to get my board booting
again this week.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-09 Thread Roland Dreier

 > I'd say we can probably try to get rid of it in 2.6.25, this is
 > assuming we get driver authors to cooperate and do the conversions
 > or alternatively some other motivated person.
 > 
 > I can just threaten to do them all and that should get the driver
 > maintainers going :-)

I can definitely kill LLTX for IPoIB by 2.6.25 and I just added it to
my TODO list so I don't forget.

In fact if 2.6.23 drags on long enough I may do it for 2.6.24
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ofa-general] InfiniBand/RDMA merge plans for 2.6.24

2007-10-05 Thread Roland Dreier

 > I tested this by simulating a slow passive side responder, and it worked as
 > expected for those tests.  Using an MRA does add another MAD to the CM 
 > exchange,
 > which is why it is sent only after seeing a duplicate request.  
 > Alternatively,
 > we can take the OFED module parameter patch.

What the heck, I added this for 2.6.24.  If it doesn't work out we can
back it out.

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Vague maybe ppp-related panic report for 2.6.23-rc9

2007-10-04 Thread Roland Dreier

Just as a quick update -- I seem to only be able to reproduce this
crash when my ppp session drops, which seems associated with marginal
signal.  And unfortunately I have great coverage at home so I haven't
been able to reproduce this again today.  Maybe on the train tomorrow
I can crash my laptop...

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] mac80211: Fix TX after monitor interface is converted to managed

2007-10-04 Thread Roland Dreier

 > Programming with assertions (and BUG_ON is a form of that) is
 > generally a good practice.  Almost any book or other source on
 > good programming practices will agree.  Yes, it can be overdone.
 > But I don't really think that is the case here, since the check is
 > relatively inexpensive and the consequence should it ever *somehow*
 > happen could be a something wierd (crash, corruption, etc) w/o any
 > other indication of what occured.

The problem with BUG_ON is that it kills the whole system.  So every
time you add a BUG_ON into code, you have to weigh whether the problem
you detected is so severe that the right response is to panic.  For
example, I can see panicking on something fundamental like corrupted
page tables.  However I would submit that the wireless stack should
*never* use BUG_ON -- printing a warning and trying to limp on seems
preferable to me.

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Vague maybe ppp-related panic report for 2.6.23-rc9

2007-10-04 Thread Roland Dreier

 > I don't want to jump the gun on the analysis but it just might
 > be the packet sharing fixes Herbert put in a short time ago.
 > 
 > What you could do is go back to say rc2 and see if you still get
 > the panics, then bisect from there to narrow it down.
 > 
 > If rc2 still gives the panic, it's something else, perhaps device
 > specific.

OK thanks.  I'm still trying to find a good way to reproduce it, so
I'm going to try to get it to happen with -rc9 while running in the
console, and at least take a picture of the backtrace.  If I find a
good way to make it happen then I'll try rc2 and let you know.

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Vague maybe ppp-related panic report for 2.6.23-rc9

2007-10-04 Thread Roland Dreier

I'm running 2.6.23-rc9 on my laptop, and when in a coffee shop I use
a Verizon EVDO card to get network access.  This is a kyocera device
that looks like a serial adapter behind an ohci usb controller, and
uses the airprime driver (for usb device 0c88:17da).  The actual IP
networking is ppp over that serial adapter.

The connection is slightly flaky (I blame this on the Verizon
coverage), so I end up restarting things every so often.  The point of
this email (at last) is that I've just seen my second panic (only
info: blinking caps lock led, since I've been in X) that seems
correlated to stopping/restarting pppd.

Sorry for the lack of detail -- I've just switched to running in the
console so if I can provoke the crash again I'll get a little more
info.  I just wanted to mention this in case someone has seen
something similar or has any good ideas about how to capture debugging
output on a laptop (when I'm not home and have no other boxes handy).

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: tcp bw in 2.6

2007-10-02 Thread Roland Dreier

 > It would be really great to see numbers with a more recent kernel
 > than 2.6.18

FWIW Debian has binaries for 2.6.21 in testing and for 2.6.22 in
unstable so it should be very easy for Larry to try at least those.

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ofa-general] InfiniBand/RDMA merge plans for 2.6.24

2007-10-02 Thread Roland Dreier

 > >OK -- just to make sure I'm understanding what you're saying: have you
 > >confirmed that your proposed [CM MRA] patches actually fix the issue?
 > 
 > Not directly.  I cannot easily test kernel patches on our larger, production
 > clusters.  We've seen the issue with specific applications on 512 and 1024
 > cores, but I've only been able to test the patch on a 48-core cluster.  I 
 > have
 > verified that it successfully increases the timeout to where it *should* 
 > work,
 > but cannot absolutely confirm that it will fix the problem.  I'm unlikely to
 > know that until the production clusters move to an OFED release (1.3?)
 > containing this patch.

Umm... this is a difficult situation for me to merge the changes then.
We're changing the CM retry behavior blind here.  How do we know that
the MRA changes don't make the scalability issue worse?

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] IPoIB: Convert to netdevice internal stats

2007-09-28 Thread Roland Dreier

 > How is that ibm_emac NAPI conversion coming along? :-)

Sorry, trying to reduce my backlog first, but it is still on my list
of things to work on :)

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] IPoIB: Convert to netdevice internal stats

2007-09-28 Thread Roland Dreier

Use the stats member of struct netdevice in IPoIB, so we can save
memory by deleting the stats member of struct ipoib_dev_priv, and save
code by deleting ipoib_get_stats().

Signed-off-by: Roland Dreier <[EMAIL PROTECTED]>
---
Dave, can you queue this in net-2.6.24 please?  I would ordinarily
merge IPoIB changes but since this depends on the netdevice internal
stats change it becomes a cross-tree dependency if I try to do that.
And I'd like to get it queued in git now before the merge window.

Thanks...

 drivers/infiniband/ulp/ipoib/ipoib.h   |2 --
 drivers/infiniband/ulp/ipoib/ipoib_cm.c|   20 ++--
 drivers/infiniband/ulp/ipoib/ipoib_ib.c|   18 +-
 drivers/infiniband/ulp/ipoib/ipoib_main.c  |   22 +++---
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |   10 +-
 5 files changed, 31 insertions(+), 41 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h 
b/drivers/infiniband/ulp/ipoib/ipoib.h
index 3a6ef14..1e627ee 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -300,8 +300,6 @@ struct ipoib_dev_priv {
 
struct ib_event_handler event_handler;
 
-   struct net_device_stats stats;
-
struct net_device *parent;
struct list_head child_intfs;
struct list_head list;
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c 
b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 08b4676..1afd93c 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -430,7 +430,7 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct 
ib_wc *wc)
ipoib_dbg(priv, "cm recv error "
   "(status=%d, wrid=%d vend_err %x)\n",
   wc->status, wr_id, wc->vendor_err);
-   ++priv->stats.rx_dropped;
+   ++dev->stats.rx_dropped;
goto repost;
}
 
@@ -457,7 +457,7 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct 
ib_wc *wc)
 * this packet and reuse the old buffer.
 */
ipoib_dbg(priv, "failed to allocate receive buffer %d\n", 
wr_id);
-   ++priv->stats.rx_dropped;
+   ++dev->stats.rx_dropped;
goto repost;
}
 
@@ -474,8 +474,8 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct 
ib_wc *wc)
skb_pull(skb, IPOIB_ENCAP_LEN);
 
dev->last_rx = jiffies;
-   ++priv->stats.rx_packets;
-   priv->stats.rx_bytes += skb->len;
+   ++dev->stats.rx_packets;
+   dev->stats.rx_bytes += skb->len;
 
skb->dev = dev;
/* XXX get correct PACKET_ type here */
@@ -512,8 +512,8 @@ void ipoib_cm_send(struct net_device *dev, struct sk_buff 
*skb, struct ipoib_cm_
if (unlikely(skb->len > tx->mtu)) {
ipoib_warn(priv, "packet len %d (> %d) too long to send, 
dropping\n",
   skb->len, tx->mtu);
-   ++priv->stats.tx_dropped;
-   ++priv->stats.tx_errors;
+   ++dev->stats.tx_dropped;
+   ++dev->stats.tx_errors;
ipoib_cm_skb_too_long(dev, skb, tx->mtu - IPOIB_ENCAP_LEN);
return;
}
@@ -532,7 +532,7 @@ void ipoib_cm_send(struct net_device *dev, struct sk_buff 
*skb, struct ipoib_cm_
tx_req->skb = skb;
addr = ib_dma_map_single(priv->ca, skb->data, skb->len, DMA_TO_DEVICE);
if (unlikely(ib_dma_mapping_error(priv->ca, addr))) {
-   ++priv->stats.tx_errors;
+   ++dev->stats.tx_errors;
dev_kfree_skb_any(skb);
return;
}
@@ -542,7 +542,7 @@ void ipoib_cm_send(struct net_device *dev, struct sk_buff 
*skb, struct ipoib_cm_
if (unlikely(post_send(priv, tx, tx->tx_head & (ipoib_sendq_size - 1),
addr, skb->len))) {
ipoib_warn(priv, "post_send failed\n");
-   ++priv->stats.tx_errors;
+   ++dev->stats.tx_errors;
ib_dma_unmap_single(priv->ca, addr, skb->len, DMA_TO_DEVICE);
dev_kfree_skb_any(skb);
} else {
@@ -580,8 +580,8 @@ static void ipoib_cm_handle_tx_wc(struct net_device *dev, 
struct ipoib_cm_tx *tx
ib_dma_unmap_single(priv->ca, tx_req->mapping, tx_req->skb->len, 
DMA_TO_DEVICE);
 
/* FIXME: is this right? Shouldn't we only increment on success? */
-   ++priv->stats.tx_packets;
-   priv->stats.tx_bytes += tx_req->skb->len;
+   ++dev->stats.tx_packets;
+   dev->stats.tx_bytes += tx_req->skb->len;
 
dev_kfree_skb_any(tx_req->skb);
 
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c 
b/drivers/infinib

Re: [PATCH V6 0/9] net/bonding: ADD IPoIB support for the bonding driver

2007-09-25 Thread Roland Dreier

 >  Roland, are you comfortable with the IB changes in patches 1 and 2?

Yes, they look fine to me.  Feel free to apply, with

Acked-by: Roland Dreier <[EMAIL PROTECTED]>

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ofa-general] Re: [PATCH V5 2/11] IB/ipoib: Notify the world before doing unregister

2007-09-23 Thread Roland Dreier

 > The action in bonding to a detach of slave is to unregister the master (see 
 > patch 10).
 > This can't be done from the context of unregister_netdevice itself (it is 
 > protected by rtnl_lock).

I'm confused.  Your patch has:

 > +ipoib_slave_detach(cpriv->dev);
 >  unregister_netdev(cpriv->dev);

And ipoib_slave_detach() is:

 > +static inline void ipoib_slave_detach(struct net_device *dev)
 > +{
 > +rtnl_lock();
 > +netdev_slave_detach(dev);
 > +rtnl_unlock();
 > +}

so you are calling netdev_slave_detach() with the rtnl lock held.
Why can't you make the same call from the start of unregister_netdevice()?

Anyway, if the rtnl lock is a problem, can you just add the call to
netdev_slave_detach() to unregister_netdev() before it takes the rtnl lock?

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] bnx2: factor out gzip unpacker

2007-09-21 Thread Roland Dreier

 > Just change the makefiles to always install gzip'ed modules
 > modutils knows how to unzip them on the fly.

But that leaves the uncompressed firmware blobs in .data that ends up
in unswappable kernel memory.

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] RDMA/CMA: Use neigh_event_send() to initiate neighbour discovery.

2007-09-20 Thread Roland Dreier

 > Roland - can you please queue this up for 2.6.24?

Done, thanks.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH V5 2/11] IB/ipoib: Notify the world before doing unregister

2007-09-20 Thread Roland Dreier

 > +ipoib_slave_detach(cpriv->dev);
 >  unregister_netdev(cpriv->dev);

Maybe you already answered this before, but I'm still not clear why
this notifier call can't just be added to the start of
unregister_netdevice(), so we can avoid having driver needing to know
anything about bonding internals?

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ofa-general] Re: [PATCH 02/11] IB/ipoib: Notify the world before doing unregister

2007-09-19 Thread Roland Dreier

 > I think that if there are no other comments, I will submit the entire 11 
 > patches 
 > again (with changes) to make it easier to merge into the kernel. Since the 
 > most of the 
 > content in the patch series is in bonding I thought it would be right that 
 > Jay will 
 > push all the patches to the networking git. Is it OK with you Roland?

Yes, that's fine.

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ofa-general] [PATCH] RDMA/CMA: Implement rdma_resolve_ip retry enhancement.

2007-09-19 Thread Roland Dreier

Thanks for the patch...

 > If an application is calling rdma_resolve_ip() and a status of -ENODATA is 
 > returned from addr_resolve_local/remote(), the timeout mechanism waits until 
 > the application's timeout occurs before rechecking the address resolution 
 > status; the application will wait until it's full timeout occurs.  This case 
 > is seen when the work thread call to process_req() is made before the arp 
 > packet is processed.

I'm having a hard time understanding this changelog.  Could you please
resend with a description that lets me understand:

 - What the current behavior is and what is wrong with that;
 - What the behavior should be;
 - And how your patch changes the behavior to be correct.

 > This patch is in addition to Steve Wise's neigh_event_send patch to initiate 
 > neighbour discovery sent on 9/12/2007.

Does this mean it depends on Steve's patch being applied first?

Also please try to keep lines in the changelog to 72 characters or so...

 > @@ -136,6 +137,7 @@ static void set_timeout(unsigned long ti
 >  static void queue_req(struct addr_req *req)
 >  {
 >  struct addr_req *temp_req;
 > +unsigned long req_timeout = msecs_to_jiffies(MIN_ADDR_TIMEOUT_MS) + 
 > jiffies;
 >  
 >  mutex_lock(&lock);
 >  list_for_each_entry_reverse(temp_req, &req_list, list) {
 > @@ -145,8 +147,10 @@ static void queue_req(struct addr_req *r
 >  
 >  list_add(&req->list, &temp_req->list);
 >  
 > -if (req_list.next == &req->list)
 > +if (req_list.next == &req->list) {
 > +req_timeout = min(req_timeout, req->timeout);
 >  set_timeout(req->timeout);
 > +}
 >  mutex_unlock(&lock);
 >  }

I don't understand this code.  It seems you keep track of the minimum
timeout, and then ignore the value you computed.  What am I missing?

Thanks,
  Roland
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: new NAPI interface broken

2007-09-19 Thread Roland Dreier

 > One other thing I observed is that I can not unload the module as the
 > ref counter of the eth device is too low. I haven't tracked down the 
 > source of this problem yet.

I suspect that this is because netif_rx_reschedule() was missing a
dev_hold() to match the dev_put() in netif_rx_complete().  Dave merged
a fix for that problem yesterday, so current net-2.6.24 should be OK.
Let us know if it's not OK, because then there's another bug...

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] IPoIB: Fix unregister_netdev hang

2007-09-18 Thread Roland Dreier

 > > and I think that is only in Dave's net-2.6.24 tree now, right?
 > 
 > Nope, that was what I downloaded yesterday:
 > 
 > VERSION = 2
 > PATCHLEVEL = 6
 > SUBLEVEL = 23
 > EXTRAVERSION =-rc6
 > NAME = Pink Farting Weasel

Please double check your tree.  I just very carefully looked at my
trees, and the poll_more: label is added in commit 6b460a71 ("[NET]:
Make NAPI polling independent of struct net_device objects.") which is
only in the net-2.6.24 tree.  Of course Dave did not change the
version information in the Makefile since he wouldn't want Linus to
pick up any extra strange changes when he pulls, so a net-2.6.24 tree
will look like 2.6.23-rc6 as you quoted.

And the refcounting bug I fixed is only in net-2.6.24.

 > To be clear, netif_rx_schedule while we are still in the poll list will not
 > do any harm as it does nothing since NAPI_STATE_SCHED is still set (cleared
 > by netif_rx_complete which has not yet run). Effectively we lost/delayed
 > processing an interrupt, if I understood the code right.

Right, we lose an interrupt, and since the CQ events are one-shot, we
never get another one, and the interface is effectively dead.

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-2.6.24] Fix refcounting problem with netif_rx_reschedule()

2007-09-18 Thread Roland Dreier

 > Further complicating things is that you need to setup a ppc32
 > cross-build environment to even build test a conversion, and I'm not
 > comfortable doing the surgery until I can test build the thing.

OK, I actually have a system with a ppc 440 SoC that uses this driver,
so I'll try to get things to the stage where I can boot net-2.6.24 on
it and see if I can get the driver working...
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-2.6.24] Fix documentation for dev_put()/dev_hold()

2007-09-18 Thread Roland Dreier

It looks like the comments for dev_put() and dev_hold() got reversed somehow.

Signed-off-by: Roland Dreier <[EMAIL PROTECTED]>
---
 include/linux/netdevice.h |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index be5fe05..239ae68 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1030,7 +1030,7 @@ extern intnetdev_budget;
 extern void netdev_run_todo(void);
 
 /**
- * dev_put - get reference to device
+ * dev_put - release reference to device
  * @dev: network device
  *
  * Hold reference to device to keep it from being freed.
@@ -1041,7 +1041,7 @@ static inline void dev_put(struct net_device *dev)
 }
 
 /**
- * dev_hold - release reference to device
+ * dev_hold - get reference to device
  * @dev: network device
  *
  * Release reference to device to allow it to be freed.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-2.6.24] Fix refcounting problem with netif_rx_reschedule()

2007-09-18 Thread Roland Dreier

netif_rx_complete() takes a netdev parameter and does dev_put() on
that netdev, so netif_rx_reschedule() needs to also take a netdev
parameter and do dev_hold() on it to avoid reference counts from
getting becoming negative because of unbalanced dev_put()s.

This should fix the problem reported by Krishna Kumar
<[EMAIL PROTECTED]> with IPoIB waiting forever for netdev refcounts
to become 0 during module unload.

Signed-off-by: Roland Dreier <[EMAIL PROTECTED]>
---
Dave, feel free to roll this up into earlier NAPI conversion patches
(assuming I'm understanding things correctly and this patch actually
makes sense!).

BTW, it looks like drivers/net/ibm_emac/ibm_emac_mal.c would not have 
built in the current net-2.6.24 tree, since its call to
netif_rx_reschedule() was left with the netdev parameter.  So that
file does not need to be touched in this patch.

 drivers/infiniband/ulp/ipoib/ipoib_ib.c |2 +-
 drivers/net/arm/ep93xx_eth.c|2 +-
 drivers/net/ehea/ehea_main.c|2 +-
 drivers/net/ibmveth.c   |2 +-
 include/linux/netdevice.h   |   21 +++--
 5 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c 
b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index 6a2bff4..481e4b6 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -320,7 +320,7 @@ poll_more:
if (unlikely(ib_req_notify_cq(priv->cq,
  IB_CQ_NEXT_COMP |
  IB_CQ_REPORT_MISSED_EVENTS)) &&
-   netif_rx_reschedule(napi))
+   netif_rx_reschedule(dev, napi))
goto poll_more;
}
 
diff --git a/drivers/net/arm/ep93xx_eth.c b/drivers/net/arm/ep93xx_eth.c
index f3858d1..7f016f3 100644
--- a/drivers/net/arm/ep93xx_eth.c
+++ b/drivers/net/arm/ep93xx_eth.c
@@ -309,7 +309,7 @@ poll_some_more:
}
spin_unlock_irq(&ep->rx_lock);
 
-   if (more && netif_rx_reschedule(napi))
+   if (more && netif_rx_reschedule(dev, napi))
goto poll_some_more;
}
 
diff --git a/drivers/net/ehea/ehea_main.c b/drivers/net/ehea/ehea_main.c
index 4a5ab4a..9a499f4 100644
--- a/drivers/net/ehea/ehea_main.c
+++ b/drivers/net/ehea/ehea_main.c
@@ -636,7 +636,7 @@ static int ehea_poll(struct napi_struct *napi, int budget)
if (!cqe && !cqe_skb)
return 0;
 
-   if (!netif_rx_reschedule(napi))
+   if (!netif_rx_reschedule(dev, napi))
return 0;
}
 
diff --git a/drivers/net/ibmveth.c b/drivers/net/ibmveth.c
index b8d7cec..b94f266 100644
--- a/drivers/net/ibmveth.c
+++ b/drivers/net/ibmveth.c
@@ -973,7 +973,7 @@ static int ibmveth_poll(struct napi_struct *napi, int 
budget)
netif_rx_complete(netdev, napi);
 
if (ibmveth_rxq_pending_buffer(adapter) &&
-   netif_rx_reschedule(napi)) {
+   netif_rx_reschedule(netdev, napi)) {
lpar_rc = h_vio_signal(adapter->vdev->unit_address,
   VIO_IRQ_DISABLE);
goto restart_poll;
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index be5fe05..0dbf185 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1198,16 +1198,6 @@ static inline u32 netif_msg_init(int debug_value, int 
default_msg_enable_bits)
return (1 << debug_value) - 1;
 }
 
-/* Try to reschedule poll. Called by dev->poll() after netif_rx_complete().  */
-static inline int netif_rx_reschedule(struct napi_struct *n)
-{
-   if (napi_schedule_prep(n)) {
-   __napi_schedule(n);
-   return 1;
-   }
-   return 0;
-}
-
 /* Test if receive needs to be scheduled but only if up */
 static inline int netif_rx_schedule_prep(struct net_device *dev,
 struct napi_struct *napi)
@@ -1234,6 +1224,17 @@ static inline void netif_rx_schedule(struct net_device 
*dev,
__netif_rx_schedule(dev, napi);
 }
 
+/* Try to reschedule poll. Called by dev->poll() after netif_rx_complete().  */
+static inline int netif_rx_reschedule(struct net_device *dev,
+ struct napi_struct *napi)
+{
+   if (napi_schedule_prep(napi)) {
+   __netif_rx_schedule(dev, napi);
+   return 1;
+   }
+   return 0;
+}
+
 /* same as netif_rx_complete, except that local_irq_save(flags)
  * has already been issued
  */
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ofa-general] Re: [PATCH 02/11] IB/ipoib: Notify the world before doing unregister

2007-09-18 Thread Roland Dreier

 > Maybe this new notification function should be in net/core/dev.c
 > instead of exporting call_netdevice_notifiers()?

Or actually, does it work to add the call to the notifiers directly in
unregister_netdev() so that device drivers don't have to worry about it?

(And is the existing patch missing a call to notifiers in ipoib_vlan_delete()?)

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: InfiniBand/RDMA merge plans for 2.6.24

2007-09-18 Thread Roland Dreier

 > Roland, could you merge the common TX CQ patch please?
 > It actually fixes a real problem.

Yes, I will, but it collides with the net-2.6.24 NAPI rework I think,
so it may not go in until a few days after the merge window.

Have you verified that the patch cures the interrupt overload issues?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] IPoIB: Fix unregister_netdev hang

2007-09-18 Thread Roland Dreier

Thanks for testing on ehca...

 > While using IPoIB over EHCA (rc6 bits), unregister_netdev hangs with

I don't think you're actually using rc6 bits, since in your patch you have:

 > -poll_more:

and I think that is only in Dave's net-2.6.24 tree now, right?

 > The problem is that the poll handler does netif_rx_complete (which
 > does a dev_put) followed by netif_rx_reschedule() to schedule for
 > more receives (which again does a dev_put). This reduces refcount to
 > < 0 (depending on how many times netif_rx_complete followed by
 > netif_rx_reschedule was called).

Dave, the real problem seems to be that netif_rx_recschedule() calls
__napi_schedule() rather than __netif_rx_schedule(), so it misses the
call to dev_hold() that is needed to balance the dev_put() in
netif_rx_complete().  The current netif_rx_reschedule() looks like it
really should be napi_reschedule(), and we need a new function that
takes a netdev too.  Or am I misunderstanding the refcounting?

I'll send a patch once I've had some breakfast and had a chance to at
least compile it...

Krishna, unfortunately your proposed fix has a race:

 > -netif_rx_complete(dev, napi);
 > -if (unlikely(ib_req_notify_cq(priv->cq,
 > -  IB_CQ_NEXT_COMP |
 > -  IB_CQ_REPORT_MISSED_EVENTS)) &&
 > -netif_rx_reschedule(napi))
 > -goto poll_more;
 > +if (likely(!ib_req_notify_cq(priv->cq,
 > + IB_CQ_NEXT_COMP |
 > + IB_CQ_REPORT_MISSED_EVENTS)))

It is possible for an interrupt to happen immediately right here,
before the netif_rx_complete(), so that netif_rx_schedule() gets
called while we are still on the poll list.

 > +netif_rx_complete(dev, napi);

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 02/11] IB/ipoib: Notify the world before doing unregister

2007-09-17 Thread Roland Dreier

 >  Conceptually, I see your point and I'm ok with doing it either
 > way.  My only question is, would this change would make the ipoib module
 > dependent upon having the bonding module loaded (to resolve all of the
 > symbols)?

Yes, I guess so, if that function is in bonding.  Hmm, that wouldn't
be a good change.

Maybe this new notification function should be in net/core/dev.c
instead of exporting call_netdevice_notifiers()?

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 03/11] IB/ipoib: Bound the net device to the ipoib_neigh structue

2007-09-17 Thread Roland Dreier

Overall idea looks good... one comment:

 > +if (n->dev->flags & IFF_MASTER) {
 > +/* n->dev is not an IPoIB device and we have
 > +to take priv from elsewhere */
 > +neigh = *to_ipoib_neigh(n);
 > +if (neigh) {
 > +priv = netdev_priv(neigh->dev);
 > +ipoib_dbg(priv, "neigh_destructor for bonding device: 
 > %s\n",
 > +  n->dev->name);
 > +} else
 > +return;
 > +}

seems like it would be cleaner not to worry about bonding here and
just use neigh->dev all the time rather than having two ways to look
up the device.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 02/11] IB/ipoib: Notify the world before doing unregister

2007-09-17 Thread Roland Dreier

OK with me.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 02/11] IB/ipoib: Notify the world before doing unregister

2007-09-17 Thread Roland Dreier

Actually, thinking about this some more... would it be cleaner to more
the knowledge about bonding out of the ipoib driver?  in other words,
export something similar to

 > +static int ipoib_slave_detach(struct net_device *dev)
 > +{
 > +int ret = 0;
 > +if (dev->flags & IFF_SLAVE) {
 > +dev->priv_flags |= IFF_SLAVE_DETACH;
 > +rtnl_lock();
 > +ret = call_netdevice_notifiers(NETDEV_CHANGE, dev);
 > +rtnl_unlock();
 > +}
 > +return ret;
 > +}

for drivers to use, rather than putting use of IFF_SLAVE and
IFF_SLAVE_DETACH outside of the bonding driver.

Also it seems this function could return void, since both call sites
ignore the return value and I don't see anything sensible that IPoIB
could do with the notifier chain return value anyway.

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 04/11] IB/ipoib: Verify address handle validity on send

2007-09-17 Thread Roland Dreier

Looks fine overall, with one minor nitpick:

 > -if (unlikely(memcmp(&neigh->dgid.raw,
 > +if (unlikely((memcmp(&neigh->dgid.raw,
 >  skb->dst->neighbour->ha + 4,
 > -sizeof(union ib_gid {
 > +sizeof(union ib_gid))) ||
 > + (neigh->dev != dev))) {

the indentation here makes this confusing to read -- I would just do:

} else if (neigh->ah) {
if (unlikely(memcmp(&neigh->dgid.raw,
skb->dst->neighbour->ha + 4,
-   sizeof(union ib_gid {
+   sizeof(union ib_gid)) ||
+neigh->dev != dev)) {
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 01/11] IB/ipoib: Export call to call_netdevice_notifiers and add new private flag

2007-09-17 Thread Roland Dreier

I tried to look at the ipoib stuff in this series... this patch looks
fine but it doesn't actually touch ipoib, so the subject line is a bit
misleading...
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ofa-general] InfiniBand/RDMA merge plans for 2.6.24

2007-09-17 Thread Roland Dreier

 > The IGMP enabling patch posted by me on September 2nd isn't on your list
 > http://lists.openfabrics.org/pipermail/general/2007-September/040250.html
 > can you add it?

Yes, I lost that somehow.  I will add it to my list of things to take
a look at (no opinion yet).

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: InfiniBand/RDMA merge plans for 2.6.24

2007-09-17 Thread Roland Dreier

 > > IPoIB CM handles this properly by gathering together single pages in
 > > skbs' fragment lists.

 > Then can we reuse IPoIB CM code here?

Yes, if possible, refactoring things so that the rx skb allocation
code becomes common between CM and non-CM would definitely make sense.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ofa-general] InfiniBand/RDMA merge plans for 2.6.24

2007-09-14 Thread Roland Dreier

 > > I've been meaning to track down the bnx2 iscsi offload patch to look
 > > and see if this issue is addressed, since the same problem seems to
 > > exist: it seems an iscsi connection and a main stack tcp connection
 > > might share the same 4-tuple unless something is done to avoid that
 > > happening.

 > iSCSI does not do passive listens, only active connections to the
 > target.  But you're right, the port space is still shared between iSCSI
 > and the main stack.  We currently rely on user apps binding to the main
 > stack to reserve certain ephemeral ports, and telling the iSCSI driver
 > which ports to use.

Got it... I wasn't thinking that clearly, but it is clear that a full
4-tuple collision with only active connections is quite unlikely.  I
guess you would have to make both an offloaded and a non-offloaded
iSCSI connection to the same target and get really unlucky with
ephemeral port allocation.  So in practice I guess it's not an issue
at all with your driver yet.

However, do you have any plans to support iSCSI offload for targets?
Also, looking at the first CNIC patch, I can't help but notice that
you seem to have at least some support for iWARP there.  How does the
CNIC look?  Does it share the same interface/addresses as the
non-offload NIC, or does it create a completely separate netdevice?

I want to make sure that whatever solution we come up with for cxgb3
doesn't cause problems for you.  And of course if you have a better
idea than what Steve has come up with, that would be great :)

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: InfiniBand/RDMA merge plans for 2.6.24

2007-09-14 Thread Roland Dreier

 > The patch is just needed to pick up broadcast MTU size instead of hard
 > coding 2K right now. SKB allocation shouldn't be different with Ethernet
 > Jambo Frame and IPoIB-CM which 64K MTU. I don't understand why it's
 > different. Could you please explain this?

It's exactly the same problem as ethernet jumbo frames.  A web search
for '"order 1" failure e1000' might be interesting.

IPoIB CM handles this properly by gathering together single pages in
skbs' fragment lists.

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] iw_cxgb3: Support "iwarp-only" interfaces to avoid 4-tuple conflicts.

2007-09-14 Thread Roland Dreier

 > Maybe you should automatically create an alias each time new interface
 > is added so that admin would not care about proper aliases?

I agree that makes much more sense from a user interface point of
view.  Unfortunately an alias without an address doesn't make sense,
so there doesn't seem to be a way to implement that.

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

1 2 3 4 >

1 - 100 of 309 matches

Mail list logo