Re: [PATCH] CDC-NCM: remove "connected" log message
> now that you put it that way, I get the merit of what you are saying. > Very well. I will submit the first set of patches. > > May I add your "Tested-by"? Yes, absolutely: Tested-by: Roland Dreier
Re: [PATCH] CDC-NCM: remove "connected" log message
> > to preserve the legacy behavior rather than changing the behavior of > > every usbnet driver all at once? Like make a new > > usbnet_get_link_ksettings_nonmdio and update only cdc_ncm to use it? > > Then I would have to touch them all. The problem is that the MDIO > stuff really is pretty much a layering violation. It should never > have been default. But now it is. I don't understand this. Your 0001 patch changes the behavior of usbnet_get_link_ksettings() and you have to touch all of the 8 drivers that use it if you don't want to change their behavior. If you keep the old usbnet_get_link_ksettings() and add usbnet_get_link_ksettings_nonmdio() then you can just update cdc_ncm to start with, and then gradually migrate other drivers. And eventually fix the layering violation and get rid of the legacy function when the whole transition is done. - R.
Re: [PATCH] CDC-NCM: remove "connected" log message
I haven't tried these patches yet but they don't look quite right to me. inlining the first 0001 patch: > diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c > index 1447da1d5729..bcd17f6d6de6 100644 > --- a/drivers/net/usb/usbnet.c > +++ b/drivers/net/usb/usbnet.c > @@ -944,7 +944,7 @@ EXPORT_SYMBOL_GPL(usbnet_open); > * they'll probably want to use this base set. > */ > > -int usbnet_get_link_ksettings(struct net_device *net, > +int usbnet_get_link_ksettings_mdio(struct net_device *net, >struct ethtool_link_ksettings *cmd) > { > struct usbnet *dev = netdev_priv(net); > @@ -956,6 +956,32 @@ int usbnet_get_link_ksettings(struct net_device *net, > > return 0; > } > +EXPORT_SYMBOL_GPL(usbnet_get_link_ksettings_mdio); why keep and export the old function when it will have no callers? > +int usbnet_get_link_ksettings(struct net_device *net, > +struct ethtool_link_ksettings *cmd) > +{ > +struct usbnet *dev = netdev_priv(net); > + > +/* the assumption that speed is equal on tx and rx > + * is deeply engrained into the networking layer. > + * For wireless stuff it is not true. > + * We assume that rxspeed matters more. > + */ > +if (dev->rxspeed != SPEED_UNKNOWN) > +cmd->base.speed = dev->rxspeed / 100; > +else if (dev->txspeed != SPEED_UNKNOWN) > +cmd->base.speed = dev->txspeed / 100; > +/* if a minidriver does not record speed we try to > + * fall back on MDIO > + */ > +else if (!dev->mii.mdio_read) > +cmd->base.speed = SPEED_UNKNOWN; > +else > +mii_ethtool_get_link_ksettings(&dev->mii, cmd); > + > +return 0; This is a change in behavior for every driver that doesn't set rxspeed / txspeed - the old get_link function would return EOPNOTSUPP if mdio_read isn't implemented, now we give SPEED_UNKNOWN with a successful return code. > @@ -1661,6 +1687,8 @@ usbnet_probe (struct usb_interface *udev, const struct usb_device_id *prod) > dev->intf = udev; > dev->driver_info = info; > dev->driver_name = name; > +dev->rxspeed = -1; /* unknown or handled by MII */ > +dev->txspeed = -1; Minor nit: if we're going to test these against SPEED_UNKNOWN above, then I think it's clearer to initialize them to that value via the same constant. > diff --git a/include/linux/usb/usbnet.h b/include/linux/usb/usbnet.h > index 88a7673894d5..f748c758f82a 100644 > --- a/include/linux/usb/usbnet.h > +++ b/include/linux/usb/usbnet.h > @@ -267,8 +269,11 @@ extern void usbnet_purge_paused_rxq(struct usbnet *); > > extern int usbnet_get_link_ksettings(struct net_device *net, > struct ethtool_link_ksettings *cmd); > -extern int usbnet_set_link_ksettings(struct net_device *net, > +extern int usbnet_set_link_ksettings_mdio(struct net_device *net, > const struct ethtool_link_ksettings *cmd); > +/* Legacy - to be used if you really need an error to be returned */ > +extern int usbnet_set_link_ksettings(struct net_device *net, > +const struct ethtool_link_ksettings *cmd); > extern u32 usbnet_get_link(struct net_device *net); > extern u32 usbnet_get_msglevel(struct net_device *); > extern void usbnet_set_msglevel(struct net_device *, u32); I think this was meant to be changing get_link, not set_link. Also I don't understand the "Legacy" comment. Is that referring to the EOPNOTSUPP change I mentioned above? If so, wouldn't it be better to preserve the legacy behavior rather than changing the behavior of every usbnet driver all at once? Like make a new usbnet_get_link_ksettings_nonmdio and update only cdc_ncm to use it? - R.
Re: [PATCH] CDC-NCM: remove "connected" log message
> I looked at them again and found that there is a way to get > the same effect that will make maintenance easier in the long run. > Could I send them to you later this week for testing? Yes, please. I have a good test setup now so I can easily try out patches. Thanks, Roland
Re: [PATCH] CDC-NCM: remove "connected" log message
> Applied to net and queued for LTS, thanks! Thanks - is Oliver's series of 3 patches that get rid of the other half of the log spam also on the way upstream? - R.
[PATCH] CDC-NCM: remove "connected" log message
The cdc_ncm driver passes network connection notifications up to usbnet_link_change(), which is the right place for any logging. Remove the netdev_info() duplicating this from the driver itself. This stops devices such as my "TRENDnet USB 10/100/1G/2.5G LAN" (ID 20f4:e02b) adapter from spamming the kernel log with cdc_ncm 2-2:2.0 enp0s2u2c2: network connection: connected messages every 60 msec or so. Signed-off-by: Roland Dreier --- drivers/net/usb/cdc_ncm.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c index a45fcc44facf..50d3a4e6d445 100644 --- a/drivers/net/usb/cdc_ncm.c +++ b/drivers/net/usb/cdc_ncm.c @@ -1850,9 +1850,6 @@ static void cdc_ncm_status(struct usbnet *dev, struct urb *urb) * USB_CDC_NOTIFY_NETWORK_CONNECTION notification shall be * sent by device after USB_CDC_NOTIFY_SPEED_CHANGE. */ - netif_info(dev, link, dev->net, - "network connection: %sconnected\n", - !!event->wValue ? "" : "dis"); usbnet_link_change(dev, !!event->wValue, 0); break; -- 2.29.2
Re: cdc_ncm kernel log spam with trendnet 2.5G USB adapter
On Tue, Dec 22, 2020 at 6:49 PM Jakub Kicinski wrote: > I'm not sure what the story here is but if this change is expected to > get into the networking tree we'll need a fresh posting. This sort of > scissored reply does not get into patchwork. OK, will resend. Too bad about patchwork, "git am" drops everything before scissors lines by default. > It sounds like you're getting tens of those messages a second, we can > remove the message but the device is still generating spurious events, > wasting CPU cycles. Was blocking those events deemed unfeasible? I certainly don't know enough about the USB CDC class to know why the spurious messages are showing up or whether they could be suppressed without a fix in the adapter firmware. But even ~30 spurious messages per second doesn't seem so bad for a multi-gig adapter that might be handling 100,000 or more packets per second. - R.
Re: cdc_ncm kernel log spam with trendnet 2.5G USB adapter
(Apologies, trying one more time with a better mailer) Sorry it took so long, but I finally got a chance to test the patches. They seem to work well, but they only get rid of the downlink / uplink speed spam - I still get the following filling my kernel log with a patched kernel: [ 29.830383] cdc_ncm 2-2:2.0 enp0s2u2c2: network connection: connected [ 29.894359] cdc_ncm 2-2:2.0 enp0s2u2c2: network connection: connected [ 29.958601] cdc_ncm 2-2:2.0 enp0s2u2c2: network connection: connected [ 30.022473] cdc_ncm 2-2:2.0 enp0s2u2c2: network connection: connected [ 30.086548] cdc_ncm 2-2:2.0 enp0s2u2c2: network connection: connected with the below patch on top of your 3, then my kernel log is clean. Please apply your patches plus my patch, and feel free to add Tested-by: Roland Dreier to the other three. --- 8< -- 8< --- Subject: [PATCH] CDC-NCM: remove "connected" log message The cdc_ncm driver passes network connection notifications up to usbnet_link_change(), which is the right place for any logging. Remove the netdev_info() duplicating this from the driver itself. This stops devices such as my "TRENDnet USB 10/100/1G/2.5G LAN" (ID 20f4:e02b) adapter from spamming the kernel log with cdc_ncm 2-2:2.0 enp0s2u2c2: network connection: connected messages every 60 msec or so. Signed-off-by: Roland Dreier --- drivers/net/usb/cdc_ncm.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c index a45fcc44facf..50d3a4e6d445 100644 --- a/drivers/net/usb/cdc_ncm.c +++ b/drivers/net/usb/cdc_ncm.c @@ -1850,9 +1850,6 @@ static void cdc_ncm_status(struct usbnet *dev, struct urb *urb) * USB_CDC_NOTIFY_NETWORK_CONNECTION notification shall be * sent by device after USB_CDC_NOTIFY_SPEED_CHANGE. */ - netif_info(dev, link, dev->net, - "network connection: %sconnected\n", - !!event->wValue ? "" : "dis"); usbnet_link_change(dev, !!event->wValue, 0); break; -- 2.29.2
Re: cdc_ncm kernel log spam with trendnet 2.5G USB adapter
> your criticism was valid. Could you test the attached patches? Thanks - I will build a kernel and report back. - R.
cdc_ncm kernel log spam with trendnet 2.5G USB adapter
Hi, I recently got a 2.5G USB adapter, and although it works fine, the driver continually spams the kernel log with messages like [127662.025647] cdc_ncm 4-1:2.0 enx3c8cf8f942e0: network connection: connected [127662.057680] cdc_ncm 4-1:2.0 enx3c8cf8f942e0: 2500 mbit/s downlink 2500 mbit/s uplink [127662.089794] cdc_ncm 4-1:2.0 enx3c8cf8f942e0: network connection: connected [127662.121831] cdc_ncm 4-1:2.0 enx3c8cf8f942e0: 2500 mbit/s downlink 2500 mbit/s uplink [127662.153858] cdc_ncm 4-1:2.0 enx3c8cf8f942e0: network connection: connected ... Looking at the code in cdc_ncm.c it seems the device is just sending USB_CDC_NOTIFY_NETWORK_CONNECTION and USB_CDC_NOTIFY_SPEED_CHANGE urbs over and over, and the driver logs every single one. Should we add code to the driver to keep track of what the last event was and only log if the state has really change? Thanks! Roland Full details on the adapter, in case it matters: Bus 004 Device 005: ID 20f4:e02b TRENDnet USB 10/100/1G/2.5G LAN Device Descriptor: bLength18 bDescriptorType 1 bcdUSB 3.20 bDeviceClass0 bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 9 idVendor 0x20f4 TRENDnet idProduct 0xe02b bcdDevice 30.04 iManufacturer 1 iProduct2 iSerial 6 bNumConfigurations 3 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 0x0039 bNumInterfaces 1 bConfigurationValue 1 iConfiguration 0 bmAttributes 0xa0 (Bus Powered) Remote Wakeup MaxPower 512mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber0 bAlternateSetting 0 bNumEndpoints 3 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass255 Vendor Specific Subclass bInterfaceProtocol 0 iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes2 Transfer TypeBulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 3 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x02 EP 2 OUT bmAttributes2 Transfer TypeBulk Synch Type None Usage Type Data wMaxPacketSize 0x0400 1x 1024 bytes bInterval 0 bMaxBurst 3 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x83 EP 3 IN bmAttributes3 Transfer TypeInterrupt Synch Type None Usage Type Data wMaxPacketSize 0x0002 1x 2 bytes bInterval 11 bMaxBurst 0 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 0x0068 bNumInterfaces 2 bConfigurationValue 2 iConfiguration 0 bmAttributes 0xa0 (Bus Powered) Remote Wakeup MaxPower 512mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber0 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 2 Communications bInterfaceSubClass 13 bInterfaceProtocol 0 iInterface 5 CDC Header: bcdCDC 1.10 CDC Union: bMasterInterface0 bSlaveInterface 1 CDC Ethernet: iMacAddress 3 (??) bmEthernetStatistics0x0031500f wMaxSegmentSize 1518 wNumberMCFilters0x8000 bNumberPowerFilters 0 CDC NCM: bcdNcmVersion1.00 bmNetworkCapabilities 0x2b 8-byte ntb input size max datagram size net address packet filter Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x83 EP 3 IN bmAttributes3 Transfer TypeInterrupt Synch Type None Usage Type Data wMaxPacketSize 0x0010 1x 16 bytes bInterval 8 bMaxBurst 0 Interface Descriptor: bLength 9
[PATCH] qed: Fix potential use-after-free in qed_spq_post()
From: Roland Dreier We need to check if p_ent->comp_mode is QED_SPQ_MODE_EBLOCK before calling qed_spq_add_entry(). The test is fine is the mode is EBLOCK, but if it isn't then qed_spq_add_entry() might kfree(p_ent). Signed-off-by: Roland Dreier --- drivers/net/ethernet/qlogic/qed/qed_spq.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/qlogic/qed/qed_spq.c b/drivers/net/ethernet/qlogic/qed/qed_spq.c index be48d9abd001..3588081b2e27 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_spq.c +++ b/drivers/net/ethernet/qlogic/qed/qed_spq.c @@ -776,6 +776,7 @@ int qed_spq_post(struct qed_hwfn *p_hwfn, int rc = 0; struct qed_spq *p_spq = p_hwfn ? p_hwfn->p_spq : NULL; bool b_ret_ent = true; + bool eblock; if (!p_hwfn) return -EINVAL; @@ -794,6 +795,11 @@ int qed_spq_post(struct qed_hwfn *p_hwfn, if (rc) goto spq_post_fail; + /* Check if entry is in block mode before qed_spq_add_entry, +* which might kfree p_ent. +*/ + eblock = (p_ent->comp_mode == QED_SPQ_MODE_EBLOCK); + /* Add the request to the pending queue */ rc = qed_spq_add_entry(p_hwfn, p_ent, p_ent->priority); if (rc) @@ -811,7 +817,7 @@ int qed_spq_post(struct qed_hwfn *p_hwfn, spin_unlock_bh(&p_spq->lock); - if (p_ent->comp_mode == QED_SPQ_MODE_EBLOCK) { + if (eblock) { /* For entries in QED BLOCK mode, the completion code cannot * perform the necessary cleanup - if it did, we couldn't * access p_ent here to see whether it's successful or not. -- 2.14.1
Re: [PATCH] PCI: Update ACS quirk for more Intel 10G NICs
> I think the conclusion is that a hard-wired ACS capability is a > positive indication of isolation for a multifunction device, the code > is intended to support this and appears to do so, and Roland was going > to investigate the sightings that inspired this patch in more detail. > Dropping for now is appropriate. Thanks, That's right. I confirmed that the issue we found was due to another PCI quirk. It may make sense to add more 82599 variants to the table, but the X540 and X550 work without a quirk. Sorry for the noise. - R.
Re: [PATCH] PCI: Update ACS quirk for more Intel 10G NICs
> Is there a misunderstanding of the code flow here? We're never setting > EC. In the first code block we're just masking out requested > capabilities where unimplemented capabilities is the same as > implemented + enabled. We're not adding EC to the request, we're > just not removing it based on the implemented capabilities because we > don't interpret it the same way. Thus if someone wants to test a > device for EC, it really needs to implement EC, we cannot assume it > based on lack of support for EC in the ACS capability. As you point > out, nobody really cares about EC yet though. I guess I find the semantics confusing. For every other bit, pci_acs_enabled() returns true if the bit is either enabled or not implemented. For EC, it returns false if the bit is not implemented. It's not clear to me what the use case for checking for PCI_ACS_EC enabled would be. Seems like checking for EC in the capabilities register would be more useful. - R.
Re: [PATCH] PCI: Update ACS quirk for more Intel 10G NICs
> I think that the device implementing ACS and not implementing certain > features that are "must be implemented if..." is a positive indication > that the device does not have that behavior and therefore the ability > to control that behavior is unnecessary. pci_acs_flags_enabled() is > coded with this intention: > > static bool pci_acs_flags_enabled(struct pci_dev *pdev, u16 acs_flags) > { > int pos; > u16 cap, ctrl; > > pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ACS); > if (!pos) > return false; > > /* > * Except for egress control, capabilities are either required > * or only required if controllable. Features missing from the > * capability field can therefore be assumed as hard-wired enabled. > */ > pci_read_config_word(pdev, pos + PCI_ACS_CAP, &cap); > acs_flags &= (cap | PCI_ACS_EC); > > > > pci_read_config_word(pdev, pos + PCI_ACS_CTRL, &ctrl); > return (ctrl & acs_flags) == acs_flags; > > I'm not sure it makes sense to look for the EC bit in the control register if the capability bit says it is not supported. In fact I think it would violate the PCI spec to set the bit in the control register if it is zero in the capability register - the spec says: Default value of this bit is 0b. Must be hardwired to 0b if the ACS P2P Egress Control functionality is not implemented. I don't understand the reasoning behind forcing the EC bit in commit 83db7e0bdb70. I think the only reason this works is that nowhere in the kernel uses this function to check PCI_ACS_EC. > At least that's my interpretation and how I've indicated to folks at > Intel how things should work for a device that does not support p2p. > Of course this all hinges on having an ACS capability. Without an ACS > capability we must assume any sort of p2p is possible unless we have a > quirk to tell us otherwise. With ACS, the absence of capabilities is a > positive indication that those forms of p2p are not possible (depending > on the spec wording for that capability). If the code doesn't behave > that way, let's fix it. Thanks, Thanks - I'm looking more closely at what goes wrong with X550, since from reading the code it looks like it should work, but people have observed that it doesn't on our system. However the issue might be elsewhere. However I'll send a patch removing that "| PCI_ACS_EC" from the check if you agree - maybe I'm misunderstanding the logic but I don't see how it could work if it ever became relevant. - R.
Re: [PATCH] PCI: Update ACS quirk for more Intel 10G NICs
On Thu, Jul 20, 2017 at 3:15 PM, Alex Williamson wrote: > Most of the ACS capabilities are worded as "Must be implemented by > devices that implement ..." Shouldn't a hard-wired ACS capability > sufficiently describe that, or is there something wrong with how > they're hard wired? Interesting question. I just looked at what the PCI spec says about the various bits for ACS functions in multi-function devices. Many of the functions are "must not be implemented." Of the ones that are "must be implemented if..." the key is that the if is for devices that support peer-to-peer traffic. I think the Intel NICs are compliant with the spec - they don't support any ACS mechanisms for controlling peer-to-peer traffic, but they also don't implement peer-to-peer traffic. This means that the PCI standard way of knowing that it is safe to assign individual functions does not prove it is safe - but with device-specific knowledge we do know it is safe. Hence a quirk to give that device-specific information to the kernel. - R.
[PATCH] PCI: Update ACS quirk for more Intel 10G NICs
From: Roland Dreier Add one more variant of the 82599 plus the device IDs for X540 and X550 variants. Intel has confirmed that none of these devices does peer-to-peer between functions. The X540 and X550 have added ACS capabilities in their PCI config space, but the ACS control register is hard-wired to 0 for both devices, so we still need the quirk for IOMMU grouping to allow assignment of individual SR-IOV functions. Signed-off-by: Roland Dreier --- drivers/pci/quirks.c | 21 + 1 file changed, 21 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 6967c6b4cf6b..b939db671326 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -4335,12 +4335,33 @@ static const struct pci_dev_acs_enabled { { PCI_VENDOR_ID_INTEL, 0x1507, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x1514, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x151C, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, 0x1528, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x1529, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, 0x154A, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x152A, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x154D, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x154F, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x1551, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x1558, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, 0x1560, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, 0x1563, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, 0x15AA, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, 0x15AB, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, 0x15AC, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, 0x15AD, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, 0x15AE, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, 0x15B0, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, 0x15AB, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, 0x15C2, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, 0x15C3, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, 0x15C4, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, 0x15C6, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, 0x15C7, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, 0x15C8, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, 0x15CE, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, 0x15E4, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, 0x15E5, pci_quirk_mf_endpoint_acs }, + { PCI_VENDOR_ID_INTEL, 0x15D1, pci_quirk_mf_endpoint_acs }, /* 82580 */ { PCI_VENDOR_ID_INTEL, 0x1509, pci_quirk_mf_endpoint_acs }, { PCI_VENDOR_ID_INTEL, 0x150E, pci_quirk_mf_endpoint_acs }, -- 2.11.0
Re: Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation
On Fri, Jul 8, 2016 at 9:51 AM, Jason Gunthorpe wrote: > So, it appears, the dst and neigh can be used for all performances cases. > > For the non performance dst == null case, can we just burn cycles and > stuff the daddr in front of the packet at hardheader time, even if we > have to copy? OK, sounds interesting. Unfortunately the scope of this work has gotten to the point where I can't take it on right now. My system is running 4.4.y for now (before struct skb_gso_cb grew) so I think shrinking struct skb_gso_cb to 8 bytes plus changing SKB_SGO_CB_OFFSET to 20 will work for now. Hope someone is able to come up with a real fix before I need to upgrade to 4.10.y... - R.
Re: Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation
On Thu, Jul 7, 2016 at 4:14 PM, Jason Gunthorpe wrote: > We have neighbour_priv, and ndo_neigh_construct/destruct now .. > > A first blush that would seem to be enough to let ipoib store the AH > and other path information in the neigh and avoid the cb? At least the > example in clip sure looks like what ipoib needs to do. Do you think those new facilities let us go back to using the neigh and still avoid the issues that led to commit b63b70d87741 ("IPoIB: Use a private hash table for path lookup in xmit path")? - R.
Re: Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation
>> struct skb_gso_cb { >> int mac_offset; >> int encap_level; >> __u16 csum_start; >> }; > This is based on an out-dated version of this struct. The 4.7 RC > kernel has a few more fields that were added to support local checksum > offload for encapsulated frames. Thanks for pointing that out. I hit the perf regression on 4.4.y (stable) and looked at the struct there. I see that latest upstream has changed, and I agree that this struct really can't shrink below 10 bytes. Since IP needs 20 bytes, GSO needs 10 bytes and IPoIB needs 20 bytes, we're 2 bytes over the 48 that are available in cb[]. So this is harder to fix than just changing skb_gso_cb and SKB_SGO_CB_OFFSET unfortunately. >> What is the best way to keep the crash fix but not kill IPoIB performance? > > It seems like what would probably need to happen is to move where the > IPoIB address is stored. I'm not sure the control buffer is really > the best place for it since the cb gets overwritten at various levels, > and storing 20 bytes makes it hard to avoid bumping up against the > size restrictions of the buffer. Seeing as how the IPoIB hwaddr is > generated around the same time we generate the L2 header for the > frame, I wonder if you couldn't get away with using a bit of extra skb > headroom to store it and then use a offset from the MAC header to > access it. An added bonus would be that with a few tricks with > SKB_GSO_CB(skb)->mac_offset you might even be able to set things up so > that you copy the hwaddr when you copy the header for each fragment > instead of having to go and copy the hwaddr out of the cb and clone it > for each frame. Can we assume there are 20 bytes of skb headroom available? What if we're forwarding an skb received on an Ethernet device? The reason we moved to the cb storage is that in the past, trying to hide some data in the actual skb buffer that we don't actually send led to some awkward-at-best code. (As I recall GRO was difficult to handle before commit 936d7de3d736 "IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses") But maybe there's a third way to handle this other than the old way and the skb->cb way. - R.
Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation
On Thu, Jan 7, 2016 at 3:00 AM, Konstantin Khlebnikov wrote: > Or just shift GSO CB and add couple checks like > BUILD_BUG_ON(sizeof(SKB_GSO_CB(skb)->room) < sizeof(*IPCB(skb))); Resurrecting this old thread, because the patch that ultimately went upstream (commit 9207f9d45b0a / net: preserve IP control block during GSO segmentation) causes a huge IPoIB performance regression (to the point of being unusable): https://bugzilla.kernel.org/show_bug.cgi?id=111921 I don't think anyone has explained what goes wrong or why IPoIB works the way it does. The underlying difference that IPoIB has from other drivers is that there are two levels of address resolution. First, normal ARP / ND resolves an IP address to a "hardware" address. The difference is that in IPoIB, the hardware address is an IB GID (plus a QPN, but we can ignore that). To actually send data to that GID, the IPoIB driver has to do a second lookup - it needs to ask the IB subnet manager for a path record that tells it how to reach that GID. In particular this means that "destination address" (as the IP / ARP layer understands it) actually isn't in the packet anywhere - there's nothing like an ethernet header as there is for "normal" network drivers. Instead, the driver stashes the address in skb->cb during hard_header_ops->create() and then looks at it in the xmit routine - this was designed way back around when commit a0417fa3a18a / net: Make qdisc_skb_cb upper size bound explicit. was merged. The expectation was that the part of the cb after sizeof (struct qdisc_skb_cb) would be preserved. The problem with commit 9207f9d45b0a is that GSO operations now access cb after SKB_SGO_CB_OFFSET==32, which lands right in the middle of where IPoIB stashes its hwaddr. It seems that the intent of the commit is to preserve the IP control block - struct inet_skb_parm (and presumably struct inet6_skb_parm) - even when using SKB_GSO_CB(). Seems like both inet_skb_parm and inet6_skb_parm are 20 bytes. IPoIB uses the part of cb after 28 bytes, so if we could squeeze struct skb_gso_cb down to 8 bytes and set SKB_SGO_CB_OFFSET to 20, then everything would work. The struct is struct skb_gso_cb { int mac_offset; int encap_level; __u16 csum_start; }; is it feasible to make encap_level a __u16 (which would make the overall struct exactly 8 bytes)? If I understand this correctly, 64K nested encapsulations seems like quite a bit for a packet... Or, earlier in this thread, having the GSO in ip_output and other gso paths save and restore the IP/IP6 control block was suggested as an alternate approach. I don't know if there are performance implications to that. What is the best way to keep the crash fix but not kill IPoIB performance? Thanks! - R.
[PATCH 3.19 and earlier] fib_rules: Fix dump_rules() not to exit early
From: Roland Dreier Backports of 41fc014332d9 ("fib_rules: fix fib rule dumps across multiple skbs") introduced a regression in "ip rule show" - it ends up dumping the first rule over and over and never exiting, because 3.19 and earlier are missing commit 053c095a82cf ("netlink: make nlmsg_end() and genlmsg_end() void"), so fib_nl_fill_rule() ends up returning skb->len (i.e. > 0) in the success case. Fix this by checking the return code for < 0 instead of != 0. Signed-off-by: Roland Dreier --- Hi, this is needed for all stable trees earlier than 4.0 that have picked up 41fc014332d9; so far looks like at least 3.10.y and 3.14.y have made such releases. net/core/fib_rules.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c index 627e517077e4..84340a2605ed 100644 --- a/net/core/fib_rules.c +++ b/net/core/fib_rules.c @@ -606,7 +606,7 @@ static int dump_rules(struct sk_buff *skb, struct netlink_callback *cb, err = fib_nl_fill_rule(skb, rule, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq, RTM_NEWRULE, NLM_F_MULTI, ops); - if (err) + if (err < 0) break; skip: idx++; -- 2.5.0 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net] fib_rules: fix fib rule dumps across multiple skbs
On Tue, Sep 22, 2015 at 9:40 PM, Roopa Prabhu wrote: > + err = fib_nl_fill_rule(skb, rule, NETLINK_CB(cb->skb).portid, > + cb->nlh->nlmsg_seq, RTM_NEWRULE, > + NLM_F_MULTI, ops); > + if (err) FWIW I believe this breaks pre-4.0 stable kernels (unfortunately it just showed up in 3.10.90). In kernels without 053c095a82cf ("netlink: make nlmsg_end() and genlmsg_end() void") then fib_nl_fill_rule() returns a positive value (skb->len) on success, so we break out of the loop here immediately. Symptom is "ip rule show" loops forever printing the first rule. After I finish testing a fix (as trivial as changing to "if (err < 0)") here, I'll send it to -stable guys. - R. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ofa-general] [PATCH 2.6.25] RDMA/cxgb3: Shift calculation wrong for single sge entries.
Thanks, applied, although I assume based on the Signed-off-by line that you left out a From: Bryan Rosenburg <[EMAIL PROTECTED]> at the top (to get the authorship in git correctly). > RDMA/cxgb3: Shift calculation wrong for single sge entries. BTW, there's no need to duplicate the subject line in the message body, but if you are going to do it, please put a "Subject:" before it. Otherwise I just have to edit it out by hand to avoid git putting the subject in twice. - R. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH RESEND] libertas: Remove unused exports
Any chance of getting this applied? It seems the build is still broken on ia64 at least due to the export of static functions. --- The libertas driver exports a number of symbols with no in-tree users; remove these unused exports. lbs_reset_device() is completely unused, with no callers at all, so remove the function completely. A couple of these unused exported symbols are static, which causes the following build error on ia64 with gcc 4.2.3: drivers/net/wireless/libertas/main.c:1375: error: __ksymtab_lbs_remove_mesh causes a section type conflict drivers/net/wireless/libertas/main.c:1354: error: __ksymtab_lbs_add_mesh causes a section type conflict This patch fixes this problem. I don't have hardware, so this is not run-tested, but I tested the build with CONFIG_LIBERTAS=y CONFIG_LIBERTAS_USB=m CONFIG_LIBERTAS_CS=m CONFIG_LIBERTAS_SDIO=m and there were no problems with undefined symbols. Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> --- diff --git a/drivers/net/wireless/libertas/cmd.c b/drivers/net/wireless/libertas/cmd.c index eab0203..b3c1acb 100644 --- a/drivers/net/wireless/libertas/cmd.c +++ b/drivers/net/wireless/libertas/cmd.c @@ -1040,7 +1040,6 @@ int lbs_mesh_access(struct lbs_private *priv, uint16_t cmd_action, lbs_deb_leave(LBS_DEB_CMD); return ret; } -EXPORT_SYMBOL_GPL(lbs_mesh_access); int lbs_mesh_config(struct lbs_private *priv, uint16_t enable, uint16_t chan) { @@ -1576,7 +1575,6 @@ done: lbs_deb_leave_args(LBS_DEB_HOST, "ret %d", ret); return ret; } -EXPORT_SYMBOL_GPL(lbs_prepare_and_send_command); /** * @brief This function allocates the command buffer and link diff --git a/drivers/net/wireless/libertas/decl.h b/drivers/net/wireless/libertas/decl.h index aaacd9b..4e22341 100644 --- a/drivers/net/wireless/libertas/decl.h +++ b/drivers/net/wireless/libertas/decl.h @@ -69,7 +69,6 @@ struct lbs_private *lbs_add_card(void *card, struct device *dmdev); int lbs_remove_card(struct lbs_private *priv); int lbs_start_card(struct lbs_private *priv); int lbs_stop_card(struct lbs_private *priv); -int lbs_reset_device(struct lbs_private *priv); void lbs_host_to_card_done(struct lbs_private *priv); int lbs_update_channel(struct lbs_private *priv); diff --git a/drivers/net/wireless/libertas/main.c b/drivers/net/wireless/libertas/main.c index 84fb49c..1eaf6af 100644 --- a/drivers/net/wireless/libertas/main.c +++ b/drivers/net/wireless/libertas/main.c @@ -1351,8 +1350,6 @@ done: lbs_deb_leave_args(LBS_DEB_MESH, "ret %d", ret); return ret; } -EXPORT_SYMBOL_GPL(lbs_add_mesh); - static void lbs_remove_mesh(struct lbs_private *priv) { @@ -1372,7 +1369,6 @@ static void lbs_remove_mesh(struct lbs_private *priv) free_netdev(mesh_dev); lbs_deb_leave(LBS_DEB_MESH); } -EXPORT_SYMBOL_GPL(lbs_remove_mesh); /** * @brief This function finds the CFP in @@ -1458,20 +1454,6 @@ void lbs_interrupt(struct lbs_private *priv) } EXPORT_SYMBOL_GPL(lbs_interrupt); -int lbs_reset_device(struct lbs_private *priv) -{ - int ret; - - lbs_deb_enter(LBS_DEB_MAIN); - ret = lbs_prepare_and_send_command(priv, CMD_802_11_RESET, - CMD_ACT_HALT, 0, 0, NULL); - msleep_interruptible(10); - - lbs_deb_leave_args(LBS_DEB_MAIN, "ret %d", ret); - return ret; -} -EXPORT_SYMBOL_GPL(lbs_reset_device); - static int __init lbs_init_module(void) { lbs_deb_enter(LBS_DEB_MAIN); -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ofa-general] [PATCH 2.6.25] RDMA/cxgb3: Fail loopbackconnections.
Steve, I had to update the patch adding an include and fixing the function declaration (as below)... but how much testing have you done with this?? commit 8704e9a8790cc9e394198663c1c9150c899fb9a2 Author: Steve Wise <[EMAIL PROTECTED]> Date: Tue Feb 12 16:09:29 2008 -0600 RDMA/cxgb3: Fail loopback connections The cxgb3 HW and driver don't support loopback RDMA connections. So fail any connection attempt where the destination address is local. Signed-off-by: Steve Wise <[EMAIL PROTECTED]> Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> diff --git a/drivers/infiniband/hw/cxgb3/iwch_cm.c b/drivers/infiniband/hw/cxgb3/iwch_cm.c index e9a08fa..320f2b6 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_cm.c +++ b/drivers/infiniband/hw/cxgb3/iwch_cm.c @@ -35,6 +35,7 @@ #include #include #include +#include #include #include @@ -1784,6 +1785,17 @@ err: return err; } +static int is_loopback_dst(struct iw_cm_id *cm_id) +{ + struct net_device *dev; + + dev = ip_dev_find(&init_net, cm_id->remote_addr.sin_addr.s_addr); + if (!dev) + return 0; + dev_put(dev); + return 1; +} + int iwch_connect(struct iw_cm_id *cm_id, struct iw_cm_conn_param *conn_param) { int err = 0; @@ -1791,6 +1803,11 @@ int iwch_connect(struct iw_cm_id *cm_id, struct iw_cm_conn_param *conn_param) struct iwch_ep *ep; struct rtable *rt; + if (is_loopback_dst(cm_id)) { + err = -ENOSYS; + goto out; + } + ep = alloc_ep(sizeof(*ep), GFP_KERNEL); if (!ep) { printk(KERN_ERR MOD "%s - cannot alloc ep.\n", __FUNCTION__); -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ofa-general] [PATCH 2.6.25] RDMA/cxgb3: Fail loopbackconnections.
> how can a static void function return 0? good question... I've fixed the patch in my tree. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ofa-general] [PATCH 2.6.25] RDMA/cxgb3: Fail loopback connections.
applied, although: > +static void is_loopback_dst(struct iw_cm_id *cm_id) > +{ > +struct net_device *dev; > + > +dev = ip_dev_find(&init_net, cm_id->remote_addr.sin_addr.s_addr); > +if (!dev) > +return 0; > +dev_put(dev); > +return 1; > +} is there any way this could trigger when it should, like if I'm trying to make a connection from one local device to a different local device (which should work fine)? - R. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] libertas: Remove unused exports
The libertas driver exports a number of symbols with no in-tree users; remove these unused exports. lbs_reset_device() is completely unused, with no callers at all, so remove the function completely. A couple of these unused exported symbols are static, which causes the following build error on ia64 with gcc 4.2.3: drivers/net/wireless/libertas/main.c:1375: error: __ksymtab_lbs_remove_mesh causes a section type conflict drivers/net/wireless/libertas/main.c:1354: error: __ksymtab_lbs_add_mesh causes a section type conflict This patch fixes this problem. I don't have hardware, so this is not run-tested, but I tested the build with CONFIG_LIBERTAS=y CONFIG_LIBERTAS_USB=m CONFIG_LIBERTAS_CS=m CONFIG_LIBERTAS_SDIO=m and there were no problems with undefined symbols. Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> --- Here's the patch that removes the unused exports (and a few more that I found). diff --git a/drivers/net/wireless/libertas/cmd.c b/drivers/net/wireless/libertas/cmd.c index eab0203..b3c1acb 100644 --- a/drivers/net/wireless/libertas/cmd.c +++ b/drivers/net/wireless/libertas/cmd.c @@ -1040,7 +1040,6 @@ int lbs_mesh_access(struct lbs_private *priv, uint16_t cmd_action, lbs_deb_leave(LBS_DEB_CMD); return ret; } -EXPORT_SYMBOL_GPL(lbs_mesh_access); int lbs_mesh_config(struct lbs_private *priv, uint16_t enable, uint16_t chan) { @@ -1576,7 +1575,6 @@ done: lbs_deb_leave_args(LBS_DEB_HOST, "ret %d", ret); return ret; } -EXPORT_SYMBOL_GPL(lbs_prepare_and_send_command); /** * @brief This function allocates the command buffer and link diff --git a/drivers/net/wireless/libertas/decl.h b/drivers/net/wireless/libertas/decl.h index aaacd9b..4e22341 100644 --- a/drivers/net/wireless/libertas/decl.h +++ b/drivers/net/wireless/libertas/decl.h @@ -69,7 +69,6 @@ struct lbs_private *lbs_add_card(void *card, struct device *dmdev); int lbs_remove_card(struct lbs_private *priv); int lbs_start_card(struct lbs_private *priv); int lbs_stop_card(struct lbs_private *priv); -int lbs_reset_device(struct lbs_private *priv); void lbs_host_to_card_done(struct lbs_private *priv); int lbs_update_channel(struct lbs_private *priv); diff --git a/drivers/net/wireless/libertas/main.c b/drivers/net/wireless/libertas/main.c index 84fb49c..1eaf6af 100644 --- a/drivers/net/wireless/libertas/main.c +++ b/drivers/net/wireless/libertas/main.c @@ -1351,8 +1350,6 @@ done: lbs_deb_leave_args(LBS_DEB_MESH, "ret %d", ret); return ret; } -EXPORT_SYMBOL_GPL(lbs_add_mesh); - static void lbs_remove_mesh(struct lbs_private *priv) { @@ -1372,7 +1369,6 @@ static void lbs_remove_mesh(struct lbs_private *priv) free_netdev(mesh_dev); lbs_deb_leave(LBS_DEB_MESH); } -EXPORT_SYMBOL_GPL(lbs_remove_mesh); /** * @brief This function finds the CFP in @@ -1458,20 +1454,6 @@ void lbs_interrupt(struct lbs_private *priv) } EXPORT_SYMBOL_GPL(lbs_interrupt); -int lbs_reset_device(struct lbs_private *priv) -{ - int ret; - - lbs_deb_enter(LBS_DEB_MAIN); - ret = lbs_prepare_and_send_command(priv, CMD_802_11_RESET, - CMD_ACT_HALT, 0, 0, NULL); - msleep_interruptible(10); - - lbs_deb_leave_args(LBS_DEB_MAIN, "ret %d", ret); - return ret; -} -EXPORT_SYMBOL_GPL(lbs_reset_device); - static int __init lbs_init_module(void) { lbs_deb_enter(LBS_DEB_MAIN); -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] mlx4: needs vmalloc.h for vmap()
thanks, just merged the same patch from Olof Johansson. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] libertas: Don't mark exported symbols as static
> Why not pull the exports? they aren't used anywhere in the existing kernel. I'm guessing there's some not-(yet-)merged mesh networking stuff that uses the symbols, but it doesn't matter much to me... - R. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] libertas: Don't mark exported symbols as static
Marking exported symbols as static causes the following build error on ia64 with gcc 4.2.3: drivers/net/wireless/libertas/main.c:1375: error: __ksymtab_lbs_remove_mesh causes a section type conflict drivers/net/wireless/libertas/main.c:1354: error: __ksymtab_lbs_add_mesh causes a section type conflict Therefore, remove the static marking on lbs_remove_mesh and lbs_add_mesh. Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> --- diff --git a/drivers/net/wireless/libertas/main.c b/drivers/net/wireless/libertas/main.c index 84fb49c..a688ce8 100644 --- a/drivers/net/wireless/libertas/main.c +++ b/drivers/net/wireless/libertas/main.c @@ -253,8 +253,8 @@ static ssize_t lbs_anycast_set(struct device *dev, static int lbs_add_rtap(struct lbs_private *priv); static void lbs_remove_rtap(struct lbs_private *priv); -static int lbs_add_mesh(struct lbs_private *priv); -static void lbs_remove_mesh(struct lbs_private *priv); +int lbs_add_mesh(struct lbs_private *priv); +void lbs_remove_mesh(struct lbs_private *priv); /** @@ -1296,7 +1296,7 @@ EXPORT_SYMBOL_GPL(lbs_stop_card); * @param privA pointer to the struct lbs_private structure * @return 0 if successful, -X otherwise */ -static int lbs_add_mesh(struct lbs_private *priv) +int lbs_add_mesh(struct lbs_private *priv) { struct net_device *mesh_dev = NULL; int ret = 0; @@ -1354,7 +1354,7 @@ done: EXPORT_SYMBOL_GPL(lbs_add_mesh); -static void lbs_remove_mesh(struct lbs_private *priv) +void lbs_remove_mesh(struct lbs_private *priv) { struct net_device *mesh_dev; -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [1/2] Skip empty hash buckets faster in /proc/net/tcp
> On a 2GB Core2 system here I see a time cat /proc/net/tcp > /dev/null > constently dropping from 0.44s to 0.4-0.8s system time with this change. Seems like there must be a typo in either the before or after times you report here? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] cxgb3: Remove incorrect __devinit annotations
When PCI error recovery was added to cxgb3, a function t3_io_slot_reset() was added. This function can call back into t3_prep_adapter() at any time, so t3_prep_adapter() can no longer be marked __devinit. This patch removes the __devinit annotation from t3_prep_adapter() and all the functions that it calls, which fixes WARNING: drivers/net/cxgb3/built-in.o(.text+0x2427): Section mismatch in reference from the function t3_io_slot_reset() to the function .devinit.text:t3_prep_adapter() Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> --- drivers/net/cxgb3/mc5.c |2 +- drivers/net/cxgb3/sge.c |2 +- drivers/net/cxgb3/t3_hw.c | 22 ++ 3 files changed, 12 insertions(+), 14 deletions(-) diff --git a/drivers/net/cxgb3/mc5.c b/drivers/net/cxgb3/mc5.c index 84c1ffa..4c4d6e8 100644 --- a/drivers/net/cxgb3/mc5.c +++ b/drivers/net/cxgb3/mc5.c @@ -452,7 +452,7 @@ void t3_mc5_intr_handler(struct mc5 *mc5) t3_write_reg(adap, A_MC5_DB_INT_CAUSE, cause); } -void __devinit t3_mc5_prep(struct adapter *adapter, struct mc5 *mc5, int mode) +void t3_mc5_prep(struct adapter *adapter, struct mc5 *mc5, int mode) { #define K * 1024 diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c index cb684d3..9ca8c66 100644 --- a/drivers/net/cxgb3/sge.c +++ b/drivers/net/cxgb3/sge.c @@ -2836,7 +2836,7 @@ void t3_sge_init(struct adapter *adap, struct sge_params *p) * defaults for the assorted SGE parameters, which admins can change until * they are used to initialize the SGE. */ -void __devinit t3_sge_prep(struct adapter *adap, struct sge_params *p) +void t3_sge_prep(struct adapter *adap, struct sge_params *p) { int i; diff --git a/drivers/net/cxgb3/t3_hw.c b/drivers/net/cxgb3/t3_hw.c index 7469935..a99496a 100644 --- a/drivers/net/cxgb3/t3_hw.c +++ b/drivers/net/cxgb3/t3_hw.c @@ -2675,7 +2675,7 @@ void t3_tp_set_max_rxsize(struct adapter *adap, unsigned int size) V_PMMAXXFERLEN0(size) | V_PMMAXXFERLEN1(size)); } -static void __devinit init_mtus(unsigned short mtus[]) +static void init_mtus(unsigned short mtus[]) { /* * See draft-mathis-plpmtud-00.txt for the values. The min is 88 so @@ -2703,7 +2703,7 @@ static void __devinit init_mtus(unsigned short mtus[]) /* * Initial congestion control parameters. */ -static void __devinit init_cong_ctrl(unsigned short *a, unsigned short *b) +static void init_cong_ctrl(unsigned short *a, unsigned short *b) { a[0] = a[1] = a[2] = a[3] = a[4] = a[5] = a[6] = a[7] = a[8] = 1; a[9] = 2; @@ -3354,8 +3354,7 @@ out_err: * Determines a card's PCI mode and associated parameters, such as speed * and width. */ -static void __devinit get_pci_mode(struct adapter *adapter, - struct pci_params *p) +static void get_pci_mode(struct adapter *adapter, struct pci_params *p) { static unsigned short speed_map[] = { 33, 66, 100, 133 }; u32 pci_mode, pcie_cap; @@ -3395,8 +3394,7 @@ static void __devinit get_pci_mode(struct adapter *adapter, * capabilities and default speed/duplex/flow-control/autonegotiation * settings. */ -static void __devinit init_link_config(struct link_config *lc, - unsigned int caps) +static void init_link_config(struct link_config *lc, unsigned int caps) { lc->supported = caps; lc->requested_speed = lc->speed = SPEED_INVALID; @@ -3419,7 +3417,7 @@ static void __devinit init_link_config(struct link_config *lc, * Calculates the size of an MC7 memory in bytes from the value of its * configuration register. */ -static unsigned int __devinit mc7_calc_size(u32 cfg) +static unsigned int mc7_calc_size(u32 cfg) { unsigned int width = G_WIDTH(cfg); unsigned int banks = !!(cfg & F_BKS) + 1; @@ -3430,8 +3428,8 @@ static unsigned int __devinit mc7_calc_size(u32 cfg) return MBs << 20; } -static void __devinit mc7_prep(struct adapter *adapter, struct mc7 *mc7, - unsigned int base_addr, const char *name) +static void mc7_prep(struct adapter *adapter, struct mc7 *mc7, +unsigned int base_addr, const char *name) { u32 cfg; @@ -3517,7 +3515,7 @@ static int t3_reset_adapter(struct adapter *adapter) return 0; } -static int __devinit init_parity(struct adapter *adap) +static int init_parity(struct adapter *adap) { int i, err, addr; @@ -3552,8 +3550,8 @@ static int __devinit init_parity(struct adapter *adap) * for some adapter tunables, take PHYs out of reset, and initialize the MDIO * interface. */ -int __devinit t3_prep_adapter(struct adapter *adapter, - const struct adapter_info *ai, int reset) +int t3_prep_adapter(struct adapter *adapter, const struct adapter_info *ai, + int reset) { int ret; unsigned int i,
Re: [PATCH 2.6.25] RDMA/cxgb3: Fix the T3A workaround checks.
thanks, applied. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RESEND 3/3] RDMA/cxgb3: Mark qp as privileged based on user capabilities.
thanks, applied 1-3 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH] IPoIB: improve IPv4/IPv6 to IB mcast mapping functions
Any objection to merging the following for 2.6.25? [Rolf -- I think it makes more sense to delete the overwriting of the P_Key in ipoib_multicast.c in this patch rather than later in the series; do you agree?] Thanks, Roland From: Rolf Manderscheid <[EMAIL PROTECTED]> An IPoIB subnet on an IB fabric that spans multiple IB subnets can't use link-local scope in multicast GIDs. The existing routines that map IP/IPv6 multicast addresses into IB link-level addresses hard-code the scope to link-local, and they also leave the partition key field uninitialised. This patch adds a parameter (the link-level broadcast address) to the mapping routines, allowing them to initialise both the scope and the P_Key appropriately, and fixes up the call sites. The next step will be to add a way to configure the scope for an IPoIB interface. Signed-off-by: Rolf Manderscheid <[EMAIL PROTECTED]> Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> --- drivers/infiniband/core/cma.c |4 +--- drivers/infiniband/ulp/ipoib/ipoib_multicast.c |4 include/net/if_inet6.h | 11 +++ include/net/ip.h | 10 ++ net/ipv4/arp.c |2 +- net/ipv6/ndisc.c |2 +- 6 files changed, 16 insertions(+), 17 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 312ec74..982836e 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -2610,11 +2610,9 @@ static void cma_set_mgid(struct rdma_id_private *id_priv, /* IPv6 address is an SA assigned MGID. */ memcpy(mgid, &sin6->sin6_addr, sizeof *mgid); } else { - ip_ib_mc_map(sin->sin_addr.s_addr, mc_map); + ip_ib_mc_map(sin->sin_addr.s_addr, dev_addr->broadcast, mc_map); if (id_priv->id.ps == RDMA_PS_UDP) mc_map[7] = 0x01; /* Use RDMA CM signature */ - mc_map[8] = ib_addr_get_pkey(dev_addr) >> 8; - mc_map[9] = (unsigned char) ib_addr_get_pkey(dev_addr); *mgid = *(union ib_gid *) (mc_map + 4); } } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index 858ada1..2628339 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -788,10 +788,6 @@ void ipoib_mcast_restart_task(struct work_struct *work) memcpy(mgid.raw, mclist->dmi_addr + 4, sizeof mgid); - /* Add in the P_Key */ - mgid.raw[4] = (priv->pkey >> 8) & 0xff; - mgid.raw[5] = priv->pkey & 0xff; - mcast = __ipoib_mcast_find(dev, &mgid); if (!mcast || test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags)) { struct ipoib_mcast *nmcast; diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h index 448eccb..b24508a 100644 --- a/include/net/if_inet6.h +++ b/include/net/if_inet6.h @@ -269,18 +269,21 @@ static inline void ipv6_arcnet_mc_map(const struct in6_addr *addr, char *buf) buf[0] = 0x00; } -static inline void ipv6_ib_mc_map(struct in6_addr *addr, char *buf) +static inline void ipv6_ib_mc_map(const struct in6_addr *addr, + const unsigned char *broadcast, char *buf) { + unsigned char scope = broadcast[5] & 0xF; + buf[0] = 0;/* Reserved */ buf[1] = 0xff; /* Multicast QPN */ buf[2] = 0xff; buf[3] = 0xff; buf[4] = 0xff; - buf[5] = 0x12; /* link local scope */ + buf[5] = 0x10 | scope; /* scope from broadcast address */ buf[6] = 0x60; /* IPv6 signature */ buf[7] = 0x1b; - buf[8] = 0;/* P_Key */ - buf[9] = 0; + buf[8] = broadcast[8]; /* P_Key */ + buf[9] = broadcast[9]; memcpy(buf + 10, addr->s6_addr + 6, 10); } #endif diff --git a/include/net/ip.h b/include/net/ip.h index 840dd91..50c8889 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -266,20 +266,22 @@ static inline void ip_eth_mc_map(__be32 naddr, char *buf) * Leave P_Key as 0 to be filled in by driver. */ -static inline void ip_ib_mc_map(__be32 naddr, char *buf) +static inline void ip_ib_mc_map(__be32 naddr, const unsigned char *broadcast, char *buf) { __u32 addr; + unsigned char scope = broadcast[5] & 0xF; + buf[0] = 0;/* Reserved */ buf[1] = 0xff; /* Multicast QPN */ buf[2] = 0xff; buf[3] = 0xff; addr= ntohl(naddr); buf[4] = 0xff; - buf[5] = 0x12; /* link local scope */ + buf[5] = 0x10 | scope; /* scope from broadcast address */
Bogus commit 70eba18b ("make bnx2x select ZLIB_INFLATE")?
Commit 70eba18b ("make bnx2x select ZLIB_INFLATE") in Linus's tree seems bogus. As far as I can tell, bnx2x is not upstream yet, and the commit in question actually adds "select ZLIB_INFLATE" to the TEHUTI config, since there is no BNX2X config option (and also I don't see any reference to zlib in any tehuti files). I assume this is some sort of merge problem that got added to the wrong tree and should be reverted upstream. - R. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH]: fix lro_gen_skb() alignment
> >> - skb = netdev_alloc_skb(lro_mgr->dev, hlen); > >> + skb = netdev_alloc_skb(lro_mgr->dev, hlen + NET_IP_ALIGN); > > NET_IP_ALIGN should only be used if you're DMAing into the skb head. > > Otherwise you should say 2. It would be nice to have another macro > > for that I suppose. > > It is certainly simple enough to say 2. Thank you for pointing > this out. I have attached a patch to do that.. > > Signed off by: Andrew Gallatin <[EMAIL PROTECTED]> Isn't the value of 2 ethernet-specific (to round the 14-byte header up to 16)? Given that the rest of the lro code is fairly careful to calculate mac_hdr_len etc it seems as if it would be cleaner to make this independent of the specific L2 being used. (And I plan on using the LRO module for IP-over-InfiniBand so this is not completely theoretical) - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2.6.25 2/2] RDMA/cxgb3: Support 5.0 firmware.
OK, applied 1 and 2... > Note: this change requires 5.0 firmware. I assume the change to the cxgb3 FW versions is pending in a net driver change for 2.6.25? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
> Agreed. On first glance, I was intrigued but: > > 1) Why is everyone so concerned that export symbol space is large? > - does it cost cpu or running memory? > - does it cause bugs? > - or are you just worried about "evil modules"? > > 2) These aren't real namespaces > - all global names still have to be unique > - still have to handle the "non-modular build" namespace conflicts > - there isn't a big problem with conflicting symbols today. Perhaps changing the name from "namespace" to "interface" would help? Then a module could have something like MODULE_USE_INTERFACE(foo); and I think that makes it clearer what the advantage of this is: it marks symbols as being part of a certain interface, requires modules that use that interface to declare that use explicitly, and allows reviewers to say "Hey why is this code using the scsi interface when it's a webcam driver?" - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
> Except C doesn't have namespaces and this mechanism doesn't create them. So > this is just complete and utter makework; as I said before, noone's going to > confuse all those udp_* functions if they're not in the udp namespace. I don't understand why you're so opposed to organizing the kernel's exported symbols in a more self-documenting way. It seems pretty clear to me that having a mechanism that requires modules to make explicit which (semi-)internal APIs makes reviewing easier, makes it easier to communicate "please don't use that API" to module authors, and takes at least a small step towards bringing the kernel's exported API under control. What's the real downside? - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
> > I agree that we shouldn't make things too hard for out-of-tree > > modules, but I disagree with your first statement: there clearly is a > > large class of symbols that are used by multiple modules but which are > > not generically useful -- they are only useful by a certain small class > > of modules. > > If it is so clear, you should be able to easily provide examples? Sure -- Andi's example of symbols required only by TCP congestion modules; the SCSI internals that Christoph wants to mark; the symbols exported by my mlx4_core driver (which I admit are currently only used by the mlx4_ib driver, but which will also be used by at least the ethernet NIC driver for the same hardware). I thought this was already covered repeatedly in the thread and indeed in Andi's code so there was no need to repeat it... - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
> Yes, and if a symbol is already used by multiple modules, it's generically > useful. And if so, why restrict it to in-tree modules? I agree that we shouldn't make things too hard for out-of-tree modules, but I disagree with your first statement: there clearly is a large class of symbols that are used by multiple modules but which are not generically useful -- they are only useful by a certain small class of modules. - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
> This patch allows to export symbols only for specific modules by > introducing symbol name spaces. A module name space has a white > list of modules that are allowed to import symbols for it; all others > can't use the symbols. > > It adds two new macros: > > MODULE_NAMESPACE_ALLOW(namespace, module); I definitely like the idea of organizing exported symbols into namespaces. However, I feel like it would make more sense to have something like MODULE_NAMESPACE_IMPORT(namespace); inside the modules that use a namespace. This matches the way C preprocessor #includes, C++ namespaces, perl "use", etc. works and so it is probably closer to how programmers think. It does weaken the enforcement of review a little bit, since there are no changes to the site where things are exported to catch, but git makes it easy to sneak that kind of change in anyway. The practical benefits are that you don't end up with stupid patch conflicts caused by one patch adding MODULE_NAMESPACE_ALLOW() for module "foo" and another patch adding it for module "bar" at the same place, and that it becomes simpler for people to test or deliver, say, a new TCP congestion module without having to rebuild the whole kernel (which is an especially huge pain for distro kernels). In any case I would make use of this in whatever form it gets merged in -- Mellanox ConnectX hardware can operate simultaneously as an InfiniBand adapter and a 10Gb ethernet NIC, and so my driver is split up into a low-level mlx4_core driver that exports symbols for other mlx4 drivers to use; clearly it only makes sense to export them to other parts of the driver, and in this case the difference between "ALLOW" and "IMPORT" semantics is not a big deal. - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2.6.24] RDMA/cxgb3: Set the max_qp_init_rd_atom attribute.
thanks, applied. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 11/14 v2] nes: OpenFabrics kernel verbs
> +/** > + * nes_post_send > + */ > +static int nes_post_send(struct ib_qp *ibqp, struct ib_send_wr *ib_wr, > +struct ib_send_wr **bad_wr) > ... > +switch (ib_wr->opcode) { > ... > +if (ib_wr->num_sge > > nesdev->nesadapter->max_sge) { > +err = -EINVAL; > +break; > +} > ... > +default: > +/* error */ > +err = -EINVAL; > +break; looks like if you detect an error while posting a work request, you break out of the switch statement but just continue through the while loop going through the list of work reuqests. Which doesn't seem like it will work very well. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 10/14 v2] nes: eeprom and phy routines
> +/* TODO: deal with EEPROM endian issues */ This is pretty scary. Is the driver broken on big-endian systems now? > +/* > +"Everything you wanted to know about CRC algorithms, but were afraid to ask > + for fear that errors in your understanding might be detected." Version : > 3. etc etc... can all this be replaced with what's in lib/crc32.c? (I hope so) - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/14 v2] nes: module and device initialization
> Thanks Roland. Let me know when you have your branch ready. OK, I pushed out a "neteffect" branch at git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git This has the driver from your git tree plus some compile fixes and cleanups (added as separate commits, so you can see what I did). If it suits you, let's work against that tree to continue cleaning things up -- you can send me patches or git pull requests to pick up new things. - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/14 v2] nes: module and device initialization
OK, a couple quick review comments and a process comment too: - First step in the driver is to kill off a lot of the #ifdefs: > +#ifdef IRQF_SHARED The upstream driver really shouldn't have compatibility gunk for older kernels... just make it build against the kernel it's in. > +#ifdef OFED_1_2 Same... kernel code shouldn't worry about OFED. > +#ifdef CONFIG_PCI_MSI > +if (nesdev->msi_enabled) { > +pci_disable_msi(pcidev); > +} > +#endif This can be much simpler, because pci_disable_msi() is always available and is a NOP if the config option is off or MSI is not enabled. So you can just unconditionally do pci_disable_msi(pcidev); > +#ifdef NES_NAPI I don't see anything that defines NES_NAPI. I think for the final merge we want a NAPI-only driver (ie no ifdef at all)... is there any performance or other reason to ever build a non-NAPI driver (for a modern kernel)? OK, on a process level, my plan is to pull the current driver into a "neteffect" branch in my git tree with the intention of merging it for 2.6.25. I'll let you know when that's ready (probably early next week). I'll probably do some cleanups there, and you can send me cleanup/fix patches against that branch any time too. We should try to keep the cycle time short: the interval between the first posting of this driver and the current one was pretty long, and there's a lot of cleanup to do to get ready for the next merge window. Does that plan make sense? - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 13/14 v2] nes: kernel build infrastructure
> +config INFINIBAND_NES_DEBUG > +bool "Verbose debugging output" > +depends on INFINIBAND_NES > +default n > +---help--- > + This option causes the NetEffect RNIC driver to produce debug > + messages. Select this if you are developing the driver > + or trying to diagnose a problem. If you make this default n then no distro will have it enabled and you'll have to rebuild to debug anything. Better to have the default be enabled and make it controllable at runtime too with a module parameter. (you can look at what mthca does for an example of what I mean) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ofa-general] [PATCH 14/14 v2] nes: kernel build infrastructure
> + > +EXTRA_CFLAGS += -DNES_MINICM I don't see anyplace NES_MINICM is used. Delete this line? > + > +obj-$(CONFIG_INFINIBAND_NES) += iw_nes.o > + > +iw_nes-objs := nes.o nes_hw.o nes_nic.o nes_utils.o nes_verbs.o nes_cm.o > + Also the file has an extra blank line at the beginning and end. Might as well kill them. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/14 v2] nes: device structures and defines
> You are starting off on the wrong foot. ??? > > +if(!(expr)) { > > \ > > + printk(KERN_ERR PFX "Assertion failed! %s, %s, %s, line %d\n", \ > > + #expr, __FILE__, __FUNCTION__, __LINE__); > > \ > > +} > > Use BUG_ON I agree that there's no need to invent a driver-private assertion macro, but (to first order at least) drivers should never use BUG_ON. I don't want some glitch in a network driver that the system could probably survive to be turned into a panic by BUG_ON -- WARN_ON seems infinitely preferable. - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/14 v2] nes: module and device initialization
Thanks... I am kind of overloaded trying to handle the last few things for the 2.6.24 merge window, but I will look at this next week, and I expect we should be able to merge the driver for 2.6.25 unless there are unexpected hangups. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/8] [MV643XX_ETH] Remove SHARED_REGS register address bias
> +static void __iomem *mv643xx_eth_base; > +return readl(((void __iomem *)mv643xx_eth_base) + offset); Given the declaration of mv643xx_eth_base as void __iomem * already, I don't understand why you need the cast to the same type here (and elsewhere in the driver). - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bonding support for eth1394?
> The bonding sources have a few occurrences of EOPNOTSUPP. Unless I > missed something, they are all related to setting the hardware address > of the interface. AFAICS this is impossible with IP over FireWire. If > it is crucial to bonding to be able to change the slaves' hardware > addresses, then you are out of luck. There are a few changes to the bonding driver pending that will add support for bonding IP-over-InfiniBand interfaces. IPoIB also cannot change its HW address, so the patches address that issue. Once those patches land, bonding eth1394 interfaces may "just work". - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/ipoib: Bound the net device to the ipoib_neigh structue
> It happens only when ib interfaces are slaves of a bonding device. > I thought before that the stuck is in napi_disable() but it's almost right. > I put prints before and after call to napi_disable and see that it is called > twice. > I'll try to investigate in this direction. > > ib0: stopping interface > ib0: before napi_disable > ib0: after napi_disable > ib0: downing ib_dev > ib0: All sends and receives done. > ib0: stopping interface > ib0: before napi_disable Yes, two napi_disable()s in a row without a matching napi_enable() will deadlock. I guess the question is why the ipoib interface is being stopped twice. If you just take the net-2.6.24 tree (without bonding patches), does bonding for ethernet interfaces work OK, or is there a similar problem with double napi_disable()? How about bonding of ethernet after this batch of bonding patches? - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/ipoib: Bound the net device to the ipoib_neigh structue
> I also ran a test for the code in the branch of 2.6.24 and found a problem. > I see that ifconfig down doesn't return (for IPoIB interfaces) and it's > stuck in napi_disable() in the kernel (any idea why?) For what it's worth, I took the upstream 2.6.23 git tree and merged in Dave's latest net-2.6.24 tree and my latest for-2.6.24 tree and tried that. I brought up an IPoIB interface, sent a few pings, and did ifconfig down, and it worked fine. Can you try the same thing without the bonding patches to see if your setup works OK too? Also can you give more details about what you do to get ifconfig down stuck? - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ofa-general] [PATCH 4/4] ibm_emac: Convert to use napi_struct independent of struct net_device
Sorry... wrong subject here; it should have been "ibm_newemac: ..." - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4] ibm_emac: Convert to use napi_struct independent of struct net_device
Commit da3dedd9 ("[NET]: Make NAPI polling independent of struct net_device objects.") changed the interface to NAPI polling. Fix up the ibm_newemac driver so that it works with this new interface. This is actually a nice cleanup because ibm_newemac is one of the drivers that wants to have multiple NAPI structures for a single net_device. Compile-tested only as I don't have a system that uses the ibm_newemac driver. This conversion the conversion for the ibm_emac driver that was tested on real PowerPC 440SPe hardware. Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> --- drivers/net/ibm_newemac/mal.c | 55 ++-- drivers/net/ibm_newemac/mal.h |2 +- 2 files changed, 20 insertions(+), 37 deletions(-) diff --git a/drivers/net/ibm_newemac/mal.c b/drivers/net/ibm_newemac/mal.c index c4335b7..5885411 100644 --- a/drivers/net/ibm_newemac/mal.c +++ b/drivers/net/ibm_newemac/mal.c @@ -235,10 +235,10 @@ static irqreturn_t mal_serr(int irq, void *dev_instance) static inline void mal_schedule_poll(struct mal_instance *mal) { - if (likely(netif_rx_schedule_prep(&mal->poll_dev))) { + if (likely(napi_schedule_prep(&mal->napi))) { MAL_DBG2(mal, "schedule_poll" NL); mal_disable_eob_irq(mal); - __netif_rx_schedule(&mal->poll_dev); + __napi_schedule(&mal->napi); } else MAL_DBG2(mal, "already in poll" NL); } @@ -318,8 +318,7 @@ void mal_poll_disable(struct mal_instance *mal, struct mal_commac *commac) msleep(1); /* Synchronize with the MAL NAPI poller. */ - while (test_bit(__LINK_STATE_RX_SCHED, &mal->poll_dev.state)) - msleep(1); + napi_disable(&mal->napi); } void mal_poll_enable(struct mal_instance *mal, struct mal_commac *commac) @@ -330,11 +329,11 @@ void mal_poll_enable(struct mal_instance *mal, struct mal_commac *commac) // XXX might want to kick a poll now... } -static int mal_poll(struct net_device *ndev, int *budget) +static int mal_poll(struct napi_struct *napi, int budget) { - struct mal_instance *mal = netdev_priv(ndev); + struct mal_instance *mal = container_of(napi, struct mal_instance, napi); struct list_head *l; - int rx_work_limit = min(ndev->quota, *budget), received = 0, done; + int received = 0; unsigned long flags; MAL_DBG2(mal, "poll(%d) %d ->" NL, *budget, @@ -358,26 +357,21 @@ static int mal_poll(struct net_device *ndev, int *budget) int n; if (unlikely(test_bit(MAL_COMMAC_POLL_DISABLED, &mc->flags))) continue; - n = mc->ops->poll_rx(mc->dev, rx_work_limit); + n = mc->ops->poll_rx(mc->dev, budget); if (n) { received += n; - rx_work_limit -= n; - if (rx_work_limit <= 0) { - done = 0; - // XXX What if this is the last one ? - goto more_work; - } + budget -= n; + if (budget <= 0) + goto more_work; // XXX What if this is the last one ? } } /* We need to disable IRQs to protect from RXDE IRQ here */ spin_lock_irqsave(&mal->lock, flags); - __netif_rx_complete(ndev); + __napi_complete(napi); mal_enable_eob_irq(mal); spin_unlock_irqrestore(&mal->lock, flags); - done = 1; - /* Check for "rotting" packet(s) */ list_for_each(l, &mal->poll_list) { struct mal_commac *mc = @@ -387,12 +381,12 @@ static int mal_poll(struct net_device *ndev, int *budget) if (unlikely(mc->ops->peek_rx(mc->dev) || test_bit(MAL_COMMAC_RX_STOPPED, &mc->flags))) { MAL_DBG2(mal, "rotting packet" NL); - if (netif_rx_reschedule(ndev, received)) + if (napi_reschedule(napi)) mal_disable_eob_irq(mal); else MAL_DBG2(mal, "already in poll list" NL); - if (rx_work_limit > 0) + if (budget > 0) goto again; else goto more_work; @@ -401,13 +395,8 @@ static int mal_poll(struct net_device *ndev, int *budget) } more_work: - ndev->quota -= received; - *budget -= received; - - MAL_DBG2(mal, "poll() %d <- %d" NL, *budget, -done ? 0 : 1); - - re
[PATCH 1/4] IPoIB: Fix unused variable warning
The conversion to use netdevice internal stats left an unused variable in ipoib_neigh_free(), since there's no longer any reason to get netdev_priv() in order to increment dropped packets. Delete the unused priv variable. Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> --- drivers/infiniband/ulp/ipoib/ipoib_main.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index 6b1b4b2..855c9de 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -854,7 +854,6 @@ struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour) void ipoib_neigh_free(struct net_device *dev, struct ipoib_neigh *neigh) { - struct ipoib_dev_priv *priv = netdev_priv(dev); struct sk_buff *skb; *to_ipoib_neigh(neigh->neighbour) = NULL; while ((skb = __skb_dequeue(&neigh->queue))) { - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/4] ibm_emac: Convert to use napi_struct independent of struct net_device
Commit da3dedd9 ("[NET]: Make NAPI polling independent of struct net_device objects.") changed the interface to NAPI polling. Fix up the ibm_emac driver so that it works with this new interface. This is actually a nice cleanup because ibm_emac is one of the drivers that wants to have multiple NAPI structures for a single net_device. Tested with the internal MAC of a PowerPC 440SPe SoC with an AMCC 'Yucca' evaluation board. Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> --- drivers/net/ibm_emac/ibm_emac_mal.c | 48 -- drivers/net/ibm_emac/ibm_emac_mal.h |2 +- include/linux/netdevice.h | 10 +++ 3 files changed, 28 insertions(+), 32 deletions(-) diff --git a/drivers/net/ibm_emac/ibm_emac_mal.c b/drivers/net/ibm_emac/ibm_emac_mal.c index cabd984..cc3ddc9 100644 --- a/drivers/net/ibm_emac/ibm_emac_mal.c +++ b/drivers/net/ibm_emac/ibm_emac_mal.c @@ -207,10 +207,10 @@ static irqreturn_t mal_serr(int irq, void *dev_instance) static inline void mal_schedule_poll(struct ibm_ocp_mal *mal) { - if (likely(netif_rx_schedule_prep(&mal->poll_dev))) { + if (likely(napi_schedule_prep(&mal->napi))) { MAL_DBG2("%d: schedule_poll" NL, mal->def->index); mal_disable_eob_irq(mal); - __netif_rx_schedule(&mal->poll_dev); + __napi_schedule(&mal->napi); } else MAL_DBG2("%d: already in poll" NL, mal->def->index); } @@ -273,11 +273,11 @@ static irqreturn_t mal_rxde(int irq, void *dev_instance) return IRQ_HANDLED; } -static int mal_poll(struct net_device *ndev, int *budget) +static int mal_poll(struct napi_struct *napi, int budget) { - struct ibm_ocp_mal *mal = ndev->priv; + struct ibm_ocp_mal *mal = container_of(napi, struct ibm_ocp_mal, napi); struct list_head *l; - int rx_work_limit = min(ndev->quota, *budget), received = 0, done; + int received = 0; MAL_DBG2("%d: poll(%d) %d ->" NL, mal->def->index, *budget, rx_work_limit); @@ -295,38 +295,34 @@ static int mal_poll(struct net_device *ndev, int *budget) list_for_each(l, &mal->poll_list) { struct mal_commac *mc = list_entry(l, struct mal_commac, poll_list); - int n = mc->ops->poll_rx(mc->dev, rx_work_limit); + int n = mc->ops->poll_rx(mc->dev, budget); if (n) { received += n; - rx_work_limit -= n; - if (rx_work_limit <= 0) { - done = 0; + budget -= n; + if (budget <= 0) goto more_work; // XXX What if this is the last one ? - } } } /* We need to disable IRQs to protect from RXDE IRQ here */ local_irq_disable(); - __netif_rx_complete(ndev); + __napi_complete(napi); mal_enable_eob_irq(mal); local_irq_enable(); - done = 1; - /* Check for "rotting" packet(s) */ list_for_each(l, &mal->poll_list) { struct mal_commac *mc = list_entry(l, struct mal_commac, poll_list); if (unlikely(mc->ops->peek_rx(mc->dev) || mc->rx_stopped)) { MAL_DBG2("%d: rotting packet" NL, mal->def->index); - if (netif_rx_reschedule(ndev, received)) + if (napi_reschedule(napi)) mal_disable_eob_irq(mal); else MAL_DBG2("%d: already in poll list" NL, mal->def->index); - if (rx_work_limit > 0) + if (budget > 0) goto again; else goto more_work; @@ -335,12 +331,8 @@ static int mal_poll(struct net_device *ndev, int *budget) } more_work: - ndev->quota -= received; - *budget -= received; - - MAL_DBG2("%d: poll() %d <- %d" NL, mal->def->index, *budget, -done ? 0 : 1); - return done ? 0 : 1; + MAL_DBG2("%d: poll() %d <- %d" NL, mal->def->index, budget, received); + return received; } static void mal_reset(struct ibm_ocp_mal *mal) @@ -425,11 +417,8 @@ static int __init mal_probe(struct ocp_device *ocpdev) mal->def = ocpdev->def; INIT_LIST_HEAD(&mal->poll_list); - set_bit(__LINK_STATE_START, &mal->poll_dev.state); - mal->poll_dev.weight = CONFIG_IBM_EMAC_POLL_WEIGHT; - mal->
[PATCH 3/4] ibm_new_emac: Nuke SET_MODULE_OWNER() use
Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> --- drivers/net/ibm_newemac/core.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/drivers/net/ibm_newemac/core.c b/drivers/net/ibm_newemac/core.c index ce127b9..8ea5009 100644 --- a/drivers/net/ibm_newemac/core.c +++ b/drivers/net/ibm_newemac/core.c @@ -2549,7 +2549,6 @@ static int __devinit emac_probe(struct of_device *ofdev, dev->ndev = ndev; dev->ofdev = ofdev; dev->blist = blist; - SET_MODULE_OWNER(ndev); SET_NETDEV_DEV(ndev, &ofdev->dev); /* Initialize some embedded data structures */ - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3][NET_BATCH] net core use batching
> Before you add new entries to your list, how is that ibm driver NAPI > conversion coming along? :-) OK, thanks for the kick in the pants, I have a couple of patches for net-2.6.24 coming (including an unrelated trivial warning fix for IPoIB). - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3][NET_BATCH] net core use batching
> Before you add new entries to your list, how is that ibm driver NAPI > conversion coming along? :-) I still haven't done much. OK, I will try to get my board booting again this week. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3][NET_BATCH] net core use batching
> I'd say we can probably try to get rid of it in 2.6.25, this is > assuming we get driver authors to cooperate and do the conversions > or alternatively some other motivated person. > > I can just threaten to do them all and that should get the driver > maintainers going :-) I can definitely kill LLTX for IPoIB by 2.6.25 and I just added it to my TODO list so I don't forget. In fact if 2.6.23 drags on long enough I may do it for 2.6.24 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ofa-general] InfiniBand/RDMA merge plans for 2.6.24
> I tested this by simulating a slow passive side responder, and it worked as > expected for those tests. Using an MRA does add another MAD to the CM > exchange, > which is why it is sent only after seeing a duplicate request. > Alternatively, > we can take the OFED module parameter patch. What the heck, I added this for 2.6.24. If it doesn't work out we can back it out. - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Vague maybe ppp-related panic report for 2.6.23-rc9
Just as a quick update -- I seem to only be able to reproduce this crash when my ppp session drops, which seems associated with marginal signal. And unfortunately I have great coverage at home so I haven't been able to reproduce this again today. Maybe on the train tomorrow I can crash my laptop... - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] mac80211: Fix TX after monitor interface is converted to managed
> Programming with assertions (and BUG_ON is a form of that) is > generally a good practice. Almost any book or other source on > good programming practices will agree. Yes, it can be overdone. > But I don't really think that is the case here, since the check is > relatively inexpensive and the consequence should it ever *somehow* > happen could be a something wierd (crash, corruption, etc) w/o any > other indication of what occured. The problem with BUG_ON is that it kills the whole system. So every time you add a BUG_ON into code, you have to weigh whether the problem you detected is so severe that the right response is to panic. For example, I can see panicking on something fundamental like corrupted page tables. However I would submit that the wireless stack should *never* use BUG_ON -- printing a warning and trying to limp on seems preferable to me. - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Vague maybe ppp-related panic report for 2.6.23-rc9
> I don't want to jump the gun on the analysis but it just might > be the packet sharing fixes Herbert put in a short time ago. > > What you could do is go back to say rc2 and see if you still get > the panics, then bisect from there to narrow it down. > > If rc2 still gives the panic, it's something else, perhaps device > specific. OK thanks. I'm still trying to find a good way to reproduce it, so I'm going to try to get it to happen with -rc9 while running in the console, and at least take a picture of the backtrace. If I find a good way to make it happen then I'll try rc2 and let you know. - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Vague maybe ppp-related panic report for 2.6.23-rc9
I'm running 2.6.23-rc9 on my laptop, and when in a coffee shop I use a Verizon EVDO card to get network access. This is a kyocera device that looks like a serial adapter behind an ohci usb controller, and uses the airprime driver (for usb device 0c88:17da). The actual IP networking is ppp over that serial adapter. The connection is slightly flaky (I blame this on the Verizon coverage), so I end up restarting things every so often. The point of this email (at last) is that I've just seen my second panic (only info: blinking caps lock led, since I've been in X) that seems correlated to stopping/restarting pppd. Sorry for the lack of detail -- I've just switched to running in the console so if I can provoke the crash again I'll get a little more info. I just wanted to mention this in case someone has seen something similar or has any good ideas about how to capture debugging output on a laptop (when I'm not home and have no other boxes handy). - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tcp bw in 2.6
> It would be really great to see numbers with a more recent kernel > than 2.6.18 FWIW Debian has binaries for 2.6.21 in testing and for 2.6.22 in unstable so it should be very easy for Larry to try at least those. - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ofa-general] InfiniBand/RDMA merge plans for 2.6.24
> >OK -- just to make sure I'm understanding what you're saying: have you > >confirmed that your proposed [CM MRA] patches actually fix the issue? > > Not directly. I cannot easily test kernel patches on our larger, production > clusters. We've seen the issue with specific applications on 512 and 1024 > cores, but I've only been able to test the patch on a 48-core cluster. I > have > verified that it successfully increases the timeout to where it *should* > work, > but cannot absolutely confirm that it will fix the problem. I'm unlikely to > know that until the production clusters move to an OFED release (1.3?) > containing this patch. Umm... this is a difficult situation for me to merge the changes then. We're changing the CM retry behavior blind here. How do we know that the MRA changes don't make the scalability issue worse? - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IPoIB: Convert to netdevice internal stats
> How is that ibm_emac NAPI conversion coming along? :-) Sorry, trying to reduce my backlog first, but it is still on my list of things to work on :) - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] IPoIB: Convert to netdevice internal stats
Use the stats member of struct netdevice in IPoIB, so we can save memory by deleting the stats member of struct ipoib_dev_priv, and save code by deleting ipoib_get_stats(). Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> --- Dave, can you queue this in net-2.6.24 please? I would ordinarily merge IPoIB changes but since this depends on the netdevice internal stats change it becomes a cross-tree dependency if I try to do that. And I'd like to get it queued in git now before the merge window. Thanks... drivers/infiniband/ulp/ipoib/ipoib.h |2 -- drivers/infiniband/ulp/ipoib/ipoib_cm.c| 20 ++-- drivers/infiniband/ulp/ipoib/ipoib_ib.c| 18 +- drivers/infiniband/ulp/ipoib/ipoib_main.c | 22 +++--- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 10 +- 5 files changed, 31 insertions(+), 41 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index 3a6ef14..1e627ee 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -300,8 +300,6 @@ struct ipoib_dev_priv { struct ib_event_handler event_handler; - struct net_device_stats stats; - struct net_device *parent; struct list_head child_intfs; struct list_head list; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c index 08b4676..1afd93c 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c @@ -430,7 +430,7 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) ipoib_dbg(priv, "cm recv error " "(status=%d, wrid=%d vend_err %x)\n", wc->status, wr_id, wc->vendor_err); - ++priv->stats.rx_dropped; + ++dev->stats.rx_dropped; goto repost; } @@ -457,7 +457,7 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) * this packet and reuse the old buffer. */ ipoib_dbg(priv, "failed to allocate receive buffer %d\n", wr_id); - ++priv->stats.rx_dropped; + ++dev->stats.rx_dropped; goto repost; } @@ -474,8 +474,8 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc) skb_pull(skb, IPOIB_ENCAP_LEN); dev->last_rx = jiffies; - ++priv->stats.rx_packets; - priv->stats.rx_bytes += skb->len; + ++dev->stats.rx_packets; + dev->stats.rx_bytes += skb->len; skb->dev = dev; /* XXX get correct PACKET_ type here */ @@ -512,8 +512,8 @@ void ipoib_cm_send(struct net_device *dev, struct sk_buff *skb, struct ipoib_cm_ if (unlikely(skb->len > tx->mtu)) { ipoib_warn(priv, "packet len %d (> %d) too long to send, dropping\n", skb->len, tx->mtu); - ++priv->stats.tx_dropped; - ++priv->stats.tx_errors; + ++dev->stats.tx_dropped; + ++dev->stats.tx_errors; ipoib_cm_skb_too_long(dev, skb, tx->mtu - IPOIB_ENCAP_LEN); return; } @@ -532,7 +532,7 @@ void ipoib_cm_send(struct net_device *dev, struct sk_buff *skb, struct ipoib_cm_ tx_req->skb = skb; addr = ib_dma_map_single(priv->ca, skb->data, skb->len, DMA_TO_DEVICE); if (unlikely(ib_dma_mapping_error(priv->ca, addr))) { - ++priv->stats.tx_errors; + ++dev->stats.tx_errors; dev_kfree_skb_any(skb); return; } @@ -542,7 +542,7 @@ void ipoib_cm_send(struct net_device *dev, struct sk_buff *skb, struct ipoib_cm_ if (unlikely(post_send(priv, tx, tx->tx_head & (ipoib_sendq_size - 1), addr, skb->len))) { ipoib_warn(priv, "post_send failed\n"); - ++priv->stats.tx_errors; + ++dev->stats.tx_errors; ib_dma_unmap_single(priv->ca, addr, skb->len, DMA_TO_DEVICE); dev_kfree_skb_any(skb); } else { @@ -580,8 +580,8 @@ static void ipoib_cm_handle_tx_wc(struct net_device *dev, struct ipoib_cm_tx *tx ib_dma_unmap_single(priv->ca, tx_req->mapping, tx_req->skb->len, DMA_TO_DEVICE); /* FIXME: is this right? Shouldn't we only increment on success? */ - ++priv->stats.tx_packets; - priv->stats.tx_bytes += tx_req->skb->len; + ++dev->stats.tx_packets; + dev->stats.tx_bytes += tx_req->skb->len; dev_kfree_skb_any(tx_req->skb); diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infinib
Re: [PATCH V6 0/9] net/bonding: ADD IPoIB support for the bonding driver
> Roland, are you comfortable with the IB changes in patches 1 and 2? Yes, they look fine to me. Feel free to apply, with Acked-by: Roland Dreier <[EMAIL PROTECTED]> - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ofa-general] Re: [PATCH V5 2/11] IB/ipoib: Notify the world before doing unregister
> The action in bonding to a detach of slave is to unregister the master (see > patch 10). > This can't be done from the context of unregister_netdevice itself (it is > protected by rtnl_lock). I'm confused. Your patch has: > +ipoib_slave_detach(cpriv->dev); > unregister_netdev(cpriv->dev); And ipoib_slave_detach() is: > +static inline void ipoib_slave_detach(struct net_device *dev) > +{ > +rtnl_lock(); > +netdev_slave_detach(dev); > +rtnl_unlock(); > +} so you are calling netdev_slave_detach() with the rtnl lock held. Why can't you make the same call from the start of unregister_netdevice()? Anyway, if the rtnl lock is a problem, can you just add the call to netdev_slave_detach() to unregister_netdev() before it takes the rtnl lock? - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] bnx2: factor out gzip unpacker
> Just change the makefiles to always install gzip'ed modules > modutils knows how to unzip them on the fly. But that leaves the uncompressed firmware blobs in .data that ends up in unswappable kernel memory. - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] RDMA/CMA: Use neigh_event_send() to initiate neighbour discovery.
> Roland - can you please queue this up for 2.6.24? Done, thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V5 2/11] IB/ipoib: Notify the world before doing unregister
> +ipoib_slave_detach(cpriv->dev); > unregister_netdev(cpriv->dev); Maybe you already answered this before, but I'm still not clear why this notifier call can't just be added to the start of unregister_netdevice(), so we can avoid having driver needing to know anything about bonding internals? - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ofa-general] Re: [PATCH 02/11] IB/ipoib: Notify the world before doing unregister
> I think that if there are no other comments, I will submit the entire 11 > patches > again (with changes) to make it easier to merge into the kernel. Since the > most of the > content in the patch series is in bonding I thought it would be right that > Jay will > push all the patches to the networking git. Is it OK with you Roland? Yes, that's fine. - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ofa-general] [PATCH] RDMA/CMA: Implement rdma_resolve_ip retry enhancement.
Thanks for the patch... > If an application is calling rdma_resolve_ip() and a status of -ENODATA is > returned from addr_resolve_local/remote(), the timeout mechanism waits until > the application's timeout occurs before rechecking the address resolution > status; the application will wait until it's full timeout occurs. This case > is seen when the work thread call to process_req() is made before the arp > packet is processed. I'm having a hard time understanding this changelog. Could you please resend with a description that lets me understand: - What the current behavior is and what is wrong with that; - What the behavior should be; - And how your patch changes the behavior to be correct. > This patch is in addition to Steve Wise's neigh_event_send patch to initiate > neighbour discovery sent on 9/12/2007. Does this mean it depends on Steve's patch being applied first? Also please try to keep lines in the changelog to 72 characters or so... > @@ -136,6 +137,7 @@ static void set_timeout(unsigned long ti > static void queue_req(struct addr_req *req) > { > struct addr_req *temp_req; > +unsigned long req_timeout = msecs_to_jiffies(MIN_ADDR_TIMEOUT_MS) + > jiffies; > > mutex_lock(&lock); > list_for_each_entry_reverse(temp_req, &req_list, list) { > @@ -145,8 +147,10 @@ static void queue_req(struct addr_req *r > > list_add(&req->list, &temp_req->list); > > -if (req_list.next == &req->list) > +if (req_list.next == &req->list) { > +req_timeout = min(req_timeout, req->timeout); > set_timeout(req->timeout); > +} > mutex_unlock(&lock); > } I don't understand this code. It seems you keep track of the minimum timeout, and then ignore the value you computed. What am I missing? Thanks, Roland - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: new NAPI interface broken
> One other thing I observed is that I can not unload the module as the > ref counter of the eth device is too low. I haven't tracked down the > source of this problem yet. I suspect that this is because netif_rx_reschedule() was missing a dev_hold() to match the dev_put() in netif_rx_complete(). Dave merged a fix for that problem yesterday, so current net-2.6.24 should be OK. Let us know if it's not OK, because then there's another bug... - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] IPoIB: Fix unregister_netdev hang
> > and I think that is only in Dave's net-2.6.24 tree now, right? > > Nope, that was what I downloaded yesterday: > > VERSION = 2 > PATCHLEVEL = 6 > SUBLEVEL = 23 > EXTRAVERSION =-rc6 > NAME = Pink Farting Weasel Please double check your tree. I just very carefully looked at my trees, and the poll_more: label is added in commit 6b460a71 ("[NET]: Make NAPI polling independent of struct net_device objects.") which is only in the net-2.6.24 tree. Of course Dave did not change the version information in the Makefile since he wouldn't want Linus to pick up any extra strange changes when he pulls, so a net-2.6.24 tree will look like 2.6.23-rc6 as you quoted. And the refcounting bug I fixed is only in net-2.6.24. > To be clear, netif_rx_schedule while we are still in the poll list will not > do any harm as it does nothing since NAPI_STATE_SCHED is still set (cleared > by netif_rx_complete which has not yet run). Effectively we lost/delayed > processing an interrupt, if I understood the code right. Right, we lose an interrupt, and since the CQ events are one-shot, we never get another one, and the interface is effectively dead. - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-2.6.24] Fix refcounting problem with netif_rx_reschedule()
> Further complicating things is that you need to setup a ppc32 > cross-build environment to even build test a conversion, and I'm not > comfortable doing the surgery until I can test build the thing. OK, I actually have a system with a ppc 440 SoC that uses this driver, so I'll try to get things to the stage where I can boot net-2.6.24 on it and see if I can get the driver working... - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-2.6.24] Fix documentation for dev_put()/dev_hold()
It looks like the comments for dev_put() and dev_hold() got reversed somehow. Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> --- include/linux/netdevice.h |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index be5fe05..239ae68 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1030,7 +1030,7 @@ extern intnetdev_budget; extern void netdev_run_todo(void); /** - * dev_put - get reference to device + * dev_put - release reference to device * @dev: network device * * Hold reference to device to keep it from being freed. @@ -1041,7 +1041,7 @@ static inline void dev_put(struct net_device *dev) } /** - * dev_hold - release reference to device + * dev_hold - get reference to device * @dev: network device * * Release reference to device to allow it to be freed. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-2.6.24] Fix refcounting problem with netif_rx_reschedule()
netif_rx_complete() takes a netdev parameter and does dev_put() on that netdev, so netif_rx_reschedule() needs to also take a netdev parameter and do dev_hold() on it to avoid reference counts from getting becoming negative because of unbalanced dev_put()s. This should fix the problem reported by Krishna Kumar <[EMAIL PROTECTED]> with IPoIB waiting forever for netdev refcounts to become 0 during module unload. Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> --- Dave, feel free to roll this up into earlier NAPI conversion patches (assuming I'm understanding things correctly and this patch actually makes sense!). BTW, it looks like drivers/net/ibm_emac/ibm_emac_mal.c would not have built in the current net-2.6.24 tree, since its call to netif_rx_reschedule() was left with the netdev parameter. So that file does not need to be touched in this patch. drivers/infiniband/ulp/ipoib/ipoib_ib.c |2 +- drivers/net/arm/ep93xx_eth.c|2 +- drivers/net/ehea/ehea_main.c|2 +- drivers/net/ibmveth.c |2 +- include/linux/netdevice.h | 21 +++-- 5 files changed, 15 insertions(+), 14 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index 6a2bff4..481e4b6 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -320,7 +320,7 @@ poll_more: if (unlikely(ib_req_notify_cq(priv->cq, IB_CQ_NEXT_COMP | IB_CQ_REPORT_MISSED_EVENTS)) && - netif_rx_reschedule(napi)) + netif_rx_reschedule(dev, napi)) goto poll_more; } diff --git a/drivers/net/arm/ep93xx_eth.c b/drivers/net/arm/ep93xx_eth.c index f3858d1..7f016f3 100644 --- a/drivers/net/arm/ep93xx_eth.c +++ b/drivers/net/arm/ep93xx_eth.c @@ -309,7 +309,7 @@ poll_some_more: } spin_unlock_irq(&ep->rx_lock); - if (more && netif_rx_reschedule(napi)) + if (more && netif_rx_reschedule(dev, napi)) goto poll_some_more; } diff --git a/drivers/net/ehea/ehea_main.c b/drivers/net/ehea/ehea_main.c index 4a5ab4a..9a499f4 100644 --- a/drivers/net/ehea/ehea_main.c +++ b/drivers/net/ehea/ehea_main.c @@ -636,7 +636,7 @@ static int ehea_poll(struct napi_struct *napi, int budget) if (!cqe && !cqe_skb) return 0; - if (!netif_rx_reschedule(napi)) + if (!netif_rx_reschedule(dev, napi)) return 0; } diff --git a/drivers/net/ibmveth.c b/drivers/net/ibmveth.c index b8d7cec..b94f266 100644 --- a/drivers/net/ibmveth.c +++ b/drivers/net/ibmveth.c @@ -973,7 +973,7 @@ static int ibmveth_poll(struct napi_struct *napi, int budget) netif_rx_complete(netdev, napi); if (ibmveth_rxq_pending_buffer(adapter) && - netif_rx_reschedule(napi)) { + netif_rx_reschedule(netdev, napi)) { lpar_rc = h_vio_signal(adapter->vdev->unit_address, VIO_IRQ_DISABLE); goto restart_poll; diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index be5fe05..0dbf185 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1198,16 +1198,6 @@ static inline u32 netif_msg_init(int debug_value, int default_msg_enable_bits) return (1 << debug_value) - 1; } -/* Try to reschedule poll. Called by dev->poll() after netif_rx_complete(). */ -static inline int netif_rx_reschedule(struct napi_struct *n) -{ - if (napi_schedule_prep(n)) { - __napi_schedule(n); - return 1; - } - return 0; -} - /* Test if receive needs to be scheduled but only if up */ static inline int netif_rx_schedule_prep(struct net_device *dev, struct napi_struct *napi) @@ -1234,6 +1224,17 @@ static inline void netif_rx_schedule(struct net_device *dev, __netif_rx_schedule(dev, napi); } +/* Try to reschedule poll. Called by dev->poll() after netif_rx_complete(). */ +static inline int netif_rx_reschedule(struct net_device *dev, + struct napi_struct *napi) +{ + if (napi_schedule_prep(napi)) { + __netif_rx_schedule(dev, napi); + return 1; + } + return 0; +} + /* same as netif_rx_complete, except that local_irq_save(flags) * has already been issued */ - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ofa-general] Re: [PATCH 02/11] IB/ipoib: Notify the world before doing unregister
> Maybe this new notification function should be in net/core/dev.c > instead of exporting call_netdevice_notifiers()? Or actually, does it work to add the call to the notifiers directly in unregister_netdev() so that device drivers don't have to worry about it? (And is the existing patch missing a call to notifiers in ipoib_vlan_delete()?) - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: InfiniBand/RDMA merge plans for 2.6.24
> Roland, could you merge the common TX CQ patch please? > It actually fixes a real problem. Yes, I will, but it collides with the net-2.6.24 NAPI rework I think, so it may not go in until a few days after the merge window. Have you verified that the patch cures the interrupt overload issues? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] IPoIB: Fix unregister_netdev hang
Thanks for testing on ehca... > While using IPoIB over EHCA (rc6 bits), unregister_netdev hangs with I don't think you're actually using rc6 bits, since in your patch you have: > -poll_more: and I think that is only in Dave's net-2.6.24 tree now, right? > The problem is that the poll handler does netif_rx_complete (which > does a dev_put) followed by netif_rx_reschedule() to schedule for > more receives (which again does a dev_put). This reduces refcount to > < 0 (depending on how many times netif_rx_complete followed by > netif_rx_reschedule was called). Dave, the real problem seems to be that netif_rx_recschedule() calls __napi_schedule() rather than __netif_rx_schedule(), so it misses the call to dev_hold() that is needed to balance the dev_put() in netif_rx_complete(). The current netif_rx_reschedule() looks like it really should be napi_reschedule(), and we need a new function that takes a netdev too. Or am I misunderstanding the refcounting? I'll send a patch once I've had some breakfast and had a chance to at least compile it... Krishna, unfortunately your proposed fix has a race: > -netif_rx_complete(dev, napi); > -if (unlikely(ib_req_notify_cq(priv->cq, > - IB_CQ_NEXT_COMP | > - IB_CQ_REPORT_MISSED_EVENTS)) && > -netif_rx_reschedule(napi)) > -goto poll_more; > +if (likely(!ib_req_notify_cq(priv->cq, > + IB_CQ_NEXT_COMP | > + IB_CQ_REPORT_MISSED_EVENTS))) It is possible for an interrupt to happen immediately right here, before the netif_rx_complete(), so that netif_rx_schedule() gets called while we are still on the poll list. > +netif_rx_complete(dev, napi); - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/11] IB/ipoib: Notify the world before doing unregister
> Conceptually, I see your point and I'm ok with doing it either > way. My only question is, would this change would make the ipoib module > dependent upon having the bonding module loaded (to resolve all of the > symbols)? Yes, I guess so, if that function is in bonding. Hmm, that wouldn't be a good change. Maybe this new notification function should be in net/core/dev.c instead of exporting call_netdevice_notifiers()? - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 03/11] IB/ipoib: Bound the net device to the ipoib_neigh structue
Overall idea looks good... one comment: > +if (n->dev->flags & IFF_MASTER) { > +/* n->dev is not an IPoIB device and we have > +to take priv from elsewhere */ > +neigh = *to_ipoib_neigh(n); > +if (neigh) { > +priv = netdev_priv(neigh->dev); > +ipoib_dbg(priv, "neigh_destructor for bonding device: > %s\n", > + n->dev->name); > +} else > +return; > +} seems like it would be cleaner not to worry about bonding here and just use neigh->dev all the time rather than having two ways to look up the device. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/11] IB/ipoib: Notify the world before doing unregister
OK with me. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/11] IB/ipoib: Notify the world before doing unregister
Actually, thinking about this some more... would it be cleaner to more the knowledge about bonding out of the ipoib driver? in other words, export something similar to > +static int ipoib_slave_detach(struct net_device *dev) > +{ > +int ret = 0; > +if (dev->flags & IFF_SLAVE) { > +dev->priv_flags |= IFF_SLAVE_DETACH; > +rtnl_lock(); > +ret = call_netdevice_notifiers(NETDEV_CHANGE, dev); > +rtnl_unlock(); > +} > +return ret; > +} for drivers to use, rather than putting use of IFF_SLAVE and IFF_SLAVE_DETACH outside of the bonding driver. Also it seems this function could return void, since both call sites ignore the return value and I don't see anything sensible that IPoIB could do with the notifier chain return value anyway. - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 04/11] IB/ipoib: Verify address handle validity on send
Looks fine overall, with one minor nitpick: > -if (unlikely(memcmp(&neigh->dgid.raw, > +if (unlikely((memcmp(&neigh->dgid.raw, > skb->dst->neighbour->ha + 4, > -sizeof(union ib_gid { > +sizeof(union ib_gid))) || > + (neigh->dev != dev))) { the indentation here makes this confusing to read -- I would just do: } else if (neigh->ah) { if (unlikely(memcmp(&neigh->dgid.raw, skb->dst->neighbour->ha + 4, - sizeof(union ib_gid { + sizeof(union ib_gid)) || +neigh->dev != dev)) { - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/11] IB/ipoib: Export call to call_netdevice_notifiers and add new private flag
I tried to look at the ipoib stuff in this series... this patch looks fine but it doesn't actually touch ipoib, so the subject line is a bit misleading... - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ofa-general] InfiniBand/RDMA merge plans for 2.6.24
> The IGMP enabling patch posted by me on September 2nd isn't on your list > http://lists.openfabrics.org/pipermail/general/2007-September/040250.html > can you add it? Yes, I lost that somehow. I will add it to my list of things to take a look at (no opinion yet). - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: InfiniBand/RDMA merge plans for 2.6.24
> > IPoIB CM handles this properly by gathering together single pages in > > skbs' fragment lists. > Then can we reuse IPoIB CM code here? Yes, if possible, refactoring things so that the rx skb allocation code becomes common between CM and non-CM would definitely make sense. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ofa-general] InfiniBand/RDMA merge plans for 2.6.24
> > I've been meaning to track down the bnx2 iscsi offload patch to look > > and see if this issue is addressed, since the same problem seems to > > exist: it seems an iscsi connection and a main stack tcp connection > > might share the same 4-tuple unless something is done to avoid that > > happening. > iSCSI does not do passive listens, only active connections to the > target. But you're right, the port space is still shared between iSCSI > and the main stack. We currently rely on user apps binding to the main > stack to reserve certain ephemeral ports, and telling the iSCSI driver > which ports to use. Got it... I wasn't thinking that clearly, but it is clear that a full 4-tuple collision with only active connections is quite unlikely. I guess you would have to make both an offloaded and a non-offloaded iSCSI connection to the same target and get really unlucky with ephemeral port allocation. So in practice I guess it's not an issue at all with your driver yet. However, do you have any plans to support iSCSI offload for targets? Also, looking at the first CNIC patch, I can't help but notice that you seem to have at least some support for iWARP there. How does the CNIC look? Does it share the same interface/addresses as the non-offload NIC, or does it create a completely separate netdevice? I want to make sure that whatever solution we come up with for cxgb3 doesn't cause problems for you. And of course if you have a better idea than what Steve has come up with, that would be great :) - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: InfiniBand/RDMA merge plans for 2.6.24
> The patch is just needed to pick up broadcast MTU size instead of hard > coding 2K right now. SKB allocation shouldn't be different with Ethernet > Jambo Frame and IPoIB-CM which 64K MTU. I don't understand why it's > different. Could you please explain this? It's exactly the same problem as ethernet jumbo frames. A web search for '"order 1" failure e1000' might be interesting. IPoIB CM handles this properly by gathering together single pages in skbs' fragment lists. - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] iw_cxgb3: Support "iwarp-only" interfaces to avoid 4-tuple conflicts.
> Maybe you should automatically create an alias each time new interface > is added so that admin would not care about proper aliases? I agree that makes much more sense from a user interface point of view. Unfortunately an alias without an address doesn't make sense, so there doesn't seem to be a way to implement that. - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html