RE: [PATCH] [NET]: Fix TX bug VLAN in VLAN
2007/11/27, Herbert Xu <[EMAIL PROTECTED]>: > On Tue, Nov 27, 2007 at 02:32:49PM +0900, Joonwoo Park wrote: > > > > Thanks Herbert. > > Well.. I think patch would work propely for AF_PACKET also. > > (I did not insert BUG() macro in my patch) > > How do you think? > > Are you sure? I thought you need to check both in the xmit function. > That is, > >if (veth->h_vlan_proto != htons(ETH_P_8021Q) || >VLAN_DEV_INFO(dev)->flags & VLAN_FLAG_REORDER_HDR) { > > Otherwise you'll miss AF_PACKET packets when REORDER is off. Thanks Herbert! I agree with you. Thanks. Joonwoo [NET]: Fix TX bug VLAN in VLAN Fix misbehavior of vlan_dev_hard_start_xmit() for recursive encapsulations. Signed-off-by: Joonwoo Park <[EMAIL PROTECTED]> --- diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c index 7a36878..4f99bb8 100644 --- a/net/8021q/vlan_dev.c +++ b/net/8021q/vlan_dev.c @@ -462,7 +462,8 @@ int vlan_dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev) * OTHER THINGS LIKE FDDI/TokenRing/802.3 SNAPs... */ - if (veth->h_vlan_proto != htons(ETH_P_8021Q)) { + if (veth->h_vlan_proto != htons(ETH_P_8021Q) || + VLAN_DEV_INFO(dev)->flags & VLAN_FLAG_REORDER_HDR) { int orig_headroom = skb_headroom(skb); unsigned short veth_TCI; --- - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/10] [SKBUFF]: Add skb_morph
From: Herbert Xu <[EMAIL PROTECTED]> > On Mon, Nov 26, 2007 at 03:50:22PM +0900, Yasuyuki KOZAKAI wrote: > > > > The refcount of nfct is leaked by this function. As a result, > > nf_conntrack_ipv6.ko cannot be unloaded after doing "ping6 -s 2000 ..." . > > dst->dst and dst->secpath are also needed to be released, I think. > > > > Please consider to apply this patch. > > Good catch! Thanks for spotting this. > > I'm going to add the following patch to net-2.6. That looks better. -- Yasuyuki Kozakai - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Does tc-prio really work as advertised?
On 26-11-2007 23:25, Jarek Poplawski wrote: ... > Are you doing this on the same box? I was tracing this long time ago too, > and, if > I didn't miss something, it was about the place! So, as I recall (after > finding > some old message) this TOS is considered only for packets going through the > FORWARD > chain. (But, I haven't checked this at all now, so "no complaints"...) ...Too exactly! Iptables aren't needed for this, so "going through the forward." should be enough... Jarek P. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [NET]: Fix TX bug VLAN in VLAN
On Tue, Nov 27, 2007 at 02:32:49PM +0900, Joonwoo Park wrote: > > Thanks Herbert. > Well.. I think patch would work propely for AF_PACKET also. > (I did not insert BUG() macro in my patch) > How do you think? Are you sure? I thought you need to check both in the xmit function. That is, if (veth->h_vlan_proto != htons(ETH_P_8021Q) || VLAN_DEV_INFO(dev)->flags & VLAN_FLAG_REORDER_HDR) { Otherwise you'll miss AF_PACKET packets when REORDER is off. Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Tue, 2007-11-27 at 15:49 +1100, Rusty Russell wrote: > On Monday 26 November 2007 17:15:44 Roland Dreier wrote: > > > Except C doesn't have namespaces and this mechanism doesn't create them. > > > So this is just complete and utter makework; as I said before, noone's > > > going to confuse all those udp_* functions if they're not in the udp > > > namespace. > > > > I don't understand why you're so opposed to organizing the kernel's > > exported symbols in a more self-documenting way. > > No, I was the one who moved exports near their declarations. That's > organised. I just don't see how this new "organization" will help: oh good, > I won't accidentally use the udp functions any more?!? > > > It seems pretty > > clear to me that having a mechanism that requires modules to make > > explicit which (semi-)internal APIs makes reviewing easier > > Perhaps you've got lots of patches were people are using internal APIs they > shouldn't? > Maybe the issue is "who can tell" since what is external and what is internal is not explicitly defined? > > , makes it > > easier to communicate "please don't use that API" to module authors, > > Well, introduce an EXPORT_SYMBOL_INTERNAL(). It's a lot less code. But > you'd > still need to show that people are having trouble knowing what APIs to use. > > and takes at least a small step towards bringing the kernel's exported > > API under control. > > There is no "exported API" to bring under control. Hmm...apparently, there are those that are struggling... > There are symbols we > expose for the kernel's own use which can be used by external modules at > their own risk. > > > What's the real downside? > > No. That's the wrong question. What's the real upside? Explicitly documenting what comprises the kernel API (external, supported) and what comprises the kernel implementation (internal, not supported). > > Let's not put code in the core because "it doesn't seem to hurt". > agreed. > I'm sure you think there's a real problem, but I'm still waiting for someone > to *show* it to me. Then we can look at solutions. I think the benefits should include: - forcing developers to identify their exports as part of the implementation or as part of the kernel API - making it easier for reviewers to identify when developers are adding to the kernel API and thereby focusing the appropriate level of review to the new function - making it obvious to developers when they are binding their implementation to a particular kernel release > Rusty. > - > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [NET]: Fix TX bug VLAN in VLAN
2007/11/26, Herbert Xu <[EMAIL PROTECTED]>: > On Fri, Nov 23, 2007 at 12:12:52PM +, Joonwoo Park wrote: > > This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=8766 > > > > Is it possible? > > BUG((veth->h_vlan_proto != htons(ETH_P_8021Q)) && > > !(VLAN_DEV_INFO(dev)->flags & VLAN_FLAG_REORDER_HDR)) > > I'm afraid, queued packet before vconfig set_flag would do that. > > Yes, AF_PACKET would do that. So you should check both. > Thanks Herbert. Well.. I think patch would work propely for AF_PACKET also. (I did not insert BUG() macro in my patch) How do you think? Thanks Joonwoo - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Monday 26 November 2007 17:15:44 Roland Dreier wrote: > > Except C doesn't have namespaces and this mechanism doesn't create them. > > So this is just complete and utter makework; as I said before, noone's > > going to confuse all those udp_* functions if they're not in the udp > > namespace. > > I don't understand why you're so opposed to organizing the kernel's > exported symbols in a more self-documenting way. No, I was the one who moved exports near their declarations. That's organised. I just don't see how this new "organization" will help: oh good, I won't accidentally use the udp functions any more?!? > It seems pretty > clear to me that having a mechanism that requires modules to make > explicit which (semi-)internal APIs makes reviewing easier Perhaps you've got lots of patches were people are using internal APIs they shouldn't? > , makes it > easier to communicate "please don't use that API" to module authors, Well, introduce an EXPORT_SYMBOL_INTERNAL(). It's a lot less code. But you'd still need to show that people are having trouble knowing what APIs to use. > and takes at least a small step towards bringing the kernel's exported > API under control. There is no "exported API" to bring under control. There are symbols we expose for the kernel's own use which can be used by external modules at their own risk. > What's the real downside? No. That's the wrong question. What's the real upside? Let's not put code in the core because "it doesn't seem to hurt". I'm sure you think there's a real problem, but I'm still waiting for someone to *show* it to me. Then we can look at solutions. Rusty. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Monday 26 November 2007 16:58:08 Roland Dreier wrote: > > > I agree that we shouldn't make things too hard for out-of-tree > > > modules, but I disagree with your first statement: there clearly is a > > > large class of symbols that are used by multiple modules but which are > > > not generically useful -- they are only useful by a certain small > > > class of modules. > > > > If it is so clear, you should be able to easily provide examples? > > Sure -- Andi's example of symbols required only by TCP congestion > modules; Exactly. Why exactly should someone not write a new TCP congestion module? > the SCSI internals that Christoph wants to mark He didn't justify those though, either. > ; the symbols exported by my mlx4_core driver (which I admit are > currently only used > by the mlx4_ib driver, but which will also be used by at least the > ethernet NIC driver for the same hardware). Right. So presumably there will only ever be two drivers using this core code, so no new users will ever be written? Now we've found one use case, is it worth the complexity of namespaces? Is it worth the halfway point of export-to-module? What problem will it solve? > I thought this was > already covered repeatedly in the thread and indeed in Andi's code so > there was no need to repeat it... No, we've seen the solution and various people applying it. I'm still trying to discover the problem it's solving. Hope that helps, Rusty. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 2/2][BNX2]: Add iSCSI support to BNX2 devices.
Anil Veerabhadrappa wrote: The sysfs bits related to the hba should be use one of the scsi sysfs facilities or if they are related to iscsi bits and are generic then through the iscsi hba bnx2i needs 2 sysfs entries - 1. QP size info - this is used to size per connection shared data structures to issue work requests to chip (login, scsi cmd, tmf, nopin) and get completions from the chip (scsi completions, async messages, etc'). This is a iSCSI HBA attribute 2. port mapper - we can be more flexible on classifying this as either iSCSI HBA attribute or bnx2i driver global attribute Can hooks be added to iSCSI transport class to include these? Which ones were they exactly? I think JamesB wanted only common transport values in the transport class. If it is driver specific then it should go on the host or target or device with the scsi_host_template attrs. It's a chicken & egg issue to put "port mapper" sysfs entry in scsi host attributes. Application won't see sysfs unless initiator creates an Sorry for the late response. I was on vacation. That is only with how you coded it today. I asked you to do something like qla4xxx where the session and host are not so closely bound. iSCSI session and driver can't create an iSCSI session without a tcp That is not right with how things are today even. The iscsi_session struct can be created before the tcp connection. This was done because we thought we were going to have to use only sysfs for all setup and management (we ended up netlink and sysfs though). port. I was wondering if there is a better way than using IOCTL in this situation? - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [XFRM]: Fix leak of expired xfrm_states
On Mon, Nov 26, 2007 at 05:52:15PM +0100, Patrick McHardy wrote: > > OK, here's a patch to use xfrm_state_put in __xfrm_state_delete(). > I've checked the other callers and it should be fine. lock ordering > between x->lock and xfrm_state_gc_lock also doesn't seem to be an > issue. Patch applied. Thanks a lot Patrick! -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1][ATM]: [he] initialize lock and tasklet earlier
On Tue, Nov 27, 2007 at 11:03:41AM +0800, Herbert Xu wrote: > On Mon, Nov 26, 2007 at 04:33:31PM +, chas williams - CONTRACTOR wrote: > > if you are lucky (unlucky?) enough to have shared interrupts, the > > interrupt handler can be called before the tasklet and lock are ready > > for use. > > Patch applied. Thanks Chas! Oh please don't forget to add a Signed-off-by header next time. I've added it for you this time. Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1][ATM]: [he] initialize lock and tasklet earlier
On Mon, Nov 26, 2007 at 04:33:31PM +, chas williams - CONTRACTOR wrote: > if you are lucky (unlucky?) enough to have shared interrupts, the > interrupt handler can be called before the tasklet and lock are ready > for use. Patch applied. Thanks Chas! -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DST_NOHASH flag and IPsec transformers routing tables - need some clarification
Ian Brown <[EMAIL PROTECTED]> wrote: > > NOHASH hints that we do not keep the > an entry in a hash. I doubt that such dst_entries , which are created with > IPsec and so has the DST_NOHASH flag set, are not kept in the routing cache? Exactly, they're not in the routing cache (for IPv4 anyway, there is no cache at all for IPv6). Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mm snapshot broken-out-2007-11-20-01-45 Build Fail - net/wireless driver
On Tue, Nov 20, 2007 at 06:53:15PM +0530, Kamalesh Babulal wrote: > Hi Andrew, > > The kernel build fails, with following message > > LD drivers/net/wireless/built-in.o > drivers/net/wireless/rtl8187.o: In function `rtl8225z2_rf_init': > (.opd+0x180): multiple definition of `rtl8225z2_rf_init' > drivers/net/wireless/rtl8180.o:(.opd+0x1b0): first defined here > drivers/net/wireless/rtl8187.o: In function `rtl8225z2_rf_init': > /root/linux-2.6.24-rc3/drivers/net/wireless/rtl8187_rtl8225.c:571: multiple > definition of `.rtl8225z2_rf_init' > drivers/net/wireless/rtl8180.o:/root/linux-2.6.24-rc3/drivers/net/wireless/rtl8180_rtl8225.c:561: > first defined here > ld: Warning: size of symbol `.rtl8225z2_rf_init' changed from 3836 in > drivers/net/wireless/rtl8180.o to 3544 in drivers/net/wireless/rtl8187.o The patch below is a little ugly but will allow allyesconfig to work. I don't know enough about the Realtek devices to make intellegent suggestions on how to fix this particular problem. Clearly the 2 drivers share a lot of common code so perhaps they can be merged? I assumed that the RTL8180, is still somewhat WiP based on the commit message for a2645795713c4374ff2efda960251cdc30b63430 (wireless-2.6.git). Appologies for the uber long CC line, wasn't sure who can be pruned. From: Tony Breeds <[EMAIL PROTECTED]> Temporarily ensure that Realtek 8185 and 8187 aren't compiled together. These two drivers share a number of common (global) functions. While RTL8180 is still being worked on ensure that it's not built together with the RTL8187 (ie allyseconfig). Signed-off-by: Tony Breeds <[EMAIL PROTECTED]> --- drivers/net/wireless/Kconfig |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/net/wireless/Kconfig b/drivers/net/wireless/Kconfig index 82e5de7..ab2eac0 100644 --- a/drivers/net/wireless/Kconfig +++ b/drivers/net/wireless/Kconfig @@ -555,6 +555,7 @@ config USB_ZD1201 config RTL8180 tristate "Realtek 8185 PCI support" depends on MAC80211 && PCI && WLAN_80211 && EXPERIMENTAL + depends on !RTL8187 select EEPROM_93CX6 config RTL8187 Yours Tony linux.conf.auhttp://linux.conf.au/ || http://lca2008.linux.org.au/ Jan 28 - Feb 02 2008 The Australian Linux Technical Conference! - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ehea: Add kdump support
In message <[EMAIL PROTECTED]> you wrote: > Michael Ellerman wrote on 26.11.2007 09:16:28: > > Solutions that might be better: > > > > a) if there are a finite number of handles and we can predict their > > values, just delete them all in the kdump kernel before the driver > > loads. > > Guessing the values does not work, because of the handle structure > defined by the hypervisor. > > > b) if there are a small & finite number of handles, save their values > > in a device tree property and have the kdump kernel read them and > > delete them before the driver loads. > > 5*16*nr_ports+1+1= >82. a ML16 has 4 adapters with up to 16 ports, so the > number is not small anymore I assume this machine with a huge number of adapters has a huge amount of memory too! :-) > The device tree functions are currently not exported. We can add this. > If you crashdump to a new kernel, will it get the device tree > representation of the crashed kernel or of the initial one of open > firmware? The kexec tools userspace control this. Normally it just takes the current device tree plus some modifications (eg. initrd location changes). So provided the ehea driver export this info somewhere, it can be grabbed by the kexec tools and stuffed in the device tree of the new kernel. That being said, the proper place to have this would be original device tree. > > > c) if neither of those work, provide a minimal routine that _only_ > > deletes the handles in the crashed kernel. > > I would hope this has the highest chance to actually work. > For this we would have to add a proper notifier chain. > Do you agree? > > > d) > > Firmware change? But that's not something you will get very soon. > > Christoph R. > - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] SGISEEQ: use cached memory access to make driver work on IP28
Following patch is clearly 2.6.25 material and is needed to get SGI IP28 machines supported. Thomas. SGI IP28 machines would need special treatment (enable adding addtional wait states) when accessing memory uncached. To avoid this pain I changed the driver to use only cached access to memory. Signed-off-by: Thomas Bogendoerfer <[EMAIL PROTECTED]> --- drivers/net/sgiseeq.c | 239 ++--- 1 files changed, 166 insertions(+), 73 deletions(-) diff --git a/drivers/net/sgiseeq.c b/drivers/net/sgiseeq.c index ff40563..3145ca1 100644 --- a/drivers/net/sgiseeq.c +++ b/drivers/net/sgiseeq.c @@ -12,7 +12,6 @@ #include #include #include -#include #include #include #include @@ -53,14 +52,35 @@ static char *sgiseeqstr = "SGI Seeq8003"; sp->tx_old + (SEEQ_TX_BUFFERS - 1) - sp->tx_new : \ sp->tx_old - sp->tx_new - 1) +#define VIRT_TO_DMA(sp, v) ((sp)->srings_dma + \ + (dma_addr_t)((unsigned long)(v) -\ + (unsigned long)((sp)->rx_desc))) + +#define DMA_SYNC_DESC_CPU(dev, addr) \ + do { dma_cache_sync((dev)->dev.parent, (void *)addr, \ +sizeof(struct sgiseeq_rx_desc), DMA_FROM_DEVICE); } while (0) + +#define DMA_SYNC_DESC_DEV(dev, addr) \ + do { dma_cache_sync((dev)->dev.parent, (void *)addr, \ +sizeof(struct sgiseeq_rx_desc), DMA_TO_DEVICE); } while (0) + +/* Copy frames shorter than rx_copybreak, otherwise pass on up in + * a full sized sk_buff. Value of 100 stolen from tulip.c (!alpha). + */ +static int rx_copybreak = 100; + +#define PAD_SIZE(128 - sizeof(struct hpc_dma_desc) - sizeof(void *)) + struct sgiseeq_rx_desc { volatile struct hpc_dma_desc rdma; - volatile signed int buf_vaddr; + u8 padding[PAD_SIZE]; + struct sk_buff *skb; }; struct sgiseeq_tx_desc { volatile struct hpc_dma_desc tdma; - volatile signed int buf_vaddr; + u8 padding[PAD_SIZE]; + struct sk_buff *skb; }; /* @@ -163,35 +183,55 @@ static int seeq_init_ring(struct net_device *dev) /* Setup tx ring. */ for(i = 0; i < SEEQ_TX_BUFFERS; i++) { - if (!sp->tx_desc[i].tdma.pbuf) { - unsigned long buffer; - - buffer = (unsigned long) kmalloc(PKT_BUF_SZ, GFP_KERNEL); - if (!buffer) - return -ENOMEM; - sp->tx_desc[i].buf_vaddr = CKSEG1ADDR(buffer); - sp->tx_desc[i].tdma.pbuf = CPHYSADDR(buffer); - } sp->tx_desc[i].tdma.cntinfo = TCNTINFO_INIT; + DMA_SYNC_DESC_DEV(dev, &sp->tx_desc[i]); } /* And now the rx ring. */ for (i = 0; i < SEEQ_RX_BUFFERS; i++) { if (!sp->rx_desc[i].rdma.pbuf) { - unsigned long buffer; + dma_addr_t dma_addr; + struct sk_buff *skb = netdev_alloc_skb(dev, PKT_BUF_SZ); - buffer = (unsigned long) kmalloc(PKT_BUF_SZ, GFP_KERNEL); - if (!buffer) + if (skb == NULL) return -ENOMEM; - sp->rx_desc[i].buf_vaddr = CKSEG1ADDR(buffer); - sp->rx_desc[i].rdma.pbuf = CPHYSADDR(buffer); + skb_reserve(skb, 2); + dma_addr = dma_map_single(dev->dev.parent, + skb->data - 2, + PKT_BUF_SZ, DMA_FROM_DEVICE); + sp->rx_desc[i].skb = skb; + sp->rx_desc[i].rdma.pbuf = dma_addr; } sp->rx_desc[i].rdma.cntinfo = RCNTINFO_INIT; + DMA_SYNC_DESC_DEV(dev, &sp->rx_desc[i]); } sp->rx_desc[i - 1].rdma.cntinfo |= HPCDMA_EOR; + DMA_SYNC_DESC_DEV(dev, &sp->rx_desc[i - 1]); return 0; } +static void seeq_purge_ring(struct net_device *dev) +{ + struct sgiseeq_private *sp = netdev_priv(dev); + int i; + + /* clear tx ring. */ + for (i = 0; i < SEEQ_TX_BUFFERS; i++) { + if (sp->tx_desc[i].skb) { + dev_kfree_skb(sp->tx_desc[i].skb); + sp->tx_desc[i].skb = NULL; + } + } + + /* And now the rx ring. */ + for (i = 0; i < SEEQ_RX_BUFFERS; i++) { + if (sp->rx_desc[i].skb) { + dev_kfree_skb(sp->rx_desc[i].skb); + sp->rx_desc[i].skb = NULL; + } + } +} + #ifdef DEBUG static struct sgiseeq_private *gpriv; static struct net_device *gdev; @@ -258,8 +298,8 @@ static int init_seeq(struct net_device *dev, struct sgiseeq_private *sp, sregs
Re: [PATCH 31/59] drivers/net/ixgb: Add missing "space"
Joe Perches wrote: > Signed-off-by: Joe Perches <[EMAIL PROTECTED]> > --- > drivers/net/ixgbe/ixgbe_common.c |2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/drivers/net/ixgbe/ixgbe_common.c > b/drivers/net/ixgbe/ixgbe_common.c > index 512e3b2..b7e50bc 100644 > --- a/drivers/net/ixgbe/ixgbe_common.c > +++ b/drivers/net/ixgbe/ixgbe_common.c > @@ -950,7 +950,7 @@ s32 ixgbe_setup_fc(struct ixgbe_hw *hw, s32 packetbuf_num) > u32 rmcs_reg; > > if (packetbuf_num < 0 || packetbuf_num > 7) > - hw_dbg(hw, "Invalid packet buffer number [%d], expected range" > + hw_dbg(hw, "Invalid packet buffer number [%d], expected range " > "is 0-7\n", packetbuf_num); > > frctl_reg = IXGBE_READ_REG(hw, IXGBE_FCTRL); Jeff, please apply in case you didn't do so yet. Acked-by: Auke Kok <[EMAIL PROTECTED]> Auke - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] e1000: Fix for 32 bits platforms with 64 bits resources
Jeff Garzik wrote: > Benjamin Herrenschmidt wrote: >> The e1000 driver stores the content of the PCI resources into >> unsigned long's before ioremapping. This breaks on 32 bits >> platforms that support 64 bits MMIO resources such as ppc 44x. >> >> This fixes it by removing those temporary variables and passing >> directly the result of pci_resource_start/len to ioremap. >> >> The side effect is that I removed the assignments to the netdev >> fields mem_start, mem_end and base_addr, which are totally useless >> for PCI devices. >> >> Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> >> -- >> >> drivers/net/e1000/e1000_main.c | 18 +- >> 1 file changed, 5 insertions(+), 13 deletions(-) > > Looks good to me. auke? yes, please apply. Acked-by: Auke Kok <[EMAIL PROTECTED]> Auke - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 12/21] r8169: confusion between hardware and IP header alignment
Martin Michlmayr <[EMAIL PROTECTED]> : > * Francois Romieu <[EMAIL PROTECTED]> [2007-11-26 00:05]: > > > I'd like to backport the fix to the 2.6.18 kernel that is in our > > > stable release and have a couple of questions: > > > - Does your later patch "align the IP header when there is no DMA > > >constraint" fix any bugs or is it merely an improvement? > > It fixes a "it was faster before" problem. > > Before the patch I'm interested in backporting, right? In that case, > the patch I suggested would fix a bug but also introduce a performance > regression. So maybe the later patch should also be backported. What > do you think? The regression was due to cc9f022d97d08e4e36d38661857991fe91447d68. cc9f022d.. does not appear to be in v2.6.18.8. [...] > > > - Should I change "align" to 8 for RTL_CFG_1, as it's done in > > >current kernels? > > No. RTL_CFG_1 is for the 8168 (slightly different beast). > > Yes, but might 8168 users face similar problems as the one I saw? Clearly. There is a pentachiée* of patches to handle correctly the different flavors of 8168 devices in the recent kernels (not speaking for the genuine bugs). Until someone analyzes the log of the r8169, extracts the relevant pieces and tests them with 2.6.18.x, it will be hard to claim to support the 8168 with this kernel. [*] en français dans le texte. > > > If you have 6dccd16b7c2703e8bbf8bca62b5cf248332afbe2 applied, you > > want c946b3047205d7e107be16885bbb42ab9f10350a too. > ... > > If you move from the 8110SB to the 8110SC, you will probably want to apply > > 65d916d95314566f426cc40ff0f17b754a773b0b > > I don't have 6dccd16b7c2703e8bbf8bca62b5cf248332afbe2 and it still > says 8110SB, so I won't need these patches for my device. However, > these patches look like candidates to put into 2.6.18 anyway. I'd > like to avoid backporting too many changes from 2.6.24 to 2.6.18, but > are there any fixes we should absolutely have? Let aside the "align" fixes, the short list below contains some candidates in reverse order: 315917d23fdd20a0f4ff99b9228de5840d9d276c 9cb427b6ff0b3e235c518acf5c1fcbbfc95f0ae2 d03902b8864d7814c938f67befade5a3bba68708 | you should already have those a27993f3d9daca0dffa26577a83822db99c952e2 | eb2a021c4710b98081daa797d5a729ac23c240cd 2efa53f373ed811d4860904f5205b8a3b376e253 99f252b097a3bd6280047ba2175b605671da4a23 1371fa6db0bbb8e23f988a641f5ae7361bc629dd It's gross though: there are 99 changes from v2.6.18.8 to current master for the r8169 driver and some registers init changes may have been partially reverted later. Sorry if it is not terribly specific but I really, really, really need to spend more time on several bugs first. [...] > With 2.6.24-rc2-g8c086340-dirty: > r8169 Gigabit Ethernet driver 2.2LK loaded > eth0: RTL8169sb/8110sb at 0xe085c200, 00:14:fd:10:33:8e, XID 1000 IRQ 27 > r8169 Gigabit Ethernet driver 2.2LK loaded > eth1: RTL8169sb/8110sb at 0xe085e300, 00:14:fd:10:33:8f, XID 1000 IRQ 30 Nice. Thanks. -- Ueimor - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Does tc-prio really work as advertised?
Joerg Pommnitz wrote, On 11/23/2007 03:47 PM: > Hello all, > I might make a fool out of me, but I think the prio qdisc doesn't work as > advertised in any document I could lay my hands on. Marketing? > > My problem was that the link quality reported by the olsr.org olsrd degraded > depending on the amount of payload traffic transferred through an adhoc/mesh > interface. The LQ is calculated from the packet loss of LQ Hello packets sent > through this interface. To make sure normal traffic does not interfere with > this value, olsrd sets the TOS field to 0x10 (Minimize-Delay) by default. > This should give olsr traffic the highest priority on the link. > > Investigating this issue I replaced the default Pfifo_fast with a prio qdisc > and attached a pfifo on each of the bands: > > INTERFACE=wifi0 > tc qdisc add dev $INTERFACE root handle 1: prio > tc qdisc add dev $INTERFACE parent 1:1 handle 10: pfifo > tc qdisc add dev $INTERFACE parent 1:2 handle 20: pfifo > tc qdisc add dev $INTERFACE parent 1:3 handle 30: pfifo > > The I used ping -Q TOSVALUE to send packets with different TOS values through > the interface. tcpdump confirmed the correct TOS values in the outgoing > packets. > > With "tc -s qdisc ls dev wifi0" I could observe the effects of the different > TOS values. The result: no effect at all! Every single packet used the band > indicated by the first value in the priomap (e.g. band 1 by default, in my > case the pfifo with handle 20:). I can't square this observation with the > available documentation. > > Looking at the source code, it seems that sched_prio uses the skb->priority > value to select the outgoing band. According to some documentation I found, > an application can set this value. > > Now I'm at a loss. I can work around this problem with filters, but I don't > think that this is the correct solution. Any suggestions? > Are you doing this on the same box? I was tracing this long time ago too, and, if I didn't miss something, it was about the place! So, as I recall (after finding some old message) this TOS is considered only for packets going through the FORWARD chain. (But, I haven't checked this at all now, so "no complaints"...) Regards, Jarek P. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3][RESEND] phylib: add PHY interface modes for internal delay for tx and rx only
Allow phylib specification of cases where hardware needs to configure PHYs for Internal Delay only on either RX or TX (not both). Signed-off-by: Kim Phillips <[EMAIL PROTECTED]> Tested-by: Anton Vorontsov <[EMAIL PROTECTED]> Acked-by: Li Yang <[EMAIL PROTECTED]> --- include/linux/phy.h |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/include/linux/phy.h b/include/linux/phy.h index f0742b6..e10763d 100644 --- a/include/linux/phy.h +++ b/include/linux/phy.h @@ -58,6 +58,8 @@ typedef enum { PHY_INTERFACE_MODE_RMII, PHY_INTERFACE_MODE_RGMII, PHY_INTERFACE_MODE_RGMII_ID, + PHY_INTERFACE_MODE_RGMII_RXID, + PHY_INTERFACE_MODE_RGMII_TXID, PHY_INTERFACE_MODE_RTBI } phy_interface_t; -- 1.5.2.2 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3][RESEND] ucc_geth: handle passing of RX-only and TX-only internal delay PHY connection type parameters
Extend the RGMII-Internal Delay specification case to include TX-only and RX-only variants. Signed-off-by: Kim Phillips <[EMAIL PROTECTED]> Tested-by: Anton Vorontsov <[EMAIL PROTECTED]> Acked-by: Li Yang <[EMAIL PROTECTED]> --- drivers/net/ucc_geth.c | 10 ++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c index a3ff270..7f68990 100644 --- a/drivers/net/ucc_geth.c +++ b/drivers/net/ucc_geth.c @@ -1460,6 +1460,8 @@ static int adjust_enet_interface(struct ucc_geth_private *ugeth) if ((ugeth->phy_interface == PHY_INTERFACE_MODE_RMII) || (ugeth->phy_interface == PHY_INTERFACE_MODE_RGMII) || (ugeth->phy_interface == PHY_INTERFACE_MODE_RGMII_ID) || + (ugeth->phy_interface == PHY_INTERFACE_MODE_RGMII_RXID) || + (ugeth->phy_interface == PHY_INTERFACE_MODE_RGMII_TXID) || (ugeth->phy_interface == PHY_INTERFACE_MODE_RTBI)) { upsmr |= UPSMR_RPM; switch (ugeth->max_speed) { @@ -1557,6 +1559,8 @@ static void adjust_link(struct net_device *dev) if ((ugeth->phy_interface == PHY_INTERFACE_MODE_RMII) || (ugeth->phy_interface == PHY_INTERFACE_MODE_RGMII) || (ugeth->phy_interface == PHY_INTERFACE_MODE_RGMII_ID) || + (ugeth->phy_interface == PHY_INTERFACE_MODE_RGMII_RXID) || + (ugeth->phy_interface == PHY_INTERFACE_MODE_RGMII_TXID) || (ugeth->phy_interface == PHY_INTERFACE_MODE_RTBI)) { if (phydev->speed == SPEED_10) upsmr |= UPSMR_R10M; @@ -3795,6 +3799,10 @@ static phy_interface_t to_phy_interface(const char *phy_connection_type) return PHY_INTERFACE_MODE_RGMII; if (strcasecmp(phy_connection_type, "rgmii-id") == 0) return PHY_INTERFACE_MODE_RGMII_ID; + if (strcasecmp(phy_connection_type, "rgmii-txid") == 0) + return PHY_INTERFACE_MODE_RGMII_TXID; + if (strcasecmp(phy_connection_type, "rgmii-rxid") == 0) + return PHY_INTERFACE_MODE_RGMII_RXID; if (strcasecmp(phy_connection_type, "rtbi") == 0) return PHY_INTERFACE_MODE_RTBI; @@ -3889,6 +3897,8 @@ static int ucc_geth_probe(struct of_device* ofdev, const struct of_device_id *ma case PHY_INTERFACE_MODE_GMII: case PHY_INTERFACE_MODE_RGMII: case PHY_INTERFACE_MODE_RGMII_ID: + case PHY_INTERFACE_MODE_RGMII_RXID: + case PHY_INTERFACE_MODE_RGMII_TXID: case PHY_INTERFACE_MODE_TBI: case PHY_INTERFACE_MODE_RTBI: max_speed = SPEED_1000; -- 1.5.2.2 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2.6.24-rc3] Fix /proc/net breakage
Pavel Emelyanov <[EMAIL PROTECTED]> writes: > Rafael J. Wysocki wrote: >> On Monday, 19 of November 2007, Pavel Machek wrote: >>> Hi! >>> >>> I think that this worked before: >>> >>> [EMAIL PROTECTED]:/proc# find . -name "timer_info" >>> find: WARNING: Hard link count is wrong for ./net: this may be a bug >>> in your filesystem driver. Automatically turning on find's -noleaf >>> option. Earlier results may have failed to include directories that >>> should have been searched. >>> [EMAIL PROTECTED]:/proc# >> >> I'm seeing that too. > > I have a better things with 2.6.24-rc3 ;) > > # cd /proc/net > # ls .. > ls: reading directory ..: Not a directory > > and this > > # cd /proc > # find > ... > ./net > find: . changed during execution of find > # find net > find: net changed during execution of find > # find net/ > > > Moreover. Program that opens /proc/net and dumps the /proc/self/fd > files produces the following: > > # cd / > # a.out /proc/net > ... > lr-x-- 1 root root 64 Nov 20 18:02 3 -> /proc/net/net (deleted) > ... > # cd /proc/net > # a.out . > ... > lr-x-- 1 root root 64 Nov 20 18:03 3 -> /proc/net/net (deleted) > ... > # a.out .. > ... > lr-x-- 1 root root 64 Nov 20 18:03 3 -> /proc/net > ... Well I clearly goofed when I added the initial network namespace support for /proc/net. Currently things work but there are odd details visible to user space, even when we have a single network namespace. Since we do not cache proc_dir_entry dentries at the moment we can just modify ->lookup to return a different directory inode depending on the network namespace of the process looking at /proc/net, replacing the current technique of using a magic and fragile follow_link method. To accomplish that this patch: - introduces a shadow_proc method to allow different dentries to be returned from proc_lookup. - Removes the old /proc/net follow_link magic - Fixes a weakness in our not caching of proc generic dentries. As shadow_proc uses a task struct to decided which dentry to return we can go back later and fix the proc generic caching without modifying any code that uses the shadow_proc method. Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]> --- fs/proc/generic.c | 12 ++- fs/proc/proc_net.c | 86 +++ include/linux/proc_fs.h |3 ++ 3 files changed, 19 insertions(+), 82 deletions(-) diff --git a/fs/proc/generic.c b/fs/proc/generic.c index a9806bc..c2b7523 100644 --- a/fs/proc/generic.c +++ b/fs/proc/generic.c @@ -374,9 +374,16 @@ static int proc_delete_dentry(struct dentry * dentry) return 1; } +static int proc_revalidate_dentry(struct dentry *dentry, struct nameidata *nd) +{ + d_drop(dentry); + return 0; +} + static struct dentry_operations proc_dentry_operations = { .d_delete = proc_delete_dentry, + .d_revalidate = proc_revalidate_dentry, }; /* @@ -397,8 +404,11 @@ struct dentry *proc_lookup(struct inode * dir, struct dentry *dentry, struct nam if (de->namelen != dentry->d_name.len) continue; if (!memcmp(dentry->d_name.name, de->name, de->namelen)) { - unsigned int ino = de->low_ino; + unsigned int ino; + if (de->shadow_proc) + de = de->shadow_proc(current, de); + ino = de->low_ino; de_get(de); spin_unlock(&proc_subdir_lock); error = -EINVAL; diff --git a/fs/proc/proc_net.c b/fs/proc/proc_net.c index 131f9c6..0afe21e 100644 --- a/fs/proc/proc_net.c +++ b/fs/proc/proc_net.c @@ -50,89 +50,14 @@ struct net *get_proc_net(const struct inode *inode) } EXPORT_SYMBOL_GPL(get_proc_net); -static struct proc_dir_entry *proc_net_shadow; +static struct proc_dir_entry *shadow_pde; -static struct dentry *proc_net_shadow_dentry(struct dentry *parent, +static struct proc_dir_entry *proc_net_shadow(struct task_struct *task, struct proc_dir_entry *de) { - struct dentry *shadow = NULL; - struct inode *inode; - if (!de) - goto out; - de_get(de); - inode = proc_get_inode(parent->d_inode->i_sb, de->low_ino, de); - if (!inode) - goto out_de_put; - shadow = d_alloc_name(parent, de->name); - if (!shadow) - goto out_iput; - shadow->d_op = parent->d_op; /* proc_dentry_operations */ - d_instantiate(shadow, inode); -out: - return shadow; -out_iput: - iput(inode); -out_de_put: - de_put(de); - goto out; -} - -static void *proc_net_follow_link(struct dentry *parent, struct nameidata *nd) -{ - struct net *net = current->nsproxy->net_ns; - struct dentry *shadow; - shado
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
> Agreed. On first glance, I was intrigued but: > > 1) Why is everyone so concerned that export symbol space is large? > - does it cost cpu or running memory? > - does it cause bugs? > - or are you just worried about "evil modules"? > > 2) These aren't real namespaces > - all global names still have to be unique > - still have to handle the "non-modular build" namespace conflicts > - there isn't a big problem with conflicting symbols today. Perhaps changing the name from "namespace" to "interface" would help? Then a module could have something like MODULE_USE_INTERFACE(foo); and I think that makes it clearer what the advantage of this is: it marks symbols as being part of a certain interface, requires modules that use that interface to declare that use explicitly, and allows reviewers to say "Hey why is this code using the scsi interface when it's a webcam driver?" - R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/3][RESEND] fixups for mpc8360 rev. 2.1 erratum #2 (RGMII Timing)
these 3 patches are a resend of patches 2-4 (out of 5) that were originally sent 2007-11-05* (patches 1 and 5 were picked up by Kumar to go through powerpc). Jeff, Leo has acked these, please consider for 2.6.24. Thanks, Kim * http://marc.info/?l=linux-netdev&m=119428688804765&w=1 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3][RESEND] phylib: marvell: add support for TX-only and RX-only Internal Delay
Previously, Internal Delay specification implied the delay be applied to both TX and RX. This patch allows for separate TX/RX-only internal delay specification. Signed-off-by: Kim Phillips <[EMAIL PROTECTED]> Tested-by: Anton Vorontsov <[EMAIL PROTECTED]> Acked-by: Li Yang <[EMAIL PROTECTED]> --- drivers/net/phy/marvell.c | 26 +- 1 files changed, 17 insertions(+), 9 deletions(-) diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c index 035fd41..f057407 100644 --- a/drivers/net/phy/marvell.c +++ b/drivers/net/phy/marvell.c @@ -143,21 +143,29 @@ static int m88e_config_init(struct phy_device *phydev) int err; if ((phydev->interface == PHY_INTERFACE_MODE_RGMII) || - (phydev->interface == PHY_INTERFACE_MODE_RGMII_ID)) { + (phydev->interface == PHY_INTERFACE_MODE_RGMII_ID) || + (phydev->interface == PHY_INTERFACE_MODE_RGMII_RXID) || + (phydev->interface == PHY_INTERFACE_MODE_RGMII_TXID)) { int temp; - if (phydev->interface == PHY_INTERFACE_MODE_RGMII_ID) { - temp = phy_read(phydev, MII_M_PHY_EXT_CR); - if (temp < 0) - return temp; + temp = phy_read(phydev, MII_M_PHY_EXT_CR); + if (temp < 0) + return temp; + if (phydev->interface == PHY_INTERFACE_MODE_RGMII_ID) { temp |= (MII_M_RX_DELAY | MII_M_TX_DELAY); - - err = phy_write(phydev, MII_M_PHY_EXT_CR, temp); - if (err < 0) - return err; + } else if (phydev->interface == PHY_INTERFACE_MODE_RGMII_RXID) { + temp &= ~MII_M_TX_DELAY; + temp |= MII_M_RX_DELAY; + } else if (phydev->interface == PHY_INTERFACE_MODE_RGMII_TXID) { + temp &= ~MII_M_RX_DELAY; + temp |= MII_M_TX_DELAY; } + err = phy_write(phydev, MII_M_PHY_EXT_CR, temp); + if (err < 0) + return err; + temp = phy_read(phydev, MII_M_PHY_EXT_SR); if (temp < 0) return temp; -- 1.5.2.2 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2.6 patch] ipv4/arp.c:arp_process(): remove bogus #ifdef mess
On Mon, Nov 26, 2007 at 11:19:26PM +0800, Herbert Xu wrote: > On Sun, Nov 25, 2007 at 04:30:03PM +, Adrian Bunk wrote: > > > > > > > > Please look at net/ipv4/arp.c:arp_process() > > > > > > > > Am I right that CONFIG_NET_ETHERNET=n and CONFIG_NETDEV_1000=y or > > > > CONFIG_NETDEV_1=y will not be handled correctly there? > > > > > > > > And the best solution is to nuke all #ifdef's in this function and make > > > > the code unconditionally available? > > > > > > I think removing those specific ifdefs in arp_process() > > > is the best option, yes. > > > > Patch below. > > Thanks Adrian. Patch applied to net-2.6. > > Do we need this for stable too? Unless I'm misunderstanding the code we currently wrongly ignore some ARP packages based on the setting of an unrelated option, so it seems to be a -stable candidate when it's in Linus' tree. > Chers, cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/6] skge: increase TX threshold for Jumbo
Need to increase TX threshold when doing Jumbo frames on dual port board to avoid underruns. (Code from sk98lin). Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> --- a/drivers/net/skge.c2007-11-21 14:05:59.0 -0800 +++ b/drivers/net/skge.c2007-11-21 14:06:15.0 -0800 @@ -1633,15 +1633,14 @@ static void genesis_mac_init(struct skge } xm_write16(hw, port, XM_RX_CMD, r); - /* We want short frames padded to 60 bytes. */ xm_write16(hw, port, XM_TX_CMD, XM_TX_AUTO_PAD); - /* -* Bump up the transmit threshold. This helps hold off transmit -* underruns when we're blasting traffic from both ports at once. -*/ - xm_write16(hw, port, XM_TX_THR, 512); + /* Increase threshold for jumbo frames on dual port */ + if (hw->ports > 1 && jumbo) + xm_write16(hw, port, XM_TX_THR, 1020); + else + xm_write16(hw, port, XM_TX_THR, 512); /* * Enable the reception of all error frames. This is is -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/6] skge: retry on MAC shutdown
Make sure and retry when shutting down the MAC. This code is copied from sk98lin driver. Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> --- a/drivers/net/skge.c2007-11-21 14:40:58.0 -0800 +++ b/drivers/net/skge.c2007-11-21 14:44:02.0 -0800 @@ -1713,7 +1713,7 @@ static void genesis_stop(struct skge_por { struct skge_hw *hw = skge->hw; int port = skge->port; - u32 reg; + unsigned retries = 1000; genesis_reset(hw, port); @@ -1721,20 +1721,17 @@ static void genesis_stop(struct skge_por skge_write16(hw, B3_PA_CTRL, port == 0 ? PA_CLR_TO_TX1 : PA_CLR_TO_TX2); - /* -* If the transfer sticks at the MAC the STOP command will not -* terminate if we don't flush the XMAC's transmit FIFO ! -*/ - xm_write32(hw, port, XM_MODE, - xm_read32(hw, port, XM_MODE)|XM_MD_FTF); - - /* Reset the MAC */ - skge_write16(hw, SK_REG(port, TX_MFF_CTRL1), MFF_SET_MAC_RST); + skge_write16(hw, SK_REG(port, TX_MFF_CTRL1), MFF_CLR_MAC_RST); + do { + skge_write16(hw, SK_REG(port, TX_MFF_CTRL1), MFF_SET_MAC_RST); + if (!(skge_read16(hw, SK_REG(port, TX_MFF_CTRL1)) & MFF_SET_MAC_RST)) + break; + } while (--retries > 0); /* For external PHYs there must be special handling */ if (hw->phy_type != SK_PHY_XMAC) { - reg = skge_read32(hw, B2_GP_IO); + u32 reg = skge_read32(hw, B2_GP_IO); if (port == 0) { reg |= GP_DIR_0; reg &= ~GP_IO_0; -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/6] skge update (for 2.6.24)
This resolves the skge problems that show up on dual ported and fiber attached boards. These fixes have been validated by the users that reported regressions with earlier patches. -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/6] skge: fiber link up/down fix
The driver would not work over fibre if other end when down then came back up (would require reloading driver). The correct way to manage the link the same way for both TP and fibre. Resloves problem described in: http://lkml.org/lkml/2007/11/6/395 Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> --- a/drivers/net/skge.c2007-11-21 14:44:02.0 -0800 +++ b/drivers/net/skge.c2007-11-21 14:46:34.0 -0800 @@ -1095,16 +1095,9 @@ static void xm_link_down(struct skge_hw { struct net_device *dev = hw->dev[port]; struct skge_port *skge = netdev_priv(dev); - u16 cmd = xm_read16(hw, port, XM_MMU_CMD); xm_write16(hw, port, XM_IMSK, XM_IMSK_DISABLE); - cmd &= ~(XM_MMU_ENA_RX | XM_MMU_ENA_TX); - xm_write16(hw, port, XM_MMU_CMD, cmd); - - /* dummy read to ensure writing */ - xm_read16(hw, port, XM_MMU_CMD); - if (netif_carrier_ok(dev)) skge_link_down(skge); } @@ -1194,6 +1187,7 @@ static void genesis_init(struct skge_hw static void genesis_reset(struct skge_hw *hw, int port) { const u8 zero[8] = { 0 }; + u32 reg; skge_write8(hw, SK_REG(port, GMAC_IRQ_MSK), 0); @@ -1209,6 +1203,11 @@ static void genesis_reset(struct skge_hw xm_write16(hw, port, PHY_BCOM_INT_MASK, 0x); xm_outhash(hw, port, XM_HSM, zero); + + /* Flush TX and RX fifo */ + reg = xm_read32(hw, port, XM_MODE); + xm_write32(hw, port, XM_MODE, reg | XM_MD_FTF); + xm_write32(hw, port, XM_MODE, reg | XM_MD_FRF); } @@ -1714,6 +1713,12 @@ static void genesis_stop(struct skge_por struct skge_hw *hw = skge->hw; int port = skge->port; unsigned retries = 1000; + u16 cmd; + + /* Disable Tx and Rx */ + cmd = xm_read16(hw, port, XM_MMU_CMD); + cmd &= ~(XM_MMU_ENA_RX | XM_MMU_ENA_TX); + xm_write16(hw, port, XM_MMU_CMD, cmd); genesis_reset(hw, port); -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/6] skge: FIFO Ram calculation error
The calculation of usable FIFO RAM is wrong in the skge driver. First, is doesn't take into account the reserved area on the original SysKonnect Genesis boards. Second it has an off-by-one error because hw->ports is either 1 or 2. Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> --- a/drivers/net/skge.c2007-11-21 09:50:08.0 -0800 +++ b/drivers/net/skge.c2007-11-21 12:18:59.0 -0800 @@ -2619,8 +2619,8 @@ static int skge_up(struct net_device *de yukon_mac_init(hw, port); spin_unlock_bh(&hw->phy_lock); - /* Configure RAMbuffers */ - chunk = hw->ram_size / ((hw->ports + 1)*2); + /* Configure RAMbuffers - equally between ports and tx/rx */ + chunk = (hw->ram_size - hw->ram_offset) / (hw->ports * 2); ram_addr = hw->ram_offset + 2 * chunk * port; skge_ramset(hw, rxqaddr[port], ram_addr, chunk); -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/6] skge: receive flush logic
Receive FIFO overrun is not catastrophic condition, so don't flush when it happens. Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> --- a/drivers/net/skge.c2007-11-21 14:36:31.0 -0800 +++ b/drivers/net/skge.c2007-11-21 14:40:58.0 -0800 @@ -1801,11 +1801,6 @@ static void genesis_mac_intr(struct skge xm_write32(hw, port, XM_MODE, XM_MD_FTF); ++dev->stats.tx_fifo_errors; } - - if (status & XM_IS_RXF_OV) { - xm_write32(hw, port, XM_MODE, XM_MD_FRF); - ++dev->stats.rx_fifo_errors; - } } static void genesis_link_up(struct skge_port *skge) @@ -1862,9 +1857,9 @@ static void genesis_link_up(struct skge_ xm_write32(hw, port, XM_MODE, mode); - /* Turn on detection of Tx underrun, Rx overrun */ + /* Turn on detection of Tx underrun */ msk = xm_read16(hw, port, XM_IMSK); - msk &= ~(XM_IS_RXF_OV | XM_IS_TXF_UR); + msk &= ~XM_IS_TXF_UR; xm_write16(hw, port, XM_IMSK, msk); xm_read16(hw, port, XM_ISRC); -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/6] skge version 1.13
Version for 2.6.24 Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> --- a/drivers/net/skge.c2007-11-21 14:07:37.0 -0800 +++ b/drivers/net/skge.c2007-11-21 14:08:35.0 -0800 @@ -44,7 +44,7 @@ #include "skge.h" #define DRV_NAME "skge" -#define DRV_VERSION"1.12" +#define DRV_VERSION"1.13" #define PFXDRV_NAME " " #define DEFAULT_TX_RING_SIZE 128 -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] XFRM: SPD auditing fix to include the netmask/prefix-length
Currently the netmask/prefix-length of an IPsec SPD entry is not included in any of the SPD related audit messages. This can cause a problem when the audit log is examined as the netmask/prefix-length is vital in determining what network traffic is affected by a particular SPD entry. This patch fixes this problem by adding two additional fields, "src_prefixlen" and "dst_prefixlen", to the SPD audit messages to indicate the source and destination netmasks. These new fields are only included in the audit message when the netmask/prefix-length is less than the address length, i.e. the SPD entry applies to a network address and not a host address. Example audit message: type=UNKNOWN[1415] msg=audit(1196105849.752:25): auid=0 \ subj=root:system_r:unconfined_t:s0-s0:c0.c1023 op=SPD-add res=1 \ src=192.168.0.0 src_prefixlen=24 dst=192.168.1.0 dst_prefixlen=24 In addition, this patch also fixes a few other things in the xfrm_audit_common_policyinfo() function. The IPv4 string formatting was converted to use the standard NIPQUAD_FMT constant, the memcpy() was removed from the IPv6 code path and replaced with a typecast (the memcpy() was acting as a slow, implicit typecast anyway), and two local variables were created to make referencing the XFRM security context and selector information cleaner. Signed-off-by: Paul Moore <[EMAIL PROTECTED]> --- net/xfrm/xfrm_policy.c | 44 ++-- 1 files changed, 26 insertions(+), 18 deletions(-) diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index b702bd8..bd70d79 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -2123,29 +2123,37 @@ void __init xfrm_init(void) static inline void xfrm_audit_common_policyinfo(struct xfrm_policy *xp, struct audit_buffer *audit_buf) { - if (xp->security) + struct xfrm_sec_ctx *ctx = xp->security; + struct xfrm_selector *sel = &xp->selector; + + if (ctx) audit_log_format(audit_buf, " sec_alg=%u sec_doi=%u sec_obj=%s", -xp->security->ctx_alg, xp->security->ctx_doi, -xp->security->ctx_str); +ctx->ctx_alg, ctx->ctx_doi, ctx->ctx_str); - switch(xp->selector.family) { + switch(sel->family) { case AF_INET: - audit_log_format(audit_buf, " src=%u.%u.%u.%u dst=%u.%u.%u.%u", -NIPQUAD(xp->selector.saddr.a4), -NIPQUAD(xp->selector.daddr.a4)); + audit_log_format(audit_buf, " src=" NIPQUAD_FMT, +NIPQUAD(sel->saddr.a4)); + if (sel->prefixlen_s != 32) + audit_log_format(audit_buf, " src_prefixlen=%d", +sel->prefixlen_s); + audit_log_format(audit_buf, " dst=" NIPQUAD_FMT, +NIPQUAD(sel->daddr.a4)); + if (sel->prefixlen_d != 32) + audit_log_format(audit_buf, " dst_prefixlen=%d", +sel->prefixlen_d); break; case AF_INET6: - { - struct in6_addr saddr6, daddr6; - - memcpy(&saddr6, xp->selector.saddr.a6, - sizeof(struct in6_addr)); - memcpy(&daddr6, xp->selector.daddr.a6, - sizeof(struct in6_addr)); - audit_log_format(audit_buf, - " src=" NIP6_FMT " dst=" NIP6_FMT, - NIP6(saddr6), NIP6(daddr6)); - } + audit_log_format(audit_buf, " src=" NIP6_FMT, +NIP6(*(struct in6_addr *)sel->saddr.a6)); + if (sel->prefixlen_s != 128) + audit_log_format(audit_buf, " src_prefixlen=%d", +sel->prefixlen_s); + audit_log_format(audit_buf, " dst=" NIP6_FMT, +NIP6(*(struct in6_addr *)sel->daddr.a6)); + if (sel->prefixlen_d != 128) + audit_log_format(audit_buf, " dst_prefixlen=%d", +sel->prefixlen_d); break; } } - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] ethtool: fix typo on setting speed 10000
From: Jesse Brandeburg <[EMAIL PROTECTED]> fix the typo in speed 1 setting. Signed-off-by: Jesse Brandeburg <[EMAIL PROTECTED]> Signed-off-by: Auke Kok <[EMAIL PROTECTED]> --- ethtool.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/ethtool.c b/ethtool.c index 3adf843..a668b49 100644 --- a/ethtool.c +++ b/ethtool.c @@ -524,7 +524,7 @@ static void parse_cmdline(int argc, char **argp) speed_wanted = SPEED_1000; else if (!strcmp(argp[i], "2500")) speed_wanted = SPEED_2500; - else if (!strcmp(argp[1], "1")) + else if (!strcmp(argp[i], "1")) speed_wanted = SPEED_1; else show_usage(1); - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.
On Mon, 26 Nov 2007 12:28:14 +1100 Rusty Russell <[EMAIL PROTECTED]> wrote: > On Monday 26 November 2007 07:27:03 Roland Dreier wrote: > > > This patch allows to export symbols only for specific modules by > > > introducing symbol name spaces. A module name space has a white > > > list of modules that are allowed to import symbols for it; all others > > > can't use the symbols. > > > > > > It adds two new macros: > > > > > > MODULE_NAMESPACE_ALLOW(namespace, module); > > > > I definitely like the idea of organizing exported symbols into > > namespaces. However, I feel like it would make more sense to have > > something like > > > > MODULE_NAMESPACE_IMPORT(namespace); > > Except C doesn't have namespaces and this mechanism doesn't create them. So > this is just complete and utter makework; as I said before, noone's going to > confuse all those udp_* functions if they're not in the udp namespace. > > For better or worse, this is not C++. > Agreed. On first glance, I was intrigued but: 1) Why is everyone so concerned that export symbol space is large? - does it cost cpu or running memory? - does it cause bugs? - or are you just worried about "evil modules"? 2) These aren't real namespaces - all global names still have to be unique - still have to handle the "non-modular build" namespace conflicts - there isn't a big problem with conflicting symbols today. So why bother adding complexity. -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2 1/3] NET_SCHED: PSPacer qdisc module
Ryousei Takano wrote: This patch includes the PSPacer (Precise Software Pacer) qdisc module, which achieves precise transmission bandwidth control. You can find more information at the project web page (http://www.gridmpi.org/gridtcp.jsp). Thanks for the update, but you didn't answer any of my questions. Another round of comments below. Signed-off-by: Ryousei Takano <[EMAIL PROTECTED]> --- include/linux/pkt_sched.h | 37 ++ net/sched/Kconfig |9 + net/sched/Makefile|1 + net/sched/sch_psp.c | 958 + 4 files changed, 1005 insertions(+), 0 deletions(-) create mode 100644 net/sched/sch_psp.c diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h index 919af93..fda41cd 100644 --- a/include/linux/pkt_sched.h +++ b/include/linux/pkt_sched.h @@ -430,6 +430,43 @@ enum { #define TCA_ATM_MAX (__TCA_ATM_MAX - 1) +/* Precise Software Pacer section */ + +#define TC_PSP_MAXDEPTH (8) + +typedef long long gapclock_t; It seems you only add to this, does it need to be signed? How about using a fixed size type (u64) and getting rid of the typedef? + +enum { + MODE_NORMAL = 0, + MODE_STATIC = 1, +}; + +struct tc_psp_copt +{ + __u32 level; + __u32 mode; + __u32 rate; +}; + +struct tc_psp_qopt +{ + __u32 defcls; + __u32 rate; +}; What unit is rate measured in? + +struct tc_psp_xstats +{ + __u32 bytes; /* gap packet statistics */ + __u32 packets; +}; How about using gnet_stats_basic for this? + +enum +{ + TCA_PSP_UNSPEC, + TCA_PSP_COPT, + TCA_PSP_QOPT, +}; + /* Network emulator */ +++ b/net/sched/sch_psp.c @@ -0,0 +1,958 @@ +/* + * net/sched/sch_psp.c PSPacer: Precise Software Pacer + * + * Copyright (C) 2004-2007 National Institute of Advanced + * Industrial Science and Technology (AIST), Japan. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + * + * Authors:Ryousei Takano, <[EMAIL PROTECTED]> + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* PSPacer achieves precise rate regulation results, and no microscopic + * burst transmission which exceeds the limit is generated. + * + * The basic idea is that transmission timing can be precisely controlled, + * if packets are sent back-to-back at the wire rate. PSPacer controls + * the packet transmision intervals by inserting additional packets, + * called gap packets, between adjacent packets. The transmission interval + * can be controlled accurately by adjusting the number and size of the gap + * packets. PSPacer uses the 802.3x PAUSE frame as the gap packet. + * + * For the purpose of adjusting the gap size, this Qdisc maintains a byte + * clock which is recorded by a total transmitted byte per connection. + * Each sub-class has a class local clock which is used to make decision + * whether to send a packet or not. If there is not any packets to send, + * gap packets are inserted. + * + * References: + * [1] R.Takano, T.Kudoh, Y.Kodama, M.Matsuda, H.Tezuka, and Y.Ishikawa, + * "Design and Evaluation of Precise Software Pacing Mechanisms for + * Fast Long-Distance Networks", PFLDnet2005. + * [2] http://www.gridmpi.org/gridtcp.jsp + */ + +#define HW_GAP (16)/* Preamble(8) + Inter Frame Gap(8) */ +#define FCS(4) /* Frame Check Sequence(4) */ +#define MIN_GAP (64) /* Minimum size of gap packet */ +#define MIN_TARGET_RATE (1000) /* 1 KB/s (= 8 Kbps) */ What is the reason for this minimum? + +#define PSP_HSIZE (16) + +#define BIT2BYTE(n) ((n) >> 3) Please remove this and simply open code "/ BITS_PER_BYTE" in the only spot using it. + +struct psp_class +{ + u32 classid;/* class id */ + int refcnt; /* reference count */ + + struct gnet_stats_basic bstats; /* basic stats */ + struct gnet_stats_queue qstats; /* queue stats */ + + int level; /* class level in hierarchy */ + struct psp_class *parent; /* parent class */ + struct list_head sibling; /* sibling classes */ + struct list_head children; /* child classes */ + + struct Qdisc *qdisc;/* leaf qdisc */ + + struct tcf_proto *filter_list; /* filter list */ + int filter_cnt; /* filter count */ + + struct list_head hlist; /* hash list */ + struct list_head dlist; /* drop list */ + struct list_head plist; /* normal/pacing class qdisc list */ + + int activity; /* activity flag */ +#define FLAG_ACTIV
Re: [PATCH] ehea: Add kdump support
Hi, On Mon, Nov 26, 2007 at 01:41:37PM -0200, Luke Browning wrote: > On Mon, 2007-11-26 at 19:16 +1100, Michael Ellerman wrote: > > > For kdump we have to assume that the kernel is fundamentally broken, If I may so humbly suggest: since ehea is a power6 thing only, we should refocus our energies on "hypervisor assisted dump", which solves all of these problems. In short, upon crash, the hypervisor will reset the pci devices into working order, and will then boot a new fresh kernel into a tiny corner of ram. The rest of ram is not cleared, and can be dumped. After the dump, the mem is returned to general use. The key point here, for ehea, is "the hypervisor will reset he device state to something rational". Preliminary patches are at http://patchwork.ozlabs.org/linuxppc/patch?id=14884 and following. --linas - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/01] ipv6: RFC4214 Support (v2.5)
In article <[EMAIL PROTECTED]> (at Mon, 26 Nov 2007 09:16:16 -0800), "Templin, Fred L" <[EMAIL PROTECTED]> says: > From: Fred L. Templin <[EMAIL PROTECTED]> > > This patch includes support for the Intra-Site Automatic Tunnel > Addressing Protocol (ISATAP) per RFC4214. It uses the SIT > module, and is configured using extensions to the "iproute2" > utility. The diffs are specific to the Linux 2.6.24-rc2 kernel > distribution. > > This version includes the diff for ./include/linux/if.h which was > missing in the v2.4 submission and is needed to make the > patch compile. The patch has been installed, compiled and > tested in a clean 2.6.24-rc2 kernel build area. > > Signed-off-by: Fred L. Templin <[EMAIL PROTECTED]> Acked-by: YOSHIFUJI Hideaki <[EMAIL PROTECTED]> Note: With linux-2.6: | % patch -p1 < /tmp/isatap.patch | patching file include/linux/if.h | patching file include/linux/if_tunnel.h | patching file include/linux/in.h | patching file include/net/addrconf.h | patching file net/ipv6/addrconf.c | patching file net/ipv6/route.c | Hunk #1 succeeded at 1660 (offset -8 lines). | patching file net/ipv6/sit.c With net-2.6.24: | % patch -p1 < /tmp/isatap.patch | % patch -p1 < /tmp/isatap.patch | patching file include/linux/if.h | patching file include/linux/if_tunnel.h | patching file include/linux/in.h | patching file include/net/addrconf.h | patching file net/ipv6/addrconf.c | Hunk #1 succeeded at 378 (offset -1 lines). | Hunk #2 succeeded at 1441 (offset -1 lines). | Hunk #3 succeeded at 1479 (offset -1 lines). | Hunk #4 succeeded at 2210 (offset -1 lines). | patching file net/ipv6/route.c | Hunk #1 succeeded at 1727 (offset 59 lines). | patching file net/ipv6/sit.c --yoshfuji - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] bridging: don't forward EAPOL frames
On Thu, 22 Nov 2007 14:23:28 +0100 Johannes Berg <[EMAIL PROTECTED]> wrote: > This patch makes the bridging code drop EAPOL frames as recommended by > 802.1X-2004 in C.3.3. > > Is this really the right place to put it? > --- > include/linux/if_ether.h |1 + > include/net/ieee80211.h |6 -- > net/bridge/br_input.c|3 +++ > 3 files changed, 4 insertions(+), 6 deletions(-) > > --- everything.orig/include/linux/if_ether.h 2007-11-22 11:47:14.178686360 > +0100 > +++ everything/include/linux/if_ether.h 2007-11-22 11:48:21.438679036 > +0100 > @@ -74,6 +74,7 @@ > #define ETH_P_ATMFATE0x8884 /* Frame-based ATM Transport >* over Ethernet >*/ > +#define ETH_P_PAE0x888E /* Port Access Entity (IEEE 802.1X) */ > #define ETH_P_AOE0x88A2 /* ATA over Ethernet*/ > #define ETH_P_TIPC 0x88CA /* TIPC */ > > --- everything.orig/include/net/ieee80211.h 2007-11-22 11:46:29.908682888 > +0100 > +++ everything/include/net/ieee80211.h2007-11-22 11:48:51.908679037 > +0100 > @@ -183,12 +183,6 @@ const char *escape_essid(const char *ess > #endif > #include /* new driver API */ > > -#ifndef ETH_P_PAE > -#define ETH_P_PAE 0x888E /* Port Access Entity (IEEE 802.1X) */ > -#endif /* ETH_P_PAE */ > - > -#define ETH_P_PREAUTH 0x88C7 /* IEEE 802.11i pre-authentication */ > - > #ifndef ETH_P_80211_RAW > #define ETH_P_80211_RAW (ETH_P_ECONET + 1) > #endif > --- everything.orig/net/bridge/br_input.c 2007-11-22 11:54:44.798683106 > +0100 > +++ everything/net/bridge/br_input.c 2007-11-22 11:57:23.248680285 +0100 > @@ -145,6 +145,9 @@ struct sk_buff *br_handle_frame(struct n > } > } > > + if (unlikely(skb->protocol = htons(ETH_P_PAE))) > + goto drop; > + > switch (p->state) { > case BR_STATE_FORWARDING: Not needed because the bridge is already handling it: 1) If running STP (ie true bridge), then all link local multicast is only received by the bridge and never forwarded. 2) If not running sTP (ie invisible bridge), then it will be forwarded. Despite what the standards say, many users are using bridging code for invisible firewalls etc, and in those cases they want STP and EAPOL frames to be forwarded. -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2.6.25 2/2] RDMA/cxgb3: Support 5.0 firmware.
RDMA/cxgb3: Support 5.0 firmware. The 5.0 firmware now supports translating sgls in recv wrs, so remove the host driver logic currently doing the translation. Note: this change requires 5.0 firmware. Signed-off-by: Steve Wise <[EMAIL PROTECTED]> --- drivers/infiniband/hw/cxgb3/iwch_qp.c | 21 ++--- 1 files changed, 2 insertions(+), 19 deletions(-) diff --git a/drivers/infiniband/hw/cxgb3/iwch_qp.c b/drivers/infiniband/hw/cxgb3/iwch_qp.c index dd89b6b..9bb8112 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_qp.c +++ b/drivers/infiniband/hw/cxgb3/iwch_qp.c @@ -208,36 +208,19 @@ static int iwch_sgl2pbl_map(struct iwch_dev *rhp, struct ib_sge *sg_list, static int iwch_build_rdma_recv(struct iwch_dev *rhp, union t3_wr *wqe, struct ib_recv_wr *wr) { - int i, err = 0; - u32 pbl_addr[4]; - u8 page_size[4]; + int i; if (wr->num_sge > T3_MAX_SGE) return -EINVAL; - err = iwch_sgl2pbl_map(rhp, wr->sg_list, wr->num_sge, pbl_addr, - page_size); - if (err) - return err; - wqe->recv.pagesz[0] = page_size[0]; - wqe->recv.pagesz[1] = page_size[1]; - wqe->recv.pagesz[2] = page_size[2]; - wqe->recv.pagesz[3] = page_size[3]; wqe->recv.num_sgle = cpu_to_be32(wr->num_sge); for (i = 0; i < wr->num_sge; i++) { wqe->recv.sgl[i].stag = cpu_to_be32(wr->sg_list[i].lkey); wqe->recv.sgl[i].len = cpu_to_be32(wr->sg_list[i].length); - - /* to in the WQE == the offset into the page */ - wqe->recv.sgl[i].to = cpu_to_be64(((u32) wr->sg_list[i].addr) % - (1UL << (12 + page_size[i]))); - - /* pbl_addr is the adapters address in the PBL */ - wqe->recv.pbl_addr[i] = cpu_to_be32(pbl_addr[i]); + wqe->recv.sgl[i].to = cpu_to_be64(wr->sg_list[i].addr); } for (; i < T3_MAX_SGE; i++) { wqe->recv.sgl[i].stag = 0; wqe->recv.sgl[i].len = 0; wqe->recv.sgl[i].to = 0; - wqe->recv.pbl_addr[i] = 0; } return 0; } - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2.6.25 0/2] RDMA/cxgb3 patches for 2.6.25
Hey roland, Please pull these two iw_cxgb3 patches for 2.6.25. The 5.0 firmware change must be committed along with the cxgb3 NIC changes submitted here: http://lkml.org/lkml/2007/11/16/224 and merged by Jeff here: http://lkml.org/lkml/2007/11/23/180 Shortlog: RDMA/cxgb3: Hold rtnl_lock() around ethtool get_drvinfo call. RDMA/cxgb3: Support 5.0 firmware. --- Steve. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2.6.25 1/2] RDMA/cxgb3: Hold rtnl_lock() around ethtool get_drvinfo call.
RDMA/cxgb3: Hold rtnl_lock() around ethtool get_drvinfo call. Currently the call into cxgb3 to get the driver info is not serialized. The iw_cxgb3 module needs to hold the rtnl_lock around the ethtool ops call like dev_ioctl() does. Signed-off-by: Steve Wise <[EMAIL PROTECTED]> --- drivers/infiniband/hw/cxgb3/iwch_provider.c |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c b/drivers/infiniband/hw/cxgb3/iwch_provider.c index b5436ca..69b1204 100644 --- a/drivers/infiniband/hw/cxgb3/iwch_provider.c +++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c @@ -39,6 +39,7 @@ #include #include #include +#include #include #include @@ -1053,7 +1054,9 @@ static ssize_t show_fw_ver(struct class_device *cdev, char *buf) struct net_device *lldev = dev->rdev.t3cdev_p->lldev; PDBG("%s class dev 0x%p\n", __FUNCTION__, cdev); + rtnl_lock(); lldev->ethtool_ops->get_drvinfo(lldev, &info); + rtnl_unlock(); return sprintf(buf, "%s\n", info.fw_version); } @@ -1065,7 +1068,9 @@ static ssize_t show_hca(struct class_device *cdev, char *buf) struct net_device *lldev = dev->rdev.t3cdev_p->lldev; PDBG("%s class dev 0x%p\n", __FUNCTION__, cdev); + rtnl_lock(); lldev->ethtool_ops->get_drvinfo(lldev, &info); + rtnl_unlock(); return sprintf(buf, "%s\n", info.driver); } - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 01/01] iproute2-2.6.23: RFC4214 Support (v2.5)
From: Fred L. Templin <[EMAIL PROTECTED]> This patch includes support for the Intra-Site Automatic Tunnel Addressing Protocol (ISATAP) per RFC4214. The following diffs are specific to the iproute2-2.6.23 software distribution. This message includes the full and patchable diff text; please use this version to apply patches. Signed-off-by: Fred L. Templin <[EMAIL PROTECTED]> --- --- iproute2-2.6.23/ip/iptunnel.c.orig 2007-11-08 16:27:24.0 -0800 +++ iproute2-2.6.23/ip/iptunnel.c 2007-11-19 05:57:47.0 -0800 @@ -39,7 +39,7 @@ static void usage(void) __attribute__((n static void usage(void) { fprintf(stderr, "Usage: ip tunnel { add | change | del | show } [ NAME ]\n"); - fprintf(stderr, " [ mode { ipip | gre | sit } ] [ remote ADDR ] [ local ADDR ]\n"); + fprintf(stderr, " [ mode { ipip | gre | sit | isatap } ] [ remote ADDR ] [ local ADDR ]\n"); fprintf(stderr, " [ [i|o]seq ] [ [i|o]key KEY ] [ [i|o]csum ]\n"); fprintf(stderr, " [ ttl TTL ] [ tos TOS ] [ [no]pmtudisc ] [ dev PHYS_DEV ]\n"); fprintf(stderr, "\n"); @@ -55,6 +55,7 @@ static int parse_args(int argc, char **a { int count = 0; char medium[IFNAMSIZ]; + int isatap = 0; memset(p, 0, sizeof(*p)); memset(&medium, 0, sizeof(medium)); @@ -90,6 +91,13 @@ static int parse_args(int argc, char **a exit(-1); } p->iph.protocol = IPPROTO_IPV6; + } else if (strcmp(*argv, "isatap") == 0) { + if (p->iph.protocol && p->iph.protocol != IPPROTO_IPV6) { + fprintf(stderr,"You managed to ask for more than one tunnel mode.\n"); + exit(-1); + } + p->iph.protocol = IPPROTO_IPV6; + isatap++; } else { fprintf(stderr,"Cannot guess tunnel mode.\n"); exit(-1); @@ -212,6 +220,10 @@ static int parse_args(int argc, char **a p->iph.protocol = IPPROTO_IPIP; else if (memcmp(p->name, "sit", 3) == 0) p->iph.protocol = IPPROTO_IPV6; + else if (memcmp(p->name, "isatap", 6) == 0) { + p->iph.protocol = IPPROTO_IPV6; + isatap++; + } } if (p->iph.protocol == IPPROTO_IPIP || p->iph.protocol == IPPROTO_IPV6) { @@ -239,6 +251,14 @@ static int parse_args(int argc, char **a fprintf(stderr, "Broadcast tunnel requires a source address.\n"); return -1; } + if (isatap) { + if (p->iph.daddr) { + fprintf(stderr, "no remote with isatap.\n"); + return -1; + } + p->i_flags |= SIT_ISATAP; + } + return 0; } - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 01/01] ipv6: RFC4214 Support (v2.5)
From: Fred L. Templin <[EMAIL PROTECTED]> This patch includes support for the Intra-Site Automatic Tunnel Addressing Protocol (ISATAP) per RFC4214. It uses the SIT module, and is configured using extensions to the "iproute2" utility. The diffs are specific to the Linux 2.6.24-rc2 kernel distribution. This version includes the diff for ./include/linux/if.h which was missing in the v2.4 submission and is needed to make the patch compile. The patch has been installed, compiled and tested in a clean 2.6.24-rc2 kernel build area. Signed-off-by: Fred L. Templin <[EMAIL PROTECTED]> --- --- linux-2.6.24-rc2/include/linux/if.h.orig2007-11-08 12:05:47.0 -0800 +++ linux-2.6.24-rc2/include/linux/if.h 2007-11-08 08:26:44.0 -0800 @@ -61,6 +61,7 @@ #define IFF_MASTER_ALB 0x10/* bonding master, balance-alb. */ #define IFF_BONDING0x20/* bonding master or slave */ #define IFF_SLAVE_NEEDARP 0x40 /* need ARPs for validation */ +#define IFF_ISATAP 0x80/* ISATAP interface (RFC4214) */ #define IF_GET_IFACE 0x0001 /* for querying only */ #define IF_GET_PROTO 0x0002 --- linux-2.6.24-rc2/include/linux/if_tunnel.h.orig 2007-11-19 03:54:12.0 -0800 +++ linux-2.6.24-rc2/include/linux/if_tunnel.h 2007-11-19 03:55:58.0 -0800 @@ -17,6 +17,9 @@ #define GRE_FLAGS __constant_htons(0x00F8) #define GRE_VERSION__constant_htons(0x0007) +/* i_flags values for SIT mode */ +#defineSIT_ISATAP 0x0001 + struct ip_tunnel_parm { charname[IFNAMSIZ]; --- linux-2.6.24-rc2/include/linux/in.h.orig2007-11-09 08:00:32.0 -0800 +++ linux-2.6.24-rc2/include/linux/in.h 2007-11-12 07:37:05.0 -0800 @@ -253,6 +253,14 @@ struct sockaddr_in { #define ZERONET(x) (((x) & htonl(0xff00)) == htonl(0x)) #define LOCAL_MCAST(x) (((x) & htonl(0xFF00)) == htonl(0xE000)) +/* Special-Use IPv4 Addresses (RFC3330) */ +#define PRIVATE_10(x) (((x) & htonl(0xff00)) == htonl(0x0A00)) +#define LINKLOCAL_169(x) (((x) & htonl(0x)) == htonl(0xA9FE)) +#define PRIVATE_172(x) (((x) & htonl(0xfff0)) == htonl(0xAC10)) +#define TEST_192(x)(((x) & htonl(0xff00)) == htonl(0xC200)) +#define ANYCAST_6TO4(x)(((x) & htonl(0xff00)) == htonl(0xC0586300)) +#define PRIVATE_192(x) (((x) & htonl(0x)) == htonl(0xC0A8)) +#define TEST_198(x)(((x) & htonl(0xfffe)) == htonl(0xC612)) #endif #endif /* _LINUX_IN_H */ --- linux-2.6.24-rc2/include/net/addrconf.h.orig2007-11-08 12:06:17.0 -0800 +++ linux-2.6.24-rc2/include/net/addrconf.h 2007-11-19 05:47:48.0 -0800 @@ -17,6 +17,7 @@ #define IPV6_MAX_ADDRESSES 16 +#include #include struct prefix_info { @@ -241,6 +242,24 @@ static inline int ipv6_addr_is_ll_all_ro addr->s6_addr32[3] == htonl(0x0002)); } +static inline int ipv6_isatap_eui64(u8 *eui, __be32 addr) +{ + eui[0] = (ZERONET(addr) || PRIVATE_10(addr) || LOOPBACK(addr) || + LINKLOCAL_169(addr) || PRIVATE_172(addr) || TEST_192(addr) || + ANYCAST_6TO4(addr) || PRIVATE_192(addr) || TEST_198(addr) || + MULTICAST(addr) || BADCLASS(addr)) ? 0x00 : 0x02; + eui[1] = 0; + eui[2] = 0x5E; + eui[3] = 0xFE; + memcpy (eui+4, &addr, 4); + return 0; +} + +static inline int ipv6_addr_is_isatap(const struct in6_addr *addr) +{ + return ((addr->s6_addr32[2] | htonl(0x0200)) == htonl(0x02005EFE)); +} + #ifdef CONFIG_PROC_FS extern int if6_proc_init(void); extern void if6_proc_exit(void); --- linux-2.6.24-rc2/net/ipv6/addrconf.c.orig 2007-11-19 03:43:06.0 -0800 +++ linux-2.6.24-rc2/net/ipv6/addrconf.c2007-11-19 13:29:36.0 -0800 @@ -379,6 +379,13 @@ static struct inet6_dev * ipv6_add_dev(s "%s: Disabled Privacy Extensions\n", dev->name); ndev->cnf.use_tempaddr = -1; + + if (dev->type == ARPHRD_SIT && (dev->priv_flags & IFF_ISATAP)) { + printk(KERN_INFO + "%s: Disabled Multicast RS\n", + dev->name); + ndev->cnf.rtr_solicits = 0; + } } else { in6_dev_hold(ndev); ipv6_regen_rndid((unsigned long) ndev); @@ -1435,6 +1442,9 @@ static int ipv6_generate_eui64(u8 *eui, return addrconf_ifid_arcnet(eui, dev); case ARPHRD_INFINIBAND: return addrconf_ifid_infiniband(eui, dev); + case ARPHRD_SIT: + if (dev->priv_flags & IFF_ISATAP) + return ipv6_isatap_eui64(eui, *(__be32 *)dev->dev_addr); } return -1; } @@ -1470,7 +1480,7 @@ regen: * * - Reserved subnet anycast (RFC 2526) * 110
[PATCH 2/2] Eliminate unused argument from sk_stream_alloc_pskb
The 3rd argument is always zero (according to grep :) Eliminate it and merge the function with sk_stream_alloc_skb. This saves 44 more bytes, and together with the previous patch we have: add/remove: 1/0 grow/shrink: 0/8 up/down: 183/-751 (-568) function old new delta sk_stream_alloc_skb- 183+183 ip_rt_init 529 525 -4 arp_ignore 112 107 -5 __inet_lookup_listener 284 274 -10 tcp_sendmsg 25832481-102 tcp_sendpage14491300-149 tso_fragment 417 258-159 tcp_fragment1149 988-161 __tcp_push_pending_frames 19981837-161 Question: is this 2.6.24 material (good space saving) or should I rework this against 2.6.25 (it applies with fuzzes, but seems to compile)? Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]> --- diff --git a/include/net/sock.h b/include/net/sock.h index 492dc4a..a469ed8 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1230,15 +1230,7 @@ static inline void sk_stream_moderate_sndbuf(struct sock *sk) } } -struct sk_buff *sk_stream_alloc_pskb(struct sock *sk, - int size, int mem, gfp_t gfp); - -static inline struct sk_buff *sk_stream_alloc_skb(struct sock *sk, - int size, - gfp_t gfp) -{ - return sk_stream_alloc_pskb(sk, size, 0, gfp); -} +struct sk_buff *sk_stream_alloc_skb(struct sock *sk, int size, gfp_t gfp); static inline struct page *sk_stream_alloc_page(struct sock *sk) { diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 0dfda20..1965c37 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -501,8 +501,7 @@ static inline void tcp_push(struct sock *sk, int flags, int mss_now, } } -struct sk_buff *sk_stream_alloc_pskb(struct sock *sk, - int size, int mem, gfp_t gfp) +struct sk_buff *sk_stream_alloc_skb(struct sock *sk, int size, gfp_t gfp) { struct sk_buff *skb; @@ -511,7 +510,6 @@ struct sk_buff *sk_stream_alloc_pskb(struct sock *sk, skb = alloc_skb_fclone(size + sk->sk_prot->max_header, gfp); if (skb) { - skb->truesize += mem; if (sk_stream_wmem_schedule(sk, skb->truesize)) { /* * Make sure that we have exactly size bytes @@ -564,8 +562,7 @@ new_segment: if (!sk_stream_memory_free(sk)) goto wait_for_sndbuf; - skb = sk_stream_alloc_pskb(sk, 0, 0, - sk->sk_allocation); + skb = sk_stream_alloc_skb(sk, 0, sk->sk_allocation); if (!skb) goto wait_for_memory; @@ -745,8 +742,8 @@ new_segment: if (!sk_stream_memory_free(sk)) goto wait_for_sndbuf; - skb = sk_stream_alloc_pskb(sk, select_size(sk), - 0, sk->sk_allocation); + skb = sk_stream_alloc_skb(sk, select_size(sk), + sk->sk_allocation); if (!skb) goto wait_for_memory; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index e5130a7..132e16b 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1183,7 +1183,7 @@ static int tso_fragment(struct sock *sk, struct sk_buff *skb, unsigned int len, if (skb->len != skb->data_len) return tcp_fragment(sk, skb, len, mss_now); - buff = sk_stream_alloc_pskb(sk, 0, 0, GFP_ATOMIC); + buff = sk_stream_alloc_skb(sk, 0, GFP_ATOMIC); if (unlikely(buff == NULL)) return -ENOMEM; -- 1.5.3.4 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] Uninline the sk_stream_alloc_pskb
This function seems too big for inlining. Indeed, it saves half-a-kilo when uninlined: add/remove: 1/0 grow/shrink: 0/7 up/down: 195/-719 (-524) function old new delta sk_stream_alloc_pskb - 195+195 ip_rt_init 529 525 -4 __inet_lookup_listener 284 274 -10 tcp_sendmsg 25832486 -97 tcp_sendpage14491305-144 tso_fragment 417 267-150 tcp_fragment1149 992-157 __tcp_push_pending_frames 19981841-157 Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]> --- diff --git a/include/net/sock.h b/include/net/sock.h index 67e35c7..492dc4a 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1230,33 +1230,8 @@ static inline void sk_stream_moderate_sndbuf(struct sock *sk) } } -static inline struct sk_buff *sk_stream_alloc_pskb(struct sock *sk, - int size, int mem, - gfp_t gfp) -{ - struct sk_buff *skb; - - /* The TCP header must be at least 32-bit aligned. */ - size = ALIGN(size, 4); - - skb = alloc_skb_fclone(size + sk->sk_prot->max_header, gfp); - if (skb) { - skb->truesize += mem; - if (sk_stream_wmem_schedule(sk, skb->truesize)) { - /* -* Make sure that we have exactly size bytes -* available to the caller, no more, no less. -*/ - skb_reserve(skb, skb_tailroom(skb) - size); - return skb; - } - __kfree_skb(skb); - } else { - sk->sk_prot->enter_memory_pressure(); - sk_stream_moderate_sndbuf(sk); - } - return NULL; -} +struct sk_buff *sk_stream_alloc_pskb(struct sock *sk, + int size, int mem, gfp_t gfp); static inline struct sk_buff *sk_stream_alloc_skb(struct sock *sk, int size, diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 8e65182..0dfda20 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -501,6 +501,33 @@ static inline void tcp_push(struct sock *sk, int flags, int mss_now, } } +struct sk_buff *sk_stream_alloc_pskb(struct sock *sk, + int size, int mem, gfp_t gfp) +{ + struct sk_buff *skb; + + /* The TCP header must be at least 32-bit aligned. */ + size = ALIGN(size, 4); + + skb = alloc_skb_fclone(size + sk->sk_prot->max_header, gfp); + if (skb) { + skb->truesize += mem; + if (sk_stream_wmem_schedule(sk, skb->truesize)) { + /* +* Make sure that we have exactly size bytes +* available to the caller, no more, no less. +*/ + skb_reserve(skb, skb_tailroom(skb) - size); + return skb; + } + __kfree_skb(skb); + } else { + sk->sk_prot->enter_memory_pressure(); + sk_stream_moderate_sndbuf(sk); + } + return NULL; +} + static ssize_t do_tcp_sendpages(struct sock *sk, struct page **pages, int poffset, size_t psize, int flags) { -- 1.5.3.4 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Missing audit information in xfrm_audit_common_policyinfo()?
On Monday 26 November 2007 11:47:09 am Joy Latten wrote: > Paul Moore <[EMAIL PROTECTED]> wrote on 11/21/2007 03:34:31 PM: > > I just noticed that the IPsec auditing code does not appear to audit the > > > > netmask for the selector source and destination addresses in > > xfrm_audit_common_policyinfo(). Before I threw a patch together I > > thought I > > > would check to see if there was a reason for this that I am missing ... > > I don't think we ever discussed including netmask when we added the > ipsec audit info... Hmmm ... okay. I'm almost certain it should be included when auditing changes to the SPD as the netmask/prefixlen is very important when considering which traffic will be matched by a particular SPD entry. I'm working on a patch now. -- paul moore linux security @ hp - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [XFRM]: Fix leak of expired xfrm_states
Herbert Xu wrote: On Mon, Nov 26, 2007 at 04:56:01PM +0100, Patrick McHardy wrote: That should work as long as we keep the del_timer_sync to avoid a use-after-free. It seems a bit fragile though. Well we're relying on the del_timer_sync already to avoid the ref count on the timer. Otherwise if the admin deletes the SA while the timer is running it'll go up in smoke too. If you look in the history you'll find that the same patch that removed the ref count on the timer introduced the call to del_timer_sync :) OK, here's a patch to use xfrm_state_put in __xfrm_state_delete(). I've checked the other callers and it should be fine. lock ordering between x->lock and xfrm_state_gc_lock also doesn't seem to be an issue. commit ba63b1baf5d8a63f3bb3097a7201de75c1b77e2d Author: Patrick McHardy <[EMAIL PROTECTED]> Date: Mon Nov 26 16:00:50 2007 +0100 [XFRM]: Fix leak of expired xfrm_states The xfrm_timer calls __xfrm_state_delete, which drops the final reference manually without triggering destruction of the state. Change it to use xfrm_state_put to add the state to the gc list when we're dropping the last reference. The timer function may still continue to use the state safely since the final destruction does a del_timer_sync(). Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 224b44e..cf43c49 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -552,7 +552,7 @@ int __xfrm_state_delete(struct xfrm_state *x) * The xfrm_state_alloc call gives a reference, and that * is what we are dropping here. */ - __xfrm_state_put(x); + xfrm_state_put(x); err = 0; }
[PATCH 1/1][ATM]: [he] initialize lock and tasklet earlier
if you are lucky (unlucky?) enough to have shared interrupts, the interrupt handler can be called before the tasklet and lock are ready for use. commit 44b3e82778b0edf73147529c8b1c115d241a6a4d Author: chas williams - CONTRACTOR <[EMAIL PROTECTED]> Date: Mon Nov 26 11:30:33 2007 -0500 [ATM]: [he] initialize lock and tasklet earlier diff --git a/drivers/atm/he.c b/drivers/atm/he.c index d33aba6..3b64a99 100644 --- a/drivers/atm/he.c +++ b/drivers/atm/he.c @@ -394,6 +394,11 @@ he_init_one(struct pci_dev *pci_dev, const struct pci_device_id *pci_ent) he_dev->atm_dev->dev_data = he_dev; atm_dev->dev_data = he_dev; he_dev->number = atm_dev->number; +#ifdef USE_TASKLET + tasklet_init(&he_dev->tasklet, he_tasklet, (unsigned long) he_dev); +#endif + spin_lock_init(&he_dev->global_lock); + if (he_start(atm_dev)) { he_stop(he_dev); err = -ENODEV; @@ -1173,11 +1178,6 @@ he_start(struct atm_dev *dev) if ((err = he_init_irq(he_dev)) != 0) return err; -#ifdef USE_TASKLET - tasklet_init(&he_dev->tasklet, he_tasklet, (unsigned long) he_dev); -#endif - spin_lock_init(&he_dev->global_lock); - /* 4.11 enable pci bus controller state machines */ host_cntl |= (OUTFF_ENB | CMDFF_ENB | QUICK_RD_RETRY | QUICK_WR_RETRY | PERR_INT_ENB); - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2 2/3] TC: PSPacer qdisc module
I am sorry I sent an old patch. Please see this one. -- [PATCHv2 2/3] TC: PSPacer qdisc module This patch includes the PSPacer (Precise Software Pacer) qdisc tc part, which achieves precise transmission bandwidth control. You can find more information at the project web page (http://www.gridmpi.org/gridtcp.jsp). Signed-off-by: Ryousei Takano <[EMAIL PROTECTED]> --- include/linux/pkt_sched.h | 37 + tc/Makefile |1 + tc/q_psp.c| 199 + 3 files changed, 237 insertions(+), 0 deletions(-) create mode 100644 tc/q_psp.c diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h index 268c515..ed21e26 100644 --- a/include/linux/pkt_sched.h +++ b/include/linux/pkt_sched.h @@ -430,6 +430,43 @@ enum { #define TCA_ATM_MAX(__TCA_ATM_MAX - 1) +/* Precise Software Pacer section */ + +#define TC_PSP_MAXDEPTH (8) + +typedef long long gapclock_t; + +enum { + MODE_NORMAL = 0, + MODE_STATIC = 1, +}; + +struct tc_psp_copt +{ + __u32 level; + __u32 mode; + __u32 rate; +}; + +struct tc_psp_qopt +{ + __u32 defcls; + __u32 rate; +}; + +struct tc_psp_xstats +{ + __u32 bytes; /* gap packet statistics */ + __u32 packets; +}; + +enum +{ + TCA_PSP_UNSPEC, + TCA_PSP_COPT, + TCA_PSP_QOPT, +}; + /* Network emulator */ enum diff --git a/tc/Makefile b/tc/Makefile index a715566..836df9d 100644 --- a/tc/Makefile +++ b/tc/Makefile @@ -12,6 +12,7 @@ TCMODULES += q_prio.o TCMODULES += q_tbf.o TCMODULES += q_cbq.o TCMODULES += q_rr.o +TCMODULES += q_psp.o TCMODULES += q_netem.o TCMODULES += f_rsvp.o TCMODULES += f_u32.o diff --git a/tc/q_psp.c b/tc/q_psp.c new file mode 100644 index 000..1806b66 --- /dev/null +++ b/tc/q_psp.c @@ -0,0 +1,199 @@ +/* + * q_psp.c PSPacer: Precise Software Pacer + * + * Copyright (C) 2004-2007 National Institute of Advanced + * Industrial Science and Technology (AIST), Japan. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + * + * Authors:Ryousei Takano, <[EMAIL PROTECTED]> + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "utils.h" +#include "tc_util.h" + +static void explain(void) +{ + fprintf(stderr, +"Usage: ... qdisc add ... psp [ default N ] [rate RATE]\n" +" default minor id of class to which unclassified packets are sent {0}\n" +" rate physical interface bandwidth\n\n" +"... class add ... psp mode M [ rate MBPS ]\n" +" mode target rate estimation method (NORMAL=0 STATIC=1) {0}\n" +" rate rate allocated to this class\n"); +} + +static void explain1(char *arg) +{ + fprintf(stderr, "Illegal \"%s\"\n", arg); + explain(); +} + + +static int psp_parse_opt(struct qdisc_util *qu, int argc, char **argv, +struct nlmsghdr *n) +{ + struct tc_psp_qopt qopt; + struct rtattr *tail; + memset(&qopt, 0, sizeof(qopt)); + + while (argc > 0) { + if (matches(*argv, "rate") == 0) { + NEXT_ARG(); + if (get_rate(&qopt.rate, *argv)) { + explain1("rate"); + return -1; + } + } else if (matches(*argv, "default") == 0) { + NEXT_ARG(); + if (get_u32(&qopt.defcls, *argv, 16)) { + explain1("default"); + return -1; + } + } else if (matches(*argv, "help") == 0) { + explain(); + return -1; + } else { + fprintf(stderr, "What is \"%s\"?\n", *argv); + explain(); + return -1; + } + argc--; + argv++; + } + + tail = NLMSG_TAIL(n); + addattr_l(n, 1024, TCA_OPTIONS, NULL, 0); + addattr_l(n, 2024, TCA_OPTIONS, &qopt, NLMSG_ALIGN(sizeof(qopt))); + tail->rta_len = (void *) NLMSG_TAIL(n) - (void *) tail; + return 0; +} + +static int psp_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt) +{ + struct rtattr *tb[TCA_PSP_QOPT+1]; + struct tc_psp_copt *copt; + struct tc_psp_qopt *qopt; + SPRINT_BUF(b); + + if (opt == NULL) + return 0; + + memset(tb, 0, sizeof(tb)); + parse_rtattr_nested(tb, TCA_PSP_QOPT, opt); + + if (tb[TCA_PSP_COPT]) { + copt = RTA_DATA(tb[TCA_PSP_COPT]); + if (RTA_PAYLOAD(tb[TCA_PSP_COPT]) < sizeo
Re: [XFRM]: Fix leak of expired xfrm_states
On Mon, Nov 26, 2007 at 04:56:01PM +0100, Patrick McHardy wrote: > > That should work as long as we keep the del_timer_sync to avoid > a use-after-free. It seems a bit fragile though. Well we're relying on the del_timer_sync already to avoid the ref count on the timer. Otherwise if the admin deletes the SA while the timer is running it'll go up in smoke too. If you look in the history you'll find that the same patch that removed the ref count on the timer introduced the call to del_timer_sync :) Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[git patches] net driver fixes
Please pull from 'upstream-linus' branch of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git upstream-linus to receive the following updates: drivers/net/Kconfig|2 +- drivers/net/amd8111e.c |6 ++ drivers/net/bfin_mac.c |2 +- drivers/net/ehea/ehea.h|2 +- drivers/net/ehea/ehea_main.c | 20 drivers/net/ehea/ehea_qmr.h|4 ++-- drivers/net/forcedeth.c| 38 +- drivers/net/ibm_newemac/core.c | 31 --- drivers/net/ibm_newemac/core.h |1 + drivers/net/sky2.c |6 +- drivers/net/smc911x.c | 19 +-- drivers/net/smc911x.h |2 +- drivers/net/smc91x.h |2 +- drivers/net/tulip/dmfe.c |4 ++-- drivers/net/usb/dm9601.c |2 +- include/linux/pci_ids.h|4 16 files changed, 84 insertions(+), 61 deletions(-) Ayaz Abdulla (2): forcedeth: new mcp79 pci ids forcedeth boot delay fix Benjamin Herrenschmidt (1): ibm_newemac: Fix possible lockup on close Jeff Garzik (1): dmfe: checkpatch fix (add whitespace) Jiri Bohac (1): amd8111e: don't call napi_enable if configured w/o NAPI Maxim Levitsky (1): NET: dmfe: don't access configuration space in D3 state Mike Frysinger (1): Blackfin SMC91x Driver: punt CONFIG_BFIN -- we already have CONFIG_BLACKFIN Peter Korsgaard (4): dm9601: Fix printk smc911x: Fix undefined CONFIG_ symbol warning smc911x: Fix unused variable warning. smc911x: Fix multicast handling Stephen Hemminger (1): sky2: disable rx checksum on Yukon XL Thomas Klein (2): ehea: Improve tx packets counting ehea: Reworked rcv queue handling to log only fatal errors Vitja Makarov (1): Blackfin EMAC driver: fix bug - NAT doesn't work with bfin_mac driver diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig index e8d69b0..1437b37 100644 --- a/drivers/net/Kconfig +++ b/drivers/net/Kconfig @@ -888,7 +888,7 @@ config SMC91X tristate "SMC 91C9x/91C1xxx support" select CRC32 select MII - depends on ARM || REDWOOD_5 || REDWOOD_6 || M32R || SUPERH || SOC_AU1X00 || BFIN + depends on ARM || REDWOOD_5 || REDWOOD_6 || M32R || SUPERH || SOC_AU1X00 || BLACKFIN help This is a driver for SMC's 91x series of Ethernet chipsets, including the SMC91C94 and the SMC91C111. Say Y if you want it diff --git a/drivers/net/amd8111e.c b/drivers/net/amd8111e.c index eebf5bb..e7fdd81 100644 --- a/drivers/net/amd8111e.c +++ b/drivers/net/amd8111e.c @@ -1340,7 +1340,9 @@ static int amd8111e_close(struct net_device * dev) struct amd8111e_priv *lp = netdev_priv(dev); netif_stop_queue(dev); +#ifdef CONFIG_AMD8111E_NAPI napi_disable(&lp->napi); +#endif spin_lock_irq(&lp->lock); @@ -1372,7 +1374,9 @@ static int amd8111e_open(struct net_device * dev ) dev->name, dev)) return -EAGAIN; +#ifdef CONFIG_AMD8111E_NAPI napi_enable(&lp->napi); +#endif spin_lock_irq(&lp->lock); @@ -1380,7 +1384,9 @@ static int amd8111e_open(struct net_device * dev ) if(amd8111e_restart(dev)){ spin_unlock_irq(&lp->lock); +#ifdef CONFIG_AMD8111E_NAPI napi_disable(&lp->napi); +#endif if (dev->irq) free_irq(dev->irq, dev); return -ENOMEM; diff --git a/drivers/net/bfin_mac.c b/drivers/net/bfin_mac.c index 084acfd..f0f8516 100644 --- a/drivers/net/bfin_mac.c +++ b/drivers/net/bfin_mac.c @@ -676,7 +676,7 @@ static void bf537mac_rx(struct net_device *dev) skb->protocol = eth_type_trans(skb, dev); #if defined(BFIN_MAC_CSUM_OFFLOAD) skb->csum = current_rx_ptr->status.ip_payload_csum; - skb->ip_summed = CHECKSUM_PARTIAL; + skb->ip_summed = CHECKSUM_COMPLETE; #endif netif_rx(skb); diff --git a/drivers/net/ehea/ehea.h b/drivers/net/ehea/ehea.h index f78e5bf..5f82a46 100644 --- a/drivers/net/ehea/ehea.h +++ b/drivers/net/ehea/ehea.h @@ -40,7 +40,7 @@ #include #define DRV_NAME "ehea" -#define DRV_VERSION"EHEA_0080" +#define DRV_VERSION"EHEA_0083" /* eHEA capability flags */ #define DLPAR_PORT_ADD_REM 1 diff --git a/drivers/net/ehea/ehea_main.c b/drivers/net/ehea/ehea_main.c index f0319f1..869e160 100644 --- a/drivers/net/ehea/ehea_main.c +++ b/drivers/net/ehea/ehea_main.c @@ -136,7 +136,7 @@ static struct net_device_stats *ehea_get_stats(struct net_device *dev) struct ehea_port *port = netdev_priv(dev); struct net_device_stats *stats = &port->stats; struct hcp_ehea_port_cb2 *cb2; - u64 hret, rx_packets; + u64 hret, rx_packets, tx_packets; int i; memset(stats, 0, sizeof(*stats)); @@ -162,7 +162,11 @@ static struct net
Re: [XFRM]: Fix leak of expired xfrm_states
Herbert Xu wrote: On Mon, Nov 26, 2007 at 04:51:42PM +0100, Patrick McHardy wrote: It actually won't get freed at all currently since nothing is calling __xfrm_state_destroy(). __xfrm_state_delete() uses __xfrm_state_put(), which only decrements the refcount, but doesn't perform destruction. This is visible when looking at the xfrm[46]_mode_{tunnel,transport} module reference counts, they climb higher and higher over time. Oh I see. How about just removing those double underscores then? That should work as long as we keep the del_timer_sync to avoid a use-after-free. It seems a bit fragile though. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [XFRM]: Fix leak of expired xfrm_states
On Mon, Nov 26, 2007 at 04:51:42PM +0100, Patrick McHardy wrote: > > It actually won't get freed at all currently since nothing is > calling __xfrm_state_destroy(). __xfrm_state_delete() uses > __xfrm_state_put(), which only decrements the refcount, but > doesn't perform destruction. > > This is visible when looking at the xfrm[46]_mode_{tunnel,transport} > module reference counts, they climb higher and higher over time. Oh I see. How about just removing those double underscores then? Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [XFRM]: Fix leak of expired xfrm_states
Herbert Xu wrote: On Mon, Nov 26, 2007 at 04:05:27PM +0100, Patrick McHardy wrote: This patch fixes a xfrm_state leak, which appears to be a regression from the reference count simplifications. I was going to say this was a good find :) But digging deeper it seems that it might not be a bug after all. Even though the ref count on x may now drop to zero, it won't be freed until del_timer_sync returns which should be sufficient, no? It actually won't get freed at all currently since nothing is calling __xfrm_state_destroy(). __xfrm_state_delete() uses __xfrm_state_put(), which only decrements the refcount, but doesn't perform destruction. This is visible when looking at the xfrm[46]_mode_{tunnel,transport} module reference counts, they climb higher and higher over time. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv2 2/3] TC: PSPacer qdisc module
This patch includes the PSPacer (Precise Software Pacer) qdisc tc part, which achieves precise transmission bandwidth control. You can find more information at the project web page (http://www.gridmpi.org/gridtcp.jsp). Signed-off-by: Ryousei Takano <[EMAIL PROTECTED]> --- include/linux/pkt_sched.h | 38 + tc/Makefile |1 + tc/q_psp.c| 199 + 3 files changed, 238 insertions(+), 0 deletions(-) create mode 100644 tc/q_psp.c diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h index 268c515..c708082 100644 --- a/include/linux/pkt_sched.h +++ b/include/linux/pkt_sched.h @@ -430,6 +430,44 @@ enum { #define TCA_ATM_MAX(__TCA_ATM_MAX - 1) +/* Precise Software Pacer section */ + +#define TC_PSP_MAXDEPTH (8) + +typedef long long gapclock_t; + +enum { + MODE_NORMAL = 0, + MODE_STATIC = 1, +}; + +struct tc_psp_copt +{ + __u32 level; + __u32 mode; + __u32 rate; +}; + +struct tc_psp_qopt +{ + __u32 defcls; + __u32 rate; + __u32 direct_pkts; +}; + +struct tc_psp_xstats +{ + __u32 bytes; /* gap packet statistics */ + __u32 packets; +}; + +enum +{ + TCA_PSP_UNSPEC, + TCA_PSP_COPT, + TCA_PSP_QOPT, +}; + /* Network emulator */ enum diff --git a/tc/Makefile b/tc/Makefile index a715566..836df9d 100644 --- a/tc/Makefile +++ b/tc/Makefile @@ -12,6 +12,7 @@ TCMODULES += q_prio.o TCMODULES += q_tbf.o TCMODULES += q_cbq.o TCMODULES += q_rr.o +TCMODULES += q_psp.o TCMODULES += q_netem.o TCMODULES += f_rsvp.o TCMODULES += f_u32.o diff --git a/tc/q_psp.c b/tc/q_psp.c new file mode 100644 index 000..1806b66 --- /dev/null +++ b/tc/q_psp.c @@ -0,0 +1,199 @@ +/* + * q_psp.c PSPacer: Precise Software Pacer + * + * Copyright (C) 2004-2007 National Institute of Advanced + * Industrial Science and Technology (AIST), Japan. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + * + * Authors:Ryousei Takano, <[EMAIL PROTECTED]> + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "utils.h" +#include "tc_util.h" + +static void explain(void) +{ + fprintf(stderr, +"Usage: ... qdisc add ... psp [ default N ] [rate RATE]\n" +" default minor id of class to which unclassified packets are sent {0}\n" +" rate physical interface bandwidth\n\n" +"... class add ... psp mode M [ rate MBPS ]\n" +" mode target rate estimation method (NORMAL=0 STATIC=1) {0}\n" +" rate rate allocated to this class\n"); +} + +static void explain1(char *arg) +{ + fprintf(stderr, "Illegal \"%s\"\n", arg); + explain(); +} + + +static int psp_parse_opt(struct qdisc_util *qu, int argc, char **argv, +struct nlmsghdr *n) +{ + struct tc_psp_qopt qopt; + struct rtattr *tail; + memset(&qopt, 0, sizeof(qopt)); + + while (argc > 0) { + if (matches(*argv, "rate") == 0) { + NEXT_ARG(); + if (get_rate(&qopt.rate, *argv)) { + explain1("rate"); + return -1; + } + } else if (matches(*argv, "default") == 0) { + NEXT_ARG(); + if (get_u32(&qopt.defcls, *argv, 16)) { + explain1("default"); + return -1; + } + } else if (matches(*argv, "help") == 0) { + explain(); + return -1; + } else { + fprintf(stderr, "What is \"%s\"?\n", *argv); + explain(); + return -1; + } + argc--; + argv++; + } + + tail = NLMSG_TAIL(n); + addattr_l(n, 1024, TCA_OPTIONS, NULL, 0); + addattr_l(n, 2024, TCA_OPTIONS, &qopt, NLMSG_ALIGN(sizeof(qopt))); + tail->rta_len = (void *) NLMSG_TAIL(n) - (void *) tail; + return 0; +} + +static int psp_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt) +{ + struct rtattr *tb[TCA_PSP_QOPT+1]; + struct tc_psp_copt *copt; + struct tc_psp_qopt *qopt; + SPRINT_BUF(b); + + if (opt == NULL) + return 0; + + memset(tb, 0, sizeof(tb)); + parse_rtattr_nested(tb, TCA_PSP_QOPT, opt); + + if (tb[TCA_PSP_COPT]) { + copt = RTA_DATA(tb[TCA_PSP_COPT]); + if (RTA_PAYLOAD(tb[TCA_PSP_COPT]) < sizeof(*copt)) + return -1; + fprintf
[PATCHv2 3/3] TC: PSPacer man page
This patch includes the man page of the PSPacer (Precise Software Pacing) qdisc module. Signed-off-by: Ryousei Takano <[EMAIL PROTECTED]> --- man/man8/tc-psp.8 | 166 + 1 files changed, 166 insertions(+), 0 deletions(-) create mode 100644 man/man8/tc-psp.8 diff --git a/man/man8/tc-psp.8 b/man/man8/tc-psp.8 new file mode 100644 index 000..a6e26bf --- /dev/null +++ b/man/man8/tc-psp.8 @@ -0,0 +1,166 @@ +.TH PSP 8 "13 October 2007" "iproute2" "Linux" +.SH NAME +PSP \- Precise Software Pacer +.SH SYNOPSIS +.B tc qdisc ... dev +dev +.B ( parent +classid +.B | root) [ handle +major: +.B ] psp [ default +minor-id +.B ] [ rate +rate +.B ] + +.B tc class ... dev +dev +.B parent +major:[minor] +.B [ classid +major:minor +.B ] psp rate +rate +.B ] [ mode +mode +.B ] + +.SH DESCRIPTION +Precise Software Pacer (PSPacer) is a classful queuing discipline +which controls traffic with +.BR tc (8) +command. +PSP achieves a precise pacing per class. + +.SH GAP PACKET +The key to realizing precise pacing is to control the starting time of +the transmission of each packet. We propose a simple yet accurate +mechanism to trigger the transmission of a packet. That is, to insert +a gap packet between the real packets. The gap packet produces a gap +between sequentially transmitted real packets. +We employ a PAUSE packet as a gap packet. A PAUSE packet is defined in +the IEEE 802.3x flow control. + +By changing the gap packet size, the starting time of +the next real packet transmission can be precisely controlled. +For example, to control a half rate transmission, a gap packet is inserted +between every real packet where the gap packet size is the same as +that of the real packets. + +.SH BYTE CLOCK SCHEDULING +Packet transmission is scheduled based on the inter-packet gap of each +class (i.e. target rate). +If the network has multiple bottleneck links, it is necessary to +schedule the order of packet transmission and the packet interval. + +PSPacer maintains a virtual clock which is counted by the total transmitted +byte instead of real time clock. Each sub-class has its local clock +which is used to make decision whether to send a packet or not. +If there is an idle time, a gap packet is inserted. + +.SH CLASSIFICATION +Within one PSP instance, many classes may exist. Each of these classes +contains its own qdisc. + +When enqueuing a packet, PSP starts at the root and uses various methods to +determine which class should be used to obtain the data to be enqueued. + +In the standard configuration, this process is rather easy. +At each node we look for an instruction, and then go to the class the +instruction refers to. If the class found is a leaf-node (without +children), we enqueue the packet there. If it is not yet a leaf node, we do +the same thing over again starting from that node. + +The following actions are performed in order at each node we visit, until +move to another node, or terminates the process. +.TP +(i) +Consult filters attached to the class. If we are at a leaf node, we are done. +Otherwise, restart. +.TP +(ii) +If none of the above returned with an instruction, send to the default class. +.P +./ This algorithm makes sure that a packet always ends up somewhere, even while +./ you are busy building your configuration. + +.SH QDISC +The root of a PSP qdisc class tree has the following parameters: + +.TP +parent major:minor | root +This mandatory parameter determines the place of the PSP instance, +either at the +.B root +of an interface or within an existing class. +.TP +handle major: +Like all other qdiscs, the PSP can be assigned a handle. It should consist only +of a major number, followed by a colon. Optional, but it is very useful +if classes will be generated within this qdisc. +.TP +default minor-id +Unclassified traffic is sent to the class with this minor-id. +.TP +rate rate +Optional. You can explicitly specify the maximum transmission rate. +For example, if a 33MHz/32bit PCI bus is used to connect a Gigabit +Ethernet network interface, the bottleneck is the PCI bus, and the +system can not transmit packets at the rate of gigabit/sec. + +.SH CLASSES +Classes have a host of parameters to configure their operation. + +.TP +parent major:minor +Specifies the place of this class within the hierarchy. If attached directly +to a qdisc and not to another class, minor can be omitted. Mandatory. +.TP +classid major:minor +Like qdiscs, classes can be named. The major number must be equal to the +major number of the qdisc to which it belongs. Optional, but needed if this +class is going to have children. +.TP +rate rate +Maximum transmission rate this class including all its children are assigned. +Optional, but required if this class is set to mode 1 (static target rate). +.TP +mode mode +Range from 0 to 1. The mode 0 is without pacing. The mode 1 is +pacing based on static target rate estimation.
[PATCHv2 net-2.6.25 0/3] PSPacer qdisc module
Hi all, This is the 2nd version of PSPacer patches. PSPacer (Precise Software Pacer) is a qdisc module which realizes precise transmission bandwidth control. It makes bursty traffic which is often generated by TCP smooth without any special hardware. For your information, please see my previous post: http://marc.info/?l=linux-netdev&m=119570861526290&w=2 Changes: * checked by the checkpatch.pl script. * introduced struct gaphdr. * removed the HTB-way of using a "direct class". * removed unnecessary skb_reserve() and magic values in alloc_gap_packet(). * added a proper check when skb_clone() fails in psp_dequeue(). * used qdisc_tree_decrease_qlen() in psp_graft(). Usage: # tc qdisc add dev eth0 root handle 1: psp default 1 # tc class add dev eth0 parent 1: classid 1:1 psp rate 500mbit # tc qdisc add dev eth0 parent 1:1 handle 10: pfifo Patches: [1/3] PSPacer kernel part [2/3] PSPacer tc part [3/3] PSPacer tc man page Best regards, Ryousei Takano - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv2 1/3] NET_SCHED: PSPacer qdisc module
This patch includes the PSPacer (Precise Software Pacer) qdisc module, which achieves precise transmission bandwidth control. You can find more information at the project web page (http://www.gridmpi.org/gridtcp.jsp). Signed-off-by: Ryousei Takano <[EMAIL PROTECTED]> --- include/linux/pkt_sched.h | 37 ++ net/sched/Kconfig |9 + net/sched/Makefile|1 + net/sched/sch_psp.c | 958 + 4 files changed, 1005 insertions(+), 0 deletions(-) create mode 100644 net/sched/sch_psp.c diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h index 919af93..fda41cd 100644 --- a/include/linux/pkt_sched.h +++ b/include/linux/pkt_sched.h @@ -430,6 +430,43 @@ enum { #define TCA_ATM_MAX(__TCA_ATM_MAX - 1) +/* Precise Software Pacer section */ + +#define TC_PSP_MAXDEPTH (8) + +typedef long long gapclock_t; + +enum { + MODE_NORMAL = 0, + MODE_STATIC = 1, +}; + +struct tc_psp_copt +{ + __u32 level; + __u32 mode; + __u32 rate; +}; + +struct tc_psp_qopt +{ + __u32 defcls; + __u32 rate; +}; + +struct tc_psp_xstats +{ + __u32 bytes; /* gap packet statistics */ + __u32 packets; +}; + +enum +{ + TCA_PSP_UNSPEC, + TCA_PSP_COPT, + TCA_PSP_QOPT, +}; + /* Network emulator */ enum diff --git a/net/sched/Kconfig b/net/sched/Kconfig index 9c15c48..ec40e43 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -184,6 +184,15 @@ config NET_SCH_DSMARK To compile this code as a module, choose M here: the module will be called sch_dsmark. +config NET_SCH_PSP + tristate "Precise Software Pacer (PSP)" + ---help--- + Say Y here if you want to include PSPacer module, which means + that you will be able to control precise pacing. + + To compile this driver as a module, choose M here: the + module will be called sch_psp. + config NET_SCH_NETEM tristate "Network emulator (NETEM)" ---help--- diff --git a/net/sched/Makefile b/net/sched/Makefile index 81ecbe8..85425c2 100644 --- a/net/sched/Makefile +++ b/net/sched/Makefile @@ -27,6 +27,7 @@ obj-$(CONFIG_NET_SCH_TBF) += sch_tbf.o obj-$(CONFIG_NET_SCH_TEQL) += sch_teql.o obj-$(CONFIG_NET_SCH_PRIO) += sch_prio.o obj-$(CONFIG_NET_SCH_ATM) += sch_atm.o +obj-$(CONFIG_NET_SCH_PSP) += sch_psp.o obj-$(CONFIG_NET_SCH_NETEM)+= sch_netem.o obj-$(CONFIG_NET_CLS_U32) += cls_u32.o obj-$(CONFIG_NET_CLS_ROUTE4) += cls_route.o diff --git a/net/sched/sch_psp.c b/net/sched/sch_psp.c new file mode 100644 index 000..f475b50 --- /dev/null +++ b/net/sched/sch_psp.c @@ -0,0 +1,958 @@ +/* + * net/sched/sch_psp.c PSPacer: Precise Software Pacer + * + * Copyright (C) 2004-2007 National Institute of Advanced + * Industrial Science and Technology (AIST), Japan. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + * + * Authors:Ryousei Takano, <[EMAIL PROTECTED]> + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* PSPacer achieves precise rate regulation results, and no microscopic + * burst transmission which exceeds the limit is generated. + * + * The basic idea is that transmission timing can be precisely controlled, + * if packets are sent back-to-back at the wire rate. PSPacer controls + * the packet transmision intervals by inserting additional packets, + * called gap packets, between adjacent packets. The transmission interval + * can be controlled accurately by adjusting the number and size of the gap + * packets. PSPacer uses the 802.3x PAUSE frame as the gap packet. + * + * For the purpose of adjusting the gap size, this Qdisc maintains a byte + * clock which is recorded by a total transmitted byte per connection. + * Each sub-class has a class local clock which is used to make decision + * whether to send a packet or not. If there is not any packets to send, + * gap packets are inserted. + * + * References: + * [1] R.Takano, T.Kudoh, Y.Kodama, M.Matsuda, H.Tezuka, and Y.Ishikawa, + * "Design and Evaluation of Precise Software Pacing Mechanisms for + * Fast Long-Distance Networks", PFLDnet2005. + * [2] http://www.gridmpi.org/gridtcp.jsp + */ + +#define HW_GAP (16)/* Preamble(8) + Inter Frame Gap(8) */ +#define FCS(4) /* Frame Check Sequence(4) */ +#define MIN_GAP (64) /* Minimum size of gap packet */ +#define MIN_TARGET_RATE (1000) /* 1 KB/s (= 8 Kbps) */ + +#define PSP_HSIZE (16) + +#define BIT2BYTE(n) ((n) >> 3) + +struct psp_class +{ + u32 classid;/* class id */ + int refcnt;
Re: [XFRM]: Fix leak of expired xfrm_states
On Mon, Nov 26, 2007 at 04:05:27PM +0100, Patrick McHardy wrote: > This patch fixes a xfrm_state leak, which appears to be a > regression from the reference count simplifications. I was going to say this was a good find :) But digging deeper it seems that it might not be a bug after all. Even though the ref count on x may now drop to zero, it won't be freed until del_timer_sync returns which should be sufficient, no? Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ehea: Add kdump support
On Mon, 2007-11-26 at 19:16 +1100, Michael Ellerman wrote: > > Hi Thomas, > > I'm sorry, but this patch is all wrong IMHO. > > For kdump we have to assume that the kernel is fundamentally broken, > we've panicked, so something bad has happened - every line of kernel > code that is run decreases the chance that we'll successfully make it > into the kdump kernel. I agree with Michael. > Solutions that might be better: > > a) if there are a finite number of handles and we can predict their > values, just delete them all in the kdump kernel before the driver > loads. This is a good solution if handles are predefined. > b) if there are a small & finite number of handles, save their values > in a device tree property and have the kdump kernel read them and > delete them before the driver loads. Also good but is more complicated. > c) if neither of those work, provide a minimal routine that _only_ > deletes the handles in the crashed kernel. > d) Can the driver or configuration method for the driver query PHYP to determine if there are any pre-existing mappings... Regards, Luke - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-2.6.25] Fix pcounter build error without CONFIG_SMP
On Mon, Nov 26, 2007 at 01:23:05PM -0200, Arnaldo Carvalho de Melo wrote: > Em Mon, Nov 26, 2007 at 04:35:38PM +0200, Ilpo Järvinen escreveu: > > > > I keep getting this build error and couldn't find anyone fixing > > it in archives. ...Maybe all net developers except me build > > just SMP kernels :-). > > Thank you! UP is for sissies anyway :-P > > Acked-by: Arnaldo Carvalho de Melo <[EMAIL PROTECTED]> Ilpo, you need a new machine :) Thanks for fixing this! Patch applied to net-2.6.25. -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/4] fib_hash: kmalloc + memset conversion to kzalloc
On Mon, Nov 26, 2007 at 10:24:03AM +, Joonwoo Park wrote: > fib_hash: kmalloc + memset conversion to kzalloc > fix to avoid memset entirely. Patch applied. -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/4] fib_semantics: kmalloc + memset conversion to kzalloc
On Mon, Nov 26, 2007 at 10:24:03AM +, Joonwoo Park wrote: > fib_semantics: kmalloc + memset conversion to kzalloc > fix to avoid memset entirely. Also applied. -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] xfrm_hash: kmalloc + memset conversion to kzalloc
On Mon, Nov 26, 2007 at 10:23:51AM +, Joonwoo Park wrote: > 2007/11/26, Patrick McHardy <[EMAIL PROTECTED]>: > > How about also switching vmalloc/get_free_pages to GFP_ZERO > > and getting rid of the memset entirely while you're at it? > > > > xfrm_hash: kmalloc + memset conversion to kzalloc > fix to avoid memset entirely. Patch applied. Thanks everyone! -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-2.6.25] Fix pcounter build error without CONFIG_SMP
Em Mon, Nov 26, 2007 at 04:35:38PM +0200, Ilpo Järvinen escreveu: > > I keep getting this build error and couldn't find anyone fixing > it in archives. ...Maybe all net developers except me build > just SMP kernels :-). Thank you! UP is for sissies anyway :-P Acked-by: Arnaldo Carvalho de Melo <[EMAIL PROTECTED]> > In file included from include/net/sock.h:50, > from ipc/mqueue.c:35: > include/linux/pcounter.h: In function 'pcounter_add': > include/linux/pcounter.h:87: error: 'struct pcounter' has no > member named 'value' > make[1]: *** [ipc/mqueue.o] Error 1 > make: *** [ipc] Error 2 > > Signed-off-by: Ilpo Järvinen <[EMAIL PROTECTED]> > --- > include/linux/pcounter.h |2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/include/linux/pcounter.h b/include/linux/pcounter.h > index 620aade..9c4760a 100644 > --- a/include/linux/pcounter.h > +++ b/include/linux/pcounter.h > @@ -84,7 +84,7 @@ static inline int pcounter_getval(const struct pcounter > *self) > > static inline void pcounter_add(struct pcounter *self, int inc) > { > - self->value += inc; > + self->val += inc; > } > > static inline int pcounter_getval(const struct pcounter *self) > -- > 1.5.0.6 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2.6 patch] ipv4/arp.c:arp_process(): remove bogus #ifdef mess
On Sun, Nov 25, 2007 at 04:30:03PM +, Adrian Bunk wrote: > > > > > > Please look at net/ipv4/arp.c:arp_process() > > > > > > Am I right that CONFIG_NET_ETHERNET=n and CONFIG_NETDEV_1000=y or > > > CONFIG_NETDEV_1=y will not be handled correctly there? > > > > > > And the best solution is to nuke all #ifdef's in this function and make > > > the code unconditionally available? > > > > I think removing those specific ifdefs in arp_process() > > is the best option, yes. > > Patch below. Thanks Adrian. Patch applied to net-2.6. Do we need this for stable too? Chers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/10] [SKBUFF]: Add skb_morph
On Mon, Nov 26, 2007 at 03:50:22PM +0900, Yasuyuki KOZAKAI wrote: > > The refcount of nfct is leaked by this function. As a result, > nf_conntrack_ipv6.ko cannot be unloaded after doing "ping6 -s 2000 ..." . > dst->dst and dst->secpath are also needed to be released, I think. > > Please consider to apply this patch. Good catch! Thanks for spotting this. I'm going to add the following patch to net-2.6. [SKBUFF]: Free old skb properly in skb_morph The skb_morph function only freed the data part of the dst skb, but leaked the auxiliary data such as the netfilter fields. This patch fixes this by moving the relevant parts from __kfree_skb to skb_release_all and calling it in skb_morph. It also makes kfree_skbmem static since it's no longer called anywhere else and it now no longer does skb_release_data. Thanks to Yasuyuki KOZAKAI for finding this problem and posting a patch for it. Signed-off-by: Herbert Xu <[EMAIL PROTECTED]> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 91140fe..bddd50b 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -356,7 +356,6 @@ static inline struct sk_buff *alloc_skb_fclone(unsigned int size, return __alloc_skb(size, priority, 1, -1); } -extern void kfree_skbmem(struct sk_buff *skb); extern struct sk_buff *skb_morph(struct sk_buff *dst, struct sk_buff *src); extern struct sk_buff *skb_clone(struct sk_buff *skb, gfp_t priority); diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 32d5826..5b4ce9b 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -275,12 +275,11 @@ static void skb_release_data(struct sk_buff *skb) /* * Free an skbuff by memory without cleaning the state. */ -void kfree_skbmem(struct sk_buff *skb) +static void kfree_skbmem(struct sk_buff *skb) { struct sk_buff *other; atomic_t *fclone_ref; - skb_release_data(skb); switch (skb->fclone) { case SKB_FCLONE_UNAVAILABLE: kmem_cache_free(skbuff_head_cache, skb); @@ -307,16 +306,8 @@ void kfree_skbmem(struct sk_buff *skb) } } -/** - * __kfree_skb - private function - * @skb: buffer - * - * Free an sk_buff. Release anything attached to the buffer. - * Clean the state. This is an internal helper function. Users should - * always call kfree_skb - */ - -void __kfree_skb(struct sk_buff *skb) +/* Free everything but the sk_buff shell. */ +static void skb_release_all(struct sk_buff *skb) { dst_release(skb->dst); #ifdef CONFIG_XFRM @@ -340,7 +331,21 @@ void __kfree_skb(struct sk_buff *skb) skb->tc_verd = 0; #endif #endif + skb_release_data(skb); +} + +/** + * __kfree_skb - private function + * @skb: buffer + * + * Free an sk_buff. Release anything attached to the buffer. + * Clean the state. This is an internal helper function. Users should + * always call kfree_skb + */ +void __kfree_skb(struct sk_buff *skb) +{ + skb_release_all(skb); kfree_skbmem(skb); } @@ -441,7 +446,7 @@ static struct sk_buff *__skb_clone(struct sk_buff *n, struct sk_buff *skb) */ struct sk_buff *skb_morph(struct sk_buff *dst, struct sk_buff *src) { - skb_release_data(dst); + skb_release_all(dst); return __skb_clone(dst, src); } EXPORT_SYMBOL_GPL(skb_morph); - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] [POWERPC] fsl_soc: add support for gianfar forfixed-link property
On Mon, 2007-11-26 at 17:29 +0300, Vitaly Bordug wrote: > fixed-link says: register new "Fixed/emulated PHY", i.e. PHY that > not connected to the real MDIO bus. > > Signed-off-by: Vitaly Bordug <[EMAIL PROTECTED]> > Signed-off-by: Anton Vorontsov <[EMAIL PROTECTED]> > > --- > > Documentation/powerpc/booting-without-of.txt |3 + > arch/powerpc/sysdev/fsl_soc.c| 56 > ++ > 2 files changed, 42 insertions(+), 17 deletions(-) > > > diff --git a/Documentation/powerpc/booting-without-of.txt > b/Documentation/powerpc/booting-without-of.txt > index e9a3cb1..cf25070 100644 > --- a/Documentation/powerpc/booting-without-of.txt > +++ b/Documentation/powerpc/booting-without-of.txt > @@ -1254,6 +1254,9 @@ platforms are moved over to use the > flattened-device-tree model. >services interrupts for this device. > - phy-handle : The phandle for the PHY connected to this ethernet >controller. > +- fixed-link : where a is emulated phy id - choose any, > + but unique to the all specified fixed-links, b is duplex - 0 half, > + 1 full, c is link speed - d#10/d#100/d#1000. Good work! May I suggest adding a "d" to where d is flow control - 0 no, 1 yes flow control or not just popped up here today so I got a use for it. Jocke - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[XFRM]: Fix leak of expired xfrm_states
This patch fixes a xfrm_state leak, which appears to be a regression from the reference count simplifications. commit 817252c2a475371f9764883c7d0f0cde63b3cfe8 Author: Patrick McHardy <[EMAIL PROTECTED]> Date: Mon Nov 26 16:00:50 2007 +0100 [XFRM]: Fix leak of expired xfrm_states The xfrm_timer calls __xfrm_state_delete, which drops the final reference manually without triggering destruction of the state. Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 224b44e..11e9a48 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -416,7 +416,7 @@ static inline unsigned long make_jiffies(long secs) static void xfrm_timer_handler(unsigned long data) { - struct xfrm_state *x = (struct xfrm_state*)data; + struct xfrm_state *x = (struct xfrm_state*)data, *del = NULL; unsigned long now = get_seconds(); long next = LONG_MAX; int warn = 0; @@ -479,6 +479,8 @@ expired: goto resched; } + del = x; + xfrm_state_hold(del); err = __xfrm_state_delete(x); if (!err && x->id.spi) km_state_expired(x, 1, 0); @@ -488,6 +490,8 @@ expired: out: spin_unlock(&x->lock); + if (del) + xfrm_state_put(del); } static void xfrm_replay_timer_handler(unsigned long data);
[PATCH net-2.6.25] Fix pcounter build error without CONFIG_SMP
I keep getting this build error and couldn't find anyone fixing it in archives. ...Maybe all net developers except me build just SMP kernels :-). In file included from include/net/sock.h:50, from ipc/mqueue.c:35: include/linux/pcounter.h: In function 'pcounter_add': include/linux/pcounter.h:87: error: 'struct pcounter' has no member named 'value' make[1]: *** [ipc/mqueue.o] Error 1 make: *** [ipc] Error 2 Signed-off-by: Ilpo Järvinen <[EMAIL PROTECTED]> --- include/linux/pcounter.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/include/linux/pcounter.h b/include/linux/pcounter.h index 620aade..9c4760a 100644 --- a/include/linux/pcounter.h +++ b/include/linux/pcounter.h @@ -84,7 +84,7 @@ static inline int pcounter_getval(const struct pcounter *self) static inline void pcounter_add(struct pcounter *self, int inc) { - self->value += inc; + self->val += inc; } static inline int pcounter_getval(const struct pcounter *self) -- 1.5.0.6
[PATCH 3/3] [POWERPC] MPC8349E-mITX: Vitesse 7385 PHY is not connected to the MDIO bus
...thus use fixed-link to register proper "Fixed PHY" Signed-off-by: Anton Vorontsov <[EMAIL PROTECTED]> Signed-off-by: Vitaly Bordug <[EMAIL PROTECTED]> --- arch/powerpc/boot/dts/mpc8349emitx.dts | 11 ++- 1 files changed, 2 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/boot/dts/mpc8349emitx.dts b/arch/powerpc/boot/dts/mpc8349emitx.dts index 5072f6d..e2d00f1 100644 --- a/arch/powerpc/boot/dts/mpc8349emitx.dts +++ b/arch/powerpc/boot/dts/mpc8349emitx.dts @@ -115,14 +115,6 @@ reg = <1c>; device_type = "ethernet-phy"; }; - - /* Vitesse 7385 */ - phy1f: [EMAIL PROTECTED] { - interrupt-parent = < &ipic >; - interrupts = <12 8>; - reg = <1f>; - device_type = "ethernet-phy"; - }; }; [EMAIL PROTECTED] { @@ -159,7 +151,8 @@ local-mac-address = [ 00 00 00 00 00 00 ]; interrupts = <23 8 24 8 25 8>; interrupt-parent = < &ipic >; - phy-handle = < &phy1f >; + /* Vitesse 7385 isn't on the MDIO bus */ + fixed-link = <1 1 d#1000>; linux,network-index = <1>; }; - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] [NET] phy/fixed.c: rework to not duplicate PHY layer functionality
With that patch fixed.c now fully emulates MDIO bus, thus no need to duplicate PHY layer functionality. That, in turn, drastically simplifies the code, and drops down line count. As an additional bonus, now there is no need to register MDIO bus for each PHY, all emulated PHYs placed on the platform fixed MDIO bus. There is also no more need to pre-allocate PHYs via .config option, this is all now handled dynamically. p.s. Don't even try to understand patch content! Better: apply patch and look into resulting drivers/net/phy/fixed.c. Signed-off-by: Anton Vorontsov <[EMAIL PROTECTED]> Signed-off-by: Vitaly Bordug <[EMAIL PROTECTED]> --- drivers/net/phy/Kconfig | 32 +-- drivers/net/phy/fixed.c | 427 - include/linux/phy_fixed.h | 49 ++--- 3 files changed, 176 insertions(+), 332 deletions(-) diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig index 54b2ba9..a05c614 100644 --- a/drivers/net/phy/Kconfig +++ b/drivers/net/phy/Kconfig @@ -61,34 +61,12 @@ config ICPLUS_PHY Currently supports the IP175C PHY. config FIXED_PHY - tristate "Drivers for PHY emulation on fixed speed/link" + tristate "Drivers for MDIO Bus/PHY emulation on fixed speed/link" ---help--- - Adds the driver to PHY layer to cover the boards that do not have any PHY bound, - but with the ability to manipulate the speed/link in software. The relevant MII - speed/duplex parameters could be effectively handled in a user-specified function. - Currently tested with mpc866ads. - -config FIXED_MII_10_FDX - bool "Emulation for 10M Fdx fixed PHY behavior" - depends on FIXED_PHY - -config FIXED_MII_100_FDX - bool "Emulation for 100M Fdx fixed PHY behavior" - depends on FIXED_PHY - -config FIXED_MII_1000_FDX - bool "Emulation for 1000M Fdx fixed PHY behavior" - depends on FIXED_PHY - -config FIXED_MII_AMNT -int "Number of emulated PHYs to allocate " -depends on FIXED_PHY -default "1" ----help--- -Sometimes it is required to have several independent emulated -PHYs on the bus (in case of multi-eth but phy-less HW for instance). -This control will have specified number allocated for each fixed -PHY type enabled. + Adds the platform "fixed" MDIO Bus to cover the boards that use + PHYs that are not connected to the real MDIO bus. + + Currently tested with mpc866ads and mpc8349e-mitx. config MDIO_BITBANG tristate "Support for bitbanged MDIO buses" diff --git a/drivers/net/phy/fixed.c b/drivers/net/phy/fixed.c index 5619182..31719b3 100644 --- a/drivers/net/phy/fixed.c +++ b/drivers/net/phy/fixed.c @@ -1,362 +1,237 @@ /* - * drivers/net/phy/fixed.c + * Fixed MDIO bus (MDIO bus emulation with fixed PHYs) * - * Driver for fixed PHYs, when transceiver is able to operate in one fixed mode. + * Author: Vitaly Bordug <[EMAIL PROTECTED]> + * Anton Vorontsov <[EMAIL PROTECTED]> * - * Author: Vitaly Bordug - * - * Copyright (c) 2006 MontaVista Software, Inc. + * Copyright (c) 2006-2007 MontaVista Software, Inc. * * This program is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License as published by the * Free Software Foundation; either version 2 of the License, or (at your * option) any later version. - * */ + #include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include #include +#include +#include #include -#include #include #include -#include -#include -#include - -/* we need to track the allocated pointers in order to free them on exit */ -static struct fixed_info *fixed_phy_ptrs[CONFIG_FIXED_MII_AMNT*MAX_PHY_AMNT]; - -/*- - * If something weird is required to be done with link/speed, - * network driver is able to assign a function to implement this. - * May be useful for PHY's that need to be software-driven. - *-*/ -int fixed_mdio_set_link_update(struct phy_device *phydev, - int (*link_update) (struct net_device *, - struct fixed_phy_status *)) -{ - struct fixed_info *fixed; - - if (link_update == NULL) - return -EINVAL; - - if (phydev) { - if (phydev->bus) { - fixed = phydev->bus->priv; - fixed->link_update = link_update; - return 0; - } - } - return -EINVAL; -} +#define MII_REGS_NUM 29 -EXPORT_SYMBOL(fixed_mdio_set_link_update); +struct fixed_mdio_bus { + int irqs[PHY_MAX_ADDR]; + struct mii_bus mii_bus; + struct list_head phys;
[PATCH 2/3] [POWERPC] fsl_soc: add support for gianfar for fixed-link property
fixed-link says: register new "Fixed/emulated PHY", i.e. PHY that not connected to the real MDIO bus. Signed-off-by: Vitaly Bordug <[EMAIL PROTECTED]> Signed-off-by: Anton Vorontsov <[EMAIL PROTECTED]> --- Documentation/powerpc/booting-without-of.txt |3 + arch/powerpc/sysdev/fsl_soc.c| 56 ++ 2 files changed, 42 insertions(+), 17 deletions(-) diff --git a/Documentation/powerpc/booting-without-of.txt b/Documentation/powerpc/booting-without-of.txt index e9a3cb1..cf25070 100644 --- a/Documentation/powerpc/booting-without-of.txt +++ b/Documentation/powerpc/booting-without-of.txt @@ -1254,6 +1254,9 @@ platforms are moved over to use the flattened-device-tree model. services interrupts for this device. - phy-handle : The phandle for the PHY connected to this ethernet controller. +- fixed-link : where a is emulated phy id - choose any, + but unique to the all specified fixed-links, b is duplex - 0 half, + 1 full, c is link speed - d#10/d#100/d#1000. Recommended properties: diff --git a/arch/powerpc/sysdev/fsl_soc.c b/arch/powerpc/sysdev/fsl_soc.c index 3ace747..e06a5c9 100644 --- a/arch/powerpc/sysdev/fsl_soc.c +++ b/arch/powerpc/sysdev/fsl_soc.c @@ -24,6 +24,7 @@ #include #include #include +#include #include #include #include @@ -193,7 +194,6 @@ static const char *gfar_tx_intr = "tx"; static const char *gfar_rx_intr = "rx"; static const char *gfar_err_intr = "error"; - static int __init gfar_of_init(void) { struct device_node *np; @@ -277,29 +277,51 @@ static int __init gfar_of_init(void) gfar_data.interface = PHY_INTERFACE_MODE_MII; ph = of_get_property(np, "phy-handle", NULL); - phy = of_find_node_by_phandle(*ph); + if (ph == NULL) { + struct fixed_phy_status status = {}; + u32 *fixed_link; - if (phy == NULL) { - ret = -ENODEV; - goto unreg; - } + fixed_link = (u32*)of_get_property(np, "fixed-link",NULL); + if (!fixed_link) { + ret = -ENODEV; + goto unreg; + } - mdio = of_get_parent(phy); + status.link = 1; + status.duplex = fixed_link[1]; + status.speed = fixed_link[2]; + + ret = fixed_phy_add(PHY_POLL, fixed_link[0], &status); + if (ret) + goto unreg; + + gfar_data.bus_id = 0; + gfar_data.phy_id = fixed_link[0]; + } else { + phy = of_find_node_by_phandle(*ph); + + if (phy == NULL) { + ret = -ENODEV; + goto unreg; + } + + mdio = of_get_parent(phy); + + id = of_get_property(phy, "reg", NULL); + ret = of_address_to_resource(mdio, 0, &res); + if (ret) { + of_node_put(phy); + of_node_put(mdio); + goto unreg; + } + + gfar_data.phy_id = *id; + gfar_data.bus_id = res.start; - id = of_get_property(phy, "reg", NULL); - ret = of_address_to_resource(mdio, 0, &res); - if (ret) { of_node_put(phy); of_node_put(mdio); - goto unreg; } - gfar_data.phy_id = *id; - gfar_data.bus_id = res.start; - - of_node_put(phy); - of_node_put(mdio); - ret = platform_device_add_data(gfar_dev, &gfar_data, sizeof(struct - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DST_NOHASH flag and IPsec transformers routing tables - need some clarification
Hello, Thanks! This make things clear a bit. What I understand from thee code (__dst_free() in dst.c) is that if we are freeing a dst_entry (by calling __dst_free() ), than in case this dst_entry was created as an IPsec dst_entry it has the DST_NOHASH flag set and this dst_entry is the first in the list; all dst_entries in this list represent IPsec transformations, except the last dst_entry in this list; and in such a case, dst_free traverses the list until the last dst_entry. And if this dst_entry is NOT created as an IPsec dst_entry, than we freeing it and only it. But I am sorry, I still don't understand the semantics; why is the name DST_NOHASH?; NOHASH hints that we do not keep the an entry in a hash. I doubt that such dst_entries , which are created with IPsec and so has the DST_NOHASH flag set, are not kept in the routing cache? Ian On Nov 26, 2007 4:51 AM, Herbert Xu <[EMAIL PROTECTED]> wrote: > Ian Brown <[EMAIL PROTECTED]> wrote: > > > > 3) in net/core/dst.c: > > struct dst_entry *dst_destroy(struct dst_entry * dst) > >{ > >... > >... > > int nohash = dst->flags & DST_NOHASH; > >... > >... > > } > > You were so close :) > > This is where it's used. Look harder. > > Cheers, > -- > Visit Openswan at http://www.openswan.org/ > Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt > - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Fix memory leak in inet_hashtables.h when NUMA is on
On Fri, Nov 23, 2007 at 07:13:11PM +0300, Pavel Emelyanov wrote: > The inet_ehash_locks_alloc() looks like this: > > #ifdef CONFIG_NUMA > if (size > PAGE_SIZE) > x = vmalloc(...); > else > #endif > x = kmalloc(...); > > Unlike it, the inet_ehash_locks_alloc() looks like this: > > #ifdef CONFIG_NUMA > if (size > PAGE_SIZE) > vfree(x); > else > #else > kfree(x); > #endif > > The error is obvious - if the NUMA is on and the size > is less than the PAGE_SIZE we leak the pointer (kfree is > inside the #else branch). > > Compiler doesn't warn us because after the kfree(x) there's > a "x = NULL" assignment, so here's another (minor?) bug: we > don't set x to NULL under certain circumstances. > > Boring explanation, I know... Patch explains it better. > > Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]> Good catch! Applied to net-2.6. Thanks. -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-2.6.25] Make macro to specify the ptype_base size
On Fri, Nov 23, 2007 at 04:47:36PM +0300, Pavel Emelyanov wrote: > Currently this size is 16, but as the comment says this > is so only because all the chains (except one) has the > length 1. I think, that some day this may change, so > growing this hash will be much easier. > > Besides, symbolic names are read better than magic constants. > > Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]> Applied to net-2.6.25. Thanks! -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-2.6.25] Name magic constants in sock_wake_async()
On Fri, Nov 23, 2007 at 04:43:11PM +0300, Pavel Emelyanov wrote: > The sock_wake_async() performs a bit different actions > depending on "how" argument. Unfortunately this argument > ony has numerical magic values. > > I propose to give names to their constants to help people > reading this function callers understand what's going on > without looking into this function all the time. > > I suppose this is 2.6.25 material, but if it's not (or the > naming seems poor/bad/awful), I can rework it against the > current net-2.6 tree. > > Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]> I like this but I admit I'm no good with names either :) Patch applied to net-2.6.25. Thanks Pavel! -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [NET]: Fix TX bug VLAN in VLAN
On Fri, Nov 23, 2007 at 12:12:52PM +, Joonwoo Park wrote: > This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=8766 > > Is it possible? > BUG((veth->h_vlan_proto != htons(ETH_P_8021Q)) && !(VLAN_DEV_INFO(dev)->flags > & VLAN_FLAG_REORDER_HDR)) > I'm afraid, queued packet before vconfig set_flag would do that. Yes, AF_PACKET would do that. So you should check both. Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] atm/ambassador: kmalloc + memset conversion to kzalloc
On Mon, 26 Nov 2007, Joonwoo Park wrote: > 2007/11/26, Robert P. J. Day <[EMAIL PROTECTED]>: > > i realized that. but all you can say is that only amb_init() calls > > setup_dev() *currently*. when you're not looking, someone else might > > (for whatever reason) call setup_dev() from elsewhere, and *that* call > > might not zero that memory area. > > > > IMHO, the only safe transforms of kmalloc+memset -> kzalloc are those > > in which the flow of control is unmistakable and invariant. splitting > > that across a function call seems like a dangerous thing to do. > > (except, of course, in the case, where the kzalloc() is added inside > > the function -- then all callers are entitled to simplify *their* > > code. but that's different.) > > > > in any event, i just thought i'd point it out. if you're absolutely > > sure there will never be another call to setup_dev() from somewhere > > else, then, yes, it's safe. > > > > I understood your opinions. and partially agree with you. > But isn't it a unfounded fear? i don't know, i just thought i'd mention it. if no one thinks it's an issue, it's certainly fine with me. rday Robert P. J. Day Linux Consulting, Training and Annoying Kernel Pedantry Waterloo, Ontario, CANADA http://crashcourse.ca - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/7] [IPSEC]: Lock state when copying non-atomic fields to user-space
On Mon, Nov 26, 2007 at 11:18:45AM +0800, Herbert Xu wrote: > > I'm just going to revert this patch for 2.6.24 since we've lived > with this race for so long anyway. Actually, instead of reverting it completely I'm just going to remove the newly added locks which should be just as effective. This would reduce the churn in the code as we'd be putting most of it back soon anyway. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- diff --git a/net/key/af_key.c b/net/key/af_key.c index 3b2d864..878039b 100644 --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -1015,9 +1015,7 @@ static inline struct sk_buff *pfkey_xfrm_state2msg(struct xfrm_state *x) { struct sk_buff *skb; - spin_lock_bh(&x->lock); skb = __pfkey_xfrm_state2msg(x, 1, 3); - spin_unlock_bh(&x->lock); return skb; } diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index d41588d..e75dbdc 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -507,7 +507,6 @@ static int copy_to_user_state_extra(struct xfrm_state *x, struct xfrm_usersa_info *p, struct sk_buff *skb) { - spin_lock_bh(&x->lock); copy_to_user_state(x, p); if (x->coaddr) @@ -515,7 +514,6 @@ static int copy_to_user_state_extra(struct xfrm_state *x, if (x->lastused) NLA_PUT_U64(skb, XFRMA_LASTUSED, x->lastused); - spin_unlock_bh(&x->lock); if (x->aalg) NLA_PUT(skb, XFRMA_ALG_AUTH, alg_len(x->aalg), x->aalg); - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] atm/ambassador: kmalloc + memset conversion to kzalloc
2007/11/26, Robert P. J. Day <[EMAIL PROTECTED]>: > i realized that. but all you can say is that only amb_init() calls > setup_dev() *currently*. when you're not looking, someone else might > (for whatever reason) call setup_dev() from elsewhere, and *that* call > might not zero that memory area. > > IMHO, the only safe transforms of kmalloc+memset -> kzalloc are those > in which the flow of control is unmistakable and invariant. splitting > that across a function call seems like a dangerous thing to do. > (except, of course, in the case, where the kzalloc() is added inside > the function -- then all callers are entitled to simplify *their* > code. but that's different.) > > in any event, i just thought i'd point it out. if you're absolutely > sure there will never be another call to setup_dev() from somewhere > else, then, yes, it's safe. > I understood your opinions. and partially agree with you. But isn't it a unfounded fear? Thanks Joonwoo - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ehea: Add kdump support
Michael Ellerman wrote on 26.11.2007 09:16:28: > Solutions that might be better: > > a) if there are a finite number of handles and we can predict their > values, just delete them all in the kdump kernel before the driver > loads. Guessing the values does not work, because of the handle structure defined by the hypervisor. > b) if there are a small & finite number of handles, save their values > in a device tree property and have the kdump kernel read them and > delete them before the driver loads. 5*16*nr_ports+1+1= >82. a ML16 has 4 adapters with up to 16 ports, so the number is not small anymore The device tree functions are currently not exported. If you crashdump to a new kernel, will it get the device tree representation of the crashed kernel or of the initial one of open firmware? > c) if neither of those work, provide a minimal routine that _only_ > deletes the handles in the crashed kernel. I would hope this has the highest chance to actually work. For this we would have to add a proper notifier chain. Do you agree? > d) Firmware change? But that's not something you will get very soon. Christoph R. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] add compare_ether_addr_unaligned
On Fri, Nov 23, 2007 at 09:26:31PM +0800, Herbert Xu wrote: > On Fri, Nov 23, 2007 at 12:09:22AM +, Daniel Drake wrote: > > David Miller found a problem in a wireless driver where I was using > > compare_ether_addr() on potentially unaligned data. Document that > > compare_ether_addr() is not safe for use everywhere, and add an equivalent > > function that works regardless of alignment. > > > > Signed-off-by: Daniel Drake <[EMAIL PROTECTED]> > > Patch applied to net-2.6. Thanks. Since it turned out that this function wouldn't be useful in the case that you originally intended it for, I'm going to drop it until such a time when a new need arises. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] atm/ambassador: kmalloc + memset conversion to kzalloc
On Mon, 26 Nov 2007, Joonwoo Park wrote: > 2007/11/26, Robert P. J. Day <[EMAIL PROTECTED]>: > > i'm not sure the above is a safe thing to do, as you're zeroing that > > area, then making a function call and assuming, upon entry to the > > function call, that the caller has done the right thing. i don't see > > how you can count on that, depending on who else might want to call > > that routine and whether they get sloppy about it. unless you're > > prepared to guarantee that there will never be another call to > > setup_dev() from elsewhere. > > Thanks for your response. But setup_dev is static function and only > amb_init calls it. i realized that. but all you can say is that only amb_init() calls setup_dev() *currently*. when you're not looking, someone else might (for whatever reason) call setup_dev() from elsewhere, and *that* call might not zero that memory area. IMHO, the only safe transforms of kmalloc+memset -> kzalloc are those in which the flow of control is unmistakable and invariant. splitting that across a function call seems like a dangerous thing to do. (except, of course, in the case, where the kzalloc() is added inside the function -- then all callers are entitled to simplify *their* code. but that's different.) in any event, i just thought i'd point it out. if you're absolutely sure there will never be another call to setup_dev() from somewhere else, then, yes, it's safe. rday Robert P. J. Day Linux Consulting, Training and Annoying Kernel Pedantry Waterloo, Ontario, CANADA http://crashcourse.ca - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] xfrm_hash: kmalloc + memset conversion to kzalloc
> > i believe the more common standard for the above is: > > else if (hashdist) { > > to reduce the level of overall indentation, no? > No, it was. Because there was a memset in that indentation, but I made it by removing memset. Thanks. Joonwoo - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/4] fib_hash: kmalloc + memset conversion to kzalloc
fib_hash: kmalloc + memset conversion to kzalloc fix to avoid memset entirely. Thanks. Joonwoo Signed-off-by: Joonwoo Park <[EMAIL PROTECTED]> --- diff --git a/net/ipv4/fib_hash.c b/net/ipv4/fib_hash.c index 527a6e0..9d0cee2 100644 --- a/net/ipv4/fib_hash.c +++ b/net/ipv4/fib_hash.c @@ -102,10 +102,10 @@ static struct hlist_head *fz_hash_alloc(int divisor) unsigned long size = divisor * sizeof(struct hlist_head); if (size <= PAGE_SIZE) { - return kmalloc(size, GFP_KERNEL); + return kzalloc(size, GFP_KERNEL); } else { return (struct hlist_head *) - __get_free_pages(GFP_KERNEL, get_order(size)); + __get_free_pages(GFP_KERNEL | __GFP_ZERO, get_order(size)); } } @@ -174,8 +174,6 @@ static void fn_rehash_zone(struct fn_zone *fz) ht = fz_hash_alloc(new_divisor); if (ht) { - memset(ht, 0, new_divisor * sizeof(struct hlist_head)); - write_lock_bh(&fib_hash_lock); old_ht = fz->fz_hash; fz->fz_hash = ht; @@ -219,7 +217,6 @@ fn_new_zone(struct fn_hash *table, int z) kfree(fz); return NULL; } - memset(fz->fz_hash, 0, fz->fz_divisor * sizeof(struct hlist_head *)); fz->fz_order = z; fz->fz_mask = inet_make_mask(z); --- - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/4] fib_semantics: kmalloc + memset conversion to kzalloc
fib_semantics: kmalloc + memset conversion to kzalloc fix to avoid memset entirely. Thanks. Joonwoo Signed-off-by: Joonwoo Park <[EMAIL PROTECTED]> --- diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c index 1351a26..352f8c4 100644 --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -605,10 +605,10 @@ static inline unsigned int fib_laddr_hashfn(__be32 val) static struct hlist_head *fib_hash_alloc(int bytes) { if (bytes <= PAGE_SIZE) - return kmalloc(bytes, GFP_KERNEL); + return kzalloc(bytes, GFP_KERNEL); else return (struct hlist_head *) - __get_free_pages(GFP_KERNEL, get_order(bytes)); + __get_free_pages(GFP_KERNEL | __GFP_ZERO, get_order(bytes)); } static void fib_hash_free(struct hlist_head *hash, int bytes) @@ -712,12 +712,8 @@ struct fib_info *fib_create_info(struct fib_config *cfg) if (!new_info_hash || !new_laddrhash) { fib_hash_free(new_info_hash, bytes); fib_hash_free(new_laddrhash, bytes); - } else { - memset(new_info_hash, 0, bytes); - memset(new_laddrhash, 0, bytes); - + } else fib_hash_move(new_info_hash, new_laddrhash, new_size); - } if (!fib_hash_size) goto failure; --- - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] atm/ambassador: kmalloc + memset conversion to kzalloc
2007/11/26, Robert P. J. Day <[EMAIL PROTECTED]>: > i'm not sure the above is a safe thing to do, as you're zeroing that > area, then making a function call and assuming, upon entry to the > function call, that the caller has done the right thing. i don't see > how you can count on that, depending on who else might want to call > that routine and whether they get sloppy about it. unless you're > prepared to guarantee that there will never be another call to > setup_dev() from elsewhere. > Thanks for your response. But setup_dev is static function and only amb_init calls it. IMO it's safe. Thanks. Joonwoo - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 1/4] xfrm_hash: kmalloc + memset conversion to kzalloc
2007/11/26, Patrick McHardy <[EMAIL PROTECTED]>: > How about also switching vmalloc/get_free_pages to GFP_ZERO > and getting rid of the memset entirely while you're at it? > xfrm_hash: kmalloc + memset conversion to kzalloc fix to avoid memset entirely. Thanks Patrick. Thanks. Joonwoo Signed-off-by: Joonwoo Park <[EMAIL PROTECTED]> --- diff --git a/net/xfrm/xfrm_hash.c b/net/xfrm/xfrm_hash.c index 55ab579..a2023ec 100644 --- a/net/xfrm/xfrm_hash.c +++ b/net/xfrm/xfrm_hash.c @@ -17,17 +17,14 @@ struct hlist_head *xfrm_hash_alloc(unsigned int sz) struct hlist_head *n; if (sz <= PAGE_SIZE) - n = kmalloc(sz, GFP_KERNEL); + n = kzalloc(sz, GFP_KERNEL); else if (hashdist) - n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL); + n = __vmalloc(sz, GFP_KERNEL | __GFP_ZERO, PAGE_KERNEL); else n = (struct hlist_head *) - __get_free_pages(GFP_KERNEL | __GFP_NOWARN, + __get_free_pages(GFP_KERNEL | __GFP_NOWARN | __GFP_ZERO, get_order(sz)); - if (n) - memset(n, 0, sz); - return n; } --- - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] atm/ambassador: kmalloc + memset conversion to kzalloc
On Mon, 26 Nov 2007, Joonwoo Park wrote: > atm/ambassador: kmalloc + memset conversion to kzalloc > > Signed-off-by: Joonwoo Park <[EMAIL PROTECTED]> > > Thanks. > Joonwoo > > --- > diff --git a/drivers/atm/ambassador.c b/drivers/atm/ambassador.c > index b34b382..4f99ba3 100644 > --- a/drivers/atm/ambassador.c > +++ b/drivers/atm/ambassador.c > @@ -2163,7 +2163,6 @@ static int __devinit amb_init (amb_dev * dev) > static void setup_dev(amb_dev *dev, struct pci_dev *pci_dev) > { >unsigned char pool; > - memset (dev, 0, sizeof(amb_dev)); > >// set up known dev items straight away >dev->pci_dev = pci_dev; > @@ -2253,7 +2252,7 @@ static int __devinit amb_probe(struct pci_dev *pci_dev, > const struct pci_device_ > goto out_disable; > } > > - dev = kmalloc (sizeof(amb_dev), GFP_KERNEL); > + dev = kzalloc(sizeof(amb_dev), GFP_KERNEL); > if (!dev) { > PRINTK (KERN_ERR, "out of memory!"); > err = -ENOMEM; > --- i'm not sure the above is a safe thing to do, as you're zeroing that area, then making a function call and assuming, upon entry to the function call, that the caller has done the right thing. i don't see how you can count on that, depending on who else might want to call that routine and whether they get sloppy about it. unless you're prepared to guarantee that there will never be another call to setup_dev() from elsewhere. rday Robert P. J. Day Linux Consulting, Training and Annoying Kernel Pedantry Waterloo, Ontario, CANADA http://crashcourse.ca - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] xfrm_hash: kmalloc + memset conversion to kzalloc
Joonwoo Park wrote: diff --git a/net/xfrm/xfrm_hash.c b/net/xfrm/xfrm_hash.c index 55ab579..37795bd 100644 --- a/net/xfrm/xfrm_hash.c +++ b/net/xfrm/xfrm_hash.c @@ -17,16 +17,17 @@ struct hlist_head *xfrm_hash_alloc(unsigned int sz) struct hlist_head *n; if (sz <= PAGE_SIZE) - n = kmalloc(sz, GFP_KERNEL); - else if (hashdist) - n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL); - else - n = (struct hlist_head *) - __get_free_pages(GFP_KERNEL | __GFP_NOWARN, -get_order(sz)); - - if (n) - memset(n, 0, sz); + n = kzalloc(sz, GFP_KERNEL); + else { + if (hashdist) + n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL); + else + n = (struct hlist_head *) + __get_free_pages(GFP_KERNEL | __GFP_NOWARN, +get_order(sz)); + if (n) + memset(n, 0, sz); How about also switching vmalloc/get_free_pages to GFP_ZERO and getting rid of the memset entirely while you're at it? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] xfrm_hash: kmalloc + memset conversion to kzalloc
On Mon, 26 Nov 2007, Joonwoo Park wrote: > xfrm_hash: kmalloc + memset conversion to kzalloc > > Signed-off-by: Joonwoo Park <[EMAIL PROTECTED]> > > Thanks. > Joonwoo > > --- > diff --git a/net/xfrm/xfrm_hash.c b/net/xfrm/xfrm_hash.c > index 55ab579..37795bd 100644 > --- a/net/xfrm/xfrm_hash.c > +++ b/net/xfrm/xfrm_hash.c > @@ -17,16 +17,17 @@ struct hlist_head *xfrm_hash_alloc(unsigned int sz) > struct hlist_head *n; > > if (sz <= PAGE_SIZE) > - n = kmalloc(sz, GFP_KERNEL); > - else if (hashdist) > - n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL); > - else > - n = (struct hlist_head *) > - __get_free_pages(GFP_KERNEL | __GFP_NOWARN, > - get_order(sz)); > - > - if (n) > - memset(n, 0, sz); > + n = kzalloc(sz, GFP_KERNEL); > + else { > + if (hashdist) i believe the more common standard for the above is: else if (hashdist) { to reduce the level of overall indentation, no? rday Robert P. J. Day Linux Consulting, Training and Annoying Kernel Pedantry Waterloo, Ontario, CANADA http://crashcourse.ca - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/4] fib_semantics: kmalloc + memset conversion to kzalloc
fib_semantics: kmalloc + memset conversion to kzalloc Signed-off-by: Joonwoo Park <[EMAIL PROTECTED]> Thanks. Joonwoo --- diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c index 1351a26..87a1e72 100644 --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -605,10 +605,15 @@ static inline unsigned int fib_laddr_hashfn(__be32 val) static struct hlist_head *fib_hash_alloc(int bytes) { if (bytes <= PAGE_SIZE) - return kmalloc(bytes, GFP_KERNEL); - else - return (struct hlist_head *) + return kzalloc(bytes, GFP_KERNEL); + else { + struct hlist_head *hash; + hash = (struct hlist_head *) __get_free_pages(GFP_KERNEL, get_order(bytes)); + if (hash) + memset(hash, 0, bytes); + return hash; + } } static void fib_hash_free(struct hlist_head *hash, int bytes) @@ -712,12 +717,8 @@ struct fib_info *fib_create_info(struct fib_config *cfg) if (!new_info_hash || !new_laddrhash) { fib_hash_free(new_info_hash, bytes); fib_hash_free(new_laddrhash, bytes); - } else { - memset(new_info_hash, 0, bytes); - memset(new_laddrhash, 0, bytes); - + } else fib_hash_move(new_info_hash, new_laddrhash, new_size); - } if (!fib_hash_size) goto failure; --- - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/4] fib_hash: kmalloc + memset conversion to kzalloc
fib_hash: kmalloc + memset conversion to kzalloc Signed-off-by: Joonwoo Park <[EMAIL PROTECTED]> Thanks. Joonwoo --- diff --git a/net/ipv4/fib_hash.c b/net/ipv4/fib_hash.c index 527a6e0..2874fe7 100644 --- a/net/ipv4/fib_hash.c +++ b/net/ipv4/fib_hash.c @@ -102,10 +102,14 @@ static struct hlist_head *fz_hash_alloc(int divisor) unsigned long size = divisor * sizeof(struct hlist_head); if (size <= PAGE_SIZE) { - return kmalloc(size, GFP_KERNEL); + return kzalloc(size, GFP_KERNEL); } else { - return (struct hlist_head *) + struct hlist_head *hash; + hash = (struct hlist_head *) __get_free_pages(GFP_KERNEL, get_order(size)); + if (hash) + memset(hash, 0, size); + return hash; } } @@ -174,8 +178,6 @@ static void fn_rehash_zone(struct fn_zone *fz) ht = fz_hash_alloc(new_divisor); if (ht) { - memset(ht, 0, new_divisor * sizeof(struct hlist_head)); - write_lock_bh(&fib_hash_lock); old_ht = fz->fz_hash; fz->fz_hash = ht; @@ -219,7 +221,6 @@ fn_new_zone(struct fn_hash *table, int z) kfree(fz); return NULL; } - memset(fz->fz_hash, 0, fz->fz_divisor * sizeof(struct hlist_head *)); fz->fz_order = z; fz->fz_mask = inet_make_mask(z); --- - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4] atm/ambassador: kmalloc + memset conversion to kzalloc
atm/ambassador: kmalloc + memset conversion to kzalloc Signed-off-by: Joonwoo Park <[EMAIL PROTECTED]> Thanks. Joonwoo --- diff --git a/drivers/atm/ambassador.c b/drivers/atm/ambassador.c index b34b382..4f99ba3 100644 --- a/drivers/atm/ambassador.c +++ b/drivers/atm/ambassador.c @@ -2163,7 +2163,6 @@ static int __devinit amb_init (amb_dev * dev) static void setup_dev(amb_dev *dev, struct pci_dev *pci_dev) { unsigned char pool; - memset (dev, 0, sizeof(amb_dev)); // set up known dev items straight away dev->pci_dev = pci_dev; @@ -2253,7 +2252,7 @@ static int __devinit amb_probe(struct pci_dev *pci_dev, const struct pci_device_ goto out_disable; } - dev = kmalloc (sizeof(amb_dev), GFP_KERNEL); + dev = kzalloc(sizeof(amb_dev), GFP_KERNEL); if (!dev) { PRINTK (KERN_ERR, "out of memory!"); err = -ENOMEM; --- - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/4] xfrm_hash: kmalloc + memset conversion to kzalloc
xfrm_hash: kmalloc + memset conversion to kzalloc Signed-off-by: Joonwoo Park <[EMAIL PROTECTED]> Thanks. Joonwoo --- diff --git a/net/xfrm/xfrm_hash.c b/net/xfrm/xfrm_hash.c index 55ab579..37795bd 100644 --- a/net/xfrm/xfrm_hash.c +++ b/net/xfrm/xfrm_hash.c @@ -17,16 +17,17 @@ struct hlist_head *xfrm_hash_alloc(unsigned int sz) struct hlist_head *n; if (sz <= PAGE_SIZE) - n = kmalloc(sz, GFP_KERNEL); - else if (hashdist) - n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL); - else - n = (struct hlist_head *) - __get_free_pages(GFP_KERNEL | __GFP_NOWARN, -get_order(sz)); - - if (n) - memset(n, 0, sz); + n = kzalloc(sz, GFP_KERNEL); + else { + if (hashdist) + n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL); + else + n = (struct hlist_head *) + __get_free_pages(GFP_KERNEL | __GFP_NOWARN, +get_order(sz)); + if (n) + memset(n, 0, sz); + } return n; } --- - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bonding sysfs output
Andrew Morton <[EMAIL PROTECTED]> writes: > On Sun, 25 Nov 2007 16:12:57 +0100 Wagner Ferenc <[EMAIL PROTECTED]> wrote: > >> I propose it as a fix for trailing NULs and spaces like eg. >> >> $ od -c /sys/class/net/bond0/bonding/slaves >> 000 e t h - l e f t e t h - r i g >> 020 h t \n \0 >> 025 >> >> I'm afraid there're other problems with "++more++" handling, but let's >> not consider those just yet. Find the patch attached. The first >> hunks also renames buffer to buf, for consistency's shake. >> >> The original version had varying behaviour for Not Applicable cases. >> This patch also settles for empty files (not even a line feed) in >> those cases, but I'm not sure about the general policy on this matter. > > hm, there are a lot of changes there. Were they all actually needed to fix > the one bug which you have described? Trailing NULs are present in each file under /sys/class/net/*/bonding and also in /sys/class/net/bonding_masters. That is, in every file provided by drivers/net/bonding/bond_sysfs.c. Most of the patch is concerned with this. Closely related is the presence of trailing spaces in multivalue files. There are three such files, one of them has the trailing space removed. This patch removes it from the other two. During this it also renames one function argument 'buffer' to 'buf', for consistency. On the policy side: some files are not applicable to some types of bonds, and return a single linefeed in that case. Except for one single case, which returns 'NA\n'. The patch changes these cases into emtpy files. If these are worthy changes, I'm absolutely willing to split up the patch into three parts as the above. -- Thanks, Feri. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html