Re: [PATCH 3/9] powerpc32: checksum_wrappers_64 becomes checksum_wrappers
Hi Scott, > I wonder why it was 64-bit specific in the first place. I think it was part of a series where I added my 64bit assembly checksum routines, and I didn't step back and think that the wrapper code would be useful on 32 bit. Anton -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
ehea work queues
Hi, I booted 2.6.23-rc8 and noticed that ehea loves its workqueues: # ps aux|grep ehea root 3266 0.0 0.000 ?S 11:02 0:00 [ehea_driver_wq/] root 3268 0.0 0.000 ?S 11:02 0:00 [ehea_driver_wq/] root 3269 0.0 0.000 ?S 11:02 0:00 [ehea_driver_wq/] root 3270 0.0 0.000 ?S 11:02 0:00 [ehea_driver_wq/] root 3271 0.0 0.000 ?S 11:02 0:00 [ehea_driver_wq/] root 3272 0.0 0.000 ?S 11:02 0:00 [ehea_driver_wq/] root 3273 0.0 0.000 ?S 11:02 0:00 [ehea_driver_wq/] root 3274 0.0 0.000 ?S 11:02 0:00 [ehea_driver_wq/] root 3275 0.0 0.000 ?S 11:02 0:00 [ehea_wq/0] root 3276 0.0 0.000 ?S 11:02 0:00 [ehea_wq/1] root 3278 0.0 0.000 ?S 11:02 0:00 [ehea_wq/2] root 3279 0.0 0.000 ?S 11:02 0:00 [ehea_wq/3] root 3280 0.0 0.000 ?S 11:02 0:00 [ehea_wq/4] root 3281 0.0 0.000 ?S 11:02 0:00 [ehea_wq/5] root 3282 0.0 0.000 ?S 11:02 0:00 [ehea_wq/6] root 3283 0.0 0.000 ?S 11:02 0:00 [ehea_wq/7] (notice also that the ehea_driver_wq/XXX exceeds TASK_COMM_LEN). Since they are both infrequent events and not performance critical (memory hotplug and driver reset), can we just use schedule_work? Anton - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: select(0, ..) is valid ?
Hi Hugh, It's interesting that compat_core_sys_select() shows this kmalloc(0) failure but core_sys_select() does not. That's because core_sys_select() avoids kmalloc by using a buffer on the stack for small allocations (and 0 sure is small). Shouldn't compat_core_sys_select() do just the same? Or is SLUB going to be so efficient that doing so is a waste of time? Nice catch, the original optimisation from Andi is: http://git.kernel.org/git-new/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=70674f95c0a2ea694d5c39f4e514f538a09be36f And I think it makes sense for the compat code to do it too. Anton - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Fix return code in pci-skeleton.c
We assign the return value of register_netdev to i, but return rc later on. Fix it. Signed-off-by: Anton Blanchard [EMAIL PROTECTED] --- diff --git a/drivers/net/pci-skeleton.c b/drivers/net/pci-skeleton.c index 00ca0fd..6ca4e4f 100644 --- a/drivers/net/pci-skeleton.c +++ b/drivers/net/pci-skeleton.c @@ -710,8 +710,8 @@ match: tp-chipset, rtl_chip_info[tp-chipset].name); - i = register_netdev (dev); - if (i) + rc = register_netdev (dev); + if (rc) goto err_out_unmap; DPRINTK (EXIT, returning 0\n); - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2.6.19-rc3 2/2] ehea: 64K page support fix
Hi, +#ifdef CONFIG_PPC_64K_PAGES + /* To support 64k pages we must round to 64k page boundary */ + epas-kernel.addr = + ioremap((paddr_kernel 0x), PAGE_SIZE) + + (paddr_kernel 0x); +#else epas-kernel.addr = ioremap(paddr_kernel, PAGE_SIZE); +#endif Cant you just use PAGE_MASK, ~PAGE_MASK and remove the ifdefs completely? Anton - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2.6.19 PATCH 2/7] ehea: pHYP interface
Hi, I asked SO to recount arguments and we've come to a conclusion that there're in fact 19 args not 18 as the name suggests. 19 args is I-N-S-A-N-E. It will be partially cleaned up by: http://ozlabs.org/pipermail/linuxppc-dev/2006-July/024556.html However it doesnt fix the fact someone has architected such a crazy interface :( Anton - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/6] ehea: interface to network stack
Is a conditional cheaper than a divide? In case of a misprediction I would assume it to be significantly slower and I don't know the ratio of mispredictions for this branch. A quick scan of the web shows 40 cycles for athlon64 idiv, and its similarly slow on many other cpus. Even assuming you mispredict every branch its going to be a win. Anton - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/6] ehea: queue management
Hi, I agree, stubbs were removed. Thanks. What is going to be done about the debug infrastructure in the ehea driver? The entry and exit traces really need to go, and any other debug you think is important to users needs to go into debugfs or something similar. I see a similar issue in the ehca driver that I am in the middle of reviewing. Anton - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/6] ehea: interface to network stack
Hi, --- linux-2.6.18-rc4-orig/drivers/net/ehea/ehea_main.c1969-12-31 +#define DEB_PREFIX main Doesnt appear to be used. +static struct net_device_stats *ehea_get_stats(struct net_device *dev) ... + cb2 = kzalloc(H_CB_ALIGNMENT, GFP_KERNEL); I cant see where this gets freed. + + skb_index = ((index - i + + port_res-skb_arr_sq_len) + % port_res-skb_arr_sq_len); This is going to force an expensive divide. Its much better to change this to the simpler and quicker: i++; if (i max) i = 0; There are a few places in the driver can be changed to do this. +static int ehea_setup_single_port(struct ehea_adapter *adapter,A + int portnum, struct device_node *dn) ... + cb4 = kzalloc(H_CB_ALIGNMENT, GFP_KERNEL); I cant see where this is freed. Anton - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/6] ehea: pHYP interface
Hi, --- linux-2.6.18-rc4-orig/drivers/net/ehea/ehea_phyp.c1969-12-31 16:00:00.0 -0800 +u64 ehea_h_alloc_resource_eq(const u64 hcp_adapter_handle, ... +u64 hipz_h_reregister_pmr(const u64 adapter_handle, ... +static inline int hcp_galpas_ctor(struct h_galpas *galpas, Be nice to have some consistent names, hipz_ and hcp_ is kind of cryptic. +#define H_QP_CR_STATE_RESET 0x0100 /* Reset */ Probably want ULL on here and the other 64bit constants. Anton - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/6] ehea: header files
Hi, drivers/net/ehea/ehea.h| 452 +#define EHEA_DRIVER_NAME IBM eHEA You are using this for ethtool get_drvinfo. Im not sure if it should match the module name, and I worry about having a space in the name. Any ideas on what we should be doing here? +#define NET_IP_ALIGN 0 Shouldnt override this in your driver. +#define EDEB_P_GENERIC(level, idstring, format, args...) \ +#define EDEB_P_GENERIC(level,idstring,format,args...) \ +#define EDEB(level, format, args...) \ +#define EDEB_ERR(level, format, args...) \ +#define EDEB_EN(level, format, args...) \ +#define EDEB_EX(level, format, args...) \ +#define EDEB_DMP(level, adr, len, format, args...) \ There are a lot of debug statements in the driver. When doing a review I stripped them all out to make it easier to read. As suggested by others, using the standard debug macros (where still required) would be a good idea. Anton - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/6] ehea: queue management
Hi, --- linux-2.6.18-rc4-orig/drivers/net/ehea/ehea_ethtool.c 1969-12-31 +static void netdev_get_pauseparam(struct net_device *dev, + struct ethtool_pauseparam *pauseparam) +{ + printk(get pauseparam\n); +} There are a number of stubbed out ethtool functions like this. Best not to implement them and allow the upper layers to return a correct error. Anton - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/6] ehea: header files
--- linux-2.6.18-rc4-orig/drivers/net/ehea/ehea.h 1969-12-31 +extern void exit(int); Should be able to remove that prototype :) Anton - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: e100: checksum mismatch on 82551ER rev10
Hi, If the EEPROM has a broken checksum, the user should have an option that allows him to try and use the device anyways, end of story. Ive come across this problem a number of times on e1000 chips (to be clear it was vendor programming issues). The driver has the option to read and write the EEPROM already. All we need is the ability for the driver to hang around so that we can use ethtool to fix it. At the moment we carry an out of tree patch to do this. Anton - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 38 of 39] IB/ipath - More changes to support InfiniPath on PowerPC 970 systems
Hi, Please fix the generic code if it doesn't provide the facility you need at the moment. Don't shoe horn it into your driver just to make up for that. Ive had 3 drivers asking for write combining recently so I agree this is a good idea. How about ioremap_wc as suggested by Willy: http://marc.theaimsgroup.com/?l=linux-kernelm=114374741828040w=2 Anton - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/4] myri10ge - Driver core
Hi, We didn't get any ppc64 with PCI-E to run Linux so far. What performance drop should we expect with our current code ? We have seen 20% improvement on ppc64 running some networking workloads when forcing 128 byte alignment (instead of 16 byte alignment). DMA writes have to get cacheline aligned (in power of 2 steps) on some IO chips. I am not sure what you mean. The only ppc64 with PCI-E that we have seen so far (a G5) couldn't do write combining according to Apple. Im thinking more generally, MTRRs are x86 specific and it would be good to have a more generic way to enable write combining. Anton - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Allow skb headroom to be overridden
Previously we added NET_IP_ALIGN so an architecture can override the padding done to align headers. The next step is to allow the skb headroom to be overridden. We currently always reserve 16 bytes to grow into, meaning all DMAs start 16 bytes into a cacheline. On ppc64 we really want DMA writes to start on a cacheline boundary, so we increase that headroom to one cacheline. Signed-off-by: Anton Blanchard [EMAIL PROTECTED] --- Index: kernel/include/linux/skbuff.h === --- kernel.orig/include/linux/skbuff.h 2006-03-22 17:53:33.250531451 -0600 +++ kernel/include/linux/skbuff.h 2006-03-22 18:02:31.25608 -0600 @@ -956,6 +956,25 @@ static inline void skb_reserve(struct sk #define NET_IP_ALIGN 2 #endif +/* + * The networking layer reserves some headroom in skb data (via + * dev_alloc_skb). This is used to avoid having to reallocate skb data when + * the header has to grow. In the default case, if the header has to grow + * 16 bytes or less we avoid the reallocation. + * + * Unfortunately this headroom changes the DMA alignment of the resulting + * network packet. As for NET_IP_ALIGN, this unaligned DMA is expensive + * on some architectures. An architecture can override this value, + * perhaps setting it to a cacheline in size (since that will maintain + * cacheline alignment of the DMA). It must be a power of 2. + * + * Various parts of the networking layer expect at least 16 bytes of + * headroom, you should not reduce this. + */ +#ifndef NET_SKB_PAD +#define NET_SKB_PAD16 +#endif + extern int ___pskb_trim(struct sk_buff *skb, unsigned int len, int realloc); static inline void __skb_trim(struct sk_buff *skb, unsigned int len) @@ -1045,9 +1064,9 @@ static inline void __skb_queue_purge(str static inline struct sk_buff *__dev_alloc_skb(unsigned int length, gfp_t gfp_mask) { - struct sk_buff *skb = alloc_skb(length + 16, gfp_mask); + struct sk_buff *skb = alloc_skb(length + NET_SKB_PAD, gfp_mask); if (likely(skb)) - skb_reserve(skb, 16); + skb_reserve(skb, NET_SKB_PAD); return skb; } #else @@ -1085,13 +1104,15 @@ static inline struct sk_buff *dev_alloc_ */ static inline int skb_cow(struct sk_buff *skb, unsigned int headroom) { - int delta = (headroom 16 ? headroom : 16) - skb_headroom(skb); + int delta = (headroom NET_SKB_PAD ? headroom : NET_SKB_PAD) - + skb_headroom(skb); if (delta 0) delta = 0; if (delta || skb_cloned(skb)) - return pskb_expand_head(skb, (delta + 15) ~15, 0, GFP_ATOMIC); + return pskb_expand_head(skb, (delta + (NET_SKB_PAD-1)) + ~(NET_SKB_PAD-1), 0, GFP_ATOMIC); return 0; } Index: kernel/include/asm-powerpc/system.h === --- kernel.orig/include/asm-powerpc/system.h2006-03-22 17:53:33.250531451 -0600 +++ kernel/include/asm-powerpc/system.h 2006-03-22 17:54:06.487187558 -0600 @@ -363,8 +363,11 @@ __cmpxchg(volatile void *ptr, unsigned l * powers of 2 writes until it reaches sufficient alignment). * * Based on this we disable the IP header alignment in network drivers. + * We also modify NET_SKB_PAD to be a cacheline in size, thus maintaining + * cacheline alignment of buffers. */ -#define NET_IP_ALIGN 0 +#define NET_IP_ALIGN 0 +#define NET_SKB_PADL1_CACHE_BYTES #endif #define arch_align_stack(x) (x) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] TCP Cubic use Newton-Raphson
Hi Stephen, Replace cube root algorithim with a faster version using Newton-Raphson. Surprisingly, doing the scaled div64_64 is faster than a true 64 bit division on 64 bit CPU's. Interesting, what cpu was this on? Was there much difference between the two methods? Anton - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH netdev-2.6 1/3] ixgb: TSO fixes
Hi, TSO fixes - fix rare early completion when using TSO - extra descriptor for the sentinel descriptor Is this the same bug as e1000? The extra DMA descriptor is going to be costly, especially on 10Gb. Would the e1000_unmap_and_free_tx_resource trick used in e1000 work instead? (Actually I noticed we are still adding sentinel descriptors in e1000 even now we are doing the e1000_unmap_and_free_tx_resource trick, could we remove it?) Anton - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] disable DEBUG in ibmveth
Any chance we can get this patch in? Anton At the moment ibmveth has DEBUG enabled which is rather verbose. Disable it. Signed-off-by: Anton Blanchard [EMAIL PROTECTED] --- Index: foobar2/drivers/net/ibmveth.c === --- foobar2.orig/drivers/net/ibmveth.c 2005-07-06 07:49:42.0 +1000 +++ foobar2/drivers/net/ibmveth.c 2005-07-14 13:23:32.117030579 +1000 @@ -59,7 +59,7 @@ #include ibmveth.h -#define DEBUG 1 +#undef DEBUG #define ibmveth_printk(fmt, args...) \ printk(KERN_INFO %s: fmt, __FILE__, ## args) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Align DMA buffers to a cacheline
Hi, On ppc64 we want DMA writes (ie receive packets) to be aligned, hopefully on a cacheline boundary. The NET_IP_ALIGN patch got us part of the way, so that DMAs were 16 byte aligned. The following patch addresses the other source of misalignment. Every skb has 16 bytes of headroom reserved for various optimisations, unfortunately this means all receive packets start 16 bytes into a cacheline. By allowing this to be overridden (via the NET_SKB_PAD define), we can define the headroom to be a cacheline and maintain cacheline alignment. Signed-off-by: Anton Blanchard [EMAIL PROTECTED] --- Any thoughts on this? The other option was to override dev_alloc_skb, but I would prefer not to if possible. Index: gr_work/include/linux/skbuff.h === --- gr_work.orig/include/linux/skbuff.h 2005-11-08 01:58:45.830672271 -0600 +++ gr_work/include/linux/skbuff.h 2005-11-08 02:26:19.349053315 -0600 @@ -930,6 +930,23 @@ #define NET_IP_ALIGN 2 #endif +/* + * The networking layer reserves some headroom in skb data (via + * dev_alloc_skb). This is used to avoid having to reallocate skb data when + * the header has to grow. In the default case, if the header has to grow + * 16 bytes or less we avoid the reallocation. + * + * Unfortunately this headroom changes the DMA alignment of the resulting + * network packet. As for NET_IP_ALIGN, this unaligned DMA is expensive + * on some architectures. An architecture can override this value, + * perhaps setting it to 0 (if the optimisation is not seen as important), + * or to a cacheline in size (since that will maintain cacheline alignment + * of the DMA). + */ +#ifndef NET_SKB_PAD +#define NET_SKB_PAD16 +#endif + extern int ___pskb_trim(struct sk_buff *skb, unsigned int len, int realloc); static inline void __skb_trim(struct sk_buff *skb, unsigned int len) @@ -1019,9 +1036,9 @@ static inline struct sk_buff *__dev_alloc_skb(unsigned int length, gfp_t gfp_mask) { - struct sk_buff *skb = alloc_skb(length + 16, gfp_mask); + struct sk_buff *skb = alloc_skb(length + NET_SKB_PAD, gfp_mask); if (likely(skb)) - skb_reserve(skb, 16); + skb_reserve(skb, NET_SKB_PAD); return skb; } #else Index: gr_work/include/asm-ppc64/system.h === --- gr_work.orig/include/asm-ppc64/system.h 2005-11-08 01:58:45.833671796 -0600 +++ gr_work/include/asm-ppc64/system.h 2005-11-08 02:27:19.276618633 -0600 @@ -297,8 +297,11 @@ * powers of 2 writes until it reaches sufficient alignment). * * Based on this we disable the IP header alignment in network drivers. + * We also modify NET_SKB_PAD to be a cacheline in size, thus maintaining + * cacheline alignment of buffers. */ -#define NET_IP_ALIGN 0 +#define NET_IP_ALIGN 0 +#define NET_SKB_PADL1_CACHE_BYTES #define arch_align_stack(x) (x) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
ethtool phys_id sleeps for long periods with rtnl_lock taken
Hi, We had a bad ethernet card that wouldnt initialise, so I figured we could blink all other ethernets in the box to identify the bad one. I was surprised to find we could only blink one card at a time, the other ethtool processes were all blocked in D state. It turns out the ethtool ioctl takes the rtnl semaphore: rtnl_lock(); ret = dev_ethtool(ifr); rtnl_unlock(); And all the -phys_id methods I looked at sleep for the amount of time you want to blink for. eg for e100: static int e100_phys_id(struct net_device *netdev, u32 data) { struct nic *nic = netdev_priv(netdev); if(!data || data (u32)(MAX_SCHEDULE_TIMEOUT / HZ)) data = (u32)(MAX_SCHEDULE_TIMEOUT / HZ); mod_timer(nic-blink_timer, jiffies); msleep_interruptible(data * 1000); del_timer_sync(nic-blink_timer); mdio_write(netdev, nic-mii.phy_id, MII_LED_CONTROL, 0); return 0; } While its a bit annoying to only be able to blink one card at a time, the fact we can hold the rtnl semaphore for a very long time sounds bad. Maybe we should instead fire off the blink timer, return immediately and stop the blink when our timeout has been exceeded. If the user wanted to clear a blink they could then issue another ethtool call with a 0 length timeout. Anton - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] disable DEBUG in ibmveth
The trailing 1 will cause a warning. Good point. -- At the moment ibmveth has DEBUG enabled which is rather verbose. Disable it. Signed-off-by: Anton Blanchard [EMAIL PROTECTED] Index: foobar2/drivers/net/ibmveth.c === --- foobar2.orig/drivers/net/ibmveth.c 2005-07-06 07:49:42.0 +1000 +++ foobar2/drivers/net/ibmveth.c 2005-07-14 13:23:32.117030579 +1000 @@ -59,7 +59,7 @@ #include ibmveth.h -#define DEBUG 1 +#undef DEBUG #define ibmveth_printk(fmt, args...) \ printk(KERN_INFO %s: fmt, __FILE__, ## args) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html