Re: RFR: New e1000 driver (e1000new), was: Re: e1000: backport ich9 support from 7.5.5 ?
Arjan van de Ven wrote: Kok, Auke wrote: Jeff Garzik wrote: Andrew Morton wrote: On Fri, 29 Jun 2007 14:39:20 -0700 Kok, Auke [EMAIL PROTECTED] wrote: That's why we want to introduce a second e1000 driver (named differently, pick any name) that contains the new code base, side-by-side into the kernel with the current e1000. Sounds like a reasonable approach to me (it has plenty of precedent). But I forget what all the other issues were, so ignore me. Given past history with duplicate drivers and the problems that they cause -- I know, I've caused some of those problems :( -- I strongly recommend against when it can be avoided. Leaving e1000 with current hardware, and a new e1001 for newer hardware should be easier to manage for all involved, without the headaches that duplicate drivers cause. Jeff, ok first you hate the old e1000 and now you don't want to get rid of it ;) No -- e1000 is big and bloated but also stable and working and a key popular NIC driver for a lot of people. That means the transition has a wide impact, and thus should be considered a lot more carefully than (say) rewriting 8139cp. I reject the notion that a flag day switchover for a huge mass of e1000 users is the correct path. I do not think that best serves Linux users. It is better to get a new driver out in the field and working for a small population, let time pass, then consider if we really want to transition the entire e1000 population to the new driver. That leaves the mass of e1000 users with a working driver while the new driver proves itself. Since a small population of users is required to use the new driver, by virtue of new/old PCI ID split, you are guaranteed a population of test users immediately for the new driver. When the new driver proves itself stable, you will have plenty of knowledge from which to decide whether to transition 8257x users, all e1000 users, or whatever. I appreciate the pain a temporary dual driver situation gives; it comes down to a few things that I can think of right now, if you see more please add to the list. 1) users who find a bug in the new one silently use the old one rather than reporting the bug; and only scream when the old one eventually goes away (see ALSA/OSS duplication) 2) users who enable both in KConfig may get a random one 3) distros really prefer only 1 driver per PCI ID for their infrastructure tools 4) there will be resistance against deleting the old one meaning it might not happen You are missing the largest source of pain and headache: Users will use the default driver, which means no field testing at all until flag day, with obvious results. Furthermore, Linux kernel history demonstrates that temporary dual driver situations are rarely temporary. Thus, selling it as such in the face of all contrary experience is pure hyperbole. Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFR: New e1000 driver (e1000new), was: Re: e1000: backport ich9 support from 7.5.5 ?
Roland Dreier wrote: one possibility would be to merge e1000new with support only for chips not supported by e1000, and semi-freeze e1000 (fixes only, new device support goes into e1000new). indeed. Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFR: New e1000 driver (e1000new), was: Re: e1000: backport ich9 support from 7.5.5 ?
James Chapman wrote: I envisage something like this:- e1000_core.c- common code used by all drivers of the e1000 family. Exports functions used by actual drivers. Loadable as a separate module when built as a module. e1000_82541.c- driver for 82541 e1000_82542.c- driver for 82542 e1000_x.c- driver for x Please ignore the statement I made above. Having read the code and chip docs again, I realize I jumped to the wrong conclusions when I saw separate source files for each of the 7 chip variants in the new driver. Just wanted to nip it in the bud before it causes any wasted time discussing it. :) -- James Chapman Katalix Systems Ltd http://www.katalix.com Catalysts for your Embedded Linux software development - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFR: New e1000 driver (e1000new), was: Re: e1000: backport ich9 support from 7.5.5 ?
I reject the notion that a flag day switchover for a huge mass of e1000 users is the correct path. I do not think that best serves Linux users. so the options we have is do it one pci id a time; and the suggestion by others like Christoph has been to make the split at the PCI Express switchover. Do you agree with that approach? I appreciate the pain a temporary dual driver situation gives; it comes down to a few things that I can think of right now, if you see more please add to the list. 1) users who find a bug in the new one silently use the old one rather than reporting the bug; and only scream when the old one eventually goes away (see ALSA/OSS duplication) 2) users who enable both in KConfig may get a random one 3) distros really prefer only 1 driver per PCI ID for their infrastructure tools 4) there will be resistance against deleting the old one meaning it might not happen You are missing the largest source of pain and headache: Users will use the default driver, which means no field testing at all until flag day, with obvious results. which is why the proposal that was below the text you quoted talks about working with the distro kernel maintainers and get the new driver default for their development releases etc etc that adds a significant level of granularity. For example I'd not be surprised if Fedora and Ubuntu test releases have orders of magnitude more users than kernel.org self compiloe kernels. Furthermore, Linux kernel history demonstrates that temporary dual driver situations are rarely temporary. Thus, selling it as such in the face of all contrary experience is pure hyperbole. history also demonstrates it can be done (just look at Adrian Bunk and the OSS drivers) as long as someone is determined to do it. It's not easy, especially cases where the two drivers have different maintainers who disagree, or represent different interfaces/functionality; but it's very not impossible either. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFR: New e1000 driver (e1000new), was: Re: e1000: backport ich9 support from 7.5.5 ?
Kok, Auke wrote: Jeff Garzik wrote: * The multitude of tiny, fine-grained operations for MAC, NVM, PHY, etc. is a signal that organization is backwards. You should be creating hardware-specific high level operations (PHY layer hooks, net_device hooks, interrupt handler) that call out to more-generic functions when necessary. Doing so eliminates the need to create a new hook for every little twirl in the code path. I think this is the key point that needs to be addressed in the new driver. All of the netdevice glue such as PCI ID tables, netdev open, close etc should be in its own driver for each device, which calls out to common code where appropriate. are you talking about the internals of e1000_phy/mac/nvm etc? i agree that the amount of forward/backward mapping in here is a bit of a spaghetti and could be more clear Make it more clear by restructuring it, not by adding comments though. * The multitude of files makes it difficult to review. Much easier in one file, or a small few. well, at least the files are reasonably well structured by topic. Combining small files just to make reviewing easier seems strange to me, besides making it easier to jump around in an editor it doesn't add any value to the code organization, and just encourages more forward declarations and horrible ordering issues. There's a lot of common code across the e1000 family. But all of the chip-specific code for an individual driver could be in one file. I envisage something like this:- e1000_core.c- common code used by all drivers of the e1000 family. Exports functions used by actual drivers. Loadable as a separate module when built as a module. e1000_82541.c - driver for 82541 e1000_82542.c - driver for 82542 e1000_x.c - driver for x If phylib were used, the phy-specific parts would move out naturally into a phy driver for each phy device. And if e1000_core.c were implemented well, few changes should occur over time as more e1000 devices are released. There would obviously be no overlap of PCI IDs in each driver. For the initial version, support just one device - more can be added later. The important thing is to get the structure right in the initial version. -- James Chapman Katalix Systems Ltd http://www.katalix.com Catalysts for your Embedded Linux software development - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFR: New e1000 driver (e1000new), was: Re: e1000: backport ich9 support from 7.5.5 ?
On 7/6/07, Jeff Garzik [EMAIL PROTECTED] wrote: OK, just looked through the driver. I think its structured inside-out from what it should be. * The multitude of tiny, fine-grained operations for MAC, NVM, PHY, etc. is a signal that organization is backwards. You should be creating hardware-specific high level operations (PHY layer hooks, net_device hooks, interrupt handler) that call out to more-generic functions when necessary. Doing so eliminates the need to create a new hook for every little twirl in the code path. I think the nice thing about the driver's organization is that it gives a clear overview of how the driver works, without delving into generation-specific details. This fixes a problem that the old driver had, where it was very easy to get lost in the woods of family-specific workarounds. In the long run, a driver is easier to maintain if you can easily follow the code path for a particular hardware generation. Creating e1001_8257x_do_this_thing(), which calls more generic code as needed, is easier to review and doesn't require all sorts of indirection through APIs. Basically make a library of e1000-routines. The alternative is a polymorphic approach (i.e. using function pointers so generic code can call specific code). Both approaches are used extensively in the kernel -- I happen to think the latter approach is better suited to the e1000's current issue. (For what is a driver but specific code called from generic code? This just reuses this helpful design pattern at a lower level.) Doing so also means that many workarounds for older hardware disappear from the most-travelled code paths over time. The 64k boundary check found in e1000new is an easy example of something that really shouldn't pollute newer code at all [yes, even though it reduces to 'return 1' for most]. Given that the new driver would only support PCIe devices (i.e. relatively new/good HW) I would think this would cut down on the workaround hooks, leaving the HW is genuinely different hooks. * The multitude of files makes it difficult to review. Much easier in one file, or a small few. Removing support for pre-PCIe in the new driver would help alleviate this. Regards -- Andy - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFR: New e1000 driver (e1000new), was: Re: e1000: backport ich9 support from 7.5.5 ?
Jeff Garzik wrote: OK, just looked through the driver. I think its structured inside-out from what it should be. Comments: * is a clear improvement from current e1000 * The multitude of tiny, fine-grained operations for MAC, NVM, PHY, etc. is a signal that organization is backwards. You should be creating hardware-specific high level operations (PHY layer hooks, net_device hooks, interrupt handler) that call out to more-generic functions when necessary. Doing so eliminates the need to create a new hook for every little twirl in the code path. In the long run, a driver is easier to maintain if you can easily follow the code path for a particular hardware generation. Creating e1001_8257x_do_this_thing(), which calls more generic code as needed, is easier to review and doesn't require all sorts of indirection through APIs. Doing so also means that many workarounds for older hardware disappear from the most-travelled code paths over time. The 64k boundary check found in e1000new is an easy example of something that really shouldn't pollute newer code at all [yes, even though it reduces to 'return 1' for most]. are you talking about the internals of e1000_phy/mac/nvm etc? i agree that the amount of forward/backward mapping in here is a bit of a spaghetti and could be more clear * The multitude of files makes it difficult to review. Much easier in one file, or a small few. well, at least the files are reasonably well structured by topic. Combining small files just to make reviewing easier seems strange to me, besides making it easier to jump around in an editor it doesn't add any value to the code organization, and just encourages more forward declarations and horrible ordering issues. * bitfields consider these gone ;) * check for PCI DMA mapping failure *draws blank* specific one? I'm not seeing what you mean here * atomic_t irq_sem is reinventing the wheel (and too heavy for you needs, too?). You might as well use a lock or mutex or whatnot at that point, since you are using a locked instruction. tg3 might also have some hints in this regard. absolutely, I really don't like irq_sem and several people insist even that it can be removed (allthough I've demonstrated that plain removing it actually breaks on pci-e adapters). I had left it in since it currently works fine, but it's high on my list to fix. * like I noted in the last email, the quickest path to upstream is to start SMALL. Create the smallest working driver, review it heavily, get it upstream. Add all hardwarefeatures after that. In this line, I will try to drop things like multiqueue from it completely to avoid making it over-complex (and not used at all right now), as is the case in a lot of points right now. Thanks for reviewing. Auke - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RFR: New e1000 driver (e1000new), was: Re: e1000: backport ich9 support from 7.5.5 ?
Jeff Garzik wrote: Andrew Morton wrote: On Fri, 29 Jun 2007 14:39:20 -0700 Kok, Auke [EMAIL PROTECTED] wrote: That's why we want to introduce a second e1000 driver (named differently, pick any name) that contains the new code base, side-by-side into the kernel with the current e1000. Sounds like a reasonable approach to me (it has plenty of precedent). But I forget what all the other issues were, so ignore me. Given past history with duplicate drivers and the problems that they cause -- I know, I've caused some of those problems :( -- I strongly recommend against when it can be avoided. Leaving e1000 with current hardware, and a new e1001 for newer hardware should be easier to manage for all involved, without the headaches that duplicate drivers cause. An e1001 approach also means we have a much greater chance of encouraging Intel down the path of clean driver-ness. ok, FWIW, here is this new driver: I've posted this new driver (currently called e1000new) on my git tree: git-pull git://lost.foo-projects.org/~ahkok/git/netdev-2.6 e1000new It contains a single git-commit with the following content (on top of current master branch from Jeff's netdev tree): --- commit d2ec375736ac23a3432534112e0f8fc172ee9db4 Author: Auke Kok [EMAIL PROTECTED] Date: Fri Jun 29 15:48:04 2007 -0700 e1000new: new e1000 driver for current e1000 hardware This driver is a cleaned up version of the current e1000 driver. It introduces a fully rewritten abstraction between MAC, PHY, NVM and other low-level hardware features such as manageability. Per- hardware family specific code. Mac type checks are almost completely eliminated and replaced with hardware feature/issue or capability flags. This allows us much easier going forward to add new hardware support or debug issues. Finally, the driver structures were reorganized and restructured to align better and remove holes. tx and rx structures are now basically the same and simplify setup code etc. Signed-off-by: Auke Kok [EMAIL PROTECTED] :100644 100644 7d57f4a... 0bad8f3... M drivers/net/Kconfig :100644 100644 a77affa... 0b58e78... M drivers/net/Makefile :00 100644 000... 5bdf481... A drivers/net/e1000new/80003es2lan.c :00 100644 000... f139050... A drivers/net/e1000new/80003es2lan.h :00 100644 000... 1279521... A drivers/net/e1000new/82540.c :00 100644 000... dc20510... A drivers/net/e1000new/82541.c :00 100644 000... 5fab7ff... A drivers/net/e1000new/82541.h :00 100644 000... e18a1be... A drivers/net/e1000new/82542.c :00 100644 000... dad89ae... A drivers/net/e1000new/82543.c :00 100644 000... 6d14aba... A drivers/net/e1000new/82543.h :00 100644 000... 3283841... A drivers/net/e1000new/82571.c :00 100644 000... 5912f47... A drivers/net/e1000new/82571.h :00 100644 000... 703d938... A drivers/net/e1000new/Makefile :00 100644 000... 2de032e... A drivers/net/e1000new/api.c :00 100644 000... e429f8c... A drivers/net/e1000new/api.h :00 100644 000... 59f34db... A drivers/net/e1000new/defines.h :00 100644 000... 71b775d... A drivers/net/e1000new/e1000.h :00 100644 000... 803ed2c... A drivers/net/e1000new/ethtool.c :00 100644 000... 848debc... A drivers/net/e1000new/hw.h :00 100644 000... e5adbff... A drivers/net/e1000new/ich8lan.c :00 100644 000... 557858f... A drivers/net/e1000new/ich8lan.h :00 100644 000... 8f589a3... A drivers/net/e1000new/mac.c :00 100644 000... 075e326... A drivers/net/e1000new/mac.h :00 100644 000... 3b6f6bc... A drivers/net/e1000new/main.c :00 100644 000... e5fee2e... A drivers/net/e1000new/manage.c :00 100644 000... 27606ea... A drivers/net/e1000new/manage.h :00 100644 000... 2c9ccb4... A drivers/net/e1000new/nvm.c :00 100644 000... e59c8ad... A drivers/net/e1000new/nvm.h :00 100644 000... 3c943f1... A drivers/net/e1000new/param.c :00 100644 000... 45338ef... A drivers/net/e1000new/phy.c :00 100644 000... 957957d... A drivers/net/e1000new/phy.h :00 100644 000... 669ad03... A drivers/net/e1000new/regs.h --- You can also get it through http: http://foo-projects.org/~sofar/e1000new.patch [883KB] http://foo-projects.org/~sofar/e1000new.patch.bz2 [121KB] Unfortunately it's too big to send directly to the list. While I understand everyone's reservation for going a certain course forward, I would really appreciate it if people would review this driver and give it attention. Any feedback is appreciated, and whatever course we might end up going, can be used. Thanks, Auke - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFR: New e1000 driver (e1000new), was: Re: e1000: backport ich9 support from 7.5.5 ?
Kok, Auke wrote: Jeff Garzik wrote: Andrew Morton wrote: On Fri, 29 Jun 2007 14:39:20 -0700 Kok, Auke [EMAIL PROTECTED] wrote: That's why we want to introduce a second e1000 driver (named differently, pick any name) that contains the new code base, side-by-side into the kernel with the current e1000. Sounds like a reasonable approach to me (it has plenty of precedent). But I forget what all the other issues were, so ignore me. Given past history with duplicate drivers and the problems that they cause -- I know, I've caused some of those problems :( -- I strongly recommend against when it can be avoided. Leaving e1000 with current hardware, and a new e1001 for newer hardware should be easier to manage for all involved, without the headaches that duplicate drivers cause. Jeff, ok first you hate the old e1000 and now you don't want to get rid of it ;) I appreciate the pain a temporary dual driver situation gives; it comes down to a few things that I can think of right now, if you see more please add to the list. 1) users who find a bug in the new one silently use the old one rather than reporting the bug; and only scream when the old one eventually goes away (see ALSA/OSS duplication) 2) users who enable both in KConfig may get a random one 3) distros really prefer only 1 driver per PCI ID for their infrastructure tools 4) there will be resistance against deleting the old one meaning it might not happen While the pain won't be zero, I know there are some simple things that can be done to make it significantly less: 1) Make sure that the various feature-removal files and KConfig clearly mark the old one as going away early on, make the old one printk that it's the older driver and stick a hard date on it. This should at least ease [1] and [4] above 2) After a minimal amount of shake-out time (say one kernel release), make the old e1000 not export the PCI IDs in it's module, but make it a manual modprobe. This means people can still use the old one, but distro tools will pick the new one automatically [and if there is any doubt about a specific PCI ID model temporarily, that one can be done the other way around] 3) Have a config option which does what you say; allow both drivers to be there but with non-overlapping PCI IDs; where exactly the new/old split is is up for discussion, and it can even move over time 4) Work with the distro maintainers to default their development snapshots as quickly as possible to the new driver so that it gets as much testing as possible quickly; yet at the same time for their security fixes and released updates they can use the 3) option 5) once there is some confidence in the new driver, put a patch into -mm that makes the old driver really hard to select in KConfig; this in preparation for removal later on (hey this worked for devfs ;) 6) put more or less a code freeze on the old driver; it should remain compiling but it's ok to be bug-to-bug compatible with what is there at the moment the new driver goes into the tree. Personally I'm convinced this is a managable process with a reasonable assurance that the old driver really can go away after 2 kernel releases (as long as Auke and co are on top of bugreports about regressions, I don't see why not). What is needed of course is some level of strictness/determination to get rid of the old one after the reasonable time period. if you have other concerns about temporarily having two drivers that are real showstoppers , please put them on the table; it might well be possible to find a reasonable way to solve/mitigate them. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFR: New e1000 driver (e1000new), was: Re: e1000: backport ich9 support from 7.5.5 ?
one possibility would be to merge e1000new with support only for chips not supported by e1000, and semi-freeze e1000 (fixes only, new device support goes into e1000new). Then if there are some devices that are more naturally supported by e1000new, we could later on merge patches to move support for those devices from e1000 to e1000new. Presumably this would simplify e1000, since it would have to support a smaller set of older devices. e1000new would stay relatively clean too since it wouldn't have to handle anything too old. And at every stage of the process, any given NIC would have only one driver in the tree. - R. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html