http://www.symantec.com/connect/articles/how-fudge-linux-ahci-ata-drivers-ich10-controllersHow To Fudge Linux AHCI & ATA Drivers for ICH10 ControllersBeing a hard-and-fast Windows administrator for the last 6 years, I've always felt a tad intimidated by Altiris' Linux automation environment. I used it, and was generally just grateful that it worked. The appearance of the Dell Optiplex 760 and HP dc7900 however, and their inability to image, forced me to confront my demon. With no vendor help on the horizon the only way forward seemed to be to delve into the inners of Linux automation. This article tries to capture all I learned as I fumbled along on my journey to patch the ATA & AHCI drivers in the Linux source. IntroductionOver the past few months, many Altiris Admins have been trying unsuccessfully to image the new generation of HP and Dell machines. No matter where you turned, you couldn't seem to escape the error 'Disk not found' (I found that even when I closed my eyes this error appeared etched across the inside of my eyelids). These errors are particularly frustrating to those of us who use Linux automation and aren't Linux administrators. We would just like it all to work. So, what was the problem? Well, it's all down to the new Intel ICH10 SATA controllers. In particular, the issue is that the drivers for these controllers are not present in the Linux kernel distributed with Deployment Solution's Linux automation environment. This is not surprising -with the release of Deployment Solution 6.8SP2 back in 2007, Altiris decided to freeze the Linux automation kernel. This decision looks sensible -if Altiris were to constantly sync their automation environment with the latest kernel release they would be constantly have to test their code against the newer kernels. The support resource might blossom as the portfolio of supported kernels rose. Freezing the kernel version has the great advantage that lowers the resource overhead required for product development -it allows the programs compiled by Altiris to execute safely across all the DS versions supporting the same kernel. The downside of all this is that to gain the latest driver support we must add drivers to Linux automation via Bootdisk creator. This is not unusual -after all we often have to do this to expand hardware support in DOS and and WinPE all the time. The problem in this instance of ICH10 support was that although modern linux kernels natively support these controllers, no drivers were available for the older 2.6.18.8 kernel used in Linux Automation. Oh dear. At this point I should emphasize that I am not a Linux Guru!. If you are a Linux Guru, please don't flame me for what comes next... The Linux KernelsIf you are new to Linux you might be wondering what all the fuss is about right? After all, its all Linux so why all this talk about kernel versions? And what the heck is a kernel anyway? The Linux kernel is the heart of the Linux operating system. This is the supervisory code that provides a framework to allows programs to execute and interact with the computers hardware. The kernel controls the system -it initializes the computer's hardware and provides access to (and management of) processes, memory and the filesystem. The kernel is freely downloadable from www.kernel.org, and it forms the core of many Linux distributions such as SuSe & Debian who layer applications, services and a nice user interface on top of the kernel. The Linux kernel is open source, and is constantly being developed by hoards of programmers -rarely a quarter goes by it seems without new revisions being released to providing new features, code improvements, or enhanced hardware support. In order to keep track of all this, the kernel has historically kept to a strict versioning system. The kernel version is commonly broken down into four fields, <version>.<major revision>.<minor revision>.<build number>
As you've probably already gathered, Linux has quite a few kernel versions, and they are not at all easy to keep track of. Being a windows guy, I tend (to the horror of Linux gurus) to equate these to Windows releases. So, in the Microsoft world where we've had Windows evolve all the way from 3.1 to Vista, the Linux world has seem a similar evolution in the Linux kernel. I therefore tend to equate the Linux 1.x series of kernels to the Windows 3.1/95/9x family, and the Linux 2.x series to the Windows NT family which takes us from Windows NT all the way to Windows 7. Each major revision of the 2.x kernel we can playfully map thus,
This is spookily neat isn't it? My life will complete should the 2.8 kernel come out Q4 2009 with Windows 7 ;-) With this approach you can then think of the minor fields as being akin to the Windows service packs and hotfixes. I hope this demystifies kernel versioning, but as I can now feel the heat rising from the Guru direction, I'll move swiftly on.... The Linux and Microsoft Driver ModelsWith the above in mind, Altiris freezing the 2.6.18.8 kernel seems to equate pretty much to using WinPE2.1 (which based on Server 2008), but perhaps without the service pack and a bundle of hotfixes. This doesn't sound too bad, so why the big problem with driver support? To answer this, we now must enter an area where Linux and Microsoft massively differ -their device driver models. To explain, let's first look at Microsoft's approach. With each Windows release, Microsoft commits itself to a stable driver application binary interface which lies as a negotiating abstraction layer between the OS and driver. This approach decouples the driver from the operating system, allowing each to develop independently without compromising stability. The big advantage of this approach for both Microsoft and the hardware vendor is that neither need share their source code -they just need to talk the same language by committing to this common interface. The end result is that drivers tend to be fairly stable over life cycle of the OS. This however comes at the cost of drivers not being able to take advantage directly of operating system enhancements, and they are generally incompatible between operating systems. The approach that Linux uses is diametrically opposite to the above -the kernel development team do not to commit to a stable interface. The kernel developers are free to change the kernel's driver interfaces with each kernel release. As a result, the best way for hardware vendors to guarantee Linux support is to have their driver source code peer reviewed and accepted as part of the kernel. This way it will be maintained by the kernel developers who will ensure it continues to work smoothly as the driver interfaces evolve. Hardware vendors who accept this approach will find their hardware supported throughout all future kernel releases. In short, the Linux approach embraces devices drivers as part of the core operating system. There are no barriers with the aim of nothing being proprietary, but the continual evolution of the kernel and driver interfaces requires that vendors release their source and work with the kernel developers for best hardware support. Microsoft's approach puts a barrier between their OS development team and the hardware vendors who are kept at the gates so to speak. Each others territory is guarded and respected. The 2.6.18.8 kernel and I/O Controller Hub(ICH) SupportWith the above in mind, we can now begin to understand why freezing the Linux automation kernel is now perhaps not such a good idea. Let's look at the timeline for the ICH hubs.
Taking the example of Linux 2.6.18.8 kernel, this was released in February 2007, and as such had to miss out on ICH9 and ICH10 support (they had not been invented yet). By the time ICH10 was released in June 2008 the 2.6.18.8 kernel was a bouncy 14 months old and was already eighty releases behind the current kernel. That's a lot of kernels. Once ICH10 came into the mainstream kernel in 2008, compatibility for future kernel releases was guaranteed through the maintenance program, but older kernels will never see this support -to back port all the currently supported drivers to operate on previous kernel releases is a vast amount of work. And this is the reason why new hardware is such a problem for Altiris automation -someone has to backport the driver so that they operate with this specific kernel release (with its version of driver interface), or else fudge an existing driver written for the 2.6.18.8 kernel to work with the new device. Just compiling the new source against the old kernel is destined for failure -by design. Preparing Ubuntu for Compiling Linux Automation DriversHaving realized I have to start messing with Linux automation, the first thing I did was hit the knowledge base. This came up with a great article, How to compile drivers for Linux Automation. There are also a couple of Juice articles which I came upon later by lordmithrandir and TheDude05 which are good reading too. I'm not going to copy the KB article in here, but here are the steps which give you a Linux environment suitable for compiling Linux automation drivers,
This now puts you in a position to compile drivers for Linux automation. Its also a lot of work for a Windows administrator, so if you got here give yourself a hearty clap on the back. You deserve it. A Tale Of Woe -recompiling drivers from new kernels against old kernelsThe first thing I had to do was find out if a modern kernel had the ICH10 support I was looking for. The plan being that if a modern kernel worked, perhaps I could grab the source code and compile it against our older kernel. I did all this despite being told that the chances of recompiling a new driver against an old kernel were practically zero. What can I say -I was desperate.... So, I grabbed the latest SuSE Linux CD, and bunged it onto my dc7900 and let it work its magic. It installed perfectly. So, now the hunt began for the driver. Joe Doupnik told me that the fellow we were looking for was ahci.c, I started hunting and indeed this was there in the kernel source. Fabulous. So, on my Ubuntu VM I downloaded the 2.6.27 kernel source for the Ubuntu distro and took a look to see if this were there too. And lovely -it was...
And this is where it all started looking really doubtful. A full version increment in the source. But, still being desperate I copied over the new ahci.c over the one supplied with 2.6.18.8 kernel, typed 'make modules', and awaited my new driver to pop out of the Make Magic. It didn't of course -the screen scrolled endlessly with errors. This was a non-starter, and so time for a rethink. Alan Cox and 'How to Fudge a Driver'At this point I did what any sysadmin would do. I begged for help. I posted to Linux forums and a mailing list posted at the top of the driver source and waited. Within hours Alan Cox replied with a great tip.
"Assuming you don't need any of the latest and greatest features
however the 2.6.18 ahci driver *ought* to drive the ICH10 SATA
controller just fine if you add the idents from 2.6.28 (or the class
match for AHCI) to it"
I found this startling -what Alan was saying was that the driver embedded in the 2.6.18.8 kernel would probably work if I just let it know the new controller's device IDs. Add the new IDs, recompile and it would probably all work. So, I had a look in the source code for the old driver to see where the ICH Support was located. Here is the relevant snippet, /* Intel */ { PCI_VENDOR_ID_INTEL, 0x2652, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH6 */ { PCI_VENDOR_ID_INTEL, 0x2653, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH6M */ { PCI_VENDOR_ID_INTEL, 0x27c1, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH7 */ { PCI_VENDOR_ID_INTEL, 0x27c5, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH7M */ { PCI_VENDOR_ID_INTEL, 0x27c3, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH7R */ { PCI_VENDOR_ID_AL, 0x5288, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ULi M5288 */ { PCI_VENDOR_ID_INTEL, 0x2681, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ESB2 */ { PCI_VENDOR_ID_INTEL, 0x2682, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ESB2 */ { PCI_VENDOR_ID_INTEL, 0x2683, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ESB2 */ { PCI_VENDOR_ID_INTEL, 0x27c6, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH7-M DH */ { PCI_VENDOR_ID_INTEL, 0x2821, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH8 */ { PCI_VENDOR_ID_INTEL, 0x2822, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH8 */ { PCI_VENDOR_ID_INTEL, 0x2824, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH8 */ { PCI_VENDOR_ID_INTEL, 0x2829, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH8M */ { PCI_VENDOR_ID_INTEL, 0x282a, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH8M */ So, natively this driver only supported up to the ICH8 controller. What I did find interesting in looking at all the Intel controllers was that they only difference in the data structure that described them was their device IDs. No other magic seemed to be required. What ever underlying function did the comms was using the same method for all the Intel Controllers. This looked very good. So, all I had to do now was to add entries for all the ICH9/10 device IDs I could muster. This resulted in the following additions, { PCI_VENDOR_ID_INTEL, 0x2922, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH9 */ { PCI_VENDOR_ID_INTEL, 0x2923, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH9 */ { PCI_VENDOR_ID_INTEL, 0x2924, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH9 */ { PCI_VENDOR_ID_INTEL, 0x2925, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH9 */ { PCI_VENDOR_ID_INTEL, 0x2927, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH9 */ { PCI_VENDOR_ID_INTEL, 0x2929, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH9M */ { PCI_VENDOR_ID_INTEL, 0x292a, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH9M */ { PCI_VENDOR_ID_INTEL, 0x292b, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH9M */ { PCI_VENDOR_ID_INTEL, 0x292c, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH9M */ { PCI_VENDOR_ID_INTEL, 0x292f, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH9M */ { PCI_VENDOR_ID_INTEL, 0x294d, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH9 */ { PCI_VENDOR_ID_INTEL, 0x294e, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH9M */ { PCI_VENDOR_ID_INTEL, 0x502a, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* Tolapai */ { PCI_VENDOR_ID_INTEL, 0x502b, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* Tolapai */ { PCI_VENDOR_ID_INTEL, 0x3a02, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH10 */ { PCI_VENDOR_ID_INTEL, 0x3a05, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH10 */ { PCI_VENDOR_ID_INTEL, 0x3a25, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* ICH10 */ { PCI_VENDOR_ID_INTEL, 0x3b24, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* PCH RAID */ { PCI_VENDOR_ID_INTEL, 0x3b25, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* PCH RAID */ { PCI_VENDOR_ID_INTEL, 0x3b2b, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* PCH RAID */ { PCI_VENDOR_ID_INTEL, 0x3b2c, PCI_ANY_ID, PCI_ANY_ID, 0, 0, board_ahci }, /* PCH RAID */ Compiling the revised source came up trumps -resulting in a driver which was able to successfully image all the hardware I had lurking around with a modern AHCI controller -including the latest HP dc7900 and Dell Optiplex 760. This driver can be found in the Juice Download, Linux AHCI Driver for Dell Optiplex 760 and HP dc7900. Getting ICH Controllers to image in IDE Compatibility ModeThe next challenge was to get these computers working no matter the controller configuration in the BIOS. You see in IDE compatibility mode, the controller presents different device IDs to the PCI bus to reflect that only ATA/IDE features of the controller are accessible. The driver file which manages this IDE/ATA communications is ata_piix.c. In the 2.6.18.8 source, the relevant section for the Intel controllers is below, /* 82801EB (ICH5) */ { 0x8086, 0x24d1, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich5_sata }, /* 82801EB (ICH5) */ { 0x8086, 0x24df, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich5_sata }, /* 6300ESB (ICH5 variant with broken PCS present bits) */ { 0x8086, 0x25a3, PCI_ANY_ID, PCI_ANY_ID, 0, 0, esb_sata }, /* 6300ESB pretending RAID */ { 0x8086, 0x25b0, PCI_ANY_ID, PCI_ANY_ID, 0, 0, esb_sata }, /* 82801FB/FW (ICH6/ICH6W) */ { 0x8086, 0x2651, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich6_sata }, /* 82801FR/FRW (ICH6R/ICH6RW) */ { 0x8086, 0x2652, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich6_sata_ahci }, /* 82801FBM ICH6M (ICH6R with only port 0 and 2 implemented) */ { 0x8086, 0x2653, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich6m_sata_ahci }, /* 82801GB/GR/GH (ICH7, identical to ICH6) */ { 0x8086, 0x27c0, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich6_sata_ahci }, /* 2801GBM/GHM (ICH7M, identical to ICH6M) */ { 0x8086, 0x27c4, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich7m_sata_ahci }, /* Enterprise Southbridge 2 (where's the datasheet?) */ { 0x8086, 0x2680, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich6_sata_ahci }, /* SATA Controller 1 IDE (ICH8, no datasheet yet) */ { 0x8086, 0x2820, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci }, /* SATA Controller 2 IDE (ICH8, ditto) */ { 0x8086, 0x2825, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci }, /* Mobile SATA Controller IDE (ICH8M, ditto) */ { 0x8086, 0x2828, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci }, This looked a little more troubling. In IDE emulation, the ICH controllers seemed to require some subtle differences to support each generation. I had to just hope that whatever trick they used for the ICH8 series, would just work for the ICH9 and ICH10. So, my addition to the Intel section looks like, /* SATA Controller IDE (ICH9) */ { 0x8086, 0x2920, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci }, /* SATA Controller IDE (ICH9) */ { 0x8086, 0x2921, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci }, /* SATA Controller IDE (ICH9) */ { 0x8086, 0x2926, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci }, /* SATA Controller IDE (ICH9M) */ { 0x8086, 0x2928, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci }, /* SATA Controller IDE (ICH9M) */ { 0x8086, 0x292d, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci }, /* SATA Controller IDE (ICH9M) */ { 0x8086, 0x292e, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci }, /* SATA Controller IDE (ICH10) */ { 0x8086, 0x3a00, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci }, /* SATA Controller IDE (ICH10) */ { 0x8086, 0x3a06, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci }, /* SATA Controller IDE (ICH10) */ { 0x8086, 0x3a20, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci }, /* SATA Controller IDE (ICH10) */ { 0x8086, 0x3a26, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci }, And this too, amazingly appears to work well. This driver can be found in the Juice Download, Linux AHCI Driver for Dell Optiplex 760 and HP dc7900. Summary and MusingsWhat I have tried to do in today's article is show why its so damn difficult getting drivers to work in Linux automation, and further to de-mystify a little of the driver build process. The continually evolving nature of kernel's driver interface means that this will continue to be a thorn in Altiris'(and our) backsides. So how do we solve this? Well, a few options occurred to me during my few days delving into Linux. They are,
There will of course be other options -these are just those that instantly come to mind. Even if nothing exciting happens to Linux automation over the next year or so, I hope that this article will be of benefit in 2010. You see, the ICH11 controllers will probably come to market next summer and once again hardware will emerge which can't be imaged. I suspect this trick of once again copying in the relevant device IDs into ahci.c and ata_piix.c will continue to work as Intel are unlikely to break compatibility. Further ReadingIn addition to the Juice and KB articles referred to already, here are some more documents which you might find useful,
AcknowledgementsThanks again to Joe Doupnik at Oxford University, and Alan Cox at Linux.org. Their assistance was very much appreciated. |