Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
Hi, You may have already found out that there's a problem using pci_alloc_consistent and friends in the USB layer which will only be obvious on CPUs where they need to do page table remapping - that is that pci_alloc_consistent/pci_free_consistent aren't guaranteed to be interrupt-safe. I'm not sure what the correct way around this is yet, but I do know its a major problem. ;( Maybe we need to do a get_free_pages-type thing with this and keep a set amount of consistent area in reserve for atomic allocations (as per GFP_ATOMIC)? Yes, I know its not nice, but I don't see any other option at the moment with USB. (yes, I'm hacking the 2.2.18 ohci driver for my own ends to get something up and running on one of my machines). _ |_| - ---+---+- | | Russell King[EMAIL PROTECTED] --- --- | | | | http://www.arm.linux.org.uk/personal/aboutme.html / / | | +-+-+ --- -+- / | THE developer of ARM Linux |+| /|\ / | | | --- | +-+-+ - /\\\ | - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
On Sat, Jan 20, 2001, Russell King <[EMAIL PROTECTED]> wrote: > Johannes Erdfelt writes: > > They need to be visible via DMA. They need to be 16 byte aligned. We > > also have QH's which have similar requirements, but we don't use as many > > of them. > > Can we get away from the "16 byte aligned" and make it "n byte aligned"? > I believe that slab already has support for this? If you look at the part of the message that I quoted and you cut off, the requirements for UHCI are the data structures MUST be 16 byte aligned. I don't mind if the API is more generalized, but those are the requirements that were asked about in this specific case. JE - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
Russell King wrote: > > Manfred Spraul writes: > > Not yet, but that would be a 2 line patch (currently it's hardcoded to > > BYTES_PER_WORD align or L1_CACHE_BYTES, depending on the HWCACHE_ALIGN > > flag). > > I don't think there's a problem then. However, if slab can be told "I want > 1024 bytes aligned to 1024 bytes" then I can get rid of > arch/arm/mm/small_page.c (separate problem to the one we're discussing > though) ;) > That's easy, I'll include it in my next slab update. -- Manfred - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
Manfred Spraul writes: > Not yet, but that would be a 2 line patch (currently it's hardcoded to > BYTES_PER_WORD align or L1_CACHE_BYTES, depending on the HWCACHE_ALIGN > flag). I don't think there's a problem then. However, if slab can be told "I want 1024 bytes aligned to 1024 bytes" then I can get rid of arch/arm/mm/small_page.c (separate problem to the one we're discussing though) ;) > But there are 2 other problems: > * kmem_cache_alloc returns one pointer, pci_alloc_consistent 2 pointers: > one dma address, one virtual address. > * The code relies on the virt_to_page() macro. What I'm wondering is what about a wrapper around the slab allocator, in a similar way to pci_alloc_consistent() is a wrapper around gfp. Since the slab allocator returns "pointers" in the same space as gfp returns page references, there shouldn't be a problem (Linus may complain here). ie, we could make pci_alloc_consistent() a little more intelligent and allocate from the slab for small sizes, but use gfp for larger sizes? Comments, anyone (DaveM, Linus, et al) ? _ |_| - ---+---+- | | Russell King[EMAIL PROTECTED] --- --- | | | | http://www.arm.linux.org.uk/personal/aboutme.html / / | | +-+-+ --- -+- / | THE developer of ARM Linux |+| /|\ / | | | --- | +-+-+ - /\\\ | - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
Russell King wrote: > > Johannes Erdfelt writes: > > They need to be visible via DMA. They need to be 16 byte aligned. We > > also have QH's which have similar requirements, but we don't use as many > > of them. > > Can we get away from the "16 byte aligned" and make it "n byte aligned"? > I believe that slab already has support for this? > Not yet, but that would be a 2 line patch (currently it's hardcoded to BYTES_PER_WORD align or L1_CACHE_BYTES, depending on the HWCACHE_ALIGN flag). But there are 2 other problems: * kmem_cache_alloc returns one pointer, pci_alloc_consistent 2 pointers: one dma address, one virtual address. * The code relies on the virt_to_page() macro. The second problem is the difficult one, I don't see how I could remove that dependency without a major overhaul. -- Manfred - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
Russell King wrote: Johannes Erdfelt writes: They need to be visible via DMA. They need to be 16 byte aligned. We also have QH's which have similar requirements, but we don't use as many of them. Can we get away from the "16 byte aligned" and make it "n byte aligned"? I believe that slab already has support for this? Not yet, but that would be a 2 line patch (currently it's hardcoded to BYTES_PER_WORD align or L1_CACHE_BYTES, depending on the HWCACHE_ALIGN flag). But there are 2 other problems: * kmem_cache_alloc returns one pointer, pci_alloc_consistent 2 pointers: one dma address, one virtual address. * The code relies on the virt_to_page() macro. The second problem is the difficult one, I don't see how I could remove that dependency without a major overhaul. -- Manfred - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
Manfred Spraul writes: Not yet, but that would be a 2 line patch (currently it's hardcoded to BYTES_PER_WORD align or L1_CACHE_BYTES, depending on the HWCACHE_ALIGN flag). I don't think there's a problem then. However, if slab can be told "I want 1024 bytes aligned to 1024 bytes" then I can get rid of arch/arm/mm/small_page.c (separate problem to the one we're discussing though) ;) But there are 2 other problems: * kmem_cache_alloc returns one pointer, pci_alloc_consistent 2 pointers: one dma address, one virtual address. * The code relies on the virt_to_page() macro. What I'm wondering is what about a wrapper around the slab allocator, in a similar way to pci_alloc_consistent() is a wrapper around gfp. Since the slab allocator returns "pointers" in the same space as gfp returns page references, there shouldn't be a problem (Linus may complain here). ie, we could make pci_alloc_consistent() a little more intelligent and allocate from the slab for small sizes, but use gfp for larger sizes? Comments, anyone (DaveM, Linus, et al) ? _ |_| - ---+---+- | | Russell King[EMAIL PROTECTED] --- --- | | | | http://www.arm.linux.org.uk/personal/aboutme.html / / | | +-+-+ --- -+- / | THE developer of ARM Linux |+| /|\ / | | | --- | +-+-+ - /\\\ | - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
Russell King wrote: Manfred Spraul writes: Not yet, but that would be a 2 line patch (currently it's hardcoded to BYTES_PER_WORD align or L1_CACHE_BYTES, depending on the HWCACHE_ALIGN flag). I don't think there's a problem then. However, if slab can be told "I want 1024 bytes aligned to 1024 bytes" then I can get rid of arch/arm/mm/small_page.c (separate problem to the one we're discussing though) ;) That's easy, I'll include it in my next slab update. -- Manfred - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
On Sat, Jan 20, 2001, Russell King [EMAIL PROTECTED] wrote: Johannes Erdfelt writes: They need to be visible via DMA. They need to be 16 byte aligned. We also have QH's which have similar requirements, but we don't use as many of them. Can we get away from the "16 byte aligned" and make it "n byte aligned"? I believe that slab already has support for this? If you look at the part of the message that I quoted and you cut off, the requirements for UHCI are the data structures MUST be 16 byte aligned. I don't mind if the API is more generalized, but those are the requirements that were asked about in this specific case. JE - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
Hi, You may have already found out that there's a problem using pci_alloc_consistent and friends in the USB layer which will only be obvious on CPUs where they need to do page table remapping - that is that pci_alloc_consistent/pci_free_consistent aren't guaranteed to be interrupt-safe. I'm not sure what the correct way around this is yet, but I do know its a major problem. ;( Maybe we need to do a get_free_pages-type thing with this and keep a set amount of consistent area in reserve for atomic allocations (as per GFP_ATOMIC)? Yes, I know its not nice, but I don't see any other option at the moment with USB. (yes, I'm hacking the 2.2.18 ohci driver for my own ends to get something up and running on one of my machines). _ |_| - ---+---+- | | Russell King[EMAIL PROTECTED] --- --- | | | | http://www.arm.linux.org.uk/personal/aboutme.html / / | | +-+-+ --- -+- / | THE developer of ARM Linux |+| /|\ / | | | --- | +-+-+ - /\\\ | - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
Johannes Erdfelt writes: > They need to be visible via DMA. They need to be 16 byte aligned. We > also have QH's which have similar requirements, but we don't use as many > of them. Can we get away from the "16 byte aligned" and make it "n byte aligned"? I believe that slab already has support for this? _ |_| - ---+---+- | | Russell King[EMAIL PROTECTED] --- --- | | | | http://www.arm.linux.org.uk/personal/aboutme.html / / | | +-+-+ --- -+- / | THE developer of ARM Linux |+| /|\ / | | | --- | +-+-+ - /\\\ | - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
On Sat, Jan 20, 2001, Manfred Spraul <[EMAIL PROTECTED]> wrote: > > > > TD's are around 32 bytes big (actually, they may be 48 or even 64 now, I > > haven't checked recently). That's a waste of space for an entire page. > > > > However, having every driver implement it's own slab cache seems a > > complete waste of time when we already have the code to do so in > > mm/slab.c. It would be nice if we could extend the generic slab code to > > understand the PCI DMA API for us. > > > I missed the beginning of the thread: > > What are the exact requirements for TD's? > I have 3 tiny updates for mm/slab.c that I'll send to Linus as soon as > 2.4 has stabilized a bit more, perhaps I can integrate the code for USB. They need to be visible via DMA. They need to be 16 byte aligned. We also have QH's which have similar requirements, but we don't use as many of them. JE - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
> > TD's are around 32 bytes big (actually, they may be 48 or even 64 now, I > haven't checked recently). That's a waste of space for an entire page. > > However, having every driver implement it's own slab cache seems a > complete waste of time when we already have the code to do so in > mm/slab.c. It would be nice if we could extend the generic slab code to > understand the PCI DMA API for us. > I missed the beginning of the thread: What are the exact requirements for TD's? I have 3 tiny updates for mm/slab.c that I'll send to Linus as soon as 2.4 has stabilized a bit more, perhaps I can integrate the code for USB. -- Manfred - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
On Sat, Jan 20, 2001, Russell King <[EMAIL PROTECTED]> wrote: > Johannes Erdfelt writes: > > On Fri, Jan 19, 2001, Miles Lane <[EMAIL PROTECTED]> wrote: > > > Johannes Erdfelt wrote: > > > > > > > TODO > > > > > > > > - The PCI DMA architecture is horribly inefficient on x86 and ia64. The > > > > result is a page is allocated for each TD. This is evil. Perhaps a slab > > > > cache internally? Or modify the generic slab cache to handle PCI DMA > > > > pages instead? > > > > > > This might be the kind of thing to run past Linus when the 2.5 tree > > > opens up. Are these inefficiencies necessary evils due to workarounds > > > for whacky bugs in BIOSen or PCI chipsets or are they due to poor > > > design/implementation? > > > > Looks like poor design/implementation. Or perhaps it was designed for > > another reason than I want to use it for. > > Why? What are you trying to do? Allocate one area per small structure? > Why not allocate one big area and allocate from that (like the tulip > drivers do for their TX and RX rings)? > > I don't really know what you're trying to do/what the problem is because > there isn't enough context left in the original mail above, and I have > no idea whether the original mail appeared here or where I can read it. I was hoping the context from the original TODO up there was sufficient and it looked like it was enough. TD's are around 32 bytes big (actually, they may be 48 or even 64 now, I haven't checked recently). That's a waste of space for an entire page. However, having every driver implement it's own slab cache seems a complete waste of time when we already have the code to do so in mm/slab.c. It would be nice if we could extend the generic slab code to understand the PCI DMA API for us. > > I should also check architectures other than x86 and ia64. > > This is an absolute must. Not really. The 2 interesting architectures are x86 and ia64 since that's where you commonly see UHCI controllers. While you can add UHCI controllers to most any other architecture which has PCI, you usually see OHCI on those systems. I was curious to see if any other architectures implemented it differently and I was just expecting too much out of the API. You pretty much confirmed my suspicions when you suggested doing what the tulip driver does. JE - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
Johannes Erdfelt writes: > On Fri, Jan 19, 2001, Miles Lane <[EMAIL PROTECTED]> wrote: > > Johannes Erdfelt wrote: > > > > > TODO > > > > > > - The PCI DMA architecture is horribly inefficient on x86 and ia64. The > > > result is a page is allocated for each TD. This is evil. Perhaps a slab > > > cache internally? Or modify the generic slab cache to handle PCI DMA > > > pages instead? > > > > This might be the kind of thing to run past Linus when the 2.5 tree > > opens up. Are these inefficiencies necessary evils due to workarounds > > for whacky bugs in BIOSen or PCI chipsets or are they due to poor > > design/implementation? > > Looks like poor design/implementation. Or perhaps it was designed for > another reason than I want to use it for. Why? What are you trying to do? Allocate one area per small structure? Why not allocate one big area and allocate from that (like the tulip drivers do for their TX and RX rings)? I don't really know what you're trying to do/what the problem is because there isn't enough context left in the original mail above, and I have no idea whether the original mail appeared here or where I can read it. > I should also check architectures other than x86 and ia64. This is an absolute must. _ |_| - ---+---+- | | Russell King[EMAIL PROTECTED] --- --- | | | | http://www.arm.linux.org.uk/personal/aboutme.html / / | | +-+-+ --- -+- / | THE developer of ARM Linux |+| /|\ / | | | --- | +-+-+ - /\\\ | - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
Johannes Erdfelt writes: On Fri, Jan 19, 2001, Miles Lane [EMAIL PROTECTED] wrote: Johannes Erdfelt wrote: TODO - The PCI DMA architecture is horribly inefficient on x86 and ia64. The result is a page is allocated for each TD. This is evil. Perhaps a slab cache internally? Or modify the generic slab cache to handle PCI DMA pages instead? This might be the kind of thing to run past Linus when the 2.5 tree opens up. Are these inefficiencies necessary evils due to workarounds for whacky bugs in BIOSen or PCI chipsets or are they due to poor design/implementation? Looks like poor design/implementation. Or perhaps it was designed for another reason than I want to use it for. Why? What are you trying to do? Allocate one area per small structure? Why not allocate one big area and allocate from that (like the tulip drivers do for their TX and RX rings)? I don't really know what you're trying to do/what the problem is because there isn't enough context left in the original mail above, and I have no idea whether the original mail appeared here or where I can read it. I should also check architectures other than x86 and ia64. This is an absolute must. _ |_| - ---+---+- | | Russell King[EMAIL PROTECTED] --- --- | | | | http://www.arm.linux.org.uk/personal/aboutme.html / / | | +-+-+ --- -+- / | THE developer of ARM Linux |+| /|\ / | | | --- | +-+-+ - /\\\ | - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
On Sat, Jan 20, 2001, Russell King [EMAIL PROTECTED] wrote: Johannes Erdfelt writes: On Fri, Jan 19, 2001, Miles Lane [EMAIL PROTECTED] wrote: Johannes Erdfelt wrote: TODO - The PCI DMA architecture is horribly inefficient on x86 and ia64. The result is a page is allocated for each TD. This is evil. Perhaps a slab cache internally? Or modify the generic slab cache to handle PCI DMA pages instead? This might be the kind of thing to run past Linus when the 2.5 tree opens up. Are these inefficiencies necessary evils due to workarounds for whacky bugs in BIOSen or PCI chipsets or are they due to poor design/implementation? Looks like poor design/implementation. Or perhaps it was designed for another reason than I want to use it for. Why? What are you trying to do? Allocate one area per small structure? Why not allocate one big area and allocate from that (like the tulip drivers do for their TX and RX rings)? I don't really know what you're trying to do/what the problem is because there isn't enough context left in the original mail above, and I have no idea whether the original mail appeared here or where I can read it. I was hoping the context from the original TODO up there was sufficient and it looked like it was enough. TD's are around 32 bytes big (actually, they may be 48 or even 64 now, I haven't checked recently). That's a waste of space for an entire page. However, having every driver implement it's own slab cache seems a complete waste of time when we already have the code to do so in mm/slab.c. It would be nice if we could extend the generic slab code to understand the PCI DMA API for us. I should also check architectures other than x86 and ia64. This is an absolute must. Not really. The 2 interesting architectures are x86 and ia64 since that's where you commonly see UHCI controllers. While you can add UHCI controllers to most any other architecture which has PCI, you usually see OHCI on those systems. I was curious to see if any other architectures implemented it differently and I was just expecting too much out of the API. You pretty much confirmed my suspicions when you suggested doing what the tulip driver does. JE - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
TD's are around 32 bytes big (actually, they may be 48 or even 64 now, I haven't checked recently). That's a waste of space for an entire page. However, having every driver implement it's own slab cache seems a complete waste of time when we already have the code to do so in mm/slab.c. It would be nice if we could extend the generic slab code to understand the PCI DMA API for us. I missed the beginning of the thread: What are the exact requirements for TD's? I have 3 tiny updates for mm/slab.c that I'll send to Linus as soon as 2.4 has stabilized a bit more, perhaps I can integrate the code for USB. -- Manfred - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
On Sat, Jan 20, 2001, Manfred Spraul [EMAIL PROTECTED] wrote: TD's are around 32 bytes big (actually, they may be 48 or even 64 now, I haven't checked recently). That's a waste of space for an entire page. However, having every driver implement it's own slab cache seems a complete waste of time when we already have the code to do so in mm/slab.c. It would be nice if we could extend the generic slab code to understand the PCI DMA API for us. I missed the beginning of the thread: What are the exact requirements for TD's? I have 3 tiny updates for mm/slab.c that I'll send to Linus as soon as 2.4 has stabilized a bit more, perhaps I can integrate the code for USB. They need to be visible via DMA. They need to be 16 byte aligned. We also have QH's which have similar requirements, but we don't use as many of them. JE - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
Johannes Erdfelt writes: They need to be visible via DMA. They need to be 16 byte aligned. We also have QH's which have similar requirements, but we don't use as many of them. Can we get away from the "16 byte aligned" and make it "n byte aligned"? I believe that slab already has support for this? _ |_| - ---+---+- | | Russell King[EMAIL PROTECTED] --- --- | | | | http://www.arm.linux.org.uk/personal/aboutme.html / / | | +-+-+ --- -+- / | THE developer of ARM Linux |+| /|\ / | | | --- | +-+-+ - /\\\ | - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
On Fri, Jan 19, 2001, Miles Lane <[EMAIL PROTECTED]> wrote: > Johannes Erdfelt wrote: > > > TODO > > > > - The PCI DMA architecture is horribly inefficient on x86 and ia64. The > > result is a page is allocated for each TD. This is evil. Perhaps a slab > > cache internally? Or modify the generic slab cache to handle PCI DMA > > pages instead? > > This might be the kind of thing to run past Linus when the 2.5 tree > opens up. Are these inefficiencies necessary evils due to workarounds > for whacky bugs in BIOSen or PCI chipsets or are they due to poor > design/implementation? Looks like poor design/implementation. Or perhaps it was designed for another reason than I want to use it for. 2.5 is probably where any core changes will happen, if any. But for now I suspect I'll need to workaround it in my driver. I should also check architectures other than x86 and ia64. JE - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Inefficient PCI DMA usage (was: [experimental patch] UHCI updates)
On Fri, Jan 19, 2001, Miles Lane [EMAIL PROTECTED] wrote: Johannes Erdfelt wrote: TODO - The PCI DMA architecture is horribly inefficient on x86 and ia64. The result is a page is allocated for each TD. This is evil. Perhaps a slab cache internally? Or modify the generic slab cache to handle PCI DMA pages instead? This might be the kind of thing to run past Linus when the 2.5 tree opens up. Are these inefficiencies necessary evils due to workarounds for whacky bugs in BIOSen or PCI chipsets or are they due to poor design/implementation? Looks like poor design/implementation. Or perhaps it was designed for another reason than I want to use it for. 2.5 is probably where any core changes will happen, if any. But for now I suspect I'll need to workaround it in my driver. I should also check architectures other than x86 and ia64. JE - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/