Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Thu, Jun 28, 2007 at 10:24:24PM -0700, Andrew Morton wrote: > > Really, it would be great if we could treat kmalloc() objects > > just like real pages. > > >From a high level, that seems like a bad idea. kmalloc() gives you a > virtual address and you really shouldn't be poking around at that memory's > underlying page's pageframe metadata. > > However we can of course do tasteless and weird things if the benefit is > sufficient Hey, when we had exactly that issues coming up with xfs/ext3 recovery over iscsi/aoe you said it's fine :) End result is that XFS got fixed and ext3 is still broken.. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Sat, Jun 30, 2007 at 12:11:38AM +0100, Alan Cox wrote: > > DMA to or from memory should be done via the DMA mapping API. If we're > > DMAing to/from a limited range within a page, either we should be using > > dma_map_single(), or dma_map_page() with an appropriate offset and size. > > If those ranges overlap a cache line then the dma mapping API will not > save your backside. There's nothing much that the DMA API can do though. Consider DMA to a result buffer which is, eg, only 16 bytes in size. So you get passed a size of '16' to the DMA API. What should you do at this point? BUG()? What if you have 64 or 128 byte cache lines? > > sizes, but they do happen. We handle this on ARM by writing back > > the overlapped lines and invalidating the rest before the DMA operation > > commences, and hope that the overlapped lines aren't touched for the > > duration of the DMA.) > > The combination of "hope" and "DMA" isn't a good one for stable system > design. In this situation we should be waving large red flags I agree. However, I don't think this is an issue for the DMA API to handle; it's something that driver authors need to be aware of. If they wish to do a DMA to a kmalloc'd buffer or even a page, we could require that offsets and sizes are cache line aligned. However, remember that turning on slab debugging turns off cache line alignment, so imposing such a requirement implies that the slab debugging will break DMA, or driver authors also have to be aware of that and do their own alignment internally, *or* we provide an allocator which does unconditionally align. -- Russell King Linux kernel2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Sat, Jun 30, 2007 at 12:11:38AM +0100, Alan Cox wrote: DMA to or from memory should be done via the DMA mapping API. If we're DMAing to/from a limited range within a page, either we should be using dma_map_single(), or dma_map_page() with an appropriate offset and size. If those ranges overlap a cache line then the dma mapping API will not save your backside. There's nothing much that the DMA API can do though. Consider DMA to a result buffer which is, eg, only 16 bytes in size. So you get passed a size of '16' to the DMA API. What should you do at this point? BUG()? What if you have 64 or 128 byte cache lines? sizes, but they do happen. We handle this on ARM by writing back the overlapped lines and invalidating the rest before the DMA operation commences, and hope that the overlapped lines aren't touched for the duration of the DMA.) The combination of hope and DMA isn't a good one for stable system design. In this situation we should be waving large red flags I agree. However, I don't think this is an issue for the DMA API to handle; it's something that driver authors need to be aware of. If they wish to do a DMA to a kmalloc'd buffer or even a page, we could require that offsets and sizes are cache line aligned. However, remember that turning on slab debugging turns off cache line alignment, so imposing such a requirement implies that the slab debugging will break DMA, or driver authors also have to be aware of that and do their own alignment internally, *or* we provide an allocator which does unconditionally align. -- Russell King Linux kernel2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Thu, Jun 28, 2007 at 10:24:24PM -0700, Andrew Morton wrote: Really, it would be great if we could treat kmalloc() objects just like real pages. From a high level, that seems like a bad idea. kmalloc() gives you a virtual address and you really shouldn't be poking around at that memory's underlying page's pageframe metadata. However we can of course do tasteless and weird things if the benefit is sufficient Hey, when we had exactly that issues coming up with xfs/ext3 recovery over iscsi/aoe you said it's fine :) End result is that XFS got fixed and ext3 is still broken.. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
> DMA to or from memory should be done via the DMA mapping API. If we're > DMAing to/from a limited range within a page, either we should be using > dma_map_single(), or dma_map_page() with an appropriate offset and size. If those ranges overlap a cache line then the dma mapping API will not save your backside. On a system with a 32 byte cache granularity what happens if you get two dma mapping calls for x and x+16. Right now the thing that avoids this occurring is that the allocators don't pack stuff in that hard so x+16 always belongs to the same driver and we can hope driver authors are sensible > sizes, but they do happen. We handle this on ARM by writing back > the overlapped lines and invalidating the rest before the DMA operation > commences, and hope that the overlapped lines aren't touched for the > duration of the DMA.) The combination of "hope" and "DMA" isn't a good one for stable system design. In this situation we should be waving large red flags - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Fri, 29 Jun 2007 13:45:29 -0700 Andrew Morton <[EMAIL PROTECTED]> wrote: > On Fri, 29 Jun 2007 13:16:57 +0100 > Alan Cox <[EMAIL PROTECTED]> wrote: > > > > If those operations involve modifying that slab page's pageframe then what > > > stops concurrent dma'ers from stomping on each other's changes? As in: > > > why aren't we already buggy? > > > > Or DMA operations falling out with CPU operations in the same memory > > area. Not all platforms have hardware consistency and some will blat the > > entire page out of cache. > > Is that just a performance problem, or can data be lost here? It depends > on the meaning of "blat": writeback? invalidate? More details, please. Invalidate. Sorry didn't realise it they hadn't discovered that word down under. If you've got something packing objects in tight we are going to have fun with cache handling simply because the CPU cache granularity may mean that the invalidate also invalidates a few bytes on (ie a 12 byte object will invalidate 16 bytes of memory) and you've just removed any CPU held changes in the start of the next object. Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Fri, Jun 29, 2007 at 01:45:29PM -0700, Andrew Morton wrote: > On Fri, 29 Jun 2007 13:16:57 +0100 > Alan Cox <[EMAIL PROTECTED]> wrote: > > > > If those operations involve modifying that slab page's pageframe then what > > > stops concurrent dma'ers from stomping on each other's changes? As in: > > > why aren't we already buggy? > > > > Or DMA operations falling out with CPU operations in the same memory > > area. Not all platforms have hardware consistency and some will blat the > > entire page out of cache. > > Is that just a performance problem, or can data be lost here? It depends > on the meaning of "blat": writeback? invalidate? More details, please. > > > I'm dyin here and nobody will talk to me. If the kernel is already doing > these things, why aren't we already buggy? Is it because we don't actually > modify the pageframes of these dma-to-from-kmalloced pages? But we were > thinking of doing so in the future? I think people are getting too het up about this. DMA to or from memory should be done via the DMA mapping API. If we're DMAing to/from a limited range within a page, either we should be using dma_map_single(), or dma_map_page() with an appropriate offset and size. Other cache flushing functions should not be called for DMA operations; any cache handling required by non-coherent architectures should be done by the DMA API only. However, with non-coherent aliasing architectures (such as those with aliasing VIPT or VIVT caches) there is an additional requirement on PIO to page cache. If the page we're writing data has some cache lines allocated to it, we potentially hit those cache lines and the data doesn't hit the underlying page. Later on, when we come to map the page into userspace, the data may still be sitting in the cache lines corresponding with the kernel's mapping. Therefore, there is a requirement to ensure that the cache state WRT the kernel's mapping is the same irrespective of the method by which data ends up in the page. That means that for these caches, the data PIO'd into the page must be written back to the underlying page before the page is handed to userspace. The two are completely separate; it seems to me from the above discussion that people are confusing the two scenarios, and mixing DMA with the PIO cache handling. Please don't, you'll only get more and more confused. (Note: with the dma_map_* API, architectures have to be sensible when they're passed offests and sizes which aren't cacheline aligned. Technically, it's buggy to ask for non-L1 line aligned offsets and sizes, but they do happen. We handle this on ARM by writing back the overlapped lines and invalidating the rest before the DMA operation commences, and hope that the overlapped lines aren't touched for the duration of the DMA.) -- Russell King Linux kernel2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Fri, 29 Jun 2007 13:16:57 +0100 Alan Cox <[EMAIL PROTECTED]> wrote: > > If those operations involve modifying that slab page's pageframe then what > > stops concurrent dma'ers from stomping on each other's changes? As in: > > why aren't we already buggy? > > Or DMA operations falling out with CPU operations in the same memory > area. Not all platforms have hardware consistency and some will blat the > entire page out of cache. Is that just a performance problem, or can data be lost here? It depends on the meaning of "blat": writeback? invalidate? More details, please. I'm dyin here and nobody will talk to me. If the kernel is already doing these things, why aren't we already buggy? Is it because we don't actually modify the pageframes of these dma-to-from-kmalloced pages? But we were thinking of doing so in the future? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Fri, 29 Jun 2007, Hugh Dickins wrote: > I stand by my page_mapping patch, and the remark I made before, > that page_mapping(page) is the correct place to check this. What is > page_mapping(page) for? Precisely to return the struct address_space* > from page->mapping when that's what's in there, and not when that field > has been reused for something else. > > So lines like > > + mapping = PageSlab(page) ? NULL : page_mapping(page); > seem to miss the point. They certainly point out to the reader that one can expect a slab page here where one may not expect one since flush_dcache_page is a page cache function. > I agree that the only clash found yet has been in flush_dcache_page, > so some bytes and branches can indeed be saved by just doing the > test in there. Oh, but your VM_BUG_ON cancels out that saving. > And if we were to try to save bytes and branches there, it's the > synthetic swapper_space business (only required in a couple of > places) I'd be wanting to cut out. VM_BUG_ON is different from BUG_ON. BUG_ON is always checked. VM_BUG_ON depends on a debug config option. > To me this all seems like a big fuss to excuse your surprise: I am weirded out by the use of terms like "accusations" and "excuses" in these discussions. But if it makes you feel better Others seemed to have encountered the same surprises before me. So it is better to point out these issues in the sources. There is the danger of other pagecache functions getting called for slab pages in the I/O layer and the check in page_mapping provides some protection from such changes. The checks/comments in the functions where we allow slab page use help the ones enhancing the code to keep these issues in mind and help them to not be surprised in turn. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Thu, 28 Jun 2007, Christoph Lameter wrote: > I had a talk with James Bottomley last night and it seems that there is an > established way of using the page structs of slab objects in the block > layer. Drivers may use the DMA interfaces to issue control commands. In > that case they may allocate a short structure via the slab allocator and > put the control commands into that slab object. > > The driver will then perform a sg_init_one() on the slab object. > sg_init_one() calls sg_set_buf() which determines the page struct of a > page. In this case sg_set_buf() will determine the page struct of a slab > object. The dma layer may then perform operations on the "slab page". The > block layer folks seem to have spend some time to make this work right. Yes, I don't see why this comes as such a surprise and horror to you, so much in need of dire WARNINGs. kmalloc memory is not a different kind of memory from what you get from the page allocators. I stand by my page_mapping patch, and the remark I made before, that page_mapping(page) is the correct place to check this. What is page_mapping(page) for? Precisely to return the struct address_space* from page->mapping when that's what's in there, and not when that field has been reused for something else. So lines like > + mapping = PageSlab(page) ? NULL : page_mapping(page); seem to miss the point. I agree that the only clash found yet has been in flush_dcache_page, so some bytes and branches can indeed be saved by just doing the test in there. Oh, but your VM_BUG_ON cancels out that saving. And if we were to try to save bytes and branches there, it's the synthetic swapper_space business (only required in a couple of places) I'd be wanting to cut out. To me this all seems like a big fuss to excuse your surprise: so please don't expect an Ack from me; but if others prefer this, I won't be Nacking. (Though I'll probably whine about it into eternity ;) Hugh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
> If those operations involve modifying that slab page's pageframe then what > stops concurrent dma'ers from stomping on each other's changes? As in: > why aren't we already buggy? Or DMA operations falling out with CPU operations in the same memory area. Not all platforms have hardware consistency and some will blat the entire page out of cache. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
From: Christoph Lameter <[EMAIL PROTECTED]> Date: Fri, 29 Jun 2007 00:00:39 -0700 (PDT) > On Thu, 28 Jun 2007, David Miller wrote: > > > Really, it would be great if we could treat kmalloc() objects > > just like real pages. Everything wants to do I/O on pages > > but sometimes (like the networking) you have a kmalloc > > chunk which is technically just a part of a page. > > > > The fact that there is no easy way to make this work is > > frustrating :-) > > There is easy way: Allocate a page and just use the first N bytes. You can > specify the bytes to be used when putting the memory onto the scatter > gather list. This wastes memory but it works. You have real refcounting > since you got a real page. > > How frequent are these objects? Every single network packet. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Thu, 28 Jun 2007, David Miller wrote: > Really, it would be great if we could treat kmalloc() objects > just like real pages. Everything wants to do I/O on pages > but sometimes (like the networking) you have a kmalloc > chunk which is technically just a part of a page. > > The fact that there is no easy way to make this work is > frustrating :-) There is easy way: Allocate a page and just use the first N bytes. You can specify the bytes to be used when putting the memory onto the scatter gather list. This wastes memory but it works. You have real refcounting since you got a real page. How frequent are these objects? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Thu, 28 Jun 2007, Andrew Morton wrote: > If those operations _don't_ involve modifying the pageframe (hopes this is > true) then we're read-only and things become much easier? This is true right now. We are way off topic ... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Thu, 28 Jun 2007, David Miller wrote: > From: Andrew Morton <[EMAIL PROTECTED]> > Date: Thu, 28 Jun 2007 22:24:24 -0700 > > > So what happens when two quite different threads of control are doing > > IO against two hunks of kmalloced memory which happen to come from the same > > page? Either some (kernel-wide) locking is needed, or that pageframe needs > > to be treated as readonly? > > Or you put an atomic_t at the beginning or tail of every SLAB > object. It's a space cost not a runtime cost for the common > case which is: Hmmm... We could do something like kmem_cache_get(slab, object) and kmem_cache_put(slab, object) kmem_cache_get would disable allocations from the slab and increment a refcount. kmem_cache_put would enable allocations again if the refcount reaches one. The problem is that freeing an object may cause writes to the object. F.e. poisoning will overwrite the object on free. SLUB will put its free pointer in the first words etc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Thu, 28 Jun 2007, David Miller wrote: From: Andrew Morton [EMAIL PROTECTED] Date: Thu, 28 Jun 2007 22:24:24 -0700 So what happens when two quite different threads of control are doing IO against two hunks of kmalloced memory which happen to come from the same page? Either some (kernel-wide) locking is needed, or that pageframe needs to be treated as readonly? Or you put an atomic_t at the beginning or tail of every SLAB object. It's a space cost not a runtime cost for the common case which is: Hmmm... We could do something like kmem_cache_get(slab, object) and kmem_cache_put(slab, object) kmem_cache_get would disable allocations from the slab and increment a refcount. kmem_cache_put would enable allocations again if the refcount reaches one. The problem is that freeing an object may cause writes to the object. F.e. poisoning will overwrite the object on free. SLUB will put its free pointer in the first words etc. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Thu, 28 Jun 2007, Andrew Morton wrote: If those operations _don't_ involve modifying the pageframe (hopes this is true) then we're read-only and things become much easier? This is true right now. We are way off topic ... - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Thu, 28 Jun 2007, David Miller wrote: Really, it would be great if we could treat kmalloc() objects just like real pages. Everything wants to do I/O on pages but sometimes (like the networking) you have a kmalloc chunk which is technically just a part of a page. The fact that there is no easy way to make this work is frustrating :-) There is easy way: Allocate a page and just use the first N bytes. You can specify the bytes to be used when putting the memory onto the scatter gather list. This wastes memory but it works. You have real refcounting since you got a real page. How frequent are these objects? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
From: Christoph Lameter [EMAIL PROTECTED] Date: Fri, 29 Jun 2007 00:00:39 -0700 (PDT) On Thu, 28 Jun 2007, David Miller wrote: Really, it would be great if we could treat kmalloc() objects just like real pages. Everything wants to do I/O on pages but sometimes (like the networking) you have a kmalloc chunk which is technically just a part of a page. The fact that there is no easy way to make this work is frustrating :-) There is easy way: Allocate a page and just use the first N bytes. You can specify the bytes to be used when putting the memory onto the scatter gather list. This wastes memory but it works. You have real refcounting since you got a real page. How frequent are these objects? Every single network packet. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
If those operations involve modifying that slab page's pageframe then what stops concurrent dma'ers from stomping on each other's changes? As in: why aren't we already buggy? Or DMA operations falling out with CPU operations in the same memory area. Not all platforms have hardware consistency and some will blat the entire page out of cache. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Thu, 28 Jun 2007, Christoph Lameter wrote: I had a talk with James Bottomley last night and it seems that there is an established way of using the page structs of slab objects in the block layer. Drivers may use the DMA interfaces to issue control commands. In that case they may allocate a short structure via the slab allocator and put the control commands into that slab object. The driver will then perform a sg_init_one() on the slab object. sg_init_one() calls sg_set_buf() which determines the page struct of a page. In this case sg_set_buf() will determine the page struct of a slab object. The dma layer may then perform operations on the slab page. The block layer folks seem to have spend some time to make this work right. Yes, I don't see why this comes as such a surprise and horror to you, so much in need of dire WARNINGs. kmalloc memory is not a different kind of memory from what you get from the page allocators. I stand by my page_mapping patch, and the remark I made before, that page_mapping(page) is the correct place to check this. What is page_mapping(page) for? Precisely to return the struct address_space* from page-mapping when that's what's in there, and not when that field has been reused for something else. So lines like + mapping = PageSlab(page) ? NULL : page_mapping(page); seem to miss the point. I agree that the only clash found yet has been in flush_dcache_page, so some bytes and branches can indeed be saved by just doing the test in there. Oh, but your VM_BUG_ON cancels out that saving. And if we were to try to save bytes and branches there, it's the synthetic swapper_space business (only required in a couple of places) I'd be wanting to cut out. To me this all seems like a big fuss to excuse your surprise: so please don't expect an Ack from me; but if others prefer this, I won't be Nacking. (Though I'll probably whine about it into eternity ;) Hugh - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Fri, 29 Jun 2007, Hugh Dickins wrote: I stand by my page_mapping patch, and the remark I made before, that page_mapping(page) is the correct place to check this. What is page_mapping(page) for? Precisely to return the struct address_space* from page-mapping when that's what's in there, and not when that field has been reused for something else. So lines like + mapping = PageSlab(page) ? NULL : page_mapping(page); seem to miss the point. They certainly point out to the reader that one can expect a slab page here where one may not expect one since flush_dcache_page is a page cache function. I agree that the only clash found yet has been in flush_dcache_page, so some bytes and branches can indeed be saved by just doing the test in there. Oh, but your VM_BUG_ON cancels out that saving. And if we were to try to save bytes and branches there, it's the synthetic swapper_space business (only required in a couple of places) I'd be wanting to cut out. VM_BUG_ON is different from BUG_ON. BUG_ON is always checked. VM_BUG_ON depends on a debug config option. To me this all seems like a big fuss to excuse your surprise: I am weirded out by the use of terms like accusations and excuses in these discussions. But if it makes you feel better Others seemed to have encountered the same surprises before me. So it is better to point out these issues in the sources. There is the danger of other pagecache functions getting called for slab pages in the I/O layer and the check in page_mapping provides some protection from such changes. The checks/comments in the functions where we allow slab page use help the ones enhancing the code to keep these issues in mind and help them to not be surprised in turn. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Fri, 29 Jun 2007 13:16:57 +0100 Alan Cox [EMAIL PROTECTED] wrote: If those operations involve modifying that slab page's pageframe then what stops concurrent dma'ers from stomping on each other's changes? As in: why aren't we already buggy? Or DMA operations falling out with CPU operations in the same memory area. Not all platforms have hardware consistency and some will blat the entire page out of cache. Is that just a performance problem, or can data be lost here? It depends on the meaning of blat: writeback? invalidate? More details, please. I'm dyin here and nobody will talk to me. If the kernel is already doing these things, why aren't we already buggy? Is it because we don't actually modify the pageframes of these dma-to-from-kmalloced pages? But we were thinking of doing so in the future? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Fri, Jun 29, 2007 at 01:45:29PM -0700, Andrew Morton wrote: On Fri, 29 Jun 2007 13:16:57 +0100 Alan Cox [EMAIL PROTECTED] wrote: If those operations involve modifying that slab page's pageframe then what stops concurrent dma'ers from stomping on each other's changes? As in: why aren't we already buggy? Or DMA operations falling out with CPU operations in the same memory area. Not all platforms have hardware consistency and some will blat the entire page out of cache. Is that just a performance problem, or can data be lost here? It depends on the meaning of blat: writeback? invalidate? More details, please. I'm dyin here and nobody will talk to me. If the kernel is already doing these things, why aren't we already buggy? Is it because we don't actually modify the pageframes of these dma-to-from-kmalloced pages? But we were thinking of doing so in the future? I think people are getting too het up about this. DMA to or from memory should be done via the DMA mapping API. If we're DMAing to/from a limited range within a page, either we should be using dma_map_single(), or dma_map_page() with an appropriate offset and size. Other cache flushing functions should not be called for DMA operations; any cache handling required by non-coherent architectures should be done by the DMA API only. However, with non-coherent aliasing architectures (such as those with aliasing VIPT or VIVT caches) there is an additional requirement on PIO to page cache. If the page we're writing data has some cache lines allocated to it, we potentially hit those cache lines and the data doesn't hit the underlying page. Later on, when we come to map the page into userspace, the data may still be sitting in the cache lines corresponding with the kernel's mapping. Therefore, there is a requirement to ensure that the cache state WRT the kernel's mapping is the same irrespective of the method by which data ends up in the page. That means that for these caches, the data PIO'd into the page must be written back to the underlying page before the page is handed to userspace. The two are completely separate; it seems to me from the above discussion that people are confusing the two scenarios, and mixing DMA with the PIO cache handling. Please don't, you'll only get more and more confused. (Note: with the dma_map_* API, architectures have to be sensible when they're passed offests and sizes which aren't cacheline aligned. Technically, it's buggy to ask for non-L1 line aligned offsets and sizes, but they do happen. We handle this on ARM by writing back the overlapped lines and invalidating the rest before the DMA operation commences, and hope that the overlapped lines aren't touched for the duration of the DMA.) -- Russell King Linux kernel2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Fri, 29 Jun 2007 13:45:29 -0700 Andrew Morton [EMAIL PROTECTED] wrote: On Fri, 29 Jun 2007 13:16:57 +0100 Alan Cox [EMAIL PROTECTED] wrote: If those operations involve modifying that slab page's pageframe then what stops concurrent dma'ers from stomping on each other's changes? As in: why aren't we already buggy? Or DMA operations falling out with CPU operations in the same memory area. Not all platforms have hardware consistency and some will blat the entire page out of cache. Is that just a performance problem, or can data be lost here? It depends on the meaning of blat: writeback? invalidate? More details, please. Invalidate. Sorry didn't realise it they hadn't discovered that word down under. If you've got something packing objects in tight we are going to have fun with cache handling simply because the CPU cache granularity may mean that the invalidate also invalidates a few bytes on (ie a 12 byte object will invalidate 16 bytes of memory) and you've just removed any CPU held changes in the start of the next object. Alan - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
DMA to or from memory should be done via the DMA mapping API. If we're DMAing to/from a limited range within a page, either we should be using dma_map_single(), or dma_map_page() with an appropriate offset and size. If those ranges overlap a cache line then the dma mapping API will not save your backside. On a system with a 32 byte cache granularity what happens if you get two dma mapping calls for x and x+16. Right now the thing that avoids this occurring is that the allocators don't pack stuff in that hard so x+16 always belongs to the same driver and we can hope driver authors are sensible sizes, but they do happen. We handle this on ARM by writing back the overlapped lines and invalidating the rest before the DMA operation commences, and hope that the overlapped lines aren't touched for the duration of the DMA.) The combination of hope and DMA isn't a good one for stable system design. In this situation we should be waving large red flags - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Thu, 28 Jun 2007 22:37:34 -0700 (PDT) David Miller <[EMAIL PROTECTED]> wrote: > From: Andrew Morton <[EMAIL PROTECTED]> > Date: Thu, 28 Jun 2007 22:24:24 -0700 > > > So what happens when two quite different threads of control are doing > > IO against two hunks of kmalloced memory which happen to come from the same > > page? Either some (kernel-wide) locking is needed, or that pageframe needs > > to be treated as readonly? > > Or you put an atomic_t at the beginning or tail of every SLAB > object. It's a space cost not a runtime cost for the common > case which is: > > smp_rmb(); > if (atomic_read(_obj->count) == 1) > really_free_it(); > else if (atomic_dec_and_test(...)) > > Note I don't like this variant either. :) but, but... Christoph said 'The dma layer may then perform operations on the "slab page'. If those operations involve modifying that slab page's pageframe then what stops concurrent dma'ers from stomping on each other's changes? As in: why aren't we already buggy? If those operations _don't_ involve modifying the pageframe (hopes this is true) then we're read-only and things become much easier? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
From: Andrew Morton <[EMAIL PROTECTED]> Date: Thu, 28 Jun 2007 22:24:24 -0700 > So what happens when two quite different threads of control are doing > IO against two hunks of kmalloced memory which happen to come from the same > page? Either some (kernel-wide) locking is needed, or that pageframe needs > to be treated as readonly? Or you put an atomic_t at the beginning or tail of every SLAB object. It's a space cost not a runtime cost for the common case which is: smp_rmb(); if (atomic_read(_obj->count) == 1) really_free_it(); else if (atomic_dec_and_test(...)) Note I don't like this variant either. :) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Thu, 28 Jun 2007 22:06:06 -0700 (PDT) David Miller <[EMAIL PROTECTED]> wrote: > From: Christoph Lameter <[EMAIL PROTECTED]> > Date: Thu, 28 Jun 2007 21:39:01 -0700 (PDT) > > > H... Maybe we are creating more of a mess with this. Isnt there some > > other way to handle these object. > > That's where I was going with the silly idea to use another > allocator :) > > Really, it would be great if we could treat kmalloc() objects > just like real pages. >From a high level, that seems like a bad idea. kmalloc() gives you a virtual address and you really shouldn't be poking around at that memory's underlying page's pageframe metadata. However we can of course do tasteless and weird things if the benefit is sufficient > Everything wants to do I/O on pages > but sometimes (like the networking) you have a kmalloc > chunk which is technically just a part of a page. hm. So what happens when two quite different threads of control are doing IO against two hunks of kmalloced memory which happen to come from the same page? Either some (kernel-wide) locking is needed, or that pageframe needs to be treated as readonly? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
From: Christoph Lameter <[EMAIL PROTECTED]> Date: Thu, 28 Jun 2007 21:39:01 -0700 (PDT) > H... Maybe we are creating more of a mess with this. Isnt there some > other way to handle these object. That's where I was going with the silly idea to use another allocator :) Really, it would be great if we could treat kmalloc() objects just like real pages. Everything wants to do I/O on pages but sometimes (like the networking) you have a kmalloc chunk which is technically just a part of a page. The fact that there is no easy way to make this work is frustrating :-) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Thu, 28 Jun 2007, David Miller wrote: > > You can get such a reference and then the slab page will be in limbo if > > all objects are freed until that reference is given up. The reference > > method is also use by kmem_cache_vacate() (but that is slab internal). > > What about if someone kfree()'s that object meanwhile? If that is the last object in the slab then the page is deslabified and it will stick around as a regular page until its refcount reaches zero. There is slab functionality in SLUB that uses this trick to insure that the slab does not go away. > Can we bump the SLAB object count just like we can a page? You can bump the slab page count. But you have to use virt_to_head_page() instead of virt_to_page() since slab pages may be compound pages (in both SLAB and SLUB). Incrementing the refcount of a tail page will cause an oops on free. > That's basically what's happening in the stuff Jens is > working on, he needs to grab a reference to a SLAB > object just like one can a page. Even if there is an > intervening kfree (like a put_page()) the SLAB object is > still live until all the references are put, and thus it > can't get reallocated and given to another client by SLAB. If you kfree the object then all slab allocators will feel free to immediately alloc the object for other purposes. If you want to prohibit further allocations from a slab then I can likely give you a function call that removes a slab from the partial lists so that the slab will not be allocated from. But this will only work with SLUB. At some point the slab will then need to be returned to the partial slab lists. H... Maybe we are creating more of a mess with this. Isnt there some other way to handle these object. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
From: Christoph Lameter <[EMAIL PROTECTED]> Date: Thu, 28 Jun 2007 21:22:22 -0700 (PDT) > On Thu, 28 Jun 2007, David Miller wrote: > > > > Still a better solution would be to not use the slab allocator at all for > > > the objects that are used to send commands to the devices. These are not > > > permanent and grabbing a page from the pcp lists and putting it back is > > > likely as fast as performing a kmalloc. > > > > Jens Axboe wants to get references to the page structs behind > > kmalloc() allocated pages in his networking splice work. > > You can get such a reference and then the slab page will be in limbo if > all objects are freed until that reference is given up. The reference > method is also use by kmem_cache_vacate() (but that is slab internal). What about if someone kfree()'s that object meanwhile? Can we bump the SLAB object count just like we can a page? That's basically what's happening in the stuff Jens is working on, he needs to grab a reference to a SLAB object just like one can a page. Even if there is an intervening kfree (like a put_page()) the SLAB object is still live until all the references are put, and thus it can't get reallocated and given to another client by SLAB. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Thu, 28 Jun 2007, David Miller wrote: > > Still a better solution would be to not use the slab allocator at all for > > the objects that are used to send commands to the devices. These are not > > permanent and grabbing a page from the pcp lists and putting it back is > > likely as fast as performing a kmalloc. > > Jens Axboe wants to get references to the page structs behind > kmalloc() allocated pages in his networking splice work. You can get such a reference and then the slab page will be in limbo if all objects are freed until that reference is given up. The reference method is also use by kmem_cache_vacate() (but that is slab internal). > We could make a special allocator in the networking that carves > chunks out of pages but I'm sure you'll find that about as stupid > and wasteful as I do :-) Well then I guess this containment patch is as good as we can get. It makes sure that things do not get out of hand. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
From: Christoph Lameter <[EMAIL PROTECTED]> Date: Thu, 28 Jun 2007 21:01:36 -0700 (PDT) > Modify the functions in the affected arches to check for PageSlab() and > use a NULL mapping if such a page is encountered. This may only be > necessary for parisc and arm since sparc64 and xtensa do not scan over > processes mapping a page but I have modified those two arches also for > correctnesses sake since they use page_mapping() in flush_dcache_page(). > > Still a better solution would be to not use the slab allocator at all for > the objects that are used to send commands to the devices. These are not > permanent and grabbing a page from the pcp lists and putting it back is > likely as fast as performing a kmalloc. Jens Axboe wants to get references to the page structs behind kmalloc() allocated pages in his networking splice work. We pass scatterlists around, but networking buffers are composed of a kmalloc()'d data header area for packet headers and some of the initial packet data, then a true scatterlist of page/offset/len triplets. Splice wants to work with pages everwhere. This is a reocurring theme, we should provide some kind of solution to these issues instead of wishing they would go away. We could make a special allocator in the networking that carves chunks out of pages but I'm sure you'll find that about as stupid and wasteful as I do :-) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
From: Christoph Lameter [EMAIL PROTECTED] Date: Thu, 28 Jun 2007 21:01:36 -0700 (PDT) Modify the functions in the affected arches to check for PageSlab() and use a NULL mapping if such a page is encountered. This may only be necessary for parisc and arm since sparc64 and xtensa do not scan over processes mapping a page but I have modified those two arches also for correctnesses sake since they use page_mapping() in flush_dcache_page(). Still a better solution would be to not use the slab allocator at all for the objects that are used to send commands to the devices. These are not permanent and grabbing a page from the pcp lists and putting it back is likely as fast as performing a kmalloc. Jens Axboe wants to get references to the page structs behind kmalloc() allocated pages in his networking splice work. We pass scatterlists around, but networking buffers are composed of a kmalloc()'d data header area for packet headers and some of the initial packet data, then a true scatterlist of page/offset/len triplets. Splice wants to work with pages everwhere. This is a reocurring theme, we should provide some kind of solution to these issues instead of wishing they would go away. We could make a special allocator in the networking that carves chunks out of pages but I'm sure you'll find that about as stupid and wasteful as I do :-) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Thu, 28 Jun 2007, David Miller wrote: Still a better solution would be to not use the slab allocator at all for the objects that are used to send commands to the devices. These are not permanent and grabbing a page from the pcp lists and putting it back is likely as fast as performing a kmalloc. Jens Axboe wants to get references to the page structs behind kmalloc() allocated pages in his networking splice work. You can get such a reference and then the slab page will be in limbo if all objects are freed until that reference is given up. The reference method is also use by kmem_cache_vacate() (but that is slab internal). We could make a special allocator in the networking that carves chunks out of pages but I'm sure you'll find that about as stupid and wasteful as I do :-) Well then I guess this containment patch is as good as we can get. It makes sure that things do not get out of hand. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
From: Christoph Lameter [EMAIL PROTECTED] Date: Thu, 28 Jun 2007 21:22:22 -0700 (PDT) On Thu, 28 Jun 2007, David Miller wrote: Still a better solution would be to not use the slab allocator at all for the objects that are used to send commands to the devices. These are not permanent and grabbing a page from the pcp lists and putting it back is likely as fast as performing a kmalloc. Jens Axboe wants to get references to the page structs behind kmalloc() allocated pages in his networking splice work. You can get such a reference and then the slab page will be in limbo if all objects are freed until that reference is given up. The reference method is also use by kmem_cache_vacate() (but that is slab internal). What about if someone kfree()'s that object meanwhile? Can we bump the SLAB object count just like we can a page? That's basically what's happening in the stuff Jens is working on, he needs to grab a reference to a SLAB object just like one can a page. Even if there is an intervening kfree (like a put_page()) the SLAB object is still live until all the references are put, and thus it can't get reallocated and given to another client by SLAB. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Thu, 28 Jun 2007, David Miller wrote: You can get such a reference and then the slab page will be in limbo if all objects are freed until that reference is given up. The reference method is also use by kmem_cache_vacate() (but that is slab internal). What about if someone kfree()'s that object meanwhile? If that is the last object in the slab then the page is deslabified and it will stick around as a regular page until its refcount reaches zero. There is slab functionality in SLUB that uses this trick to insure that the slab does not go away. Can we bump the SLAB object count just like we can a page? You can bump the slab page count. But you have to use virt_to_head_page() instead of virt_to_page() since slab pages may be compound pages (in both SLAB and SLUB). Incrementing the refcount of a tail page will cause an oops on free. That's basically what's happening in the stuff Jens is working on, he needs to grab a reference to a SLAB object just like one can a page. Even if there is an intervening kfree (like a put_page()) the SLAB object is still live until all the references are put, and thus it can't get reallocated and given to another client by SLAB. If you kfree the object then all slab allocators will feel free to immediately alloc the object for other purposes. If you want to prohibit further allocations from a slab then I can likely give you a function call that removes a slab from the partial lists so that the slab will not be allocated from. But this will only work with SLUB. At some point the slab will then need to be returned to the partial slab lists. H... Maybe we are creating more of a mess with this. Isnt there some other way to handle these object. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
From: Christoph Lameter [EMAIL PROTECTED] Date: Thu, 28 Jun 2007 21:39:01 -0700 (PDT) H... Maybe we are creating more of a mess with this. Isnt there some other way to handle these object. That's where I was going with the silly idea to use another allocator :) Really, it would be great if we could treat kmalloc() objects just like real pages. Everything wants to do I/O on pages but sometimes (like the networking) you have a kmalloc chunk which is technically just a part of a page. The fact that there is no easy way to make this work is frustrating :-) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Thu, 28 Jun 2007 22:06:06 -0700 (PDT) David Miller [EMAIL PROTECTED] wrote: From: Christoph Lameter [EMAIL PROTECTED] Date: Thu, 28 Jun 2007 21:39:01 -0700 (PDT) H... Maybe we are creating more of a mess with this. Isnt there some other way to handle these object. That's where I was going with the silly idea to use another allocator :) Really, it would be great if we could treat kmalloc() objects just like real pages. From a high level, that seems like a bad idea. kmalloc() gives you a virtual address and you really shouldn't be poking around at that memory's underlying page's pageframe metadata. However we can of course do tasteless and weird things if the benefit is sufficient Everything wants to do I/O on pages but sometimes (like the networking) you have a kmalloc chunk which is technically just a part of a page. hm. So what happens when two quite different threads of control are doing IO against two hunks of kmalloced memory which happen to come from the same page? Either some (kernel-wide) locking is needed, or that pageframe needs to be treated as readonly? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
From: Andrew Morton [EMAIL PROTECTED] Date: Thu, 28 Jun 2007 22:24:24 -0700 So what happens when two quite different threads of control are doing IO against two hunks of kmalloced memory which happen to come from the same page? Either some (kernel-wide) locking is needed, or that pageframe needs to be treated as readonly? Or you put an atomic_t at the beginning or tail of every SLAB object. It's a space cost not a runtime cost for the common case which is: smp_rmb(); if (atomic_read(slab_obj-count) == 1) really_free_it(); else if (atomic_dec_and_test(...)) Note I don't like this variant either. :) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Containment measures for slab objects on scatter gather lists
On Thu, 28 Jun 2007 22:37:34 -0700 (PDT) David Miller [EMAIL PROTECTED] wrote: From: Andrew Morton [EMAIL PROTECTED] Date: Thu, 28 Jun 2007 22:24:24 -0700 So what happens when two quite different threads of control are doing IO against two hunks of kmalloced memory which happen to come from the same page? Either some (kernel-wide) locking is needed, or that pageframe needs to be treated as readonly? Or you put an atomic_t at the beginning or tail of every SLAB object. It's a space cost not a runtime cost for the common case which is: smp_rmb(); if (atomic_read(slab_obj-count) == 1) really_free_it(); else if (atomic_dec_and_test(...)) Note I don't like this variant either. :) but, but... Christoph said 'The dma layer may then perform operations on the slab page'. If those operations involve modifying that slab page's pageframe then what stops concurrent dma'ers from stomping on each other's changes? As in: why aren't we already buggy? If those operations _don't_ involve modifying the pageframe (hopes this is true) then we're read-only and things become much easier? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/