Re: [patch 1/2] splice: dont steal
On Thu, Mar 15, 2007 at 01:54:32PM +0100, Jens Axboe wrote: > On Thu, Mar 15 2007, Nick Piggin wrote: > > On Thu, Mar 15, 2007 at 01:27:23PM +0100, Jens Axboe wrote: > > > On Thu, Mar 15 2007, Nick Piggin wrote: > > > > > > > > We should be able to allow for it with the new a_ops API I'm working > > > > on. > > > > > > "Should be" and in progress stuff, is it guarenteed to get there? > > > > Well considering that it is needed in order to solve 3 different deadlock > > scenarios in the core write(2) path without taking a big performance hit, > > I'd hope so ;) > > > > It isn't guaranteed, but I have only had positive feedback so far. Would > > take a while to actually get merged, though. > > It's not that I don't believe you, I'm just a little reluctant to rip > stuff out with a promise to fix it later when foo and bar are merged, > since things like that have a tendency not to get done because they are > forgotten :-) Fair enough. The API side is trivial, all I need to do is set a single flag and make splice pass down the page, and set that flag when stealing. Filesystems might vary from trivial to impossible, but I think most should be OK. If the flag is there then they at least have the option. > Do you have a test case for stealing failures? What I'm really asking is > how critical is this? I guess you could fill a filesystem completely, and have a sparse file in it. Then steal a page and splice it in. The prepare_write should fail, but the page will still be in pagecache, until it gets reclaimed, then it will go back to zeroes. (no I don't have a test case ;)). You could do something like remove the page if prepare_write fails, but there is still a window where a read can see it. Basically I can't see a way that it can possibly work within our current prepare_write API, and it is a data corruption bug, so in my opinion it is a candidate for 2.6.21 + stable. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/2] splice: dont steal
On Thu, Mar 15 2007, Nick Piggin wrote: > On Thu, Mar 15, 2007 at 01:27:23PM +0100, Jens Axboe wrote: > > On Thu, Mar 15 2007, Nick Piggin wrote: > > > > > > We should be able to allow for it with the new a_ops API I'm working > > > on. > > > > "Should be" and in progress stuff, is it guarenteed to get there? > > Well considering that it is needed in order to solve 3 different deadlock > scenarios in the core write(2) path without taking a big performance hit, > I'd hope so ;) > > It isn't guaranteed, but I have only had positive feedback so far. Would > take a while to actually get merged, though. It's not that I don't believe you, I'm just a little reluctant to rip stuff out with a promise to fix it later when foo and bar are merged, since things like that have a tendency not to get done because they are forgotten :-) Do you have a test case for stealing failures? What I'm really asking is how critical is this? -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/2] splice: dont steal
On Thu, Mar 15, 2007 at 01:27:23PM +0100, Jens Axboe wrote: > On Thu, Mar 15 2007, Nick Piggin wrote: > > > > We should be able to allow for it with the new a_ops API I'm working > > on. > > "Should be" and in progress stuff, is it guarenteed to get there? Well considering that it is needed in order to solve 3 different deadlock scenarios in the core write(2) path without taking a big performance hit, I'd hope so ;) It isn't guaranteed, but I have only had positive feedback so far. Would take a while to actually get merged, though. Nick - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/2] splice: dont steal
On Thu, Mar 15 2007, Nick Piggin wrote: > On Thu, Mar 15, 2007 at 12:52:37PM +0100, Jens Axboe wrote: > > On Wed, Mar 14 2007, Nick Piggin wrote: > > > Here are a couple of splice patches I found when digging in the area. > > > I could be wrong, so I'd appreciate confirmation. > > > > > > Untested other than compile, because I don't have a good splice test > > > setup. > > > > > > Considering these are data corruption / information leak issues, then > > > we could do worse than to merge them in 2.6.21 and earlier stable > > > trees. > > > > > > Does anyone really use splice stealing? > > > > That's a damn shame, I'd greatly prefer if we can try and fix it > > instead. Splice isn't really all that used yet to my knowledge, but > > stealing is one of the niftier features I think. Otherwise you're just > > copying data again. > > We should be able to allow for it with the new a_ops API I'm working > on. "Should be" and in progress stuff, is it guarenteed to get there? > Basically we can pass the page down to the filesystem, and tell it to > attempt to install that page in-place. > > The problem is that we can't just put this page here hoping the fs can > take it, becaue it might fail allocating blocks, for example. > > Anyway, we can still copy files with 1 less copy than read/write ;) It's not about 1 vs 2 copies, it's more about 0 vs 1 copy. But yes, we can file copy with less copies. > It is a nifty feature, but I think it is more of a niche than simply > saving that 1 copy, because you have to know that the source isn't > going to be used again. Well yes, same as when you free() a page. A little more tricky, but that's mainly the vm assumptions/requirements for page stealing. > But I'll try to support it with begin_write. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/2] splice: dont steal
On Thu, Mar 15, 2007 at 12:52:37PM +0100, Jens Axboe wrote: > On Wed, Mar 14 2007, Nick Piggin wrote: > > Here are a couple of splice patches I found when digging in the area. > > I could be wrong, so I'd appreciate confirmation. > > > > Untested other than compile, because I don't have a good splice test > > setup. > > > > Considering these are data corruption / information leak issues, then > > we could do worse than to merge them in 2.6.21 and earlier stable > > trees. > > > > Does anyone really use splice stealing? > > That's a damn shame, I'd greatly prefer if we can try and fix it > instead. Splice isn't really all that used yet to my knowledge, but > stealing is one of the niftier features I think. Otherwise you're just > copying data again. We should be able to allow for it with the new a_ops API I'm working on. Basically we can pass the page down to the filesystem, and tell it to attempt to install that page in-place. The problem is that we can't just put this page here hoping the fs can take it, becaue it might fail allocating blocks, for example. Anyway, we can still copy files with 1 less copy than read/write ;) It is a nifty feature, but I think it is more of a niche than simply saving that 1 copy, because you have to know that the source isn't going to be used again. But I'll try to support it with begin_write. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/2] splice: dont steal
On Wed, Mar 14 2007, Nick Piggin wrote: > Here are a couple of splice patches I found when digging in the area. > I could be wrong, so I'd appreciate confirmation. > > Untested other than compile, because I don't have a good splice test > setup. > > Considering these are data corruption / information leak issues, then > we could do worse than to merge them in 2.6.21 and earlier stable > trees. > > Does anyone really use splice stealing? That's a damn shame, I'd greatly prefer if we can try and fix it instead. Splice isn't really all that used yet to my knowledge, but stealing is one of the niftier features I think. Otherwise you're just copying data again. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/2] splice: dont steal
On Wed, Mar 14 2007, Nick Piggin wrote: Here are a couple of splice patches I found when digging in the area. I could be wrong, so I'd appreciate confirmation. Untested other than compile, because I don't have a good splice test setup. Considering these are data corruption / information leak issues, then we could do worse than to merge them in 2.6.21 and earlier stable trees. Does anyone really use splice stealing? That's a damn shame, I'd greatly prefer if we can try and fix it instead. Splice isn't really all that used yet to my knowledge, but stealing is one of the niftier features I think. Otherwise you're just copying data again. -- Jens Axboe - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/2] splice: dont steal
On Thu, Mar 15, 2007 at 12:52:37PM +0100, Jens Axboe wrote: On Wed, Mar 14 2007, Nick Piggin wrote: Here are a couple of splice patches I found when digging in the area. I could be wrong, so I'd appreciate confirmation. Untested other than compile, because I don't have a good splice test setup. Considering these are data corruption / information leak issues, then we could do worse than to merge them in 2.6.21 and earlier stable trees. Does anyone really use splice stealing? That's a damn shame, I'd greatly prefer if we can try and fix it instead. Splice isn't really all that used yet to my knowledge, but stealing is one of the niftier features I think. Otherwise you're just copying data again. We should be able to allow for it with the new a_ops API I'm working on. Basically we can pass the page down to the filesystem, and tell it to attempt to install that page in-place. The problem is that we can't just put this page here hoping the fs can take it, becaue it might fail allocating blocks, for example. Anyway, we can still copy files with 1 less copy than read/write ;) It is a nifty feature, but I think it is more of a niche than simply saving that 1 copy, because you have to know that the source isn't going to be used again. But I'll try to support it with begin_write. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/2] splice: dont steal
On Thu, Mar 15 2007, Nick Piggin wrote: On Thu, Mar 15, 2007 at 12:52:37PM +0100, Jens Axboe wrote: On Wed, Mar 14 2007, Nick Piggin wrote: Here are a couple of splice patches I found when digging in the area. I could be wrong, so I'd appreciate confirmation. Untested other than compile, because I don't have a good splice test setup. Considering these are data corruption / information leak issues, then we could do worse than to merge them in 2.6.21 and earlier stable trees. Does anyone really use splice stealing? That's a damn shame, I'd greatly prefer if we can try and fix it instead. Splice isn't really all that used yet to my knowledge, but stealing is one of the niftier features I think. Otherwise you're just copying data again. We should be able to allow for it with the new a_ops API I'm working on. Should be and in progress stuff, is it guarenteed to get there? Basically we can pass the page down to the filesystem, and tell it to attempt to install that page in-place. The problem is that we can't just put this page here hoping the fs can take it, becaue it might fail allocating blocks, for example. Anyway, we can still copy files with 1 less copy than read/write ;) It's not about 1 vs 2 copies, it's more about 0 vs 1 copy. But yes, we can file copy with less copies. It is a nifty feature, but I think it is more of a niche than simply saving that 1 copy, because you have to know that the source isn't going to be used again. Well yes, same as when you free() a page. A little more tricky, but that's mainly the vm assumptions/requirements for page stealing. But I'll try to support it with begin_write. -- Jens Axboe - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/2] splice: dont steal
On Thu, Mar 15, 2007 at 01:27:23PM +0100, Jens Axboe wrote: On Thu, Mar 15 2007, Nick Piggin wrote: We should be able to allow for it with the new a_ops API I'm working on. Should be and in progress stuff, is it guarenteed to get there? Well considering that it is needed in order to solve 3 different deadlock scenarios in the core write(2) path without taking a big performance hit, I'd hope so ;) It isn't guaranteed, but I have only had positive feedback so far. Would take a while to actually get merged, though. Nick - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/2] splice: dont steal
On Thu, Mar 15 2007, Nick Piggin wrote: On Thu, Mar 15, 2007 at 01:27:23PM +0100, Jens Axboe wrote: On Thu, Mar 15 2007, Nick Piggin wrote: We should be able to allow for it with the new a_ops API I'm working on. Should be and in progress stuff, is it guarenteed to get there? Well considering that it is needed in order to solve 3 different deadlock scenarios in the core write(2) path without taking a big performance hit, I'd hope so ;) It isn't guaranteed, but I have only had positive feedback so far. Would take a while to actually get merged, though. It's not that I don't believe you, I'm just a little reluctant to rip stuff out with a promise to fix it later when foo and bar are merged, since things like that have a tendency not to get done because they are forgotten :-) Do you have a test case for stealing failures? What I'm really asking is how critical is this? -- Jens Axboe - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/2] splice: dont steal
On Thu, Mar 15, 2007 at 01:54:32PM +0100, Jens Axboe wrote: On Thu, Mar 15 2007, Nick Piggin wrote: On Thu, Mar 15, 2007 at 01:27:23PM +0100, Jens Axboe wrote: On Thu, Mar 15 2007, Nick Piggin wrote: We should be able to allow for it with the new a_ops API I'm working on. Should be and in progress stuff, is it guarenteed to get there? Well considering that it is needed in order to solve 3 different deadlock scenarios in the core write(2) path without taking a big performance hit, I'd hope so ;) It isn't guaranteed, but I have only had positive feedback so far. Would take a while to actually get merged, though. It's not that I don't believe you, I'm just a little reluctant to rip stuff out with a promise to fix it later when foo and bar are merged, since things like that have a tendency not to get done because they are forgotten :-) Fair enough. The API side is trivial, all I need to do is set a single flag and make splice pass down the page, and set that flag when stealing. Filesystems might vary from trivial to impossible, but I think most should be OK. If the flag is there then they at least have the option. Do you have a test case for stealing failures? What I'm really asking is how critical is this? I guess you could fill a filesystem completely, and have a sparse file in it. Then steal a page and splice it in. The prepare_write should fail, but the page will still be in pagecache, until it gets reclaimed, then it will go back to zeroes. (no I don't have a test case ;)). You could do something like remove the page if prepare_write fails, but there is still a window where a read can see it. Basically I can't see a way that it can possibly work within our current prepare_write API, and it is a data corruption bug, so in my opinion it is a candidate for 2.6.21 + stable. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/