Re: [patch 1/2] splice: dont steal

2007-03-15 Thread Nick Piggin
On Thu, Mar 15, 2007 at 01:54:32PM +0100, Jens Axboe wrote:
> On Thu, Mar 15 2007, Nick Piggin wrote:
> > On Thu, Mar 15, 2007 at 01:27:23PM +0100, Jens Axboe wrote:
> > > On Thu, Mar 15 2007, Nick Piggin wrote:
> > > > 
> > > > We should be able to allow for it with the new a_ops API I'm working
> > > > on.
> > > 
> > > "Should be" and in progress stuff, is it guarenteed to get there?
> > 
> > Well considering that it is needed in order to solve 3 different deadlock
> > scenarios in the core write(2) path without taking a big performance hit,
> > I'd hope so ;)
> > 
> > It isn't guaranteed, but I have only had positive feedback so far. Would
> > take a while to actually get merged, though.
> 
> It's not that I don't believe you, I'm just a little reluctant to rip
> stuff out with a promise to fix it later when foo and bar are merged,
> since things like that have a tendency not to get done because they are
> forgotten :-)

Fair enough. The API side is trivial, all I need to do is set a single
flag and make splice pass down the page, and set that flag when stealing.
Filesystems might vary from trivial to impossible, but I think most should
be OK. If the flag is there then they at least have the option.


> Do you have a test case for stealing failures? What I'm really asking is
> how critical is this?

I guess you could fill a filesystem completely, and have a sparse file
in it. Then steal a page and splice it in. The prepare_write should fail,
but the page will still be in pagecache, until it gets reclaimed, then
it will go back to zeroes.

(no I don't have a test case ;)).

You could do something like remove the page if prepare_write fails, but
there is still a window where a read can see it. Basically I can't see
a way that it can possibly work within our current prepare_write API,
and it is a data corruption bug, so in my opinion it is a candidate for
2.6.21 + stable.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] splice: dont steal

2007-03-15 Thread Jens Axboe
On Thu, Mar 15 2007, Nick Piggin wrote:
> On Thu, Mar 15, 2007 at 01:27:23PM +0100, Jens Axboe wrote:
> > On Thu, Mar 15 2007, Nick Piggin wrote:
> > > 
> > > We should be able to allow for it with the new a_ops API I'm working
> > > on.
> > 
> > "Should be" and in progress stuff, is it guarenteed to get there?
> 
> Well considering that it is needed in order to solve 3 different deadlock
> scenarios in the core write(2) path without taking a big performance hit,
> I'd hope so ;)
> 
> It isn't guaranteed, but I have only had positive feedback so far. Would
> take a while to actually get merged, though.

It's not that I don't believe you, I'm just a little reluctant to rip
stuff out with a promise to fix it later when foo and bar are merged,
since things like that have a tendency not to get done because they are
forgotten :-)

Do you have a test case for stealing failures? What I'm really asking is
how critical is this?

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] splice: dont steal

2007-03-15 Thread Nick Piggin
On Thu, Mar 15, 2007 at 01:27:23PM +0100, Jens Axboe wrote:
> On Thu, Mar 15 2007, Nick Piggin wrote:
> > 
> > We should be able to allow for it with the new a_ops API I'm working
> > on.
> 
> "Should be" and in progress stuff, is it guarenteed to get there?

Well considering that it is needed in order to solve 3 different deadlock
scenarios in the core write(2) path without taking a big performance hit,
I'd hope so ;)

It isn't guaranteed, but I have only had positive feedback so far. Would
take a while to actually get merged, though.

Nick
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] splice: dont steal

2007-03-15 Thread Jens Axboe
On Thu, Mar 15 2007, Nick Piggin wrote:
> On Thu, Mar 15, 2007 at 12:52:37PM +0100, Jens Axboe wrote:
> > On Wed, Mar 14 2007, Nick Piggin wrote:
> > > Here are a couple of splice patches I found when digging in the area.
> > > I could be wrong, so I'd appreciate confirmation.
> > > 
> > > Untested other than compile, because I don't have a good splice test
> > > setup.
> > > 
> > > Considering these are data corruption / information leak issues, then
> > > we could do worse than to merge them in 2.6.21 and earlier stable
> > > trees.
> > > 
> > > Does anyone really use splice stealing?
> > 
> > That's a damn shame, I'd greatly prefer if we can try and fix it
> > instead. Splice isn't really all that used yet to my knowledge, but
> > stealing is one of the niftier features I think. Otherwise you're just
> > copying data again.
> 
> We should be able to allow for it with the new a_ops API I'm working
> on.

"Should be" and in progress stuff, is it guarenteed to get there?

> Basically we can pass the page down to the filesystem, and tell it to
> attempt to install that page in-place.
> 
> The problem is that we can't just put this page here hoping the fs can
> take it, becaue it might fail allocating blocks, for example.
> 
> Anyway, we can still copy files with 1 less copy than read/write ;)

It's not about 1 vs 2 copies, it's more about 0 vs 1 copy. But yes, we
can file copy with less copies.

> It is a nifty feature, but I think it is more of a niche than simply
> saving that 1 copy, because you have to know that the source isn't
> going to be used again.

Well yes, same as when you free() a page. A little more tricky, but
that's mainly the vm assumptions/requirements for page stealing.

> But I'll try to support it with begin_write.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] splice: dont steal

2007-03-15 Thread Nick Piggin
On Thu, Mar 15, 2007 at 12:52:37PM +0100, Jens Axboe wrote:
> On Wed, Mar 14 2007, Nick Piggin wrote:
> > Here are a couple of splice patches I found when digging in the area.
> > I could be wrong, so I'd appreciate confirmation.
> > 
> > Untested other than compile, because I don't have a good splice test
> > setup.
> > 
> > Considering these are data corruption / information leak issues, then
> > we could do worse than to merge them in 2.6.21 and earlier stable
> > trees.
> > 
> > Does anyone really use splice stealing?
> 
> That's a damn shame, I'd greatly prefer if we can try and fix it
> instead. Splice isn't really all that used yet to my knowledge, but
> stealing is one of the niftier features I think. Otherwise you're just
> copying data again.

We should be able to allow for it with the new a_ops API I'm working
on.

Basically we can pass the page down to the filesystem, and tell it to
attempt to install that page in-place.

The problem is that we can't just put this page here hoping the fs can
take it, becaue it might fail allocating blocks, for example.

Anyway, we can still copy files with 1 less copy than read/write ;)

It is a nifty feature, but I think it is more of a niche than simply
saving that 1 copy, because you have to know that the source isn't
going to be used again.

But I'll try to support it with begin_write.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] splice: dont steal

2007-03-15 Thread Jens Axboe
On Wed, Mar 14 2007, Nick Piggin wrote:
> Here are a couple of splice patches I found when digging in the area.
> I could be wrong, so I'd appreciate confirmation.
> 
> Untested other than compile, because I don't have a good splice test
> setup.
> 
> Considering these are data corruption / information leak issues, then
> we could do worse than to merge them in 2.6.21 and earlier stable
> trees.
> 
> Does anyone really use splice stealing?

That's a damn shame, I'd greatly prefer if we can try and fix it
instead. Splice isn't really all that used yet to my knowledge, but
stealing is one of the niftier features I think. Otherwise you're just
copying data again.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] splice: dont steal

2007-03-15 Thread Jens Axboe
On Wed, Mar 14 2007, Nick Piggin wrote:
 Here are a couple of splice patches I found when digging in the area.
 I could be wrong, so I'd appreciate confirmation.
 
 Untested other than compile, because I don't have a good splice test
 setup.
 
 Considering these are data corruption / information leak issues, then
 we could do worse than to merge them in 2.6.21 and earlier stable
 trees.
 
 Does anyone really use splice stealing?

That's a damn shame, I'd greatly prefer if we can try and fix it
instead. Splice isn't really all that used yet to my knowledge, but
stealing is one of the niftier features I think. Otherwise you're just
copying data again.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] splice: dont steal

2007-03-15 Thread Nick Piggin
On Thu, Mar 15, 2007 at 12:52:37PM +0100, Jens Axboe wrote:
 On Wed, Mar 14 2007, Nick Piggin wrote:
  Here are a couple of splice patches I found when digging in the area.
  I could be wrong, so I'd appreciate confirmation.
  
  Untested other than compile, because I don't have a good splice test
  setup.
  
  Considering these are data corruption / information leak issues, then
  we could do worse than to merge them in 2.6.21 and earlier stable
  trees.
  
  Does anyone really use splice stealing?
 
 That's a damn shame, I'd greatly prefer if we can try and fix it
 instead. Splice isn't really all that used yet to my knowledge, but
 stealing is one of the niftier features I think. Otherwise you're just
 copying data again.

We should be able to allow for it with the new a_ops API I'm working
on.

Basically we can pass the page down to the filesystem, and tell it to
attempt to install that page in-place.

The problem is that we can't just put this page here hoping the fs can
take it, becaue it might fail allocating blocks, for example.

Anyway, we can still copy files with 1 less copy than read/write ;)

It is a nifty feature, but I think it is more of a niche than simply
saving that 1 copy, because you have to know that the source isn't
going to be used again.

But I'll try to support it with begin_write.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] splice: dont steal

2007-03-15 Thread Jens Axboe
On Thu, Mar 15 2007, Nick Piggin wrote:
 On Thu, Mar 15, 2007 at 12:52:37PM +0100, Jens Axboe wrote:
  On Wed, Mar 14 2007, Nick Piggin wrote:
   Here are a couple of splice patches I found when digging in the area.
   I could be wrong, so I'd appreciate confirmation.
   
   Untested other than compile, because I don't have a good splice test
   setup.
   
   Considering these are data corruption / information leak issues, then
   we could do worse than to merge them in 2.6.21 and earlier stable
   trees.
   
   Does anyone really use splice stealing?
  
  That's a damn shame, I'd greatly prefer if we can try and fix it
  instead. Splice isn't really all that used yet to my knowledge, but
  stealing is one of the niftier features I think. Otherwise you're just
  copying data again.
 
 We should be able to allow for it with the new a_ops API I'm working
 on.

Should be and in progress stuff, is it guarenteed to get there?

 Basically we can pass the page down to the filesystem, and tell it to
 attempt to install that page in-place.
 
 The problem is that we can't just put this page here hoping the fs can
 take it, becaue it might fail allocating blocks, for example.
 
 Anyway, we can still copy files with 1 less copy than read/write ;)

It's not about 1 vs 2 copies, it's more about 0 vs 1 copy. But yes, we
can file copy with less copies.

 It is a nifty feature, but I think it is more of a niche than simply
 saving that 1 copy, because you have to know that the source isn't
 going to be used again.

Well yes, same as when you free() a page. A little more tricky, but
that's mainly the vm assumptions/requirements for page stealing.

 But I'll try to support it with begin_write.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] splice: dont steal

2007-03-15 Thread Nick Piggin
On Thu, Mar 15, 2007 at 01:27:23PM +0100, Jens Axboe wrote:
 On Thu, Mar 15 2007, Nick Piggin wrote:
  
  We should be able to allow for it with the new a_ops API I'm working
  on.
 
 Should be and in progress stuff, is it guarenteed to get there?

Well considering that it is needed in order to solve 3 different deadlock
scenarios in the core write(2) path without taking a big performance hit,
I'd hope so ;)

It isn't guaranteed, but I have only had positive feedback so far. Would
take a while to actually get merged, though.

Nick
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] splice: dont steal

2007-03-15 Thread Jens Axboe
On Thu, Mar 15 2007, Nick Piggin wrote:
 On Thu, Mar 15, 2007 at 01:27:23PM +0100, Jens Axboe wrote:
  On Thu, Mar 15 2007, Nick Piggin wrote:
   
   We should be able to allow for it with the new a_ops API I'm working
   on.
  
  Should be and in progress stuff, is it guarenteed to get there?
 
 Well considering that it is needed in order to solve 3 different deadlock
 scenarios in the core write(2) path without taking a big performance hit,
 I'd hope so ;)
 
 It isn't guaranteed, but I have only had positive feedback so far. Would
 take a while to actually get merged, though.

It's not that I don't believe you, I'm just a little reluctant to rip
stuff out with a promise to fix it later when foo and bar are merged,
since things like that have a tendency not to get done because they are
forgotten :-)

Do you have a test case for stealing failures? What I'm really asking is
how critical is this?

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] splice: dont steal

2007-03-15 Thread Nick Piggin
On Thu, Mar 15, 2007 at 01:54:32PM +0100, Jens Axboe wrote:
 On Thu, Mar 15 2007, Nick Piggin wrote:
  On Thu, Mar 15, 2007 at 01:27:23PM +0100, Jens Axboe wrote:
   On Thu, Mar 15 2007, Nick Piggin wrote:

We should be able to allow for it with the new a_ops API I'm working
on.
   
   Should be and in progress stuff, is it guarenteed to get there?
  
  Well considering that it is needed in order to solve 3 different deadlock
  scenarios in the core write(2) path without taking a big performance hit,
  I'd hope so ;)
  
  It isn't guaranteed, but I have only had positive feedback so far. Would
  take a while to actually get merged, though.
 
 It's not that I don't believe you, I'm just a little reluctant to rip
 stuff out with a promise to fix it later when foo and bar are merged,
 since things like that have a tendency not to get done because they are
 forgotten :-)

Fair enough. The API side is trivial, all I need to do is set a single
flag and make splice pass down the page, and set that flag when stealing.
Filesystems might vary from trivial to impossible, but I think most should
be OK. If the flag is there then they at least have the option.


 Do you have a test case for stealing failures? What I'm really asking is
 how critical is this?

I guess you could fill a filesystem completely, and have a sparse file
in it. Then steal a page and splice it in. The prepare_write should fail,
but the page will still be in pagecache, until it gets reclaimed, then
it will go back to zeroes.

(no I don't have a test case ;)).

You could do something like remove the page if prepare_write fails, but
there is still a window where a read can see it. Basically I can't see
a way that it can possibly work within our current prepare_write API,
and it is a data corruption bug, so in my opinion it is a candidate for
2.6.21 + stable.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/