subject:"Re\: \-mm merge plans for 2.6.23"

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-08-06 Thread Paolo Ciarrocchi

On 8/6/07, Nick Piggin <[EMAIL PROTECTED]> wrote:
[...]
> > this completely ignores the use case where the
> > swapping was exactly the
> > right thing to do, but memory has been freed up from
> > a program exiting so
> > that you couldnow fill that empty ram with data that
> > was swapped out.
>
> Yeah. However, merging patches (especially when
> changing heuristics, especially in page reclaim) is
> not about just thinking up a use-case that it works
> well for and telling people that they're putting their
> heads in the sand if they say anything against it.
> Read this thread and you'll find other examples of
> patches that have been around for as long or longer
> and also have some good use-cases and also have not
> been merged.

What do you think Andrew?
Swap prefetch is not the panacea, it's not going to solve all the
problems but it seems to improve the "desktop experience" and it has
been discussed and reviewed a lot (it's has even been discussed more
than it should have be).

Are you going to push upstream the patch?

Ciao,
-- 
Paolo
http://paolo.ciarrocchi.googlepages.com/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-08-06 Thread Nick Piggin

--- [EMAIL PROTECTED] wrote:

> On Mon, 6 Aug 2007, Nick Piggin wrote:
> 
> > [EMAIL PROTECTED] wrote:
> >>  On Sun, 29 Jul 2007, Rene Herman wrote:
> >> 
> >> >  On 07/29/2007 01:41 PM, [EMAIL PROTECTED] wrote:
> >> > 
> >> > >  I agree that tinkering with the core VM code
> should not be done 
> >> > >  lightly,
> >> > >   but this has been put through the proper
> process and is stalled with 
> >> > >   no
> >> > >   hints on how to move forward.
> >> > 
> >> > 
> >> >  It has not. Concerns that were raised (by
> specifically Nick Piggin) 
> >> >  weren't being addressed.
> >>
> >>
> >>  I may have missed them, but what I saw from him
> weren't specific issues,
> >>  but instead a nebulous 'something better may
> come along later'
> >
> > Something better, ie. the problems with page
> reclaim being fixed.
> > Why is that nebulous?
> 
> becouse that doesn't begin to address all the
> benifits.

What do you mean "address the benefits"? What I want
to address is the page reclaim problems.

> the approach of fixing page reclaim and updatedb is
> pretending that if you 
> only do everything right pages won't get pushed to
> swap in the first 
> place, and therefor swap prefetch won't be needed.

You should read what I wrote.

Anyway, the fact of the matter is that there are still
fairly significant problems with page reclaim in this
workload which I would like to see fixed.

I personally still think some of the low hanging fruit
*might* be better fixed before swap prefetch gets
merged, but I've repeatedly said I'm sick of getting
dragged back into the whole debate so I'm happy with
whatever Andrew decides to do with it.

I think it is sad to turn it off for laptops, if it
really makes the "desktop" experience so much better.
Surely for _most_ workloads we should be able to
manage 1-2GB of RAM reasonably well.

> this completely ignores the use case where the
> swapping was exactly the 
> right thing to do, but memory has been freed up from
> a program exiting so 
> that you couldnow fill that empty ram with data that
> was swapped out.

Yeah. However, merging patches (especially when
changing heuristics, especially in page reclaim) is
not about just thinking up a use-case that it works
well for and telling people that they're putting their
heads in the sand if they say anything against it.
Read this thread and you'll find other examples of
patches that have been around for as long or longer
and also have some good use-cases and also have not
been merged.

Yahoo!7 Mail has just got even bigger and better with unlimited storage on all 
webmail accounts. 
http://au.docs.yahoo.com/mail/unlimitedstorage.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-08-06 Thread Nick Piggin


--- [EMAIL PROTECTED] wrote:

 On Mon, 6 Aug 2007, Nick Piggin wrote:
 
  [EMAIL PROTECTED] wrote:
   On Sun, 29 Jul 2007, Rene Herman wrote:
  
On 07/29/2007 01:41 PM, [EMAIL PROTECTED] wrote:
   
 I agree that tinkering with the core VM code
 should not be done 
 lightly,
  but this has been put through the proper
 process and is stalled with 
  no
  hints on how to move forward.
   
   
It has not. Concerns that were raised (by
 specifically Nick Piggin) 
weren't being addressed.
 
 
   I may have missed them, but what I saw from him
 weren't specific issues,
   but instead a nebulous 'something better may
 come along later'
 
  Something better, ie. the problems with page
 reclaim being fixed.
  Why is that nebulous?
 
 becouse that doesn't begin to address all the
 benifits.

What do you mean address the benefits? What I want
to address is the page reclaim problems.


 the approach of fixing page reclaim and updatedb is
 pretending that if you 
 only do everything right pages won't get pushed to
 swap in the first 
 place, and therefor swap prefetch won't be needed.

You should read what I wrote.

Anyway, the fact of the matter is that there are still
fairly significant problems with page reclaim in this
workload which I would like to see fixed.

I personally still think some of the low hanging fruit
*might* be better fixed before swap prefetch gets
merged, but I've repeatedly said I'm sick of getting
dragged back into the whole debate so I'm happy with
whatever Andrew decides to do with it.

I think it is sad to turn it off for laptops, if it
really makes the desktop experience so much better.
Surely for _most_ workloads we should be able to
manage 1-2GB of RAM reasonably well.

 
 this completely ignores the use case where the
 swapping was exactly the 
 right thing to do, but memory has been freed up from
 a program exiting so 
 that you couldnow fill that empty ram with data that
 was swapped out.

Yeah. However, merging patches (especially when
changing heuristics, especially in page reclaim) is
not about just thinking up a use-case that it works
well for and telling people that they're putting their
heads in the sand if they say anything against it.
Read this thread and you'll find other examples of
patches that have been around for as long or longer
and also have some good use-cases and also have not
been merged.



  

Yahoo!7 Mail has just got even bigger and better with unlimited storage on all 
webmail accounts. 
http://au.docs.yahoo.com/mail/unlimitedstorage.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-08-06 Thread Paolo Ciarrocchi

On 8/6/07, Nick Piggin [EMAIL PROTECTED] wrote:
[...]
  this completely ignores the use case where the
  swapping was exactly the
  right thing to do, but memory has been freed up from
  a program exiting so
  that you couldnow fill that empty ram with data that
  was swapped out.

 Yeah. However, merging patches (especially when
 changing heuristics, especially in page reclaim) is
 not about just thinking up a use-case that it works
 well for and telling people that they're putting their
 heads in the sand if they say anything against it.
 Read this thread and you'll find other examples of
 patches that have been around for as long or longer
 and also have some good use-cases and also have not
 been merged.

What do you think Andrew?
Swap prefetch is not the panacea, it's not going to solve all the
problems but it seems to improve the desktop experience and it has
been discussed and reviewed a lot (it's has even been discussed more
than it should have be).

Are you going to push upstream the patch?

Ciao,
-- 
Paolo
http://paolo.ciarrocchi.googlepages.com/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-08-05 Thread david

On Mon, 6 Aug 2007, Nick Piggin wrote:

[EMAIL PROTECTED] wrote:

 On Sun, 29 Jul 2007, Rene Herman wrote:

>  On 07/29/2007 01:41 PM, [EMAIL PROTECTED] wrote:
> 
> >  I agree that tinkering with the core VM code should not be done 
> >  lightly,
> >   but this has been put through the proper process and is stalled with 
> >   no

> >   hints on how to move forward.
> 
> 
>  It has not. Concerns that were raised (by specifically Nick Piggin) 
>  weren't being addressed.

 I may have missed them, but what I saw from him weren't specific issues,
 but instead a nebulous 'something better may come along later'

Something better, ie. the problems with page reclaim being fixed.
Why is that nebulous?

becouse that doesn't begin to address all the benifits.

the approach of fixing page reclaim and updatedb is pretending that if you 
only do everything right pages won't get pushed to swap in the first 
place, and therefor swap prefetch won't be needed.

this completely ignores the use case where the swapping was exactly the 
right thing to do, but memory has been freed up from a program exiting so 
that you couldnow fill that empty ram with data that was swapped out.

David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-08-05 Thread Nick Piggin


[EMAIL PROTECTED] wrote:

On Sun, 29 Jul 2007, Rene Herman wrote:


On 07/29/2007 01:41 PM, [EMAIL PROTECTED] wrote:

 I agree that tinkering with the core VM code should not be done 
lightly,

 but this has been put through the proper process and is stalled with no
 hints on how to move forward.



It has not. Concerns that were raised (by specifically Nick Piggin) 
weren't being addressed.



I may have missed them, but what I saw from him weren't specific issues, 
but instead a nebulous 'something better may come along later'


Something better, ie. the problems with page reclaim being fixed.
Why is that nebulous?

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ck] Re: -mm merge plans for 2.6.23

2007-08-05 Thread Nick Piggin


Matthew Hawkins wrote:

On 7/25/07, Nick Piggin <[EMAIL PROTECTED]> wrote:


I guess /proc/meminfo, /proc/zoneinfo, /proc/vmstat, /proc/slabinfo
before and after the updatedb run with the latest kernel would be a
first step. top and vmstat output during the run wouldn't hurt either.



Hi Nick,

I've attached two files with this kind of info.  Being up at the cron
hours of the morning meant I got a better picture of what my system is
doing.  Here's a short summary of what I saw in top:

beagleindexer used gobs of ram.  600M or so (I have 1G)


Hmm OK, beagleindexer. I thought beagle didn't need frequent reindexing
because of inotify? Oh well...



updatedb didn't use much ram, but while it was running kswapd kept on
frequenting the top 10 cpu hogs - it would stick around for 5 seconds
or so then disappear for no more than 10 seconds, then come back
again.  This behaviour persisted during the run.  updatedb ran third
(beagleindexer was first, then update-dlocatedb)


Kswapd will use CPU when memory is low, even if there is no swapping.

Your "buffers" grew by 600% (from 50MB to 350MB), and slab also grew
by a few thousand entries. This is not just a problem when it pushes
out swap, it will also harm filebacked working set.

This (which Ray's traces also show) is a bit of a problem. As Andrew
noticed, use-once isn't working well for buffer cache, and it doesn't
really for dentry and inode cache either (although those don't seem
to be as much of a problem on your workload).

Andrew has done a little test patch for this in -mm, but it probably
wants more work and testing. If you can test the -mm kernel and see
if things are improved, that would help.

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ck] Re: -mm merge plans for 2.6.23

2007-08-05 Thread Nick Piggin


Matthew Hawkins wrote:

On 7/25/07, Nick Piggin [EMAIL PROTECTED] wrote:


I guess /proc/meminfo, /proc/zoneinfo, /proc/vmstat, /proc/slabinfo
before and after the updatedb run with the latest kernel would be a
first step. top and vmstat output during the run wouldn't hurt either.



Hi Nick,

I've attached two files with this kind of info.  Being up at the cron
hours of the morning meant I got a better picture of what my system is
doing.  Here's a short summary of what I saw in top:

beagleindexer used gobs of ram.  600M or so (I have 1G)


Hmm OK, beagleindexer. I thought beagle didn't need frequent reindexing
because of inotify? Oh well...



updatedb didn't use much ram, but while it was running kswapd kept on
frequenting the top 10 cpu hogs - it would stick around for 5 seconds
or so then disappear for no more than 10 seconds, then come back
again.  This behaviour persisted during the run.  updatedb ran third
(beagleindexer was first, then update-dlocatedb)


Kswapd will use CPU when memory is low, even if there is no swapping.

Your buffers grew by 600% (from 50MB to 350MB), and slab also grew
by a few thousand entries. This is not just a problem when it pushes
out swap, it will also harm filebacked working set.

This (which Ray's traces also show) is a bit of a problem. As Andrew
noticed, use-once isn't working well for buffer cache, and it doesn't
really for dentry and inode cache either (although those don't seem
to be as much of a problem on your workload).

Andrew has done a little test patch for this in -mm, but it probably
wants more work and testing. If you can test the -mm kernel and see
if things are improved, that would help.

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-08-05 Thread Nick Piggin


[EMAIL PROTECTED] wrote:

On Sun, 29 Jul 2007, Rene Herman wrote:


On 07/29/2007 01:41 PM, [EMAIL PROTECTED] wrote:

 I agree that tinkering with the core VM code should not be done 
lightly,

 but this has been put through the proper process and is stalled with no
 hints on how to move forward.



It has not. Concerns that were raised (by specifically Nick Piggin) 
weren't being addressed.



I may have missed them, but what I saw from him weren't specific issues, 
but instead a nebulous 'something better may come along later'


Something better, ie. the problems with page reclaim being fixed.
Why is that nebulous?

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-08-05 Thread david


On Mon, 6 Aug 2007, Nick Piggin wrote:


[EMAIL PROTECTED] wrote:

 On Sun, 29 Jul 2007, Rene Herman wrote:

  On 07/29/2007 01:41 PM, [EMAIL PROTECTED] wrote:
 
   I agree that tinkering with the core VM code should not be done 
   lightly,
but this has been put through the proper process and is stalled with 
no

hints on how to move forward.
 
 
  It has not. Concerns that were raised (by specifically Nick Piggin) 
  weren't being addressed.



 I may have missed them, but what I saw from him weren't specific issues,
 but instead a nebulous 'something better may come along later'


Something better, ie. the problems with page reclaim being fixed.
Why is that nebulous?


becouse that doesn't begin to address all the benifits.

the approach of fixing page reclaim and updatedb is pretending that if you 
only do everything right pages won't get pushed to swap in the first 
place, and therefor swap prefetch won't be needed.


this completely ignores the use case where the swapping was exactly the 
right thing to do, but memory has been freed up from a program exiting so 
that you couldnow fill that empty ram with data that was swapped out.


David Lang
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-08-04 Thread Pavel Machek

Hi!

> > That would just save reading the directories. Not sure
> > it helps that much. Much better would be actually if it didn't stat the 
> > individual files (and force their dentries/inodes in). I bet it does that 
> > to 
> > find out if they are directories or not. But in a modern system it could 
> > just 
> > check the type in the dirent on file systems that support 
> > that and not do a stat. Then you would get much less dentries/inodes.
>  
> FWIW, find(1) does *not* stat non-directories (and neither would this
> approach).  So it's just dentries for directories and you can't realistically
> skip those.  OK, you could - if you had banned cross-directory rename
> for directories and propagated "dirty since last look" towards root (note
> that it would be a boolean, not a timestamp).  Then we could skip unchanged
> subtrees completely...

Could we help it a little from kernel and set 'dirty since last look'
on directory renames?

I mean, this is not only updatedb. KDE startup is limited by this,
too. It would be nice to have effective 'what change in tree'
operation.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-08-04 Thread Pavel Machek

Hi!

  That would just save reading the directories. Not sure
  it helps that much. Much better would be actually if it didn't stat the 
  individual files (and force their dentries/inodes in). I bet it does that 
  to 
  find out if they are directories or not. But in a modern system it could 
  just 
  check the type in the dirent on file systems that support 
  that and not do a stat. Then you would get much less dentries/inodes.
  
 FWIW, find(1) does *not* stat non-directories (and neither would this
 approach).  So it's just dentries for directories and you can't realistically
 skip those.  OK, you could - if you had banned cross-directory rename
 for directories and propagated dirty since last look towards root (note
 that it would be a boolean, not a timestamp).  Then we could skip unchanged
 subtrees completely...

Could we help it a little from kernel and set 'dirty since last look'
on directory renames?

I mean, this is not only updatedb. KDE startup is limited by this,
too. It would be nice to have effective 'what change in tree'
operation.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ck] Re: -mm merge plans for 2.6.23

2007-07-31 Thread Matthew Hawkins

On 7/25/07, Nick Piggin <[EMAIL PROTECTED]> wrote:
> I guess /proc/meminfo, /proc/zoneinfo, /proc/vmstat, /proc/slabinfo
> before and after the updatedb run with the latest kernel would be a
> first step. top and vmstat output during the run wouldn't hurt either.

Hi Nick,

I've attached two files with this kind of info.  Being up at the cron
hours of the morning meant I got a better picture of what my system is
doing.  Here's a short summary of what I saw in top:

beagleindexer used gobs of ram.  600M or so (I have 1G)

updatedb didn't use much ram, but while it was running kswapd kept on
frequenting the top 10 cpu hogs - it would stick around for 5 seconds
or so then disappear for no more than 10 seconds, then come back
again.  This behaviour persisted during the run.  updatedb ran third
(beagleindexer was first, then update-dlocatedb)

I'm going to do this again, this time under a CFS kernel & use Ingo's
sched_debug script to see what the scheduler is doing also.

Let me know if there's anything else you wish to see.  The running
kernel at the time was 2.6.22.1-ck.  There's no slabinfo since I'm
using slub instead (and I don't have slub debug enabled).

Cheers,

-- 
Matt

beaglecron.ck
Description: Binary data

updatedbcron.ck
Description: Binary data

Re: [ck] Re: -mm merge plans for 2.6.23

2007-07-31 Thread Matthew Hawkins

On 7/25/07, Nick Piggin [EMAIL PROTECTED] wrote:
 I guess /proc/meminfo, /proc/zoneinfo, /proc/vmstat, /proc/slabinfo
 before and after the updatedb run with the latest kernel would be a
 first step. top and vmstat output during the run wouldn't hurt either.

Hi Nick,

I've attached two files with this kind of info.  Being up at the cron
hours of the morning meant I got a better picture of what my system is
doing.  Here's a short summary of what I saw in top:

beagleindexer used gobs of ram.  600M or so (I have 1G)

updatedb didn't use much ram, but while it was running kswapd kept on
frequenting the top 10 cpu hogs - it would stick around for 5 seconds
or so then disappear for no more than 10 seconds, then come back
again.  This behaviour persisted during the run.  updatedb ran third
(beagleindexer was first, then update-dlocatedb)

I'm going to do this again, this time under a CFS kernel  use Ingo's
sched_debug script to see what the scheduler is doing also.

Let me know if there's anything else you wish to see.  The running
kernel at the time was 2.6.22.1-ck.  There's no slabinfo since I'm
using slub instead (and I don't have slub debug enabled).

Cheers,

-- 
Matt


beaglecron.ck
Description: Binary data


updatedbcron.ck
Description: Binary data

Re: [ck] Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-30 Thread Helge Hafting


Matthew Hawkins wrote:

updatedb by itself doesn't really bug me, its just that on occasion
its still running at 7am 

You should start it earlier then - assuming it doesn't
already start at the earliest opportunity?


Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ck] Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-30 Thread Helge Hafting


Matthew Hawkins wrote:

updatedb by itself doesn't really bug me, its just that on occasion
its still running at 7am 

You should start it earlier then - assuming it doesn't
already start at the earliest opportunity?


Helge Hafting
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread david

On Sun, 29 Jul 2007, Rene Herman wrote:

On 07/29/2007 01:41 PM, [EMAIL PROTECTED] wrote:

 I agree that tinkering with the core VM code should not be done lightly,
 but this has been put through the proper process and is stalled with no
 hints on how to move forward.

It has not. Concerns that were raised (by specifically Nick Piggin) weren't 
being addressed.

I may have missed them, but what I saw from him weren't specific issues, 
but instead a nebulous 'something better may come along later'

 forget the nightly cron jobs for the moment. think of this scenerio. you
 have your memory fairly full with apps that you have open (including
 firefox with many tabs), you receive a spreadsheet you need to look at, so
 you fire up openoffice to look at it. then you exit openoffice and try
 to go back to firefox (after a pause while you walk to the printer to
 get the printout of the spreadsheet)

And swinging a dead rat from its tail facing east-wards while reciting 
Documentation/CodingStyle.

Okay, very very sorry, that was particularly childish, but that "walking to 
the printer" is ofcourse completely constructed and this _is_ something to 
take into account.

yes it was contrived for simplicity.

the same effect would happen if instead of going back to firefox the user 
instead went to their e-mail software and read some mail. doing so should 
still make the machine idle enough to let prefetch kick in.

Swap-prefetch wants to be free, which (also again) it is 
doing a good job at it seems, but this also means that it waits for the VM to 
be _very_ idle before it does anything and as such, we cannot just forget the 
"nightly" scenario and pretend it's about something else entirely. As long as 
the machine's being used, swap-prefetch doesn't kick in.

how long does the machine need to be idle? if someone spends 30 seconds 
reading an e-mail that's an incredibly long time for the system and I 
would think it should be enough to let the prefetch kick in.

>  -- 3: no serious consideration of possible alternatives
> 
>  Tweaking existing use-oce logic is one I've heard but if we consider 
>  the i/dcache issue dead, I believe that one is as well. Going to 
>  userspace is another one. Largest theoretical potential. I myself am 
>  extremely sceptical about the Linux userland, and largely equate it 
>  with "smallest _practical_ potential" -- but that might just be me.
> 
>  A larger swap granularity, possible even a self-training 
>  granularity. Up to now, seeks only get costlier and costlier with 
>  respect to reads with every generation of disk (flash would largely 
>  overcome it though) and doing more in one read/write _greatly_ 
>  improves throughput, maybe up to the point that swap-prefetch is no 
>  longer very useful. I myself don't know about the tradeoffs 
>  involved.

 larger swap granularity may help, but waiting for the user to need the
 ram and have to wait for it to be read back in is always going to be
 worse for the user then pre-populating the free memory (for the case
 where the pre-population is right, for other cases it's the same). so
 I see this as a red herring

I saw Chris Snook make a good post here and am going to defer this part to 
that discussion:

http://lkml.org/lkml/2007/7/27/421

But no, it's not a red herring if _practically_ speaking the swapin is fast 
enough once started that people don't actually mind anymore since in that 
case you could simply do without yet more additional VM complexity (and 
kernel daemon).

swapin will always require disk access, and avoiding doing disk access 
while the user is waiting for it by doing it when the system isn't useing 
the disk will always be a win (possibly not as large of a win, but still a 
win) on slow laptop drives where you may only get 20MB/second of reads 
under optimal situations it doesn't take much reading to be noticed by the 
user.

 there are fully legitimate situations where this is useful, the 'papering
 over' effect is not referring to these, it's referring to other possible
 problems in the future.

No, it's not just future. Just look at the various things under discussion 
now such as improved use-once and better swapin.

and these thing do not conflict with prefetch, they compliment it.

improved use-once will avoid pushing things out to swap in the first 
place. this will help during normal workloads so is valuble in any case.

better swapin (I assume you are talking about things like larger swap 
granularity) will also help during normal workloads when you are thrashing 
into swap.

prefetch will help when you have pushed things out to swap and now have 
free memory and a momentarily idle system.

David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Daniel Hazelton

On Sunday 29 July 2007 16:00:22 Ray Lee wrote:
> On 7/29/07, Paul Jackson <[EMAIL PROTECTED]> wrote:
> > If the problem is reading stuff back in from swap at the *same time*
> > that the application is reading stuff from some user file system, and if
> > that user file system is on the same drive as the swap partition
> > (typical on laptops), then interleaving the user file system accesses
> > with the swap partition accesses might overwhelm all other performance
> > problems, due to the frequent long seeks between the two.
>
> Ah, so in a normal scenario where a working-set is getting faulted
> back in, we have the swap storage as well as the file-backed stuff
> that needs to be read as well. So even if swap is organized perfectly,
> we're still seeking. Damn.

That is one reason why I try to have swap on a device dedicated just for it. 
It helps keep the system from having to seek all over the drive for data. (I 
remember that this was recommended years ago with Windows - back when you 
could tell Windows where to put the swap file)

> On the other hand, that explains another thing that swap prefetch
> could be helping with -- if it preemptively faults the swap back in,
> then the file-backed stuff can be faulted back more quickly, just by
> the virtue of not needing to seek back and forth to swap for its
> stuff. Hadn't thought of that.

For it to really help swap-prefetch would have to be more aggressive. At the 
moment (if I'm reading the code correctly) the system has to have close to 
zero for it to kick in. A tunable knob controlling how much activity is too 
much for the prefetch to kick in would help with finding a sane default. IMHO 
it should be the one that provides the most benefit with the least hit to 
performance.

> That also implies that people running with swap files rather than swap
> partitions will see less of an issue. I should dig out my old compact
> flash card and try putting swap on that for a week.

Maybe. It all depends on how much seeking is needed to track down the pages in 
the swapfile and such. What would really help make the situation even better 
would be doing the log structured swap + cleaner. The log structured swap + 
cleaner should provide a performance boost by itself - add in the prefetch 
mechanism and the benefits are even more visible.

Another way to improve performance would require making the page replacement 
mechanism more intelligent. There are bounds to what can be done in the 
kernel without negatively impacting performance, but, if I've read the code 
correctly, there might be a better way to decide which pages to evict. One 
way to do this would be to implement some mechanism that allows the system to 
choose a single group of contiguous pages (or, say, a large soft-page) over 
swapping out a single page at a time.

(some form of memory defrag would also be nice, but I can't think of a way to 
do that without massively breaking everything)

> > In case Andrew is so bored he read this far -- yes this wake-up sounds
> > like user space code, with minimal kernel changes to support any
> > particular lower level operation that we can't do already.
>
> He'd suggested using, uhm, ptrace_peek or somesuch for just such a
> purpose. The second half of the issue is to know when and what to
> target.

The userspace suggestion that was thrown out earlier would have been as 
error-prone and problematic as FUSE. A solution like you suggest would be 
workable - its small and does a task that is best done in userspace (IMHO). 
(IIRC, the original suggestion involved merging maps2 and another patchset 
into mainline and using that, combined with PEEKTEXT to provide for a 
userspace swap daemon. Swap, IMHO, should never be handled outside the 
kernel)

What might be useful is a userspace daemon that tracks memory pressure and 
uses a concise API to trigger various levels of prefetch and/or swap 
aggressiveness.

DRH

-- 
Dialup is like pissing through a pipette. Slow and excruciatingly painful.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Ray Lee

On 7/29/07, Paul Jackson <[EMAIL PROTECTED]> wrote:
> Ray wrote:
> > Ah, so in a normal scenario where a working-set is getting faulted
> > back in, we have the swap storage as well as the file-backed stuff
> > that needs to be read as well. So even if swap is organized perfectly,
> > we're still seeking. Damn.
>
> Perhaps this applies in some cases ... perhaps.

Yeah, point taken: better data would make this a lot easier to figure
out and target fixes.

Ray
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Paul Jackson

Ray wrote:
> Ah, so in a normal scenario where a working-set is getting faulted
> back in, we have the swap storage as well as the file-backed stuff
> that needs to be read as well. So even if swap is organized perfectly,
> we're still seeking. Damn.

Perhaps this applies in some cases ... perhaps.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Ray Lee

On 7/29/07, Paul Jackson <[EMAIL PROTECTED]> wrote:
> If the problem is reading stuff back in from swap at the *same time*
> that the application is reading stuff from some user file system, and if
> that user file system is on the same drive as the swap partition
> (typical on laptops), then interleaving the user file system accesses
> with the swap partition accesses might overwhelm all other performance
> problems, due to the frequent long seeks between the two.

Ah, so in a normal scenario where a working-set is getting faulted
back in, we have the swap storage as well as the file-backed stuff
that needs to be read as well. So even if swap is organized perfectly,
we're still seeking. Damn.

On the other hand, that explains another thing that swap prefetch
could be helping with -- if it preemptively faults the swap back in,
then the file-backed stuff can be faulted back more quickly, just by
the virtue of not needing to seek back and forth to swap for its
stuff. Hadn't thought of that.

That also implies that people running with swap files rather than swap
partitions will see less of an issue. I should dig out my old compact
flash card and try putting swap on that for a week.

> In that case, swap layout and swap i/o block size are secondary.
> However, pre-fetching, so that swap read back is not interleaved
> with application file accesses, could help dramatically.

> Perhaps we could have a 'wake-up' command, analogous to the various sleep
> and hibernate commands.
[...]
> In case Andrew is so bored he read this far -- yes this wake-up sounds
> like user space code, with minimal kernel changes to support any
> particular lower level operation that we can't do already.

He'd suggested using, uhm, ptrace_peek or somesuch for just such a
purpose. The second half of the issue is to know when and what to
target.

Ray
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Paul Jackson

Ray wrote:
> a log structured scheme, where the writeout happens to sequential spaces
> on the drive instead of scattered about.

If the problem is reading stuff back in from swap quickly when
needed, then this likely helps, by reducing the seeks needed.

If the problem is reading stuff back in from swap at the *same time*
that the application is reading stuff from some user file system, and if
that user file system is on the same drive as the swap partition
(typical on laptops), then interleaving the user file system accesses
with the swap partition accesses might overwhelm all other performance
problems, due to the frequent long seeks between the two.

In that case, swap layout and swap i/o block size are secondary.
However, pre-fetching, so that swap read back is not interleaved
with application file accesses, could help dramatically.

===

Perhaps we could have a 'wake-up' command, analogous to the various sleep
and hibernate commands.  The 'wake-up' command could do whatever of the
following it knew to do, in order to optimize for an anticipated change in
usage patterns:
 1) pre-fetch swap
 2) clean (write out) dirty pages
 3) maximize free memory
 4) layout swap nicely
 5) pre-fetch a favorite set of apps

Stumble out of bed in the morning, press 'wake-up', start boiling the
water for your coffee, and in another ten minutes, one is ready to rock
and roll.

In case Andrew is so bored he read this far -- yes this wake-up sounds
like user space code, with minimal kernel changes to support any
particular lower level operation that we can't do already.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Rene Herman


On 07/29/2007 07:52 PM, Ray Lee wrote:


 Well, that doesn't match my systems. My laptop has 400MB in swap:


Which in your case is slightly more than 1/3 of available swap space. Quite 
a lot for a desktop indeed. And if it's more than a few percent fragmented, 
please fix current swapout instead of log structuring it.


Rene.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Ray Lee

On 7/29/07, Rene Herman <[EMAIL PROTECTED]> wrote:
> On 07/29/2007 07:19 PM, Ray Lee wrote:
> For me, it is generally the case yes. We are still discussing this in the
> context of desktop machines and their problems with being slow as things
> have been swapped out and generally I expect a desktop to have plenty of
> swap which it's not regularly going to fillup significantly since then the
> machine's unworkably slow as a desktop anyway.

 Well, that doesn't match my systems. My laptop has 400MB in swap:

[EMAIL PROTECTED]:~$ free
 total   used   free sharedbuffers cached
Mem:894208 883920  10288  0   3044 163224
-/+ buffers/cache: 717652 176556
Swap:  1116476 393132 723344

> > And once there's something already in swap, you now have a packing
> > problem when you want to swap something else out.
>
> Once we're crammed, it gets to be a different situation yes. As far as I'm
> concerned that's for another thread though. I'm spending too much time on
> LKML as it is...

No, it's not even when crammed. It's just when there are holes.
mm/swapfile.c does try to cluster things, but doesn't work too hard at
it as we don't want to spend all our time looking for a perfect fit
that may not exist.

Ray
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Alan Cox

> > Is that generally the case on your systems? Every linux system I've
> > run, regardless of RAM, has always pushed things out to swap.
> 
> For me, it is generally the case yes. We are still discussing this in the 
> context of desktop machines and their problems with being slow as things 
> have been swapped out and generally I expect a desktop to have plenty of 
> swap which it's not regularly going to fillup significantly since then the 
> machine's unworkably slow as a desktop anyway.

A simple log optimises writeout (which is latency critical) and can
otherwise stall an enitre system. In a log you can also have multiple
copies of the same page on disk easily, some stale - so you can write out
chunks of data that are not all them removed from memory, just so you get
them back more easily if you then do (and I guess you'd mark them
accordingly)

The second element is a cleaner - something to go around removing stuff
from the log that is needed when the disks are idle - and also to repack
data in nice linear chunks. So instead of using the empty disk time for
page-in you use it for packing data and optimising future paging.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Rene Herman


On 07/29/2007 07:19 PM, Ray Lee wrote:


The program is not a real-world issue and if you do not consider it a useful
boundary condition either (okay I guess), how would log structured swap help
if I just assume I have plenty of free swap to begin with?


Is that generally the case on your systems? Every linux system I've
run, regardless of RAM, has always pushed things out to swap.


For me, it is generally the case yes. We are still discussing this in the 
context of desktop machines and their problems with being slow as things 
have been swapped out and generally I expect a desktop to have plenty of 
swap which it's not regularly going to fillup significantly since then the 
machine's unworkably slow as a desktop anyway.



And once there's something already in swap, you now have a packing
problem when you want to swap something else out.


Once we're crammed, it gets to be a different situation yes. As far as I'm 
concerned that's for another thread though. I'm spending too much time on 
LKML as it is...


Rene.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Ray Lee

On 7/29/07, Rene Herman <[EMAIL PROTECTED]> wrote:
> On 07/29/2007 06:04 PM, Ray Lee wrote:
> >> I am very aware of the costs of seeks (on current magnetic media).
> >
> > Then perhaps you can just take it on faith -- log structured layouts
> > are designed to help minimize seeks, read and write.
>
> I am particularly bad at faith. Let's take that stupid program that I posted:

You only think you are :-). I'm sure there are lots of things you have
faith in. Gravity, for example :-).

> The program is not a real-world issue and if you do not consider it a useful
> boundary condition either (okay I guess), how would log structured swap help
> if I just assume I have plenty of free swap to begin with?

Is that generally the case on your systems? Every linux system I've
run, regardless of RAM, has always pushed things out to swap. And once
there's something already in swap, you now have a packing problem when
you want to swap something else out.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Rene Herman


On 07/29/2007 06:04 PM, Ray Lee wrote:


I am very aware of the costs of seeks (on current magnetic media).


Then perhaps you can just take it on faith -- log structured layouts
are designed to help minimize seeks, read and write.


I am particularly bad at faith. Let's take that stupid program that I posted:

http://lkml.org/lkml/2007/7/25/85

You push it out before you hit enter, it's written out to swap, at whatever 
speed. How should it be layed out so that it's swapped in most efficiently 
after hitting enter? Reading bigger chunks would quite obviously help, but 
the layout?


The program is not a real-world issue and if you do not consider it a useful 
boundary condition either (okay I guess), how would log structured swap help 
if I just assume I have plenty of free swap to begin with?


Rene.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Ray Lee

On 7/29/07, Rene Herman <[EMAIL PROTECTED]> wrote:
> On 07/29/2007 05:20 PM, Ray Lee wrote:
> This seems to be now fixing the different problem of swap-space filling up.
> I'm quite willing to for now assume I've got plenty free.

I was trying to point out that currently, as an example, memory that
is linear in a process' space could be fragmented on disk when swapped
out. That's today.

Under a log-structured scheme, one could set it up such that something
that was linear in RAM could be swapped out linearly on the drive,
minimizing seeks on writeout, which will naturally minimize seeks on
swap in of that same data.

> > So, at some point when the system needs to fault those blocks that
> > back in, it now has a linear span of sectors to read instead of asking
> > the drive to bounce over twenty tracks for a hundred blocks.
>
> Moreover though -- what I know about log structure is that generally it
> optimises for write (swapout) and might make read (swapin) worse due to
> fragmentation that wouldn't happen with a regular fs structure.

It looks like I'm not doing a very good job of explaining this, I'm afraid.

Suffice it to say that a log structured swap would give optimization
options that we don't have today.

> I guess that cleaner that Alan mentioned might be involved there -- I don't
> know how/what it would be doing.

Then you should google on `log structured filesystem (primer OR
introduction)` and read a few of the links that pop up. You might find
it interesting.

> I am very aware of the costs of seeks (on current magnetic media).

Then perhaps you can just take it on faith -- log structured layouts
are designed to help minimize seeks, read and write.

Ray
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Rene Herman


On 07/29/2007 05:20 PM, Ray Lee wrote:


I understand what log structure is generally, but how does it help swapin?


Look at the swap out case first.

Right now, when swapping out the kernel places whatever it can
wherever it can inside the swap space. The closer you are to filling
your swap space, the more likely that those swapped out blocks will be
all over place, rather than in one nice chunk. Contrast that with a
log structured scheme, where the writeout happens to sequential spaces
on the drive instead of scattered about.


This seems to be now fixing the different problem of swap-space filling up. 
I'm quite willing to for now assume I've got plenty free.



So, at some point when the system needs to fault those blocks that
back in, it now has a linear span of sectors to read instead of asking
the drive to bounce over twenty tracks for a hundred blocks.


Moreover though -- what I know about log structure is that generally it 
optimises for write (swapout) and might make read (swapin) worse due to 
fragmentation that wouldn't happen with a regular fs structure.


I guess that cleaner that Alan mentioned might be involved there -- I don't 
know how/what it would be doing.



So, it eliminates the seeks. My laptop drive can read (huh, how odd,
it got slower, need to retest in single user mode), hmm, let's go with
about 25 MB/s. If we ask for a single block from each track, though,
that'll drop to 4k * (1 second / seek time) which is about a megabyte
a second if we're lucky enough to read from consecutive tracks. Even
worse if it's not.

Seeks are the enemy any time you need to hit the drive for anything,
be it swapping or optimizing a database.


I am very aware of the costs of seeks (on current magnetic media).

Rene.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Ray Lee

On 7/29/07, Rene Herman <[EMAIL PROTECTED]> wrote:
> On 07/29/2007 04:58 PM, Ray Lee wrote:
> > On 7/29/07, Rene Herman <[EMAIL PROTECTED]> wrote:
> >> Right over my head. Why does log-structure help anything?
> >
> > Log structured disk layouts allow for better placement of writeout, so
> > that you cn eliminate most or all seeks. Seeks are the enemy when
> > trying to get full disk bandwidth.
> >
> > google on log structured disk layout, or somesuch, for details.
>
> I understand what log structure is generally, but how does it help swapin?

Look at the swap out case first.

Right now, when swapping out the kernel places whatever it can
wherever it can inside the swap space. The closer you are to filling
your swap space, the more likely that those swapped out blocks will be
all over place, rather than in one nice chunk. Contrast that with a
log structured scheme, where the writeout happens to sequential spaces
on the drive instead of scattered about.

So, at some point when the system needs to fault those blocks that
back in, it now has a linear span of sectors to read instead of asking
the drive to bounce over twenty tracks for a hundred blocks.

So, it eliminates the seeks. My laptop drive can read (huh, how odd,
it got slower, need to retest in single user mode), hmm, let's go with
about 25 MB/s. If we ask for a single block from each track, though,
that'll drop to 4k * (1 second / seek time) which is about a megabyte
a second if we're lucky enough to read from consecutive tracks. Even
worse if it's not.

Seeks are the enemy any time you need to hit the drive for anything,
be it swapping or optimizing a database.

Ray
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Rene Herman


On 07/29/2007 04:58 PM, Ray Lee wrote:


On 7/29/07, Rene Herman <[EMAIL PROTECTED]> wrote:

On 07/29/2007 03:12 PM, Alan Cox wrote:



More radically if anyone wants to do real researchy type work - how about
log structured swap with a cleaner  ?



Right over my head. Why does log-structure help anything?


Log structured disk layouts allow for better placement of writeout, so
that you cn eliminate most or all seeks. Seeks are the enemy when
trying to get full disk bandwidth.

google on log structured disk layout, or somesuch, for details.


I understand what log structure is generally, but how does it help swapin?

Rene.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Ray Lee

On 7/29/07, Rene Herman <[EMAIL PROTECTED]> wrote:
> On 07/29/2007 03:12 PM, Alan Cox wrote:
> > More radically if anyone wants to do real researchy type work - how about
> > log structured swap with a cleaner  ?
>
> Right over my head. Why does log-structure help anything?

Log structured disk layouts allow for better placement of writeout, so
that you cn eliminate most or all seeks. Seeks are the enemy when
trying to get full disk bandwidth.

google on log structured disk layout, or somesuch, for details.

Ray
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Rene Herman


On 07/29/2007 03:12 PM, Alan Cox wrote:


What are the tradeoffs here? What wants small chunks? Also, as far as
I'm aware Linux does not do things like up the granularity when it
notices it's swapping in heavily? That sounds sort of promising...


Small chunks means you get better efficiency of memory use - large chunks
mean you may well page in a lot more than you needed to each time (and 
cause more paging in turn). Your disk would prefer you fed it big linear
I/O's - 512KB would probably be my first guess at tuning a large box 
under load for paging chunk size.


That probably kills my momentary hope that I was looking at yet another good 
use of large soft-pages seeing as how 512K would be going overboard a bit 
right? :-/



More radically if anyone wants to do real researchy type work - how about
log structured swap with a cleaner  ?


Right over my head. Why does log-structure help anything?

Rene.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Rene Herman


On 07/29/2007 01:41 PM, [EMAIL PROTECTED] wrote:

And now you do it again :-) There is no conclusion -- just the 
inescapable observation that swap-prefetch was (or may have been) 
masking the problem of GNU locate being a program that noone in their 
right mind should be using.


isn't your conclusion then that if people just stopped useing that 
version of updatedb the problem would be solved and there would be no 
need for the swap prefetch patch? that seemed to be what you were 
strongly implying (if not saying outright)


No. What I said outright, every single time, is that swap-prefetch in itself 
seems to make sense. And specifically that even if the _direct_ problem is a 
crummy program, it _still_ makes sense generally. Every single time.


But see -- you failed to notice this because you guys are stuck in this dumb 
adversary "us against them" thing so inherent of (online) communities, where 
you sit around your own habitats patting each other on the back for extended 
periods of time and then every once a while go out clinging on to each other 
vigorously and going "boo! hiss!" at the big bad outside world.


I already got overly violent at one point in this thread so I'll leave out 
any further references to sense-deprived fanboy-culture but please, I said 
every single time that I'm not against swap-prefetch. I cannot communicate 
when I'm not being read.


I agree that tinkering with the core VM code should not be done lightly, 
but this has been put through the proper process and is stalled with no 
hints on how to move forward.


It has not. Concerns that were raised (by specifically Nick Piggin) weren't 
being addressed.


forget the nightly cron jobs for the moment. think of this scenerio. you 
have your memory fairly full with apps that you have open (including 
firefox with many tabs), you receive a spreadsheet you need to look at, 
so you fire up openoffice to look at it. then you exit openoffice and try

 to go back to firefox (after a pause while you walk to the printer to
get the printout of the spreadsheet)


And swinging a dead rat from its tail facing east-wards while reciting 
Documentation/CodingStyle.


Okay, very very sorry, that was particularly childish, but that "walking to 
the printer" is ofcourse completely constructed and this _is_ something to 
take into account. Swap-prefetch wants to be free, which (also again) it is 
doing a good job at it seems, but this also means that it waits for the VM 
to be _very_ idle before it does anything and as such, we cannot just forget 
the "nightly" scenario and pretend it's about something else entirely. As 
long as the machine's being used, swap-prefetch doesn't kick in.


Which is a good feature for swap-prefetch, but also something that needs to 
weighed alongside its other features in a discussion of alternatives, where 
for example something like a larger swap granularity would not have anything 
of the sort to take into account. If it were about walks to the printer, we 
could shelve the issue as being of too limited practical use for inclusion.



-- 2: no serious investigation into possible downsides

Swap-prefetch tries hard to be as free as possible and it seems to 
largely be succeeding at that. Thing that (obviously -- as in I 
wouldn't want to state it's the only possible worry anyone could have 
left) remains is the "papering over effect" it has by design that one 
might not care for.


Arjan van de Ven made another point here about seeking away due to 
swap-prefetch (just) before the next request comes in, but that's probably a 
bit of a non-issue in practice with the "very idle" precondition.



-- 3: no serious consideration of possible alternatives

Tweaking existing use-oce logic is one I've heard but if we consider 
the i/dcache issue dead, I believe that one is as well. Going to 
userspace is another one. Largest theoretical potential. I myself am 
extremely sceptical about the Linux userland, and largely equate it 
with "smallest _practical_ potential" -- but that might just be me.


A larger swap granularity, possible even a self-training granularity. 
Up to now, seeks only get costlier and costlier with respect to reads 
with every generation of disk (flash would largely overcome it though) 
and doing more in one read/write _greatly_ improves throughput, maybe 
up to the point that swap-prefetch is no longer very useful. I myself 
don't know about the tradeoffs involved.


larger swap granularity may help, but waiting for the user to need the 
ram and have to wait for it to be read back in is always going to be 
worse for the user then pre-populating the free memory (for the case 
where the pre-population is right, for other cases it's the same). so I 
see this as a red herring


I saw Chris Snook make a good post here and am going to defer this part to 
that discussion:


http://lkml.org/lkml/2007/7/27/421

But no, it's not a red herring if _practically_ speaking the swapin is fast 
enough once started

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Alan Cox

> Contrived thing and all, but what it does do is show exactly how bad seeking 
> all over swap-space is. If you push it out before hitting enter, the time it 
> takes easily grows past 10 minutes (with my 768M) versus sub-second (!) when 
> it's all in to start with.

Think in "operations/second" and you get a better view of the disk.

> What are the tradeoffs here? What wants small chunks? Also, as far as I'm 
> aware Linux does not do things like up the granularity when it notices it's 
> swapping in heavily? That sounds sort of promising...

Small chunks means you get better efficiency of memory use - large chunks
mean you may well page in a lot more than you needed to each time (and
cause more paging in turn). Your disk would prefer you fed it big linear
I/O's - 512KB would probably be my first guess at tuning a large box
under load for paging chunk size.

More radically if anyone wants to do real researchy type work - how about
log structured swap with a cleaner  ?

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Paul Jackson

Andi wrote:
> GNU sort uses a merge sort with temporary files on disk. Not sure
> how much it keeps in memory during that, but it's probably less
> than 150MB. 

If I'm reading the source code for GNU sort correctly, then the
following snippet of shell code displays how much memory it uses
for its primary buffer on typical GNU/Linux systems:

head -2 /proc/meminfo | awk '
NR == 1 { memtotal = $2 }
NR == 2 { memfree = $2 }
END {
   if (memfree > memtotal/8)
   m = memfree
   else
   m = memtotal/8
   print "sort size:", m/2, "kB"
}
'

That is, over simplifying, GNU sort looks at the first two entries
in /proc/meminfo, which for example on a machine near me happen to be:

  MemTotal:  2336472 kB
  MemFree:110600 kB

and then uses one-half of whichever is -greater- of MemTotal/8 or
MemFree.

... However ... for the typical GNU locate updatedb run, it is sorting
the list of pathnames for almost all files on the system, which is
usually larger than fits in one of these sized buffers.   So it ends up
using quite a few of the temporary files you mention, which tends to
chew up easily freed memory.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread david

On Sun, 29 Jul 2007, Rene Herman wrote:

On 07/28/2007 11:00 PM, [EMAIL PROTECTED] wrote:

>  many -mm users use it anyway? He himself said he's not convinced of 
>  usefulness having not seen it help for him (and notice that most 
>  developers are also users), turned it off due to it annoying him at some 
>  point and hasn't seen a serious investigation into potential downsides.

 if that was the case then people should be responding to the request to
 get it merged with 'but it caused problems for me when I tried it'

 I haven't seen any comments like that.

So you're saying Andrew did not say that? You're jumping to the conclusion 
that I am saying that it's causing problems.

I don't remember anyone saying that it actually caused problems (including 
both you and andrew). I (and others) have been trying to learn what 
problems people believe it has in the hope that they can be addressed one 
way or another.

> >   that the only significant con left is the potential to mask other
> >   problems.
> 
>  Which is not a madeup issue, mind you. As an example, I just now tried 
>  GNU locate and saw it's a complete pig and specifically unsuitable for 
>  the low memory boxes under discussion. Upon completion, it actually 
>  frees enough memory that swap-prefetch _could_ help on some boxes, while 
>  the real issue is that they should first and foremost dump GNU locate.

 I see the conclusion as being exactly the opposite.

And now you do it again :-) There is no conclusion -- just the inescapable 
observation that swap-prefetch was (or may have been) masking the problem of 
GNU locate being a program that noone in their right mind should be using.

isn't your conclusion then that if people just stopped useing that version 
of updatedb the problem would be solved and there would be no need for the 
swap prefetch patch? that seemed to be what you were strongly implying (if 
not saying outright)

 so there is a legitimate situation where swap-prefetch will help
 significantly, what is the downside that prevents it from being included?

People being unconvinced it helps all that much, no serious investigation 
into possible downsides and no consideration of alternatives is three I've 
personally heard.

You don't want to merge a conceptually core VM feature if you're not really 
convinced. It's not a part of the kernel you can throw a feature into like 
you could some driver saying "ah, heck, if it makes someone happy" since 
everything in the VM ends up interacting -- that in fact is actually the hard 
part of VM as far as I've seen it.

And in this situation the proposed feature is something that "papers over a 
problem" by design -- where it could certainly be that the problem is not 
solveable in another way simply due to the kernel not growing the possiblity 
to read user's minds anytime soon (which some might even like to rephrase as 
"due to no problem existing") but that this gets people a bit anxious is not 
surprising.

people who have lots of memory and so don't use swap will never see the 
benifit of this patch. over the years many people have investigated the 
problem and tried to address it in other ways (the better version of 
updatedb is an attempt to fix it for that program as an example), but 
there is still a problem.

I agree that tinkering with the core VM code should not be done lightly, 
but this has been put through the proper process and is stalled with no 
hints on how to move forward.

 I've seen it mentioned that there is still a maintainer but I missed who
 it is, but I haven't seen any concerns that can be addressed, they all
 seem to be 'this is a core concept, people need to think about it' or 'but
 someone may find a better answer in the future' type of things. it's
 impossible to address these concerns directly.

So do it indirectly. But please don't just say "it help some people (not me 
mind you!) so merge it and if you don't it's all just politics and we can't 
do anything about it anyway". Because that's mostly what I've been hearing.

And no, I'm not subscribed to any ck mailinglists nor do I hang around its 
IRC community which will can account for part of that. I expect though that 
the same holds for the people that actually matter in this, such as Andrew 
Morton and Nick Piggin.

-- 1: people being unconvinced it helps all that much

At least partly caused by the updatedb i/dcache red herring that infected 
this issue. Also, at the point VM  pressure has mounted high enough to cause 
enough to be swapped out to give you a bad experience, a lot of other things 
have been dropped already as well.

It's unsurprising though that it would for example help the issue of 
openoffice with a large open spreadsheet having been thrown out overnight 
meaning it's a matter of deciding whether or not this is an important enough 
issue to fix inside the VM with something like swap-prefetch.

Personally -- no opinion, I do not experience the problem (I even

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Rene Herman


On 07/28/2007 11:00 PM, [EMAIL PROTECTED] wrote:

many -mm users use it anyway? He himself said he's not convinced of 
usefulness having not seen it help for him (and notice that most 
developers are also users), turned it off due to it annoying him at 
some point and hasn't seen a serious investigation into potential 
downsides.


if that was the case then people should be responding to the request to 
get it merged with 'but it caused problems for me when I tried it'


I haven't seen any comments like that.


So you're saying Andrew did not say that? You're jumping to the conclusion 
that I am saying that it's causing problems.



 that the only significant con left is the potential to mask other
 problems.


Which is not a madeup issue, mind you. As an example, I just now tried 
GNU locate and saw it's a complete pig and specifically unsuitable for 
the low memory boxes under discussion. Upon completion, it actually 
frees enough memory that swap-prefetch _could_ help on some boxes, 
while the real issue is that they should first and foremost dump GNU 
locate.


I see the conclusion as being exactly the opposite.


And now you do it again :-) There is no conclusion -- just the inescapable 
observation that swap-prefetch was (or may have been) masking the problem of 
GNU locate being a program that noone in their right mind should be using.


so there is a legitimate situation where swap-prefetch will help 
significantly, what is the downside that prevents it from being 
included?


People being unconvinced it helps all that much, no serious investigation 
into possible downsides and no consideration of alternatives is three I've 
personally heard.


You don't want to merge a conceptually core VM feature if you're not really 
convinced. It's not a part of the kernel you can throw a feature into like 
you could some driver saying "ah, heck, if it makes someone happy" since 
everything in the VM ends up interacting -- that in fact is actually the 
hard part of VM as far as I've seen it.


And in this situation the proposed feature is something that "papers over a 
problem" by design -- where it could certainly be that the problem is not 
solveable in another way simply due to the kernel not growing the possiblity 
to read user's minds anytime soon (which some might even like to rephrase as 
"due to no problem existing") but that this gets people a bit anxious is not 
surprising.



I've seen it mentioned that there is still a maintainer but I missed who
it is, but I haven't seen any concerns that can be addressed, they all 
seem to be 'this is a core concept, people need to think about it' or 
'but someone may find a better answer in the future' type of things. it's

impossible to address these concerns directly.


So do it indirectly. But please don't just say "it help some people (not me 
mind you!) so merge it and if you don't it's all just politics and we can't 
do anything about it anyway". Because that's mostly what I've been hearing.


And no, I'm not subscribed to any ck mailinglists nor do I hang around its 
IRC community which will can account for part of that. I expect though that 
the same holds for the people that actually matter in this, such as Andrew 
Morton and Nick Piggin.


-- 1: people being unconvinced it helps all that much

At least partly caused by the updatedb i/dcache red herring that infected 
this issue. Also, at the point VM  pressure has mounted high enough to cause 
enough to be swapped out to give you a bad experience, a lot of other things 
have been dropped already as well.


It's unsurprising though that it would for example help the issue of 
openoffice with a large open spreadsheet having been thrown out overnight 
meaning it's a matter of deciding whether or not this is an important enough 
issue to fix inside the VM with something like swap-prefetch.


Personally -- no opinion, I do not experience the problem (I even switch off 
the machine at night and do not run cron at all).


-- 2: no serious investigation into possible downsides

Swap-prefetch tries hard to be as free as possible and it seems to largely 
be succeeding at that. Thing that (obviously -- as in I wouldn't want to 
state it's the only possible worry anyone could have left) remains is the 
"papering over effect" it has by design that one might not care for.


-- 3: no serious consideration of possible alternatives

Tweaking existing use-oce logic is one I've heard but if we consider the 
i/dcache issue dead, I believe that one is as well. Going to userspace is 
another one. Largest theoretical potential. I myself am extremely sceptical 
about the Linux userland, and largely equate it with "smallest _practical_ 
potential" -- but that might just be me.


A larger swap granularity, possible even a self-training granularity. Up to 
now, seeks only get costlier and costlier with respect to reads with every 
generation of disk (flash would largely overcome it though) and doing more 
in one read/write

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Rene Herman


On 07/28/2007 01:21 PM, Alan Cox wrote:


It is. Prefetched pages can be dropped on the floor without additional
I/O.


Which is essentially free for most cases. In addition your disk access 
may well have been in idle time (and should be for this sort of stuff)


Yes. The swap-prefetch patch ensures that the machine (well, the VM) is very 
idle before it allows itself to kick in.


and if it was in the same chunk as something nearby was effectively free 
anyway.


Actual physical disk ops are precious resource and anything that mostly 
reduces the number will be a win - not to stay swap prefetch is the right

answer but accidentally or otherwise there are good reasons it may
happen to help.

Bigger more linear chunks of writeout/readin is much more important I 
suspect than swap prefetching.


Yes, I believe this might be an important point. Earlier I posted a dumb 
little VM thrasher:


http://lkml.org/lkml/2007/7/25/85

Contrived thing and all, but what it does do is show exactly how bad seeking 
all over swap-space is. If you push it out before hitting enter, the time it 
takes easily grows past 10 minutes (with my 768M) versus sub-second (!) when 
it's all in to start with.


What are the tradeoffs here? What wants small chunks? Also, as far as I'm 
aware Linux does not do things like up the granularity when it notices it's 
swapping in heavily? That sounds sort of promising...


good overview of exactly how broken -mm can be at times. How many -mm users 
use it anyway? He himself said he's not convinced of usefulness having not 


I've been using it for months with no noticed problem. I turn it on
because it might as well get tested. I've not done comparison tests so I
can't comment on if its worth it.

Lots of -mm testers turn *everything* on because its a test kernel.


Okay.

Rene.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Rene Herman


On 07/28/2007 01:21 PM, Alan Cox wrote:


It is. Prefetched pages can be dropped on the floor without additional
I/O.


Which is essentially free for most cases. In addition your disk access 
may well have been in idle time (and should be for this sort of stuff)


Yes. The swap-prefetch patch ensures that the machine (well, the VM) is very 
idle before it allows itself to kick in.


and if it was in the same chunk as something nearby was effectively free 
anyway.


Actual physical disk ops are precious resource and anything that mostly 
reduces the number will be a win - not to stay swap prefetch is the right

answer but accidentally or otherwise there are good reasons it may
happen to help.

Bigger more linear chunks of writeout/readin is much more important I 
suspect than swap prefetching.


Yes, I believe this might be an important point. Earlier I posted a dumb 
little VM thrasher:


http://lkml.org/lkml/2007/7/25/85

Contrived thing and all, but what it does do is show exactly how bad seeking 
all over swap-space is. If you push it out before hitting enter, the time it 
takes easily grows past 10 minutes (with my 768M) versus sub-second (!) when 
it's all in to start with.


What are the tradeoffs here? What wants small chunks? Also, as far as I'm 
aware Linux does not do things like up the granularity when it notices it's 
swapping in heavily? That sounds sort of promising...


good overview of exactly how broken -mm can be at times. How many -mm users 
use it anyway? He himself said he's not convinced of usefulness having not 


I've been using it for months with no noticed problem. I turn it on
because it might as well get tested. I've not done comparison tests so I
can't comment on if its worth it.

Lots of -mm testers turn *everything* on because its a test kernel.


Okay.

Rene.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Rene Herman


On 07/28/2007 11:00 PM, [EMAIL PROTECTED] wrote:

many -mm users use it anyway? He himself said he's not convinced of 
usefulness having not seen it help for him (and notice that most 
developers are also users), turned it off due to it annoying him at 
some point and hasn't seen a serious investigation into potential 
downsides.


if that was the case then people should be responding to the request to 
get it merged with 'but it caused problems for me when I tried it'


I haven't seen any comments like that.


So you're saying Andrew did not say that? You're jumping to the conclusion 
that I am saying that it's causing problems.



 that the only significant con left is the potential to mask other
 problems.


Which is not a madeup issue, mind you. As an example, I just now tried 
GNU locate and saw it's a complete pig and specifically unsuitable for 
the low memory boxes under discussion. Upon completion, it actually 
frees enough memory that swap-prefetch _could_ help on some boxes, 
while the real issue is that they should first and foremost dump GNU 
locate.


I see the conclusion as being exactly the opposite.


And now you do it again :-) There is no conclusion -- just the inescapable 
observation that swap-prefetch was (or may have been) masking the problem of 
GNU locate being a program that noone in their right mind should be using.


so there is a legitimate situation where swap-prefetch will help 
significantly, what is the downside that prevents it from being 
included?


People being unconvinced it helps all that much, no serious investigation 
into possible downsides and no consideration of alternatives is three I've 
personally heard.


You don't want to merge a conceptually core VM feature if you're not really 
convinced. It's not a part of the kernel you can throw a feature into like 
you could some driver saying ah, heck, if it makes someone happy since 
everything in the VM ends up interacting -- that in fact is actually the 
hard part of VM as far as I've seen it.


And in this situation the proposed feature is something that papers over a 
problem by design -- where it could certainly be that the problem is not 
solveable in another way simply due to the kernel not growing the possiblity 
to read user's minds anytime soon (which some might even like to rephrase as 
due to no problem existing) but that this gets people a bit anxious is not 
surprising.



I've seen it mentioned that there is still a maintainer but I missed who
it is, but I haven't seen any concerns that can be addressed, they all 
seem to be 'this is a core concept, people need to think about it' or 
'but someone may find a better answer in the future' type of things. it's

impossible to address these concerns directly.


So do it indirectly. But please don't just say it help some people (not me 
mind you!) so merge it and if you don't it's all just politics and we can't 
do anything about it anyway. Because that's mostly what I've been hearing.


And no, I'm not subscribed to any ck mailinglists nor do I hang around its 
IRC community which will can account for part of that. I expect though that 
the same holds for the people that actually matter in this, such as Andrew 
Morton and Nick Piggin.


-- 1: people being unconvinced it helps all that much

At least partly caused by the updatedb i/dcache red herring that infected 
this issue. Also, at the point VM  pressure has mounted high enough to cause 
enough to be swapped out to give you a bad experience, a lot of other things 
have been dropped already as well.


It's unsurprising though that it would for example help the issue of 
openoffice with a large open spreadsheet having been thrown out overnight 
meaning it's a matter of deciding whether or not this is an important enough 
issue to fix inside the VM with something like swap-prefetch.


Personally -- no opinion, I do not experience the problem (I even switch off 
the machine at night and do not run cron at all).


-- 2: no serious investigation into possible downsides

Swap-prefetch tries hard to be as free as possible and it seems to largely 
be succeeding at that. Thing that (obviously -- as in I wouldn't want to 
state it's the only possible worry anyone could have left) remains is the 
papering over effect it has by design that one might not care for.


-- 3: no serious consideration of possible alternatives

Tweaking existing use-oce logic is one I've heard but if we consider the 
i/dcache issue dead, I believe that one is as well. Going to userspace is 
another one. Largest theoretical potential. I myself am extremely sceptical 
about the Linux userland, and largely equate it with smallest _practical_ 
potential -- but that might just be me.


A larger swap granularity, possible even a self-training granularity. Up to 
now, seeks only get costlier and costlier with respect to reads with every 
generation of disk (flash would largely overcome it though) and doing more 
in one read/write _greatly_

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread david


On Sun, 29 Jul 2007, Rene Herman wrote:


On 07/28/2007 11:00 PM, [EMAIL PROTECTED] wrote:

  many -mm users use it anyway? He himself said he's not convinced of 
  usefulness having not seen it help for him (and notice that most 
  developers are also users), turned it off due to it annoying him at some 
  point and hasn't seen a serious investigation into potential downsides.


 if that was the case then people should be responding to the request to
 get it merged with 'but it caused problems for me when I tried it'

 I haven't seen any comments like that.


So you're saying Andrew did not say that? You're jumping to the conclusion 
that I am saying that it's causing problems.


I don't remember anyone saying that it actually caused problems (including 
both you and andrew). I (and others) have been trying to learn what 
problems people believe it has in the hope that they can be addressed one 
way or another.



that the only significant con left is the potential to mask other
problems.
 
  Which is not a madeup issue, mind you. As an example, I just now tried 
  GNU locate and saw it's a complete pig and specifically unsuitable for 
  the low memory boxes under discussion. Upon completion, it actually 
  frees enough memory that swap-prefetch _could_ help on some boxes, while 
  the real issue is that they should first and foremost dump GNU locate.


 I see the conclusion as being exactly the opposite.


And now you do it again :-) There is no conclusion -- just the inescapable 
observation that swap-prefetch was (or may have been) masking the problem of 
GNU locate being a program that noone in their right mind should be using.


isn't your conclusion then that if people just stopped useing that version 
of updatedb the problem would be solved and there would be no need for the 
swap prefetch patch? that seemed to be what you were strongly implying (if 
not saying outright)



 so there is a legitimate situation where swap-prefetch will help
 significantly, what is the downside that prevents it from being included?


People being unconvinced it helps all that much, no serious investigation 
into possible downsides and no consideration of alternatives is three I've 
personally heard.


You don't want to merge a conceptually core VM feature if you're not really 
convinced. It's not a part of the kernel you can throw a feature into like 
you could some driver saying ah, heck, if it makes someone happy since 
everything in the VM ends up interacting -- that in fact is actually the hard 
part of VM as far as I've seen it.


And in this situation the proposed feature is something that papers over a 
problem by design -- where it could certainly be that the problem is not 
solveable in another way simply due to the kernel not growing the possiblity 
to read user's minds anytime soon (which some might even like to rephrase as 
due to no problem existing) but that this gets people a bit anxious is not 
surprising.


people who have lots of memory and so don't use swap will never see the 
benifit of this patch. over the years many people have investigated the 
problem and tried to address it in other ways (the better version of 
updatedb is an attempt to fix it for that program as an example), but 
there is still a problem.


I agree that tinkering with the core VM code should not be done lightly, 
but this has been put through the proper process and is stalled with no 
hints on how to move forward.



 I've seen it mentioned that there is still a maintainer but I missed who
 it is, but I haven't seen any concerns that can be addressed, they all
 seem to be 'this is a core concept, people need to think about it' or 'but
 someone may find a better answer in the future' type of things. it's
 impossible to address these concerns directly.


So do it indirectly. But please don't just say it help some people (not me 
mind you!) so merge it and if you don't it's all just politics and we can't 
do anything about it anyway. Because that's mostly what I've been hearing.


And no, I'm not subscribed to any ck mailinglists nor do I hang around its 
IRC community which will can account for part of that. I expect though that 
the same holds for the people that actually matter in this, such as Andrew 
Morton and Nick Piggin.


-- 1: people being unconvinced it helps all that much

At least partly caused by the updatedb i/dcache red herring that infected 
this issue. Also, at the point VM  pressure has mounted high enough to cause 
enough to be swapped out to give you a bad experience, a lot of other things 
have been dropped already as well.


It's unsurprising though that it would for example help the issue of 
openoffice with a large open spreadsheet having been thrown out overnight 
meaning it's a matter of deciding whether or not this is an important enough 
issue to fix inside the VM with something like swap-prefetch.


Personally -- no opinion, I do not experience the problem (I even switch off 
the machine

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Paul Jackson

Andi wrote:
 GNU sort uses a merge sort with temporary files on disk. Not sure
 how much it keeps in memory during that, but it's probably less
 than 150MB. 

If I'm reading the source code for GNU sort correctly, then the
following snippet of shell code displays how much memory it uses
for its primary buffer on typical GNU/Linux systems:

head -2 /proc/meminfo | awk '
NR == 1 { memtotal = $2 }
NR == 2 { memfree = $2 }
END {
   if (memfree  memtotal/8)
   m = memfree
   else
   m = memtotal/8
   print sort size:, m/2, kB
}
'

That is, over simplifying, GNU sort looks at the first two entries
in /proc/meminfo, which for example on a machine near me happen to be:

  MemTotal:  2336472 kB
  MemFree:110600 kB

and then uses one-half of whichever is -greater- of MemTotal/8 or
MemFree.

... However ... for the typical GNU locate updatedb run, it is sorting
the list of pathnames for almost all files on the system, which is
usually larger than fits in one of these sized buffers.   So it ends up
using quite a few of the temporary files you mention, which tends to
chew up easily freed memory.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson [EMAIL PROTECTED] 1.925.600.0401
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Alan Cox

 Contrived thing and all, but what it does do is show exactly how bad seeking 
 all over swap-space is. If you push it out before hitting enter, the time it 
 takes easily grows past 10 minutes (with my 768M) versus sub-second (!) when 
 it's all in to start with.

Think in operations/second and you get a better view of the disk.

 What are the tradeoffs here? What wants small chunks? Also, as far as I'm 
 aware Linux does not do things like up the granularity when it notices it's 
 swapping in heavily? That sounds sort of promising...

Small chunks means you get better efficiency of memory use - large chunks
mean you may well page in a lot more than you needed to each time (and
cause more paging in turn). Your disk would prefer you fed it big linear
I/O's - 512KB would probably be my first guess at tuning a large box
under load for paging chunk size.

More radically if anyone wants to do real researchy type work - how about
log structured swap with a cleaner  ?

Alan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Rene Herman


On 07/29/2007 01:41 PM, [EMAIL PROTECTED] wrote:

And now you do it again :-) There is no conclusion -- just the 
inescapable observation that swap-prefetch was (or may have been) 
masking the problem of GNU locate being a program that noone in their 
right mind should be using.


isn't your conclusion then that if people just stopped useing that 
version of updatedb the problem would be solved and there would be no 
need for the swap prefetch patch? that seemed to be what you were 
strongly implying (if not saying outright)


No. What I said outright, every single time, is that swap-prefetch in itself 
seems to make sense. And specifically that even if the _direct_ problem is a 
crummy program, it _still_ makes sense generally. Every single time.


But see -- you failed to notice this because you guys are stuck in this dumb 
adversary us against them thing so inherent of (online) communities, where 
you sit around your own habitats patting each other on the back for extended 
periods of time and then every once a while go out clinging on to each other 
vigorously and going boo! hiss! at the big bad outside world.


I already got overly violent at one point in this thread so I'll leave out 
any further references to sense-deprived fanboy-culture but please, I said 
every single time that I'm not against swap-prefetch. I cannot communicate 
when I'm not being read.


I agree that tinkering with the core VM code should not be done lightly, 
but this has been put through the proper process and is stalled with no 
hints on how to move forward.


It has not. Concerns that were raised (by specifically Nick Piggin) weren't 
being addressed.


forget the nightly cron jobs for the moment. think of this scenerio. you 
have your memory fairly full with apps that you have open (including 
firefox with many tabs), you receive a spreadsheet you need to look at, 
so you fire up openoffice to look at it. then you exit openoffice and try

 to go back to firefox (after a pause while you walk to the printer to
get the printout of the spreadsheet)


And swinging a dead rat from its tail facing east-wards while reciting 
Documentation/CodingStyle.


Okay, very very sorry, that was particularly childish, but that walking to 
the printer is ofcourse completely constructed and this _is_ something to 
take into account. Swap-prefetch wants to be free, which (also again) it is 
doing a good job at it seems, but this also means that it waits for the VM 
to be _very_ idle before it does anything and as such, we cannot just forget 
the nightly scenario and pretend it's about something else entirely. As 
long as the machine's being used, swap-prefetch doesn't kick in.


Which is a good feature for swap-prefetch, but also something that needs to 
weighed alongside its other features in a discussion of alternatives, where 
for example something like a larger swap granularity would not have anything 
of the sort to take into account. If it were about walks to the printer, we 
could shelve the issue as being of too limited practical use for inclusion.



-- 2: no serious investigation into possible downsides

Swap-prefetch tries hard to be as free as possible and it seems to 
largely be succeeding at that. Thing that (obviously -- as in I 
wouldn't want to state it's the only possible worry anyone could have 
left) remains is the papering over effect it has by design that one 
might not care for.


Arjan van de Ven made another point here about seeking away due to 
swap-prefetch (just) before the next request comes in, but that's probably a 
bit of a non-issue in practice with the very idle precondition.



-- 3: no serious consideration of possible alternatives

Tweaking existing use-oce logic is one I've heard but if we consider 
the i/dcache issue dead, I believe that one is as well. Going to 
userspace is another one. Largest theoretical potential. I myself am 
extremely sceptical about the Linux userland, and largely equate it 
with smallest _practical_ potential -- but that might just be me.


A larger swap granularity, possible even a self-training granularity. 
Up to now, seeks only get costlier and costlier with respect to reads 
with every generation of disk (flash would largely overcome it though) 
and doing more in one read/write _greatly_ improves throughput, maybe 
up to the point that swap-prefetch is no longer very useful. I myself 
don't know about the tradeoffs involved.


larger swap granularity may help, but waiting for the user to need the 
ram and have to wait for it to be read back in is always going to be 
worse for the user then pre-populating the free memory (for the case 
where the pre-population is right, for other cases it's the same). so I 
see this as a red herring


I saw Chris Snook make a good post here and am going to defer this part to 
that discussion:


http://lkml.org/lkml/2007/7/27/421

But no, it's not a red herring if _practically_ speaking the swapin is fast 
enough once started that people

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Rene Herman


On 07/29/2007 03:12 PM, Alan Cox wrote:


What are the tradeoffs here? What wants small chunks? Also, as far as
I'm aware Linux does not do things like up the granularity when it
notices it's swapping in heavily? That sounds sort of promising...


Small chunks means you get better efficiency of memory use - large chunks
mean you may well page in a lot more than you needed to each time (and 
cause more paging in turn). Your disk would prefer you fed it big linear
I/O's - 512KB would probably be my first guess at tuning a large box 
under load for paging chunk size.


That probably kills my momentary hope that I was looking at yet another good 
use of large soft-pages seeing as how 512K would be going overboard a bit 
right? :-/



More radically if anyone wants to do real researchy type work - how about
log structured swap with a cleaner  ?


Right over my head. Why does log-structure help anything?

Rene.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Ray Lee

On 7/29/07, Rene Herman [EMAIL PROTECTED] wrote:
 On 07/29/2007 03:12 PM, Alan Cox wrote:
  More radically if anyone wants to do real researchy type work - how about
  log structured swap with a cleaner  ?

 Right over my head. Why does log-structure help anything?

Log structured disk layouts allow for better placement of writeout, so
that you cn eliminate most or all seeks. Seeks are the enemy when
trying to get full disk bandwidth.

google on log structured disk layout, or somesuch, for details.

Ray
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Ray Lee

On 7/29/07, Rene Herman [EMAIL PROTECTED] wrote:
 On 07/29/2007 04:58 PM, Ray Lee wrote:
  On 7/29/07, Rene Herman [EMAIL PROTECTED] wrote:
  Right over my head. Why does log-structure help anything?
 
  Log structured disk layouts allow for better placement of writeout, so
  that you cn eliminate most or all seeks. Seeks are the enemy when
  trying to get full disk bandwidth.
 
  google on log structured disk layout, or somesuch, for details.

 I understand what log structure is generally, but how does it help swapin?

Look at the swap out case first.

Right now, when swapping out the kernel places whatever it can
wherever it can inside the swap space. The closer you are to filling
your swap space, the more likely that those swapped out blocks will be
all over place, rather than in one nice chunk. Contrast that with a
log structured scheme, where the writeout happens to sequential spaces
on the drive instead of scattered about.

So, at some point when the system needs to fault those blocks that
back in, it now has a linear span of sectors to read instead of asking
the drive to bounce over twenty tracks for a hundred blocks.

So, it eliminates the seeks. My laptop drive can read (huh, how odd,
it got slower, need to retest in single user mode), hmm, let's go with
about 25 MB/s. If we ask for a single block from each track, though,
that'll drop to 4k * (1 second / seek time) which is about a megabyte
a second if we're lucky enough to read from consecutive tracks. Even
worse if it's not.

Seeks are the enemy any time you need to hit the drive for anything,
be it swapping or optimizing a database.

Ray
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Rene Herman


On 07/29/2007 04:58 PM, Ray Lee wrote:


On 7/29/07, Rene Herman [EMAIL PROTECTED] wrote:

On 07/29/2007 03:12 PM, Alan Cox wrote:



More radically if anyone wants to do real researchy type work - how about
log structured swap with a cleaner  ?



Right over my head. Why does log-structure help anything?


Log structured disk layouts allow for better placement of writeout, so
that you cn eliminate most or all seeks. Seeks are the enemy when
trying to get full disk bandwidth.

google on log structured disk layout, or somesuch, for details.


I understand what log structure is generally, but how does it help swapin?

Rene.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Rene Herman


On 07/29/2007 05:20 PM, Ray Lee wrote:


I understand what log structure is generally, but how does it help swapin?


Look at the swap out case first.

Right now, when swapping out the kernel places whatever it can
wherever it can inside the swap space. The closer you are to filling
your swap space, the more likely that those swapped out blocks will be
all over place, rather than in one nice chunk. Contrast that with a
log structured scheme, where the writeout happens to sequential spaces
on the drive instead of scattered about.


This seems to be now fixing the different problem of swap-space filling up. 
I'm quite willing to for now assume I've got plenty free.



So, at some point when the system needs to fault those blocks that
back in, it now has a linear span of sectors to read instead of asking
the drive to bounce over twenty tracks for a hundred blocks.


Moreover though -- what I know about log structure is that generally it 
optimises for write (swapout) and might make read (swapin) worse due to 
fragmentation that wouldn't happen with a regular fs structure.


I guess that cleaner that Alan mentioned might be involved there -- I don't 
know how/what it would be doing.



So, it eliminates the seeks. My laptop drive can read (huh, how odd,
it got slower, need to retest in single user mode), hmm, let's go with
about 25 MB/s. If we ask for a single block from each track, though,
that'll drop to 4k * (1 second / seek time) which is about a megabyte
a second if we're lucky enough to read from consecutive tracks. Even
worse if it's not.

Seeks are the enemy any time you need to hit the drive for anything,
be it swapping or optimizing a database.


I am very aware of the costs of seeks (on current magnetic media).

Rene.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Ray Lee

On 7/29/07, Rene Herman [EMAIL PROTECTED] wrote:
 On 07/29/2007 05:20 PM, Ray Lee wrote:
 This seems to be now fixing the different problem of swap-space filling up.
 I'm quite willing to for now assume I've got plenty free.

I was trying to point out that currently, as an example, memory that
is linear in a process' space could be fragmented on disk when swapped
out. That's today.

Under a log-structured scheme, one could set it up such that something
that was linear in RAM could be swapped out linearly on the drive,
minimizing seeks on writeout, which will naturally minimize seeks on
swap in of that same data.

  So, at some point when the system needs to fault those blocks that
  back in, it now has a linear span of sectors to read instead of asking
  the drive to bounce over twenty tracks for a hundred blocks.

 Moreover though -- what I know about log structure is that generally it
 optimises for write (swapout) and might make read (swapin) worse due to
 fragmentation that wouldn't happen with a regular fs structure.

It looks like I'm not doing a very good job of explaining this, I'm afraid.

Suffice it to say that a log structured swap would give optimization
options that we don't have today.

 I guess that cleaner that Alan mentioned might be involved there -- I don't
 know how/what it would be doing.

Then you should google on `log structured filesystem (primer OR
introduction)` and read a few of the links that pop up. You might find
it interesting.

 I am very aware of the costs of seeks (on current magnetic media).

Then perhaps you can just take it on faith -- log structured layouts
are designed to help minimize seeks, read and write.

Ray
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Rene Herman


On 07/29/2007 06:04 PM, Ray Lee wrote:


I am very aware of the costs of seeks (on current magnetic media).


Then perhaps you can just take it on faith -- log structured layouts
are designed to help minimize seeks, read and write.


I am particularly bad at faith. Let's take that stupid program that I posted:

http://lkml.org/lkml/2007/7/25/85

You push it out before you hit enter, it's written out to swap, at whatever 
speed. How should it be layed out so that it's swapped in most efficiently 
after hitting enter? Reading bigger chunks would quite obviously help, but 
the layout?


The program is not a real-world issue and if you do not consider it a useful 
boundary condition either (okay I guess), how would log structured swap help 
if I just assume I have plenty of free swap to begin with?


Rene.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Ray Lee

On 7/29/07, Rene Herman [EMAIL PROTECTED] wrote:
 On 07/29/2007 06:04 PM, Ray Lee wrote:
  I am very aware of the costs of seeks (on current magnetic media).
 
  Then perhaps you can just take it on faith -- log structured layouts
  are designed to help minimize seeks, read and write.

 I am particularly bad at faith. Let's take that stupid program that I posted:

You only think you are :-). I'm sure there are lots of things you have
faith in. Gravity, for example :-).

 The program is not a real-world issue and if you do not consider it a useful
 boundary condition either (okay I guess), how would log structured swap help
 if I just assume I have plenty of free swap to begin with?

Is that generally the case on your systems? Every linux system I've
run, regardless of RAM, has always pushed things out to swap. And once
there's something already in swap, you now have a packing problem when
you want to swap something else out.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Rene Herman


On 07/29/2007 07:19 PM, Ray Lee wrote:


The program is not a real-world issue and if you do not consider it a useful
boundary condition either (okay I guess), how would log structured swap help
if I just assume I have plenty of free swap to begin with?


Is that generally the case on your systems? Every linux system I've
run, regardless of RAM, has always pushed things out to swap.


For me, it is generally the case yes. We are still discussing this in the 
context of desktop machines and their problems with being slow as things 
have been swapped out and generally I expect a desktop to have plenty of 
swap which it's not regularly going to fillup significantly since then the 
machine's unworkably slow as a desktop anyway.



And once there's something already in swap, you now have a packing
problem when you want to swap something else out.


Once we're crammed, it gets to be a different situation yes. As far as I'm 
concerned that's for another thread though. I'm spending too much time on 
LKML as it is...


Rene.



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Alan Cox

  Is that generally the case on your systems? Every linux system I've
  run, regardless of RAM, has always pushed things out to swap.
 
 For me, it is generally the case yes. We are still discussing this in the 
 context of desktop machines and their problems with being slow as things 
 have been swapped out and generally I expect a desktop to have plenty of 
 swap which it's not regularly going to fillup significantly since then the 
 machine's unworkably slow as a desktop anyway.

A simple log optimises writeout (which is latency critical) and can
otherwise stall an enitre system. In a log you can also have multiple
copies of the same page on disk easily, some stale - so you can write out
chunks of data that are not all them removed from memory, just so you get
them back more easily if you then do (and I guess you'd mark them
accordingly)

The second element is a cleaner - something to go around removing stuff
from the log that is needed when the disks are idle - and also to repack
data in nice linear chunks. So instead of using the empty disk time for
page-in you use it for packing data and optimising future paging.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Ray Lee

On 7/29/07, Rene Herman [EMAIL PROTECTED] wrote:
 On 07/29/2007 07:19 PM, Ray Lee wrote:
 For me, it is generally the case yes. We are still discussing this in the
 context of desktop machines and their problems with being slow as things
 have been swapped out and generally I expect a desktop to have plenty of
 swap which it's not regularly going to fillup significantly since then the
 machine's unworkably slow as a desktop anyway.

Shrug Well, that doesn't match my systems. My laptop has 400MB in swap:

[EMAIL PROTECTED]:~$ free
 total   used   free sharedbuffers cached
Mem:894208 883920  10288  0   3044 163224
-/+ buffers/cache: 717652 176556
Swap:  1116476 393132 723344

  And once there's something already in swap, you now have a packing
  problem when you want to swap something else out.

 Once we're crammed, it gets to be a different situation yes. As far as I'm
 concerned that's for another thread though. I'm spending too much time on
 LKML as it is...

No, it's not even when crammed. It's just when there are holes.
mm/swapfile.c does try to cluster things, but doesn't work too hard at
it as we don't want to spend all our time looking for a perfect fit
that may not exist.

Ray
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Rene Herman


On 07/29/2007 07:52 PM, Ray Lee wrote:


Shrug Well, that doesn't match my systems. My laptop has 400MB in swap:


Which in your case is slightly more than 1/3 of available swap space. Quite 
a lot for a desktop indeed. And if it's more than a few percent fragmented, 
please fix current swapout instead of log structuring it.


Rene.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Paul Jackson

Ray wrote:
 a log structured scheme, where the writeout happens to sequential spaces
 on the drive instead of scattered about.

If the problem is reading stuff back in from swap quickly when
needed, then this likely helps, by reducing the seeks needed.

If the problem is reading stuff back in from swap at the *same time*
that the application is reading stuff from some user file system, and if
that user file system is on the same drive as the swap partition
(typical on laptops), then interleaving the user file system accesses
with the swap partition accesses might overwhelm all other performance
problems, due to the frequent long seeks between the two.

In that case, swap layout and swap i/o block size are secondary.
However, pre-fetching, so that swap read back is not interleaved
with application file accesses, could help dramatically.

===

Perhaps we could have a 'wake-up' command, analogous to the various sleep
and hibernate commands.  The 'wake-up' command could do whatever of the
following it knew to do, in order to optimize for an anticipated change in
usage patterns:
 1) pre-fetch swap
 2) clean (write out) dirty pages
 3) maximize free memory
 4) layout swap nicely
 5) pre-fetch a favorite set of apps

Stumble out of bed in the morning, press 'wake-up', start boiling the
water for your coffee, and in another ten minutes, one is ready to rock
and roll.

In case Andrew is so bored he read this far -- yes this wake-up sounds
like user space code, with minimal kernel changes to support any
particular lower level operation that we can't do already.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson [EMAIL PROTECTED] 1.925.600.0401
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Ray Lee

On 7/29/07, Paul Jackson [EMAIL PROTECTED] wrote:
 If the problem is reading stuff back in from swap at the *same time*
 that the application is reading stuff from some user file system, and if
 that user file system is on the same drive as the swap partition
 (typical on laptops), then interleaving the user file system accesses
 with the swap partition accesses might overwhelm all other performance
 problems, due to the frequent long seeks between the two.

Ah, so in a normal scenario where a working-set is getting faulted
back in, we have the swap storage as well as the file-backed stuff
that needs to be read as well. So even if swap is organized perfectly,
we're still seeking. Damn.

On the other hand, that explains another thing that swap prefetch
could be helping with -- if it preemptively faults the swap back in,
then the file-backed stuff can be faulted back more quickly, just by
the virtue of not needing to seek back and forth to swap for its
stuff. Hadn't thought of that.

That also implies that people running with swap files rather than swap
partitions will see less of an issue. I should dig out my old compact
flash card and try putting swap on that for a week.

 In that case, swap layout and swap i/o block size are secondary.
 However, pre-fetching, so that swap read back is not interleaved
 with application file accesses, could help dramatically.

Nod

 Perhaps we could have a 'wake-up' command, analogous to the various sleep
 and hibernate commands.
[...]
 In case Andrew is so bored he read this far -- yes this wake-up sounds
 like user space code, with minimal kernel changes to support any
 particular lower level operation that we can't do already.

He'd suggested using, uhm, ptrace_peek or somesuch for just such a
purpose. The second half of the issue is to know when and what to
target.

Ray
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Paul Jackson

Ray wrote:
 Ah, so in a normal scenario where a working-set is getting faulted
 back in, we have the swap storage as well as the file-backed stuff
 that needs to be read as well. So even if swap is organized perfectly,
 we're still seeking. Damn.

Perhaps this applies in some cases ... perhaps.

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson [EMAIL PROTECTED] 1.925.600.0401
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Ray Lee

On 7/29/07, Paul Jackson [EMAIL PROTECTED] wrote:
 Ray wrote:
  Ah, so in a normal scenario where a working-set is getting faulted
  back in, we have the swap storage as well as the file-backed stuff
  that needs to be read as well. So even if swap is organized perfectly,
  we're still seeking. Damn.

 Perhaps this applies in some cases ... perhaps.

Yeah, point taken: better data would make this a lot easier to figure
out and target fixes.

Ray
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread Daniel Hazelton

On Sunday 29 July 2007 16:00:22 Ray Lee wrote:
 On 7/29/07, Paul Jackson [EMAIL PROTECTED] wrote:
  If the problem is reading stuff back in from swap at the *same time*
  that the application is reading stuff from some user file system, and if
  that user file system is on the same drive as the swap partition
  (typical on laptops), then interleaving the user file system accesses
  with the swap partition accesses might overwhelm all other performance
  problems, due to the frequent long seeks between the two.

 Ah, so in a normal scenario where a working-set is getting faulted
 back in, we have the swap storage as well as the file-backed stuff
 that needs to be read as well. So even if swap is organized perfectly,
 we're still seeking. Damn.

That is one reason why I try to have swap on a device dedicated just for it. 
It helps keep the system from having to seek all over the drive for data. (I 
remember that this was recommended years ago with Windows - back when you 
could tell Windows where to put the swap file)

 On the other hand, that explains another thing that swap prefetch
 could be helping with -- if it preemptively faults the swap back in,
 then the file-backed stuff can be faulted back more quickly, just by
 the virtue of not needing to seek back and forth to swap for its
 stuff. Hadn't thought of that.

For it to really help swap-prefetch would have to be more aggressive. At the 
moment (if I'm reading the code correctly) the system has to have close to 
zero for it to kick in. A tunable knob controlling how much activity is too 
much for the prefetch to kick in would help with finding a sane default. IMHO 
it should be the one that provides the most benefit with the least hit to 
performance.

 That also implies that people running with swap files rather than swap
 partitions will see less of an issue. I should dig out my old compact
 flash card and try putting swap on that for a week.

Maybe. It all depends on how much seeking is needed to track down the pages in 
the swapfile and such. What would really help make the situation even better 
would be doing the log structured swap + cleaner. The log structured swap + 
cleaner should provide a performance boost by itself - add in the prefetch 
mechanism and the benefits are even more visible.

Another way to improve performance would require making the page replacement 
mechanism more intelligent. There are bounds to what can be done in the 
kernel without negatively impacting performance, but, if I've read the code 
correctly, there might be a better way to decide which pages to evict. One 
way to do this would be to implement some mechanism that allows the system to 
choose a single group of contiguous pages (or, say, a large soft-page) over 
swapping out a single page at a time.

(some form of memory defrag would also be nice, but I can't think of a way to 
do that without massively breaking everything)

snip
  In case Andrew is so bored he read this far -- yes this wake-up sounds
  like user space code, with minimal kernel changes to support any
  particular lower level operation that we can't do already.

 He'd suggested using, uhm, ptrace_peek or somesuch for just such a
 purpose. The second half of the issue is to know when and what to
 target.

The userspace suggestion that was thrown out earlier would have been as 
error-prone and problematic as FUSE. A solution like you suggest would be 
workable - its small and does a task that is best done in userspace (IMHO). 
(IIRC, the original suggestion involved merging maps2 and another patchset 
into mainline and using that, combined with PEEKTEXT to provide for a 
userspace swap daemon. Swap, IMHO, should never be handled outside the 
kernel)

What might be useful is a userspace daemon that tracks memory pressure and 
uses a concise API to trigger various levels of prefetch and/or swap 
aggressiveness.

DRH

-- 
Dialup is like pissing through a pipette. Slow and excruciatingly painful.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-29 Thread david


On Sun, 29 Jul 2007, Rene Herman wrote:


On 07/29/2007 01:41 PM, [EMAIL PROTECTED] wrote:


 I agree that tinkering with the core VM code should not be done lightly,
 but this has been put through the proper process and is stalled with no
 hints on how to move forward.


It has not. Concerns that were raised (by specifically Nick Piggin) weren't 
being addressed.


I may have missed them, but what I saw from him weren't specific issues, 
but instead a nebulous 'something better may come along later'



 forget the nightly cron jobs for the moment. think of this scenerio. you
 have your memory fairly full with apps that you have open (including
 firefox with many tabs), you receive a spreadsheet you need to look at, so
 you fire up openoffice to look at it. then you exit openoffice and try
 to go back to firefox (after a pause while you walk to the printer to
 get the printout of the spreadsheet)


And swinging a dead rat from its tail facing east-wards while reciting 
Documentation/CodingStyle.


Okay, very very sorry, that was particularly childish, but that walking to 
the printer is ofcourse completely constructed and this _is_ something to 
take into account.


yes it was contrived for simplicity.

the same effect would happen if instead of going back to firefox the user 
instead went to their e-mail software and read some mail. doing so should 
still make the machine idle enough to let prefetch kick in.


Swap-prefetch wants to be free, which (also again) it is 
doing a good job at it seems, but this also means that it waits for the VM to 
be _very_ idle before it does anything and as such, we cannot just forget the 
nightly scenario and pretend it's about something else entirely. As long as 
the machine's being used, swap-prefetch doesn't kick in.


how long does the machine need to be idle? if someone spends 30 seconds 
reading an e-mail that's an incredibly long time for the system and I 
would think it should be enough to let the prefetch kick in.



  -- 3: no serious consideration of possible alternatives
 
  Tweaking existing use-oce logic is one I've heard but if we consider 
  the i/dcache issue dead, I believe that one is as well. Going to 
  userspace is another one. Largest theoretical potential. I myself am 
  extremely sceptical about the Linux userland, and largely equate it 
  with smallest _practical_ potential -- but that might just be me.
 
  A larger swap granularity, possible even a self-training 
  granularity. Up to now, seeks only get costlier and costlier with 
  respect to reads with every generation of disk (flash would largely 
  overcome it though) and doing more in one read/write _greatly_ 
  improves throughput, maybe up to the point that swap-prefetch is no 
  longer very useful. I myself don't know about the tradeoffs 
  involved.


 larger swap granularity may help, but waiting for the user to need the
 ram and have to wait for it to be read back in is always going to be
 worse for the user then pre-populating the free memory (for the case
 where the pre-population is right, for other cases it's the same). so
 I see this as a red herring


I saw Chris Snook make a good post here and am going to defer this part to 
that discussion:


http://lkml.org/lkml/2007/7/27/421

But no, it's not a red herring if _practically_ speaking the swapin is fast 
enough once started that people don't actually mind anymore since in that 
case you could simply do without yet more additional VM complexity (and 
kernel daemon).


swapin will always require disk access, and avoiding doing disk access 
while the user is waiting for it by doing it when the system isn't useing 
the disk will always be a win (possibly not as large of a win, but still a 
win) on slow laptop drives where you may only get 20MB/second of reads 
under optimal situations it doesn't take much reading to be noticed by the 
user.



 there are fully legitimate situations where this is useful, the 'papering
 over' effect is not referring to these, it's referring to other possible
 problems in the future.


No, it's not just future. Just look at the various things under discussion 
now such as improved use-once and better swapin.


and these thing do not conflict with prefetch, they compliment it.

improved use-once will avoid pushing things out to swap in the first 
place. this will help during normal workloads so is valuble in any case.


better swapin (I assume you are talking about things like larger swap 
granularity) will also help during normal workloads when you are thrashing 
into swap.


prefetch will help when you have pushed things out to swap and now have 
free memory and a momentarily idle system.


David Lang
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Andrew Morton

On Sat, 28 Jul 2007 21:33:59 -0400 Rik van Riel <[EMAIL PROTECTED]> wrote:

> Andrew Morton wrote:
> 
> > What I think is killing us here is the blockdev pagecache: the pagecache
> > which backs those directory entries and inodes.  These pages get read
> > multiple times because they hold multiple directory entries and multiple
> > inodes.  These multiple touches will put those pages onto the active list
> > so they stick around for a long time and everything else gets evicted.
> > 
> > I've never been very sure about this policy for the metadata pagecache.  We
> > read the filesystem objects into the dcache and icache and then we won't
> > read from that page again for a long time (I expect).  But the page will
> > still hang around for a long time.
> > 
> > It could be that we should leave those pages inactive.
> 
> Good idea for updatedb.
> 
> However, it may be a bad idea for files that are often
> written to.  Turning an inode write into a read plus a
> write does not sound like such a hot idea, we really
> want to keep those in the cache.

Remember that this problem applies to both inode blocks and to directory
blocks.  Yes, it might be useful to hold onto an inode block for a future
write (atime, mtime, usually), but not a directory block.

> I think what you need is to ignore multiple references
> to the same page when they all happen in one time
> interval, counting them only if they happen in multiple
> time intervals.

Yes, the sudden burst of accesses for adjacent inode/dirents will be a
common pattern, and it'd make heaps of sense to treat that as a single
touch.  It'd have to be done in the fs I guess, and it might be a bit hard
to do.  And it turns out that embedding the touch_buffer() all the way down
in __find_get_block() was convenient, but it's going to be tricky to
change.

For now I'm fairly inclined to just nuke the touch_buffer() on the read side
and maybe add one on the modification codepaths and see what happens.

As always, testing is the problem.

> The use-once cleanup (which takes a page flag for PG_new,
> I know...) would solve that problem.
> 
> However, it would introduce the problem of having to scan
> all the pages on the list before a page becomes freeable.
> We would have to add some background scanning (or a separate
> list for PG_new pages) to make the initial pageout run use
> an acceptable amount of CPU time.
> 
> Not sure that complexity will be worth it...
> 

I suspect that the situation we have now is so bad that pretty much
anything we do will be an improvement.  I've always wondered "ytf is there
so much blockdev pagecache?"

This machine I'm typing at:

MemTotal:  3975080 kB
MemFree:750400 kB
Buffers:547736 kB
Cached:1299532 kB
SwapCached:  12772 kB
Active:1789864 kB
Inactive:   861420 kB
HighTotal:   0 kB
HighFree:0 kB
LowTotal:  3975080 kB
LowFree:750400 kB
SwapTotal: 4875716 kB
SwapFree:  4715660 kB
Dirty:  76 kB
Writeback:   0 kB
Mapped: 638036 kB
Slab:   522724 kB
CommitLimit:   6863256 kB
Committed_AS:  1115632 kB
PageTables:  14452 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 36432 kB
VmallocChunk: 34359696379 kB
HugePages_Total: 0
HugePages_Free:  0
HugePages_Rsvd:  0
Hugepagesize: 2048 kB

More that a quarter of my RAM in fs metadata!  Most of it I'll bet is on the
active list.  And the fs on which I do most of the work is mounted
noatime..

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Rik van Riel


Andrew Morton wrote:


What I think is killing us here is the blockdev pagecache: the pagecache
which backs those directory entries and inodes.  These pages get read
multiple times because they hold multiple directory entries and multiple
inodes.  These multiple touches will put those pages onto the active list
so they stick around for a long time and everything else gets evicted.

I've never been very sure about this policy for the metadata pagecache.  We
read the filesystem objects into the dcache and icache and then we won't
read from that page again for a long time (I expect).  But the page will
still hang around for a long time.

It could be that we should leave those pages inactive.


Good idea for updatedb.

However, it may be a bad idea for files that are often
written to.  Turning an inode write into a read plus a
write does not sound like such a hot idea, we really
want to keep those in the cache.

I think what you need is to ignore multiple references
to the same page when they all happen in one time
interval, counting them only if they happen in multiple
time intervals.

The use-once cleanup (which takes a page flag for PG_new,
I know...) would solve that problem.

However, it would introduce the problem of having to scan
all the pages on the list before a page becomes freeable.
We would have to add some background scanning (or a separate
list for PG_new pages) to make the initial pageout run use
an acceptable amount of CPU time.

Not sure that complexity will be worth it...

--
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Daniel Hazelton

On Saturday 28 July 2007 17:06:50 [EMAIL PROTECTED] wrote:
> On Sat, 28 Jul 2007, Daniel Hazelton wrote:
> > On Saturday 28 July 2007 04:55:58 [EMAIL PROTECTED] wrote:
> >> On Sat, 28 Jul 2007, Rene Herman wrote:
> >>> On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote:
>   On Fri, 27 Jul 2007, Rene Herman wrote:
> >  On 07/27/2007 07:45 PM, Daniel Hazelton wrote:
> >>
> >> nobody is arguing that swap prefetch helps in the second cast.
> >
> > Actually, I made a mistake when tracking the thread and reading the code
> > for the patch and started to argue just that. But I have to admit I made
> > a mistake - the patches author has stated (as Rene was kind enough to
> > point out) that swap prefetch can't help when memory is filled.
>
> I stand corrected, thaks for speaking up and correcting your position.

If you had made the statement before I decided to speak up you would have been 
correct :)

Anyway, I try to always admit when I've made a mistake - its part of my 
philosophy. (There have been times when I haven't done it, but I'm trying to 
make that stop entirely)

> >> what people are arguing is that there are situations where it helps for
> >> the first case. on some machines and version of updatedb the nighly run
> >> of updatedb can cause both sets of problems. but the nightly updatedb
> >> run is not the only thing that can cause problems
> >
> > Solving the cache filling memory case is difficult. There have been a
> > number of discussions about it. The simplest solution, IMHO, would be to
> > place a (configurable) hard limit on the maximum size any of the kernels
> > caches can grow to. (The only solution that was discussed, however, is a
> > complex beast)
>
> limiting the size of the cache is also the wrong thing to do in many
> situations. it's only right if the cache pushes out other data you care
> about, if you are trying to do one thing as fast as you can you really do
> want the system to use all the memory it can for the cache.

After thinking about this you are partially correct. There are those sorts of 
situations where you want the system to use all the memory it can for caches. 
OTOH, if those situations could be described in some sort of simple 
heuristic, then a soft-limit that uses those heuristics to determine when to 
let the cache expand could exploit the benefits of having both a limited and 
unlimited cache. (And, potentially, if the heuristic has allowed a cache to 
expand beyond the limit then, when the heuristic no longer shows the oversize 
cache is no longer necessary it could trigger and automatic reclaim of that 
memory.)

(I'm willing to help write and test code to do this exactly. There is no 
guarantee that I'll be able to help with more than testing - I don't 
understand the parts of the code involved all that well)

DRH

-- 
Dialup is like pissing through a pipette. Slow and excruciatingly painful.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread david


On Sat, 28 Jul 2007, Daniel Hazelton wrote:



On Saturday 28 July 2007 04:55:58 [EMAIL PROTECTED] wrote:

On Sat, 28 Jul 2007, Rene Herman wrote:

On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote:

 On Fri, 27 Jul 2007, Rene Herman wrote:

 On 07/27/2007 07:45 PM, Daniel Hazelton wrote:


nobody is arguing that swap prefetch helps in the second cast.


Actually, I made a mistake when tracking the thread and reading the code for
the patch and started to argue just that. But I have to admit I made a
mistake - the patches author has stated (as Rene was kind enough to point
out) that swap prefetch can't help when memory is filled.


I stand corrected, thaks for speaking up and correcting your position.


what people are arguing is that there are situations where it helps for
the first case. on some machines and version of updatedb the nighly run of
updatedb can cause both sets of problems. but the nightly updatedb run is
not the only thing that can cause problems


Solving the cache filling memory case is difficult. There have been a number
of discussions about it. The simplest solution, IMHO, would be to place a
(configurable) hard limit on the maximum size any of the kernels caches can
grow to. (The only solution that was discussed, however, is a complex beast)


limiting the size of the cache is also the wrong thing to do in many 
situations. it's only right if the cache pushes out other data you care 
about, if you are trying to do one thing as fast as you can you really do 
want the system to use all the memory it can for the cache.


David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread david


On Sat, 28 Jul 2007, Alan Cox wrote:


It is. Prefetched pages can be dropped on the floor without additional I/O.


Which is essentially free for most cases. In addition your disk access
may well have been in idle time (and should be for this sort of stuff)
and if it was in the same chunk as something nearby was effectively free
anyway.


as I understand it the swap-prefetch only kicks in if the device is idle


Actual physical disk ops are precious resource and anything that mostly
reduces the number will be a win - not to stay swap prefetch is the right
answer but accidentally or otherwise there are good reasons it may happen
to help.

Bigger more linear chunks of writeout/readin is much more important I
suspect than swap prefetching.


I'm sure this is true while you are doing the swapout or swapin and the 
system is waiting for it. but with prefetch you may be able to avoid doing 
the swapin at a time when the system is waiting for it by doing it at a 
time when the system is otherwise idle.


David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread david


On Sat, 28 Jul 2007, Rene Herman wrote:


On 07/28/2007 10:55 AM, [EMAIL PROTECTED] wrote:


 it looks to me like unless the code was really bad (and after 23 months in
 -mm it doesn't sound like it is)


Not to sound pretentious or anything but I assume that Andrew has a fairly 
good overview of exactly how broken -mm can be at times. How many -mm users 
use it anyway? He himself said he's not convinced of usefulness having not 
seen it help for him (and notice that most developers are also users), turned 
it off due to it annoying him at some point and hasn't seen a serious 
investigation into potential downsides.


if that was the case then people should be responding to the request to 
get it merged with 'but it caused problems for me when I tried it'


I haven't seen any comments like that.


 that the only significant con left is the potential to mask other
 problems.


Which is not a madeup issue, mind you. As an example, I just now tried GNU 
locate and saw it's a complete pig and specifically unsuitable for the low 
memory boxes under discussion. Upon completion, it actually frees enough 
memory that swap-prefetch _could_ help on some boxes, while the real issue is 
that they should first and foremost dump GNU locate.


I see the conclusion as being exactly the opposite.

here is a workload with some badly designed userspace software that the 
kernel can make much more pleasent for users.


arguing that users should never use badly designed software in userspace 
doesn't seem like an argument that will gain much traction. I'm not saying 
the kernel needs to fix the software itself (ala the sched_yeild issues), 
but the kernel should try and keep such software from hurting the rest of 
the system where it can.


in this case it can't help it while the bad software is running, but it 
could minimize the impact after it finishes.



 however there are many legitimate cases where it is definantly dong the
 right thing (swapout was correct in pushing out the pages, but now the
 cause of that preasure is gone). the amount of benifit from this will vary
 from situation to situation, but it's not reasonable to claim that this
 provides no benifit (you have benchmark numbers that show it in synthetic
 benchmarks, and you have user reports that show it in the real-worlk)


I certainly would not want to argue anything of the sort no. As said a few 
times, I agree that swap-prefetch makes sense and has at least the potential 
to help some situations that you really wouldnt even want to try and fix any 
other way, simply because nothing's broken.


so there is a legitimate situation where swap-prefetch will help 
significantly, what is the downside that prevents it from being included? 
(reading this thread it sometimes seems like the downside is that updatedb 
shouldn't cause this problem and so if you fixed updatedb there wold be no 
legitimate benifit, or alturnatly this patch doesn't help updatedb so 
there's no legitimate benifit)



 there are lots of things in the kernel who's job is to pre-fill the memroy
 with data that may (or may not) be useful in the future. this is just
 another method of filling the cache. it does so my saying "the user
 wanted these pages in the recent past, so it's a reasonable guess to say
 that the user will want them again in the future"


Well, _that_ is what the kernel is already going to great lengths at doing, 
and it decided that those pages us poor overnight OO.o users want in in the 
morning weren't reasonable guesses. The kernel also won't any time soon be 
reading our minds, so any solution would need either user intervention (we 
could devise a way to tell the kernel "hey ho, I consider these pages to be 
very important -- try not to swap them out" possible even with a "and if you 
do, please pull them back in when possible") or we can let swap-prefetch do 
the "just in case" thing it is doing.


it's not that they shouldn't have been swapped out (they should have 
been), it's that the reason they were swapped out no longer exists.


While swap-prefetch may not be the be all end all of solutions I agree that 
having a machine sit around with free memory and applications in swap seems 
not too useful if (as is the case) fetched pages can be dropped immediately 
when it turns out swap-prefetch made the wrong decision.


So that's for the concept. As to implementation, if I try and look at the 
code, it seems to be trying hard to really be free and as such, potential 
downsides seem limited. It's a rather core concept though and as such needs 
someone with a _lot_ more VM clue to ack. Sorry for not knowing, but who's 
maintaining/submitting the thing now that Con's not? He or she should 
preferably address any concerns it seems.


I've seen it mentioned that there is still a maintainer but I missed who 
it is, but I haven't seen any concerns that can be addressed, they all 
seem to be 'this is a core concept, people need to think about it' or 'but 
someone may find

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Ray Lee

On 7/28/07, Alan Cox <[EMAIL PROTECTED]> wrote:
> Actual physical disk ops are precious resource and anything that mostly
> reduces the number will be a win - not to stay swap prefetch is the right
> answer but accidentally or otherwise there are good reasons it may happen
> to help.
>
> Bigger more linear chunks of writeout/readin is much more important I
> suspect than swap prefetching.

. The larger the chunks are that we swap out, the less it
actually hurts to swap, which might make all this a moot point. Not
all I/O is created equal...

Ray
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Daniel Hazelton

On Saturday 28 July 2007 04:55:58 [EMAIL PROTECTED] wrote:
> On Sat, 28 Jul 2007, Rene Herman wrote:
> > On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote:
> >>  On Fri, 27 Jul 2007, Rene Herman wrote:
> >> >  On 07/27/2007 07:45 PM, Daniel Hazelton wrote:
> >> > >   Questions about it:
> >> > >   Q) Does swap-prefetch help with this?
> >> > >   A) [From all reports I've seen (*)]
> >> > >   Yes, it does.
> >> >
> >> >  No it does not. If updatedb filled memory to the point of causing
> >> >  swapping (which noone is reproducing anyway) it HAS FILLED MEMORY and
> >> >  swap-prefetch hasn't any memory to prefetch into -- updatedb itself
> >> >  doesn't use any significant memory.
> >>
> >>  however there are other programs which are known to take up significant
> >>  amounts of memory and will cause the issue being described (openoffice
> >> for example)
> >>
> >>  please don't get hung up on the text 'updatedb' and accept that there
> >> are programs that do run intermittently and do use a significant amount
> >> of ram and then free it.
> >
> > Different issue. One that's worth pursueing perhaps, but a different
> > issue from the VFS caches issue that people have been trying to track
> > down.
>
> people are trying to track down the problem of their machine being slow
> until enough data is swapped back in to operate normally.
>
> in at some situations swap prefetch can help becouse something that used
> memory freed it so there is free memory that could be filled with data
> (which is something that Linux does agressivly in most other situations)
>
> in some other situations swap prefetch cannot help becouse useless data is
> getting cached at the expense of useful data.
>
> nobody is arguing that swap prefetch helps in the second cast.

Actually, I made a mistake when tracking the thread and reading the code for 
the patch and started to argue just that. But I have to admit I made a 
mistake - the patches author has stated (as Rene was kind enough to point 
out) that swap prefetch can't help when memory is filled.

> what people are arguing is that there are situations where it helps for
> the first case. on some machines and version of updatedb the nighly run of
> updatedb can cause both sets of problems. but the nightly updatedb run is
> not the only thing that can cause problems

Solving the cache filling memory case is difficult. There have been a number 
of discussions about it. The simplest solution, IMHO, would be to place a 
(configurable) hard limit on the maximum size any of the kernels caches can 
grow to. (The only solution that was discussed, however, is a complex beast)

>
> but let's talk about the concept here for a little bit
>
> the design is to use CPU and I/O capacity that's otherwise idle to fill
> free memory with data from swap.
>
> pro:
>more ram has potentially useful data in it
>
> con:
>it takes a little extra effort to give this memory to another app (the
> page must be removed from the list and zeroed at the time it's needed, I
> assume that the data is left in swap so that it doesn't have to be written
> out again)
>
>it adds some complexity to the kernel (~500 lines IIRC from this thread)
>
>by undoing recent swapouts it can potentially mask problems with swapout
>
> it looks to me like unless the code was really bad (and after 23 months in
> -mm it doesn't sound like it is) that the only significant con left is the
> potential to mask other problems.

I'll second this. But with the swap system itself having seen as heavy testing 
as it has I don't know if it would be masking other problems.

That is why I've been asking "What is so wrong with it?" - while it definately 
doesn't help with programs that cause caches to balloon (that problem does 
need another solution) it does help to speed things up when a memory hog has 
exited. (And since its a pretty safe assumption that swap is going to be 
noticeably slower than RAM this patch seems to me to be a rather visible and 
obvious solution to that problem)

> however there are many legitimate cases where it is definantly dong the
> right thing (swapout was correct in pushing out the pages, but now the
> cause of that preasure is gone). the amount of benifit from this will vary
> from situation to situation, but it's not reasonable to claim that this
> provides no benifit (you have benchmark numbers that show it in synthetic
> benchmarks, and you have user reports that show it in the real-worlk)

Exactly. Though I have seen posts which (to me at least) appear to claim 
exactly that. It was part of the reason why I got a bit incensed. (The other 
was that it looked like the kernel devs with the ultra-powerful machines were 
claiming 'I don't see the problem on my machine, so it doesn't exist'. That 
sort of attitude is fine, in some cases, but not, IMHO, where performance is 
concerned)

> there are lots of things in the kernel who's job is to pre-fill the memroy
> with data that may (or may not) be useful in

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Daniel Hazelton

On Saturday 28 July 2007 03:48:13 Mike Galbraith wrote:
> On Fri, 2007-07-27 at 18:51 -0400, Daniel Hazelton wrote:
> > Now, once more, I'm going to ask: What is so terribly wrong with swap
> > prefetch? Why does it seem that everyone against it says "Its treating a
> > symptom, so it can't go in"?
>
> And once again, I personally have nothing against swap-prefetch, or
> something like it. I can see how it or something like it could be made
> to improve the lives of people who get up in the morning to find their
> apps sitting on disk due to memory pressure generated by over-night
> system maintenance operations.
>
> The author himself however, says his implementation can't help with
> updatedb (though people seem to be saying that it does), or anything
> else that leaves memory full.  That IMHO, makes it of questionable value
> toward solving what people are saying they want swap-prefetch for in the
> first place.

Okay. I have to agree with the author that, in such a situation, it wouldn't 
help. However there are, without a doubt, other situations where it would 
help immensely. (memory hogs forcing everything to disk and quitting, one off 
tasks that don't balloon the cache (kernel compiles, et al) - in those 
situations swap prefetch would really shine.)

DRH

-- 
Dialup is like pissing through a pipette. Slow and excruciatingly painful.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Alan Cox

> It is. Prefetched pages can be dropped on the floor without additional I/O.

Which is essentially free for most cases. In addition your disk access
may well have been in idle time (and should be for this sort of stuff)
and if it was in the same chunk as something nearby was effectively free
anyway. 

Actual physical disk ops are precious resource and anything that mostly
reduces the number will be a win - not to stay swap prefetch is the right
answer but accidentally or otherwise there are good reasons it may happen
to help.

Bigger more linear chunks of writeout/readin is much more important I
suspect than swap prefetching. 

> good overview of exactly how broken -mm can be at times. How many -mm users 
> use it anyway? He himself said he's not convinced of usefulness having not 

I've been using it for months with no noticed problem. I turn it on
because it might as well get tested. I've not done comparison tests so I
can't comment on if its worth it.

Lots of -mm testers turn *everything* on because its a test kernel.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Rene Herman


On 07/28/2007 10:55 AM, [EMAIL PROTECTED] wrote:

in at some situations swap prefetch can help becouse something that used 
memory freed it so there is free memory that could be filled with data 
(which is something that Linux does agressivly in most other situations)


in some other situations swap prefetch cannot help becouse useless data 
is getting cached at the expense of useful data.


nobody is arguing that swap prefetch helps in the second cast.


Oh yes they are. Daniel for example did twice, telling me to turn my brain 
on in between (if you read it, you may have noticed I got a little annoyed 
at that point).



but let's talk about the concept here for a little bit

the design is to use CPU and I/O capacity that's otherwise idle to fill 
free memory with data from swap.


pro:
  more ram has potentially useful data in it

con:
  it takes a little extra effort to give this memory to another app (the 
page must be removed from the list and zeroed at the time it's needed, I 
assume that the data is left in swap so that it doesn't have to be 
written out again)


It is. Prefetched pages can be dropped on the floor without additional I/O.


  it adds some complexity to the kernel (~500 lines IIRC from this thread)

  by undoing recent swapouts it can potentially mask problems with swapout

it looks to me like unless the code was really bad (and after 23 months 
in -mm it doesn't sound like it is)


Not to sound pretentious or anything but I assume that Andrew has a fairly 
good overview of exactly how broken -mm can be at times. How many -mm users 
use it anyway? He himself said he's not convinced of usefulness having not 
seen it help for him (and notice that most developers are also users), 
turned it off due to it annoying him at some point and hasn't seen a serious 
investigation into potential downsides.



that the only significant con left is the potential to mask other
problems.


Which is not a madeup issue, mind you. As an example, I just now tried GNU 
locate and saw it's a complete pig and specifically unsuitable for the low 
memory boxes under discussion. Upon completion, it actually frees enough 
memory that swap-prefetch _could_ help on some boxes, while the real issue 
is that they should first and foremost dump GNU locate.


however there are many legitimate cases where it is definantly dong the 
right thing (swapout was correct in pushing out the pages, but now the 
cause of that preasure is gone). the amount of benifit from this will 
vary from situation to situation, but it's not reasonable to claim that 
this provides no benifit (you have benchmark numbers that show it in 
synthetic benchmarks, and you have user reports that show it in the 
real-worlk)


I certainly would not want to argue anything of the sort no. As said a few 
times, I agree that swap-prefetch makes sense and has at least the potential 
to help some situations that you really wouldnt even want to try and fix any 
other way, simply because nothing's broken.


there are lots of things in the kernel who's job is to pre-fill the 
memroy with data that may (or may not) be useful in the future. this is 
just another method of filling the cache. it does so my saying "the user

wanted these pages in the recent past, so it's a reasonable guess to say
that the user will want them again in the future"


Well, _that_ is what the kernel is already going to great lengths at doing, 
and it decided that those pages us poor overnight OO.o users want in in the 
morning weren't reasonable guesses. The kernel also won't any time soon be 
reading our minds, so any solution would need either user intervention (we 
could devise a way to tell the kernel "hey ho, I consider these pages to be 
very important -- try not to swap them out" possible even with a "and if you 
do, please pull them back in when possible") or we can let swap-prefetch do 
the "just in case" thing it is doing.


While swap-prefetch may not be the be all end all of solutions I agree that 
having a machine sit around with free memory and applications in swap seems 
not too useful if (as is the case) fetched pages can be dropped immediately 
when it turns out swap-prefetch made the wrong decision.


So that's for the concept. As to implementation, if I try and look at the 
code, it seems to be trying hard to really be free and as such, potential 
downsides seem limited. It's a rather core concept though and as such needs 
someone with a _lot_ more VM clue to ack. Sorry for not knowing, but who's 
maintaining/submitting the thing now that Con's not? He or she should 
preferably address any concerns it seems.


Rene.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: -mm merge plans for 2.6.23

2007-07-28 Thread Stefan Richter

Daniel Cheng wrote:
> but merging maps2 have higher risk which should be done in a development
> branch (er... 2.7, but we don't have it now).

This is off-topic and has been discussed to death, but:  Rather than one
stable branch and one development branch, we have a few stable branches
and a lot of development branches.  Some are located at git.kernel.org.
Among else, this gives you a predictable release rythm and very timely
updated stable branches.
-- 
Stefan Richter
-=-=-=== -=== ===--
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread david

On Sat, 28 Jul 2007, Rene Herman wrote:

On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote:

 On Fri, 27 Jul 2007, Rene Herman wrote:

>  On 07/27/2007 07:45 PM, Daniel Hazelton wrote:
> 
> >   Questions about it:

> >   Q) Does swap-prefetch help with this?
> >   A) [From all reports I've seen (*)]
> >   Yes, it does. 
> 
>  No it does not. If updatedb filled memory to the point of causing 
>  swapping (which noone is reproducing anyway) it HAS FILLED MEMORY and 
>  swap-prefetch hasn't any memory to prefetch into -- updatedb itself 
>  doesn't use any significant memory.

 however there are other programs which are known to take up significant
 amounts of memory and will cause the issue being described (openoffice for
 example)

 please don't get hung up on the text 'updatedb' and accept that there are
 programs that do run intermittently and do use a significant amount of ram
 and then free it.

Different issue. One that's worth pursueing perhaps, but a different issue 
from the VFS caches issue that people have been trying to track down.

people are trying to track down the problem of their machine being slow 
until enough data is swapped back in to operate normally.

in at some situations swap prefetch can help becouse something that used 
memory freed it so there is free memory that could be filled with data 
(which is something that Linux does agressivly in most other situations)

in some other situations swap prefetch cannot help becouse useless data is 
getting cached at the expense of useful data.

nobody is arguing that swap prefetch helps in the second cast.

what people are arguing is that there are situations where it helps for 
the first case. on some machines and version of updatedb the nighly run of 
updatedb can cause both sets of problems. but the nightly updatedb run is 
not the only thing that can cause problems

but let's talk about the concept here for a little bit

the design is to use CPU and I/O capacity that's otherwise idle to fill 
free memory with data from swap.

pro:
  more ram has potentially useful data in it

con:
  it takes a little extra effort to give this memory to another app (the 
page must be removed from the list and zeroed at the time it's needed, I 
assume that the data is left in swap so that it doesn't have to be written 
out again)

  it adds some complexity to the kernel (~500 lines IIRC from this thread)

  by undoing recent swapouts it can potentially mask problems with swapout

it looks to me like unless the code was really bad (and after 23 months in 
-mm it doesn't sound like it is) that the only significant con left is the 
potential to mask other problems.

however there are many legitimate cases where it is definantly dong the 
right thing (swapout was correct in pushing out the pages, but now the 
cause of that preasure is gone). the amount of benifit from this will vary 
from situation to situation, but it's not reasonable to claim that this 
provides no benifit (you have benchmark numbers that show it in synthetic 
benchmarks, and you have user reports that show it in the real-worlk)

there are lots of things in the kernel who's job is to pre-fill the memroy 
with data that may (or may not) be useful in the future. this is just 
another method of filling the cache. it does so my saying "the user wanted 
these pages in the recent past, so it's a reasonable guess to say that the 
user will want them again in the future"

David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Rene Herman


On 07/28/2007 09:35 AM, Rene Herman wrote:

By the way -- I'm unable to make my slocate grow substantial here but 
I'll try what GNU locate does. If it's really as bad as I hear then 
regardless of anything else it should really be either fixed or dumped...


Yes. GNU locate is broken and nobody should be using it. The updatedb from 
(my distribution standard) "slocate" uses around 2M allocated total during 
an entire run while GNU locate allocates some 30M to the sort process alone.


GNU locate is also close to 4 times as slow (although that ofcourse only 
matters on cached runs anyways).


So, GNU locate is just a pig pushing things out, with or without any added 
VFS cache pressure from the things it does by design. As such, we can trust 
people complaining about it but should first tell them to switch to halfway 
sane locate implementation. If you run memory hogs on small memory boxes, 
you're going to suffer.


Leaves the fact that swap-prefetch sometimes helps alleviate these and other 
kinds of memory-hog situations and as such, might not (again) be a bad idea 
in itself.


Rene.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Mike Galbraith

On Fri, 2007-07-27 at 18:51 -0400, Daniel Hazelton wrote:

> Now, once more, I'm going to ask: What is so terribly wrong with swap 
> prefetch? Why does it seem that everyone against it says "Its treating a 
> symptom, so it can't go in"?

And once again, I personally have nothing against swap-prefetch, or
something like it. I can see how it or something like it could be made
to improve the lives of people who get up in the morning to find their
apps sitting on disk due to memory pressure generated by over-night
system maintenance operations.

The author himself however, says his implementation can't help with
updatedb (though people seem to be saying that it does), or anything
else that leaves memory full.  That IMHO, makes it of questionable value
toward solving what people are saying they want swap-prefetch for in the
first place.

I personally don't care if swap-prefetch goes in or not.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Rene Herman


On 07/28/2007 01:15 AM, Björn Steinbrink wrote:


On 2007.07.27 20:16:32 +0200, Rene Herman wrote:



Here's swap-prefetch's author saying the same:

http://lkml.org/lkml/2007/2/9/112

| It can't help the updatedb scenario. Updatedb leaves the ram full and
| swap prefetch wants to cost as little as possible so it will never
| move anything out of ram in preference for the pages it wants to swap
| back in.

Now please finally either understand this, or tell us how we're wrong.


Con might have been wrong there for boxes with really little memory.


Note -- with "the updatedb scenario" both he in the above and I are talking 
about the "VFS caches filling memory cause the problem" not updatedb in 
particular.


My desktop box has not even 300k inodes in use (IIRC someone posted a df 
-i output showing 1 million inodes in use). Still, the memory footprint 
of the "sort" process grows up to about 50MB. Assuming that the average 
filename length stays, that would mean 150MB for the 1 million inode 
case, just for the "sort" process.


Even if it's not 150MB, 50MB is already a lot on a 128 or even a 256MB box. 
So, yes, we're now at the expected scenario of some hog pushing out things 
and freeing it upon exit again and it's something swap-prefetch definitely 
has potential to help with.


Said early in the thread it's hard to imagine how it would not help in any 
such situation so that the discussion may as far as I'm concerned at that 
point concentrate on whether swap-prefetch hurts anything in others.


Some people I believe are not convinced it helps very significantly due to 
at that point _everything_ having been thrown out but a copy of openoffice 
with a large spreadsheet open should come back to life much quicker it would 
seem.



Any faults in that reasoning?


No. If the machine goes idle after some memory hog _itself_ pushes things 
out and then exits, swap-prefetch helps, at the veryvery least potentially.


By the way -- I'm unable to make my slocate grow substantial here but I'll 
try what GNU locate does. If it's really as bad as I hear then regardless of 
anything else it should really be either fixed or dumped...


Rene.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Rene Herman


On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote:


On Fri, 27 Jul 2007, Rene Herman wrote:


On 07/27/2007 07:45 PM, Daniel Hazelton wrote:


 Questions about it:
 Q) Does swap-prefetch help with this?
 A) [From all reports I've seen (*)]
 Yes, it does. 


No it does not. If updatedb filled memory to the point of causing 
swapping (which noone is reproducing anyway) it HAS FILLED MEMORY and 
swap-prefetch hasn't any memory to prefetch into -- updatedb itself 
doesn't use any significant memory.


however there are other programs which are known to take up significant 
amounts of memory and will cause the issue being described (openoffice 
for example)


please don't get hung up on the text 'updatedb' and accept that there 
are programs that do run intermittently and do use a significant amount 
of ram and then free it.


Different issue. One that's worth pursueing perhaps, but a different issue 
from the VFS caches issue that people have been trying to track down.


Rene.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Rene Herman


On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote:


On Fri, 27 Jul 2007, Rene Herman wrote:


On 07/27/2007 07:45 PM, Daniel Hazelton wrote:


 Questions about it:
 Q) Does swap-prefetch help with this?
 A) [From all reports I've seen (*)]
 Yes, it does. 


No it does not. If updatedb filled memory to the point of causing 
swapping (which noone is reproducing anyway) it HAS FILLED MEMORY and 
swap-prefetch hasn't any memory to prefetch into -- updatedb itself 
doesn't use any significant memory.


however there are other programs which are known to take up significant 
amounts of memory and will cause the issue being described (openoffice 
for example)


please don't get hung up on the text 'updatedb' and accept that there 
are programs that do run intermittently and do use a significant amount 
of ram and then free it.


Different issue. One that's worth pursueing perhaps, but a different issue 
from the VFS caches issue that people have been trying to track down.


Rene.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Rene Herman


On 07/28/2007 01:15 AM, Björn Steinbrink wrote:


On 2007.07.27 20:16:32 +0200, Rene Herman wrote:



Here's swap-prefetch's author saying the same:

http://lkml.org/lkml/2007/2/9/112

| It can't help the updatedb scenario. Updatedb leaves the ram full and
| swap prefetch wants to cost as little as possible so it will never
| move anything out of ram in preference for the pages it wants to swap
| back in.

Now please finally either understand this, or tell us how we're wrong.


Con might have been wrong there for boxes with really little memory.


Note -- with the updatedb scenario both he in the above and I are talking 
about the VFS caches filling memory cause the problem not updatedb in 
particular.


My desktop box has not even 300k inodes in use (IIRC someone posted a df 
-i output showing 1 million inodes in use). Still, the memory footprint 
of the sort process grows up to about 50MB. Assuming that the average 
filename length stays, that would mean 150MB for the 1 million inode 
case, just for the sort process.


Even if it's not 150MB, 50MB is already a lot on a 128 or even a 256MB box. 
So, yes, we're now at the expected scenario of some hog pushing out things 
and freeing it upon exit again and it's something swap-prefetch definitely 
has potential to help with.


Said early in the thread it's hard to imagine how it would not help in any 
such situation so that the discussion may as far as I'm concerned at that 
point concentrate on whether swap-prefetch hurts anything in others.


Some people I believe are not convinced it helps very significantly due to 
at that point _everything_ having been thrown out but a copy of openoffice 
with a large spreadsheet open should come back to life much quicker it would 
seem.



Any faults in that reasoning?


No. If the machine goes idle after some memory hog _itself_ pushes things 
out and then exits, swap-prefetch helps, at the veryvery least potentially.


By the way -- I'm unable to make my slocate grow substantial here but I'll 
try what GNU locate does. If it's really as bad as I hear then regardless of 
anything else it should really be either fixed or dumped...


Rene.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Mike Galbraith

On Fri, 2007-07-27 at 18:51 -0400, Daniel Hazelton wrote:

 Now, once more, I'm going to ask: What is so terribly wrong with swap 
 prefetch? Why does it seem that everyone against it says Its treating a 
 symptom, so it can't go in?

And once again, I personally have nothing against swap-prefetch, or
something like it. I can see how it or something like it could be made
to improve the lives of people who get up in the morning to find their
apps sitting on disk due to memory pressure generated by over-night
system maintenance operations.

The author himself however, says his implementation can't help with
updatedb (though people seem to be saying that it does), or anything
else that leaves memory full.  That IMHO, makes it of questionable value
toward solving what people are saying they want swap-prefetch for in the
first place.

I personally don't care if swap-prefetch goes in or not.

-Mike

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Rene Herman


On 07/28/2007 09:35 AM, Rene Herman wrote:

By the way -- I'm unable to make my slocate grow substantial here but 
I'll try what GNU locate does. If it's really as bad as I hear then 
regardless of anything else it should really be either fixed or dumped...


Yes. GNU locate is broken and nobody should be using it. The updatedb from 
(my distribution standard) slocate uses around 2M allocated total during 
an entire run while GNU locate allocates some 30M to the sort process alone.


GNU locate is also close to 4 times as slow (although that ofcourse only 
matters on cached runs anyways).


So, GNU locate is just a pig pushing things out, with or without any added 
VFS cache pressure from the things it does by design. As such, we can trust 
people complaining about it but should first tell them to switch to halfway 
sane locate implementation. If you run memory hogs on small memory boxes, 
you're going to suffer.


Leaves the fact that swap-prefetch sometimes helps alleviate these and other 
kinds of memory-hog situations and as such, might not (again) be a bad idea 
in itself.


Rene.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread david


On Sat, 28 Jul 2007, Rene Herman wrote:


On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote:


 On Fri, 27 Jul 2007, Rene Herman wrote:

  On 07/27/2007 07:45 PM, Daniel Hazelton wrote:
 
Questions about it:

Q) Does swap-prefetch help with this?
A) [From all reports I've seen (*)]
Yes, it does. 
 
  No it does not. If updatedb filled memory to the point of causing 
  swapping (which noone is reproducing anyway) it HAS FILLED MEMORY and 
  swap-prefetch hasn't any memory to prefetch into -- updatedb itself 
  doesn't use any significant memory.


 however there are other programs which are known to take up significant
 amounts of memory and will cause the issue being described (openoffice for
 example)

 please don't get hung up on the text 'updatedb' and accept that there are
 programs that do run intermittently and do use a significant amount of ram
 and then free it.


Different issue. One that's worth pursueing perhaps, but a different issue 
from the VFS caches issue that people have been trying to track down.


people are trying to track down the problem of their machine being slow 
until enough data is swapped back in to operate normally.


in at some situations swap prefetch can help becouse something that used 
memory freed it so there is free memory that could be filled with data 
(which is something that Linux does agressivly in most other situations)


in some other situations swap prefetch cannot help becouse useless data is 
getting cached at the expense of useful data.


nobody is arguing that swap prefetch helps in the second cast.

what people are arguing is that there are situations where it helps for 
the first case. on some machines and version of updatedb the nighly run of 
updatedb can cause both sets of problems. but the nightly updatedb run is 
not the only thing that can cause problems



but let's talk about the concept here for a little bit

the design is to use CPU and I/O capacity that's otherwise idle to fill 
free memory with data from swap.


pro:
  more ram has potentially useful data in it

con:
  it takes a little extra effort to give this memory to another app (the 
page must be removed from the list and zeroed at the time it's needed, I 
assume that the data is left in swap so that it doesn't have to be written 
out again)


  it adds some complexity to the kernel (~500 lines IIRC from this thread)

  by undoing recent swapouts it can potentially mask problems with swapout

it looks to me like unless the code was really bad (and after 23 months in 
-mm it doesn't sound like it is) that the only significant con left is the 
potential to mask other problems.


however there are many legitimate cases where it is definantly dong the 
right thing (swapout was correct in pushing out the pages, but now the 
cause of that preasure is gone). the amount of benifit from this will vary 
from situation to situation, but it's not reasonable to claim that this 
provides no benifit (you have benchmark numbers that show it in synthetic 
benchmarks, and you have user reports that show it in the real-worlk)


there are lots of things in the kernel who's job is to pre-fill the memroy 
with data that may (or may not) be useful in the future. this is just 
another method of filling the cache. it does so my saying the user wanted 
these pages in the recent past, so it's a reasonable guess to say that the 
user will want them again in the future


David Lang
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: -mm merge plans for 2.6.23

2007-07-28 Thread Stefan Richter

Daniel Cheng wrote:
 but merging maps2 have higher risk which should be done in a development
 branch (er... 2.7, but we don't have it now).

This is off-topic and has been discussed to death, but:  Rather than one
stable branch and one development branch, we have a few stable branches
and a lot of development branches.  Some are located at git.kernel.org.
Among else, this gives you a predictable release rythm and very timely
updated stable branches.
-- 
Stefan Richter
-=-=-=== -=== ===--
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Rene Herman


On 07/28/2007 10:55 AM, [EMAIL PROTECTED] wrote:

in at some situations swap prefetch can help becouse something that used 
memory freed it so there is free memory that could be filled with data 
(which is something that Linux does agressivly in most other situations)


in some other situations swap prefetch cannot help becouse useless data 
is getting cached at the expense of useful data.


nobody is arguing that swap prefetch helps in the second cast.


Oh yes they are. Daniel for example did twice, telling me to turn my brain 
on in between (if you read it, you may have noticed I got a little annoyed 
at that point).



but let's talk about the concept here for a little bit

the design is to use CPU and I/O capacity that's otherwise idle to fill 
free memory with data from swap.


pro:
  more ram has potentially useful data in it

con:
  it takes a little extra effort to give this memory to another app (the 
page must be removed from the list and zeroed at the time it's needed, I 
assume that the data is left in swap so that it doesn't have to be 
written out again)


It is. Prefetched pages can be dropped on the floor without additional I/O.


  it adds some complexity to the kernel (~500 lines IIRC from this thread)

  by undoing recent swapouts it can potentially mask problems with swapout

it looks to me like unless the code was really bad (and after 23 months 
in -mm it doesn't sound like it is)


Not to sound pretentious or anything but I assume that Andrew has a fairly 
good overview of exactly how broken -mm can be at times. How many -mm users 
use it anyway? He himself said he's not convinced of usefulness having not 
seen it help for him (and notice that most developers are also users), 
turned it off due to it annoying him at some point and hasn't seen a serious 
investigation into potential downsides.



that the only significant con left is the potential to mask other
problems.


Which is not a madeup issue, mind you. As an example, I just now tried GNU 
locate and saw it's a complete pig and specifically unsuitable for the low 
memory boxes under discussion. Upon completion, it actually frees enough 
memory that swap-prefetch _could_ help on some boxes, while the real issue 
is that they should first and foremost dump GNU locate.


however there are many legitimate cases where it is definantly dong the 
right thing (swapout was correct in pushing out the pages, but now the 
cause of that preasure is gone). the amount of benifit from this will 
vary from situation to situation, but it's not reasonable to claim that 
this provides no benifit (you have benchmark numbers that show it in 
synthetic benchmarks, and you have user reports that show it in the 
real-worlk)


I certainly would not want to argue anything of the sort no. As said a few 
times, I agree that swap-prefetch makes sense and has at least the potential 
to help some situations that you really wouldnt even want to try and fix any 
other way, simply because nothing's broken.


there are lots of things in the kernel who's job is to pre-fill the 
memroy with data that may (or may not) be useful in the future. this is 
just another method of filling the cache. it does so my saying the user

wanted these pages in the recent past, so it's a reasonable guess to say
that the user will want them again in the future


Well, _that_ is what the kernel is already going to great lengths at doing, 
and it decided that those pages us poor overnight OO.o users want in in the 
morning weren't reasonable guesses. The kernel also won't any time soon be 
reading our minds, so any solution would need either user intervention (we 
could devise a way to tell the kernel hey ho, I consider these pages to be 
very important -- try not to swap them out possible even with a and if you 
do, please pull them back in when possible) or we can let swap-prefetch do 
the just in case thing it is doing.


While swap-prefetch may not be the be all end all of solutions I agree that 
having a machine sit around with free memory and applications in swap seems 
not too useful if (as is the case) fetched pages can be dropped immediately 
when it turns out swap-prefetch made the wrong decision.


So that's for the concept. As to implementation, if I try and look at the 
code, it seems to be trying hard to really be free and as such, potential 
downsides seem limited. It's a rather core concept though and as such needs 
someone with a _lot_ more VM clue to ack. Sorry for not knowing, but who's 
maintaining/submitting the thing now that Con's not? He or she should 
preferably address any concerns it seems.


Rene.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Alan Cox

 It is. Prefetched pages can be dropped on the floor without additional I/O.

Which is essentially free for most cases. In addition your disk access
may well have been in idle time (and should be for this sort of stuff)
and if it was in the same chunk as something nearby was effectively free
anyway. 

Actual physical disk ops are precious resource and anything that mostly
reduces the number will be a win - not to stay swap prefetch is the right
answer but accidentally or otherwise there are good reasons it may happen
to help.

Bigger more linear chunks of writeout/readin is much more important I
suspect than swap prefetching. 

 good overview of exactly how broken -mm can be at times. How many -mm users 
 use it anyway? He himself said he's not convinced of usefulness having not 

I've been using it for months with no noticed problem. I turn it on
because it might as well get tested. I've not done comparison tests so I
can't comment on if its worth it.

Lots of -mm testers turn *everything* on because its a test kernel.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Daniel Hazelton

On Saturday 28 July 2007 03:48:13 Mike Galbraith wrote:
 On Fri, 2007-07-27 at 18:51 -0400, Daniel Hazelton wrote:
  Now, once more, I'm going to ask: What is so terribly wrong with swap
  prefetch? Why does it seem that everyone against it says Its treating a
  symptom, so it can't go in?

 And once again, I personally have nothing against swap-prefetch, or
 something like it. I can see how it or something like it could be made
 to improve the lives of people who get up in the morning to find their
 apps sitting on disk due to memory pressure generated by over-night
 system maintenance operations.

 The author himself however, says his implementation can't help with
 updatedb (though people seem to be saying that it does), or anything
 else that leaves memory full.  That IMHO, makes it of questionable value
 toward solving what people are saying they want swap-prefetch for in the
 first place.

Okay. I have to agree with the author that, in such a situation, it wouldn't 
help. However there are, without a doubt, other situations where it would 
help immensely. (memory hogs forcing everything to disk and quitting, one off 
tasks that don't balloon the cache (kernel compiles, et al) - in those 
situations swap prefetch would really shine.)

DRH

-- 
Dialup is like pissing through a pipette. Slow and excruciatingly painful.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Daniel Hazelton

On Saturday 28 July 2007 04:55:58 [EMAIL PROTECTED] wrote:
 On Sat, 28 Jul 2007, Rene Herman wrote:
  On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote:
   On Fri, 27 Jul 2007, Rene Herman wrote:
On 07/27/2007 07:45 PM, Daniel Hazelton wrote:
  Questions about it:
  Q) Does swap-prefetch help with this?
  A) [From all reports I've seen (*)]
  Yes, it does.
  
No it does not. If updatedb filled memory to the point of causing
swapping (which noone is reproducing anyway) it HAS FILLED MEMORY and
swap-prefetch hasn't any memory to prefetch into -- updatedb itself
doesn't use any significant memory.
 
   however there are other programs which are known to take up significant
   amounts of memory and will cause the issue being described (openoffice
  for example)
 
   please don't get hung up on the text 'updatedb' and accept that there
  are programs that do run intermittently and do use a significant amount
  of ram and then free it.
 
  Different issue. One that's worth pursueing perhaps, but a different
  issue from the VFS caches issue that people have been trying to track
  down.

 people are trying to track down the problem of their machine being slow
 until enough data is swapped back in to operate normally.

 in at some situations swap prefetch can help becouse something that used
 memory freed it so there is free memory that could be filled with data
 (which is something that Linux does agressivly in most other situations)

 in some other situations swap prefetch cannot help becouse useless data is
 getting cached at the expense of useful data.

 nobody is arguing that swap prefetch helps in the second cast.

Actually, I made a mistake when tracking the thread and reading the code for 
the patch and started to argue just that. But I have to admit I made a 
mistake - the patches author has stated (as Rene was kind enough to point 
out) that swap prefetch can't help when memory is filled.

 what people are arguing is that there are situations where it helps for
 the first case. on some machines and version of updatedb the nighly run of
 updatedb can cause both sets of problems. but the nightly updatedb run is
 not the only thing that can cause problems

Solving the cache filling memory case is difficult. There have been a number 
of discussions about it. The simplest solution, IMHO, would be to place a 
(configurable) hard limit on the maximum size any of the kernels caches can 
grow to. (The only solution that was discussed, however, is a complex beast)


 but let's talk about the concept here for a little bit

 the design is to use CPU and I/O capacity that's otherwise idle to fill
 free memory with data from swap.

 pro:
more ram has potentially useful data in it

 con:
it takes a little extra effort to give this memory to another app (the
 page must be removed from the list and zeroed at the time it's needed, I
 assume that the data is left in swap so that it doesn't have to be written
 out again)

it adds some complexity to the kernel (~500 lines IIRC from this thread)

by undoing recent swapouts it can potentially mask problems with swapout

 it looks to me like unless the code was really bad (and after 23 months in
 -mm it doesn't sound like it is) that the only significant con left is the
 potential to mask other problems.

I'll second this. But with the swap system itself having seen as heavy testing 
as it has I don't know if it would be masking other problems.

That is why I've been asking What is so wrong with it? - while it definately 
doesn't help with programs that cause caches to balloon (that problem does 
need another solution) it does help to speed things up when a memory hog has 
exited. (And since its a pretty safe assumption that swap is going to be 
noticeably slower than RAM this patch seems to me to be a rather visible and 
obvious solution to that problem)

 however there are many legitimate cases where it is definantly dong the
 right thing (swapout was correct in pushing out the pages, but now the
 cause of that preasure is gone). the amount of benifit from this will vary
 from situation to situation, but it's not reasonable to claim that this
 provides no benifit (you have benchmark numbers that show it in synthetic
 benchmarks, and you have user reports that show it in the real-worlk)

Exactly. Though I have seen posts which (to me at least) appear to claim 
exactly that. It was part of the reason why I got a bit incensed. (The other 
was that it looked like the kernel devs with the ultra-powerful machines were 
claiming 'I don't see the problem on my machine, so it doesn't exist'. That 
sort of attitude is fine, in some cases, but not, IMHO, where performance is 
concerned)

 there are lots of things in the kernel who's job is to pre-fill the memroy
 with data that may (or may not) be useful in the future. this is just
 another method of filling the cache. it does so my saying the user wanted
 these pages in the recent

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Ray Lee

On 7/28/07, Alan Cox [EMAIL PROTECTED] wrote:
 Actual physical disk ops are precious resource and anything that mostly
 reduces the number will be a win - not to stay swap prefetch is the right
 answer but accidentally or otherwise there are good reasons it may happen
 to help.

 Bigger more linear chunks of writeout/readin is much more important I
 suspect than swap prefetching.

nod. The larger the chunks are that we swap out, the less it
actually hurts to swap, which might make all this a moot point. Not
all I/O is created equal...

Ray
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread david


On Sat, 28 Jul 2007, Rene Herman wrote:


On 07/28/2007 10:55 AM, [EMAIL PROTECTED] wrote:


 it looks to me like unless the code was really bad (and after 23 months in
 -mm it doesn't sound like it is)


Not to sound pretentious or anything but I assume that Andrew has a fairly 
good overview of exactly how broken -mm can be at times. How many -mm users 
use it anyway? He himself said he's not convinced of usefulness having not 
seen it help for him (and notice that most developers are also users), turned 
it off due to it annoying him at some point and hasn't seen a serious 
investigation into potential downsides.


if that was the case then people should be responding to the request to 
get it merged with 'but it caused problems for me when I tried it'


I haven't seen any comments like that.


 that the only significant con left is the potential to mask other
 problems.


Which is not a madeup issue, mind you. As an example, I just now tried GNU 
locate and saw it's a complete pig and specifically unsuitable for the low 
memory boxes under discussion. Upon completion, it actually frees enough 
memory that swap-prefetch _could_ help on some boxes, while the real issue is 
that they should first and foremost dump GNU locate.


I see the conclusion as being exactly the opposite.

here is a workload with some badly designed userspace software that the 
kernel can make much more pleasent for users.


arguing that users should never use badly designed software in userspace 
doesn't seem like an argument that will gain much traction. I'm not saying 
the kernel needs to fix the software itself (ala the sched_yeild issues), 
but the kernel should try and keep such software from hurting the rest of 
the system where it can.


in this case it can't help it while the bad software is running, but it 
could minimize the impact after it finishes.



 however there are many legitimate cases where it is definantly dong the
 right thing (swapout was correct in pushing out the pages, but now the
 cause of that preasure is gone). the amount of benifit from this will vary
 from situation to situation, but it's not reasonable to claim that this
 provides no benifit (you have benchmark numbers that show it in synthetic
 benchmarks, and you have user reports that show it in the real-worlk)


I certainly would not want to argue anything of the sort no. As said a few 
times, I agree that swap-prefetch makes sense and has at least the potential 
to help some situations that you really wouldnt even want to try and fix any 
other way, simply because nothing's broken.


so there is a legitimate situation where swap-prefetch will help 
significantly, what is the downside that prevents it from being included? 
(reading this thread it sometimes seems like the downside is that updatedb 
shouldn't cause this problem and so if you fixed updatedb there wold be no 
legitimate benifit, or alturnatly this patch doesn't help updatedb so 
there's no legitimate benifit)



 there are lots of things in the kernel who's job is to pre-fill the memroy
 with data that may (or may not) be useful in the future. this is just
 another method of filling the cache. it does so my saying the user
 wanted these pages in the recent past, so it's a reasonable guess to say
 that the user will want them again in the future


Well, _that_ is what the kernel is already going to great lengths at doing, 
and it decided that those pages us poor overnight OO.o users want in in the 
morning weren't reasonable guesses. The kernel also won't any time soon be 
reading our minds, so any solution would need either user intervention (we 
could devise a way to tell the kernel hey ho, I consider these pages to be 
very important -- try not to swap them out possible even with a and if you 
do, please pull them back in when possible) or we can let swap-prefetch do 
the just in case thing it is doing.


it's not that they shouldn't have been swapped out (they should have 
been), it's that the reason they were swapped out no longer exists.


While swap-prefetch may not be the be all end all of solutions I agree that 
having a machine sit around with free memory and applications in swap seems 
not too useful if (as is the case) fetched pages can be dropped immediately 
when it turns out swap-prefetch made the wrong decision.


So that's for the concept. As to implementation, if I try and look at the 
code, it seems to be trying hard to really be free and as such, potential 
downsides seem limited. It's a rather core concept though and as such needs 
someone with a _lot_ more VM clue to ack. Sorry for not knowing, but who's 
maintaining/submitting the thing now that Con's not? He or she should 
preferably address any concerns it seems.


I've seen it mentioned that there is still a maintainer but I missed who 
it is, but I haven't seen any concerns that can be addressed, they all 
seem to be 'this is a core concept, people need to think about it' or 'but 
someone may find a better

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread david


On Sat, 28 Jul 2007, Alan Cox wrote:


It is. Prefetched pages can be dropped on the floor without additional I/O.


Which is essentially free for most cases. In addition your disk access
may well have been in idle time (and should be for this sort of stuff)
and if it was in the same chunk as something nearby was effectively free
anyway.


as I understand it the swap-prefetch only kicks in if the device is idle


Actual physical disk ops are precious resource and anything that mostly
reduces the number will be a win - not to stay swap prefetch is the right
answer but accidentally or otherwise there are good reasons it may happen
to help.

Bigger more linear chunks of writeout/readin is much more important I
suspect than swap prefetching.


I'm sure this is true while you are doing the swapout or swapin and the 
system is waiting for it. but with prefetch you may be able to avoid doing 
the swapin at a time when the system is waiting for it by doing it at a 
time when the system is otherwise idle.


David Lang
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread david


On Sat, 28 Jul 2007, Daniel Hazelton wrote:



On Saturday 28 July 2007 04:55:58 [EMAIL PROTECTED] wrote:

On Sat, 28 Jul 2007, Rene Herman wrote:

On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote:

 On Fri, 27 Jul 2007, Rene Herman wrote:

 On 07/27/2007 07:45 PM, Daniel Hazelton wrote:


nobody is arguing that swap prefetch helps in the second cast.


Actually, I made a mistake when tracking the thread and reading the code for
the patch and started to argue just that. But I have to admit I made a
mistake - the patches author has stated (as Rene was kind enough to point
out) that swap prefetch can't help when memory is filled.


I stand corrected, thaks for speaking up and correcting your position.


what people are arguing is that there are situations where it helps for
the first case. on some machines and version of updatedb the nighly run of
updatedb can cause both sets of problems. but the nightly updatedb run is
not the only thing that can cause problems


Solving the cache filling memory case is difficult. There have been a number
of discussions about it. The simplest solution, IMHO, would be to place a
(configurable) hard limit on the maximum size any of the kernels caches can
grow to. (The only solution that was discussed, however, is a complex beast)


limiting the size of the cache is also the wrong thing to do in many 
situations. it's only right if the cache pushes out other data you care 
about, if you are trying to do one thing as fast as you can you really do 
want the system to use all the memory it can for the cache.


David Lang
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Daniel Hazelton

On Saturday 28 July 2007 17:06:50 [EMAIL PROTECTED] wrote:
 On Sat, 28 Jul 2007, Daniel Hazelton wrote:
  On Saturday 28 July 2007 04:55:58 [EMAIL PROTECTED] wrote:
  On Sat, 28 Jul 2007, Rene Herman wrote:
  On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote:
   On Fri, 27 Jul 2007, Rene Herman wrote:
   On 07/27/2007 07:45 PM, Daniel Hazelton wrote:
 
  nobody is arguing that swap prefetch helps in the second cast.
 
  Actually, I made a mistake when tracking the thread and reading the code
  for the patch and started to argue just that. But I have to admit I made
  a mistake - the patches author has stated (as Rene was kind enough to
  point out) that swap prefetch can't help when memory is filled.

 I stand corrected, thaks for speaking up and correcting your position.

If you had made the statement before I decided to speak up you would have been 
correct :)

Anyway, I try to always admit when I've made a mistake - its part of my 
philosophy. (There have been times when I haven't done it, but I'm trying to 
make that stop entirely)

  what people are arguing is that there are situations where it helps for
  the first case. on some machines and version of updatedb the nighly run
  of updatedb can cause both sets of problems. but the nightly updatedb
  run is not the only thing that can cause problems
 
  Solving the cache filling memory case is difficult. There have been a
  number of discussions about it. The simplest solution, IMHO, would be to
  place a (configurable) hard limit on the maximum size any of the kernels
  caches can grow to. (The only solution that was discussed, however, is a
  complex beast)

 limiting the size of the cache is also the wrong thing to do in many
 situations. it's only right if the cache pushes out other data you care
 about, if you are trying to do one thing as fast as you can you really do
 want the system to use all the memory it can for the cache.

After thinking about this you are partially correct. There are those sorts of 
situations where you want the system to use all the memory it can for caches. 
OTOH, if those situations could be described in some sort of simple 
heuristic, then a soft-limit that uses those heuristics to determine when to 
let the cache expand could exploit the benefits of having both a limited and 
unlimited cache. (And, potentially, if the heuristic has allowed a cache to 
expand beyond the limit then, when the heuristic no longer shows the oversize 
cache is no longer necessary it could trigger and automatic reclaim of that 
memory.)

(I'm willing to help write and test code to do this exactly. There is no 
guarantee that I'll be able to help with more than testing - I don't 
understand the parts of the code involved all that well)

DRH

-- 
Dialup is like pissing through a pipette. Slow and excruciatingly painful.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Rik van Riel


Andrew Morton wrote:


What I think is killing us here is the blockdev pagecache: the pagecache
which backs those directory entries and inodes.  These pages get read
multiple times because they hold multiple directory entries and multiple
inodes.  These multiple touches will put those pages onto the active list
so they stick around for a long time and everything else gets evicted.

I've never been very sure about this policy for the metadata pagecache.  We
read the filesystem objects into the dcache and icache and then we won't
read from that page again for a long time (I expect).  But the page will
still hang around for a long time.

It could be that we should leave those pages inactive.


Good idea for updatedb.

However, it may be a bad idea for files that are often
written to.  Turning an inode write into a read plus a
write does not sound like such a hot idea, we really
want to keep those in the cache.

I think what you need is to ignore multiple references
to the same page when they all happen in one time
interval, counting them only if they happen in multiple
time intervals.

The use-once cleanup (which takes a page flag for PG_new,
I know...) would solve that problem.

However, it would introduce the problem of having to scan
all the pages on the list before a page becomes freeable.
We would have to add some background scanning (or a separate
list for PG_new pages) to make the initial pageout run use
an acceptable amount of CPU time.

Not sure that complexity will be worth it...

--
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Andrew Morton

On Sat, 28 Jul 2007 21:33:59 -0400 Rik van Riel [EMAIL PROTECTED] wrote:

 Andrew Morton wrote:
 
  What I think is killing us here is the blockdev pagecache: the pagecache
  which backs those directory entries and inodes.  These pages get read
  multiple times because they hold multiple directory entries and multiple
  inodes.  These multiple touches will put those pages onto the active list
  so they stick around for a long time and everything else gets evicted.
  
  I've never been very sure about this policy for the metadata pagecache.  We
  read the filesystem objects into the dcache and icache and then we won't
  read from that page again for a long time (I expect).  But the page will
  still hang around for a long time.
  
  It could be that we should leave those pages inactive.
 
 Good idea for updatedb.
 
 However, it may be a bad idea for files that are often
 written to.  Turning an inode write into a read plus a
 write does not sound like such a hot idea, we really
 want to keep those in the cache.

Remember that this problem applies to both inode blocks and to directory
blocks.  Yes, it might be useful to hold onto an inode block for a future
write (atime, mtime, usually), but not a directory block.

 I think what you need is to ignore multiple references
 to the same page when they all happen in one time
 interval, counting them only if they happen in multiple
 time intervals.

Yes, the sudden burst of accesses for adjacent inode/dirents will be a
common pattern, and it'd make heaps of sense to treat that as a single
touch.  It'd have to be done in the fs I guess, and it might be a bit hard
to do.  And it turns out that embedding the touch_buffer() all the way down
in __find_get_block() was convenient, but it's going to be tricky to
change.

For now I'm fairly inclined to just nuke the touch_buffer() on the read side
and maybe add one on the modification codepaths and see what happens.

As always, testing is the problem.

 The use-once cleanup (which takes a page flag for PG_new,
 I know...) would solve that problem.
 
 However, it would introduce the problem of having to scan
 all the pages on the list before a page becomes freeable.
 We would have to add some background scanning (or a separate
 list for PG_new pages) to make the initial pageout run use
 an acceptable amount of CPU time.
 
 Not sure that complexity will be worth it...
 

I suspect that the situation we have now is so bad that pretty much
anything we do will be an improvement.  I've always wondered ytf is there
so much blockdev pagecache?

This machine I'm typing at:

MemTotal:  3975080 kB
MemFree:750400 kB
Buffers:547736 kB
Cached:1299532 kB
SwapCached:  12772 kB
Active:1789864 kB
Inactive:   861420 kB
HighTotal:   0 kB
HighFree:0 kB
LowTotal:  3975080 kB
LowFree:750400 kB
SwapTotal: 4875716 kB
SwapFree:  4715660 kB
Dirty:  76 kB
Writeback:   0 kB
Mapped: 638036 kB
Slab:   522724 kB
CommitLimit:   6863256 kB
Committed_AS:  1115632 kB
PageTables:  14452 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 36432 kB
VmallocChunk: 34359696379 kB
HugePages_Total: 0
HugePages_Free:  0
HugePages_Rsvd:  0
Hugepagesize: 2048 kB

More that a quarter of my RAM in fs metadata!  Most of it I'll bet is on the
active list.  And the fs on which I do most of the work is mounted
noatime..


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-27 Thread Rene Herman


On 07/27/2007 10:28 PM, Daniel Hazelton wrote:


Check the attitude at the door then re-read what I actually said:


Attitude? You wanted attitude dear boy?


Updatedb or another process that uses the FS heavily runs on a users
256MB P3-800 (when it is idle) and the VFS caches grow, causing memory
pressure that causes other applications to be swapped to disk. In the
morning the user has to wait for the system to swap those applications
back in.


I never said that it was the *program* itself - or *any* specific program (I 
used "Updatedb" because it has been the big name in the discussion) - doing 
the filling of memory. I actually said that the problem is that the kernel's 
caches - VFS and others - will grow *WITHOUT* *LIMIT*, filling all available 
memory. 


WHICH SWAP-PREFETCH DOES NOT HELP WITH.
WHICH SWAP-PREFETCH DOES NOT HELP WITH.
WHICH SWAP-PREFETCH DOES NOT HELP WITH.

And now finally get that through your thick scull or shut up, right fucking now.

You want to know what causes the problem? The current design of the caches. 
They will extend without much limit, to the point of actually pushing pages 
to disk so they can grow even more. 


Due to being a generally nice guy, I am going to try _once_ more to try and 
make you understand. Not twice, once. So pay attention. Right now.


Those caches are NOT causing any problem under discussion. If any caches 
grow to the point of causing swap-out, they have filled memory and 
swap-prefetch cannot and will not do anything since it needs free (as in not 
occupied by caches) memory. As such, people maintaining that swap-prefetch 
helps their situation are not being hit by caches.


The only way swap-prefetch can (and will) do anything is when something that 
by itself takes up lots of memory runs and exits. So can we now please 
finally drop the fucking red herring and start talking about swap-prefetch?


If we accept that some of the people maintaining that swap-prefetch helps 
them are not in fact deluded -- a bit of a stretch seeing as how not a 
single one of them is substantiating anything -- we have a number of 
slightly different possibilities for "something" in the above.


-- 1)

It could be an inefficient updatedb. Although he isn't experiencing the 
problem, Bjoern Steinbrink is posting numbers (w!) that show that at 
least the GNU version spawns a large memory "sort" process meaning that on a 
low-memory box updatedb itself can be what causes the observed problem.


While in this situation switching to a different updatedb (slocate, mlocate) 
obviously makes sense it's the kind of situation where swap-prefetch will help.


-- 2)

It could be something else entirely such as a backup run. I suppose people 
would know if they were running anything of the sort though and wouldn't 
blaim anything on updatedb.


Other than that, it's again the situation where swap-prefetch would help.

-- 3)

The something else entirely can also run _after_ updatedb, kicking out the 
VFS caches and leaving free memory upon exit. I still suppose the same thing 
as under (2) but this is the only way how updatedb / VFS caches can even be 
part of any problem, if the _combined_ memory pressure is just enough to 
make the difference.


The direct problem is still just the "something else entirely" and needs 
someone affected to tell us what it is.


I already did. You completely ignored it because I happened to use the magic 
words "updatedb" and "swap prefetch". 


No I did not. This thread is about swap-prefetch and you used the magic 
words VFS caches. I don't give a fryin' fuck if their filling is caused by 
updatedb or the cat sleeping on the "find /" keys on your keyboard, 
they're still not causing anything swap-prefetch helps with.


This thread has seen input from a selection of knowledgeable people and 
Morton was even running benchmarks to look at this supposed VFS cache 
problem and not finding it. The only further input this thread needs is 
someone affected by the supposed problem.


Which I ofcourse notice in a followup of yours you are not either -- you're 
just here to blabber, not to solve anything.


Rene.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: -mm merge plans for 2.6.23

2007-07-27 Thread Daniel Cheng

Andrew Morton wrote:
[...]
> 
> And userspace can do a much better implementation of this
> how-to-handle-large-load-shifts problem, because it is really quite
> complex.  The system needs to be monitored to determine what is the "usual"
[...]
> All this would end up needing runtime configurability and tweakability and
> customisability.  All standard fare for userspace stuff - much easier than
> patching the kernel.

But a patch already exist.
Which is easier: (1) apply the patch ; or (2) write a new patch?

> 
> So.  We can
> a) provide a way for userspace to reload pagecache and
> b) merge maps2 (once it's finished) (pokes mpm)
> and we're done?

might be.
but merging maps2 have higher risk which should be done in a development
branch (er... 2.7, but we don't have it now).

-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 >

1 - 100 of 707 matches

Mail list logo