Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 8/6/07, Nick Piggin <[EMAIL PROTECTED]> wrote: [...] > > this completely ignores the use case where the > > swapping was exactly the > > right thing to do, but memory has been freed up from > > a program exiting so > > that you couldnow fill that empty ram with data that > > was swapped out. > > Yeah. However, merging patches (especially when > changing heuristics, especially in page reclaim) is > not about just thinking up a use-case that it works > well for and telling people that they're putting their > heads in the sand if they say anything against it. > Read this thread and you'll find other examples of > patches that have been around for as long or longer > and also have some good use-cases and also have not > been merged. What do you think Andrew? Swap prefetch is not the panacea, it's not going to solve all the problems but it seems to improve the "desktop experience" and it has been discussed and reviewed a lot (it's has even been discussed more than it should have be). Are you going to push upstream the patch? Ciao, -- Paolo http://paolo.ciarrocchi.googlepages.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
--- [EMAIL PROTECTED] wrote: > On Mon, 6 Aug 2007, Nick Piggin wrote: > > > [EMAIL PROTECTED] wrote: > >> On Sun, 29 Jul 2007, Rene Herman wrote: > >> > >> > On 07/29/2007 01:41 PM, [EMAIL PROTECTED] wrote: > >> > > >> > > I agree that tinkering with the core VM code > should not be done > >> > > lightly, > >> > > but this has been put through the proper > process and is stalled with > >> > > no > >> > > hints on how to move forward. > >> > > >> > > >> > It has not. Concerns that were raised (by > specifically Nick Piggin) > >> > weren't being addressed. > >> > >> > >> I may have missed them, but what I saw from him > weren't specific issues, > >> but instead a nebulous 'something better may > come along later' > > > > Something better, ie. the problems with page > reclaim being fixed. > > Why is that nebulous? > > becouse that doesn't begin to address all the > benifits. What do you mean "address the benefits"? What I want to address is the page reclaim problems. > the approach of fixing page reclaim and updatedb is > pretending that if you > only do everything right pages won't get pushed to > swap in the first > place, and therefor swap prefetch won't be needed. You should read what I wrote. Anyway, the fact of the matter is that there are still fairly significant problems with page reclaim in this workload which I would like to see fixed. I personally still think some of the low hanging fruit *might* be better fixed before swap prefetch gets merged, but I've repeatedly said I'm sick of getting dragged back into the whole debate so I'm happy with whatever Andrew decides to do with it. I think it is sad to turn it off for laptops, if it really makes the "desktop" experience so much better. Surely for _most_ workloads we should be able to manage 1-2GB of RAM reasonably well. > this completely ignores the use case where the > swapping was exactly the > right thing to do, but memory has been freed up from > a program exiting so > that you couldnow fill that empty ram with data that > was swapped out. Yeah. However, merging patches (especially when changing heuristics, especially in page reclaim) is not about just thinking up a use-case that it works well for and telling people that they're putting their heads in the sand if they say anything against it. Read this thread and you'll find other examples of patches that have been around for as long or longer and also have some good use-cases and also have not been merged. Yahoo!7 Mail has just got even bigger and better with unlimited storage on all webmail accounts. http://au.docs.yahoo.com/mail/unlimitedstorage.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
--- [EMAIL PROTECTED] wrote: On Mon, 6 Aug 2007, Nick Piggin wrote: [EMAIL PROTECTED] wrote: On Sun, 29 Jul 2007, Rene Herman wrote: On 07/29/2007 01:41 PM, [EMAIL PROTECTED] wrote: I agree that tinkering with the core VM code should not be done lightly, but this has been put through the proper process and is stalled with no hints on how to move forward. It has not. Concerns that were raised (by specifically Nick Piggin) weren't being addressed. I may have missed them, but what I saw from him weren't specific issues, but instead a nebulous 'something better may come along later' Something better, ie. the problems with page reclaim being fixed. Why is that nebulous? becouse that doesn't begin to address all the benifits. What do you mean address the benefits? What I want to address is the page reclaim problems. the approach of fixing page reclaim and updatedb is pretending that if you only do everything right pages won't get pushed to swap in the first place, and therefor swap prefetch won't be needed. You should read what I wrote. Anyway, the fact of the matter is that there are still fairly significant problems with page reclaim in this workload which I would like to see fixed. I personally still think some of the low hanging fruit *might* be better fixed before swap prefetch gets merged, but I've repeatedly said I'm sick of getting dragged back into the whole debate so I'm happy with whatever Andrew decides to do with it. I think it is sad to turn it off for laptops, if it really makes the desktop experience so much better. Surely for _most_ workloads we should be able to manage 1-2GB of RAM reasonably well. this completely ignores the use case where the swapping was exactly the right thing to do, but memory has been freed up from a program exiting so that you couldnow fill that empty ram with data that was swapped out. Yeah. However, merging patches (especially when changing heuristics, especially in page reclaim) is not about just thinking up a use-case that it works well for and telling people that they're putting their heads in the sand if they say anything against it. Read this thread and you'll find other examples of patches that have been around for as long or longer and also have some good use-cases and also have not been merged. Yahoo!7 Mail has just got even bigger and better with unlimited storage on all webmail accounts. http://au.docs.yahoo.com/mail/unlimitedstorage.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 8/6/07, Nick Piggin [EMAIL PROTECTED] wrote: [...] this completely ignores the use case where the swapping was exactly the right thing to do, but memory has been freed up from a program exiting so that you couldnow fill that empty ram with data that was swapped out. Yeah. However, merging patches (especially when changing heuristics, especially in page reclaim) is not about just thinking up a use-case that it works well for and telling people that they're putting their heads in the sand if they say anything against it. Read this thread and you'll find other examples of patches that have been around for as long or longer and also have some good use-cases and also have not been merged. What do you think Andrew? Swap prefetch is not the panacea, it's not going to solve all the problems but it seems to improve the desktop experience and it has been discussed and reviewed a lot (it's has even been discussed more than it should have be). Are you going to push upstream the patch? Ciao, -- Paolo http://paolo.ciarrocchi.googlepages.com/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On Mon, 6 Aug 2007, Nick Piggin wrote: [EMAIL PROTECTED] wrote: On Sun, 29 Jul 2007, Rene Herman wrote: > On 07/29/2007 01:41 PM, [EMAIL PROTECTED] wrote: > > > I agree that tinkering with the core VM code should not be done > > lightly, > > but this has been put through the proper process and is stalled with > > no > > hints on how to move forward. > > > It has not. Concerns that were raised (by specifically Nick Piggin) > weren't being addressed. I may have missed them, but what I saw from him weren't specific issues, but instead a nebulous 'something better may come along later' Something better, ie. the problems with page reclaim being fixed. Why is that nebulous? becouse that doesn't begin to address all the benifits. the approach of fixing page reclaim and updatedb is pretending that if you only do everything right pages won't get pushed to swap in the first place, and therefor swap prefetch won't be needed. this completely ignores the use case where the swapping was exactly the right thing to do, but memory has been freed up from a program exiting so that you couldnow fill that empty ram with data that was swapped out. David Lang - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
[EMAIL PROTECTED] wrote: On Sun, 29 Jul 2007, Rene Herman wrote: On 07/29/2007 01:41 PM, [EMAIL PROTECTED] wrote: I agree that tinkering with the core VM code should not be done lightly, but this has been put through the proper process and is stalled with no hints on how to move forward. It has not. Concerns that were raised (by specifically Nick Piggin) weren't being addressed. I may have missed them, but what I saw from him weren't specific issues, but instead a nebulous 'something better may come along later' Something better, ie. the problems with page reclaim being fixed. Why is that nebulous? -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: -mm merge plans for 2.6.23
Matthew Hawkins wrote: On 7/25/07, Nick Piggin <[EMAIL PROTECTED]> wrote: I guess /proc/meminfo, /proc/zoneinfo, /proc/vmstat, /proc/slabinfo before and after the updatedb run with the latest kernel would be a first step. top and vmstat output during the run wouldn't hurt either. Hi Nick, I've attached two files with this kind of info. Being up at the cron hours of the morning meant I got a better picture of what my system is doing. Here's a short summary of what I saw in top: beagleindexer used gobs of ram. 600M or so (I have 1G) Hmm OK, beagleindexer. I thought beagle didn't need frequent reindexing because of inotify? Oh well... updatedb didn't use much ram, but while it was running kswapd kept on frequenting the top 10 cpu hogs - it would stick around for 5 seconds or so then disappear for no more than 10 seconds, then come back again. This behaviour persisted during the run. updatedb ran third (beagleindexer was first, then update-dlocatedb) Kswapd will use CPU when memory is low, even if there is no swapping. Your "buffers" grew by 600% (from 50MB to 350MB), and slab also grew by a few thousand entries. This is not just a problem when it pushes out swap, it will also harm filebacked working set. This (which Ray's traces also show) is a bit of a problem. As Andrew noticed, use-once isn't working well for buffer cache, and it doesn't really for dentry and inode cache either (although those don't seem to be as much of a problem on your workload). Andrew has done a little test patch for this in -mm, but it probably wants more work and testing. If you can test the -mm kernel and see if things are improved, that would help. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: -mm merge plans for 2.6.23
Matthew Hawkins wrote: On 7/25/07, Nick Piggin [EMAIL PROTECTED] wrote: I guess /proc/meminfo, /proc/zoneinfo, /proc/vmstat, /proc/slabinfo before and after the updatedb run with the latest kernel would be a first step. top and vmstat output during the run wouldn't hurt either. Hi Nick, I've attached two files with this kind of info. Being up at the cron hours of the morning meant I got a better picture of what my system is doing. Here's a short summary of what I saw in top: beagleindexer used gobs of ram. 600M or so (I have 1G) Hmm OK, beagleindexer. I thought beagle didn't need frequent reindexing because of inotify? Oh well... updatedb didn't use much ram, but while it was running kswapd kept on frequenting the top 10 cpu hogs - it would stick around for 5 seconds or so then disappear for no more than 10 seconds, then come back again. This behaviour persisted during the run. updatedb ran third (beagleindexer was first, then update-dlocatedb) Kswapd will use CPU when memory is low, even if there is no swapping. Your buffers grew by 600% (from 50MB to 350MB), and slab also grew by a few thousand entries. This is not just a problem when it pushes out swap, it will also harm filebacked working set. This (which Ray's traces also show) is a bit of a problem. As Andrew noticed, use-once isn't working well for buffer cache, and it doesn't really for dentry and inode cache either (although those don't seem to be as much of a problem on your workload). Andrew has done a little test patch for this in -mm, but it probably wants more work and testing. If you can test the -mm kernel and see if things are improved, that would help. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
[EMAIL PROTECTED] wrote: On Sun, 29 Jul 2007, Rene Herman wrote: On 07/29/2007 01:41 PM, [EMAIL PROTECTED] wrote: I agree that tinkering with the core VM code should not be done lightly, but this has been put through the proper process and is stalled with no hints on how to move forward. It has not. Concerns that were raised (by specifically Nick Piggin) weren't being addressed. I may have missed them, but what I saw from him weren't specific issues, but instead a nebulous 'something better may come along later' Something better, ie. the problems with page reclaim being fixed. Why is that nebulous? -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On Mon, 6 Aug 2007, Nick Piggin wrote: [EMAIL PROTECTED] wrote: On Sun, 29 Jul 2007, Rene Herman wrote: On 07/29/2007 01:41 PM, [EMAIL PROTECTED] wrote: I agree that tinkering with the core VM code should not be done lightly, but this has been put through the proper process and is stalled with no hints on how to move forward. It has not. Concerns that were raised (by specifically Nick Piggin) weren't being addressed. I may have missed them, but what I saw from him weren't specific issues, but instead a nebulous 'something better may come along later' Something better, ie. the problems with page reclaim being fixed. Why is that nebulous? becouse that doesn't begin to address all the benifits. the approach of fixing page reclaim and updatedb is pretending that if you only do everything right pages won't get pushed to swap in the first place, and therefor swap prefetch won't be needed. this completely ignores the use case where the swapping was exactly the right thing to do, but memory has been freed up from a program exiting so that you couldnow fill that empty ram with data that was swapped out. David Lang - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
Hi! > > That would just save reading the directories. Not sure > > it helps that much. Much better would be actually if it didn't stat the > > individual files (and force their dentries/inodes in). I bet it does that > > to > > find out if they are directories or not. But in a modern system it could > > just > > check the type in the dirent on file systems that support > > that and not do a stat. Then you would get much less dentries/inodes. > > FWIW, find(1) does *not* stat non-directories (and neither would this > approach). So it's just dentries for directories and you can't realistically > skip those. OK, you could - if you had banned cross-directory rename > for directories and propagated "dirty since last look" towards root (note > that it would be a boolean, not a timestamp). Then we could skip unchanged > subtrees completely... Could we help it a little from kernel and set 'dirty since last look' on directory renames? I mean, this is not only updatedb. KDE startup is limited by this, too. It would be nice to have effective 'what change in tree' operation. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
Hi! That would just save reading the directories. Not sure it helps that much. Much better would be actually if it didn't stat the individual files (and force their dentries/inodes in). I bet it does that to find out if they are directories or not. But in a modern system it could just check the type in the dirent on file systems that support that and not do a stat. Then you would get much less dentries/inodes. FWIW, find(1) does *not* stat non-directories (and neither would this approach). So it's just dentries for directories and you can't realistically skip those. OK, you could - if you had banned cross-directory rename for directories and propagated dirty since last look towards root (note that it would be a boolean, not a timestamp). Then we could skip unchanged subtrees completely... Could we help it a little from kernel and set 'dirty since last look' on directory renames? I mean, this is not only updatedb. KDE startup is limited by this, too. It would be nice to have effective 'what change in tree' operation. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: -mm merge plans for 2.6.23
On 7/25/07, Nick Piggin <[EMAIL PROTECTED]> wrote: > I guess /proc/meminfo, /proc/zoneinfo, /proc/vmstat, /proc/slabinfo > before and after the updatedb run with the latest kernel would be a > first step. top and vmstat output during the run wouldn't hurt either. Hi Nick, I've attached two files with this kind of info. Being up at the cron hours of the morning meant I got a better picture of what my system is doing. Here's a short summary of what I saw in top: beagleindexer used gobs of ram. 600M or so (I have 1G) updatedb didn't use much ram, but while it was running kswapd kept on frequenting the top 10 cpu hogs - it would stick around for 5 seconds or so then disappear for no more than 10 seconds, then come back again. This behaviour persisted during the run. updatedb ran third (beagleindexer was first, then update-dlocatedb) I'm going to do this again, this time under a CFS kernel & use Ingo's sched_debug script to see what the scheduler is doing also. Let me know if there's anything else you wish to see. The running kernel at the time was 2.6.22.1-ck. There's no slabinfo since I'm using slub instead (and I don't have slub debug enabled). Cheers, -- Matt beaglecron.ck Description: Binary data updatedbcron.ck Description: Binary data
Re: [ck] Re: -mm merge plans for 2.6.23
On 7/25/07, Nick Piggin [EMAIL PROTECTED] wrote: I guess /proc/meminfo, /proc/zoneinfo, /proc/vmstat, /proc/slabinfo before and after the updatedb run with the latest kernel would be a first step. top and vmstat output during the run wouldn't hurt either. Hi Nick, I've attached two files with this kind of info. Being up at the cron hours of the morning meant I got a better picture of what my system is doing. Here's a short summary of what I saw in top: beagleindexer used gobs of ram. 600M or so (I have 1G) updatedb didn't use much ram, but while it was running kswapd kept on frequenting the top 10 cpu hogs - it would stick around for 5 seconds or so then disappear for no more than 10 seconds, then come back again. This behaviour persisted during the run. updatedb ran third (beagleindexer was first, then update-dlocatedb) I'm going to do this again, this time under a CFS kernel use Ingo's sched_debug script to see what the scheduler is doing also. Let me know if there's anything else you wish to see. The running kernel at the time was 2.6.22.1-ck. There's no slabinfo since I'm using slub instead (and I don't have slub debug enabled). Cheers, -- Matt beaglecron.ck Description: Binary data updatedbcron.ck Description: Binary data
Re: [ck] Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
Matthew Hawkins wrote: updatedb by itself doesn't really bug me, its just that on occasion its still running at 7am You should start it earlier then - assuming it doesn't already start at the earliest opportunity? Helge Hafting - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ck] Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
Matthew Hawkins wrote: updatedb by itself doesn't really bug me, its just that on occasion its still running at 7am You should start it earlier then - assuming it doesn't already start at the earliest opportunity? Helge Hafting - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On Sun, 29 Jul 2007, Rene Herman wrote: On 07/29/2007 01:41 PM, [EMAIL PROTECTED] wrote: I agree that tinkering with the core VM code should not be done lightly, but this has been put through the proper process and is stalled with no hints on how to move forward. It has not. Concerns that were raised (by specifically Nick Piggin) weren't being addressed. I may have missed them, but what I saw from him weren't specific issues, but instead a nebulous 'something better may come along later' forget the nightly cron jobs for the moment. think of this scenerio. you have your memory fairly full with apps that you have open (including firefox with many tabs), you receive a spreadsheet you need to look at, so you fire up openoffice to look at it. then you exit openoffice and try to go back to firefox (after a pause while you walk to the printer to get the printout of the spreadsheet) And swinging a dead rat from its tail facing east-wards while reciting Documentation/CodingStyle. Okay, very very sorry, that was particularly childish, but that "walking to the printer" is ofcourse completely constructed and this _is_ something to take into account. yes it was contrived for simplicity. the same effect would happen if instead of going back to firefox the user instead went to their e-mail software and read some mail. doing so should still make the machine idle enough to let prefetch kick in. Swap-prefetch wants to be free, which (also again) it is doing a good job at it seems, but this also means that it waits for the VM to be _very_ idle before it does anything and as such, we cannot just forget the "nightly" scenario and pretend it's about something else entirely. As long as the machine's being used, swap-prefetch doesn't kick in. how long does the machine need to be idle? if someone spends 30 seconds reading an e-mail that's an incredibly long time for the system and I would think it should be enough to let the prefetch kick in. > -- 3: no serious consideration of possible alternatives > > Tweaking existing use-oce logic is one I've heard but if we consider > the i/dcache issue dead, I believe that one is as well. Going to > userspace is another one. Largest theoretical potential. I myself am > extremely sceptical about the Linux userland, and largely equate it > with "smallest _practical_ potential" -- but that might just be me. > > A larger swap granularity, possible even a self-training > granularity. Up to now, seeks only get costlier and costlier with > respect to reads with every generation of disk (flash would largely > overcome it though) and doing more in one read/write _greatly_ > improves throughput, maybe up to the point that swap-prefetch is no > longer very useful. I myself don't know about the tradeoffs > involved. larger swap granularity may help, but waiting for the user to need the ram and have to wait for it to be read back in is always going to be worse for the user then pre-populating the free memory (for the case where the pre-population is right, for other cases it's the same). so I see this as a red herring I saw Chris Snook make a good post here and am going to defer this part to that discussion: http://lkml.org/lkml/2007/7/27/421 But no, it's not a red herring if _practically_ speaking the swapin is fast enough once started that people don't actually mind anymore since in that case you could simply do without yet more additional VM complexity (and kernel daemon). swapin will always require disk access, and avoiding doing disk access while the user is waiting for it by doing it when the system isn't useing the disk will always be a win (possibly not as large of a win, but still a win) on slow laptop drives where you may only get 20MB/second of reads under optimal situations it doesn't take much reading to be noticed by the user. there are fully legitimate situations where this is useful, the 'papering over' effect is not referring to these, it's referring to other possible problems in the future. No, it's not just future. Just look at the various things under discussion now such as improved use-once and better swapin. and these thing do not conflict with prefetch, they compliment it. improved use-once will avoid pushing things out to swap in the first place. this will help during normal workloads so is valuble in any case. better swapin (I assume you are talking about things like larger swap granularity) will also help during normal workloads when you are thrashing into swap. prefetch will help when you have pushed things out to swap and now have free memory and a momentarily idle system. David Lang - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On Sunday 29 July 2007 16:00:22 Ray Lee wrote: > On 7/29/07, Paul Jackson <[EMAIL PROTECTED]> wrote: > > If the problem is reading stuff back in from swap at the *same time* > > that the application is reading stuff from some user file system, and if > > that user file system is on the same drive as the swap partition > > (typical on laptops), then interleaving the user file system accesses > > with the swap partition accesses might overwhelm all other performance > > problems, due to the frequent long seeks between the two. > > Ah, so in a normal scenario where a working-set is getting faulted > back in, we have the swap storage as well as the file-backed stuff > that needs to be read as well. So even if swap is organized perfectly, > we're still seeking. Damn. That is one reason why I try to have swap on a device dedicated just for it. It helps keep the system from having to seek all over the drive for data. (I remember that this was recommended years ago with Windows - back when you could tell Windows where to put the swap file) > On the other hand, that explains another thing that swap prefetch > could be helping with -- if it preemptively faults the swap back in, > then the file-backed stuff can be faulted back more quickly, just by > the virtue of not needing to seek back and forth to swap for its > stuff. Hadn't thought of that. For it to really help swap-prefetch would have to be more aggressive. At the moment (if I'm reading the code correctly) the system has to have close to zero for it to kick in. A tunable knob controlling how much activity is too much for the prefetch to kick in would help with finding a sane default. IMHO it should be the one that provides the most benefit with the least hit to performance. > That also implies that people running with swap files rather than swap > partitions will see less of an issue. I should dig out my old compact > flash card and try putting swap on that for a week. Maybe. It all depends on how much seeking is needed to track down the pages in the swapfile and such. What would really help make the situation even better would be doing the log structured swap + cleaner. The log structured swap + cleaner should provide a performance boost by itself - add in the prefetch mechanism and the benefits are even more visible. Another way to improve performance would require making the page replacement mechanism more intelligent. There are bounds to what can be done in the kernel without negatively impacting performance, but, if I've read the code correctly, there might be a better way to decide which pages to evict. One way to do this would be to implement some mechanism that allows the system to choose a single group of contiguous pages (or, say, a large soft-page) over swapping out a single page at a time. (some form of memory defrag would also be nice, but I can't think of a way to do that without massively breaking everything) > > In case Andrew is so bored he read this far -- yes this wake-up sounds > > like user space code, with minimal kernel changes to support any > > particular lower level operation that we can't do already. > > He'd suggested using, uhm, ptrace_peek or somesuch for just such a > purpose. The second half of the issue is to know when and what to > target. The userspace suggestion that was thrown out earlier would have been as error-prone and problematic as FUSE. A solution like you suggest would be workable - its small and does a task that is best done in userspace (IMHO). (IIRC, the original suggestion involved merging maps2 and another patchset into mainline and using that, combined with PEEKTEXT to provide for a userspace swap daemon. Swap, IMHO, should never be handled outside the kernel) What might be useful is a userspace daemon that tracks memory pressure and uses a concise API to trigger various levels of prefetch and/or swap aggressiveness. DRH -- Dialup is like pissing through a pipette. Slow and excruciatingly painful. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 7/29/07, Paul Jackson <[EMAIL PROTECTED]> wrote: > Ray wrote: > > Ah, so in a normal scenario where a working-set is getting faulted > > back in, we have the swap storage as well as the file-backed stuff > > that needs to be read as well. So even if swap is organized perfectly, > > we're still seeking. Damn. > > Perhaps this applies in some cases ... perhaps. Yeah, point taken: better data would make this a lot easier to figure out and target fixes. Ray - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
Ray wrote: > Ah, so in a normal scenario where a working-set is getting faulted > back in, we have the swap storage as well as the file-backed stuff > that needs to be read as well. So even if swap is organized perfectly, > we're still seeking. Damn. Perhaps this applies in some cases ... perhaps. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 7/29/07, Paul Jackson <[EMAIL PROTECTED]> wrote: > If the problem is reading stuff back in from swap at the *same time* > that the application is reading stuff from some user file system, and if > that user file system is on the same drive as the swap partition > (typical on laptops), then interleaving the user file system accesses > with the swap partition accesses might overwhelm all other performance > problems, due to the frequent long seeks between the two. Ah, so in a normal scenario where a working-set is getting faulted back in, we have the swap storage as well as the file-backed stuff that needs to be read as well. So even if swap is organized perfectly, we're still seeking. Damn. On the other hand, that explains another thing that swap prefetch could be helping with -- if it preemptively faults the swap back in, then the file-backed stuff can be faulted back more quickly, just by the virtue of not needing to seek back and forth to swap for its stuff. Hadn't thought of that. That also implies that people running with swap files rather than swap partitions will see less of an issue. I should dig out my old compact flash card and try putting swap on that for a week. > In that case, swap layout and swap i/o block size are secondary. > However, pre-fetching, so that swap read back is not interleaved > with application file accesses, could help dramatically. > Perhaps we could have a 'wake-up' command, analogous to the various sleep > and hibernate commands. [...] > In case Andrew is so bored he read this far -- yes this wake-up sounds > like user space code, with minimal kernel changes to support any > particular lower level operation that we can't do already. He'd suggested using, uhm, ptrace_peek or somesuch for just such a purpose. The second half of the issue is to know when and what to target. Ray - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
Ray wrote: > a log structured scheme, where the writeout happens to sequential spaces > on the drive instead of scattered about. If the problem is reading stuff back in from swap quickly when needed, then this likely helps, by reducing the seeks needed. If the problem is reading stuff back in from swap at the *same time* that the application is reading stuff from some user file system, and if that user file system is on the same drive as the swap partition (typical on laptops), then interleaving the user file system accesses with the swap partition accesses might overwhelm all other performance problems, due to the frequent long seeks between the two. In that case, swap layout and swap i/o block size are secondary. However, pre-fetching, so that swap read back is not interleaved with application file accesses, could help dramatically. === Perhaps we could have a 'wake-up' command, analogous to the various sleep and hibernate commands. The 'wake-up' command could do whatever of the following it knew to do, in order to optimize for an anticipated change in usage patterns: 1) pre-fetch swap 2) clean (write out) dirty pages 3) maximize free memory 4) layout swap nicely 5) pre-fetch a favorite set of apps Stumble out of bed in the morning, press 'wake-up', start boiling the water for your coffee, and in another ten minutes, one is ready to rock and roll. In case Andrew is so bored he read this far -- yes this wake-up sounds like user space code, with minimal kernel changes to support any particular lower level operation that we can't do already. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 07/29/2007 07:52 PM, Ray Lee wrote: Well, that doesn't match my systems. My laptop has 400MB in swap: Which in your case is slightly more than 1/3 of available swap space. Quite a lot for a desktop indeed. And if it's more than a few percent fragmented, please fix current swapout instead of log structuring it. Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 7/29/07, Rene Herman <[EMAIL PROTECTED]> wrote: > On 07/29/2007 07:19 PM, Ray Lee wrote: > For me, it is generally the case yes. We are still discussing this in the > context of desktop machines and their problems with being slow as things > have been swapped out and generally I expect a desktop to have plenty of > swap which it's not regularly going to fillup significantly since then the > machine's unworkably slow as a desktop anyway. Well, that doesn't match my systems. My laptop has 400MB in swap: [EMAIL PROTECTED]:~$ free total used free sharedbuffers cached Mem:894208 883920 10288 0 3044 163224 -/+ buffers/cache: 717652 176556 Swap: 1116476 393132 723344 > > And once there's something already in swap, you now have a packing > > problem when you want to swap something else out. > > Once we're crammed, it gets to be a different situation yes. As far as I'm > concerned that's for another thread though. I'm spending too much time on > LKML as it is... No, it's not even when crammed. It's just when there are holes. mm/swapfile.c does try to cluster things, but doesn't work too hard at it as we don't want to spend all our time looking for a perfect fit that may not exist. Ray - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
> > Is that generally the case on your systems? Every linux system I've > > run, regardless of RAM, has always pushed things out to swap. > > For me, it is generally the case yes. We are still discussing this in the > context of desktop machines and their problems with being slow as things > have been swapped out and generally I expect a desktop to have plenty of > swap which it's not regularly going to fillup significantly since then the > machine's unworkably slow as a desktop anyway. A simple log optimises writeout (which is latency critical) and can otherwise stall an enitre system. In a log you can also have multiple copies of the same page on disk easily, some stale - so you can write out chunks of data that are not all them removed from memory, just so you get them back more easily if you then do (and I guess you'd mark them accordingly) The second element is a cleaner - something to go around removing stuff from the log that is needed when the disks are idle - and also to repack data in nice linear chunks. So instead of using the empty disk time for page-in you use it for packing data and optimising future paging. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 07/29/2007 07:19 PM, Ray Lee wrote: The program is not a real-world issue and if you do not consider it a useful boundary condition either (okay I guess), how would log structured swap help if I just assume I have plenty of free swap to begin with? Is that generally the case on your systems? Every linux system I've run, regardless of RAM, has always pushed things out to swap. For me, it is generally the case yes. We are still discussing this in the context of desktop machines and their problems with being slow as things have been swapped out and generally I expect a desktop to have plenty of swap which it's not regularly going to fillup significantly since then the machine's unworkably slow as a desktop anyway. And once there's something already in swap, you now have a packing problem when you want to swap something else out. Once we're crammed, it gets to be a different situation yes. As far as I'm concerned that's for another thread though. I'm spending too much time on LKML as it is... Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 7/29/07, Rene Herman <[EMAIL PROTECTED]> wrote: > On 07/29/2007 06:04 PM, Ray Lee wrote: > >> I am very aware of the costs of seeks (on current magnetic media). > > > > Then perhaps you can just take it on faith -- log structured layouts > > are designed to help minimize seeks, read and write. > > I am particularly bad at faith. Let's take that stupid program that I posted: You only think you are :-). I'm sure there are lots of things you have faith in. Gravity, for example :-). > The program is not a real-world issue and if you do not consider it a useful > boundary condition either (okay I guess), how would log structured swap help > if I just assume I have plenty of free swap to begin with? Is that generally the case on your systems? Every linux system I've run, regardless of RAM, has always pushed things out to swap. And once there's something already in swap, you now have a packing problem when you want to swap something else out. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 07/29/2007 06:04 PM, Ray Lee wrote: I am very aware of the costs of seeks (on current magnetic media). Then perhaps you can just take it on faith -- log structured layouts are designed to help minimize seeks, read and write. I am particularly bad at faith. Let's take that stupid program that I posted: http://lkml.org/lkml/2007/7/25/85 You push it out before you hit enter, it's written out to swap, at whatever speed. How should it be layed out so that it's swapped in most efficiently after hitting enter? Reading bigger chunks would quite obviously help, but the layout? The program is not a real-world issue and if you do not consider it a useful boundary condition either (okay I guess), how would log structured swap help if I just assume I have plenty of free swap to begin with? Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 7/29/07, Rene Herman <[EMAIL PROTECTED]> wrote: > On 07/29/2007 05:20 PM, Ray Lee wrote: > This seems to be now fixing the different problem of swap-space filling up. > I'm quite willing to for now assume I've got plenty free. I was trying to point out that currently, as an example, memory that is linear in a process' space could be fragmented on disk when swapped out. That's today. Under a log-structured scheme, one could set it up such that something that was linear in RAM could be swapped out linearly on the drive, minimizing seeks on writeout, which will naturally minimize seeks on swap in of that same data. > > So, at some point when the system needs to fault those blocks that > > back in, it now has a linear span of sectors to read instead of asking > > the drive to bounce over twenty tracks for a hundred blocks. > > Moreover though -- what I know about log structure is that generally it > optimises for write (swapout) and might make read (swapin) worse due to > fragmentation that wouldn't happen with a regular fs structure. It looks like I'm not doing a very good job of explaining this, I'm afraid. Suffice it to say that a log structured swap would give optimization options that we don't have today. > I guess that cleaner that Alan mentioned might be involved there -- I don't > know how/what it would be doing. Then you should google on `log structured filesystem (primer OR introduction)` and read a few of the links that pop up. You might find it interesting. > I am very aware of the costs of seeks (on current magnetic media). Then perhaps you can just take it on faith -- log structured layouts are designed to help minimize seeks, read and write. Ray - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 07/29/2007 05:20 PM, Ray Lee wrote: I understand what log structure is generally, but how does it help swapin? Look at the swap out case first. Right now, when swapping out the kernel places whatever it can wherever it can inside the swap space. The closer you are to filling your swap space, the more likely that those swapped out blocks will be all over place, rather than in one nice chunk. Contrast that with a log structured scheme, where the writeout happens to sequential spaces on the drive instead of scattered about. This seems to be now fixing the different problem of swap-space filling up. I'm quite willing to for now assume I've got plenty free. So, at some point when the system needs to fault those blocks that back in, it now has a linear span of sectors to read instead of asking the drive to bounce over twenty tracks for a hundred blocks. Moreover though -- what I know about log structure is that generally it optimises for write (swapout) and might make read (swapin) worse due to fragmentation that wouldn't happen with a regular fs structure. I guess that cleaner that Alan mentioned might be involved there -- I don't know how/what it would be doing. So, it eliminates the seeks. My laptop drive can read (huh, how odd, it got slower, need to retest in single user mode), hmm, let's go with about 25 MB/s. If we ask for a single block from each track, though, that'll drop to 4k * (1 second / seek time) which is about a megabyte a second if we're lucky enough to read from consecutive tracks. Even worse if it's not. Seeks are the enemy any time you need to hit the drive for anything, be it swapping or optimizing a database. I am very aware of the costs of seeks (on current magnetic media). Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 7/29/07, Rene Herman <[EMAIL PROTECTED]> wrote: > On 07/29/2007 04:58 PM, Ray Lee wrote: > > On 7/29/07, Rene Herman <[EMAIL PROTECTED]> wrote: > >> Right over my head. Why does log-structure help anything? > > > > Log structured disk layouts allow for better placement of writeout, so > > that you cn eliminate most or all seeks. Seeks are the enemy when > > trying to get full disk bandwidth. > > > > google on log structured disk layout, or somesuch, for details. > > I understand what log structure is generally, but how does it help swapin? Look at the swap out case first. Right now, when swapping out the kernel places whatever it can wherever it can inside the swap space. The closer you are to filling your swap space, the more likely that those swapped out blocks will be all over place, rather than in one nice chunk. Contrast that with a log structured scheme, where the writeout happens to sequential spaces on the drive instead of scattered about. So, at some point when the system needs to fault those blocks that back in, it now has a linear span of sectors to read instead of asking the drive to bounce over twenty tracks for a hundred blocks. So, it eliminates the seeks. My laptop drive can read (huh, how odd, it got slower, need to retest in single user mode), hmm, let's go with about 25 MB/s. If we ask for a single block from each track, though, that'll drop to 4k * (1 second / seek time) which is about a megabyte a second if we're lucky enough to read from consecutive tracks. Even worse if it's not. Seeks are the enemy any time you need to hit the drive for anything, be it swapping or optimizing a database. Ray - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 07/29/2007 04:58 PM, Ray Lee wrote: On 7/29/07, Rene Herman <[EMAIL PROTECTED]> wrote: On 07/29/2007 03:12 PM, Alan Cox wrote: More radically if anyone wants to do real researchy type work - how about log structured swap with a cleaner ? Right over my head. Why does log-structure help anything? Log structured disk layouts allow for better placement of writeout, so that you cn eliminate most or all seeks. Seeks are the enemy when trying to get full disk bandwidth. google on log structured disk layout, or somesuch, for details. I understand what log structure is generally, but how does it help swapin? Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 7/29/07, Rene Herman <[EMAIL PROTECTED]> wrote: > On 07/29/2007 03:12 PM, Alan Cox wrote: > > More radically if anyone wants to do real researchy type work - how about > > log structured swap with a cleaner ? > > Right over my head. Why does log-structure help anything? Log structured disk layouts allow for better placement of writeout, so that you cn eliminate most or all seeks. Seeks are the enemy when trying to get full disk bandwidth. google on log structured disk layout, or somesuch, for details. Ray - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 07/29/2007 03:12 PM, Alan Cox wrote: What are the tradeoffs here? What wants small chunks? Also, as far as I'm aware Linux does not do things like up the granularity when it notices it's swapping in heavily? That sounds sort of promising... Small chunks means you get better efficiency of memory use - large chunks mean you may well page in a lot more than you needed to each time (and cause more paging in turn). Your disk would prefer you fed it big linear I/O's - 512KB would probably be my first guess at tuning a large box under load for paging chunk size. That probably kills my momentary hope that I was looking at yet another good use of large soft-pages seeing as how 512K would be going overboard a bit right? :-/ More radically if anyone wants to do real researchy type work - how about log structured swap with a cleaner ? Right over my head. Why does log-structure help anything? Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 07/29/2007 01:41 PM, [EMAIL PROTECTED] wrote: And now you do it again :-) There is no conclusion -- just the inescapable observation that swap-prefetch was (or may have been) masking the problem of GNU locate being a program that noone in their right mind should be using. isn't your conclusion then that if people just stopped useing that version of updatedb the problem would be solved and there would be no need for the swap prefetch patch? that seemed to be what you were strongly implying (if not saying outright) No. What I said outright, every single time, is that swap-prefetch in itself seems to make sense. And specifically that even if the _direct_ problem is a crummy program, it _still_ makes sense generally. Every single time. But see -- you failed to notice this because you guys are stuck in this dumb adversary "us against them" thing so inherent of (online) communities, where you sit around your own habitats patting each other on the back for extended periods of time and then every once a while go out clinging on to each other vigorously and going "boo! hiss!" at the big bad outside world. I already got overly violent at one point in this thread so I'll leave out any further references to sense-deprived fanboy-culture but please, I said every single time that I'm not against swap-prefetch. I cannot communicate when I'm not being read. I agree that tinkering with the core VM code should not be done lightly, but this has been put through the proper process and is stalled with no hints on how to move forward. It has not. Concerns that were raised (by specifically Nick Piggin) weren't being addressed. forget the nightly cron jobs for the moment. think of this scenerio. you have your memory fairly full with apps that you have open (including firefox with many tabs), you receive a spreadsheet you need to look at, so you fire up openoffice to look at it. then you exit openoffice and try to go back to firefox (after a pause while you walk to the printer to get the printout of the spreadsheet) And swinging a dead rat from its tail facing east-wards while reciting Documentation/CodingStyle. Okay, very very sorry, that was particularly childish, but that "walking to the printer" is ofcourse completely constructed and this _is_ something to take into account. Swap-prefetch wants to be free, which (also again) it is doing a good job at it seems, but this also means that it waits for the VM to be _very_ idle before it does anything and as such, we cannot just forget the "nightly" scenario and pretend it's about something else entirely. As long as the machine's being used, swap-prefetch doesn't kick in. Which is a good feature for swap-prefetch, but also something that needs to weighed alongside its other features in a discussion of alternatives, where for example something like a larger swap granularity would not have anything of the sort to take into account. If it were about walks to the printer, we could shelve the issue as being of too limited practical use for inclusion. -- 2: no serious investigation into possible downsides Swap-prefetch tries hard to be as free as possible and it seems to largely be succeeding at that. Thing that (obviously -- as in I wouldn't want to state it's the only possible worry anyone could have left) remains is the "papering over effect" it has by design that one might not care for. Arjan van de Ven made another point here about seeking away due to swap-prefetch (just) before the next request comes in, but that's probably a bit of a non-issue in practice with the "very idle" precondition. -- 3: no serious consideration of possible alternatives Tweaking existing use-oce logic is one I've heard but if we consider the i/dcache issue dead, I believe that one is as well. Going to userspace is another one. Largest theoretical potential. I myself am extremely sceptical about the Linux userland, and largely equate it with "smallest _practical_ potential" -- but that might just be me. A larger swap granularity, possible even a self-training granularity. Up to now, seeks only get costlier and costlier with respect to reads with every generation of disk (flash would largely overcome it though) and doing more in one read/write _greatly_ improves throughput, maybe up to the point that swap-prefetch is no longer very useful. I myself don't know about the tradeoffs involved. larger swap granularity may help, but waiting for the user to need the ram and have to wait for it to be read back in is always going to be worse for the user then pre-populating the free memory (for the case where the pre-population is right, for other cases it's the same). so I see this as a red herring I saw Chris Snook make a good post here and am going to defer this part to that discussion: http://lkml.org/lkml/2007/7/27/421 But no, it's not a red herring if _practically_ speaking the swapin is fast enough once started
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
> Contrived thing and all, but what it does do is show exactly how bad seeking > all over swap-space is. If you push it out before hitting enter, the time it > takes easily grows past 10 minutes (with my 768M) versus sub-second (!) when > it's all in to start with. Think in "operations/second" and you get a better view of the disk. > What are the tradeoffs here? What wants small chunks? Also, as far as I'm > aware Linux does not do things like up the granularity when it notices it's > swapping in heavily? That sounds sort of promising... Small chunks means you get better efficiency of memory use - large chunks mean you may well page in a lot more than you needed to each time (and cause more paging in turn). Your disk would prefer you fed it big linear I/O's - 512KB would probably be my first guess at tuning a large box under load for paging chunk size. More radically if anyone wants to do real researchy type work - how about log structured swap with a cleaner ? Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
Andi wrote: > GNU sort uses a merge sort with temporary files on disk. Not sure > how much it keeps in memory during that, but it's probably less > than 150MB. If I'm reading the source code for GNU sort correctly, then the following snippet of shell code displays how much memory it uses for its primary buffer on typical GNU/Linux systems: head -2 /proc/meminfo | awk ' NR == 1 { memtotal = $2 } NR == 2 { memfree = $2 } END { if (memfree > memtotal/8) m = memfree else m = memtotal/8 print "sort size:", m/2, "kB" } ' That is, over simplifying, GNU sort looks at the first two entries in /proc/meminfo, which for example on a machine near me happen to be: MemTotal: 2336472 kB MemFree:110600 kB and then uses one-half of whichever is -greater- of MemTotal/8 or MemFree. ... However ... for the typical GNU locate updatedb run, it is sorting the list of pathnames for almost all files on the system, which is usually larger than fits in one of these sized buffers. So it ends up using quite a few of the temporary files you mention, which tends to chew up easily freed memory. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On Sun, 29 Jul 2007, Rene Herman wrote: On 07/28/2007 11:00 PM, [EMAIL PROTECTED] wrote: > many -mm users use it anyway? He himself said he's not convinced of > usefulness having not seen it help for him (and notice that most > developers are also users), turned it off due to it annoying him at some > point and hasn't seen a serious investigation into potential downsides. if that was the case then people should be responding to the request to get it merged with 'but it caused problems for me when I tried it' I haven't seen any comments like that. So you're saying Andrew did not say that? You're jumping to the conclusion that I am saying that it's causing problems. I don't remember anyone saying that it actually caused problems (including both you and andrew). I (and others) have been trying to learn what problems people believe it has in the hope that they can be addressed one way or another. > > that the only significant con left is the potential to mask other > > problems. > > Which is not a madeup issue, mind you. As an example, I just now tried > GNU locate and saw it's a complete pig and specifically unsuitable for > the low memory boxes under discussion. Upon completion, it actually > frees enough memory that swap-prefetch _could_ help on some boxes, while > the real issue is that they should first and foremost dump GNU locate. I see the conclusion as being exactly the opposite. And now you do it again :-) There is no conclusion -- just the inescapable observation that swap-prefetch was (or may have been) masking the problem of GNU locate being a program that noone in their right mind should be using. isn't your conclusion then that if people just stopped useing that version of updatedb the problem would be solved and there would be no need for the swap prefetch patch? that seemed to be what you were strongly implying (if not saying outright) so there is a legitimate situation where swap-prefetch will help significantly, what is the downside that prevents it from being included? People being unconvinced it helps all that much, no serious investigation into possible downsides and no consideration of alternatives is three I've personally heard. You don't want to merge a conceptually core VM feature if you're not really convinced. It's not a part of the kernel you can throw a feature into like you could some driver saying "ah, heck, if it makes someone happy" since everything in the VM ends up interacting -- that in fact is actually the hard part of VM as far as I've seen it. And in this situation the proposed feature is something that "papers over a problem" by design -- where it could certainly be that the problem is not solveable in another way simply due to the kernel not growing the possiblity to read user's minds anytime soon (which some might even like to rephrase as "due to no problem existing") but that this gets people a bit anxious is not surprising. people who have lots of memory and so don't use swap will never see the benifit of this patch. over the years many people have investigated the problem and tried to address it in other ways (the better version of updatedb is an attempt to fix it for that program as an example), but there is still a problem. I agree that tinkering with the core VM code should not be done lightly, but this has been put through the proper process and is stalled with no hints on how to move forward. I've seen it mentioned that there is still a maintainer but I missed who it is, but I haven't seen any concerns that can be addressed, they all seem to be 'this is a core concept, people need to think about it' or 'but someone may find a better answer in the future' type of things. it's impossible to address these concerns directly. So do it indirectly. But please don't just say "it help some people (not me mind you!) so merge it and if you don't it's all just politics and we can't do anything about it anyway". Because that's mostly what I've been hearing. And no, I'm not subscribed to any ck mailinglists nor do I hang around its IRC community which will can account for part of that. I expect though that the same holds for the people that actually matter in this, such as Andrew Morton and Nick Piggin. -- 1: people being unconvinced it helps all that much At least partly caused by the updatedb i/dcache red herring that infected this issue. Also, at the point VM pressure has mounted high enough to cause enough to be swapped out to give you a bad experience, a lot of other things have been dropped already as well. It's unsurprising though that it would for example help the issue of openoffice with a large open spreadsheet having been thrown out overnight meaning it's a matter of deciding whether or not this is an important enough issue to fix inside the VM with something like swap-prefetch. Personally -- no opinion, I do not experience the problem (I even
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 07/28/2007 11:00 PM, [EMAIL PROTECTED] wrote: many -mm users use it anyway? He himself said he's not convinced of usefulness having not seen it help for him (and notice that most developers are also users), turned it off due to it annoying him at some point and hasn't seen a serious investigation into potential downsides. if that was the case then people should be responding to the request to get it merged with 'but it caused problems for me when I tried it' I haven't seen any comments like that. So you're saying Andrew did not say that? You're jumping to the conclusion that I am saying that it's causing problems. that the only significant con left is the potential to mask other problems. Which is not a madeup issue, mind you. As an example, I just now tried GNU locate and saw it's a complete pig and specifically unsuitable for the low memory boxes under discussion. Upon completion, it actually frees enough memory that swap-prefetch _could_ help on some boxes, while the real issue is that they should first and foremost dump GNU locate. I see the conclusion as being exactly the opposite. And now you do it again :-) There is no conclusion -- just the inescapable observation that swap-prefetch was (or may have been) masking the problem of GNU locate being a program that noone in their right mind should be using. so there is a legitimate situation where swap-prefetch will help significantly, what is the downside that prevents it from being included? People being unconvinced it helps all that much, no serious investigation into possible downsides and no consideration of alternatives is three I've personally heard. You don't want to merge a conceptually core VM feature if you're not really convinced. It's not a part of the kernel you can throw a feature into like you could some driver saying "ah, heck, if it makes someone happy" since everything in the VM ends up interacting -- that in fact is actually the hard part of VM as far as I've seen it. And in this situation the proposed feature is something that "papers over a problem" by design -- where it could certainly be that the problem is not solveable in another way simply due to the kernel not growing the possiblity to read user's minds anytime soon (which some might even like to rephrase as "due to no problem existing") but that this gets people a bit anxious is not surprising. I've seen it mentioned that there is still a maintainer but I missed who it is, but I haven't seen any concerns that can be addressed, they all seem to be 'this is a core concept, people need to think about it' or 'but someone may find a better answer in the future' type of things. it's impossible to address these concerns directly. So do it indirectly. But please don't just say "it help some people (not me mind you!) so merge it and if you don't it's all just politics and we can't do anything about it anyway". Because that's mostly what I've been hearing. And no, I'm not subscribed to any ck mailinglists nor do I hang around its IRC community which will can account for part of that. I expect though that the same holds for the people that actually matter in this, such as Andrew Morton and Nick Piggin. -- 1: people being unconvinced it helps all that much At least partly caused by the updatedb i/dcache red herring that infected this issue. Also, at the point VM pressure has mounted high enough to cause enough to be swapped out to give you a bad experience, a lot of other things have been dropped already as well. It's unsurprising though that it would for example help the issue of openoffice with a large open spreadsheet having been thrown out overnight meaning it's a matter of deciding whether or not this is an important enough issue to fix inside the VM with something like swap-prefetch. Personally -- no opinion, I do not experience the problem (I even switch off the machine at night and do not run cron at all). -- 2: no serious investigation into possible downsides Swap-prefetch tries hard to be as free as possible and it seems to largely be succeeding at that. Thing that (obviously -- as in I wouldn't want to state it's the only possible worry anyone could have left) remains is the "papering over effect" it has by design that one might not care for. -- 3: no serious consideration of possible alternatives Tweaking existing use-oce logic is one I've heard but if we consider the i/dcache issue dead, I believe that one is as well. Going to userspace is another one. Largest theoretical potential. I myself am extremely sceptical about the Linux userland, and largely equate it with "smallest _practical_ potential" -- but that might just be me. A larger swap granularity, possible even a self-training granularity. Up to now, seeks only get costlier and costlier with respect to reads with every generation of disk (flash would largely overcome it though) and doing more in one read/write
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 07/28/2007 01:21 PM, Alan Cox wrote: It is. Prefetched pages can be dropped on the floor without additional I/O. Which is essentially free for most cases. In addition your disk access may well have been in idle time (and should be for this sort of stuff) Yes. The swap-prefetch patch ensures that the machine (well, the VM) is very idle before it allows itself to kick in. and if it was in the same chunk as something nearby was effectively free anyway. Actual physical disk ops are precious resource and anything that mostly reduces the number will be a win - not to stay swap prefetch is the right answer but accidentally or otherwise there are good reasons it may happen to help. Bigger more linear chunks of writeout/readin is much more important I suspect than swap prefetching. Yes, I believe this might be an important point. Earlier I posted a dumb little VM thrasher: http://lkml.org/lkml/2007/7/25/85 Contrived thing and all, but what it does do is show exactly how bad seeking all over swap-space is. If you push it out before hitting enter, the time it takes easily grows past 10 minutes (with my 768M) versus sub-second (!) when it's all in to start with. What are the tradeoffs here? What wants small chunks? Also, as far as I'm aware Linux does not do things like up the granularity when it notices it's swapping in heavily? That sounds sort of promising... good overview of exactly how broken -mm can be at times. How many -mm users use it anyway? He himself said he's not convinced of usefulness having not I've been using it for months with no noticed problem. I turn it on because it might as well get tested. I've not done comparison tests so I can't comment on if its worth it. Lots of -mm testers turn *everything* on because its a test kernel. Okay. Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 07/28/2007 01:21 PM, Alan Cox wrote: It is. Prefetched pages can be dropped on the floor without additional I/O. Which is essentially free for most cases. In addition your disk access may well have been in idle time (and should be for this sort of stuff) Yes. The swap-prefetch patch ensures that the machine (well, the VM) is very idle before it allows itself to kick in. and if it was in the same chunk as something nearby was effectively free anyway. Actual physical disk ops are precious resource and anything that mostly reduces the number will be a win - not to stay swap prefetch is the right answer but accidentally or otherwise there are good reasons it may happen to help. Bigger more linear chunks of writeout/readin is much more important I suspect than swap prefetching. Yes, I believe this might be an important point. Earlier I posted a dumb little VM thrasher: http://lkml.org/lkml/2007/7/25/85 Contrived thing and all, but what it does do is show exactly how bad seeking all over swap-space is. If you push it out before hitting enter, the time it takes easily grows past 10 minutes (with my 768M) versus sub-second (!) when it's all in to start with. What are the tradeoffs here? What wants small chunks? Also, as far as I'm aware Linux does not do things like up the granularity when it notices it's swapping in heavily? That sounds sort of promising... good overview of exactly how broken -mm can be at times. How many -mm users use it anyway? He himself said he's not convinced of usefulness having not I've been using it for months with no noticed problem. I turn it on because it might as well get tested. I've not done comparison tests so I can't comment on if its worth it. Lots of -mm testers turn *everything* on because its a test kernel. Okay. Rene. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 07/28/2007 11:00 PM, [EMAIL PROTECTED] wrote: many -mm users use it anyway? He himself said he's not convinced of usefulness having not seen it help for him (and notice that most developers are also users), turned it off due to it annoying him at some point and hasn't seen a serious investigation into potential downsides. if that was the case then people should be responding to the request to get it merged with 'but it caused problems for me when I tried it' I haven't seen any comments like that. So you're saying Andrew did not say that? You're jumping to the conclusion that I am saying that it's causing problems. that the only significant con left is the potential to mask other problems. Which is not a madeup issue, mind you. As an example, I just now tried GNU locate and saw it's a complete pig and specifically unsuitable for the low memory boxes under discussion. Upon completion, it actually frees enough memory that swap-prefetch _could_ help on some boxes, while the real issue is that they should first and foremost dump GNU locate. I see the conclusion as being exactly the opposite. And now you do it again :-) There is no conclusion -- just the inescapable observation that swap-prefetch was (or may have been) masking the problem of GNU locate being a program that noone in their right mind should be using. so there is a legitimate situation where swap-prefetch will help significantly, what is the downside that prevents it from being included? People being unconvinced it helps all that much, no serious investigation into possible downsides and no consideration of alternatives is three I've personally heard. You don't want to merge a conceptually core VM feature if you're not really convinced. It's not a part of the kernel you can throw a feature into like you could some driver saying ah, heck, if it makes someone happy since everything in the VM ends up interacting -- that in fact is actually the hard part of VM as far as I've seen it. And in this situation the proposed feature is something that papers over a problem by design -- where it could certainly be that the problem is not solveable in another way simply due to the kernel not growing the possiblity to read user's minds anytime soon (which some might even like to rephrase as due to no problem existing) but that this gets people a bit anxious is not surprising. I've seen it mentioned that there is still a maintainer but I missed who it is, but I haven't seen any concerns that can be addressed, they all seem to be 'this is a core concept, people need to think about it' or 'but someone may find a better answer in the future' type of things. it's impossible to address these concerns directly. So do it indirectly. But please don't just say it help some people (not me mind you!) so merge it and if you don't it's all just politics and we can't do anything about it anyway. Because that's mostly what I've been hearing. And no, I'm not subscribed to any ck mailinglists nor do I hang around its IRC community which will can account for part of that. I expect though that the same holds for the people that actually matter in this, such as Andrew Morton and Nick Piggin. -- 1: people being unconvinced it helps all that much At least partly caused by the updatedb i/dcache red herring that infected this issue. Also, at the point VM pressure has mounted high enough to cause enough to be swapped out to give you a bad experience, a lot of other things have been dropped already as well. It's unsurprising though that it would for example help the issue of openoffice with a large open spreadsheet having been thrown out overnight meaning it's a matter of deciding whether or not this is an important enough issue to fix inside the VM with something like swap-prefetch. Personally -- no opinion, I do not experience the problem (I even switch off the machine at night and do not run cron at all). -- 2: no serious investigation into possible downsides Swap-prefetch tries hard to be as free as possible and it seems to largely be succeeding at that. Thing that (obviously -- as in I wouldn't want to state it's the only possible worry anyone could have left) remains is the papering over effect it has by design that one might not care for. -- 3: no serious consideration of possible alternatives Tweaking existing use-oce logic is one I've heard but if we consider the i/dcache issue dead, I believe that one is as well. Going to userspace is another one. Largest theoretical potential. I myself am extremely sceptical about the Linux userland, and largely equate it with smallest _practical_ potential -- but that might just be me. A larger swap granularity, possible even a self-training granularity. Up to now, seeks only get costlier and costlier with respect to reads with every generation of disk (flash would largely overcome it though) and doing more in one read/write _greatly_
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On Sun, 29 Jul 2007, Rene Herman wrote: On 07/28/2007 11:00 PM, [EMAIL PROTECTED] wrote: many -mm users use it anyway? He himself said he's not convinced of usefulness having not seen it help for him (and notice that most developers are also users), turned it off due to it annoying him at some point and hasn't seen a serious investigation into potential downsides. if that was the case then people should be responding to the request to get it merged with 'but it caused problems for me when I tried it' I haven't seen any comments like that. So you're saying Andrew did not say that? You're jumping to the conclusion that I am saying that it's causing problems. I don't remember anyone saying that it actually caused problems (including both you and andrew). I (and others) have been trying to learn what problems people believe it has in the hope that they can be addressed one way or another. that the only significant con left is the potential to mask other problems. Which is not a madeup issue, mind you. As an example, I just now tried GNU locate and saw it's a complete pig and specifically unsuitable for the low memory boxes under discussion. Upon completion, it actually frees enough memory that swap-prefetch _could_ help on some boxes, while the real issue is that they should first and foremost dump GNU locate. I see the conclusion as being exactly the opposite. And now you do it again :-) There is no conclusion -- just the inescapable observation that swap-prefetch was (or may have been) masking the problem of GNU locate being a program that noone in their right mind should be using. isn't your conclusion then that if people just stopped useing that version of updatedb the problem would be solved and there would be no need for the swap prefetch patch? that seemed to be what you were strongly implying (if not saying outright) so there is a legitimate situation where swap-prefetch will help significantly, what is the downside that prevents it from being included? People being unconvinced it helps all that much, no serious investigation into possible downsides and no consideration of alternatives is three I've personally heard. You don't want to merge a conceptually core VM feature if you're not really convinced. It's not a part of the kernel you can throw a feature into like you could some driver saying ah, heck, if it makes someone happy since everything in the VM ends up interacting -- that in fact is actually the hard part of VM as far as I've seen it. And in this situation the proposed feature is something that papers over a problem by design -- where it could certainly be that the problem is not solveable in another way simply due to the kernel not growing the possiblity to read user's minds anytime soon (which some might even like to rephrase as due to no problem existing) but that this gets people a bit anxious is not surprising. people who have lots of memory and so don't use swap will never see the benifit of this patch. over the years many people have investigated the problem and tried to address it in other ways (the better version of updatedb is an attempt to fix it for that program as an example), but there is still a problem. I agree that tinkering with the core VM code should not be done lightly, but this has been put through the proper process and is stalled with no hints on how to move forward. I've seen it mentioned that there is still a maintainer but I missed who it is, but I haven't seen any concerns that can be addressed, they all seem to be 'this is a core concept, people need to think about it' or 'but someone may find a better answer in the future' type of things. it's impossible to address these concerns directly. So do it indirectly. But please don't just say it help some people (not me mind you!) so merge it and if you don't it's all just politics and we can't do anything about it anyway. Because that's mostly what I've been hearing. And no, I'm not subscribed to any ck mailinglists nor do I hang around its IRC community which will can account for part of that. I expect though that the same holds for the people that actually matter in this, such as Andrew Morton and Nick Piggin. -- 1: people being unconvinced it helps all that much At least partly caused by the updatedb i/dcache red herring that infected this issue. Also, at the point VM pressure has mounted high enough to cause enough to be swapped out to give you a bad experience, a lot of other things have been dropped already as well. It's unsurprising though that it would for example help the issue of openoffice with a large open spreadsheet having been thrown out overnight meaning it's a matter of deciding whether or not this is an important enough issue to fix inside the VM with something like swap-prefetch. Personally -- no opinion, I do not experience the problem (I even switch off the machine
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
Andi wrote: GNU sort uses a merge sort with temporary files on disk. Not sure how much it keeps in memory during that, but it's probably less than 150MB. If I'm reading the source code for GNU sort correctly, then the following snippet of shell code displays how much memory it uses for its primary buffer on typical GNU/Linux systems: head -2 /proc/meminfo | awk ' NR == 1 { memtotal = $2 } NR == 2 { memfree = $2 } END { if (memfree memtotal/8) m = memfree else m = memtotal/8 print sort size:, m/2, kB } ' That is, over simplifying, GNU sort looks at the first two entries in /proc/meminfo, which for example on a machine near me happen to be: MemTotal: 2336472 kB MemFree:110600 kB and then uses one-half of whichever is -greater- of MemTotal/8 or MemFree. ... However ... for the typical GNU locate updatedb run, it is sorting the list of pathnames for almost all files on the system, which is usually larger than fits in one of these sized buffers. So it ends up using quite a few of the temporary files you mention, which tends to chew up easily freed memory. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson [EMAIL PROTECTED] 1.925.600.0401 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
Contrived thing and all, but what it does do is show exactly how bad seeking all over swap-space is. If you push it out before hitting enter, the time it takes easily grows past 10 minutes (with my 768M) versus sub-second (!) when it's all in to start with. Think in operations/second and you get a better view of the disk. What are the tradeoffs here? What wants small chunks? Also, as far as I'm aware Linux does not do things like up the granularity when it notices it's swapping in heavily? That sounds sort of promising... Small chunks means you get better efficiency of memory use - large chunks mean you may well page in a lot more than you needed to each time (and cause more paging in turn). Your disk would prefer you fed it big linear I/O's - 512KB would probably be my first guess at tuning a large box under load for paging chunk size. More radically if anyone wants to do real researchy type work - how about log structured swap with a cleaner ? Alan - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 07/29/2007 01:41 PM, [EMAIL PROTECTED] wrote: And now you do it again :-) There is no conclusion -- just the inescapable observation that swap-prefetch was (or may have been) masking the problem of GNU locate being a program that noone in their right mind should be using. isn't your conclusion then that if people just stopped useing that version of updatedb the problem would be solved and there would be no need for the swap prefetch patch? that seemed to be what you were strongly implying (if not saying outright) No. What I said outright, every single time, is that swap-prefetch in itself seems to make sense. And specifically that even if the _direct_ problem is a crummy program, it _still_ makes sense generally. Every single time. But see -- you failed to notice this because you guys are stuck in this dumb adversary us against them thing so inherent of (online) communities, where you sit around your own habitats patting each other on the back for extended periods of time and then every once a while go out clinging on to each other vigorously and going boo! hiss! at the big bad outside world. I already got overly violent at one point in this thread so I'll leave out any further references to sense-deprived fanboy-culture but please, I said every single time that I'm not against swap-prefetch. I cannot communicate when I'm not being read. I agree that tinkering with the core VM code should not be done lightly, but this has been put through the proper process and is stalled with no hints on how to move forward. It has not. Concerns that were raised (by specifically Nick Piggin) weren't being addressed. forget the nightly cron jobs for the moment. think of this scenerio. you have your memory fairly full with apps that you have open (including firefox with many tabs), you receive a spreadsheet you need to look at, so you fire up openoffice to look at it. then you exit openoffice and try to go back to firefox (after a pause while you walk to the printer to get the printout of the spreadsheet) And swinging a dead rat from its tail facing east-wards while reciting Documentation/CodingStyle. Okay, very very sorry, that was particularly childish, but that walking to the printer is ofcourse completely constructed and this _is_ something to take into account. Swap-prefetch wants to be free, which (also again) it is doing a good job at it seems, but this also means that it waits for the VM to be _very_ idle before it does anything and as such, we cannot just forget the nightly scenario and pretend it's about something else entirely. As long as the machine's being used, swap-prefetch doesn't kick in. Which is a good feature for swap-prefetch, but also something that needs to weighed alongside its other features in a discussion of alternatives, where for example something like a larger swap granularity would not have anything of the sort to take into account. If it were about walks to the printer, we could shelve the issue as being of too limited practical use for inclusion. -- 2: no serious investigation into possible downsides Swap-prefetch tries hard to be as free as possible and it seems to largely be succeeding at that. Thing that (obviously -- as in I wouldn't want to state it's the only possible worry anyone could have left) remains is the papering over effect it has by design that one might not care for. Arjan van de Ven made another point here about seeking away due to swap-prefetch (just) before the next request comes in, but that's probably a bit of a non-issue in practice with the very idle precondition. -- 3: no serious consideration of possible alternatives Tweaking existing use-oce logic is one I've heard but if we consider the i/dcache issue dead, I believe that one is as well. Going to userspace is another one. Largest theoretical potential. I myself am extremely sceptical about the Linux userland, and largely equate it with smallest _practical_ potential -- but that might just be me. A larger swap granularity, possible even a self-training granularity. Up to now, seeks only get costlier and costlier with respect to reads with every generation of disk (flash would largely overcome it though) and doing more in one read/write _greatly_ improves throughput, maybe up to the point that swap-prefetch is no longer very useful. I myself don't know about the tradeoffs involved. larger swap granularity may help, but waiting for the user to need the ram and have to wait for it to be read back in is always going to be worse for the user then pre-populating the free memory (for the case where the pre-population is right, for other cases it's the same). so I see this as a red herring I saw Chris Snook make a good post here and am going to defer this part to that discussion: http://lkml.org/lkml/2007/7/27/421 But no, it's not a red herring if _practically_ speaking the swapin is fast enough once started that people
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 07/29/2007 03:12 PM, Alan Cox wrote: What are the tradeoffs here? What wants small chunks? Also, as far as I'm aware Linux does not do things like up the granularity when it notices it's swapping in heavily? That sounds sort of promising... Small chunks means you get better efficiency of memory use - large chunks mean you may well page in a lot more than you needed to each time (and cause more paging in turn). Your disk would prefer you fed it big linear I/O's - 512KB would probably be my first guess at tuning a large box under load for paging chunk size. That probably kills my momentary hope that I was looking at yet another good use of large soft-pages seeing as how 512K would be going overboard a bit right? :-/ More radically if anyone wants to do real researchy type work - how about log structured swap with a cleaner ? Right over my head. Why does log-structure help anything? Rene. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 7/29/07, Rene Herman [EMAIL PROTECTED] wrote: On 07/29/2007 03:12 PM, Alan Cox wrote: More radically if anyone wants to do real researchy type work - how about log structured swap with a cleaner ? Right over my head. Why does log-structure help anything? Log structured disk layouts allow for better placement of writeout, so that you cn eliminate most or all seeks. Seeks are the enemy when trying to get full disk bandwidth. google on log structured disk layout, or somesuch, for details. Ray - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 7/29/07, Rene Herman [EMAIL PROTECTED] wrote: On 07/29/2007 04:58 PM, Ray Lee wrote: On 7/29/07, Rene Herman [EMAIL PROTECTED] wrote: Right over my head. Why does log-structure help anything? Log structured disk layouts allow for better placement of writeout, so that you cn eliminate most or all seeks. Seeks are the enemy when trying to get full disk bandwidth. google on log structured disk layout, or somesuch, for details. I understand what log structure is generally, but how does it help swapin? Look at the swap out case first. Right now, when swapping out the kernel places whatever it can wherever it can inside the swap space. The closer you are to filling your swap space, the more likely that those swapped out blocks will be all over place, rather than in one nice chunk. Contrast that with a log structured scheme, where the writeout happens to sequential spaces on the drive instead of scattered about. So, at some point when the system needs to fault those blocks that back in, it now has a linear span of sectors to read instead of asking the drive to bounce over twenty tracks for a hundred blocks. So, it eliminates the seeks. My laptop drive can read (huh, how odd, it got slower, need to retest in single user mode), hmm, let's go with about 25 MB/s. If we ask for a single block from each track, though, that'll drop to 4k * (1 second / seek time) which is about a megabyte a second if we're lucky enough to read from consecutive tracks. Even worse if it's not. Seeks are the enemy any time you need to hit the drive for anything, be it swapping or optimizing a database. Ray - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 07/29/2007 04:58 PM, Ray Lee wrote: On 7/29/07, Rene Herman [EMAIL PROTECTED] wrote: On 07/29/2007 03:12 PM, Alan Cox wrote: More radically if anyone wants to do real researchy type work - how about log structured swap with a cleaner ? Right over my head. Why does log-structure help anything? Log structured disk layouts allow for better placement of writeout, so that you cn eliminate most or all seeks. Seeks are the enemy when trying to get full disk bandwidth. google on log structured disk layout, or somesuch, for details. I understand what log structure is generally, but how does it help swapin? Rene. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 07/29/2007 05:20 PM, Ray Lee wrote: I understand what log structure is generally, but how does it help swapin? Look at the swap out case first. Right now, when swapping out the kernel places whatever it can wherever it can inside the swap space. The closer you are to filling your swap space, the more likely that those swapped out blocks will be all over place, rather than in one nice chunk. Contrast that with a log structured scheme, where the writeout happens to sequential spaces on the drive instead of scattered about. This seems to be now fixing the different problem of swap-space filling up. I'm quite willing to for now assume I've got plenty free. So, at some point when the system needs to fault those blocks that back in, it now has a linear span of sectors to read instead of asking the drive to bounce over twenty tracks for a hundred blocks. Moreover though -- what I know about log structure is that generally it optimises for write (swapout) and might make read (swapin) worse due to fragmentation that wouldn't happen with a regular fs structure. I guess that cleaner that Alan mentioned might be involved there -- I don't know how/what it would be doing. So, it eliminates the seeks. My laptop drive can read (huh, how odd, it got slower, need to retest in single user mode), hmm, let's go with about 25 MB/s. If we ask for a single block from each track, though, that'll drop to 4k * (1 second / seek time) which is about a megabyte a second if we're lucky enough to read from consecutive tracks. Even worse if it's not. Seeks are the enemy any time you need to hit the drive for anything, be it swapping or optimizing a database. I am very aware of the costs of seeks (on current magnetic media). Rene. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 7/29/07, Rene Herman [EMAIL PROTECTED] wrote: On 07/29/2007 05:20 PM, Ray Lee wrote: This seems to be now fixing the different problem of swap-space filling up. I'm quite willing to for now assume I've got plenty free. I was trying to point out that currently, as an example, memory that is linear in a process' space could be fragmented on disk when swapped out. That's today. Under a log-structured scheme, one could set it up such that something that was linear in RAM could be swapped out linearly on the drive, minimizing seeks on writeout, which will naturally minimize seeks on swap in of that same data. So, at some point when the system needs to fault those blocks that back in, it now has a linear span of sectors to read instead of asking the drive to bounce over twenty tracks for a hundred blocks. Moreover though -- what I know about log structure is that generally it optimises for write (swapout) and might make read (swapin) worse due to fragmentation that wouldn't happen with a regular fs structure. It looks like I'm not doing a very good job of explaining this, I'm afraid. Suffice it to say that a log structured swap would give optimization options that we don't have today. I guess that cleaner that Alan mentioned might be involved there -- I don't know how/what it would be doing. Then you should google on `log structured filesystem (primer OR introduction)` and read a few of the links that pop up. You might find it interesting. I am very aware of the costs of seeks (on current magnetic media). Then perhaps you can just take it on faith -- log structured layouts are designed to help minimize seeks, read and write. Ray - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 07/29/2007 06:04 PM, Ray Lee wrote: I am very aware of the costs of seeks (on current magnetic media). Then perhaps you can just take it on faith -- log structured layouts are designed to help minimize seeks, read and write. I am particularly bad at faith. Let's take that stupid program that I posted: http://lkml.org/lkml/2007/7/25/85 You push it out before you hit enter, it's written out to swap, at whatever speed. How should it be layed out so that it's swapped in most efficiently after hitting enter? Reading bigger chunks would quite obviously help, but the layout? The program is not a real-world issue and if you do not consider it a useful boundary condition either (okay I guess), how would log structured swap help if I just assume I have plenty of free swap to begin with? Rene. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 7/29/07, Rene Herman [EMAIL PROTECTED] wrote: On 07/29/2007 06:04 PM, Ray Lee wrote: I am very aware of the costs of seeks (on current magnetic media). Then perhaps you can just take it on faith -- log structured layouts are designed to help minimize seeks, read and write. I am particularly bad at faith. Let's take that stupid program that I posted: You only think you are :-). I'm sure there are lots of things you have faith in. Gravity, for example :-). The program is not a real-world issue and if you do not consider it a useful boundary condition either (okay I guess), how would log structured swap help if I just assume I have plenty of free swap to begin with? Is that generally the case on your systems? Every linux system I've run, regardless of RAM, has always pushed things out to swap. And once there's something already in swap, you now have a packing problem when you want to swap something else out. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 07/29/2007 07:19 PM, Ray Lee wrote: The program is not a real-world issue and if you do not consider it a useful boundary condition either (okay I guess), how would log structured swap help if I just assume I have plenty of free swap to begin with? Is that generally the case on your systems? Every linux system I've run, regardless of RAM, has always pushed things out to swap. For me, it is generally the case yes. We are still discussing this in the context of desktop machines and their problems with being slow as things have been swapped out and generally I expect a desktop to have plenty of swap which it's not regularly going to fillup significantly since then the machine's unworkably slow as a desktop anyway. And once there's something already in swap, you now have a packing problem when you want to swap something else out. Once we're crammed, it gets to be a different situation yes. As far as I'm concerned that's for another thread though. I'm spending too much time on LKML as it is... Rene. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
Is that generally the case on your systems? Every linux system I've run, regardless of RAM, has always pushed things out to swap. For me, it is generally the case yes. We are still discussing this in the context of desktop machines and their problems with being slow as things have been swapped out and generally I expect a desktop to have plenty of swap which it's not regularly going to fillup significantly since then the machine's unworkably slow as a desktop anyway. A simple log optimises writeout (which is latency critical) and can otherwise stall an enitre system. In a log you can also have multiple copies of the same page on disk easily, some stale - so you can write out chunks of data that are not all them removed from memory, just so you get them back more easily if you then do (and I guess you'd mark them accordingly) The second element is a cleaner - something to go around removing stuff from the log that is needed when the disks are idle - and also to repack data in nice linear chunks. So instead of using the empty disk time for page-in you use it for packing data and optimising future paging. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 7/29/07, Rene Herman [EMAIL PROTECTED] wrote: On 07/29/2007 07:19 PM, Ray Lee wrote: For me, it is generally the case yes. We are still discussing this in the context of desktop machines and their problems with being slow as things have been swapped out and generally I expect a desktop to have plenty of swap which it's not regularly going to fillup significantly since then the machine's unworkably slow as a desktop anyway. Shrug Well, that doesn't match my systems. My laptop has 400MB in swap: [EMAIL PROTECTED]:~$ free total used free sharedbuffers cached Mem:894208 883920 10288 0 3044 163224 -/+ buffers/cache: 717652 176556 Swap: 1116476 393132 723344 And once there's something already in swap, you now have a packing problem when you want to swap something else out. Once we're crammed, it gets to be a different situation yes. As far as I'm concerned that's for another thread though. I'm spending too much time on LKML as it is... No, it's not even when crammed. It's just when there are holes. mm/swapfile.c does try to cluster things, but doesn't work too hard at it as we don't want to spend all our time looking for a perfect fit that may not exist. Ray - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 07/29/2007 07:52 PM, Ray Lee wrote: Shrug Well, that doesn't match my systems. My laptop has 400MB in swap: Which in your case is slightly more than 1/3 of available swap space. Quite a lot for a desktop indeed. And if it's more than a few percent fragmented, please fix current swapout instead of log structuring it. Rene. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
Ray wrote: a log structured scheme, where the writeout happens to sequential spaces on the drive instead of scattered about. If the problem is reading stuff back in from swap quickly when needed, then this likely helps, by reducing the seeks needed. If the problem is reading stuff back in from swap at the *same time* that the application is reading stuff from some user file system, and if that user file system is on the same drive as the swap partition (typical on laptops), then interleaving the user file system accesses with the swap partition accesses might overwhelm all other performance problems, due to the frequent long seeks between the two. In that case, swap layout and swap i/o block size are secondary. However, pre-fetching, so that swap read back is not interleaved with application file accesses, could help dramatically. === Perhaps we could have a 'wake-up' command, analogous to the various sleep and hibernate commands. The 'wake-up' command could do whatever of the following it knew to do, in order to optimize for an anticipated change in usage patterns: 1) pre-fetch swap 2) clean (write out) dirty pages 3) maximize free memory 4) layout swap nicely 5) pre-fetch a favorite set of apps Stumble out of bed in the morning, press 'wake-up', start boiling the water for your coffee, and in another ten minutes, one is ready to rock and roll. In case Andrew is so bored he read this far -- yes this wake-up sounds like user space code, with minimal kernel changes to support any particular lower level operation that we can't do already. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson [EMAIL PROTECTED] 1.925.600.0401 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 7/29/07, Paul Jackson [EMAIL PROTECTED] wrote: If the problem is reading stuff back in from swap at the *same time* that the application is reading stuff from some user file system, and if that user file system is on the same drive as the swap partition (typical on laptops), then interleaving the user file system accesses with the swap partition accesses might overwhelm all other performance problems, due to the frequent long seeks between the two. Ah, so in a normal scenario where a working-set is getting faulted back in, we have the swap storage as well as the file-backed stuff that needs to be read as well. So even if swap is organized perfectly, we're still seeking. Damn. On the other hand, that explains another thing that swap prefetch could be helping with -- if it preemptively faults the swap back in, then the file-backed stuff can be faulted back more quickly, just by the virtue of not needing to seek back and forth to swap for its stuff. Hadn't thought of that. That also implies that people running with swap files rather than swap partitions will see less of an issue. I should dig out my old compact flash card and try putting swap on that for a week. In that case, swap layout and swap i/o block size are secondary. However, pre-fetching, so that swap read back is not interleaved with application file accesses, could help dramatically. Nod Perhaps we could have a 'wake-up' command, analogous to the various sleep and hibernate commands. [...] In case Andrew is so bored he read this far -- yes this wake-up sounds like user space code, with minimal kernel changes to support any particular lower level operation that we can't do already. He'd suggested using, uhm, ptrace_peek or somesuch for just such a purpose. The second half of the issue is to know when and what to target. Ray - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
Ray wrote: Ah, so in a normal scenario where a working-set is getting faulted back in, we have the swap storage as well as the file-backed stuff that needs to be read as well. So even if swap is organized perfectly, we're still seeking. Damn. Perhaps this applies in some cases ... perhaps. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson [EMAIL PROTECTED] 1.925.600.0401 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 7/29/07, Paul Jackson [EMAIL PROTECTED] wrote: Ray wrote: Ah, so in a normal scenario where a working-set is getting faulted back in, we have the swap storage as well as the file-backed stuff that needs to be read as well. So even if swap is organized perfectly, we're still seeking. Damn. Perhaps this applies in some cases ... perhaps. Yeah, point taken: better data would make this a lot easier to figure out and target fixes. Ray - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On Sunday 29 July 2007 16:00:22 Ray Lee wrote: On 7/29/07, Paul Jackson [EMAIL PROTECTED] wrote: If the problem is reading stuff back in from swap at the *same time* that the application is reading stuff from some user file system, and if that user file system is on the same drive as the swap partition (typical on laptops), then interleaving the user file system accesses with the swap partition accesses might overwhelm all other performance problems, due to the frequent long seeks between the two. Ah, so in a normal scenario where a working-set is getting faulted back in, we have the swap storage as well as the file-backed stuff that needs to be read as well. So even if swap is organized perfectly, we're still seeking. Damn. That is one reason why I try to have swap on a device dedicated just for it. It helps keep the system from having to seek all over the drive for data. (I remember that this was recommended years ago with Windows - back when you could tell Windows where to put the swap file) On the other hand, that explains another thing that swap prefetch could be helping with -- if it preemptively faults the swap back in, then the file-backed stuff can be faulted back more quickly, just by the virtue of not needing to seek back and forth to swap for its stuff. Hadn't thought of that. For it to really help swap-prefetch would have to be more aggressive. At the moment (if I'm reading the code correctly) the system has to have close to zero for it to kick in. A tunable knob controlling how much activity is too much for the prefetch to kick in would help with finding a sane default. IMHO it should be the one that provides the most benefit with the least hit to performance. That also implies that people running with swap files rather than swap partitions will see less of an issue. I should dig out my old compact flash card and try putting swap on that for a week. Maybe. It all depends on how much seeking is needed to track down the pages in the swapfile and such. What would really help make the situation even better would be doing the log structured swap + cleaner. The log structured swap + cleaner should provide a performance boost by itself - add in the prefetch mechanism and the benefits are even more visible. Another way to improve performance would require making the page replacement mechanism more intelligent. There are bounds to what can be done in the kernel without negatively impacting performance, but, if I've read the code correctly, there might be a better way to decide which pages to evict. One way to do this would be to implement some mechanism that allows the system to choose a single group of contiguous pages (or, say, a large soft-page) over swapping out a single page at a time. (some form of memory defrag would also be nice, but I can't think of a way to do that without massively breaking everything) snip In case Andrew is so bored he read this far -- yes this wake-up sounds like user space code, with minimal kernel changes to support any particular lower level operation that we can't do already. He'd suggested using, uhm, ptrace_peek or somesuch for just such a purpose. The second half of the issue is to know when and what to target. The userspace suggestion that was thrown out earlier would have been as error-prone and problematic as FUSE. A solution like you suggest would be workable - its small and does a task that is best done in userspace (IMHO). (IIRC, the original suggestion involved merging maps2 and another patchset into mainline and using that, combined with PEEKTEXT to provide for a userspace swap daemon. Swap, IMHO, should never be handled outside the kernel) What might be useful is a userspace daemon that tracks memory pressure and uses a concise API to trigger various levels of prefetch and/or swap aggressiveness. DRH -- Dialup is like pissing through a pipette. Slow and excruciatingly painful. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On Sun, 29 Jul 2007, Rene Herman wrote: On 07/29/2007 01:41 PM, [EMAIL PROTECTED] wrote: I agree that tinkering with the core VM code should not be done lightly, but this has been put through the proper process and is stalled with no hints on how to move forward. It has not. Concerns that were raised (by specifically Nick Piggin) weren't being addressed. I may have missed them, but what I saw from him weren't specific issues, but instead a nebulous 'something better may come along later' forget the nightly cron jobs for the moment. think of this scenerio. you have your memory fairly full with apps that you have open (including firefox with many tabs), you receive a spreadsheet you need to look at, so you fire up openoffice to look at it. then you exit openoffice and try to go back to firefox (after a pause while you walk to the printer to get the printout of the spreadsheet) And swinging a dead rat from its tail facing east-wards while reciting Documentation/CodingStyle. Okay, very very sorry, that was particularly childish, but that walking to the printer is ofcourse completely constructed and this _is_ something to take into account. yes it was contrived for simplicity. the same effect would happen if instead of going back to firefox the user instead went to their e-mail software and read some mail. doing so should still make the machine idle enough to let prefetch kick in. Swap-prefetch wants to be free, which (also again) it is doing a good job at it seems, but this also means that it waits for the VM to be _very_ idle before it does anything and as such, we cannot just forget the nightly scenario and pretend it's about something else entirely. As long as the machine's being used, swap-prefetch doesn't kick in. how long does the machine need to be idle? if someone spends 30 seconds reading an e-mail that's an incredibly long time for the system and I would think it should be enough to let the prefetch kick in. -- 3: no serious consideration of possible alternatives Tweaking existing use-oce logic is one I've heard but if we consider the i/dcache issue dead, I believe that one is as well. Going to userspace is another one. Largest theoretical potential. I myself am extremely sceptical about the Linux userland, and largely equate it with smallest _practical_ potential -- but that might just be me. A larger swap granularity, possible even a self-training granularity. Up to now, seeks only get costlier and costlier with respect to reads with every generation of disk (flash would largely overcome it though) and doing more in one read/write _greatly_ improves throughput, maybe up to the point that swap-prefetch is no longer very useful. I myself don't know about the tradeoffs involved. larger swap granularity may help, but waiting for the user to need the ram and have to wait for it to be read back in is always going to be worse for the user then pre-populating the free memory (for the case where the pre-population is right, for other cases it's the same). so I see this as a red herring I saw Chris Snook make a good post here and am going to defer this part to that discussion: http://lkml.org/lkml/2007/7/27/421 But no, it's not a red herring if _practically_ speaking the swapin is fast enough once started that people don't actually mind anymore since in that case you could simply do without yet more additional VM complexity (and kernel daemon). swapin will always require disk access, and avoiding doing disk access while the user is waiting for it by doing it when the system isn't useing the disk will always be a win (possibly not as large of a win, but still a win) on slow laptop drives where you may only get 20MB/second of reads under optimal situations it doesn't take much reading to be noticed by the user. there are fully legitimate situations where this is useful, the 'papering over' effect is not referring to these, it's referring to other possible problems in the future. No, it's not just future. Just look at the various things under discussion now such as improved use-once and better swapin. and these thing do not conflict with prefetch, they compliment it. improved use-once will avoid pushing things out to swap in the first place. this will help during normal workloads so is valuble in any case. better swapin (I assume you are talking about things like larger swap granularity) will also help during normal workloads when you are thrashing into swap. prefetch will help when you have pushed things out to swap and now have free memory and a momentarily idle system. David Lang - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On Sat, 28 Jul 2007 21:33:59 -0400 Rik van Riel <[EMAIL PROTECTED]> wrote: > Andrew Morton wrote: > > > What I think is killing us here is the blockdev pagecache: the pagecache > > which backs those directory entries and inodes. These pages get read > > multiple times because they hold multiple directory entries and multiple > > inodes. These multiple touches will put those pages onto the active list > > so they stick around for a long time and everything else gets evicted. > > > > I've never been very sure about this policy for the metadata pagecache. We > > read the filesystem objects into the dcache and icache and then we won't > > read from that page again for a long time (I expect). But the page will > > still hang around for a long time. > > > > It could be that we should leave those pages inactive. > > Good idea for updatedb. > > However, it may be a bad idea for files that are often > written to. Turning an inode write into a read plus a > write does not sound like such a hot idea, we really > want to keep those in the cache. Remember that this problem applies to both inode blocks and to directory blocks. Yes, it might be useful to hold onto an inode block for a future write (atime, mtime, usually), but not a directory block. > I think what you need is to ignore multiple references > to the same page when they all happen in one time > interval, counting them only if they happen in multiple > time intervals. Yes, the sudden burst of accesses for adjacent inode/dirents will be a common pattern, and it'd make heaps of sense to treat that as a single touch. It'd have to be done in the fs I guess, and it might be a bit hard to do. And it turns out that embedding the touch_buffer() all the way down in __find_get_block() was convenient, but it's going to be tricky to change. For now I'm fairly inclined to just nuke the touch_buffer() on the read side and maybe add one on the modification codepaths and see what happens. As always, testing is the problem. > The use-once cleanup (which takes a page flag for PG_new, > I know...) would solve that problem. > > However, it would introduce the problem of having to scan > all the pages on the list before a page becomes freeable. > We would have to add some background scanning (or a separate > list for PG_new pages) to make the initial pageout run use > an acceptable amount of CPU time. > > Not sure that complexity will be worth it... > I suspect that the situation we have now is so bad that pretty much anything we do will be an improvement. I've always wondered "ytf is there so much blockdev pagecache?" This machine I'm typing at: MemTotal: 3975080 kB MemFree:750400 kB Buffers:547736 kB Cached:1299532 kB SwapCached: 12772 kB Active:1789864 kB Inactive: 861420 kB HighTotal: 0 kB HighFree:0 kB LowTotal: 3975080 kB LowFree:750400 kB SwapTotal: 4875716 kB SwapFree: 4715660 kB Dirty: 76 kB Writeback: 0 kB Mapped: 638036 kB Slab: 522724 kB CommitLimit: 6863256 kB Committed_AS: 1115632 kB PageTables: 14452 kB VmallocTotal: 34359738367 kB VmallocUsed: 36432 kB VmallocChunk: 34359696379 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 Hugepagesize: 2048 kB More that a quarter of my RAM in fs metadata! Most of it I'll bet is on the active list. And the fs on which I do most of the work is mounted noatime.. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
Andrew Morton wrote: What I think is killing us here is the blockdev pagecache: the pagecache which backs those directory entries and inodes. These pages get read multiple times because they hold multiple directory entries and multiple inodes. These multiple touches will put those pages onto the active list so they stick around for a long time and everything else gets evicted. I've never been very sure about this policy for the metadata pagecache. We read the filesystem objects into the dcache and icache and then we won't read from that page again for a long time (I expect). But the page will still hang around for a long time. It could be that we should leave those pages inactive. Good idea for updatedb. However, it may be a bad idea for files that are often written to. Turning an inode write into a read plus a write does not sound like such a hot idea, we really want to keep those in the cache. I think what you need is to ignore multiple references to the same page when they all happen in one time interval, counting them only if they happen in multiple time intervals. The use-once cleanup (which takes a page flag for PG_new, I know...) would solve that problem. However, it would introduce the problem of having to scan all the pages on the list before a page becomes freeable. We would have to add some background scanning (or a separate list for PG_new pages) to make the initial pageout run use an acceptable amount of CPU time. Not sure that complexity will be worth it... -- Politics is the struggle between those who want to make their country the best in the world, and those who believe it already is. Each group calls the other unpatriotic. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On Saturday 28 July 2007 17:06:50 [EMAIL PROTECTED] wrote: > On Sat, 28 Jul 2007, Daniel Hazelton wrote: > > On Saturday 28 July 2007 04:55:58 [EMAIL PROTECTED] wrote: > >> On Sat, 28 Jul 2007, Rene Herman wrote: > >>> On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote: > On Fri, 27 Jul 2007, Rene Herman wrote: > > On 07/27/2007 07:45 PM, Daniel Hazelton wrote: > >> > >> nobody is arguing that swap prefetch helps in the second cast. > > > > Actually, I made a mistake when tracking the thread and reading the code > > for the patch and started to argue just that. But I have to admit I made > > a mistake - the patches author has stated (as Rene was kind enough to > > point out) that swap prefetch can't help when memory is filled. > > I stand corrected, thaks for speaking up and correcting your position. If you had made the statement before I decided to speak up you would have been correct :) Anyway, I try to always admit when I've made a mistake - its part of my philosophy. (There have been times when I haven't done it, but I'm trying to make that stop entirely) > >> what people are arguing is that there are situations where it helps for > >> the first case. on some machines and version of updatedb the nighly run > >> of updatedb can cause both sets of problems. but the nightly updatedb > >> run is not the only thing that can cause problems > > > > Solving the cache filling memory case is difficult. There have been a > > number of discussions about it. The simplest solution, IMHO, would be to > > place a (configurable) hard limit on the maximum size any of the kernels > > caches can grow to. (The only solution that was discussed, however, is a > > complex beast) > > limiting the size of the cache is also the wrong thing to do in many > situations. it's only right if the cache pushes out other data you care > about, if you are trying to do one thing as fast as you can you really do > want the system to use all the memory it can for the cache. After thinking about this you are partially correct. There are those sorts of situations where you want the system to use all the memory it can for caches. OTOH, if those situations could be described in some sort of simple heuristic, then a soft-limit that uses those heuristics to determine when to let the cache expand could exploit the benefits of having both a limited and unlimited cache. (And, potentially, if the heuristic has allowed a cache to expand beyond the limit then, when the heuristic no longer shows the oversize cache is no longer necessary it could trigger and automatic reclaim of that memory.) (I'm willing to help write and test code to do this exactly. There is no guarantee that I'll be able to help with more than testing - I don't understand the parts of the code involved all that well) DRH -- Dialup is like pissing through a pipette. Slow and excruciatingly painful. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On Sat, 28 Jul 2007, Daniel Hazelton wrote: On Saturday 28 July 2007 04:55:58 [EMAIL PROTECTED] wrote: On Sat, 28 Jul 2007, Rene Herman wrote: On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote: On Fri, 27 Jul 2007, Rene Herman wrote: On 07/27/2007 07:45 PM, Daniel Hazelton wrote: nobody is arguing that swap prefetch helps in the second cast. Actually, I made a mistake when tracking the thread and reading the code for the patch and started to argue just that. But I have to admit I made a mistake - the patches author has stated (as Rene was kind enough to point out) that swap prefetch can't help when memory is filled. I stand corrected, thaks for speaking up and correcting your position. what people are arguing is that there are situations where it helps for the first case. on some machines and version of updatedb the nighly run of updatedb can cause both sets of problems. but the nightly updatedb run is not the only thing that can cause problems Solving the cache filling memory case is difficult. There have been a number of discussions about it. The simplest solution, IMHO, would be to place a (configurable) hard limit on the maximum size any of the kernels caches can grow to. (The only solution that was discussed, however, is a complex beast) limiting the size of the cache is also the wrong thing to do in many situations. it's only right if the cache pushes out other data you care about, if you are trying to do one thing as fast as you can you really do want the system to use all the memory it can for the cache. David Lang - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On Sat, 28 Jul 2007, Alan Cox wrote: It is. Prefetched pages can be dropped on the floor without additional I/O. Which is essentially free for most cases. In addition your disk access may well have been in idle time (and should be for this sort of stuff) and if it was in the same chunk as something nearby was effectively free anyway. as I understand it the swap-prefetch only kicks in if the device is idle Actual physical disk ops are precious resource and anything that mostly reduces the number will be a win - not to stay swap prefetch is the right answer but accidentally or otherwise there are good reasons it may happen to help. Bigger more linear chunks of writeout/readin is much more important I suspect than swap prefetching. I'm sure this is true while you are doing the swapout or swapin and the system is waiting for it. but with prefetch you may be able to avoid doing the swapin at a time when the system is waiting for it by doing it at a time when the system is otherwise idle. David Lang - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On Sat, 28 Jul 2007, Rene Herman wrote: On 07/28/2007 10:55 AM, [EMAIL PROTECTED] wrote: it looks to me like unless the code was really bad (and after 23 months in -mm it doesn't sound like it is) Not to sound pretentious or anything but I assume that Andrew has a fairly good overview of exactly how broken -mm can be at times. How many -mm users use it anyway? He himself said he's not convinced of usefulness having not seen it help for him (and notice that most developers are also users), turned it off due to it annoying him at some point and hasn't seen a serious investigation into potential downsides. if that was the case then people should be responding to the request to get it merged with 'but it caused problems for me when I tried it' I haven't seen any comments like that. that the only significant con left is the potential to mask other problems. Which is not a madeup issue, mind you. As an example, I just now tried GNU locate and saw it's a complete pig and specifically unsuitable for the low memory boxes under discussion. Upon completion, it actually frees enough memory that swap-prefetch _could_ help on some boxes, while the real issue is that they should first and foremost dump GNU locate. I see the conclusion as being exactly the opposite. here is a workload with some badly designed userspace software that the kernel can make much more pleasent for users. arguing that users should never use badly designed software in userspace doesn't seem like an argument that will gain much traction. I'm not saying the kernel needs to fix the software itself (ala the sched_yeild issues), but the kernel should try and keep such software from hurting the rest of the system where it can. in this case it can't help it while the bad software is running, but it could minimize the impact after it finishes. however there are many legitimate cases where it is definantly dong the right thing (swapout was correct in pushing out the pages, but now the cause of that preasure is gone). the amount of benifit from this will vary from situation to situation, but it's not reasonable to claim that this provides no benifit (you have benchmark numbers that show it in synthetic benchmarks, and you have user reports that show it in the real-worlk) I certainly would not want to argue anything of the sort no. As said a few times, I agree that swap-prefetch makes sense and has at least the potential to help some situations that you really wouldnt even want to try and fix any other way, simply because nothing's broken. so there is a legitimate situation where swap-prefetch will help significantly, what is the downside that prevents it from being included? (reading this thread it sometimes seems like the downside is that updatedb shouldn't cause this problem and so if you fixed updatedb there wold be no legitimate benifit, or alturnatly this patch doesn't help updatedb so there's no legitimate benifit) there are lots of things in the kernel who's job is to pre-fill the memroy with data that may (or may not) be useful in the future. this is just another method of filling the cache. it does so my saying "the user wanted these pages in the recent past, so it's a reasonable guess to say that the user will want them again in the future" Well, _that_ is what the kernel is already going to great lengths at doing, and it decided that those pages us poor overnight OO.o users want in in the morning weren't reasonable guesses. The kernel also won't any time soon be reading our minds, so any solution would need either user intervention (we could devise a way to tell the kernel "hey ho, I consider these pages to be very important -- try not to swap them out" possible even with a "and if you do, please pull them back in when possible") or we can let swap-prefetch do the "just in case" thing it is doing. it's not that they shouldn't have been swapped out (they should have been), it's that the reason they were swapped out no longer exists. While swap-prefetch may not be the be all end all of solutions I agree that having a machine sit around with free memory and applications in swap seems not too useful if (as is the case) fetched pages can be dropped immediately when it turns out swap-prefetch made the wrong decision. So that's for the concept. As to implementation, if I try and look at the code, it seems to be trying hard to really be free and as such, potential downsides seem limited. It's a rather core concept though and as such needs someone with a _lot_ more VM clue to ack. Sorry for not knowing, but who's maintaining/submitting the thing now that Con's not? He or she should preferably address any concerns it seems. I've seen it mentioned that there is still a maintainer but I missed who it is, but I haven't seen any concerns that can be addressed, they all seem to be 'this is a core concept, people need to think about it' or 'but someone may find
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 7/28/07, Alan Cox <[EMAIL PROTECTED]> wrote: > Actual physical disk ops are precious resource and anything that mostly > reduces the number will be a win - not to stay swap prefetch is the right > answer but accidentally or otherwise there are good reasons it may happen > to help. > > Bigger more linear chunks of writeout/readin is much more important I > suspect than swap prefetching. . The larger the chunks are that we swap out, the less it actually hurts to swap, which might make all this a moot point. Not all I/O is created equal... Ray - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On Saturday 28 July 2007 04:55:58 [EMAIL PROTECTED] wrote: > On Sat, 28 Jul 2007, Rene Herman wrote: > > On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote: > >> On Fri, 27 Jul 2007, Rene Herman wrote: > >> > On 07/27/2007 07:45 PM, Daniel Hazelton wrote: > >> > > Questions about it: > >> > > Q) Does swap-prefetch help with this? > >> > > A) [From all reports I've seen (*)] > >> > > Yes, it does. > >> > > >> > No it does not. If updatedb filled memory to the point of causing > >> > swapping (which noone is reproducing anyway) it HAS FILLED MEMORY and > >> > swap-prefetch hasn't any memory to prefetch into -- updatedb itself > >> > doesn't use any significant memory. > >> > >> however there are other programs which are known to take up significant > >> amounts of memory and will cause the issue being described (openoffice > >> for example) > >> > >> please don't get hung up on the text 'updatedb' and accept that there > >> are programs that do run intermittently and do use a significant amount > >> of ram and then free it. > > > > Different issue. One that's worth pursueing perhaps, but a different > > issue from the VFS caches issue that people have been trying to track > > down. > > people are trying to track down the problem of their machine being slow > until enough data is swapped back in to operate normally. > > in at some situations swap prefetch can help becouse something that used > memory freed it so there is free memory that could be filled with data > (which is something that Linux does agressivly in most other situations) > > in some other situations swap prefetch cannot help becouse useless data is > getting cached at the expense of useful data. > > nobody is arguing that swap prefetch helps in the second cast. Actually, I made a mistake when tracking the thread and reading the code for the patch and started to argue just that. But I have to admit I made a mistake - the patches author has stated (as Rene was kind enough to point out) that swap prefetch can't help when memory is filled. > what people are arguing is that there are situations where it helps for > the first case. on some machines and version of updatedb the nighly run of > updatedb can cause both sets of problems. but the nightly updatedb run is > not the only thing that can cause problems Solving the cache filling memory case is difficult. There have been a number of discussions about it. The simplest solution, IMHO, would be to place a (configurable) hard limit on the maximum size any of the kernels caches can grow to. (The only solution that was discussed, however, is a complex beast) > > but let's talk about the concept here for a little bit > > the design is to use CPU and I/O capacity that's otherwise idle to fill > free memory with data from swap. > > pro: >more ram has potentially useful data in it > > con: >it takes a little extra effort to give this memory to another app (the > page must be removed from the list and zeroed at the time it's needed, I > assume that the data is left in swap so that it doesn't have to be written > out again) > >it adds some complexity to the kernel (~500 lines IIRC from this thread) > >by undoing recent swapouts it can potentially mask problems with swapout > > it looks to me like unless the code was really bad (and after 23 months in > -mm it doesn't sound like it is) that the only significant con left is the > potential to mask other problems. I'll second this. But with the swap system itself having seen as heavy testing as it has I don't know if it would be masking other problems. That is why I've been asking "What is so wrong with it?" - while it definately doesn't help with programs that cause caches to balloon (that problem does need another solution) it does help to speed things up when a memory hog has exited. (And since its a pretty safe assumption that swap is going to be noticeably slower than RAM this patch seems to me to be a rather visible and obvious solution to that problem) > however there are many legitimate cases where it is definantly dong the > right thing (swapout was correct in pushing out the pages, but now the > cause of that preasure is gone). the amount of benifit from this will vary > from situation to situation, but it's not reasonable to claim that this > provides no benifit (you have benchmark numbers that show it in synthetic > benchmarks, and you have user reports that show it in the real-worlk) Exactly. Though I have seen posts which (to me at least) appear to claim exactly that. It was part of the reason why I got a bit incensed. (The other was that it looked like the kernel devs with the ultra-powerful machines were claiming 'I don't see the problem on my machine, so it doesn't exist'. That sort of attitude is fine, in some cases, but not, IMHO, where performance is concerned) > there are lots of things in the kernel who's job is to pre-fill the memroy > with data that may (or may not) be useful in
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On Saturday 28 July 2007 03:48:13 Mike Galbraith wrote: > On Fri, 2007-07-27 at 18:51 -0400, Daniel Hazelton wrote: > > Now, once more, I'm going to ask: What is so terribly wrong with swap > > prefetch? Why does it seem that everyone against it says "Its treating a > > symptom, so it can't go in"? > > And once again, I personally have nothing against swap-prefetch, or > something like it. I can see how it or something like it could be made > to improve the lives of people who get up in the morning to find their > apps sitting on disk due to memory pressure generated by over-night > system maintenance operations. > > The author himself however, says his implementation can't help with > updatedb (though people seem to be saying that it does), or anything > else that leaves memory full. That IMHO, makes it of questionable value > toward solving what people are saying they want swap-prefetch for in the > first place. Okay. I have to agree with the author that, in such a situation, it wouldn't help. However there are, without a doubt, other situations where it would help immensely. (memory hogs forcing everything to disk and quitting, one off tasks that don't balloon the cache (kernel compiles, et al) - in those situations swap prefetch would really shine.) DRH -- Dialup is like pissing through a pipette. Slow and excruciatingly painful. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
> It is. Prefetched pages can be dropped on the floor without additional I/O. Which is essentially free for most cases. In addition your disk access may well have been in idle time (and should be for this sort of stuff) and if it was in the same chunk as something nearby was effectively free anyway. Actual physical disk ops are precious resource and anything that mostly reduces the number will be a win - not to stay swap prefetch is the right answer but accidentally or otherwise there are good reasons it may happen to help. Bigger more linear chunks of writeout/readin is much more important I suspect than swap prefetching. > good overview of exactly how broken -mm can be at times. How many -mm users > use it anyway? He himself said he's not convinced of usefulness having not I've been using it for months with no noticed problem. I turn it on because it might as well get tested. I've not done comparison tests so I can't comment on if its worth it. Lots of -mm testers turn *everything* on because its a test kernel. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 07/28/2007 10:55 AM, [EMAIL PROTECTED] wrote: in at some situations swap prefetch can help becouse something that used memory freed it so there is free memory that could be filled with data (which is something that Linux does agressivly in most other situations) in some other situations swap prefetch cannot help becouse useless data is getting cached at the expense of useful data. nobody is arguing that swap prefetch helps in the second cast. Oh yes they are. Daniel for example did twice, telling me to turn my brain on in between (if you read it, you may have noticed I got a little annoyed at that point). but let's talk about the concept here for a little bit the design is to use CPU and I/O capacity that's otherwise idle to fill free memory with data from swap. pro: more ram has potentially useful data in it con: it takes a little extra effort to give this memory to another app (the page must be removed from the list and zeroed at the time it's needed, I assume that the data is left in swap so that it doesn't have to be written out again) It is. Prefetched pages can be dropped on the floor without additional I/O. it adds some complexity to the kernel (~500 lines IIRC from this thread) by undoing recent swapouts it can potentially mask problems with swapout it looks to me like unless the code was really bad (and after 23 months in -mm it doesn't sound like it is) Not to sound pretentious or anything but I assume that Andrew has a fairly good overview of exactly how broken -mm can be at times. How many -mm users use it anyway? He himself said he's not convinced of usefulness having not seen it help for him (and notice that most developers are also users), turned it off due to it annoying him at some point and hasn't seen a serious investigation into potential downsides. that the only significant con left is the potential to mask other problems. Which is not a madeup issue, mind you. As an example, I just now tried GNU locate and saw it's a complete pig and specifically unsuitable for the low memory boxes under discussion. Upon completion, it actually frees enough memory that swap-prefetch _could_ help on some boxes, while the real issue is that they should first and foremost dump GNU locate. however there are many legitimate cases where it is definantly dong the right thing (swapout was correct in pushing out the pages, but now the cause of that preasure is gone). the amount of benifit from this will vary from situation to situation, but it's not reasonable to claim that this provides no benifit (you have benchmark numbers that show it in synthetic benchmarks, and you have user reports that show it in the real-worlk) I certainly would not want to argue anything of the sort no. As said a few times, I agree that swap-prefetch makes sense and has at least the potential to help some situations that you really wouldnt even want to try and fix any other way, simply because nothing's broken. there are lots of things in the kernel who's job is to pre-fill the memroy with data that may (or may not) be useful in the future. this is just another method of filling the cache. it does so my saying "the user wanted these pages in the recent past, so it's a reasonable guess to say that the user will want them again in the future" Well, _that_ is what the kernel is already going to great lengths at doing, and it decided that those pages us poor overnight OO.o users want in in the morning weren't reasonable guesses. The kernel also won't any time soon be reading our minds, so any solution would need either user intervention (we could devise a way to tell the kernel "hey ho, I consider these pages to be very important -- try not to swap them out" possible even with a "and if you do, please pull them back in when possible") or we can let swap-prefetch do the "just in case" thing it is doing. While swap-prefetch may not be the be all end all of solutions I agree that having a machine sit around with free memory and applications in swap seems not too useful if (as is the case) fetched pages can be dropped immediately when it turns out swap-prefetch made the wrong decision. So that's for the concept. As to implementation, if I try and look at the code, it seems to be trying hard to really be free and as such, potential downsides seem limited. It's a rather core concept though and as such needs someone with a _lot_ more VM clue to ack. Sorry for not knowing, but who's maintaining/submitting the thing now that Con's not? He or she should preferably address any concerns it seems. Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: -mm merge plans for 2.6.23
Daniel Cheng wrote: > but merging maps2 have higher risk which should be done in a development > branch (er... 2.7, but we don't have it now). This is off-topic and has been discussed to death, but: Rather than one stable branch and one development branch, we have a few stable branches and a lot of development branches. Some are located at git.kernel.org. Among else, this gives you a predictable release rythm and very timely updated stable branches. -- Stefan Richter -=-=-=== -=== ===-- http://arcgraph.de/sr/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On Sat, 28 Jul 2007, Rene Herman wrote: On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote: On Fri, 27 Jul 2007, Rene Herman wrote: > On 07/27/2007 07:45 PM, Daniel Hazelton wrote: > > > Questions about it: > > Q) Does swap-prefetch help with this? > > A) [From all reports I've seen (*)] > > Yes, it does. > > No it does not. If updatedb filled memory to the point of causing > swapping (which noone is reproducing anyway) it HAS FILLED MEMORY and > swap-prefetch hasn't any memory to prefetch into -- updatedb itself > doesn't use any significant memory. however there are other programs which are known to take up significant amounts of memory and will cause the issue being described (openoffice for example) please don't get hung up on the text 'updatedb' and accept that there are programs that do run intermittently and do use a significant amount of ram and then free it. Different issue. One that's worth pursueing perhaps, but a different issue from the VFS caches issue that people have been trying to track down. people are trying to track down the problem of their machine being slow until enough data is swapped back in to operate normally. in at some situations swap prefetch can help becouse something that used memory freed it so there is free memory that could be filled with data (which is something that Linux does agressivly in most other situations) in some other situations swap prefetch cannot help becouse useless data is getting cached at the expense of useful data. nobody is arguing that swap prefetch helps in the second cast. what people are arguing is that there are situations where it helps for the first case. on some machines and version of updatedb the nighly run of updatedb can cause both sets of problems. but the nightly updatedb run is not the only thing that can cause problems but let's talk about the concept here for a little bit the design is to use CPU and I/O capacity that's otherwise idle to fill free memory with data from swap. pro: more ram has potentially useful data in it con: it takes a little extra effort to give this memory to another app (the page must be removed from the list and zeroed at the time it's needed, I assume that the data is left in swap so that it doesn't have to be written out again) it adds some complexity to the kernel (~500 lines IIRC from this thread) by undoing recent swapouts it can potentially mask problems with swapout it looks to me like unless the code was really bad (and after 23 months in -mm it doesn't sound like it is) that the only significant con left is the potential to mask other problems. however there are many legitimate cases where it is definantly dong the right thing (swapout was correct in pushing out the pages, but now the cause of that preasure is gone). the amount of benifit from this will vary from situation to situation, but it's not reasonable to claim that this provides no benifit (you have benchmark numbers that show it in synthetic benchmarks, and you have user reports that show it in the real-worlk) there are lots of things in the kernel who's job is to pre-fill the memroy with data that may (or may not) be useful in the future. this is just another method of filling the cache. it does so my saying "the user wanted these pages in the recent past, so it's a reasonable guess to say that the user will want them again in the future" David Lang - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 07/28/2007 09:35 AM, Rene Herman wrote: By the way -- I'm unable to make my slocate grow substantial here but I'll try what GNU locate does. If it's really as bad as I hear then regardless of anything else it should really be either fixed or dumped... Yes. GNU locate is broken and nobody should be using it. The updatedb from (my distribution standard) "slocate" uses around 2M allocated total during an entire run while GNU locate allocates some 30M to the sort process alone. GNU locate is also close to 4 times as slow (although that ofcourse only matters on cached runs anyways). So, GNU locate is just a pig pushing things out, with or without any added VFS cache pressure from the things it does by design. As such, we can trust people complaining about it but should first tell them to switch to halfway sane locate implementation. If you run memory hogs on small memory boxes, you're going to suffer. Leaves the fact that swap-prefetch sometimes helps alleviate these and other kinds of memory-hog situations and as such, might not (again) be a bad idea in itself. Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On Fri, 2007-07-27 at 18:51 -0400, Daniel Hazelton wrote: > Now, once more, I'm going to ask: What is so terribly wrong with swap > prefetch? Why does it seem that everyone against it says "Its treating a > symptom, so it can't go in"? And once again, I personally have nothing against swap-prefetch, or something like it. I can see how it or something like it could be made to improve the lives of people who get up in the morning to find their apps sitting on disk due to memory pressure generated by over-night system maintenance operations. The author himself however, says his implementation can't help with updatedb (though people seem to be saying that it does), or anything else that leaves memory full. That IMHO, makes it of questionable value toward solving what people are saying they want swap-prefetch for in the first place. I personally don't care if swap-prefetch goes in or not. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 07/28/2007 01:15 AM, Björn Steinbrink wrote: On 2007.07.27 20:16:32 +0200, Rene Herman wrote: Here's swap-prefetch's author saying the same: http://lkml.org/lkml/2007/2/9/112 | It can't help the updatedb scenario. Updatedb leaves the ram full and | swap prefetch wants to cost as little as possible so it will never | move anything out of ram in preference for the pages it wants to swap | back in. Now please finally either understand this, or tell us how we're wrong. Con might have been wrong there for boxes with really little memory. Note -- with "the updatedb scenario" both he in the above and I are talking about the "VFS caches filling memory cause the problem" not updatedb in particular. My desktop box has not even 300k inodes in use (IIRC someone posted a df -i output showing 1 million inodes in use). Still, the memory footprint of the "sort" process grows up to about 50MB. Assuming that the average filename length stays, that would mean 150MB for the 1 million inode case, just for the "sort" process. Even if it's not 150MB, 50MB is already a lot on a 128 or even a 256MB box. So, yes, we're now at the expected scenario of some hog pushing out things and freeing it upon exit again and it's something swap-prefetch definitely has potential to help with. Said early in the thread it's hard to imagine how it would not help in any such situation so that the discussion may as far as I'm concerned at that point concentrate on whether swap-prefetch hurts anything in others. Some people I believe are not convinced it helps very significantly due to at that point _everything_ having been thrown out but a copy of openoffice with a large spreadsheet open should come back to life much quicker it would seem. Any faults in that reasoning? No. If the machine goes idle after some memory hog _itself_ pushes things out and then exits, swap-prefetch helps, at the veryvery least potentially. By the way -- I'm unable to make my slocate grow substantial here but I'll try what GNU locate does. If it's really as bad as I hear then regardless of anything else it should really be either fixed or dumped... Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote: On Fri, 27 Jul 2007, Rene Herman wrote: On 07/27/2007 07:45 PM, Daniel Hazelton wrote: Questions about it: Q) Does swap-prefetch help with this? A) [From all reports I've seen (*)] Yes, it does. No it does not. If updatedb filled memory to the point of causing swapping (which noone is reproducing anyway) it HAS FILLED MEMORY and swap-prefetch hasn't any memory to prefetch into -- updatedb itself doesn't use any significant memory. however there are other programs which are known to take up significant amounts of memory and will cause the issue being described (openoffice for example) please don't get hung up on the text 'updatedb' and accept that there are programs that do run intermittently and do use a significant amount of ram and then free it. Different issue. One that's worth pursueing perhaps, but a different issue from the VFS caches issue that people have been trying to track down. Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote: On Fri, 27 Jul 2007, Rene Herman wrote: On 07/27/2007 07:45 PM, Daniel Hazelton wrote: Questions about it: Q) Does swap-prefetch help with this? A) [From all reports I've seen (*)] Yes, it does. No it does not. If updatedb filled memory to the point of causing swapping (which noone is reproducing anyway) it HAS FILLED MEMORY and swap-prefetch hasn't any memory to prefetch into -- updatedb itself doesn't use any significant memory. however there are other programs which are known to take up significant amounts of memory and will cause the issue being described (openoffice for example) please don't get hung up on the text 'updatedb' and accept that there are programs that do run intermittently and do use a significant amount of ram and then free it. Different issue. One that's worth pursueing perhaps, but a different issue from the VFS caches issue that people have been trying to track down. Rene. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 07/28/2007 01:15 AM, Björn Steinbrink wrote: On 2007.07.27 20:16:32 +0200, Rene Herman wrote: Here's swap-prefetch's author saying the same: http://lkml.org/lkml/2007/2/9/112 | It can't help the updatedb scenario. Updatedb leaves the ram full and | swap prefetch wants to cost as little as possible so it will never | move anything out of ram in preference for the pages it wants to swap | back in. Now please finally either understand this, or tell us how we're wrong. Con might have been wrong there for boxes with really little memory. Note -- with the updatedb scenario both he in the above and I are talking about the VFS caches filling memory cause the problem not updatedb in particular. My desktop box has not even 300k inodes in use (IIRC someone posted a df -i output showing 1 million inodes in use). Still, the memory footprint of the sort process grows up to about 50MB. Assuming that the average filename length stays, that would mean 150MB for the 1 million inode case, just for the sort process. Even if it's not 150MB, 50MB is already a lot on a 128 or even a 256MB box. So, yes, we're now at the expected scenario of some hog pushing out things and freeing it upon exit again and it's something swap-prefetch definitely has potential to help with. Said early in the thread it's hard to imagine how it would not help in any such situation so that the discussion may as far as I'm concerned at that point concentrate on whether swap-prefetch hurts anything in others. Some people I believe are not convinced it helps very significantly due to at that point _everything_ having been thrown out but a copy of openoffice with a large spreadsheet open should come back to life much quicker it would seem. Any faults in that reasoning? No. If the machine goes idle after some memory hog _itself_ pushes things out and then exits, swap-prefetch helps, at the veryvery least potentially. By the way -- I'm unable to make my slocate grow substantial here but I'll try what GNU locate does. If it's really as bad as I hear then regardless of anything else it should really be either fixed or dumped... Rene. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On Fri, 2007-07-27 at 18:51 -0400, Daniel Hazelton wrote: Now, once more, I'm going to ask: What is so terribly wrong with swap prefetch? Why does it seem that everyone against it says Its treating a symptom, so it can't go in? And once again, I personally have nothing against swap-prefetch, or something like it. I can see how it or something like it could be made to improve the lives of people who get up in the morning to find their apps sitting on disk due to memory pressure generated by over-night system maintenance operations. The author himself however, says his implementation can't help with updatedb (though people seem to be saying that it does), or anything else that leaves memory full. That IMHO, makes it of questionable value toward solving what people are saying they want swap-prefetch for in the first place. I personally don't care if swap-prefetch goes in or not. -Mike - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 07/28/2007 09:35 AM, Rene Herman wrote: By the way -- I'm unable to make my slocate grow substantial here but I'll try what GNU locate does. If it's really as bad as I hear then regardless of anything else it should really be either fixed or dumped... Yes. GNU locate is broken and nobody should be using it. The updatedb from (my distribution standard) slocate uses around 2M allocated total during an entire run while GNU locate allocates some 30M to the sort process alone. GNU locate is also close to 4 times as slow (although that ofcourse only matters on cached runs anyways). So, GNU locate is just a pig pushing things out, with or without any added VFS cache pressure from the things it does by design. As such, we can trust people complaining about it but should first tell them to switch to halfway sane locate implementation. If you run memory hogs on small memory boxes, you're going to suffer. Leaves the fact that swap-prefetch sometimes helps alleviate these and other kinds of memory-hog situations and as such, might not (again) be a bad idea in itself. Rene. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On Sat, 28 Jul 2007, Rene Herman wrote: On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote: On Fri, 27 Jul 2007, Rene Herman wrote: On 07/27/2007 07:45 PM, Daniel Hazelton wrote: Questions about it: Q) Does swap-prefetch help with this? A) [From all reports I've seen (*)] Yes, it does. No it does not. If updatedb filled memory to the point of causing swapping (which noone is reproducing anyway) it HAS FILLED MEMORY and swap-prefetch hasn't any memory to prefetch into -- updatedb itself doesn't use any significant memory. however there are other programs which are known to take up significant amounts of memory and will cause the issue being described (openoffice for example) please don't get hung up on the text 'updatedb' and accept that there are programs that do run intermittently and do use a significant amount of ram and then free it. Different issue. One that's worth pursueing perhaps, but a different issue from the VFS caches issue that people have been trying to track down. people are trying to track down the problem of their machine being slow until enough data is swapped back in to operate normally. in at some situations swap prefetch can help becouse something that used memory freed it so there is free memory that could be filled with data (which is something that Linux does agressivly in most other situations) in some other situations swap prefetch cannot help becouse useless data is getting cached at the expense of useful data. nobody is arguing that swap prefetch helps in the second cast. what people are arguing is that there are situations where it helps for the first case. on some machines and version of updatedb the nighly run of updatedb can cause both sets of problems. but the nightly updatedb run is not the only thing that can cause problems but let's talk about the concept here for a little bit the design is to use CPU and I/O capacity that's otherwise idle to fill free memory with data from swap. pro: more ram has potentially useful data in it con: it takes a little extra effort to give this memory to another app (the page must be removed from the list and zeroed at the time it's needed, I assume that the data is left in swap so that it doesn't have to be written out again) it adds some complexity to the kernel (~500 lines IIRC from this thread) by undoing recent swapouts it can potentially mask problems with swapout it looks to me like unless the code was really bad (and after 23 months in -mm it doesn't sound like it is) that the only significant con left is the potential to mask other problems. however there are many legitimate cases where it is definantly dong the right thing (swapout was correct in pushing out the pages, but now the cause of that preasure is gone). the amount of benifit from this will vary from situation to situation, but it's not reasonable to claim that this provides no benifit (you have benchmark numbers that show it in synthetic benchmarks, and you have user reports that show it in the real-worlk) there are lots of things in the kernel who's job is to pre-fill the memroy with data that may (or may not) be useful in the future. this is just another method of filling the cache. it does so my saying the user wanted these pages in the recent past, so it's a reasonable guess to say that the user will want them again in the future David Lang - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: -mm merge plans for 2.6.23
Daniel Cheng wrote: but merging maps2 have higher risk which should be done in a development branch (er... 2.7, but we don't have it now). This is off-topic and has been discussed to death, but: Rather than one stable branch and one development branch, we have a few stable branches and a lot of development branches. Some are located at git.kernel.org. Among else, this gives you a predictable release rythm and very timely updated stable branches. -- Stefan Richter -=-=-=== -=== ===-- http://arcgraph.de/sr/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 07/28/2007 10:55 AM, [EMAIL PROTECTED] wrote: in at some situations swap prefetch can help becouse something that used memory freed it so there is free memory that could be filled with data (which is something that Linux does agressivly in most other situations) in some other situations swap prefetch cannot help becouse useless data is getting cached at the expense of useful data. nobody is arguing that swap prefetch helps in the second cast. Oh yes they are. Daniel for example did twice, telling me to turn my brain on in between (if you read it, you may have noticed I got a little annoyed at that point). but let's talk about the concept here for a little bit the design is to use CPU and I/O capacity that's otherwise idle to fill free memory with data from swap. pro: more ram has potentially useful data in it con: it takes a little extra effort to give this memory to another app (the page must be removed from the list and zeroed at the time it's needed, I assume that the data is left in swap so that it doesn't have to be written out again) It is. Prefetched pages can be dropped on the floor without additional I/O. it adds some complexity to the kernel (~500 lines IIRC from this thread) by undoing recent swapouts it can potentially mask problems with swapout it looks to me like unless the code was really bad (and after 23 months in -mm it doesn't sound like it is) Not to sound pretentious or anything but I assume that Andrew has a fairly good overview of exactly how broken -mm can be at times. How many -mm users use it anyway? He himself said he's not convinced of usefulness having not seen it help for him (and notice that most developers are also users), turned it off due to it annoying him at some point and hasn't seen a serious investigation into potential downsides. that the only significant con left is the potential to mask other problems. Which is not a madeup issue, mind you. As an example, I just now tried GNU locate and saw it's a complete pig and specifically unsuitable for the low memory boxes under discussion. Upon completion, it actually frees enough memory that swap-prefetch _could_ help on some boxes, while the real issue is that they should first and foremost dump GNU locate. however there are many legitimate cases where it is definantly dong the right thing (swapout was correct in pushing out the pages, but now the cause of that preasure is gone). the amount of benifit from this will vary from situation to situation, but it's not reasonable to claim that this provides no benifit (you have benchmark numbers that show it in synthetic benchmarks, and you have user reports that show it in the real-worlk) I certainly would not want to argue anything of the sort no. As said a few times, I agree that swap-prefetch makes sense and has at least the potential to help some situations that you really wouldnt even want to try and fix any other way, simply because nothing's broken. there are lots of things in the kernel who's job is to pre-fill the memroy with data that may (or may not) be useful in the future. this is just another method of filling the cache. it does so my saying the user wanted these pages in the recent past, so it's a reasonable guess to say that the user will want them again in the future Well, _that_ is what the kernel is already going to great lengths at doing, and it decided that those pages us poor overnight OO.o users want in in the morning weren't reasonable guesses. The kernel also won't any time soon be reading our minds, so any solution would need either user intervention (we could devise a way to tell the kernel hey ho, I consider these pages to be very important -- try not to swap them out possible even with a and if you do, please pull them back in when possible) or we can let swap-prefetch do the just in case thing it is doing. While swap-prefetch may not be the be all end all of solutions I agree that having a machine sit around with free memory and applications in swap seems not too useful if (as is the case) fetched pages can be dropped immediately when it turns out swap-prefetch made the wrong decision. So that's for the concept. As to implementation, if I try and look at the code, it seems to be trying hard to really be free and as such, potential downsides seem limited. It's a rather core concept though and as such needs someone with a _lot_ more VM clue to ack. Sorry for not knowing, but who's maintaining/submitting the thing now that Con's not? He or she should preferably address any concerns it seems. Rene. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
It is. Prefetched pages can be dropped on the floor without additional I/O. Which is essentially free for most cases. In addition your disk access may well have been in idle time (and should be for this sort of stuff) and if it was in the same chunk as something nearby was effectively free anyway. Actual physical disk ops are precious resource and anything that mostly reduces the number will be a win - not to stay swap prefetch is the right answer but accidentally or otherwise there are good reasons it may happen to help. Bigger more linear chunks of writeout/readin is much more important I suspect than swap prefetching. good overview of exactly how broken -mm can be at times. How many -mm users use it anyway? He himself said he's not convinced of usefulness having not I've been using it for months with no noticed problem. I turn it on because it might as well get tested. I've not done comparison tests so I can't comment on if its worth it. Lots of -mm testers turn *everything* on because its a test kernel. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On Saturday 28 July 2007 03:48:13 Mike Galbraith wrote: On Fri, 2007-07-27 at 18:51 -0400, Daniel Hazelton wrote: Now, once more, I'm going to ask: What is so terribly wrong with swap prefetch? Why does it seem that everyone against it says Its treating a symptom, so it can't go in? And once again, I personally have nothing against swap-prefetch, or something like it. I can see how it or something like it could be made to improve the lives of people who get up in the morning to find their apps sitting on disk due to memory pressure generated by over-night system maintenance operations. The author himself however, says his implementation can't help with updatedb (though people seem to be saying that it does), or anything else that leaves memory full. That IMHO, makes it of questionable value toward solving what people are saying they want swap-prefetch for in the first place. Okay. I have to agree with the author that, in such a situation, it wouldn't help. However there are, without a doubt, other situations where it would help immensely. (memory hogs forcing everything to disk and quitting, one off tasks that don't balloon the cache (kernel compiles, et al) - in those situations swap prefetch would really shine.) DRH -- Dialup is like pissing through a pipette. Slow and excruciatingly painful. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On Saturday 28 July 2007 04:55:58 [EMAIL PROTECTED] wrote: On Sat, 28 Jul 2007, Rene Herman wrote: On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote: On Fri, 27 Jul 2007, Rene Herman wrote: On 07/27/2007 07:45 PM, Daniel Hazelton wrote: Questions about it: Q) Does swap-prefetch help with this? A) [From all reports I've seen (*)] Yes, it does. No it does not. If updatedb filled memory to the point of causing swapping (which noone is reproducing anyway) it HAS FILLED MEMORY and swap-prefetch hasn't any memory to prefetch into -- updatedb itself doesn't use any significant memory. however there are other programs which are known to take up significant amounts of memory and will cause the issue being described (openoffice for example) please don't get hung up on the text 'updatedb' and accept that there are programs that do run intermittently and do use a significant amount of ram and then free it. Different issue. One that's worth pursueing perhaps, but a different issue from the VFS caches issue that people have been trying to track down. people are trying to track down the problem of their machine being slow until enough data is swapped back in to operate normally. in at some situations swap prefetch can help becouse something that used memory freed it so there is free memory that could be filled with data (which is something that Linux does agressivly in most other situations) in some other situations swap prefetch cannot help becouse useless data is getting cached at the expense of useful data. nobody is arguing that swap prefetch helps in the second cast. Actually, I made a mistake when tracking the thread and reading the code for the patch and started to argue just that. But I have to admit I made a mistake - the patches author has stated (as Rene was kind enough to point out) that swap prefetch can't help when memory is filled. what people are arguing is that there are situations where it helps for the first case. on some machines and version of updatedb the nighly run of updatedb can cause both sets of problems. but the nightly updatedb run is not the only thing that can cause problems Solving the cache filling memory case is difficult. There have been a number of discussions about it. The simplest solution, IMHO, would be to place a (configurable) hard limit on the maximum size any of the kernels caches can grow to. (The only solution that was discussed, however, is a complex beast) but let's talk about the concept here for a little bit the design is to use CPU and I/O capacity that's otherwise idle to fill free memory with data from swap. pro: more ram has potentially useful data in it con: it takes a little extra effort to give this memory to another app (the page must be removed from the list and zeroed at the time it's needed, I assume that the data is left in swap so that it doesn't have to be written out again) it adds some complexity to the kernel (~500 lines IIRC from this thread) by undoing recent swapouts it can potentially mask problems with swapout it looks to me like unless the code was really bad (and after 23 months in -mm it doesn't sound like it is) that the only significant con left is the potential to mask other problems. I'll second this. But with the swap system itself having seen as heavy testing as it has I don't know if it would be masking other problems. That is why I've been asking What is so wrong with it? - while it definately doesn't help with programs that cause caches to balloon (that problem does need another solution) it does help to speed things up when a memory hog has exited. (And since its a pretty safe assumption that swap is going to be noticeably slower than RAM this patch seems to me to be a rather visible and obvious solution to that problem) however there are many legitimate cases where it is definantly dong the right thing (swapout was correct in pushing out the pages, but now the cause of that preasure is gone). the amount of benifit from this will vary from situation to situation, but it's not reasonable to claim that this provides no benifit (you have benchmark numbers that show it in synthetic benchmarks, and you have user reports that show it in the real-worlk) Exactly. Though I have seen posts which (to me at least) appear to claim exactly that. It was part of the reason why I got a bit incensed. (The other was that it looked like the kernel devs with the ultra-powerful machines were claiming 'I don't see the problem on my machine, so it doesn't exist'. That sort of attitude is fine, in some cases, but not, IMHO, where performance is concerned) there are lots of things in the kernel who's job is to pre-fill the memroy with data that may (or may not) be useful in the future. this is just another method of filling the cache. it does so my saying the user wanted these pages in the recent
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On 7/28/07, Alan Cox [EMAIL PROTECTED] wrote: Actual physical disk ops are precious resource and anything that mostly reduces the number will be a win - not to stay swap prefetch is the right answer but accidentally or otherwise there are good reasons it may happen to help. Bigger more linear chunks of writeout/readin is much more important I suspect than swap prefetching. nod. The larger the chunks are that we swap out, the less it actually hurts to swap, which might make all this a moot point. Not all I/O is created equal... Ray - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On Sat, 28 Jul 2007, Rene Herman wrote: On 07/28/2007 10:55 AM, [EMAIL PROTECTED] wrote: it looks to me like unless the code was really bad (and after 23 months in -mm it doesn't sound like it is) Not to sound pretentious or anything but I assume that Andrew has a fairly good overview of exactly how broken -mm can be at times. How many -mm users use it anyway? He himself said he's not convinced of usefulness having not seen it help for him (and notice that most developers are also users), turned it off due to it annoying him at some point and hasn't seen a serious investigation into potential downsides. if that was the case then people should be responding to the request to get it merged with 'but it caused problems for me when I tried it' I haven't seen any comments like that. that the only significant con left is the potential to mask other problems. Which is not a madeup issue, mind you. As an example, I just now tried GNU locate and saw it's a complete pig and specifically unsuitable for the low memory boxes under discussion. Upon completion, it actually frees enough memory that swap-prefetch _could_ help on some boxes, while the real issue is that they should first and foremost dump GNU locate. I see the conclusion as being exactly the opposite. here is a workload with some badly designed userspace software that the kernel can make much more pleasent for users. arguing that users should never use badly designed software in userspace doesn't seem like an argument that will gain much traction. I'm not saying the kernel needs to fix the software itself (ala the sched_yeild issues), but the kernel should try and keep such software from hurting the rest of the system where it can. in this case it can't help it while the bad software is running, but it could minimize the impact after it finishes. however there are many legitimate cases where it is definantly dong the right thing (swapout was correct in pushing out the pages, but now the cause of that preasure is gone). the amount of benifit from this will vary from situation to situation, but it's not reasonable to claim that this provides no benifit (you have benchmark numbers that show it in synthetic benchmarks, and you have user reports that show it in the real-worlk) I certainly would not want to argue anything of the sort no. As said a few times, I agree that swap-prefetch makes sense and has at least the potential to help some situations that you really wouldnt even want to try and fix any other way, simply because nothing's broken. so there is a legitimate situation where swap-prefetch will help significantly, what is the downside that prevents it from being included? (reading this thread it sometimes seems like the downside is that updatedb shouldn't cause this problem and so if you fixed updatedb there wold be no legitimate benifit, or alturnatly this patch doesn't help updatedb so there's no legitimate benifit) there are lots of things in the kernel who's job is to pre-fill the memroy with data that may (or may not) be useful in the future. this is just another method of filling the cache. it does so my saying the user wanted these pages in the recent past, so it's a reasonable guess to say that the user will want them again in the future Well, _that_ is what the kernel is already going to great lengths at doing, and it decided that those pages us poor overnight OO.o users want in in the morning weren't reasonable guesses. The kernel also won't any time soon be reading our minds, so any solution would need either user intervention (we could devise a way to tell the kernel hey ho, I consider these pages to be very important -- try not to swap them out possible even with a and if you do, please pull them back in when possible) or we can let swap-prefetch do the just in case thing it is doing. it's not that they shouldn't have been swapped out (they should have been), it's that the reason they were swapped out no longer exists. While swap-prefetch may not be the be all end all of solutions I agree that having a machine sit around with free memory and applications in swap seems not too useful if (as is the case) fetched pages can be dropped immediately when it turns out swap-prefetch made the wrong decision. So that's for the concept. As to implementation, if I try and look at the code, it seems to be trying hard to really be free and as such, potential downsides seem limited. It's a rather core concept though and as such needs someone with a _lot_ more VM clue to ack. Sorry for not knowing, but who's maintaining/submitting the thing now that Con's not? He or she should preferably address any concerns it seems. I've seen it mentioned that there is still a maintainer but I missed who it is, but I haven't seen any concerns that can be addressed, they all seem to be 'this is a core concept, people need to think about it' or 'but someone may find a better
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On Sat, 28 Jul 2007, Alan Cox wrote: It is. Prefetched pages can be dropped on the floor without additional I/O. Which is essentially free for most cases. In addition your disk access may well have been in idle time (and should be for this sort of stuff) and if it was in the same chunk as something nearby was effectively free anyway. as I understand it the swap-prefetch only kicks in if the device is idle Actual physical disk ops are precious resource and anything that mostly reduces the number will be a win - not to stay swap prefetch is the right answer but accidentally or otherwise there are good reasons it may happen to help. Bigger more linear chunks of writeout/readin is much more important I suspect than swap prefetching. I'm sure this is true while you are doing the swapout or swapin and the system is waiting for it. but with prefetch you may be able to avoid doing the swapin at a time when the system is waiting for it by doing it at a time when the system is otherwise idle. David Lang - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On Sat, 28 Jul 2007, Daniel Hazelton wrote: On Saturday 28 July 2007 04:55:58 [EMAIL PROTECTED] wrote: On Sat, 28 Jul 2007, Rene Herman wrote: On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote: On Fri, 27 Jul 2007, Rene Herman wrote: On 07/27/2007 07:45 PM, Daniel Hazelton wrote: nobody is arguing that swap prefetch helps in the second cast. Actually, I made a mistake when tracking the thread and reading the code for the patch and started to argue just that. But I have to admit I made a mistake - the patches author has stated (as Rene was kind enough to point out) that swap prefetch can't help when memory is filled. I stand corrected, thaks for speaking up and correcting your position. what people are arguing is that there are situations where it helps for the first case. on some machines and version of updatedb the nighly run of updatedb can cause both sets of problems. but the nightly updatedb run is not the only thing that can cause problems Solving the cache filling memory case is difficult. There have been a number of discussions about it. The simplest solution, IMHO, would be to place a (configurable) hard limit on the maximum size any of the kernels caches can grow to. (The only solution that was discussed, however, is a complex beast) limiting the size of the cache is also the wrong thing to do in many situations. it's only right if the cache pushes out other data you care about, if you are trying to do one thing as fast as you can you really do want the system to use all the memory it can for the cache. David Lang - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On Saturday 28 July 2007 17:06:50 [EMAIL PROTECTED] wrote: On Sat, 28 Jul 2007, Daniel Hazelton wrote: On Saturday 28 July 2007 04:55:58 [EMAIL PROTECTED] wrote: On Sat, 28 Jul 2007, Rene Herman wrote: On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote: On Fri, 27 Jul 2007, Rene Herman wrote: On 07/27/2007 07:45 PM, Daniel Hazelton wrote: nobody is arguing that swap prefetch helps in the second cast. Actually, I made a mistake when tracking the thread and reading the code for the patch and started to argue just that. But I have to admit I made a mistake - the patches author has stated (as Rene was kind enough to point out) that swap prefetch can't help when memory is filled. I stand corrected, thaks for speaking up and correcting your position. If you had made the statement before I decided to speak up you would have been correct :) Anyway, I try to always admit when I've made a mistake - its part of my philosophy. (There have been times when I haven't done it, but I'm trying to make that stop entirely) what people are arguing is that there are situations where it helps for the first case. on some machines and version of updatedb the nighly run of updatedb can cause both sets of problems. but the nightly updatedb run is not the only thing that can cause problems Solving the cache filling memory case is difficult. There have been a number of discussions about it. The simplest solution, IMHO, would be to place a (configurable) hard limit on the maximum size any of the kernels caches can grow to. (The only solution that was discussed, however, is a complex beast) limiting the size of the cache is also the wrong thing to do in many situations. it's only right if the cache pushes out other data you care about, if you are trying to do one thing as fast as you can you really do want the system to use all the memory it can for the cache. After thinking about this you are partially correct. There are those sorts of situations where you want the system to use all the memory it can for caches. OTOH, if those situations could be described in some sort of simple heuristic, then a soft-limit that uses those heuristics to determine when to let the cache expand could exploit the benefits of having both a limited and unlimited cache. (And, potentially, if the heuristic has allowed a cache to expand beyond the limit then, when the heuristic no longer shows the oversize cache is no longer necessary it could trigger and automatic reclaim of that memory.) (I'm willing to help write and test code to do this exactly. There is no guarantee that I'll be able to help with more than testing - I don't understand the parts of the code involved all that well) DRH -- Dialup is like pissing through a pipette. Slow and excruciatingly painful. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
Andrew Morton wrote: What I think is killing us here is the blockdev pagecache: the pagecache which backs those directory entries and inodes. These pages get read multiple times because they hold multiple directory entries and multiple inodes. These multiple touches will put those pages onto the active list so they stick around for a long time and everything else gets evicted. I've never been very sure about this policy for the metadata pagecache. We read the filesystem objects into the dcache and icache and then we won't read from that page again for a long time (I expect). But the page will still hang around for a long time. It could be that we should leave those pages inactive. Good idea for updatedb. However, it may be a bad idea for files that are often written to. Turning an inode write into a read plus a write does not sound like such a hot idea, we really want to keep those in the cache. I think what you need is to ignore multiple references to the same page when they all happen in one time interval, counting them only if they happen in multiple time intervals. The use-once cleanup (which takes a page flag for PG_new, I know...) would solve that problem. However, it would introduce the problem of having to scan all the pages on the list before a page becomes freeable. We would have to add some background scanning (or a separate list for PG_new pages) to make the initial pageout run use an acceptable amount of CPU time. Not sure that complexity will be worth it... -- Politics is the struggle between those who want to make their country the best in the world, and those who believe it already is. Each group calls the other unpatriotic. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb morning after problem [was: Re: -mm merge plans for 2.6.23]
On Sat, 28 Jul 2007 21:33:59 -0400 Rik van Riel [EMAIL PROTECTED] wrote: Andrew Morton wrote: What I think is killing us here is the blockdev pagecache: the pagecache which backs those directory entries and inodes. These pages get read multiple times because they hold multiple directory entries and multiple inodes. These multiple touches will put those pages onto the active list so they stick around for a long time and everything else gets evicted. I've never been very sure about this policy for the metadata pagecache. We read the filesystem objects into the dcache and icache and then we won't read from that page again for a long time (I expect). But the page will still hang around for a long time. It could be that we should leave those pages inactive. Good idea for updatedb. However, it may be a bad idea for files that are often written to. Turning an inode write into a read plus a write does not sound like such a hot idea, we really want to keep those in the cache. Remember that this problem applies to both inode blocks and to directory blocks. Yes, it might be useful to hold onto an inode block for a future write (atime, mtime, usually), but not a directory block. I think what you need is to ignore multiple references to the same page when they all happen in one time interval, counting them only if they happen in multiple time intervals. Yes, the sudden burst of accesses for adjacent inode/dirents will be a common pattern, and it'd make heaps of sense to treat that as a single touch. It'd have to be done in the fs I guess, and it might be a bit hard to do. And it turns out that embedding the touch_buffer() all the way down in __find_get_block() was convenient, but it's going to be tricky to change. For now I'm fairly inclined to just nuke the touch_buffer() on the read side and maybe add one on the modification codepaths and see what happens. As always, testing is the problem. The use-once cleanup (which takes a page flag for PG_new, I know...) would solve that problem. However, it would introduce the problem of having to scan all the pages on the list before a page becomes freeable. We would have to add some background scanning (or a separate list for PG_new pages) to make the initial pageout run use an acceptable amount of CPU time. Not sure that complexity will be worth it... I suspect that the situation we have now is so bad that pretty much anything we do will be an improvement. I've always wondered ytf is there so much blockdev pagecache? This machine I'm typing at: MemTotal: 3975080 kB MemFree:750400 kB Buffers:547736 kB Cached:1299532 kB SwapCached: 12772 kB Active:1789864 kB Inactive: 861420 kB HighTotal: 0 kB HighFree:0 kB LowTotal: 3975080 kB LowFree:750400 kB SwapTotal: 4875716 kB SwapFree: 4715660 kB Dirty: 76 kB Writeback: 0 kB Mapped: 638036 kB Slab: 522724 kB CommitLimit: 6863256 kB Committed_AS: 1115632 kB PageTables: 14452 kB VmallocTotal: 34359738367 kB VmallocUsed: 36432 kB VmallocChunk: 34359696379 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 Hugepagesize: 2048 kB More that a quarter of my RAM in fs metadata! Most of it I'll bet is on the active list. And the fs on which I do most of the work is mounted noatime.. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
On 07/27/2007 10:28 PM, Daniel Hazelton wrote: Check the attitude at the door then re-read what I actually said: Attitude? You wanted attitude dear boy? Updatedb or another process that uses the FS heavily runs on a users 256MB P3-800 (when it is idle) and the VFS caches grow, causing memory pressure that causes other applications to be swapped to disk. In the morning the user has to wait for the system to swap those applications back in. I never said that it was the *program* itself - or *any* specific program (I used "Updatedb" because it has been the big name in the discussion) - doing the filling of memory. I actually said that the problem is that the kernel's caches - VFS and others - will grow *WITHOUT* *LIMIT*, filling all available memory. WHICH SWAP-PREFETCH DOES NOT HELP WITH. WHICH SWAP-PREFETCH DOES NOT HELP WITH. WHICH SWAP-PREFETCH DOES NOT HELP WITH. And now finally get that through your thick scull or shut up, right fucking now. You want to know what causes the problem? The current design of the caches. They will extend without much limit, to the point of actually pushing pages to disk so they can grow even more. Due to being a generally nice guy, I am going to try _once_ more to try and make you understand. Not twice, once. So pay attention. Right now. Those caches are NOT causing any problem under discussion. If any caches grow to the point of causing swap-out, they have filled memory and swap-prefetch cannot and will not do anything since it needs free (as in not occupied by caches) memory. As such, people maintaining that swap-prefetch helps their situation are not being hit by caches. The only way swap-prefetch can (and will) do anything is when something that by itself takes up lots of memory runs and exits. So can we now please finally drop the fucking red herring and start talking about swap-prefetch? If we accept that some of the people maintaining that swap-prefetch helps them are not in fact deluded -- a bit of a stretch seeing as how not a single one of them is substantiating anything -- we have a number of slightly different possibilities for "something" in the above. -- 1) It could be an inefficient updatedb. Although he isn't experiencing the problem, Bjoern Steinbrink is posting numbers (w!) that show that at least the GNU version spawns a large memory "sort" process meaning that on a low-memory box updatedb itself can be what causes the observed problem. While in this situation switching to a different updatedb (slocate, mlocate) obviously makes sense it's the kind of situation where swap-prefetch will help. -- 2) It could be something else entirely such as a backup run. I suppose people would know if they were running anything of the sort though and wouldn't blaim anything on updatedb. Other than that, it's again the situation where swap-prefetch would help. -- 3) The something else entirely can also run _after_ updatedb, kicking out the VFS caches and leaving free memory upon exit. I still suppose the same thing as under (2) but this is the only way how updatedb / VFS caches can even be part of any problem, if the _combined_ memory pressure is just enough to make the difference. The direct problem is still just the "something else entirely" and needs someone affected to tell us what it is. I already did. You completely ignored it because I happened to use the magic words "updatedb" and "swap prefetch". No I did not. This thread is about swap-prefetch and you used the magic words VFS caches. I don't give a fryin' fuck if their filling is caused by updatedb or the cat sleeping on the "find /" keys on your keyboard, they're still not causing anything swap-prefetch helps with. This thread has seen input from a selection of knowledgeable people and Morton was even running benchmarks to look at this supposed VFS cache problem and not finding it. The only further input this thread needs is someone affected by the supposed problem. Which I ofcourse notice in a followup of yours you are not either -- you're just here to blabber, not to solve anything. Rene. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: -mm merge plans for 2.6.23
Andrew Morton wrote: [...] > > And userspace can do a much better implementation of this > how-to-handle-large-load-shifts problem, because it is really quite > complex. The system needs to be monitored to determine what is the "usual" [...] > All this would end up needing runtime configurability and tweakability and > customisability. All standard fare for userspace stuff - much easier than > patching the kernel. But a patch already exist. Which is easier: (1) apply the patch ; or (2) write a new patch? > > So. We can > a) provide a way for userspace to reload pagecache and > b) merge maps2 (once it's finished) (pokes mpm) > and we're done? might be. but merging maps2 have higher risk which should be done in a development branch (er... 2.7, but we don't have it now). -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/