Re: unfair stress on non memory allocating apps while swapout (in 2.4)

2000-10-22 Thread Nick Piggin
why are programs which do not allocate memory be delayed while one program is eating up all memory. This clearly means they are not delayed in the malloc call but simply the kernel will not schedule them while he is bussy to page out processes. Bernd, The reason why programs not allocating

Topic for discussion: OS Design

2000-10-22 Thread Nick Piggin
So what we really need to do is get some custom "RAM blitter" into our hardware to do the memory copies needed for fast context switching and message passing. don't you think you should quit while you're behind? Too bad nobody on this list works at an electronics design company... ;-P you

Re: Linux's implementation of poll() not scalable?

2000-10-23 Thread Nick Piggin
I'm trying to write a server that handles 1 clients. On 2.4.x, the RT signal queue stuff looks like the way to achieve that. I would suggest you try multiple polling threads. Not only will you get better SMP scalability, if you have say 16 threads, each one only has to handle ~ 600 fds.

[patch] BSD process accounting: new locking

2000-10-29 Thread Nick Piggin
I have attached a very small patch (test9) to remove the kernel lock from kernel/acct.c. If I am missing something major (a brain?), I apologise in advance. I have tested this on my UP x86 with spinlock debugging. I would appreciate comments or an explanation of why this can't be done if you have

test9 oops (in block_read_full_page)

2000-10-29 Thread Nick Piggin
I apologise if this oops has already been fixed: it has happened twice but I can't find the exact way to trigger it, I just want to make sure it is reported ;) Nick oops

Re: 2.4.0-test9 Oopses

2000-11-02 Thread Nick Piggin
== '/') { in fs/namei.c: __vfs_follow_link to oops. The oops is due to trying to follow an sg? link in /dev. Nick. - Original Message - From: "Nick Piggin" [EMAIL PROTECTED] To: "Linux-Kernel" [EMAIL PROTECTED] Sent: Wednesday, October 25, 2000 9:16 PM Subject: 2.4.0-test9 Oopses

2.4.0-test10 oopses (bug in devfs)

2000-11-04 Thread Nick Piggin
), seems to return invalid or incorrect devfs entries whose .u.symlink.linkname is null which causes the line: if (*link == '/') { in fs/namei.c: __vfs_follow_link to oops. The oops is due to trying to follow an sg? link in /dev. Nick. - Original Message - From: "Nick P

bkl usage

2000-11-12 Thread Nick Piggin
Hi. In my efforts to understand the linux kernel v2.4 I found the bkl being used in kernel/acct.c to lock seemingly local data. Would someone please explain what races this prevents vs. say: --- linux/kernel/acct.c Mon Oct 30 01:02:56 2000 +++ linux-2.4.0-test10/kernel/acct.c Mon Oct 30

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Nick Piggin
On Mon, Apr 16, 2007 at 08:52:33AM +1000, Con Kolivas wrote: On Monday 16 April 2007 05:00, Jonathan Lundell wrote: On Apr 15, 2007, at 10:59 AM, Linus Torvalds wrote: It's a really good thing, and it means that if somebody shows that your code is flawed in some way (by, for example,

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Nick Piggin
On Sun, Apr 15, 2007 at 04:31:54PM -0500, Matt Mackall wrote: On Sun, Apr 15, 2007 at 10:48:24PM +0200, Ingo Molnar wrote: 4) the good thing that happened to I/O, after years of stagnation isnt I/O schedulers. The good thing that happened to I/O is called Jens Axboe. If you care

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Nick Piggin
On Mon, Apr 16, 2007 at 01:15:27PM +1000, Con Kolivas wrote: On Monday 16 April 2007 12:28, Nick Piggin wrote: So, on to something productive, we have 3 candidates for a new scheduler so far. How do we decide which way to go? (and yes, I still think switchable schedulers is wrong

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-16 Thread Nick Piggin
On Sun, Apr 15, 2007 at 10:36:29PM -0700, Bill Huey wrote: On Sun, Apr 15, 2007 at 09:25:07AM -0700, Arjan van de Ven wrote: Now this doesn't mean that people shouldn't be nice to each other, not cooperate or steal credits, but I don't get the impression that that is happening here. Ingo is

Re: [ck] Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-16 Thread Nick Piggin
On Mon, Apr 16, 2007 at 03:57:54PM +1000, Matthew Hawkins wrote: On 4/16/07, Nick Piggin [EMAIL PROTECTED] wrote: So, on to something productive, we have 3 candidates for a new scheduler so far. How do we decide which way to go? (and yes, I still think switchable schedulers is wrong

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-16 Thread Nick Piggin
On Mon, Apr 16, 2007 at 09:28:24AM -0500, Matt Mackall wrote: On Mon, Apr 16, 2007 at 05:03:49AM +0200, Nick Piggin wrote: I'd prefer if we kept a single CPU scheduler in mainline, because I think that simplifies analysis and focuses testing. I think you'll find something like 80-90

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-16 Thread Nick Piggin
On Tue, Apr 17, 2007 at 04:29:01AM +0200, Mike Galbraith wrote: On Tue, 2007-04-17 at 10:06 +1000, Peter Williams wrote: Mike Galbraith wrote: Demystify what? The casual observer need only read either your attempt at writing a scheduler, or my attempts at fixing the one we have, to

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-16 Thread Nick Piggin
On Mon, Apr 16, 2007 at 04:10:59PM -0700, Michael K. Edwards wrote: On 4/16/07, Peter Williams [EMAIL PROTECTED] wrote: Note that I talk of run queues not CPUs as I think a shift to multiple CPUs per run queue may be a good idea. This observation of Peter's is the best thing to come out of

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-16 Thread Nick Piggin
On Tue, Apr 17, 2007 at 06:01:29AM +0200, Mike Galbraith wrote: On Tue, 2007-04-17 at 05:40 +0200, Nick Piggin wrote: On Tue, Apr 17, 2007 at 04:29:01AM +0200, Mike Galbraith wrote: Yup, and progress _is_ happening now, quite rapidly. Progress as in progress on Ingo's scheduler. I

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-16 Thread Nick Piggin
On Tue, Apr 17, 2007 at 02:17:22PM +1000, Peter Williams wrote: Nick Piggin wrote: On Tue, Apr 17, 2007 at 04:29:01AM +0200, Mike Galbraith wrote: On Tue, 2007-04-17 at 10:06 +1000, Peter Williams wrote: Mike Galbraith wrote: Demystify what? The casual observer need only read either your

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-16 Thread Nick Piggin
On Tue, Apr 17, 2007 at 02:25:39PM +1000, Peter Williams wrote: Nick Piggin wrote: On Mon, Apr 16, 2007 at 04:10:59PM -0700, Michael K. Edwards wrote: On 4/16/07, Peter Williams [EMAIL PROTECTED] wrote: Note that I talk of run queues not CPUs as I think a shift to multiple CPUs per run queue

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-17 Thread Nick Piggin
On Tue, Apr 17, 2007 at 07:53:55AM +0200, Willy Tarreau wrote: Hi Nick, On Tue, Apr 17, 2007 at 06:29:54AM +0200, Nick Piggin wrote: (...) And my scheduler for example cuts down the amount of policy code and code size significantly. I haven't looked at Con's ones for a while, but I

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-17 Thread Nick Piggin
+0200, Nick Piggin wrote: I don't know why. The problem is that you can't really evaluate good proposals by looking at the code (you can say that one is bad, ie. the current one, which has a huge amount of temporal complexity and is explicitly unfair), but it is pretty hard to say one

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-17 Thread Nick Piggin
On Tue, Apr 17, 2007 at 04:03:41PM +1000, Peter Williams wrote: Nick Piggin wrote: But you add extra code for that on top of what we have, and are also prevented from making per-cpu assumptions. And you can get N CPUs per runqueue behaviour by having them in a domain with no restrictions

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-17 Thread Nick Piggin
On Tue, Apr 17, 2007 at 04:23:37PM +1000, Peter Williams wrote: Nick Piggin wrote: And my scheduler for example cuts down the amount of policy code and code size significantly. Yours is one of the smaller patches mainly because you perpetuate (or you did in the last one I looked

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-17 Thread Nick Piggin
On Mon, Apr 16, 2007 at 11:26:21PM -0700, William Lee Irwin III wrote: On Mon, Apr 16, 2007 at 11:09:55PM -0700, William Lee Irwin III wrote: All things are not equal; they all have different properties. I like On Tue, Apr 17, 2007 at 08:15:03AM +0200, Nick Piggin wrote: Exactly. So we

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-17 Thread Nick Piggin
On Tue, Apr 17, 2007 at 12:09:49AM -0700, William Lee Irwin III wrote: The trouble with thorough testing right now is that no one agrees on what the tests should be and a number of the testcases are not in great shape. An agreed-upon set of testcases for basic correctness should be devised

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-17 Thread Nick Piggin
On Tue, Apr 17, 2007 at 12:27:28AM -0700, Davide Libenzi wrote: On Tue, 17 Apr 2007, William Lee Irwin III wrote: On Mon, Apr 16, 2007 at 11:50:03PM -0700, Davide Libenzi wrote: I would suggest to thoroughly test all your alternatives before deciding. Some code and design may look very

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-17 Thread Nick Piggin
On Tue, Apr 17, 2007 at 09:33:08AM +0200, Ingo Molnar wrote: * William Lee Irwin III [EMAIL PROTECTED] wrote: On Mon, Apr 16, 2007 at 11:50:03PM -0700, Davide Libenzi wrote: I had a quick look at Ingo's code yesterday. Ingo is always smart to prepare a main dish (feature) with a nice

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-17 Thread Nick Piggin
On Tue, Apr 17, 2007 at 05:48:55PM +1000, Peter Williams wrote: Nick Piggin wrote: Other hints that it was a bad idea was the need to transfer time slices between children and parents during fork() and exit(). I don't see how that has anything to do with dual arrays. It's totally to do

Re: [patch] CFS (Completely Fair Scheduler), v2

2007-04-17 Thread Nick Piggin
On Tue, Apr 17, 2007 at 01:03:46AM -0700, Davide Libenzi wrote: On Tue, 17 Apr 2007, Ingo Molnar wrote: ok - fortunately the delta between -v2-rc0 and -v2-final is pretty small. One difference is the child-runs-first fix. To restore the parent-runs-first logic, do this: echo 0

Re: [patch] CFS (Completely Fair Scheduler), v2

2007-04-17 Thread Nick Piggin
On Tue, Apr 17, 2007 at 10:26:31AM +0200, Ingo Molnar wrote: * Nick Piggin [EMAIL PROTECTED] wrote: Actually I think this is something that makes sense to add, even if just for debugging, but maybe also for production, depending on how much it impacts things. Child runs first

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-17 Thread Nick Piggin
On Tue, Apr 17, 2007 at 08:56:27AM +0100, Andy Whitcroft wrote: as usual, any sort of feedback, bugreports, fixes and suggestions are more than welcome, Pushed this through the test.kernel.org and nothing new blew up. Notably the kernbench figures are within expectations even on the

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-17 Thread Nick Piggin
On Tue, Apr 17, 2007 at 11:59:00AM +0200, Ingo Molnar wrote: * Nick Piggin [EMAIL PROTECTED] wrote: 2.6.21-rc7-cfs-v2 534.80user 30.92system 2:23.64elapsed 393%CPU 534.75user 31.01system 2:23.70elapsed 393%CPU 534.66user 31.07system 2:23.76elapsed 393%CPU 534.56user 30.91system 2

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-17 Thread Nick Piggin
On Tue, Apr 17, 2007 at 04:39:54PM -0500, Matt Mackall wrote: On Tue, Apr 17, 2007 at 09:01:55AM +0200, Nick Piggin wrote: On Mon, Apr 16, 2007 at 11:26:21PM -0700, William Lee Irwin III wrote: On Mon, Apr 16, 2007 at 11:09:55PM -0700, William Lee Irwin III wrote: All things

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-17 Thread Nick Piggin
On Wed, Apr 18, 2007 at 05:45:20AM +0200, Mike Galbraith wrote: On Wed, 2007-04-18 at 05:15 +0200, Nick Piggin wrote: On Tue, Apr 17, 2007 at 04:39:54PM -0500, Matt Mackall wrote: I'm a big fan of fairness, but I think it's a bit early to declare it a mandatory feature. Bounded

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-17 Thread Nick Piggin
On Tue, Apr 17, 2007 at 11:16:54PM +1000, Peter Williams wrote: Nick Piggin wrote: I don't like the timeslice based nice in mainline. It's too nasty with latencies. nicksched is far better in that regard IMO. But I don't know how you can assert a particular way is the best way to do

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-17 Thread Nick Piggin
On Tue, Apr 17, 2007 at 11:38:31PM -0500, Matt Mackall wrote: On Wed, Apr 18, 2007 at 05:15:11AM +0200, Nick Piggin wrote: I don't know why this would be a useful feature (of course I'm talking about processes at the same nice level). One of the big problems with the current scheduler

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-18 Thread Nick Piggin
On Wed, Apr 18, 2007 at 12:55:25AM -0500, Matt Mackall wrote: On Wed, Apr 18, 2007 at 07:00:24AM +0200, Nick Piggin wrote: It's also not yet clear that a scheduler can't be taught to do the right thing with X without fiddling with nice levels. Being fair doesn't prevent that. Implicit

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-18 Thread Nick Piggin
On Wed, Apr 18, 2007 at 01:55:34AM -0500, Matt Mackall wrote: On Wed, Apr 18, 2007 at 08:37:11AM +0200, Nick Piggin wrote: I don't know how that supports your argument for unfairness, I never had such an argument. I like fairness. My argument is that -you- don't have an argument

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-18 Thread Nick Piggin
On Tue, Apr 17, 2007 at 11:59:00AM +0200, Ingo Molnar wrote: * Nick Piggin [EMAIL PROTECTED] wrote: 2.6.21-rc7-cfs-v2 534.80user 30.92system 2:23.64elapsed 393%CPU 534.75user 31.01system 2:23.70elapsed 393%CPU 534.66user 31.07system 2:23.76elapsed 393%CPU 534.56user 30.91system 2

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-18 Thread Nick Piggin
On Wed, Apr 18, 2007 at 11:53:34AM +0200, Ingo Molnar wrote: * Nick Piggin [EMAIL PROTECTED] wrote: So looking at elapsed time, a granularity of 100ms is just behind the mainline score. However it is using slightly less user time and slightly more idle time, which indicates

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-18 Thread Nick Piggin
On Wed, Apr 18, 2007 at 07:33:56PM +1000, Con Kolivas wrote: On Wednesday 18 April 2007 18:55, Nick Piggin wrote: Again, for comparison 2.6.21-rc7 mainline: 508.87user 32.47system 2:17.82elapsed 392%CPU 509.05user 32.25system 2:17.84elapsed 392%CPU 508.75user 32.26system 2:17.83elapsed

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-18 Thread Nick Piggin
On Wed, Apr 18, 2007 at 07:48:21AM -0700, Linus Torvalds wrote: On Wed, 18 Apr 2007, Matt Mackall wrote: Why is X special? Because it does work on behalf of other processes? Lots of things do this. Perhaps a scheduler should focus entirely on the implicit and directed wakeup matrix

Re: Announce - Staircase Deadline cpu scheduler v0.42

2007-04-18 Thread Nick Piggin
On Thu, Apr 19, 2007 at 12:12:14PM +1000, Con Kolivas wrote: On Thursday 19 April 2007 10:41, Con Kolivas wrote: On Thursday 19 April 2007 09:59, Con Kolivas wrote: Since there is so much work currently ongoing with alternative cpu schedulers, as a standard for comparison with the

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-18 Thread Nick Piggin
On Wed, Apr 18, 2007 at 10:49:45PM +1000, Con Kolivas wrote: On Wednesday 18 April 2007 22:13, Nick Piggin wrote: The kernel compile (make -j8 on 4 thread system) is doing 1800 total context switches per second (450/s per runqueue) for cfs, and 670 for mainline. Going up to 20ms

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-19 Thread Nick Piggin
On Thu, Apr 19, 2007 at 08:38:10AM +0200, Ingo Molnar wrote: * Andrew Morton [EMAIL PROTECTED] wrote: And yes, by fairly, I mean fairly among all threads as a base resource class, because that's what Linux has always done Yes, there are potential compatibility problems. Example:

Re: Announce - Staircase Deadline cpu scheduler v0.42

2007-04-19 Thread Nick Piggin
On Thu, Apr 19, 2007 at 07:40:04PM +1000, Con Kolivas wrote: On Thursday 19 April 2007 13:22, Nick Piggin wrote: On Thu, Apr 19, 2007 at 12:12:14PM +1000, Con Kolivas wrote: Version 0.42 http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7-sd-0.42.patch OK, I run some tests

Re: rr_interval experiments

2007-04-19 Thread Nick Piggin
On Fri, Apr 20, 2007 at 10:47:57AM +1000, Con Kolivas wrote: On Friday 20 April 2007 01:01, Con Kolivas wrote: This then allows the maximum rr_interval to be as large as 5000 milliseconds. Just for fun, on a core2duo make allnoconfig make -j8 here are the build time differences (on a

Re: Renice X for cpu schedulers

2007-04-19 Thread Nick Piggin
On Thu, Apr 19, 2007 at 09:17:25AM -0400, Mark Lord wrote: Con Kolivas wrote: s go ahead and think up great ideas for other ways of metering out cpu bandwidth for different purposes, but for X, given the absurd simplicity of renicing, why keep fighting it? Again I reiterate that most users

Re: Renice X for cpu schedulers

2007-04-19 Thread Nick Piggin
On Thu, Apr 19, 2007 at 12:26:03PM -0700, Ray Lee wrote: On 4/19/07, Con Kolivas [EMAIL PROTECTED] wrote: The one fly in the ointment for linux remains X. I am still, to this moment, completely and utterly stunned at why everyone is trying to find increasingly complex unique ways to manage

Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-21 Thread Nick Piggin
On Fri, Apr 20, 2007 at 04:47:27PM -0400, Bill Davidsen wrote: Ingo Molnar wrote: ( Lets be cautious though: the jury is still out whether people actually like this more than the current approach. While CFS feedback looks promising after a whopping 3 days of it being released [ ;-) ],

Re: [PATCH] lazy freeing of memory through MADV_FREE

2007-04-21 Thread Nick Piggin
Rik van Riel wrote: Andrew Morton wrote: On Fri, 20 Apr 2007 17:38:06 -0400 Rik van Riel [EMAIL PROTECTED] wrote: Andrew Morton wrote: I've also merged Nick's mm: madvise avoid exclusive mmap_sem. - Nick's patch also will help this problem. It could be that your patch no longer

Re: [PATCH] lazy freeing of memory through MADV_FREE

2007-04-21 Thread Nick Piggin
Nick Piggin wrote: Rik van Riel wrote: Andrew Morton wrote: On Fri, 20 Apr 2007 17:38:06 -0400 Rik van Riel [EMAIL PROTECTED] wrote: Andrew Morton wrote: I've also merged Nick's mm: madvise avoid exclusive mmap_sem. - Nick's patch also will help this problem. It could be that your

Re: [PATCH] lazy freeing of memory through MADV_FREE

2007-04-22 Thread Nick Piggin
Rik van Riel wrote: Nick Piggin wrote: Rik van Riel wrote: Here are the transactions/seconds for each combination: vanilla new glibc madv_free kernel madv_free + mmap_sem threads 1 610 609 596545 2103211361196

Re: [patch] CFS scheduler, -v5

2007-04-22 Thread Nick Piggin
On Mon, Apr 23, 2007 at 03:12:29AM +0200, Ingo Molnar wrote: i'm pleased to announce release -v5 of the CFS scheduler patchset. The patch against v2.6.21-rc7 and v2.6.20.7 can be downloaded from: http://redhat.com/~mingo/cfs-scheduler/ this CFS release mainly fixes regressions and

Re: [REPORT] cfs-v4 vs sd-0.44

2007-04-22 Thread Nick Piggin
On Sun, Apr 22, 2007 at 04:24:47PM -0700, Linus Torvalds wrote: On Sun, 22 Apr 2007, Juliusz Chroboczek wrote: Why not do it in the X server itself? This will avoid controversial policy in the kernel, and have the added advantage of working with X servers that don't directly access

Re: [patch] CFS scheduler, -v5

2007-04-22 Thread Nick Piggin
On Mon, Apr 23, 2007 at 04:55:53AM +0200, Ingo Molnar wrote: * Nick Piggin [EMAIL PROTECTED] wrote: the biggest user-visible change in -v5 are various interactivity improvements (especially under higher load) to fix reported regressions, and an improved way of handling nice levels

Re: [PATCH] lazy freeing of memory through MADV_FREE

2007-04-22 Thread Nick Piggin
Rik van Riel wrote: I've added a 5th column, with just your mmap_sem patch and without my madv_free patch. It is run with the glibc patch, which should make it fall back to MADV_DONTNEED after the first MADV_FREE call fails. Thanks! (I edited slightly so it doesn't wrap) vanilla new

Re: [patch] CFS scheduler, -v5

2007-04-22 Thread Nick Piggin
On Mon, Apr 23, 2007 at 05:43:10AM +0200, Ingo Molnar wrote: * Nick Piggin [EMAIL PROTECTED] wrote: note that CFS's granularity value is not directly comparable to timeslice length: Right, but it does introduce the kbuild regression, [...] Note that i increased the granularity

Re: How to make mmap'ed kernel buffer non-cacheable

2007-04-22 Thread Nick Piggin
Bhuvan Kumar MITTAL wrote: Hi Alan, I believe that dma_alloc_coherent will mark the kernel buffer as uncached at alocation time. But that is not my intention. I have mapped some user space memory to the kernel buffer and I wish to ensure that the contents of both are coherent and correctly

Re: [PATCH] lazy freeing of memory through MADV_FREE

2007-04-22 Thread Nick Piggin
Jakub Jelinek wrote: On Fri, Apr 20, 2007 at 07:52:44PM -0400, Rik van Riel wrote: It turns out that Nick's patch does not improve peak performance much, but it does prevent the decline when running with 16 threads on my quad core CPU! We _definately_ want both patches, there's a huge benefit

Re: [patch] CFS scheduler, -v5

2007-04-23 Thread Nick Piggin
On Mon, Apr 23, 2007 at 09:10:50AM +0200, Ingo Molnar wrote: * Nick Piggin [EMAIL PROTECTED] wrote: yeah - but they'll all be quad core, so the SMP timeslice multiplicator should do the trick. Most of the CFS testers use single-CPU systems. But desktop users could have have

Re: [PATCH] lazy freeing of memory through MADV_FREE

2007-04-23 Thread Nick Piggin
Nick Piggin wrote: Rik van Riel wrote: I've added a 5th column, with just your mmap_sem patch and without my madv_free patch. It is run with the glibc patch, which should make it fall back to MADV_DONTNEED after the first MADV_FREE call fails. Thanks! (I edited slightly so it doesn't wrap

Re: [PATCH] lazy freeing of memory through MADV_FREE

2007-04-23 Thread Nick Piggin
Rik van Riel wrote: Use TLB batching for MADV_FREE. Adds another 10-15% extra performance to the MySQL sysbench results on my quad core system. Signed-off-by: Rik van Riel [EMAIL PROTECTED] --- Rik van Riel wrote: I've added a 5th column, with just your mmap_sem patch and without my

Re: [PATCH] lazy freeing of memory through MADV_FREE

2007-04-23 Thread Nick Piggin
Rik van Riel wrote: Nick Piggin wrote: It looks like the tlb flushes (and IPIs) from zap_pte_range() could have been the problem. They're gone now. I guess it is a good idea to batch these things. But can you do that on all architectures? What happens if your tlb flush happens after

Re: [PATCH] lazy freeing of memory through MADV_FREE

2007-04-23 Thread Nick Piggin
Rik van Riel wrote: Use TLB batching for MADV_FREE. Adds another 10-15% extra performance to the MySQL sysbench results on my quad core system. Signed-off-by: Rik van Riel [EMAIL PROTECTED] --- Nick Piggin wrote: 3) because of this, we can treat any such accesses as happening

Re: [PATCH] lazy freeing of memory through MADV_FREE

2007-04-23 Thread Nick Piggin
Rik van Riel wrote: This should fix the MADV_FREE code for PPC's hashed tlb. Signed-off-by: Rik van Riel [EMAIL PROTECTED] --- Nick Piggin wrote: Nick Piggin wrote: 3) because of this, we can treat any such accesses as happening simultaneously with the MADV_FREE and as illegal, aka

Re: [PATCH] mm: PageLRU can be non-atomic bit operation

2007-04-23 Thread Nick Piggin
Hisashi Hifumi wrote: At 22:42 07/04/23, Hugh Dickins wrote: On Mon, 23 Apr 2007, Hisashi Hifumi wrote: No. The PG_lru flag bit is just one bit amongst many others: what of concurrent operations changing other bits in that same unsigned long e.g. trying to lock the page by setting

Re: [PATCH] mm: PageLRU can be non-atomic bit operation

2007-04-24 Thread Nick Piggin
Hisashi Hifumi wrote: At 11:47 07/04/24, Nick Piggin wrote: As Hugh points out, we must have atomic ops here, so changing the generic code to use the __ version is wrong. However if there is a faster way that i386 can perform the atomic variant, then doing so will speed up the generic

Re: [rfc][patch] futex: restartable futex_wait?

2007-03-08 Thread Nick Piggin
On Thu, Mar 08, 2007 at 06:29:02PM +0100, Ingo Molnar wrote: * Nick Piggin [EMAIL PROTECTED] wrote: Hi Ingo, I'm seeing an LTP test fail for ltp test sigaction_16_24. Basically, it tests whether the SA_RESTART flag works for the sem_wait operation. I see sem_wait is implemented

Re: [rfc][patch] futex: restartable futex_wait?

2007-03-08 Thread Nick Piggin
On Fri, Mar 09, 2007 at 12:02:31AM +0100, Thomas Gleixner wrote: On Thu, 2007-03-08 at 18:29 +0100, Ingo Molnar wrote: * Nick Piggin [EMAIL PROTECTED] wrote: Hi Ingo, I'm seeing an LTP test fail for ltp test sigaction_16_24. Basically, it tests whether the SA_RESTART flag works

Re: [rfc][patch] futex: restartable futex_wait?

2007-03-09 Thread Nick Piggin
On Fri, Mar 09, 2007 at 10:38:35AM +0100, Thomas Gleixner wrote: On Fri, 2007-03-09 at 06:10 +0100, Nick Piggin wrote: i think that's quite right. I'm wondering why this never came up before? But your fix is not complete i think: + restart-arg2 = time

Re: [patch 2/3] fs: introduce perform_write aop

2007-03-09 Thread Nick Piggin
for taking a look. On Thu, Feb 08, 2007 at 02:07:36PM +0100, Nick Piggin wrote: as a single call to copy a given amount of userdata at the given offset. This is more flexible, because the implementation can determine how to best handle errors, or multi-page ranges (eg. it may use a gang

[patch] futex: restartable futex_wait

2007-03-09 Thread Nick Piggin
timeout, and allow restarts. Signed-off-by: Nick Piggin [EMAIL PROTECTED] Index: linux-2.6/kernel/futex.c === --- linux-2.6.orig/kernel/futex.c +++ linux-2.6/kernel/futex.c @@ -978,6 +978,7 @@ static void unqueue_me_pi(struct futex_q

Re: [patch] change futex_wait() to hrtimers

2007-03-12 Thread Nick Piggin
On Mon, Mar 12, 2007 at 10:10:06AM +0100, Ingo Molnar wrote: * Roland McGrath [EMAIL PROTECTED] wrote: I agree it should restart. But I don't think this is quite right in the timeout case. It will increase the total maximum real time spent arbitrarily by the amount of time elapsed

Re: [patch] change futex_wait() to hrtimers

2007-03-12 Thread Nick Piggin
On Mon, Mar 12, 2007 at 12:02:04PM +0100, Ingo Molnar wrote: * Nick Piggin [EMAIL PROTECTED] wrote: i dont think we should try to do this. We should not and cannot do anything about all of the artifacts that comes with the use of relative timeouts and schedule_timeout

Re: [patch] change futex_wait() to hrtimers

2007-03-12 Thread Nick Piggin
On Mon, Mar 12, 2007 at 12:19:58PM +0100, Ingo Molnar wrote: * Nick Piggin [EMAIL PROTECTED] wrote: even if this means more work for you (i'm sorry about that!) i'm quite sure we should take Sebastien's hrtimers based implementation of futex_wait(), and use the nanosleep method

Re: [patch] change futex_wait() to hrtimers

2007-03-12 Thread Nick Piggin
On Mon, Mar 12, 2007 at 12:38:29PM +0100, Ingo Molnar wrote: * Nick Piggin [EMAIL PROTECTED] wrote: the issue is this: your fix reduces the effects of the bug but it is still fundamentally incomplete because of the use of timer_list. So But using schedule_timeout is not a bug

Re: [PATCH 2/2] mm: incorrect direct io error handling (v6)

2007-03-12 Thread Nick Piggin
On Mon, Mar 12, 2007 at 10:58:10AM +0300, Dmitriy Monakhov wrote: I realy don't want to be annoying by sending this patcheset over and over again, i just want the issue to be solved. If anyone think this solution is realy cappy, please comment what exectly is bad. Thank you. If you don't get

Re: [PATCH 1/2] mm: move common segment checks to separate helper function (v6)

2007-03-12 Thread Nick Piggin
On Mon, Mar 12, 2007 at 10:57:53AM +0300, Dmitriy Monakhov wrote: I realy don't want to be annoying by sending this patcheset over and over again. If anyone think this patch is realy cappy, please comment what exectly is bad. Thank you. Doesn't seem like a bad idea. Changes: - patch

Re: [PATCH 2/2] mm: incorrect direct io error handling (v6)

2007-03-12 Thread Nick Piggin
On Mon, Mar 12, 2007 at 11:55:30AM +0300, Dmitriy Monakhov wrote: Nick Piggin [EMAIL PROTECTED] writes: On Mon, Mar 12, 2007 at 10:58:10AM +0300, Dmitriy Monakhov wrote: @@ -2240,6 +2241,29 @@ ssize_t generic_file_aio_write(struct kiocb *iocb, const struct iovec *iov, mutex_lock

Re: [PATCH 2/2] mm: incorrect direct io error handling (v6)

2007-03-12 Thread Nick Piggin
On Mon, Mar 12, 2007 at 12:23:00PM +0300, Dmitriy Monakhov wrote: Nick Piggin [EMAIL PROTECTED] writes: On Mon, Mar 12, 2007 at 11:55:30AM +0300, Dmitriy Monakhov wrote: Nick Piggin [EMAIL PROTECTED] writes: On Mon, Mar 12, 2007 at 10:58:10AM +0300, Dmitriy Monakhov wrote

Re: do_generic_mapping_read performance issue

2007-03-12 Thread Nick Piggin
On Mon, Mar 12, 2007 at 03:20:12PM +0100, Jan Kara wrote: Hi, Hi, I am encountering a performance problem, which I have tracked into the Linux kernel. The problem occurs with my experimental web server that uses sendfile to repeatedly transmit files. The files are based on the static

Re: [patch] change futex_wait() to hrtimers

2007-03-12 Thread Nick Piggin
On Mon, Mar 12, 2007 at 01:21:03PM +0100, Ingo Molnar wrote: * Nick Piggin [EMAIL PROTECTED] wrote: the issue is this: your fix reduces the effects of the bug but it is still fundamentally incomplete because of the use of timer_list. So But using schedule_timeout

Re: [patch] change futex_wait() to hrtimers

2007-03-12 Thread Nick Piggin
On Mon, Mar 12, 2007 at 10:12:14AM -0400, Theodore Tso wrote: On Mon, Mar 12, 2007 at 11:58:26AM +0100, Andi Kleen wrote: On Mon, Mar 12, 2007 at 12:00:20PM +0100, Thomas Gleixner wrote: On Mon, 2007-03-12 at 12:27 +0100, Andi Kleen wrote: Ingo Molnar [EMAIL PROTECTED] writes:

Re: [patch 4/6] mm: merge populate and nopage into fault (fixes nonlinear)

2007-03-12 Thread Nick Piggin
On Tue, Mar 13, 2007 at 12:01:13AM +0100, Blaisorblade wrote: On Wednesday 07 March 2007 11:02, Nick Piggin wrote: Yeah, tmpfs/shm segs are what I was thinking about. If UML can live with that as well, then I think it might be a good option. Oh, hmm if you can truncate

Re: SMP performance degradation with sysbench

2007-03-12 Thread Nick Piggin
Anton Blanchard wrote: Hi Nick, Anyway, I'll keep experimenting. If anyone from MySQL wants to help look at this, send me a mail (eg. especially with the sched_setscheduler issue, you might be able to do something better). I took a look at this today and figured Id document it:

Re: RSDL-mm 0.28

2007-03-13 Thread Nick Piggin
David Schwartz wrote: There's a substantial performance hit for not yield, so we probably want to investigate alternate semantics for it. It seems reasonable for apps to say let me not hog the CPU without completely expiring them. Imagine you're in the front of the line (aka queue) and you spend

Re: [QUICKLIST 0/4] Arch independent quicklists V2

2007-03-13 Thread Nick Piggin
Andrew Morton wrote: On Tue, 13 Mar 2007 00:13:25 -0700 (PDT) Christoph Lameter [EMAIL PROTECTED] wrote: Page table pages have the characteristics that they are typically zero or in a known state when they are freed. Well if they're zero then perhaps they should be released to the page

Re: SMP performance degradation with sysbench

2007-03-13 Thread Nick Piggin
Andrea Arcangeli wrote: On Tue, Mar 13, 2007 at 04:11:02PM +1100, Nick Piggin wrote: Hi Anton, Very cool. Yeah I had come to the conclusion that it wasn't a kernel issue, and basically was afraid to look into userspace ;) btw, regardless of what glibc is doing, still the cpu shouldn't go

Re: [RFC][PATCH 4/7] RSS accounting hooks over the code

2007-03-13 Thread Nick Piggin
Eric W. Biederman wrote: Herbert Poetzl [EMAIL PROTECTED] writes: On Mon, Mar 12, 2007 at 09:50:08AM -0700, Dave Hansen wrote: On Mon, 2007-03-12 at 19:23 +0300, Kirill Korotaev wrote: For these you essentially need per-container page-_mapcount counter, otherwise you can't detect whether

Re: SMP performance degradation with sysbench

2007-03-13 Thread Nick Piggin
Andrea Arcangeli wrote: On Tue, Mar 13, 2007 at 09:06:14PM +1100, Nick Piggin wrote: Well ignoring the HT issue, I was seeing lots of idle time simply because userspace could not keep up enough load to the scheduler. There simply were fewer runnable tasks than CPU cores. When you said idle

Re: [QUICKLIST 0/4] Arch independent quicklists V2

2007-03-13 Thread Nick Piggin
Andrew Morton wrote: On Tue, 13 Mar 2007 19:03:38 +1100 Nick Piggin [EMAIL PROTECTED] wrote: Page allocator still requires interrupts to be disabled, which this doesn't. Bah. How many cli/sti statements fit into a single cachemiss? On a Pentium 4? ;) Sure, that is a minor detail

Re: SMP performance degradation with sysbench

2007-03-13 Thread Nick Piggin
Andrea Arcangeli wrote: On Tue, Mar 13, 2007 at 09:37:54PM +1100, Nick Piggin wrote: Well it wasn't iowait time. From Anton's analysis, I would probably say it was time waiting for either the glibc malloc mutex or MySQL heap mutex. So it again makes little sense to me that this is idle time

Re: [QUICKLIST 0/4] Arch independent quicklists V2

2007-03-13 Thread Nick Piggin
Andrew Morton wrote: On Tue, 13 Mar 2007 22:06:46 +1100 Nick Piggin [EMAIL PROTECTED] wrote: Andrew Morton wrote: On Tue, 13 Mar 2007 19:03:38 +1100 Nick Piggin [EMAIL PROTECTED] wrote: ... Page allocator still requires interrupts to be disabled, which this doesn't. it is worthwhile

Re: SMP performance degradation with sysbench

2007-03-13 Thread Nick Piggin
Eric Dumazet wrote: On Tuesday 13 March 2007 12:12, Nick Piggin wrote: I guess googlemalloc (tcmalloc?) isn't suitable for a general purpose glibc allocator. But I wonder if there are other improvements that glibc can do here? I cooked a patch some time ago to speedup threaded apps and got

Re: [QUICKLIST 0/4] Arch independent quicklists V2

2007-03-13 Thread Nick Piggin
Andrew Morton wrote: On Tue, 13 Mar 2007 22:30:19 +1100 Nick Piggin [EMAIL PROTECTED] wrote: We don't actually have to zap_pte_range the entire page table in order to free it (IIRC we used to have to, before the 4lpt patches). I'm trying to remember why we ever would have needed to zero out

Re: SMP performance degradation with sysbench

2007-03-13 Thread Nick Piggin
Andrea Arcangeli wrote: On Tue, Mar 13, 2007 at 10:12:19PM +1100, Nick Piggin wrote: They'll be sleeping in futex_wait in the kernel, I think. One thread will hold the critical mutex, some will be off doing their own thing, but importantly there will be many sleeping for the mutex to become

Re: [QUICKLIST 0/4] Arch independent quicklists V2

2007-03-13 Thread Nick Piggin
Andrew Morton wrote: On Tue, 13 Mar 2007 23:01:11 +1100 Nick Piggin [EMAIL PROTECTED] wrote: Andrew Morton wrote: It would be interesting to look at a) leave the page full of random garbage if we're releasing the whole mm and b) return it straight to the page allocator. Well we have

Re: [RFC][PATCH 4/7] RSS accounting hooks over the code

2007-03-13 Thread Nick Piggin
Eric W. Biederman wrote: Nick Piggin [EMAIL PROTECTED] writes: Eric W. Biederman wrote: First touch page ownership does not guarantee give me anything useful for knowing if I can run my application or not. Because of page sharing my application might run inside the rss limit only because I

Re: [RFC][PATCH 4/7] RSS accounting hooks over the code

2007-03-13 Thread Nick Piggin
Balbir Singh wrote: Nick Piggin wrote: And strangely, this example does not go outside the parameters of what you asked for AFAIKS. In the worst case of one container getting _all_ the shared pages, they will still remain inside their maximum rss limit. When that does happen

  1   2   3   4   5   6   7   8   9   10   >