Srivatsa Vaddagiri wrote:
I think a potential area which VST may need to address is
scheduler load balance. If idle CPUs stop taking local timer ticks for
some time, then during that period it could cause the various runqueues to
go out of balance, since the idle CPUs will no longer pull tasks f
Srivatsa Vaddagiri wrote:
Hmm ..I guess we could restrict the max time a idle CPU will sleep taking
into account its balance interval. But whatever heuristics we follow to
maximize balance_interval of about-to-sleep idle CPU, don't we still run the
risk of idle cpu being woken up and going immedi
David Howells wrote:
Hugh Dickins <[EMAIL PROTECTED]> wrote:
Remove use of FIRST_USER_PGD_NR from sys_mincore: it's inconsistent (no
other syscall refers to it), unnecessary (sys_mincore loops over vmas
further down) and incorrect (misses user addresses in ARM's first pgd).
You should make it use
On Thu, 2005-04-07 at 18:08 -0700, Siddha, Suresh B wrote:
> On Thu, Apr 07, 2005 at 03:11:12AM +1000, Nick Piggin wrote:
> > Using the attached patch, a puny dual PIII-650 with ~400MB RAM swapped
> > itself to death after 2 infinite loop tasks had been pinned to one
> >
Srivatsa Vaddagiri wrote:
On Thu, Apr 07, 2005 at 05:10:24PM +0200, Ingo Molnar wrote:
Interaction with VST is not a big issue right now because this only matters
on SMP boxes which is a rare (but not unprecedented) target for embedded
platforms.
Well, I don't think VST is targetting just powe
Jens Axboe wrote:
On Wed, Mar 30 2005, Nick Piggin wrote:
So Kenneth if you could look into this one as well, to see if
it is worthwhile, that would be great.
For that to work, you have to change the get_io_context() allocation to
be GFP_ATOMIC.
Yes of course, thanks for picking that up.
I guess
Jens Axboe wrote:
On Fri, Apr 08 2005, Nick Piggin wrote:
I guess this isn't a problem, as io contexts should be allocated
comparatively rarely. It would be possible to move it out of the
lock though if we really want to.
Lets just keep it inside the lock, for the fast case it should just
Ingo Molnar wrote:
* Luck, Tony <[EMAIL PROTECTED]> wrote:
tested on x86, and all other arches should work as well, but if an
architecture has irqs-off assumptions in its switch_to() logic
it might break. (I havent found any but there may such assumptions.)
The ia64_switch_to() code includes a s
Ingo Molnar wrote:
* Nick Piggin <[EMAIL PROTECTED]> wrote:
I did propose doing unconditionally unlocked switches a while back
when my patch first popped up - you were against it then, but I guess
you've had second thoughts?
the reordering of switch_to() and the switch_mm()-relate
Claudio Martins wrote:
On Tuesday 05 April 2005 03:12, Andrew Morton wrote:
Claudio Martins <[EMAIL PROTECTED]> wrote:
While stress testing 2.6.12-rc2 on an HP DL145 I get processes stuck
in D state after some time.
This machine is a dual Opteron 248 with 2GB (ECC) on one node (the
other node h
or the tip. I booted with nmi_watchdog=0 and was able to get a full
sysrq-t as well as a sysrq-m. Since it might be a little too big for the
list, I've put it on a text file at:
http://193.136.132.235/dl145/dump1-2.6.12-rc2.txt
I also made a run with the mempool-can-fail patch from Nick Pi
the lower zone
protection for DMA ends up as), however you are well above all the
"emergency watermarks" in ZONE_NORMAL. Also:
I also made a run with the mempool-can-fail patch from Nick Piggin. With this
I got some nice memory allocation errors from the md threads when the trouble
started. T
Nick Piggin wrote:
The common theme seems to be: try_to_free_pages, swap_writepage,
mempool_alloc, down/down_failed in .text.lock.md. Next I would suspect
md/raid1 - maybe some deadlock in an uncommon memory allocation
failure path?
I'll see if I can reproduce it here.
No luck yet (on SMP
Claudio Martins wrote:
Right. I'm using two Seagate ATA133 disks (ide controler is AMD-8111) each
with 4 partitions, so I get 4 md Raid1 devices. The first one, md0, is for
swap. The rest are
~$ df -h
FilesystemSize Used Avail Use% Mounted on
/dev/md1 4.6G 1.9G 2.6
Paul E. McKenney wrote:
On Thu, Apr 07, 2005 at 05:58:40PM +1000, Nick Piggin wrote:
OK thanks for the good explanation. So I'll keep it as is for now,
and whatever needs cleaning up later can be worked out as it comes
up.
Looking forward to the split of synchronize_kernel() into synchroniz
On Tue, 2005-04-12 at 01:22 +0100, Claudio Martins wrote:
> On Monday 11 April 2005 23:59, Nick Piggin wrote:
> >
> > > OK, I'll try them in a few minutes and report back.
> >
> > I'm not overly hopeful. If they fix the problem, then it's likely
>
On Mon, 2005-04-11 at 18:06 -0700, David Mosberger wrote:
> I had to refresh my memory with a quick Google search that netted [1]
> (look for "Disable interrupts during context switch"). Actually, it
> wasn't really a deadlock, but rather a livelock, since a CPU got stuck
> on an infinite page-no
Andrew Morton wrote:
So it turns out that patch was broken. I've fixed it locally and the
results are good, but odd.
The machine is a 4GB x86_64 with aic79xx controllers and MAXTOR
ATLAS10K4_73WLS disks. ext2 filesystem.
The workload is continuous pagecache writeback versus
read-lots-of-little-fi
On Mon, 2005-04-11 at 23:19 -0700, Andrew Morton wrote:
> Nick Piggin <[EMAIL PROTECTED]> wrote:
> >
> > >- The effects of tcq on AS are much less disastrous than I thought they
> > > were. Do I have the wrong workload? Memory fails me. Or did we fix
>
Nick Piggin wrote:
Chen, Kenneth W wrote:
I like the patch a lot and already did bench it on our db setup.
However,
I'm seeing a negative regression compare to a very very crappy patch (see
attached, you can laugh at me for doing things like that :-).
OK - if we go that way, perhap
Nick Piggin wrote:
Nick Piggin wrote:
Chen, Kenneth W wrote:
I like the patch a lot and already did bench it on our db setup.
However,
I'm seeing a negative regression compare to a very very crappy patch
(see
attached, you can laugh at me for doing things like that :-).
OK - if we go tha
Andrew, please consider.
Ken, you'll probably have something similar to this if you
were following various random threads closely and picking
out my various random patches ;)
--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message
1/9
--
SUSE Labs, Novell Inc.
__GFP_ZERO really shouldn't tempt fate.
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
Index: linux-2.6/include/linux/gfp.h
===
--- linux-2.6.orig/include/linux/gfp.h 2005-04-12 22:05:4
6/9
--
SUSE Labs, Novell Inc.
get_request_wait needn't unplug the device immediately.
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/block/ll_rw_blk.c
===
--- linux-2.6.orig/drivers/block/ll_rw_bl
ue. This is reported to help efficiency.
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/block/ll_rw_blk.c
===
--- linux-2.6.orig/drivers/block/ll_rw_blk.c2005-04-12 22:26:14.0
+1000
+
_harder' if it hits the page allocator.
So if allocation still fails, then we can probably afford to hit the
pool->lock - and what's the alternative? Try page reclaim and hit
zone->lru_lock?
Signed-off-by: Nick Piggin <[EMAIL
2/9
--
SUSE Labs, Novell Inc.
Mempools have 2 problems.
The first is that mempool_alloc can possibly get stuck in __alloc_pages
when they should opt to fail, and take an element from their reserved pool.
The second is that it will happily eat emergency PF_MEMALLOC reserves
instead of going to the
sts in flight. They
will wake up waiters when they are retired.
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/block/ll_rw_blk.c
===
--- linux-2.6.orig/drivers/block/ll_rw_blk.c2005-04-12 22:05:
5/9
--
SUSE Labs, Novell Inc.
Sprinkle around a few branch hints in the block layer.
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/block/ll_rw_blk.c
===
--- linux-2.6.orig/drivers/block/ll_rw_blk.c
8/9
--
SUSE Labs, Novell Inc.
Change around locking a bit for a result of 1-2 less spin lock
unlock pairs in request submission paths.
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/block/ll_rw
.
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
Index: linux-2.6/mm/swap_state.c
===
--- linux-2.6.orig/mm/swap_state.c 2005-04-12 22:05:44.0 +1000
+++ linux-2.6/mm/swap_state.c 2005-04-12 22:26:12.0
Chen, Kenneth W wrote:
On Tue, Apr 12 2005, Nick Piggin wrote:
Actually the patches I have sent you do fix real bugs, but they also
make the block layer less likely to recurse into page reclaim, so it
may be eg. hiding the problem that Neil's patch fixes.
Jens Axboe wrote on Tuesday, Apr
Andrew Morton wrote:
Nick Piggin <[EMAIL PROTECTED]> wrote:
#define GFP_LEVEL_MASK (__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS| \
- __GFP_COLD|__GFP_NOWARN|__GFP_REPEAT| \
- __GFP_NOFAIL|__GFP_NORETRY|__GFP_NO_GROW|__GF
Andrew Morton wrote:
Index: linux-2.6/include/linux/gfp.h
===
--- linux-2.6.orig/include/linux/gfp.h 2005-04-12 22:26:10.0 +1000
+++ linux-2.6/include/linux/gfp.h 2005-04-12 22:26:11.0 +1000
@@ -38,14 +38,16 @@ s
Andrew Morton wrote:
Nick Piggin <[EMAIL PROTECTED]> wrote:
PF_MEMALLOC is really not a tool for tinkering. It is pretty specifically
used to prevent recursion into page reclaim, and to prevent low memory
deadlocks.
The mm/swap_state.c code was the only legitimate tinkerer. Its conce
Andrew Morton wrote:
Nick Piggin <[EMAIL PROTECTED]> wrote:
get_request_wait needn't unplug the device immediately.
Probably. But what if the get_request(q, rw, GFP_NOIO); did
some sleeping?
It can't sleep unless it returns the request, because it
is using mempool allocs. So any
Chen, Kenneth W wrote:
Nick Piggin wrote on Tuesday, April 12, 2005 4:09 AM
Chen, Kenneth W wrote:
I like the patch a lot and already did bench it on our db setup. However,
I'm seeing a negative regression compare to a very very crappy patch (see
attached, you can laugh at me for doing t
Claudio Martins wrote:
On Tuesday 12 April 2005 01:46, Andrew Morton wrote:
Claudio Martins <[EMAIL PROTECTED]> wrote:
I think I'm going to give a try to Neil's patch, but I'll have to apply
some patches from -mm.
Just this one if you're using 2.6.12-rc2:
--- 25/drivers/md/md.c~avoid-deadlock-in-s
David Mosberger wrote:
On Tue, 12 Apr 2005 12:12:45 +1000, Nick Piggin <[EMAIL PROTECTED]> said:
>> Now, Ingo says that the order is reversed with his patch, i.e.,
>> switch_mm() happens after switch_to(). That means flush_tlb_mm()
>> may now see a current->ac
Siddha, Suresh B wrote:
On Wed, Apr 13, 2005 at 10:08:28PM +0200, Ingo Molnar wrote:
* Siddha, Suresh B <[EMAIL PROTECTED]> wrote:
- for_each_domain(target_cpu, sd) {
+ for_each_domain(target_cpu, sd)
if ((sd->flags & SD_LOAD_BALANCE) &&
- cpu_isse
Jesper Juhl wrote:
There are two expressions in kernel/sched.c that are always false since
they test for <0 but the result of the expression is unsigned so they will
never be less than zero. This patch implement the logic that I believe is
intended without the signedness issue and without the na
Jesper Juhl wrote:
On Fri, 15 Apr 2005, Nick Piggin wrote:
Jesper Juhl wrote:
There are two expressions in kernel/sched.c that are always false since they
test for <0 but the result of the expression is unsigned so they will never
be less than zero. This patch implement the logic that I beli
Jesper Juhl wrote:
As per this patch perhaps? :
Thanks. I'll make sure it gets to the right place if nobody picks it up.
Signed-off-by: Jesper Juhl <[EMAIL PROTECTED]>
--- linux-2.6.12-rc2-mm3-orig/kernel/sched.c 2005-04-11 21:20:56.0 +0200
+++ linux-2.6.12-rc2-mm3/kernel/sched.c 2005-04-
On Fri, 2005-04-15 at 12:59 +1000, Herbert Xu wrote:
> Jesper Juhl <[EMAIL PROTECTED]> wrote:
> >
> > - if (unlikely((long long)now - prev->timestamp < 0))
> > + if (unlikely(((long long)now - (long long)prev->timestamp)
> > < 0))
>
> You can write this as
>
> (long l
On Fri, 2005-04-15 at 12:48 +0800, Michael Deegan wrote:
> Hi folks,
>
> I noticed something unusual on my home desktop machine (K6II, 448M RAM, runs
> KDE, samba, nfsd. 2.6.12-rc2 on Debian sarge). The machine seems to feel
> slightly sluggish; it seems to swap a fair bit more than it did under
>
On Thu, 2005-04-14 at 22:20 -0700, Randy.Dunlap wrote:
> On Fri, 15 Apr 2005 14:59:05 +1000 Nick Piggin wrote:
>
> | On Fri, 2005-04-15 at 12:48 +0800, Michael Deegan wrote:
> | > Hi folks,
> | >
> | > I noticed something unusual on my home desktop machine (K6II,
esh Siddha <[EMAIL PROTECTED]>
Catch more (hopefully all) cases.
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
Index: linux-2.6/kernel/sched.c
===
--- linux-2.6.orig/kernel/sched.c 2005-04-15 22:52:25.0 +1000
Siddha, Suresh B wrote:
On Fri, Apr 15, 2005 at 11:03:20PM +1000, Nick Piggin wrote:
Index: linux-2.6/kernel/sched.c
===
--- linux-2.6.orig/kernel/sched.c 2005-04-15 22:52:25.0 +1000
+++ linux-2.6/kernel/sched.c2005
the mapping it needs for good cache performance,
and as well do_wp_page is now able to always correctly detect and
optimise zero page COW faults.
This change is required in order to be able to detect whether a pte
points to a ZERO_PAGE using only its (pte, vaddr) pair.
Signed-off-by: Nick Piggin
Hi,
I'll be looking to send these off to Andrew after 2.6.14 opens,
with the aim of having them merged by 2.6.15 hopefully.
It doesn't look like they'll be able to easily free up a page
flag for 2 reasons. First, PageReserved will probably be kept
around for at least one release. Second, swsusp
wsusp, which uses PageReserved to
determine whether a struct page points to valid memory or not. This
still needs to be addressed.
Many thanks to Hugh Dickins for input.
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
Index: linux-2.6/include
Daniel Phillips wrote:
On Sunday 07 August 2005 13:28, Nick Piggin wrote:
If anyone has an issue with the patches or my merge plan, let's
get some discussion going.
You forgot to mention what replaces PageReserved: the VM_RESERVED vma flag,
which is now added to the whole zap_pte
Nigel Cunningham wrote:
Hi.
On Tue, 2005-08-09 at 07:09, Daniel Phillips wrote:
It doesn't look like they'll be able to easily free up a page
flag for 2 reasons. First, PageReserved will probably be kept
around for at least one release. Second, swsusp and some arch
code (ioremap) wants to know
Nigel Cunningham wrote:
Hi Nick et al.
On Tue, 2005-08-09 at 14:59, Nick Piggin wrote:
Changing the e820 code so it sets PageNosave instead of PageReserved,
along with a couple of modifications in swsusp itself should get rid of
the swsusp dependency.
That would work for swsusp, but there
Russell King wrote:
On Tue, Aug 09, 2005 at 02:59:53PM +1000, Nick Piggin wrote:
That would work for swsusp, but there are other users that want to
know if a struct page is valid ram (eg. ioremap), so in that case
swsusp would not be able to mess with the flag.
The usage of "valid ram&
Arjan van de Ven wrote:
On Tue, 2005-08-09 at 08:08 +0100, Russell King wrote:
Can we straighten out the terminology so it's less confusing please?
and. can we make a general page_is_ram() function that does what it
says? on x86 it can go via the e820 table, other architectures can do
Benjamin Herrenschmidt wrote:
I have no problem keeping PG_reserved for that, and _ONLY_ for that.
(though i'd rather see it renamed then). I'm just afraid by doing so,
some drivers will jump in the gap and abuse it again...
Sure it would be renamed (better yet may be a slower page_is_valid()
Arjan van de Ven wrote:
On Tue, 2005-08-09 at 19:31 +1000, Nick Piggin wrote:
Arjan van de Ven wrote:
and. can we make a general page_is_ram() function that does what it
says? on x86 it can go via the e820 table, other architectures can do
whatever they need
That would be very
Hugh Dickins wrote:
You're right (though I imagine might sometimes be holes rather than RAM).
Yep. These holes are what I have in mind, and random other things
like the !(bad_ppro && page_kills_ppro(pfn)) check.
[...]
I think Nick is treating the "use" of PageReserved in ioremap much too
Hugh Dickins wrote:
On Tue, 9 Aug 2005, Nick Piggin wrote:
But in either case: I agree that it is probably not a great loss
to remove the check, although considering it will be needed for
swsusp anyway...
swsusp (and I think crashdump has a similar need) is a very different
case: it
Siddha, Suresh B wrote:
For example, lets take two nodes each having two physical packages. And
assume that there are two tasks and both of them are on (may or may n't be
pinned) two packages in node-0
Todays load balance will detect that there is an imbalance between the
two nodes and will try
Siddha, Suresh B wrote:
On Tue, Aug 09, 2005 at 03:19:58PM -0700, Martin J. Bligh wrote:
--On Tuesday, August 09, 2005 15:03:32 -0700 "Siddha, Suresh B" <[EMAIL
PROTECTED]> wrote:
Balance on clone make some sort of sense, since you know they're not
going to exec afterwards. We've thrashed t
On Tue, 2005-08-09 at 19:03 -0700, Siddha, Suresh B wrote:
> On Wed, Aug 10, 2005 at 10:27:44AM +1000, Nick Piggin wrote:
> > Yeah this makes sense. Thanks.
> >
> > I think we'll only need your first line change to fix this, though.
> >
> > Your second chang
Benjamin Herrenschmidt wrote:
On Tue, 2005-08-09 at 20:41 +0100, Russell King wrote:
On Tue, Aug 09, 2005 at 07:38:52AM -0700, Martin J. Bligh wrote:
pfn_valid() doesn't tell you it's RAM or not - it tells you whether you
have a backing struct page for that address. Could be an IO mapped devi
This is my second attempt at a lockless pagecache.
Patches are against 2.6.13-rc6, and have had reasonable
stressing (albeit on small SMPs).
Main changes since last seen:
* Code clarity and commenting improvement.
* Fix race where multiple concurrent failed speculative
reference takers could
1/7
This rollup is a patchset all on its own. There is
a recent thread on linux-kernel if it interests you.
Required by lockless pagecache for consistent page
refcounting
--
SUSE Labs, Novell Inc.
Index: linux-2.6/mm/rmap.c
===
--
3/7
--
SUSE Labs, Novell Inc.
If we can be sure that elevating the page_count on a pagecache
page will pin it, we can speculatively run this operation, and
subsequently check to see if we hit the right page rather than
relying on holding a lock or otherwise pinning a reference to
the page.
This
2/7
--
SUSE Labs, Novell Inc.
In a future patch we can no longer rely on page_count being stable at any
time, so we can no longer overload PagePrivate && page_count == 0 to mean
the page is free and on the buddy lists.
Index: linux-2.6/include/linux/page-flags.h
4/7
Required by lockless pagecache in order to get a pointer
to a pagecache struct page.
--
SUSE Labs, Novell Inc.
From: Hans Reiser <[EMAIL PROTECTED]>
Reiser4 uses radix trees to solve a trouble reiser4_readdir has serving nfs
requests.
Unfortunately, radix tree api lacks an operation suita
5/7
--
SUSE Labs, Novell Inc.
Make radix tree lookups safe to be performed without locks.
Readers are protected against nodes being deleted by using RCU
based freeing. Readers are protected against new node insertion
by using memory barriers to ensure the node itself will be
properly written bef
6/7
--
SUSE Labs, Novell Inc.
Use the speculative get_page and the lockless radix tree lookups
to introduce lockless page cache lookups (ie. no mapping->tree_lock).
The only atomicity changes this should introduce is the use of a
non atomic pagevec lookup for truncate, however what atomicity
gu
7/7
--
SUSE Labs, Novell Inc.
With practially all the read locks gone from mapping->tree_lock,
convert the lock from an rwlock back to a spinlock.
The remaining locks including the read locks mainly deal with IO
submission and not the lookup fastpaths.
Index: linux-2.6/fs/buffer.c
Hi Pekka,
Pekka Enberg wrote:
Hi Nick,
On 8/11/05, Nick Piggin <[EMAIL PROTECTED]> wrote:
+unsigned find_get_pages_nonatomic(struct address_space *mapping, pgoff_t start,
+ unsigned int nr_pages, struct page **pages)
+{
+ unsigned int i;
+ unsign
Siddha, Suresh B wrote:
On Thu, Aug 11, 2005 at 01:09:10PM +1000, Nick Piggin wrote:
I have a variation on the 2nd part of your patch which I think
I would prefer. IMO it kind of generalises the current imbalance
calculation to handle this case rather than introducing a new
special case
Siddha, Suresh B wrote:
On Fri, Aug 12, 2005 at 09:49:36AM +1000, Nick Piggin wrote:
Well, it is a departure from our current idea of balancing.
That idea is already changing from the first line of the patch.
And the change is "allowing the load to grow upto the sched
group's
On Thu, 2005-08-11 at 18:37 -0700, Paul E. McKenney wrote:
> On Thu, Aug 11, 2005 at 10:25:47PM +1000, Nick Piggin wrote:
> > 5/7
> >
> > --
> > SUSE Labs, Novell Inc.
> >
>
> > Make radix tree lookups safe to be performed without locks.
> > Read
On Thu, 2005-08-11 at 18:49 -0700, Paul E. McKenney wrote:
> On Thu, Aug 11, 2005 at 10:28:04PM +1000, Nick Piggin wrote:
> > 6/7
> >
> > --
> > SUSE Labs, Novell Inc.
> >
>
> > Use the speculative get_page and the lockless radix tree lookups
> &
On Thu, 2005-08-11 at 18:37 -0700, Paul E. McKenney wrote:
> On Thu, Aug 11, 2005 at 10:25:47PM +1000, Nick Piggin wrote:
> > 5/7
> >
> > --
> > SUSE Labs, Novell Inc.
> >
>
> > Make radix tree lookups safe to be performed without locks.
> > Read
Nick Piggin wrote:
With the above, we can meet the same requirements of the current
find_get_page. Which basically are:
x) If the page was ever[1] in pagecache, it may be returned
y) If the pagecache was ever[2] empty, NULL may be returned
Oh, I missed a couple of "obvious" o
George Anzinger wrote:
The NMI entry and exit code fiddles with bits in the preempt count. If
an NMI happens while some other code is doing the same, bits will be
lost. This patch removes this modify code from the NMI path till we can
come up with something better.
Humour me for a minute
Russell King wrote:
On Fri, Aug 12, 2005 at 08:21:45PM +0200, [EMAIL PROTECTED] wrote:
@@ -632,10 +632,11 @@ static inline int page_mapped(struct pag
* Used to decide whether a process gets delivered SIGBUS or
* just gets major/minor fault counters bumped up.
*/
-#define VM_FAULT_OOM (-1)
Dinakar Guniguntala wrote:
Here's an attempt at dynamic sched domains aka isolated cpusets
Very good, I was wondering when someone would try to implement this ;)
It needs some work. A few initial comments on the kernel/sched.c change
- sorry, don't have too much time right now...
--- linux-2.6.12-r
On Tue, 2005-04-19 at 11:07 +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2005-04-18 at 14:38 -0500, Linas Vepstas wrote:
> >
> > Hi,
> >
> > The patch below appears to fix a problem where a number of dead processes
> > linger on the system. On a highly loaded system, dozens of processes
> > w
On Mon, 2005-04-18 at 22:54 -0700, Paul Jackson wrote:
> Now, onto the real stuff.
>
> This same issue, in a strange way, comes up on the memory side,
> as well as on the cpu side.
>
> First, let me verify one thing. I understand that the _key_
> purpose of your patch is not so much to isolate
On Mon, 2005-04-18 at 23:59 -0700, Paul Jackson wrote:
> Nick wrote:
> > Basically you just have to know that it has the
> > capability to partition the system in an arbitrary disjoint set
> > of sets of cpus.
> >
> > If you can make use of that, then we're in business ;)
>
> You read fast ;)
>
On Tue, 2005-04-19 at 00:19 -0700, Paul Jackson wrote:
> Nick wrote:
> > It doesn't work if you have *most* jobs bound to either
> > {0, 1, 2, 3} or {4, 5, 6, 7} but one which should be allowed
> > to use any CPU from 0-7.
>
> How bad does it not work?
>
> My understanding is that Dinakar's patch
On Tue, 2005-04-19 at 09:23 +0200, Yann Dupont wrote:
> Lukas Hejtmanek a Ãcrit :
> >Btw, are you using some TCP tweaks? E.g. I have default TCP window size 1MB.
> >
> >
> >
> Do you have turned NAPI on ??? I tried without it off on e1000 and ...
> surprise !
> Don't have any messages since 12H
On Tue, 2005-04-19 at 10:15 +0200, Yann Dupont wrote:
> Nick Piggin a Ãcrit :
>
> >
> >>Do you have turned NAPI on ??? I tried without it off on e1000 and ...
> >>surprise !
> >>Don't have any messages since 12H now (usually I got those in less than 1
On Wed, 2005-04-20 at 16:40 +0900, Tejun Heo wrote:
> Hello, Jens.
>
> On Wed, Apr 20, 2005 at 08:30:10AM +0200, Jens Axboe wrote:
> > Do it on requeue, please - not on the initial spotting of the request.
>
> This is the reworked version of the patch. It sets REQ_SOFTBARRIER
> in two places -
Jens Axboe wrote:
On Wed, Apr 20 2005, Tejun Heo wrote:
Well, yeah, all schedulers have dispatch queue (noop has only the
dispatch queue) and use them to defer/requeue, so no reordering will
happen, but I'm not sure they are required to be like this or just
happen to be implemented so.
Precisely,
Jens Axboe wrote:
On Wed, Apr 20 2005, Nick Piggin wrote:
I guess this could be one use of 'reordering' after a requeue.
Yeah, or perhaps the io scheduler might determine that a request has
higher prio than a requeued one. I'm not sure what semantics to place
I guess this is
On Wed, 2005-04-20 at 10:55 -0600, jmerkey wrote:
>
> For 3Ware, you need to chage the queue depths, and you will see
> dramatically improved performance. 3Ware can take requests
> a lot faster than Linux pushes them out. Try changing this instead, you
> won't be going to sleep all the time wait
condition.
BUG sighted on a 2-way Itanium2 system with 16K PAGE_SIZE running
fsstress -v -d $DIR/tmp -n 1000 -p 1000 -l 2
where $DIR is a new ext2 filesystem with 4K blocks that is quite
small (causing get_block to fail often with -ENOSPC).
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
light. __mpage_writepage BUGs on this condition.
BUG sighted on a 2-way Itanium2 system with 16K PAGE_SIZE running
fsstress -v -d $DIR/tmp -n 1000 -p 1000 -l 2
where $DIR is a new ext2 filesystem with 4K blocks that is quite
small (causing get_block to fail often with -ENOSPC).
Signed-off-by: Nick P
On Thu, 2005-04-21 at 08:01 +0100, Anton Altaparmakov wrote:
> Any reason why you left the goto out? It would be IMO much cleaner to
> remove the label "out" altogether and replace the single "goto out" with a
> "break" (which is fine since the goto happens inside the for loop
> immediately af
On Thu, 2005-04-21 at 08:10 +0100, Anton Altaparmakov wrote:
> And one more thing...
>
> On Thu, 2005-04-21 at 08:01 +0100, Anton Altaparmakov wrote:
> > On Thu, 21 Apr 2005, Nick Piggin wrote:
> > > ... I somehow didn't send it to Andrew last time.
Hi Andrew,
If you're feeling like -mm is getting too stable, then you might
consider giving these patches a spin? (unless anyone else raises
an objection).
Ben thought I should get moving with them soon.
Not much change from last time. A bit of ppc64 input from Ben,
and some rmap.c input from H
1/6
Just be clear that VM_RESERVED pages here are a bug, and the test
is not there because they are expected.
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
Index: linux-2.6/mm/rmap.c
===
--- linux-2.6.orig/mm/rmap.c
+++ lin
2/6
Microoptimise page_add_anon_rmap. Although these expressions are used only
in the taken branch of the if() statement, the compiler can't reorder them
inside because atomic_inc_and_test is a barrier.
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
Index: linux-2.
eing freed).
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
Index: linux-2.6/mm/page_alloc.c
===
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -329,7 +329,7 @@ static inline void free_pages_check(cons
101 - 200 of 1974 matches
Mail list logo