On Tue, Nov 18, 2008 at 10:03:25AM +1100, Benjamin Herrenschmidt wrote:
On Mon, 2008-11-17 at 15:34 -0500, Steven Rostedt wrote:
Note, I was using a default config that had CONFIG_IRQSTACKS off and
CONFIG_PPC_64K_PAGES on.
For one, we definitely need to turn IRQSTACKS on by default ... In
On Tuesday 18 November 2008 09:53, Paul Mackerras wrote:
I'd love to be able to use a 4k base page size if I could still get
the reduction in page faults and the expanded TLB reach that we get
now with 64k pages. If we could allocate the page cache for large
files with order-4 allocations
On Tuesday 18 November 2008 13:08, Linus Torvalds wrote:
On Tue, 18 Nov 2008, Paul Mackerras wrote:
Also, you didn't respond to my comments about the purely software
benefits of a larger page size.
I realize that there are benefits. It's just that the downsides tend to
swamp the upsides.
* Christoph Hellwig [EMAIL PROTECTED] wrote:
On Tue, Nov 18, 2008 at 10:03:25AM +1100, Benjamin Herrenschmidt wrote:
On Mon, 2008-11-17 at 15:34 -0500, Steven Rostedt wrote:
Note, I was using a default config that had CONFIG_IRQSTACKS off and
CONFIG_PPC_64K_PAGES on.
For one, we
Nick Piggin writes:
It's much harder to do this with powerpc I think because they would need
to calculate 8 hashes and touch 8 cachelines to prefill 8 translations,
wouldn't they?
Yeah, the hashed page table sucks. Film at 11, as they say. :)
Paul.
On Tue, 18 Nov 2008, Nick Piggin wrote:
The fact is, Intel (and to a lesser degree, AMD) has shown how hardware
can do good TLB's with essentially gang lookups, giving almost effective
page sizes of 32kB with hardly any of the downsides. Couple that with
It's much harder to do this
I've been hitting stack overflows on a PPC64 box, so I ran the ftrace
stack_tracer and part of the problem with that box is that it can nest
interrupts too deep. But what also worries me is that there's some heavy
hitters of stacks in generic code. Namely the fs directory has some.
Here's the
On Mon, 17 Nov 2008, Steven Rostedt wrote:
Note, I was using a default config that had CONFIG_IRQSTACKS off and
CONFIG_PPC_64K_PAGES on.
Here's my stack after boot up with CONFIG_IRQSTACKS set. Seems that
softirqs still use the same stack as the process.
[EMAIL PROTECTED] ~ cat
On Mon, 17 Nov 2008, Steven Rostedt wrote:
45) 49921280 .block_read_full_page+0x23c/0x430
46) 37121280 .do_mpage_readpage+0x43c/0x740
Ouch.
Notice at line 45 and 46 the stack usage of block_read_full_page and
do_mpage_readpage. They each use 1280 bytes of stack!
On Mon, Nov 17, 2008 at 11:18 PM, Linus Torvalds
[EMAIL PROTECTED] wrote:
I do wonder just _what_ it is that causes the stack frames to be so
horrid. For example, you have
18) 8896 160 .kmem_cache_alloc+0xfc/0x140
and I'm looking at my x86-64 compile, and it has a stack
On Mon, 17 Nov 2008 13:23:23 -0800 (PST)
Linus Torvalds [EMAIL PROTECTED] wrote:
On Mon, 17 Nov 2008, Andrew Morton wrote:
Far be it from me to apportion blame, but THIS IS ALL LINUS'S FAULT! :)
I fixed this six years ago. See http://lkml.org/lkml/2002/6/17/68
Btw, in that
On Mon, 17 Nov 2008, Andrew Morton wrote:
Yup. That being said, the younger me did assert that this is a neater
implementation anyway. If we can implement those loops without
needing those on-stack temporary arrays then things probably are better
overall.
Sure, if it actually ends up
Steven Rostedt writes:
Here's my stack after boot up with CONFIG_IRQSTACKS set. Seems that
softirqs still use the same stack as the process.
They shouldn't. I don't see do_softirq in the trace, though. Which
functions did you think would be run in a softirq? It looks to me
like the deepest
Linus Torvalds writes:
The ppc people run databases, and they don't care about sane people
And HPC apps, and all sorts of other things...
telling them the big pages suck. It's made worse by the fact that they
also have horribly bad TLB fills on their broken CPU's, and years and
Taking
On Mon, 2008-11-17 at 15:34 -0500, Steven Rostedt wrote:
Note, I was using a default config that had CONFIG_IRQSTACKS off and
CONFIG_PPC_64K_PAGES on.
For one, we definitely need to turn IRQSTACKS on by default ... In fact,
I'm pondering just removing the option.
Cheers,
Ben.
On Mon, 2008-11-17 at 15:59 -0500, Steven Rostedt wrote:
On Mon, 17 Nov 2008, Steven Rostedt wrote:
Note, I was using a default config that had CONFIG_IRQSTACKS off and
CONFIG_PPC_64K_PAGES on.
Here's my stack after boot up with CONFIG_IRQSTACKS set. Seems that
softirqs still use the
Well, it's not unacceptable on good CPU's with 4kB blocks (just an 8-entry
array), but as you say:
On PPC64 I'm told that the page size is 64K, which makes the above equal
to: 64K / 512 = 128 multiply that by 8 byte words, we have 1024 bytes.
Yeah. Not good. I think 64kB pages are
On Tue, 18 Nov 2008 10:13:16 +1100
Benjamin Herrenschmidt [EMAIL PROTECTED] wrote:
Well, it's not unacceptable on good CPU's with 4kB blocks (just an 8-entry
array), but as you say:
On PPC64 I'm told that the page size is 64K, which makes the above equal
to: 64K / 512 = 128
On Tue, 18 Nov 2008, Benjamin Herrenschmidt wrote:
Guess who is pushing for larger page sizes nowadays ? Embedded
people :-) In fact, we have patches submited on the list to offer the
option for ... 256K pages on some 44x embedded CPUs :-)
It makes some sort of sense I suppose on very
On Tue, 18 Nov 2008, Paul Mackerras wrote:
Steven Rostedt writes:
Here's my stack after boot up with CONFIG_IRQSTACKS set. Seems that
softirqs still use the same stack as the process.
They shouldn't. I don't see do_softirq in the trace, though. Which
functions did you think would
On Mon, 2008-11-17 at 15:34 -0500, Steven Rostedt wrote:
I've been hitting stack overflows on a PPC64 box, so I ran the ftrace
stack_tracer and part of the problem with that box is that it can nest
interrupts too deep. But what also worries me is that there's some heavy
hitters of stacks
Linus Torvalds writes:
It's made worse by the fact that they
also have horribly bad TLB fills on their broken CPU's, and years and
years of telling people that the MMU on ppc's are sh*t has only been
reacted to with talk to the hand, we know better.
Who are you talking about
On Mon, 17 Nov 2008, Linus Torvalds wrote:
I do wonder just _what_ it is that causes the stack frames to be so
horrid. For example, you have
18) 8896 160 .kmem_cache_alloc+0xfc/0x140
and I'm looking at my x86-64 compile, and it has a stack frame of just 8
bytes (!)
Steve,
On Mon, 17 Nov 2008, Linus Torvalds wrote:
I do wonder just _what_ it is that causes the stack frames to be so
horrid. For example, you have
18) 8896 160 .kmem_cache_alloc+0xfc/0x140
and I'm looking at my x86-64 compile, and it has a stack frame of just 8
On Tue, 18 Nov 2008, Paul Mackerras wrote:
64 bytes, still much lower than the 160 of PPC64.
The ppc64 ABI has a minimum stack frame of 112 bytes, due to having an
area for called functions to store their parameters (64 bytes) plus 6
slots for saving stuff and for the compiler and
On Mon, 17 Nov 2008, Steven Rostedt wrote:
And here's my i386 max stack:
[EMAIL PROTECTED] ~]# cat /debug/tracing/stack_trace
Depth Size Location(47 entries)
-
0) 2216 240 blk_recount_segments+0x39/0x51
1) 1976
On Tue, 18 Nov 2008, Paul Mackerras wrote:
Also, you didn't respond to my comments about the purely software
benefits of a larger page size.
I realize that there are benefits. It's just that the downsides tend to
swamp the upsides.
The fact is, Intel (and to a lesser degree, AMD) has
On Mon, 17 Nov 2008, Steven Rostedt wrote:
On Mon, 17 Nov 2008, Steven Rostedt wrote:
Note, I was using a default config that had CONFIG_IRQSTACKS off and
CONFIG_PPC_64K_PAGES on.
Here's my stack after boot up with CONFIG_IRQSTACKS set. Seems that
softirqs still use the same stack
Steven Rostedt writes:
By-the-way, my box has been running stable ever since I switched to
CONFIG_IRQSTACKS.
Great. We probably should remove the config option and just always
use irq stacks.
Paul.
___
Linuxppc-dev mailing list
From: Paul Mackerras [EMAIL PROTECTED]
Date: Tue, 18 Nov 2008 13:36:16 +1100
Steven Rostedt writes:
By-the-way, my box has been running stable ever since I switched to
CONFIG_IRQSTACKS.
Great. We probably should remove the config option and just always
use irq stacks.
That's what I
It makes some sort of sense I suppose on very static embedded workloads
with no swap nor demand paging.
It makes perfect sense for anything that doesn't use any MMU.
To a certain extent. There's two different aspects to having an MMU and
in embedded space it's useful to have one and not
31 matches
Mail list logo