Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-16 Thread Dave Hansen
Hey Nathan, Could you post your boot timing patches? My machines are much smaller than yours, but I'm curious how things behave here as well. I did some very imprecise timings (strace -t on a telnet attached to the serial console). The 'struct page' initializations take about a minute of boot

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-16 Thread Dave Hansen
Hey Nathan, Could you post your boot timing patches? My machines are much smaller than yours, but I'm curious how things behave here as well. I did some very imprecise timings (strace -t on a telnet attached to the serial console). The 'struct page' initializations take about a minute of boot

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-14 Thread Nathan Zimmer
On Wed, Aug 14, 2013 at 01:05:56PM +0200, Ingo Molnar wrote: > > * Linus Torvalds wrote: > > > [...] > > > > Ok, so I don't know all the issues, and in many ways I don't even really > > care. You could do it other ways, I don't think this is a big deal. The > > part I hate is the runtime

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-14 Thread Ingo Molnar
* Linus Torvalds wrote: > On Tue, Aug 13, 2013 at 4:10 PM, Nathan Zimmer wrote: > > > > The only mm structure we are adding to is a new flag in page->flags. > > That didn't seem too much. > > I don't agree. > > I see only downsides, and no upsides. Doing the same thing *without* the >

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-14 Thread Ingo Molnar
* Linus Torvalds wrote: > [...] > > Ok, so I don't know all the issues, and in many ways I don't even really > care. You could do it other ways, I don't think this is a big deal. The > part I hate is the runtime hook into the core MM page allocation code, > so I'm just throwing out any

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-14 Thread Ingo Molnar
* Linus Torvalds torva...@linux-foundation.org wrote: [...] Ok, so I don't know all the issues, and in many ways I don't even really care. You could do it other ways, I don't think this is a big deal. The part I hate is the runtime hook into the core MM page allocation code, so I'm

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-14 Thread Ingo Molnar
* Linus Torvalds torva...@linux-foundation.org wrote: On Tue, Aug 13, 2013 at 4:10 PM, Nathan Zimmer nzim...@sgi.com wrote: The only mm structure we are adding to is a new flag in page-flags. That didn't seem too much. I don't agree. I see only downsides, and no upsides. Doing the

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-14 Thread Nathan Zimmer
On Wed, Aug 14, 2013 at 01:05:56PM +0200, Ingo Molnar wrote: * Linus Torvalds torva...@linux-foundation.org wrote: [...] Ok, so I don't know all the issues, and in many ways I don't even really care. You could do it other ways, I don't think this is a big deal. The part I hate is

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Linus Torvalds
On Tue, Aug 13, 2013 at 4:10 PM, Nathan Zimmer wrote: > > The only mm structure we are adding to is a new flag in page->flags. > That didn't seem too much. I don't agree. I see only downsides, and no upsides. Doing the same thing *without* the downsides seems straightforward, so I simply see no

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Nathan Zimmer
On Tue, Aug 13, 2013 at 10:51:37AM -0700, Linus Torvalds wrote: > I realize that benchmarking cares, and yes, I also realize that some > benchmarks actually want to reboot the machine between some runs just > to get repeatability, but if you're benchmarking a 16TB machine I'm > guessing any

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Nathan Zimmer
On 08/13/2013 01:04 PM, Mike Travis wrote: On 8/13/2013 10:51 AM, Linus Torvalds wrote: by the time you can log in. And if it then takes another ten minutes until you have the full 16TB initialized, and some things might be a tad slower early on, does anybody really care? The machine will be

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Mike Travis
On 8/13/2013 1:24 PM, Yinghai Lu wrote: >> > FYI, the system at this time had 128 nodes each with 256GB of memory. >> > About 252GB was inserted into the absent list from nodes 1 .. 126. >> > Memory on nodes 0 and 128 was left fully present. Actually, I was corrected, it was 256 nodes with

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Yinghai Lu
On Tue, Aug 13, 2013 at 12:06 PM, Mike Travis wrote: > > > On 8/13/2013 11:04 AM, Mike Travis wrote: >> >> >> On 8/13/2013 10:51 AM, Linus Torvalds wrote: >>> by the time you can log in. And if it then takes another ten minutes >>> until you have the full 16TB initialized, and some things might

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Mike Travis
On 8/13/2013 11:04 AM, Mike Travis wrote: > > > On 8/13/2013 10:51 AM, Linus Torvalds wrote: >> by the time you can log in. And if it then takes another ten minutes >> until you have the full 16TB initialized, and some things might be a >> tad slower early on, does anybody really care? The

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Mike Travis
On 8/13/2013 10:51 AM, Linus Torvalds wrote: > by the time you can log in. And if it then takes another ten minutes > until you have the full 16TB initialized, and some things might be a > tad slower early on, does anybody really care? The machine will be up > and running with plenty of memory,

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Linus Torvalds
On Tue, Aug 13, 2013 at 10:33 AM, Mike Travis wrote: > > Initially this patch set consisted of diverting a major portion of the > memory to an "absent" list during e820 processing. A very late initcall > was then used to dispatch a cpu per node to add that nodes's absent > memory. By nature

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Mike Travis
On 8/13/2013 10:09 AM, Linus Torvalds wrote: > On Mon, Aug 12, 2013 at 2:54 PM, Nathan Zimmer wrote: >> >> As far as extra overhead. We incur an extra function call to >> ensure_page_is_initialized but that is only really expensive when we find >> uninitialized pages, otherwise it is a flag

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread H. Peter Anvin
On 08/13/2013 10:09 AM, Linus Torvalds wrote: > > I really really dislike this "let's check if memory is initialized at > runtime" approach. > It does seem to be getting messy, doesn't it... The one potential serious concern if if that will end up mucking with NUMA affinity in a way that has

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Linus Torvalds
On Mon, Aug 12, 2013 at 2:54 PM, Nathan Zimmer wrote: > > As far as extra overhead. We incur an extra function call to > ensure_page_is_initialized but that is only really expensive when we find > uninitialized pages, otherwise it is a flag check once every PTRS_PER_PMD. > To get a better feel

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Ingo Molnar
* Nathan Zimmer wrote: > We are still restricting ourselves ourselves to 2MiB initialization. > This was initially to keep the patch set a little smaller and more > clear. However given how well it is currently performing I don't see a > how much better it could be with to 2GiB chunks. > >

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Ingo Molnar
* Nathan Zimmer nzim...@sgi.com wrote: We are still restricting ourselves ourselves to 2MiB initialization. This was initially to keep the patch set a little smaller and more clear. However given how well it is currently performing I don't see a how much better it could be with to 2GiB

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Linus Torvalds
On Mon, Aug 12, 2013 at 2:54 PM, Nathan Zimmer nzim...@sgi.com wrote: As far as extra overhead. We incur an extra function call to ensure_page_is_initialized but that is only really expensive when we find uninitialized pages, otherwise it is a flag check once every PTRS_PER_PMD. To get a

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread H. Peter Anvin
On 08/13/2013 10:09 AM, Linus Torvalds wrote: I really really dislike this let's check if memory is initialized at runtime approach. It does seem to be getting messy, doesn't it... The one potential serious concern if if that will end up mucking with NUMA affinity in a way that has lasting

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Mike Travis
On 8/13/2013 10:09 AM, Linus Torvalds wrote: On Mon, Aug 12, 2013 at 2:54 PM, Nathan Zimmer nzim...@sgi.com wrote: As far as extra overhead. We incur an extra function call to ensure_page_is_initialized but that is only really expensive when we find uninitialized pages, otherwise it is a

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Linus Torvalds
On Tue, Aug 13, 2013 at 10:33 AM, Mike Travis tra...@sgi.com wrote: Initially this patch set consisted of diverting a major portion of the memory to an absent list during e820 processing. A very late initcall was then used to dispatch a cpu per node to add that nodes's absent memory. By

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Mike Travis
On 8/13/2013 10:51 AM, Linus Torvalds wrote: by the time you can log in. And if it then takes another ten minutes until you have the full 16TB initialized, and some things might be a tad slower early on, does anybody really care? The machine will be up and running with plenty of memory,

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Mike Travis
On 8/13/2013 11:04 AM, Mike Travis wrote: On 8/13/2013 10:51 AM, Linus Torvalds wrote: by the time you can log in. And if it then takes another ten minutes until you have the full 16TB initialized, and some things might be a tad slower early on, does anybody really care? The machine

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Yinghai Lu
On Tue, Aug 13, 2013 at 12:06 PM, Mike Travis tra...@sgi.com wrote: On 8/13/2013 11:04 AM, Mike Travis wrote: On 8/13/2013 10:51 AM, Linus Torvalds wrote: by the time you can log in. And if it then takes another ten minutes until you have the full 16TB initialized, and some things might be

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Mike Travis
On 8/13/2013 1:24 PM, Yinghai Lu wrote: FYI, the system at this time had 128 nodes each with 256GB of memory. About 252GB was inserted into the absent list from nodes 1 .. 126. Memory on nodes 0 and 128 was left fully present. Actually, I was corrected, it was 256 nodes with 128GB (8 *

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Nathan Zimmer
On 08/13/2013 01:04 PM, Mike Travis wrote: On 8/13/2013 10:51 AM, Linus Torvalds wrote: by the time you can log in. And if it then takes another ten minutes until you have the full 16TB initialized, and some things might be a tad slower early on, does anybody really care? The machine will be

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Nathan Zimmer
On Tue, Aug 13, 2013 at 10:51:37AM -0700, Linus Torvalds wrote: I realize that benchmarking cares, and yes, I also realize that some benchmarks actually want to reboot the machine between some runs just to get repeatability, but if you're benchmarking a 16TB machine I'm guessing any serious

Re: [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-13 Thread Linus Torvalds
On Tue, Aug 13, 2013 at 4:10 PM, Nathan Zimmer nzim...@sgi.com wrote: The only mm structure we are adding to is a new flag in page-flags. That didn't seem too much. I don't agree. I see only downsides, and no upsides. Doing the same thing *without* the downsides seems straightforward, so I

[RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-12 Thread Nathan Zimmer
We are still restricting ourselves ourselves to 2MiB initialization. This was initially to keep the patch set a little smaller and more clear. However given how well it is currently performing I don't see a how much better it could be with to 2GiB chunks. As far as extra overhead. We incur an

[RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator

2013-08-12 Thread Nathan Zimmer
We are still restricting ourselves ourselves to 2MiB initialization. This was initially to keep the patch set a little smaller and more clear. However given how well it is currently performing I don't see a how much better it could be with to 2GiB chunks. As far as extra overhead. We incur an