Re: [PATCH v6 0/9] latched RB-trees and __module_address()

2015-05-12 Thread Peter Zijlstra
On Sat, May 09, 2015 at 03:12:32AM +0930, Rusty Russell wrote:
> Ingo Molnar  writes:
> > * Rusty Russell  wrote:
> >
> >> Peter Zijlstra  writes:
> >> > This series is aimed at making __module_address() go fast(er).
> >> 
> >> Acked-by: Rusty Russell  (module parts)
> >> 
> >> Since all the interesting stuff is not module-specific, assume this 
> >> is via Ingo?  Otherwise, I'll take it...
> >
> > I can certainly take them, but since I think that the _breakages_ are 
> > going to be in module land foremost, it should be rather under your 
> > watchful eyes? :-)
> 
> Ingo, I feel like you just gave me a free puppy...

Hehe, thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 0/9] latched RB-trees and __module_address()

2015-05-12 Thread Peter Zijlstra
On Sat, May 09, 2015 at 03:12:32AM +0930, Rusty Russell wrote:
 Ingo Molnar mi...@kernel.org writes:
  * Rusty Russell ru...@rustcorp.com.au wrote:
 
  Peter Zijlstra pet...@infradead.org writes:
   This series is aimed at making __module_address() go fast(er).
  
  Acked-by: Rusty Russell ru...@rustcorp.com.au (module parts)
  
  Since all the interesting stuff is not module-specific, assume this 
  is via Ingo?  Otherwise, I'll take it...
 
  I can certainly take them, but since I think that the _breakages_ are 
  going to be in module land foremost, it should be rather under your 
  watchful eyes? :-)
 
 Ingo, I feel like you just gave me a free puppy...

Hehe, thanks!
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 0/9] latched RB-trees and __module_address()

2015-05-08 Thread Rusty Russell
Ingo Molnar  writes:
> * Rusty Russell  wrote:
>
>> Peter Zijlstra  writes:
>> > This series is aimed at making __module_address() go fast(er).
>> 
>> Acked-by: Rusty Russell  (module parts)
>> 
>> Since all the interesting stuff is not module-specific, assume this 
>> is via Ingo?  Otherwise, I'll take it...
>
> I can certainly take them, but since I think that the _breakages_ are 
> going to be in module land foremost, it should be rather under your 
> watchful eyes? :-)

Ingo, I feel like you just gave me a free puppy...

Applied,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 0/9] latched RB-trees and __module_address()

2015-05-08 Thread Rusty Russell
Ingo Molnar mi...@kernel.org writes:
 * Rusty Russell ru...@rustcorp.com.au wrote:

 Peter Zijlstra pet...@infradead.org writes:
  This series is aimed at making __module_address() go fast(er).
 
 Acked-by: Rusty Russell ru...@rustcorp.com.au (module parts)
 
 Since all the interesting stuff is not module-specific, assume this 
 is via Ingo?  Otherwise, I'll take it...

 I can certainly take them, but since I think that the _breakages_ are 
 going to be in module land foremost, it should be rather under your 
 watchful eyes? :-)

Ingo, I feel like you just gave me a free puppy...

Applied,
Rusty.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 0/9] latched RB-trees and __module_address()

2015-05-07 Thread Ingo Molnar

* Rusty Russell  wrote:

> Peter Zijlstra  writes:
> > This series is aimed at making __module_address() go fast(er).
> 
> Acked-by: Rusty Russell  (module parts)
> 
> Since all the interesting stuff is not module-specific, assume this 
> is via Ingo?  Otherwise, I'll take it...

I can certainly take them, but since I think that the _breakages_ are 
going to be in module land foremost, it should be rather under your 
watchful eyes? :-)

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 0/9] latched RB-trees and __module_address()

2015-05-07 Thread Rusty Russell
Peter Zijlstra  writes:
> This series is aimed at making __module_address() go fast(er).

Acked-by: Rusty Russell  (module parts)

Since all the interesting stuff is not module-specific, assume
this is via Ingo?  Otherwise, I'll take it...

Thanks,
Rusty.

>
> The reason for doing so is that most stack unwinders use kernel_text_address()
> to validate each frame. Perf and ftrace (can) end up doing a lot of stack
> traces from performance sensitive code.
>
> On the way there it:
>  - annotates and sanitizes module locking
>  - introduces the latched RB-tree
>  - employs it to make __module_address() go fast.
>
> I've build and boot tested this on x86_64 with modules and lockdep
> enabled.  Performance numbers (below) are done with lockdep disabled.
>
> As previously mentioned; the reason for writing the latched RB-tree as generic
> code is mostly for clarity/documentation purposes; as there are a number of
> separate and non trivial bits to the complete solution.
>
> As measured on my ivb-ep system with 84 modules loaded; the test module 
> reports
> (cache hot, performance cpufreq):
>
>   avg +- stdev
> Before:   611 +- 10 [ns] per __module_address() call
> After: 17 +-  5 [ns] per __module_address() call
>
> PMI measurements for a cpu running loops in a module (also [ns]):
>
> Before:   Mean: 2719 +- 1, Stdev: 214, Samples: 40036
> After:  Mean:  947 +- 0, Stdev: 132, Samples: 40037
>
> Note; I have also tested things like: perf record -a -g modprobe
> mod_test, to make 'sure' to hit some of the more interesting paths.
>
> Changes since last time:
>
>  - rebased against Rusty's tree
>  - raw_read_seqcount_latch() -- (mingo)
>
> Based on rusty/linux.git/pending-rebases; please consider for 4.2
>
> Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 0/9] latched RB-trees and __module_address()

2015-05-07 Thread Ingo Molnar

* Rusty Russell ru...@rustcorp.com.au wrote:

 Peter Zijlstra pet...@infradead.org writes:
  This series is aimed at making __module_address() go fast(er).
 
 Acked-by: Rusty Russell ru...@rustcorp.com.au (module parts)
 
 Since all the interesting stuff is not module-specific, assume this 
 is via Ingo?  Otherwise, I'll take it...

I can certainly take them, but since I think that the _breakages_ are 
going to be in module land foremost, it should be rather under your 
watchful eyes? :-)

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 0/9] latched RB-trees and __module_address()

2015-05-07 Thread Rusty Russell
Peter Zijlstra pet...@infradead.org writes:
 This series is aimed at making __module_address() go fast(er).

Acked-by: Rusty Russell ru...@rustcorp.com.au (module parts)

Since all the interesting stuff is not module-specific, assume
this is via Ingo?  Otherwise, I'll take it...

Thanks,
Rusty.


 The reason for doing so is that most stack unwinders use kernel_text_address()
 to validate each frame. Perf and ftrace (can) end up doing a lot of stack
 traces from performance sensitive code.

 On the way there it:
  - annotates and sanitizes module locking
  - introduces the latched RB-tree
  - employs it to make __module_address() go fast.

 I've build and boot tested this on x86_64 with modules and lockdep
 enabled.  Performance numbers (below) are done with lockdep disabled.

 As previously mentioned; the reason for writing the latched RB-tree as generic
 code is mostly for clarity/documentation purposes; as there are a number of
 separate and non trivial bits to the complete solution.

 As measured on my ivb-ep system with 84 modules loaded; the test module 
 reports
 (cache hot, performance cpufreq):

   avg +- stdev
 Before:   611 +- 10 [ns] per __module_address() call
 After: 17 +-  5 [ns] per __module_address() call

 PMI measurements for a cpu running loops in a module (also [ns]):

 Before:   Mean: 2719 +- 1, Stdev: 214, Samples: 40036
 After:  Mean:  947 +- 0, Stdev: 132, Samples: 40037

 Note; I have also tested things like: perf record -a -g modprobe
 mod_test, to make 'sure' to hit some of the more interesting paths.

 Changes since last time:

  - rebased against Rusty's tree
  - raw_read_seqcount_latch() -- (mingo)

 Based on rusty/linux.git/pending-rebases; please consider for 4.2

 Thanks!
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 0/9] latched RB-trees and __module_address()

2015-05-06 Thread Peter Zijlstra
This series is aimed at making __module_address() go fast(er).

The reason for doing so is that most stack unwinders use kernel_text_address()
to validate each frame. Perf and ftrace (can) end up doing a lot of stack
traces from performance sensitive code.

On the way there it:
 - annotates and sanitizes module locking
 - introduces the latched RB-tree
 - employs it to make __module_address() go fast.

I've build and boot tested this on x86_64 with modules and lockdep
enabled.  Performance numbers (below) are done with lockdep disabled.

As previously mentioned; the reason for writing the latched RB-tree as generic
code is mostly for clarity/documentation purposes; as there are a number of
separate and non trivial bits to the complete solution.

As measured on my ivb-ep system with 84 modules loaded; the test module reports
(cache hot, performance cpufreq):

  avg +- stdev
Before:   611 +- 10 [ns] per __module_address() call
After: 17 +-  5 [ns] per __module_address() call

PMI measurements for a cpu running loops in a module (also [ns]):

Before: Mean: 2719 +- 1, Stdev: 214, Samples: 40036
After:  Mean:  947 +- 0, Stdev: 132, Samples: 40037

Note; I have also tested things like: perf record -a -g modprobe
mod_test, to make 'sure' to hit some of the more interesting paths.

Changes since last time:

 - rebased against Rusty's tree
 - raw_read_seqcount_latch() -- (mingo)

Based on rusty/linux.git/pending-rebases; please consider for 4.2

Thanks!

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v6 0/9] latched RB-trees and __module_address()

2015-05-06 Thread Peter Zijlstra
This series is aimed at making __module_address() go fast(er).

The reason for doing so is that most stack unwinders use kernel_text_address()
to validate each frame. Perf and ftrace (can) end up doing a lot of stack
traces from performance sensitive code.

On the way there it:
 - annotates and sanitizes module locking
 - introduces the latched RB-tree
 - employs it to make __module_address() go fast.

I've build and boot tested this on x86_64 with modules and lockdep
enabled.  Performance numbers (below) are done with lockdep disabled.

As previously mentioned; the reason for writing the latched RB-tree as generic
code is mostly for clarity/documentation purposes; as there are a number of
separate and non trivial bits to the complete solution.

As measured on my ivb-ep system with 84 modules loaded; the test module reports
(cache hot, performance cpufreq):

  avg +- stdev
Before:   611 +- 10 [ns] per __module_address() call
After: 17 +-  5 [ns] per __module_address() call

PMI measurements for a cpu running loops in a module (also [ns]):

Before: Mean: 2719 +- 1, Stdev: 214, Samples: 40036
After:  Mean:  947 +- 0, Stdev: 132, Samples: 40037

Note; I have also tested things like: perf record -a -g modprobe
mod_test, to make 'sure' to hit some of the more interesting paths.

Changes since last time:

 - rebased against Rusty's tree
 - raw_read_seqcount_latch() -- (mingo)

Based on rusty/linux.git/pending-rebases; please consider for 4.2

Thanks!

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/