Re: [PATCH] Optimize int_sqrt for small values for faster idle

2017-07-24 Thread Eric Dumazet
On Thu, 2017-07-20 at 12:10 +0200, Peter Zijlstra wrote: > On Mon, Feb 01, 2016 at 04:36:38PM -0800, Eric Dumazet wrote: > > On Tue, 2016-02-02 at 00:08 +0100, Rasmus Villemoes wrote: > > > > > Thanks. (Is there a good way to tell gcc that avg*avg is actually a > > > 32x32->64 multiplication?) >

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2017-07-24 Thread Eric Dumazet
On Thu, 2017-07-20 at 12:10 +0200, Peter Zijlstra wrote: > On Mon, Feb 01, 2016 at 04:36:38PM -0800, Eric Dumazet wrote: > > On Tue, 2016-02-02 at 00:08 +0100, Rasmus Villemoes wrote: > > > > > Thanks. (Is there a good way to tell gcc that avg*avg is actually a > > > 32x32->64 multiplication?) >

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2017-07-20 Thread Peter Zijlstra
On Mon, Feb 01, 2016 at 04:36:38PM -0800, Eric Dumazet wrote: > On Tue, 2016-02-02 at 00:08 +0100, Rasmus Villemoes wrote: > > > Thanks. (Is there a good way to tell gcc that avg*avg is actually a > > 32x32->64 multiplication?) > > If avg is 32bit, compiler does that for you. > > u32 avg = ...

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2017-07-20 Thread Peter Zijlstra
On Mon, Feb 01, 2016 at 04:36:38PM -0800, Eric Dumazet wrote: > On Tue, 2016-02-02 at 00:08 +0100, Rasmus Villemoes wrote: > > > Thanks. (Is there a good way to tell gcc that avg*avg is actually a > > 32x32->64 multiplication?) > > If avg is 32bit, compiler does that for you. > > u32 avg = ...

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-02-10 Thread Fengguang Wu
On Tue, Feb 09, 2016 at 12:44:00PM -0800, Andi Kleen wrote: > On Sun, Feb 07, 2016 at 10:32:26PM +0100, Rasmus Villemoes wrote: > > On Mon, Feb 01 2016, Andi Kleen wrote: > > > > > On Mon, Feb 01, 2016 at 10:25:17PM +0100, Rasmus Villemoes wrote: > > >> On Thu, Jan 28 2016, Andi Kleen wrote: >

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-02-10 Thread Fengguang Wu
On Tue, Feb 09, 2016 at 12:44:00PM -0800, Andi Kleen wrote: > On Sun, Feb 07, 2016 at 10:32:26PM +0100, Rasmus Villemoes wrote: > > On Mon, Feb 01 2016, Andi Kleen wrote: > > > > > On Mon, Feb 01, 2016 at 10:25:17PM +0100, Rasmus Villemoes wrote: > > >> On Thu, Jan 28 2016,

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-02-09 Thread Andi Kleen
On Sun, Feb 07, 2016 at 10:32:26PM +0100, Rasmus Villemoes wrote: > On Mon, Feb 01 2016, Andi Kleen wrote: > > > On Mon, Feb 01, 2016 at 10:25:17PM +0100, Rasmus Villemoes wrote: > >> On Thu, Jan 28 2016, Andi Kleen wrote: > >> > >> > From: Andi Kleen > >> > > >> > The menu cpuidle governor

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-02-09 Thread Andi Kleen
On Sun, Feb 07, 2016 at 10:32:26PM +0100, Rasmus Villemoes wrote: > On Mon, Feb 01 2016, Andi Kleen wrote: > > > On Mon, Feb 01, 2016 at 10:25:17PM +0100, Rasmus Villemoes wrote: > >> On Thu, Jan 28 2016, Andi Kleen wrote: > >> > >> > From: Andi Kleen

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-02-07 Thread Rasmus Villemoes
On Mon, Feb 01 2016, Andi Kleen wrote: > On Mon, Feb 01, 2016 at 10:25:17PM +0100, Rasmus Villemoes wrote: >> On Thu, Jan 28 2016, Andi Kleen wrote: >> >> > From: Andi Kleen >> > >> > The menu cpuidle governor does at least two int_sqrt() each time >> > we go into idle in get_typical_interval

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-02-07 Thread Rasmus Villemoes
On Mon, Feb 01 2016, Andi Kleen wrote: > On Mon, Feb 01, 2016 at 10:25:17PM +0100, Rasmus Villemoes wrote: >> On Thu, Jan 28 2016, Andi Kleen wrote: >> >> > From: Andi Kleen >> > >> > The menu cpuidle governor does at least two

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-02-02 Thread Eric Dumazet
On Tue, 2016-02-02 at 21:46 +0100, Rasmus Villemoes wrote: > On Tue, Feb 02 2016, Eric Dumazet wrote: > > > On Tue, 2016-02-02 at 00:08 +0100, Rasmus Villemoes wrote: > > > >> Thanks. (Is there a good way to tell gcc that avg*avg is actually a > >> 32x32->64 multiplication?) > > > > If avg is

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-02-02 Thread Rasmus Villemoes
On Tue, Feb 02 2016, Eric Dumazet wrote: > On Tue, 2016-02-02 at 00:08 +0100, Rasmus Villemoes wrote: > >> Thanks. (Is there a good way to tell gcc that avg*avg is actually a >> 32x32->64 multiplication?) > > If avg is 32bit, compiler does that for you. > > u32 avg = ... > > u64 result =

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-02-02 Thread Rasmus Villemoes
On Tue, Feb 02 2016, Eric Dumazet wrote: > On Tue, 2016-02-02 at 00:08 +0100, Rasmus Villemoes wrote: > >> Thanks. (Is there a good way to tell gcc that avg*avg is actually a >> 32x32->64 multiplication?) > > If avg is 32bit, compiler does that for you. > > u32 avg = ...

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-02-02 Thread Eric Dumazet
On Tue, 2016-02-02 at 21:46 +0100, Rasmus Villemoes wrote: > On Tue, Feb 02 2016, Eric Dumazet wrote: > > > On Tue, 2016-02-02 at 00:08 +0100, Rasmus Villemoes wrote: > > > >> Thanks. (Is there a good way to tell gcc that avg*avg is actually a > >> 32x32->64

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-02-01 Thread Eric Dumazet
On Tue, 2016-02-02 at 00:08 +0100, Rasmus Villemoes wrote: > Thanks. (Is there a good way to tell gcc that avg*avg is actually a > 32x32->64 multiplication?) If avg is 32bit, compiler does that for you. u32 avg = ... u64 result = (u64)avg * avg;

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-02-01 Thread Andi Kleen
On Tue, Feb 02, 2016 at 12:08:46AM +0100, Rasmus Villemoes wrote: > On Mon, Feb 01 2016, Andi Kleen wrote: > > > On Mon, Feb 01, 2016 at 10:25:17PM +0100, Rasmus Villemoes wrote: > >> On Thu, Jan 28 2016, Andi Kleen wrote: > >> > >> > From: Andi Kleen > >> > > >> > The menu cpuidle governor

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-02-01 Thread Rasmus Villemoes
On Mon, Feb 01 2016, Andi Kleen wrote: > On Mon, Feb 01, 2016 at 10:25:17PM +0100, Rasmus Villemoes wrote: >> On Thu, Jan 28 2016, Andi Kleen wrote: >> >> > From: Andi Kleen >> > >> > The menu cpuidle governor does at least two int_sqrt() each time >> > we go into idle in get_typical_interval

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-02-01 Thread Andi Kleen
On Mon, Feb 01, 2016 at 10:25:17PM +0100, Rasmus Villemoes wrote: > On Thu, Jan 28 2016, Andi Kleen wrote: > > > From: Andi Kleen > > > > The menu cpuidle governor does at least two int_sqrt() each time > > we go into idle in get_typical_interval to compute stddev > > > > int_sqrts take 100-120

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-02-01 Thread Rasmus Villemoes
On Thu, Jan 28 2016, Andi Kleen wrote: > From: Andi Kleen > > The menu cpuidle governor does at least two int_sqrt() each time > we go into idle in get_typical_interval to compute stddev > > int_sqrts take 100-120 cycles each. Short idle latency is important > for many workloads. > If you want

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-02-01 Thread Rasmus Villemoes
On Thu, Jan 28 2016, Andi Kleen wrote: > From: Andi Kleen > > The menu cpuidle governor does at least two int_sqrt() each time > we go into idle in get_typical_interval to compute stddev > > int_sqrts take 100-120 cycles each. Short idle latency is

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-02-01 Thread Rasmus Villemoes
On Mon, Feb 01 2016, Andi Kleen wrote: > On Mon, Feb 01, 2016 at 10:25:17PM +0100, Rasmus Villemoes wrote: >> On Thu, Jan 28 2016, Andi Kleen wrote: >> >> > From: Andi Kleen >> > >> > The menu cpuidle governor does at least two

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-02-01 Thread Andi Kleen
On Mon, Feb 01, 2016 at 10:25:17PM +0100, Rasmus Villemoes wrote: > On Thu, Jan 28 2016, Andi Kleen wrote: > > > From: Andi Kleen > > > > The menu cpuidle governor does at least two int_sqrt() each time > > we go into idle in get_typical_interval to

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-02-01 Thread Andi Kleen
On Tue, Feb 02, 2016 at 12:08:46AM +0100, Rasmus Villemoes wrote: > On Mon, Feb 01 2016, Andi Kleen wrote: > > > On Mon, Feb 01, 2016 at 10:25:17PM +0100, Rasmus Villemoes wrote: > >> On Thu, Jan 28 2016, Andi Kleen wrote: > >> > >> > From: Andi Kleen

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-02-01 Thread Eric Dumazet
On Tue, 2016-02-02 at 00:08 +0100, Rasmus Villemoes wrote: > Thanks. (Is there a good way to tell gcc that avg*avg is actually a > 32x32->64 multiplication?) If avg is 32bit, compiler does that for you. u32 avg = ... u64 result = (u64)avg * avg;

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-01-30 Thread Thomas Rohwer
Hello, > - m = 1UL << (BITS_PER_LONG - 2); > + if (x <= 0x) { > + if (m <= 0xff) > + m = 1UL << (8 - 2); > + else > + m = 1UL << (16 - 2); > + } else if (x <= 0x) > + m = 1UL << (32 - 2); > + else > + m = 1UL

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-01-30 Thread Thomas Rohwer
Hello, > - m = 1UL << (BITS_PER_LONG - 2); > + if (x <= 0x) { > + if (m <= 0xff) > + m = 1UL << (8 - 2); > + else > + m = 1UL << (16 - 2); > + } else if (x <= 0x) > + m = 1UL << (32 - 2); > + else > + m = 1UL

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-01-28 Thread Rafael J. Wysocki
On Thursday, January 28, 2016 01:42:45 PM Andi Kleen wrote: > From: Andi Kleen > > The menu cpuidle governor does at least two int_sqrt() each time > we go into idle in get_typical_interval to compute stddev > > int_sqrts take 100-120 cycles each. Short idle latency is important > for many

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-01-28 Thread Andi Kleen
> This thread might be relevant: > > https://lkml.org/lkml/2015/2/2/600 > > and perhaps using fls might still be a good approach. Linus wrote: >>> We *probably* have some argument range that we care more about, which is why I'd like to know what the profile is that triggered this

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-01-28 Thread Andi Kleen
Andi Kleen writes: > From: Andi Kleen > > The menu cpuidle governor does at least two int_sqrt() each time > we go into idle in get_typical_interval to compute stddev Added a stupid typo in the last minute. I'll post a new version. -Andi -- a...@linux.intel.com -- Speaking for myself only

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-01-28 Thread Eric Dumazet
On Thu, 2016-01-28 at 13:42 -0800, Andi Kleen wrote: > From: Andi Kleen > > The menu cpuidle governor does at least two int_sqrt() each time > we go into idle in get_typical_interval to compute stddev > > int_sqrts take 100-120 cycles each. Short idle latency is important > for many workloads.

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-01-28 Thread Joe Perches
(resending with email addresses that shouldn't bounce) (adding Anshul Garg) (fixed Davidlohr Bueso's address) On Thu, 2016-01-28 at 13:42 -0800, Andi Kleen wrote: > From: Andi Kleen > > The menu cpuidle governor does at least two int_sqrt() each time > we go into idle in get_typical_interval to

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-01-28 Thread Joe Perches
(adding Anshul Garg) On Thu, 2016-01-28 at 13:42 -0800, Andi Kleen wrote: > From: Andi Kleen > > The menu cpuidle governor does at least two int_sqrt() each time > we go into idle in get_typical_interval to compute stddev > > int_sqrts take 100-120 cycles each. Short idle latency is important

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-01-28 Thread kbuild test robot
Hi Andi, [auto build test WARNING on v4.5-rc1] [also build test WARNING on next-20160128] [if your patch is applied to the wrong git tree, please drop us a note to help improving the system] url:

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-01-28 Thread kbuild test robot
Hi Andi, [auto build test WARNING on v4.5-rc1] [also build test WARNING on next-20160128] [if your patch is applied to the wrong git tree, please drop us a note to help improving the system] url:

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-01-28 Thread Joe Perches
(resending with email addresses that shouldn't bounce) (adding Anshul Garg) (fixed Davidlohr Bueso's address) On Thu, 2016-01-28 at 13:42 -0800, Andi Kleen wrote: > From: Andi Kleen > > The menu cpuidle governor does at least two int_sqrt() each time > we go into idle in

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-01-28 Thread Andi Kleen
Andi Kleen writes: > From: Andi Kleen > > The menu cpuidle governor does at least two int_sqrt() each time > we go into idle in get_typical_interval to compute stddev Added a stupid typo in the last minute. I'll post a new version. -Andi --

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-01-28 Thread kbuild test robot
Hi Andi, [auto build test WARNING on v4.5-rc1] [also build test WARNING on next-20160128] [if your patch is applied to the wrong git tree, please drop us a note to help improving the system] url:

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-01-28 Thread kbuild test robot
Hi Andi, [auto build test WARNING on v4.5-rc1] [also build test WARNING on next-20160128] [if your patch is applied to the wrong git tree, please drop us a note to help improving the system] url:

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-01-28 Thread Joe Perches
(adding Anshul Garg) On Thu, 2016-01-28 at 13:42 -0800, Andi Kleen wrote: > From: Andi Kleen > > The menu cpuidle governor does at least two int_sqrt() each time > we go into idle in get_typical_interval to compute stddev > > int_sqrts take 100-120 cycles each. Short idle

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-01-28 Thread Eric Dumazet
On Thu, 2016-01-28 at 13:42 -0800, Andi Kleen wrote: > From: Andi Kleen > > The menu cpuidle governor does at least two int_sqrt() each time > we go into idle in get_typical_interval to compute stddev > > int_sqrts take 100-120 cycles each. Short idle latency is important

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-01-28 Thread Andi Kleen
> This thread might be relevant: > > https://lkml.org/lkml/2015/2/2/600 > > and perhaps using fls might still be a good approach. Linus wrote: >>> We *probably* have some argument range that we care more about, which is why I'd like to know what the profile is that triggered this

Re: [PATCH] Optimize int_sqrt for small values for faster idle

2016-01-28 Thread Rafael J. Wysocki
On Thursday, January 28, 2016 01:42:45 PM Andi Kleen wrote: > From: Andi Kleen > > The menu cpuidle governor does at least two int_sqrt() each time > we go into idle in get_typical_interval to compute stddev > > int_sqrts take 100-120 cycles each. Short idle latency is