Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-18 Thread John Stultz
On Fri, Dec 5, 2014 at 1:03 PM, Arnd Bergmann wrote: > On Friday 05 December 2014 13:00:22 Nicolas Pitre wrote: >> >> BTW this is worth applying despite the on-going discussion with Arnd >> on a separate optimization. > > Agreed > >> On Wed, 3 Dec 2014, Nicolas Pitre wrote: >> >> > At least on

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-18 Thread John Stultz
On Fri, Dec 5, 2014 at 1:03 PM, Arnd Bergmann a...@arndb.de wrote: On Friday 05 December 2014 13:00:22 Nicolas Pitre wrote: BTW this is worth applying despite the on-going discussion with Arnd on a separate optimization. Agreed On Wed, 3 Dec 2014, Nicolas Pitre wrote: At least on ARM,

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-05 Thread Arnd Bergmann
On Friday 05 December 2014 13:00:22 Nicolas Pitre wrote: > > BTW this is worth applying despite the on-going discussion with Arnd > on a separate optimization. Agreed > On Wed, 3 Dec 2014, Nicolas Pitre wrote: > > > At least on ARM, do_div() is optimized to turn constant divisors into > > an

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-05 Thread Nicolas Pitre
BTW this is worth applying despite the on-going discussion with Arnd on a separate optimization. On Wed, 3 Dec 2014, Nicolas Pitre wrote: > At least on ARM, do_div() is optimized to turn constant divisors into > an inline multiplication by the reciprocal value at compile time. > However this

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-05 Thread Nicolas Pitre
On Fri, 5 Dec 2014, Arnd Bergmann wrote: > > > > That, too, risk overflowing. > > > > Let's say x_lo = 0x and x_hi = 0x. You get: > > > > 0x * 0x83126e97 -> 0x83126e967ced9169 > > 0x * 0x8d4fdf3b -> 0x8d4fdf3a72b020c5 > >

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-05 Thread Arnd Bergmann
On Thursday 04 December 2014 23:30:08 Nicolas Pitre wrote: > > res += (u64)x_lo * y_hi + (u64)x_hi * y_lo; > > That, too, risk overflowing. > > Let's say x_lo = 0x and x_hi = 0x. You get: > > 0x * 0x83126e97 -> 0x83126e967ced9169 > 0x *

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-05 Thread Arnd Bergmann
On Thursday 04 December 2014 23:30:08 Nicolas Pitre wrote: res += (u64)x_lo * y_hi + (u64)x_hi * y_lo; That, too, risk overflowing. Let's say x_lo = 0x and x_hi = 0x. You get: 0x * 0x83126e97 - 0x83126e967ced9169 0x *

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-05 Thread Nicolas Pitre
On Fri, 5 Dec 2014, Arnd Bergmann wrote: That, too, risk overflowing. Let's say x_lo = 0x and x_hi = 0x. You get: 0x * 0x83126e97 - 0x83126e967ced9169 0x * 0x8d4fdf3b - 0x8d4fdf3a72b020c5

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-05 Thread Nicolas Pitre
BTW this is worth applying despite the on-going discussion with Arnd on a separate optimization. On Wed, 3 Dec 2014, Nicolas Pitre wrote: At least on ARM, do_div() is optimized to turn constant divisors into an inline multiplication by the reciprocal value at compile time. However this

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-05 Thread Arnd Bergmann
On Friday 05 December 2014 13:00:22 Nicolas Pitre wrote: BTW this is worth applying despite the on-going discussion with Arnd on a separate optimization. Agreed On Wed, 3 Dec 2014, Nicolas Pitre wrote: At least on ARM, do_div() is optimized to turn constant divisors into an inline

Re: Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-04 Thread Nicolas Pitre
On Fri, 5 Dec 2014, pang.xun...@zte.com.cn wrote: > Nicolas, > > On Thursday 04 December 2014 15:23:37: Nicolas Pitre wrote: > > Nicolas Pitre > > > > u64 ktime_to_us(ktime_t kt) > > { > >u64 ns = ktime_to_ns(kt); > >u32 x_lo, x_hi, y_lo, y_hi; > >u64 res, carry; > > > >x_hi

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-04 Thread Nicolas Pitre
On Thu, 4 Dec 2014, Arnd Bergmann wrote: > On Thursday 04 December 2014 08:46:27 Nicolas Pitre wrote: > > On Thu, 4 Dec 2014, Arnd Bergmann wrote: > > Note the above code is for 32-bit architectures that support a 32x32=64 > > bit multiply instruction. And even then, what kills performances is

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-04 Thread Arnd Bergmann
On Thursday 04 December 2014 08:46:27 Nicolas Pitre wrote: > On Thu, 4 Dec 2014, Arnd Bergmann wrote: > Note the above code is for 32-bit architectures that support a 32x32=64 > bit multiply instruction. And even then, what kills performances is the > inhability to efficiently deal with carry

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-04 Thread Nicolas Pitre
On Thu, 4 Dec 2014, Arnd Bergmann wrote: > On Thursday 04 December 2014 02:23:37 Nicolas Pitre wrote: > > On Wed, 3 Dec 2014, Arnd Bergmann wrote: > > > > > On Wednesday 03 December 2014 14:43:06 Nicolas Pitre wrote: > > > > At least on ARM, do_div() is optimized to turn constant divisors into >

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-04 Thread Arnd Bergmann
On Thursday 04 December 2014 02:23:37 Nicolas Pitre wrote: > On Wed, 3 Dec 2014, Arnd Bergmann wrote: > > > On Wednesday 03 December 2014 14:43:06 Nicolas Pitre wrote: > > > At least on ARM, do_div() is optimized to turn constant divisors into > > > an inline multiplication by the reciprocal

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-04 Thread Arnd Bergmann
On Thursday 04 December 2014 02:23:37 Nicolas Pitre wrote: On Wed, 3 Dec 2014, Arnd Bergmann wrote: On Wednesday 03 December 2014 14:43:06 Nicolas Pitre wrote: At least on ARM, do_div() is optimized to turn constant divisors into an inline multiplication by the reciprocal value at

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-04 Thread Nicolas Pitre
On Thu, 4 Dec 2014, Arnd Bergmann wrote: On Thursday 04 December 2014 02:23:37 Nicolas Pitre wrote: On Wed, 3 Dec 2014, Arnd Bergmann wrote: On Wednesday 03 December 2014 14:43:06 Nicolas Pitre wrote: At least on ARM, do_div() is optimized to turn constant divisors into an inline

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-04 Thread Arnd Bergmann
On Thursday 04 December 2014 08:46:27 Nicolas Pitre wrote: On Thu, 4 Dec 2014, Arnd Bergmann wrote: Note the above code is for 32-bit architectures that support a 32x32=64 bit multiply instruction. And even then, what kills performances is the inhability to efficiently deal with carry bits

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-04 Thread Nicolas Pitre
On Thu, 4 Dec 2014, Arnd Bergmann wrote: On Thursday 04 December 2014 08:46:27 Nicolas Pitre wrote: On Thu, 4 Dec 2014, Arnd Bergmann wrote: Note the above code is for 32-bit architectures that support a 32x32=64 bit multiply instruction. And even then, what kills performances is the

Re: Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-04 Thread Nicolas Pitre
On Fri, 5 Dec 2014, pang.xun...@zte.com.cn wrote: Nicolas, On Thursday 04 December 2014 15:23:37: Nicolas Pitre wrote: Nicolas Pitre nicolas.pi...@linaro.org u64 ktime_to_us(ktime_t kt) { u64 ns = ktime_to_ns(kt); u32 x_lo, x_hi, y_lo, y_hi; u64 res, carry;

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-03 Thread Nicolas Pitre
On Wed, 3 Dec 2014, Arnd Bergmann wrote: > On Wednesday 03 December 2014 14:43:06 Nicolas Pitre wrote: > > At least on ARM, do_div() is optimized to turn constant divisors into > > an inline multiplication by the reciprocal value at compile time. > > However this optimization is missed entirely

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-03 Thread Nicolas Pitre
On Wed, 3 Dec 2014, Robert Jarzmik wrote: > Nicolas Pitre writes: > > > Let ktime_divns() use do_div() inline whenever the divisor is constant > > and small enough. This will make things like ktime_to_us() and > > ktime_to_ms() much faster. > > Hi Nicolas, > > I suppose the "small enough"

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-03 Thread Robert Jarzmik
Nicolas Pitre writes: > Let ktime_divns() use do_div() inline whenever the divisor is constant > and small enough. This will make things like ktime_to_us() and > ktime_to_ms() much faster. Hi Nicolas, I suppose the "small enough" is linked to the "!(div >> 32)" in your patch. Can I have the

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-03 Thread Arnd Bergmann
On Wednesday 03 December 2014 14:43:06 Nicolas Pitre wrote: > At least on ARM, do_div() is optimized to turn constant divisors into > an inline multiplication by the reciprocal value at compile time. > However this optimization is missed entirely whenever ktime_divns() is > used and the slow

[PATCH] optimize ktime_divns for constant divisors

2014-12-03 Thread Nicolas Pitre
At least on ARM, do_div() is optimized to turn constant divisors into an inline multiplication by the reciprocal value at compile time. However this optimization is missed entirely whenever ktime_divns() is used and the slow out-of-line division code is used all the time. Let ktime_divns() use

[PATCH] optimize ktime_divns for constant divisors

2014-12-03 Thread Nicolas Pitre
At least on ARM, do_div() is optimized to turn constant divisors into an inline multiplication by the reciprocal value at compile time. However this optimization is missed entirely whenever ktime_divns() is used and the slow out-of-line division code is used all the time. Let ktime_divns() use

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-03 Thread Arnd Bergmann
On Wednesday 03 December 2014 14:43:06 Nicolas Pitre wrote: At least on ARM, do_div() is optimized to turn constant divisors into an inline multiplication by the reciprocal value at compile time. However this optimization is missed entirely whenever ktime_divns() is used and the slow

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-03 Thread Robert Jarzmik
Nicolas Pitre nicolas.pi...@linaro.org writes: Let ktime_divns() use do_div() inline whenever the divisor is constant and small enough. This will make things like ktime_to_us() and ktime_to_ms() much faster. Hi Nicolas, I suppose the small enough is linked to the !(div 32) in your patch.

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-03 Thread Nicolas Pitre
On Wed, 3 Dec 2014, Robert Jarzmik wrote: Nicolas Pitre nicolas.pi...@linaro.org writes: Let ktime_divns() use do_div() inline whenever the divisor is constant and small enough. This will make things like ktime_to_us() and ktime_to_ms() much faster. Hi Nicolas, I suppose the

Re: [PATCH] optimize ktime_divns for constant divisors

2014-12-03 Thread Nicolas Pitre
On Wed, 3 Dec 2014, Arnd Bergmann wrote: On Wednesday 03 December 2014 14:43:06 Nicolas Pitre wrote: At least on ARM, do_div() is optimized to turn constant divisors into an inline multiplication by the reciprocal value at compile time. However this optimization is missed entirely