Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-20 Thread Thomas Gleixner
On Fri, 20 Sep 2013, Uwe Kleine-König wrote:
> On Fri, Sep 20, 2013 at 11:56:27AM +0200, Thomas Gleixner wrote:
> > On Thu, 19 Sep 2013, Uwe Kleine-König wrote:
> > > > +* from nsec to device ticks will be correct.
> > > > +*
> > > > +* For mult > (1 << shift), i.e. device frequency is > 1GHz we
> > > > +* need to be careful. Adding mult - 1 will result in a value
> > > > +* which when converted back to device ticks will be larger
> > > s/will/can/
> > 
> > No, it will always be larger.
> Hmm, consider a 1.25 GHz clock with shift = 2 and mult = 5. Then
> ns2clc(clc2ns(1000)) = 1000. So it's not always larger!
> In the fast-clock-case we have:
> With x << shift = n * mult - k for k in [0 .. mult-1] and an integer n:
> 
> ns2clc(clc2ns(x))
>   = ns2clc(((x << shift) + mult - 1) / mult)
>   = x << shift) + mult - 1) / mult) * mult) >> shift
>   = n * mult >> shift
>   = ((x << shift) + k) >> shift
>   = x + (k >> shift)
> 
> So ns2clc(clc2ns(x)) = x for all x > 0 that have
> 
>   k = mult - ((x << shift) - 1) % mult - 1 < 1 << shift
> 
> So my correction still stands.

Fair enough.  

> > 1) We cannot add if we'd overflow
> > 
> > 2) For mult <= 1 << shift it's always correct
> > 
> > 3) for mult > 1 << shift we only apply it to the min value not the max
> 
> Yeah, I didn't say your code is wrong *here*. I just think that my
> easier (and so probably faster) code is good enough.

Granted. I was stuck in the correctness discussion. So yeah, it does
not matter if we steal 30 usec of maximum idle sleep time from a 32kHz
clock. OTOH it does not matter much in the setup slow path to take
another conditional. :)

> Best regards and thanks for the nice discussion,
  
Ditto! You saved me from actually sitting down and using the pencil to
do the proper math.

Thanks,

tglx

Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-20 Thread Uwe Kleine-König
Hi Thomas,

On Fri, Sep 20, 2013 at 11:56:27AM +0200, Thomas Gleixner wrote:
> On Thu, 19 Sep 2013, Uwe Kleine-König wrote:
> > > +  * For mult <= (1 << shift) we can safely add mult - 1 to
> > > +  * prevent integer rounding loss. So the backwards conversion
> > It doesn't prevent inexactness to add mult - 1. It (only) asserts that
> > the ns2delta(delta2ns(latch)) >= latch instead of ... <= latch when not
> > doing it.
> 
> For mult <= 1 << shift the conversion is always ending up with the
> same latch value.
Ah right, I missed that we're in the slow-clock-case.

> > > +  * from nsec to device ticks will be correct.
> > > +  *
> > > +  * For mult > (1 << shift), i.e. device frequency is > 1GHz we
> > > +  * need to be careful. Adding mult - 1 will result in a value
> > > +  * which when converted back to device ticks will be larger
> > s/will/can/
> 
> No, it will always be larger.
Hmm, consider a 1.25 GHz clock with shift = 2 and mult = 5. Then
ns2clc(clc2ns(1000)) = 1000. So it's not always larger!
In the fast-clock-case we have:
With x << shift = n * mult - k for k in [0 .. mult-1] and an integer n:

  ns2clc(clc2ns(x))
= ns2clc(((x << shift) + mult - 1) / mult)
= x << shift) + mult - 1) / mult) * mult) >> shift
= n * mult >> shift
= ((x << shift) + k) >> shift
= x + (k >> shift)

So ns2clc(clc2ns(x)) = x for all x > 0 that have

k = mult - ((x << shift) - 1) % mult - 1 < 1 << shift

So my correction still stands.
 
> > > +  * than latch by (mult / (1 << shift)) - 1. For the min_delta
> > s/by/by up to/
>From the calculation above you can also see that this term is wrong. k
is smaller than mult (and there are values that realize k = mult - 1).
So the converted back value can be larger than latch by up to
(mult - 1) >> shift. This is zero for the slow-clock-case.

In the 1.25 GHz example above that means that the difference is up to 1,
not 0 as your term would imply. 1004 is an example where the conversion
to nano seconds and back to ticks results in a difference of 1.

> > > +  * calculation we still want to apply this in order to stay
> > > +  * above the minimum device ticks limit. For the upper limit
> > > +  * we would end up with a latch value larger than the upper
> > > +  * limit of the device, so we omit the add to stay below the
> > > +  * device upper boundary.
> > > +  *
> > > +  * Also omit the add if it would overflow the u64 boundary.
> > > +  */
> > > + if ((~0ULL - clc > rnd) &&
> > > + (!ismax || evt->mult <= (1U << evt->shift)))
> > > + clc += rnd;
> > I would expect that
> > 
> > if (!ismax)
> > if (~0ULL - clc > rnd)
> > clc += rnd;
> > else
> > clc = ~0ULL;
> > 
> > is enough (and a tad more exact in the presence of an overflow). I have
> > to think about that though.
> 
> Errm.
> 
> 1) We cannot add if we'd overflow
> 
> 2) For mult <= 1 << shift it's always correct
> 
> 3) for mult > 1 << shift we only apply it to the min value not the max
Yeah, I didn't say your code is wrong *here*. I just think that my
easier (and so probably faster) code is good enough.
 
> > >   clockevents_calc_mult_shift(dev, freq, sec);
> > > - dev->min_delta_ns = clockevent_delta2ns(dev->min_delta_ticks, dev);
> > > - dev->max_delta_ns = clockevent_delta2ns(dev->max_delta_ticks, dev);
> > > + dev->min_delta_ns = cev_delta2ns(dev->min_delta_ticks, dev, false);
> > > + dev->max_delta_ns = cev_delta2ns(dev->max_delta_ticks, dev, true);
> > Another improvement that came to my mind just now. For min_delta_ns you
> > want to assert that it results in a value >= min_delta_ticks when
> > converted back. For max_delta_ns you want ... value <= max_delta_ticks.
> > What about the values in between? They for sure should land in
> > [min_delta_ticks ... max_delta_ticks] when converted back and ideally
> > should be most exact. The latter part would mean to add (rnd / 2)
> > instead of rnd. I don't know yet how that would behave at the borders of
> > the [min_delta_ns ... max_delta_ns] interval, but I think you still need
> > to special-case that.
> 
> Again:
> 
> 1) For mult <= 1 << shift the backwards conversion is always the same as
>the input value.
> 
> 2) For mult > 1 << shift the backwards conversion of the min value is
>always > than the input value. And the backwards conversion of the
>max value is always < than the input value.
> 
> The values between that are completely uninteresting as the
> program_events code always converts from nsec to device ticks.
> 
> We clamp the delta between min_ns and max_ns. So due to the above any
> 
>min_ns <= delta <= max_ns
> 
> will after conversion fulfil 
> 
>min_tick <= delta_tick <= max_tick
> 
> So what are you going to improve? Either the math works or it does not.
Right, my idea is nice, but useless.

So I suggest you resend your patch with the compile fix and the
corrected comment and I will think about my 

Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-20 Thread Thomas Gleixner
On Thu, 19 Sep 2013, Uwe Kleine-König wrote:
> > + u64 rnd = (u64) evt->mult - 1;
> >  
> >   if (unlikely(!evt->mult)) {
> >   evt->mult = 1;
> >   WARN_ON(1);
> >   }
> I suggest to move the assignment to rnd below this if block as it
> changes mult. 

True.

>  
> + /*
> +  * Upper bound sanity check. If the backwards conversion is
> +  * not equal latch, we know that the above shift overflowed.
> +  */
> + if (clc >> evt->shift) != (u64)latch)
You didn't compile test, did you? Also the cast on the rhs isn't needed.

I did. I just missed to refresh the patch before sending it :)

> > +* For mult <= (1 << shift) we can safely add mult - 1 to
> > +* prevent integer rounding loss. So the backwards conversion
> It doesn't prevent inexactness to add mult - 1. It (only) asserts that
> the ns2delta(delta2ns(latch)) >= latch instead of ... <= latch when not
> doing it.

For mult <= 1 << shift the conversion is always ending up with the
same latch value.
 
> > +* from nsec to device ticks will be correct.
> > +*
> > +* For mult > (1 << shift), i.e. device frequency is > 1GHz we
> > +* need to be careful. Adding mult - 1 will result in a value
> > +* which when converted back to device ticks will be larger
> s/will/can/

No, it will always be larger.

> > +* than latch by (mult / (1 << shift)) - 1. For the min_delta
> s/by/by up to/
> 
> > +* calculation we still want to apply this in order to stay
> > +* above the minimum device ticks limit. For the upper limit
> > +* we would end up with a latch value larger than the upper
> > +* limit of the device, so we omit the add to stay below the
> > +* device upper boundary.
> > +*
> > +* Also omit the add if it would overflow the u64 boundary.
> > +*/
> > +   if ((~0ULL - clc > rnd) &&
> > +   (!ismax || evt->mult <= (1U << evt->shift)))
> > +   clc += rnd;
> I would expect that
> 
>   if (!ismax)
>   if (~0ULL - clc > rnd)
>   clc += rnd;
>   else
>   clc = ~0ULL;
> 
> is enough (and a tad more exact in the presence of an overflow). I have
> to think about that though.

Errm.

1) We cannot add if we'd overflow

2) For mult <= 1 << shift it's always correct

3) for mult > 1 << shift we only apply it to the min value not the max
 
> > clockevents_calc_mult_shift(dev, freq, sec);
> > -   dev->min_delta_ns = clockevent_delta2ns(dev->min_delta_ticks, dev);
> > -   dev->max_delta_ns = clockevent_delta2ns(dev->max_delta_ticks, dev);
> > +   dev->min_delta_ns = cev_delta2ns(dev->min_delta_ticks, dev, false);
> > +   dev->max_delta_ns = cev_delta2ns(dev->max_delta_ticks, dev, true);
> Another improvement that came to my mind just now. For min_delta_ns you
> want to assert that it results in a value >= min_delta_ticks when
> converted back. For max_delta_ns you want ... value <= max_delta_ticks.
> What about the values in between? They for sure should land in
> [min_delta_ticks ... max_delta_ticks] when converted back and ideally
> should be most exact. The latter part would mean to add (rnd / 2)
> instead of rnd. I don't know yet how that would behave at the borders of
> the [min_delta_ns ... max_delta_ns] interval, but I think you still need
> to special-case that.

Again:

1) For mult <= 1 << shift the backwards conversion is always the same as
   the input value.

2) For mult > 1 << shift the backwards conversion of the min value is
   always > than the input value. And the backwards conversion of the
   max value is always < than the input value.

The values between that are completely uninteresting as the
program_events code always converts from nsec to device ticks.

We clamp the delta between min_ns and max_ns. So due to the above any

   min_ns <= delta <= max_ns

will after conversion fulfil 

   min_tick <= delta_tick <= max_tick

So what are you going to improve? Either the math works or it does not.

Thanks,

tglx

Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-20 Thread Thomas Gleixner
On Thu, 19 Sep 2013, Uwe Kleine-König wrote:
  + u64 rnd = (u64) evt-mult - 1;
   
if (unlikely(!evt-mult)) {
evt-mult = 1;
WARN_ON(1);
}
 I suggest to move the assignment to rnd below this if block as it
 changes mult. 

True.

  
 + /*
 +  * Upper bound sanity check. If the backwards conversion is
 +  * not equal latch, we know that the above shift overflowed.
 +  */
 + if (clc  evt-shift) != (u64)latch)
You didn't compile test, did you? Also the cast on the rhs isn't needed.

I did. I just missed to refresh the patch before sending it :)

  +* For mult = (1  shift) we can safely add mult - 1 to
  +* prevent integer rounding loss. So the backwards conversion
 It doesn't prevent inexactness to add mult - 1. It (only) asserts that
 the ns2delta(delta2ns(latch)) = latch instead of ... = latch when not
 doing it.

For mult = 1  shift the conversion is always ending up with the
same latch value.
 
  +* from nsec to device ticks will be correct.
  +*
  +* For mult  (1  shift), i.e. device frequency is  1GHz we
  +* need to be careful. Adding mult - 1 will result in a value
  +* which when converted back to device ticks will be larger
 s/will/can/

No, it will always be larger.

  +* than latch by (mult / (1  shift)) - 1. For the min_delta
 s/by/by up to/
 
  +* calculation we still want to apply this in order to stay
  +* above the minimum device ticks limit. For the upper limit
  +* we would end up with a latch value larger than the upper
  +* limit of the device, so we omit the add to stay below the
  +* device upper boundary.
  +*
  +* Also omit the add if it would overflow the u64 boundary.
  +*/
  +   if ((~0ULL - clc  rnd) 
  +   (!ismax || evt-mult = (1U  evt-shift)))
  +   clc += rnd;
 I would expect that
 
   if (!ismax)
   if (~0ULL - clc  rnd)
   clc += rnd;
   else
   clc = ~0ULL;
 
 is enough (and a tad more exact in the presence of an overflow). I have
 to think about that though.

Errm.

1) We cannot add if we'd overflow

2) For mult = 1  shift it's always correct

3) for mult  1  shift we only apply it to the min value not the max
 
  clockevents_calc_mult_shift(dev, freq, sec);
  -   dev-min_delta_ns = clockevent_delta2ns(dev-min_delta_ticks, dev);
  -   dev-max_delta_ns = clockevent_delta2ns(dev-max_delta_ticks, dev);
  +   dev-min_delta_ns = cev_delta2ns(dev-min_delta_ticks, dev, false);
  +   dev-max_delta_ns = cev_delta2ns(dev-max_delta_ticks, dev, true);
 Another improvement that came to my mind just now. For min_delta_ns you
 want to assert that it results in a value = min_delta_ticks when
 converted back. For max_delta_ns you want ... value = max_delta_ticks.
 What about the values in between? They for sure should land in
 [min_delta_ticks ... max_delta_ticks] when converted back and ideally
 should be most exact. The latter part would mean to add (rnd / 2)
 instead of rnd. I don't know yet how that would behave at the borders of
 the [min_delta_ns ... max_delta_ns] interval, but I think you still need
 to special-case that.

Again:

1) For mult = 1  shift the backwards conversion is always the same as
   the input value.

2) For mult  1  shift the backwards conversion of the min value is
   always  than the input value. And the backwards conversion of the
   max value is always  than the input value.

The values between that are completely uninteresting as the
program_events code always converts from nsec to device ticks.

We clamp the delta between min_ns and max_ns. So due to the above any

   min_ns = delta = max_ns

will after conversion fulfil 

   min_tick = delta_tick = max_tick

So what are you going to improve? Either the math works or it does not.

Thanks,

tglx

Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-20 Thread Uwe Kleine-König
Hi Thomas,

On Fri, Sep 20, 2013 at 11:56:27AM +0200, Thomas Gleixner wrote:
 On Thu, 19 Sep 2013, Uwe Kleine-König wrote:
   +  * For mult = (1  shift) we can safely add mult - 1 to
   +  * prevent integer rounding loss. So the backwards conversion
  It doesn't prevent inexactness to add mult - 1. It (only) asserts that
  the ns2delta(delta2ns(latch)) = latch instead of ... = latch when not
  doing it.
 
 For mult = 1  shift the conversion is always ending up with the
 same latch value.
Ah right, I missed that we're in the slow-clock-case.

   +  * from nsec to device ticks will be correct.
   +  *
   +  * For mult  (1  shift), i.e. device frequency is  1GHz we
   +  * need to be careful. Adding mult - 1 will result in a value
   +  * which when converted back to device ticks will be larger
  s/will/can/
 
 No, it will always be larger.
Hmm, consider a 1.25 GHz clock with shift = 2 and mult = 5. Then
ns2clc(clc2ns(1000)) = 1000. So it's not always larger!
In the fast-clock-case we have:
With x  shift = n * mult - k for k in [0 .. mult-1] and an integer n:

  ns2clc(clc2ns(x))
= ns2clc(((x  shift) + mult - 1) / mult)
= x  shift) + mult - 1) / mult) * mult)  shift
= n * mult  shift
= ((x  shift) + k)  shift
= x + (k  shift)

So ns2clc(clc2ns(x)) = x for all x  0 that have

k = mult - ((x  shift) - 1) % mult - 1  1  shift

So my correction still stands.
 
   +  * than latch by (mult / (1  shift)) - 1. For the min_delta
  s/by/by up to/
From the calculation above you can also see that this term is wrong. k
is smaller than mult (and there are values that realize k = mult - 1).
So the converted back value can be larger than latch by up to
(mult - 1)  shift. This is zero for the slow-clock-case.

In the 1.25 GHz example above that means that the difference is up to 1,
not 0 as your term would imply. 1004 is an example where the conversion
to nano seconds and back to ticks results in a difference of 1.

   +  * calculation we still want to apply this in order to stay
   +  * above the minimum device ticks limit. For the upper limit
   +  * we would end up with a latch value larger than the upper
   +  * limit of the device, so we omit the add to stay below the
   +  * device upper boundary.
   +  *
   +  * Also omit the add if it would overflow the u64 boundary.
   +  */
   + if ((~0ULL - clc  rnd) 
   + (!ismax || evt-mult = (1U  evt-shift)))
   + clc += rnd;
  I would expect that
  
  if (!ismax)
  if (~0ULL - clc  rnd)
  clc += rnd;
  else
  clc = ~0ULL;
  
  is enough (and a tad more exact in the presence of an overflow). I have
  to think about that though.
 
 Errm.
 
 1) We cannot add if we'd overflow
 
 2) For mult = 1  shift it's always correct
 
 3) for mult  1  shift we only apply it to the min value not the max
Yeah, I didn't say your code is wrong *here*. I just think that my
easier (and so probably faster) code is good enough.
 
 clockevents_calc_mult_shift(dev, freq, sec);
   - dev-min_delta_ns = clockevent_delta2ns(dev-min_delta_ticks, dev);
   - dev-max_delta_ns = clockevent_delta2ns(dev-max_delta_ticks, dev);
   + dev-min_delta_ns = cev_delta2ns(dev-min_delta_ticks, dev, false);
   + dev-max_delta_ns = cev_delta2ns(dev-max_delta_ticks, dev, true);
  Another improvement that came to my mind just now. For min_delta_ns you
  want to assert that it results in a value = min_delta_ticks when
  converted back. For max_delta_ns you want ... value = max_delta_ticks.
  What about the values in between? They for sure should land in
  [min_delta_ticks ... max_delta_ticks] when converted back and ideally
  should be most exact. The latter part would mean to add (rnd / 2)
  instead of rnd. I don't know yet how that would behave at the borders of
  the [min_delta_ns ... max_delta_ns] interval, but I think you still need
  to special-case that.
 
 Again:
 
 1) For mult = 1  shift the backwards conversion is always the same as
the input value.
 
 2) For mult  1  shift the backwards conversion of the min value is
always  than the input value. And the backwards conversion of the
max value is always  than the input value.
 
 The values between that are completely uninteresting as the
 program_events code always converts from nsec to device ticks.
 
 We clamp the delta between min_ns and max_ns. So due to the above any
 
min_ns = delta = max_ns
 
 will after conversion fulfil 
 
min_tick = delta_tick = max_tick
 
 So what are you going to improve? Either the math works or it does not.
Right, my idea is nice, but useless.

So I suggest you resend your patch with the compile fix and the
corrected comment and I will think about my suggestion to simplify the
if condition independently as it's only a small runtime improvent and so
not important enough to stop the correctness issue your patch is fixing.

Best regards and thanks for the nice discussion,
Uwe

-- 

Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-20 Thread Thomas Gleixner
On Fri, 20 Sep 2013, Uwe Kleine-König wrote:
 On Fri, Sep 20, 2013 at 11:56:27AM +0200, Thomas Gleixner wrote:
  On Thu, 19 Sep 2013, Uwe Kleine-König wrote:
+* from nsec to device ticks will be correct.
+*
+* For mult  (1  shift), i.e. device frequency is  1GHz we
+* need to be careful. Adding mult - 1 will result in a value
+* which when converted back to device ticks will be larger
   s/will/can/
  
  No, it will always be larger.
 Hmm, consider a 1.25 GHz clock with shift = 2 and mult = 5. Then
 ns2clc(clc2ns(1000)) = 1000. So it's not always larger!
 In the fast-clock-case we have:
 With x  shift = n * mult - k for k in [0 .. mult-1] and an integer n:
 
 ns2clc(clc2ns(x))
   = ns2clc(((x  shift) + mult - 1) / mult)
   = x  shift) + mult - 1) / mult) * mult)  shift
   = n * mult  shift
   = ((x  shift) + k)  shift
   = x + (k  shift)
 
 So ns2clc(clc2ns(x)) = x for all x  0 that have
 
   k = mult - ((x  shift) - 1) % mult - 1  1  shift
 
 So my correction still stands.

Fair enough.  

  1) We cannot add if we'd overflow
  
  2) For mult = 1  shift it's always correct
  
  3) for mult  1  shift we only apply it to the min value not the max
 
 Yeah, I didn't say your code is wrong *here*. I just think that my
 easier (and so probably faster) code is good enough.

Granted. I was stuck in the correctness discussion. So yeah, it does
not matter if we steal 30 usec of maximum idle sleep time from a 32kHz
clock. OTOH it does not matter much in the setup slow path to take
another conditional. :)

 Best regards and thanks for the nice discussion,
  
Ditto! You saved me from actually sitting down and using the pencil to
do the proper math.

Thanks,

tglx

Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-19 Thread Uwe Kleine-König
Hi Thomas,

On Thu, Sep 19, 2013 at 04:30:37PM +0200, Thomas Gleixner wrote:
> Marc Kleine-Budde pointed out, that commit 77cc982 "clocksource: use
> clockevents_config_and_register() where possible" caused a regression
> for some of the converted subarchs.
> 
> The reason is, that the clockevents core code converts the minimal
> hardware tick delta to a nanosecond value for core internal
> usage. This conversion is affected by integer math rounding loss, so
> the backwards conversion to hardware ticks will likely result in a
> value which is less than the configured hardware limitation. The
> affected subarchs used their own workaround (SIGH!) which got lost in
> the conversion.
> 
> The solution for the issue at hand is simple: adding evt->mult - 1 to
> the shifted value before the integer divison in the core conversion
> function takes care of it. But this only works for the case where for
> the scaled math mult/shift pair "mult <= 1 << shift" is true. For the
> case where "mult > 1 << shift" we can apply the rounding add only for
> the minimum delta value to make sure that the backward conversion is
> not less than the given hardware limit. For the upper bound we need to
> omit the rounding add, because the backwards conversion is always
> larger than the original latch value. That would violate the upper
> bound of the hardware device.
> 
> Though looking closer at the details of that function reveals another
> bogosity: The upper bounds check is broken as well. Checking for a
> resulting "clc" value greater than KTIME_MAX after the conversion is
> pointless. The conversion does:
> 
>   u64 clc = (latch << evt->shift) / evt->mult;
> 
> So there is no sanity check for (latch << evt->shift) exceeding the
> 64bit boundary. The latch argument is "unsigned long", so on a 64bit
> arch the handed in argument could easily lead to an unnoticed shift
> overflow. With the above rounding fix applied the calculation before
> the divison is:
> 
>u64 clc = (latch << evt->shift) + evt->mult - 1;
> 
> So we need to make sure, that neither the shift nor the rounding add
> is overflowing the u64 boundary.
> 
> Signed-off-by: Thomas Gleixner 
> Cc: Russell King - ARM Linux 
> Cc: Marc Kleine-Budde 
> Cc: nicolas.fe...@atmel.com
> Cc: Marc Pignat 
> Cc: john.stu...@linaro.org
> Cc: ker...@pengutronix.de
> Cc: Ronald Wahl 
> Cc: LAK 
> Cc: u.kleine-koe...@pengutronix.de
> Cc: Ludovic Desroches 
> 
> ---
>  kernel/time/clockevents.c |   64 
> +++---
>  1 file changed, 49 insertions(+), 15 deletions(-)
> 
> Index: linux-2.6/kernel/time/clockevents.c
> ===
> --- linux-2.6.orig/kernel/time/clockevents.c
> +++ linux-2.6/kernel/time/clockevents.c
> @@ -33,29 +33,63 @@ struct ce_unbind {
>   int res;
>  };
>  
> -/**
> - * clockevents_delta2ns - Convert a latch value (device ticks) to nanoseconds
> - * @latch:   value to convert
> - * @evt: pointer to clock event device descriptor
> - *
> - * Math helper, returns latch value converted to nanoseconds (bound checked)
> - */
> -u64 clockevent_delta2ns(unsigned long latch, struct clock_event_device *evt)
> +static u64 cev_delta2ns(unsigned long latch, struct clock_event_device *evt,
> + bool ismax)
>  {
>   u64 clc = (u64) latch << evt->shift;
> + u64 rnd = (u64) evt->mult - 1;
>  
>   if (unlikely(!evt->mult)) {
>   evt->mult = 1;
>   WARN_ON(1);
>   }
I suggest to move the assignment to rnd below this if block as it
changes mult.

>  
> + /*
> +  * Upper bound sanity check. If the backwards conversion is
> +  * not equal latch, we know that the above shift overflowed.
> +  */
> + if (clc >> evt->shift) != (u64)latch)
You didn't compile test, did you? Also the cast on the rhs isn't needed.

> + clc = ~0ULL;
> +
> + /*
> +  * Scaled math oddities:
> +  *
> +  * For mult <= (1 << shift) we can safely add mult - 1 to
> +  * prevent integer rounding loss. So the backwards conversion
It doesn't prevent inexactness to add mult - 1. It (only) asserts that
the ns2delta(delta2ns(latch)) >= latch instead of ... <= latch when not
doing it.

> +  * from nsec to device ticks will be correct.
> +  *
> +  * For mult > (1 << shift), i.e. device frequency is > 1GHz we
> +  * need to be careful. Adding mult - 1 will result in a value
> +  * which when converted back to device ticks will be larger
s/will/can/

> +  * than latch by (mult / (1 << shift)) - 1. For the min_delta
s/by/by up to/

> +  * calculation we still want to apply this in order to stay
> +  * above the minimum device ticks limit. For the upper limit
> +  * we would end up with a latch value larger than the upper
> +  * limit of the device, so we omit the add to stay below the
> +  * device upper boundary.
> +  *
> +  * Also omit the add if it would 

Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-19 Thread Thomas Gleixner
Marc Kleine-Budde pointed out, that commit 77cc982 "clocksource: use
clockevents_config_and_register() where possible" caused a regression
for some of the converted subarchs.

The reason is, that the clockevents core code converts the minimal
hardware tick delta to a nanosecond value for core internal
usage. This conversion is affected by integer math rounding loss, so
the backwards conversion to hardware ticks will likely result in a
value which is less than the configured hardware limitation. The
affected subarchs used their own workaround (SIGH!) which got lost in
the conversion.

The solution for the issue at hand is simple: adding evt->mult - 1 to
the shifted value before the integer divison in the core conversion
function takes care of it. But this only works for the case where for
the scaled math mult/shift pair "mult <= 1 << shift" is true. For the
case where "mult > 1 << shift" we can apply the rounding add only for
the minimum delta value to make sure that the backward conversion is
not less than the given hardware limit. For the upper bound we need to
omit the rounding add, because the backwards conversion is always
larger than the original latch value. That would violate the upper
bound of the hardware device.

Though looking closer at the details of that function reveals another
bogosity: The upper bounds check is broken as well. Checking for a
resulting "clc" value greater than KTIME_MAX after the conversion is
pointless. The conversion does:

  u64 clc = (latch << evt->shift) / evt->mult;

So there is no sanity check for (latch << evt->shift) exceeding the
64bit boundary. The latch argument is "unsigned long", so on a 64bit
arch the handed in argument could easily lead to an unnoticed shift
overflow. With the above rounding fix applied the calculation before
the divison is:

   u64 clc = (latch << evt->shift) + evt->mult - 1;

So we need to make sure, that neither the shift nor the rounding add
is overflowing the u64 boundary.

Signed-off-by: Thomas Gleixner 
Cc: Russell King - ARM Linux 
Cc: Marc Kleine-Budde 
Cc: nicolas.fe...@atmel.com
Cc: Marc Pignat 
Cc: john.stu...@linaro.org
Cc: ker...@pengutronix.de
Cc: Ronald Wahl 
Cc: LAK 
Cc: u.kleine-koe...@pengutronix.de
Cc: Ludovic Desroches 

---
 kernel/time/clockevents.c |   64 +++---
 1 file changed, 49 insertions(+), 15 deletions(-)

Index: linux-2.6/kernel/time/clockevents.c
===
--- linux-2.6.orig/kernel/time/clockevents.c
+++ linux-2.6/kernel/time/clockevents.c
@@ -33,29 +33,63 @@ struct ce_unbind {
int res;
 };
 
-/**
- * clockevents_delta2ns - Convert a latch value (device ticks) to nanoseconds
- * @latch: value to convert
- * @evt:   pointer to clock event device descriptor
- *
- * Math helper, returns latch value converted to nanoseconds (bound checked)
- */
-u64 clockevent_delta2ns(unsigned long latch, struct clock_event_device *evt)
+static u64 cev_delta2ns(unsigned long latch, struct clock_event_device *evt,
+   bool ismax)
 {
u64 clc = (u64) latch << evt->shift;
+   u64 rnd = (u64) evt->mult - 1;
 
if (unlikely(!evt->mult)) {
evt->mult = 1;
WARN_ON(1);
}
 
+   /*
+* Upper bound sanity check. If the backwards conversion is
+* not equal latch, we know that the above shift overflowed.
+*/
+   if (clc >> evt->shift) != (u64)latch)
+   clc = ~0ULL;
+
+   /*
+* Scaled math oddities:
+*
+* For mult <= (1 << shift) we can safely add mult - 1 to
+* prevent integer rounding loss. So the backwards conversion
+* from nsec to device ticks will be correct.
+*
+* For mult > (1 << shift), i.e. device frequency is > 1GHz we
+* need to be careful. Adding mult - 1 will result in a value
+* which when converted back to device ticks will be larger
+* than latch by (mult / (1 << shift)) - 1. For the min_delta
+* calculation we still want to apply this in order to stay
+* above the minimum device ticks limit. For the upper limit
+* we would end up with a latch value larger than the upper
+* limit of the device, so we omit the add to stay below the
+* device upper boundary.
+*
+* Also omit the add if it would overflow the u64 boundary.
+*/
+   if ((~0ULL - clc > rnd) &&
+   (!ismax || evt->mult <= (1U << evt->shift)))
+   clc += rnd;
+
do_div(clc, evt->mult);
-   if (clc < 1000)
-   clc = 1000;
-   if (clc > KTIME_MAX)
-   clc = KTIME_MAX;
 
-   return clc;
+   /* Deltas less than 1usec are pointless noise */
+   return clc > 1000 ? clc : 1000;
+}
+
+/**
+ * clockevents_delta2ns - Convert a latch value (device ticks) to nanoseconds
+ * @latch: value to convert
+ * @evt:   

Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-19 Thread Thomas Gleixner
On Thu, 19 Sep 2013, Uwe Kleine-König wrote:
> On Thu, Sep 19, 2013 at 12:15:10PM +0200, Thomas Gleixner wrote:
> > On Thu, 19 Sep 2013, Uwe Kleine-König wrote:
> > > On Thu, Sep 19, 2013 at 12:01:25AM +0200, Thomas Gleixner wrote:
> > > > Versus the 64bit overflow check, we need to be even more careful. We
> > > > need to check for overflowing (1 << 63) - 1 (i.e. the max positive
> > > > value which fits into a s64). See clockevents_program_event().
> > > 
> > > That is because you interpret times < 0 as in the past, right? But note
> > > that the interim result we're talking about here is still to be divided
> > > by evt->mult. So assuming mult > 1, that check is too strict unless you
> > > move it below the do_div in clockevent_delta2ns. For sure it makes sense
> > > to use the same value for a and b in the handling:
> > 
> > No, it's not too strict.
> > 
> > nsec = (latch << shift) / mult;
> > 
> > Now the backwards conversion does:
> > 
> > latch = (nsec * mult) >> shift;
> >
> > So we want nsec * mult to be in the positive range of s64. Which
> > means, that latch << shift must be in that range as well.
> The backwards conversion is in clockevents_program_event(), right? There
> is:
> 
>   clc = ((unsigned long long) delta * dev->mult) >> dev->shift;
> 
> So I don't see a problem if nsec * mult overflows (1 << 63) - 1 as long
> as it still fits into an unsigned long long (i.e. a 64 bit value).

Right. It doesn't matter.


Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-19 Thread Uwe Kleine-König
Hello Thomas,

On Thu, Sep 19, 2013 at 12:15:10PM +0200, Thomas Gleixner wrote:
> On Thu, 19 Sep 2013, Uwe Kleine-König wrote:
> > On Thu, Sep 19, 2013 at 12:01:25AM +0200, Thomas Gleixner wrote:
> > > Versus the 64bit overflow check, we need to be even more careful. We
> > > need to check for overflowing (1 << 63) - 1 (i.e. the max positive
> > > value which fits into a s64). See clockevents_program_event().
> > 
> > That is because you interpret times < 0 as in the past, right? But note
> > that the interim result we're talking about here is still to be divided
> > by evt->mult. So assuming mult > 1, that check is too strict unless you
> > move it below the do_div in clockevent_delta2ns. For sure it makes sense
> > to use the same value for a and b in the handling:
> 
> No, it's not too strict.
> 
> nsec = (latch << shift) / mult;
> 
> Now the backwards conversion does:
> 
> latch = (nsec * mult) >> shift;
>
> So we want nsec * mult to be in the positive range of s64. Which
> means, that latch << shift must be in that range as well.
The backwards conversion is in clockevents_program_event(), right? There
is:

clc = ((unsigned long long) delta * dev->mult) >> dev->shift;

So I don't see a problem if nsec * mult overflows (1 << 63) - 1 as long
as it still fits into an unsigned long long (i.e. a 64 bit value).

What am I missing?

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-19 Thread Thomas Gleixner
On Thu, 19 Sep 2013, Uwe Kleine-König wrote:
> On Thu, Sep 19, 2013 at 12:01:25AM +0200, Thomas Gleixner wrote:
> > Versus the 64bit overflow check, we need to be even more careful. We
> > need to check for overflowing (1 << 63) - 1 (i.e. the max positive
> > value which fits into a s64). See clockevents_program_event().
> 
> That is because you interpret times < 0 as in the past, right? But note
> that the interim result we're talking about here is still to be divided
> by evt->mult. So assuming mult > 1, that check is too strict unless you
> move it below the do_div in clockevent_delta2ns. For sure it makes sense
> to use the same value for a and b in the handling:

No, it's not too strict.

nsec = (latch << shift) / mult;

Now the backwards conversion does:

latch = (nsec * mult) >> shift;
 
So we want nsec * mult to be in the positive range of s64. Which
means, that latch << shift must be in that range as well.

Thanks,

tglx



Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-19 Thread Uwe Kleine-König
Hello Thomas,

On Thu, Sep 19, 2013 at 12:01:25AM +0200, Thomas Gleixner wrote:
> On Wed, 18 Sep 2013, Uwe Kleine-König wrote:
> > On Wed, Sep 18, 2013 at 11:38:07AM +0200, Thomas Gleixner wrote:
> > > On Wed, 18 Sep 2013, Uwe Kleine-König wrote:
> > > > Another doubt I have is: You changed clockevent_delta2ns to round up now
> > > > unconditionally. For the numbers on at91 that doesn't matter, but I
> > > > wonder if there are situations that make the timer core violate the
> > > > max_delta_ticks condition now.
> > > 
> > > And how so? The + (mult - 1) ensures, that the conversion back to
> > > ticks results in the same value as latch. So how should it violate
> > > the max boundary?
> > That is wrong:
> > With max_delta_ticks << shift = n * mult - k for k in [0 .. mult-1] and
> > an integer n:
> > 
> >   (max_delta_ns * mult) >> shift
> > = max_delta_ticks << shift) + mult - 1) / mult) * mult) >> shift
> > = (((n * mult - k + mult - 1) / mult) * mult) >> shift
> > = n * mult >> shift
> > = ((max_delta_ticks << shift) + k) >> shift
> > = max_delta_ticks + (k >> shift)
> > 
> > k >> shift is only known to be zero if mult <= 1 << shift (i.e. the same
> > condition as above where your 64bit overflow detection is wrong). So
> > this can result in values > max_delta_ticks.
> 
> Right, should have waited for coffee actually activating the right
> brain cells.
> 
> So the rounding thing works nicely for mult <= (1 << sft). For the
> other cases the conversion is wrong by (mult / (1 << sft)) - 1. So for
> a 3 GHz clock it's off by 2. That's not an issue for the min value,
> but for the max value it might overflow the hardware limit. Not a real
> functional issue as we'd just get a pointless short interrupt, but for
> correctness reasons and for the sake of NOHZ we need to fix that.
Well, you're assuming here that this is how the hardware reacts. Might
be worse. But we're on the same side here, it should be fixed.
 
> There is no real correct solution for these cases, which will result
> in a correct backwards conversion. We could validate against the max
> delta ticks in program_event, but that adds another boundary check to
> the hotpath so I rather want to avoid it.
> 
> The simple solution is to check for the following in the conversion
> routine:
> 
> if (latch == dev->min_delta_ticks || dev->mult <= (1 << dev->sft))
>clc += mult - 1
I didn't do the necessary math, but I wouldn't be surprised if you'd
need to check for

latch <= dev->min_delta_ticks + n

for some non-negative n.

Another simple solution is to apply the uprounding only for
min_delta_ticks -> min_delta_ns conversion, but not for the
max_delta_ticks -> max_delta_ns conversion. For the first you have to
guarantee not to yield lower values so round up, for the latter you must
not pass on bigger values so round down.
 
> That ensures, that we never violate the lower boundary independent of
> the mult/sft relation. For the max_delta_ticks conversion in the "mult
> > (1 << sft)" case we end up with a slightly smaller value, but well
> inside the boundaries and close enough to the hardware limitations.
Yeah, probably it's not worth to calculate the optimal value (actually
for both limits). Anyhow, I will check that on my next shopping tour :-)
 
> Versus the 64bit overflow check, we need to be even more careful. We
> need to check for overflowing (1 << 63) - 1 (i.e. the max positive
> value which fits into a s64). See clockevents_program_event().
That is because you interpret times < 0 as in the past, right? But note
that the interim result we're talking about here is still to be divided
by evt->mult. So assuming mult > 1, that check is too strict unless you
move it below the do_div in clockevent_delta2ns. For sure it makes sense
to use the same value for a and b in the handling:

if (calculatedval overflows a)
calculatedval = b

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-19 Thread Uwe Kleine-König
Hello Thomas,

On Thu, Sep 19, 2013 at 12:01:25AM +0200, Thomas Gleixner wrote:
 On Wed, 18 Sep 2013, Uwe Kleine-König wrote:
  On Wed, Sep 18, 2013 at 11:38:07AM +0200, Thomas Gleixner wrote:
   On Wed, 18 Sep 2013, Uwe Kleine-König wrote:
Another doubt I have is: You changed clockevent_delta2ns to round up now
unconditionally. For the numbers on at91 that doesn't matter, but I
wonder if there are situations that make the timer core violate the
max_delta_ticks condition now.
   
   And how so? The + (mult - 1) ensures, that the conversion back to
   ticks results in the same value as latch. So how should it violate
   the max boundary?
  That is wrong:
  With max_delta_ticks  shift = n * mult - k for k in [0 .. mult-1] and
  an integer n:
  
(max_delta_ns * mult)  shift
  = max_delta_ticks  shift) + mult - 1) / mult) * mult)  shift
  = (((n * mult - k + mult - 1) / mult) * mult)  shift
  = n * mult  shift
  = ((max_delta_ticks  shift) + k)  shift
  = max_delta_ticks + (k  shift)
  
  k  shift is only known to be zero if mult = 1  shift (i.e. the same
  condition as above where your 64bit overflow detection is wrong). So
  this can result in values  max_delta_ticks.
 
 Right, should have waited for coffee actually activating the right
 brain cells.
 
 So the rounding thing works nicely for mult = (1  sft). For the
 other cases the conversion is wrong by (mult / (1  sft)) - 1. So for
 a 3 GHz clock it's off by 2. That's not an issue for the min value,
 but for the max value it might overflow the hardware limit. Not a real
 functional issue as we'd just get a pointless short interrupt, but for
 correctness reasons and for the sake of NOHZ we need to fix that.
Well, you're assuming here that this is how the hardware reacts. Might
be worse. But we're on the same side here, it should be fixed.
 
 There is no real correct solution for these cases, which will result
 in a correct backwards conversion. We could validate against the max
 delta ticks in program_event, but that adds another boundary check to
 the hotpath so I rather want to avoid it.
 
 The simple solution is to check for the following in the conversion
 routine:
 
 if (latch == dev-min_delta_ticks || dev-mult = (1  dev-sft))
clc += mult - 1
I didn't do the necessary math, but I wouldn't be surprised if you'd
need to check for

latch = dev-min_delta_ticks + n

for some non-negative n.

Another simple solution is to apply the uprounding only for
min_delta_ticks - min_delta_ns conversion, but not for the
max_delta_ticks - max_delta_ns conversion. For the first you have to
guarantee not to yield lower values so round up, for the latter you must
not pass on bigger values so round down.
 
 That ensures, that we never violate the lower boundary independent of
 the mult/sft relation. For the max_delta_ticks conversion in the mult
  (1  sft) case we end up with a slightly smaller value, but well
 inside the boundaries and close enough to the hardware limitations.
Yeah, probably it's not worth to calculate the optimal value (actually
for both limits). Anyhow, I will check that on my next shopping tour :-)
 
 Versus the 64bit overflow check, we need to be even more careful. We
 need to check for overflowing (1  63) - 1 (i.e. the max positive
 value which fits into a s64). See clockevents_program_event().
That is because you interpret times  0 as in the past, right? But note
that the interim result we're talking about here is still to be divided
by evt-mult. So assuming mult  1, that check is too strict unless you
move it below the do_div in clockevent_delta2ns. For sure it makes sense
to use the same value for a and b in the handling:

if (calculatedval overflows a)
calculatedval = b

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-19 Thread Thomas Gleixner
On Thu, 19 Sep 2013, Uwe Kleine-König wrote:
 On Thu, Sep 19, 2013 at 12:01:25AM +0200, Thomas Gleixner wrote:
  Versus the 64bit overflow check, we need to be even more careful. We
  need to check for overflowing (1  63) - 1 (i.e. the max positive
  value which fits into a s64). See clockevents_program_event().
 
 That is because you interpret times  0 as in the past, right? But note
 that the interim result we're talking about here is still to be divided
 by evt-mult. So assuming mult  1, that check is too strict unless you
 move it below the do_div in clockevent_delta2ns. For sure it makes sense
 to use the same value for a and b in the handling:

No, it's not too strict.

nsec = (latch  shift) / mult;

Now the backwards conversion does:

latch = (nsec * mult)  shift;
 
So we want nsec * mult to be in the positive range of s64. Which
means, that latch  shift must be in that range as well.

Thanks,

tglx



Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-19 Thread Uwe Kleine-König
Hello Thomas,

On Thu, Sep 19, 2013 at 12:15:10PM +0200, Thomas Gleixner wrote:
 On Thu, 19 Sep 2013, Uwe Kleine-König wrote:
  On Thu, Sep 19, 2013 at 12:01:25AM +0200, Thomas Gleixner wrote:
   Versus the 64bit overflow check, we need to be even more careful. We
   need to check for overflowing (1  63) - 1 (i.e. the max positive
   value which fits into a s64). See clockevents_program_event().
  
  That is because you interpret times  0 as in the past, right? But note
  that the interim result we're talking about here is still to be divided
  by evt-mult. So assuming mult  1, that check is too strict unless you
  move it below the do_div in clockevent_delta2ns. For sure it makes sense
  to use the same value for a and b in the handling:
 
 No, it's not too strict.
 
 nsec = (latch  shift) / mult;
 
 Now the backwards conversion does:
 
 latch = (nsec * mult)  shift;

 So we want nsec * mult to be in the positive range of s64. Which
 means, that latch  shift must be in that range as well.
The backwards conversion is in clockevents_program_event(), right? There
is:

clc = ((unsigned long long) delta * dev-mult)  dev-shift;

So I don't see a problem if nsec * mult overflows (1  63) - 1 as long
as it still fits into an unsigned long long (i.e. a 64 bit value).

What am I missing?

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-19 Thread Thomas Gleixner
On Thu, 19 Sep 2013, Uwe Kleine-König wrote:
 On Thu, Sep 19, 2013 at 12:15:10PM +0200, Thomas Gleixner wrote:
  On Thu, 19 Sep 2013, Uwe Kleine-König wrote:
   On Thu, Sep 19, 2013 at 12:01:25AM +0200, Thomas Gleixner wrote:
Versus the 64bit overflow check, we need to be even more careful. We
need to check for overflowing (1  63) - 1 (i.e. the max positive
value which fits into a s64). See clockevents_program_event().
   
   That is because you interpret times  0 as in the past, right? But note
   that the interim result we're talking about here is still to be divided
   by evt-mult. So assuming mult  1, that check is too strict unless you
   move it below the do_div in clockevent_delta2ns. For sure it makes sense
   to use the same value for a and b in the handling:
  
  No, it's not too strict.
  
  nsec = (latch  shift) / mult;
  
  Now the backwards conversion does:
  
  latch = (nsec * mult)  shift;
 
  So we want nsec * mult to be in the positive range of s64. Which
  means, that latch  shift must be in that range as well.
 The backwards conversion is in clockevents_program_event(), right? There
 is:
 
   clc = ((unsigned long long) delta * dev-mult)  dev-shift;
 
 So I don't see a problem if nsec * mult overflows (1  63) - 1 as long
 as it still fits into an unsigned long long (i.e. a 64 bit value).

Right. It doesn't matter.


Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-19 Thread Thomas Gleixner
Marc Kleine-Budde pointed out, that commit 77cc982 clocksource: use
clockevents_config_and_register() where possible caused a regression
for some of the converted subarchs.

The reason is, that the clockevents core code converts the minimal
hardware tick delta to a nanosecond value for core internal
usage. This conversion is affected by integer math rounding loss, so
the backwards conversion to hardware ticks will likely result in a
value which is less than the configured hardware limitation. The
affected subarchs used their own workaround (SIGH!) which got lost in
the conversion.

The solution for the issue at hand is simple: adding evt-mult - 1 to
the shifted value before the integer divison in the core conversion
function takes care of it. But this only works for the case where for
the scaled math mult/shift pair mult = 1  shift is true. For the
case where mult  1  shift we can apply the rounding add only for
the minimum delta value to make sure that the backward conversion is
not less than the given hardware limit. For the upper bound we need to
omit the rounding add, because the backwards conversion is always
larger than the original latch value. That would violate the upper
bound of the hardware device.

Though looking closer at the details of that function reveals another
bogosity: The upper bounds check is broken as well. Checking for a
resulting clc value greater than KTIME_MAX after the conversion is
pointless. The conversion does:

  u64 clc = (latch  evt-shift) / evt-mult;

So there is no sanity check for (latch  evt-shift) exceeding the
64bit boundary. The latch argument is unsigned long, so on a 64bit
arch the handed in argument could easily lead to an unnoticed shift
overflow. With the above rounding fix applied the calculation before
the divison is:

   u64 clc = (latch  evt-shift) + evt-mult - 1;

So we need to make sure, that neither the shift nor the rounding add
is overflowing the u64 boundary.

Signed-off-by: Thomas Gleixner t...@linutronix.de
Cc: Russell King - ARM Linux li...@arm.linux.org.uk
Cc: Marc Kleine-Budde m...@pengutronix.de
Cc: nicolas.fe...@atmel.com
Cc: Marc Pignat marc.pig...@hevs.ch
Cc: john.stu...@linaro.org
Cc: ker...@pengutronix.de
Cc: Ronald Wahl ronald.w...@raritan.com
Cc: LAK linux-arm-ker...@lists.infradead.org
Cc: u.kleine-koe...@pengutronix.de
Cc: Ludovic Desroches ludovic.desroc...@atmel.com

---
 kernel/time/clockevents.c |   64 +++---
 1 file changed, 49 insertions(+), 15 deletions(-)

Index: linux-2.6/kernel/time/clockevents.c
===
--- linux-2.6.orig/kernel/time/clockevents.c
+++ linux-2.6/kernel/time/clockevents.c
@@ -33,29 +33,63 @@ struct ce_unbind {
int res;
 };
 
-/**
- * clockevents_delta2ns - Convert a latch value (device ticks) to nanoseconds
- * @latch: value to convert
- * @evt:   pointer to clock event device descriptor
- *
- * Math helper, returns latch value converted to nanoseconds (bound checked)
- */
-u64 clockevent_delta2ns(unsigned long latch, struct clock_event_device *evt)
+static u64 cev_delta2ns(unsigned long latch, struct clock_event_device *evt,
+   bool ismax)
 {
u64 clc = (u64) latch  evt-shift;
+   u64 rnd = (u64) evt-mult - 1;
 
if (unlikely(!evt-mult)) {
evt-mult = 1;
WARN_ON(1);
}
 
+   /*
+* Upper bound sanity check. If the backwards conversion is
+* not equal latch, we know that the above shift overflowed.
+*/
+   if (clc  evt-shift) != (u64)latch)
+   clc = ~0ULL;
+
+   /*
+* Scaled math oddities:
+*
+* For mult = (1  shift) we can safely add mult - 1 to
+* prevent integer rounding loss. So the backwards conversion
+* from nsec to device ticks will be correct.
+*
+* For mult  (1  shift), i.e. device frequency is  1GHz we
+* need to be careful. Adding mult - 1 will result in a value
+* which when converted back to device ticks will be larger
+* than latch by (mult / (1  shift)) - 1. For the min_delta
+* calculation we still want to apply this in order to stay
+* above the minimum device ticks limit. For the upper limit
+* we would end up with a latch value larger than the upper
+* limit of the device, so we omit the add to stay below the
+* device upper boundary.
+*
+* Also omit the add if it would overflow the u64 boundary.
+*/
+   if ((~0ULL - clc  rnd) 
+   (!ismax || evt-mult = (1U  evt-shift)))
+   clc += rnd;
+
do_div(clc, evt-mult);
-   if (clc  1000)
-   clc = 1000;
-   if (clc  KTIME_MAX)
-   clc = KTIME_MAX;
 
-   return clc;
+   /* Deltas less than 1usec are pointless noise */
+   return clc  1000 ? clc : 1000;
+}
+
+/**
+ * clockevents_delta2ns - 

Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-19 Thread Uwe Kleine-König
Hi Thomas,

On Thu, Sep 19, 2013 at 04:30:37PM +0200, Thomas Gleixner wrote:
 Marc Kleine-Budde pointed out, that commit 77cc982 clocksource: use
 clockevents_config_and_register() where possible caused a regression
 for some of the converted subarchs.
 
 The reason is, that the clockevents core code converts the minimal
 hardware tick delta to a nanosecond value for core internal
 usage. This conversion is affected by integer math rounding loss, so
 the backwards conversion to hardware ticks will likely result in a
 value which is less than the configured hardware limitation. The
 affected subarchs used their own workaround (SIGH!) which got lost in
 the conversion.
 
 The solution for the issue at hand is simple: adding evt-mult - 1 to
 the shifted value before the integer divison in the core conversion
 function takes care of it. But this only works for the case where for
 the scaled math mult/shift pair mult = 1  shift is true. For the
 case where mult  1  shift we can apply the rounding add only for
 the minimum delta value to make sure that the backward conversion is
 not less than the given hardware limit. For the upper bound we need to
 omit the rounding add, because the backwards conversion is always
 larger than the original latch value. That would violate the upper
 bound of the hardware device.
 
 Though looking closer at the details of that function reveals another
 bogosity: The upper bounds check is broken as well. Checking for a
 resulting clc value greater than KTIME_MAX after the conversion is
 pointless. The conversion does:
 
   u64 clc = (latch  evt-shift) / evt-mult;
 
 So there is no sanity check for (latch  evt-shift) exceeding the
 64bit boundary. The latch argument is unsigned long, so on a 64bit
 arch the handed in argument could easily lead to an unnoticed shift
 overflow. With the above rounding fix applied the calculation before
 the divison is:
 
u64 clc = (latch  evt-shift) + evt-mult - 1;
 
 So we need to make sure, that neither the shift nor the rounding add
 is overflowing the u64 boundary.
 
 Signed-off-by: Thomas Gleixner t...@linutronix.de
 Cc: Russell King - ARM Linux li...@arm.linux.org.uk
 Cc: Marc Kleine-Budde m...@pengutronix.de
 Cc: nicolas.fe...@atmel.com
 Cc: Marc Pignat marc.pig...@hevs.ch
 Cc: john.stu...@linaro.org
 Cc: ker...@pengutronix.de
 Cc: Ronald Wahl ronald.w...@raritan.com
 Cc: LAK linux-arm-ker...@lists.infradead.org
 Cc: u.kleine-koe...@pengutronix.de
 Cc: Ludovic Desroches ludovic.desroc...@atmel.com
 
 ---
  kernel/time/clockevents.c |   64 
 +++---
  1 file changed, 49 insertions(+), 15 deletions(-)
 
 Index: linux-2.6/kernel/time/clockevents.c
 ===
 --- linux-2.6.orig/kernel/time/clockevents.c
 +++ linux-2.6/kernel/time/clockevents.c
 @@ -33,29 +33,63 @@ struct ce_unbind {
   int res;
  };
  
 -/**
 - * clockevents_delta2ns - Convert a latch value (device ticks) to nanoseconds
 - * @latch:   value to convert
 - * @evt: pointer to clock event device descriptor
 - *
 - * Math helper, returns latch value converted to nanoseconds (bound checked)
 - */
 -u64 clockevent_delta2ns(unsigned long latch, struct clock_event_device *evt)
 +static u64 cev_delta2ns(unsigned long latch, struct clock_event_device *evt,
 + bool ismax)
  {
   u64 clc = (u64) latch  evt-shift;
 + u64 rnd = (u64) evt-mult - 1;
  
   if (unlikely(!evt-mult)) {
   evt-mult = 1;
   WARN_ON(1);
   }
I suggest to move the assignment to rnd below this if block as it
changes mult.

  
 + /*
 +  * Upper bound sanity check. If the backwards conversion is
 +  * not equal latch, we know that the above shift overflowed.
 +  */
 + if (clc  evt-shift) != (u64)latch)
You didn't compile test, did you? Also the cast on the rhs isn't needed.

 + clc = ~0ULL;
 +
 + /*
 +  * Scaled math oddities:
 +  *
 +  * For mult = (1  shift) we can safely add mult - 1 to
 +  * prevent integer rounding loss. So the backwards conversion
It doesn't prevent inexactness to add mult - 1. It (only) asserts that
the ns2delta(delta2ns(latch)) = latch instead of ... = latch when not
doing it.

 +  * from nsec to device ticks will be correct.
 +  *
 +  * For mult  (1  shift), i.e. device frequency is  1GHz we
 +  * need to be careful. Adding mult - 1 will result in a value
 +  * which when converted back to device ticks will be larger
s/will/can/

 +  * than latch by (mult / (1  shift)) - 1. For the min_delta
s/by/by up to/

 +  * calculation we still want to apply this in order to stay
 +  * above the minimum device ticks limit. For the upper limit
 +  * we would end up with a latch value larger than the upper
 +  * limit of the device, so we omit the add to stay below the
 +  * device upper boundary.
 +  *
 +  * Also omit the add if it 

Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-18 Thread Thomas Gleixner
On Wed, 18 Sep 2013, Uwe Kleine-König wrote:
> On Wed, Sep 18, 2013 at 11:38:07AM +0200, Thomas Gleixner wrote:
> > On Wed, 18 Sep 2013, Uwe Kleine-König wrote:
> > > Another doubt I have is: You changed clockevent_delta2ns to round up now
> > > unconditionally. For the numbers on at91 that doesn't matter, but I
> > > wonder if there are situations that make the timer core violate the
> > > max_delta_ticks condition now.
> > 
> > And how so? The + (mult - 1) ensures, that the conversion back to
> > ticks results in the same value as latch. So how should it violate
> > the max boundary?
> That is wrong:
> With max_delta_ticks << shift = n * mult - k for k in [0 .. mult-1] and
> an integer n:
> 
> (max_delta_ns * mult) >> shift
>   = max_delta_ticks << shift) + mult - 1) / mult) * mult) >> shift
>   = (((n * mult - k + mult - 1) / mult) * mult) >> shift
>   = n * mult >> shift
>   = ((max_delta_ticks << shift) + k) >> shift
>   = max_delta_ticks + (k >> shift)
> 
> k >> shift is only known to be zero if mult <= 1 << shift (i.e. the same
> condition as above where your 64bit overflow detection is wrong). So
> this can result in values > max_delta_ticks.

Right, should have waited for coffee actually activating the right
brain cells.

So the rounding thing works nicely for mult <= (1 << sft). For the
other cases the conversion is wrong by (mult / (1 << sft)) - 1. So for
a 3 GHz clock it's off by 2. That's not an issue for the min value,
but for the max value it might overflow the hardware limit. Not a real
functional issue as we'd just get a pointless short interrupt, but for
correctness reasons and for the sake of NOHZ we need to fix that.

There is no real correct solution for these cases, which will result
in a correct backwards conversion. We could validate against the max
delta ticks in program_event, but that adds another boundary check to
the hotpath so I rather want to avoid it.

The simple solution is to check for the following in the conversion
routine:

if (latch == dev->min_delta_ticks || dev->mult <= (1 << dev->sft))
   clc += mult - 1

That ensures, that we never violate the lower boundary independent of
the mult/sft relation. For the max_delta_ticks conversion in the "mult
> (1 << sft)" case we end up with a slightly smaller value, but well
inside the boundaries and close enough to the hardware limitations.

Versus the 64bit overflow check, we need to be even more careful. We
need to check for overflowing (1 << 63) - 1 (i.e. the max positive
value which fits into a s64). See clockevents_program_event().

So this check needs to be:

   u64 clc = (u64)latch << sft;

   if ((s64)clc < 0 || (clc >> sft ) != latch)
  clc = (1 << 63) - 1;

   And the above rounding magic, which has to be applied after this
   needs to take the following condition into account as well:

(1 << 63) - 1 - clc >= mult - 1

If that's not true, we can't do the rounding add, but we know that
we are well at the upper limit of the hardware, so no issue
either.

> > > > for boundary violation and can limit "clc" to (1 << 63) - 1 before the
> > > Where does this magic constant come from?
> > 
> > Rolling my magic hex dice gave me that.
> Wow, how many sides does your dice have? Couldn't it have choosen
> (u64)-1 for improved results?

No it's restricted to s64 positiv values due to software
limitations. See above.

Thanks,

tglx

Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-18 Thread Uwe Kleine-König
Hello Thomas,

On Wed, Sep 18, 2013 at 11:38:07AM +0200, Thomas Gleixner wrote:
> On Wed, 18 Sep 2013, Uwe Kleine-König wrote:
> > > Now we can easily verify whether the whole equation fits into the
> > > 64bit boundary. Shifting the "clc" result back by evt->shift MUST
> > > result in "latch". If that's not the case, we have a clear indicator
> > But this is only the case if evt->mult is <= (1 << evt->shift). Is this
> > always given?
> 
> Crap, no. It's only true for device frequency <= 1GHz. Good catch!
> 
> > Is it more sensible to adjust dev->max_delta_ns once at register time
> > and so save the often recurrent overflow check in
> > clockevents_program_event?
> 
> Which overflow check are you talking about?
> 
> There is only the boundary check:
> 
> delta = min(delta, (int64_t) dev->max_delta_ns);
> delta = max(delta, (int64_t) dev->min_delta_ns);
> 
> Which sensible adjustment at register time is going to remove that?
My idea was that wouldn't need to add

if ((clc >> evt->shift) != (u64)latch)
...

to clockevent_delta2ns (not clockevents_program_event as I wrote) if
dev->max_delta_ns was small enough. So max_delta_ns would be the minimum
of the hardware limit and the value to prevent an overflow. Not sure any
more that this works though.

> > Another doubt I have is: You changed clockevent_delta2ns to round up now
> > unconditionally. For the numbers on at91 that doesn't matter, but I
> > wonder if there are situations that make the timer core violate the
> > max_delta_ticks condition now.
> 
> And how so? The + (mult - 1) ensures, that the conversion back to
> ticks results in the same value as latch. So how should it violate
> the max boundary?
That is wrong:
With max_delta_ticks << shift = n * mult - k for k in [0 .. mult-1] and
an integer n:

  (max_delta_ns * mult) >> shift
= max_delta_ticks << shift) + mult - 1) / mult) * mult) >> shift
= (((n * mult - k + mult - 1) / mult) * mult) >> shift
= n * mult >> shift
= ((max_delta_ticks << shift) + k) >> shift
= max_delta_ticks + (k >> shift)

k >> shift is only known to be zero if mult <= 1 << shift (i.e. the same
condition as above where your 64bit overflow detection is wrong). So
this can result in values > max_delta_ticks.

> Math is hard, right?
Yes, if it involves integer division and overflow handling it's hard to
come up with correct solutions during shopping. ;-)
 
> > > for boundary violation and can limit "clc" to (1 << 63) - 1 before the
> > Where does this magic constant come from?
> 
> Rolling my magic hex dice gave me that.
Wow, how many sides does your dice have? Couldn't it have choosen
(u64)-1 for improved results?

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-18 Thread Thomas Gleixner
On Wed, 18 Sep 2013, Uwe Kleine-König wrote:
> > Now we can easily verify whether the whole equation fits into the
> > 64bit boundary. Shifting the "clc" result back by evt->shift MUST
> > result in "latch". If that's not the case, we have a clear indicator
> But this is only the case if evt->mult is <= (1 << evt->shift). Is this
> always given?

Crap, no. It's only true for device frequency <= 1GHz. Good catch!

> Is it more sensible to adjust dev->max_delta_ns once at register time
> and so save the often recurrent overflow check in
> clockevents_program_event?

Which overflow check are you talking about?

There is only the boundary check:

delta = min(delta, (int64_t) dev->max_delta_ns);
delta = max(delta, (int64_t) dev->min_delta_ns);

Which sensible adjustment at register time is going to remove that?

> Another doubt I have is: You changed clockevent_delta2ns to round up now
> unconditionally. For the numbers on at91 that doesn't matter, but I
> wonder if there are situations that make the timer core violate the
> max_delta_ticks condition now.

And how so? The + (mult - 1) ensures, that the conversion back to
ticks results in the same value as latch. So how should it violate
the max boundary?

Math is hard, right?
 
> > for boundary violation and can limit "clc" to (1 << 63) - 1 before the
> Where does this magic constant come from?

Rolling my magic hex dice gave me that.

Thanks,

tglx

Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-18 Thread Uwe Kleine-König
Hello Thomas,

On Tue, Sep 17, 2013 at 11:15:20PM +0200, Thomas Gleixner wrote:
> Marc Kleine-Budde pointed out, that commit 77cc982 "clocksource: use
> clockevents_config_and_register() where possible" caused a regression
> for some of the converted subarchs.
> 
> The reason is, that the clockevents core code converts the minimal
> hardware tick delta to a nanosecond value for core internal
> usage. This conversion is affected by integer math rounding loss, so
> the backwards conversion to hardware ticks will likely result in a
> value which is less than the configured hardware limitation. The
> affected subarchs used their own workaround (SIGH!) which got lost in
> the conversion.
> 
> Now instead of fixing the underlying core code problem, Marcs patch
s/Marcs/Marc's/

> tried to work around the core code issue by increasing the minimal
> tick delta at clockevents registration time so the resulting limit in
> the core code backwards conversion did not violate the hardware
> limits. More SIGH!
> 
> The solution for the issue at hand is simple: adding evt->mult - 1 to
> the shifted value before the integer divison in the core conversion
> function takes care of it.
> 
> Though looking closer at the details of that function reveals another
> bogosity: The upper bounds check is broken as well. Checking for a
> resulting "clc" value greater than KTIME_MAX after the conversion is
> pointless. The conversion does:
> 
>   u64 clc = (latch << evt->shift) / evt->mult;
> 
> So there is no sanity check for (latch << evt->shift) exceeding the
> 64bit boundary. The latch argument is "unsigned long", so on a 64bit
> arch the handed in argument could easily lead to an unnoticed shift
> overflow. With the above rounding fix applied the calculation before
> the divison is:
> 
>u64 clc = (latch << evt->shift) + evt->mult - 1;
> 
> Now we can easily verify whether the whole equation fits into the
> 64bit boundary. Shifting the "clc" result back by evt->shift MUST
> result in "latch". If that's not the case, we have a clear indicator
But this is only the case if evt->mult is <= (1 << evt->shift). Is this
always given?
Is it more sensible to adjust dev->max_delta_ns once at register time
and so save the often recurrent overflow check in
clockevents_program_event?

Another doubt I have is: You changed clockevent_delta2ns to round up now
unconditionally. For the numbers on at91 that doesn't matter, but I
wonder if there are situations that make the timer core violate the
max_delta_ticks condition now.

> for boundary violation and can limit "clc" to (1 << 63) - 1 before the
Where does this magic constant come from?

Best regards
Uwe

> divison by evt->mult. The resulting nsec * evt->mult in the
> programming path will therefor always be in the 64bit boundary.
> 
> Signed-off-by: Thomas Gleixner 
> ---
> diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> index 38959c8..4fc4826 100644
> --- a/kernel/time/clockevents.c
> +++ b/kernel/time/clockevents.c
> @@ -49,13 +49,25 @@ u64 clockevent_delta2ns(unsigned long latch, struct 
> clock_event_device *evt)
>   WARN_ON(1);
>   }
>  
> + /*
> +  * Prevent integer rounding loss, otherwise the backward
> +  * conversion from nsec to ticks could result in a value less
> +  * than evt->min_delta_ticks.
> +  */
> + clc += evt->mult - 1;
> +
> + /*
> +  * Upper bound sanity check. If the backwards conversion is
> +  * not equal latch, we know that the above (shift + rounding
> +  * correction) exceeded the 64 bit boundary.
> +  */
> + if ((clc >> evt->shift) != (u64)latch)
> + clc = ((u64)1 << 63) - 1;
> +
>   do_div(clc, evt->mult);
> - if (clc < 1000)
> - clc = 1000;
> - if (clc > KTIME_MAX)
> - clc = KTIME_MAX;
>  
> - return clc;
> + /* Deltas less than 1usec are pointless noise */
> + return clc > 1000 ? clc : 1000;
>  }
>  EXPORT_SYMBOL_GPL(clockevent_delta2ns);
> 

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-18 Thread Uwe Kleine-König
Hello Thomas,

On Tue, Sep 17, 2013 at 11:15:20PM +0200, Thomas Gleixner wrote:
 Marc Kleine-Budde pointed out, that commit 77cc982 clocksource: use
 clockevents_config_and_register() where possible caused a regression
 for some of the converted subarchs.
 
 The reason is, that the clockevents core code converts the minimal
 hardware tick delta to a nanosecond value for core internal
 usage. This conversion is affected by integer math rounding loss, so
 the backwards conversion to hardware ticks will likely result in a
 value which is less than the configured hardware limitation. The
 affected subarchs used their own workaround (SIGH!) which got lost in
 the conversion.
 
 Now instead of fixing the underlying core code problem, Marcs patch
s/Marcs/Marc's/

 tried to work around the core code issue by increasing the minimal
 tick delta at clockevents registration time so the resulting limit in
 the core code backwards conversion did not violate the hardware
 limits. More SIGH!
 
 The solution for the issue at hand is simple: adding evt-mult - 1 to
 the shifted value before the integer divison in the core conversion
 function takes care of it.
 
 Though looking closer at the details of that function reveals another
 bogosity: The upper bounds check is broken as well. Checking for a
 resulting clc value greater than KTIME_MAX after the conversion is
 pointless. The conversion does:
 
   u64 clc = (latch  evt-shift) / evt-mult;
 
 So there is no sanity check for (latch  evt-shift) exceeding the
 64bit boundary. The latch argument is unsigned long, so on a 64bit
 arch the handed in argument could easily lead to an unnoticed shift
 overflow. With the above rounding fix applied the calculation before
 the divison is:
 
u64 clc = (latch  evt-shift) + evt-mult - 1;
 
 Now we can easily verify whether the whole equation fits into the
 64bit boundary. Shifting the clc result back by evt-shift MUST
 result in latch. If that's not the case, we have a clear indicator
But this is only the case if evt-mult is = (1  evt-shift). Is this
always given?
Is it more sensible to adjust dev-max_delta_ns once at register time
and so save the often recurrent overflow check in
clockevents_program_event?

Another doubt I have is: You changed clockevent_delta2ns to round up now
unconditionally. For the numbers on at91 that doesn't matter, but I
wonder if there are situations that make the timer core violate the
max_delta_ticks condition now.

 for boundary violation and can limit clc to (1  63) - 1 before the
Where does this magic constant come from?

Best regards
Uwe

 divison by evt-mult. The resulting nsec * evt-mult in the
 programming path will therefor always be in the 64bit boundary.
 
 Signed-off-by: Thomas Gleixner t...@linutronix.de
 ---
 diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
 index 38959c8..4fc4826 100644
 --- a/kernel/time/clockevents.c
 +++ b/kernel/time/clockevents.c
 @@ -49,13 +49,25 @@ u64 clockevent_delta2ns(unsigned long latch, struct 
 clock_event_device *evt)
   WARN_ON(1);
   }
  
 + /*
 +  * Prevent integer rounding loss, otherwise the backward
 +  * conversion from nsec to ticks could result in a value less
 +  * than evt-min_delta_ticks.
 +  */
 + clc += evt-mult - 1;
 +
 + /*
 +  * Upper bound sanity check. If the backwards conversion is
 +  * not equal latch, we know that the above (shift + rounding
 +  * correction) exceeded the 64 bit boundary.
 +  */
 + if ((clc  evt-shift) != (u64)latch)
 + clc = ((u64)1  63) - 1;
 +
   do_div(clc, evt-mult);
 - if (clc  1000)
 - clc = 1000;
 - if (clc  KTIME_MAX)
 - clc = KTIME_MAX;
  
 - return clc;
 + /* Deltas less than 1usec are pointless noise */
 + return clc  1000 ? clc : 1000;
  }
  EXPORT_SYMBOL_GPL(clockevent_delta2ns);
 

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-18 Thread Thomas Gleixner
On Wed, 18 Sep 2013, Uwe Kleine-König wrote:
  Now we can easily verify whether the whole equation fits into the
  64bit boundary. Shifting the clc result back by evt-shift MUST
  result in latch. If that's not the case, we have a clear indicator
 But this is only the case if evt-mult is = (1  evt-shift). Is this
 always given?

Crap, no. It's only true for device frequency = 1GHz. Good catch!

 Is it more sensible to adjust dev-max_delta_ns once at register time
 and so save the often recurrent overflow check in
 clockevents_program_event?

Which overflow check are you talking about?

There is only the boundary check:

delta = min(delta, (int64_t) dev-max_delta_ns);
delta = max(delta, (int64_t) dev-min_delta_ns);

Which sensible adjustment at register time is going to remove that?

 Another doubt I have is: You changed clockevent_delta2ns to round up now
 unconditionally. For the numbers on at91 that doesn't matter, but I
 wonder if there are situations that make the timer core violate the
 max_delta_ticks condition now.

And how so? The + (mult - 1) ensures, that the conversion back to
ticks results in the same value as latch. So how should it violate
the max boundary?

Math is hard, right?
 
  for boundary violation and can limit clc to (1  63) - 1 before the
 Where does this magic constant come from?

Rolling my magic hex dice gave me that.

Thanks,

tglx

Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-18 Thread Uwe Kleine-König
Hello Thomas,

On Wed, Sep 18, 2013 at 11:38:07AM +0200, Thomas Gleixner wrote:
 On Wed, 18 Sep 2013, Uwe Kleine-König wrote:
   Now we can easily verify whether the whole equation fits into the
   64bit boundary. Shifting the clc result back by evt-shift MUST
   result in latch. If that's not the case, we have a clear indicator
  But this is only the case if evt-mult is = (1  evt-shift). Is this
  always given?
 
 Crap, no. It's only true for device frequency = 1GHz. Good catch!
 
  Is it more sensible to adjust dev-max_delta_ns once at register time
  and so save the often recurrent overflow check in
  clockevents_program_event?
 
 Which overflow check are you talking about?
 
 There is only the boundary check:
 
 delta = min(delta, (int64_t) dev-max_delta_ns);
 delta = max(delta, (int64_t) dev-min_delta_ns);
 
 Which sensible adjustment at register time is going to remove that?
My idea was that wouldn't need to add

if ((clc  evt-shift) != (u64)latch)
...

to clockevent_delta2ns (not clockevents_program_event as I wrote) if
dev-max_delta_ns was small enough. So max_delta_ns would be the minimum
of the hardware limit and the value to prevent an overflow. Not sure any
more that this works though.

  Another doubt I have is: You changed clockevent_delta2ns to round up now
  unconditionally. For the numbers on at91 that doesn't matter, but I
  wonder if there are situations that make the timer core violate the
  max_delta_ticks condition now.
 
 And how so? The + (mult - 1) ensures, that the conversion back to
 ticks results in the same value as latch. So how should it violate
 the max boundary?
That is wrong:
With max_delta_ticks  shift = n * mult - k for k in [0 .. mult-1] and
an integer n:

  (max_delta_ns * mult)  shift
= max_delta_ticks  shift) + mult - 1) / mult) * mult)  shift
= (((n * mult - k + mult - 1) / mult) * mult)  shift
= n * mult  shift
= ((max_delta_ticks  shift) + k)  shift
= max_delta_ticks + (k  shift)

k  shift is only known to be zero if mult = 1  shift (i.e. the same
condition as above where your 64bit overflow detection is wrong). So
this can result in values  max_delta_ticks.

 Math is hard, right?
Yes, if it involves integer division and overflow handling it's hard to
come up with correct solutions during shopping. ;-)
 
   for boundary violation and can limit clc to (1  63) - 1 before the
  Where does this magic constant come from?
 
 Rolling my magic hex dice gave me that.
Wow, how many sides does your dice have? Couldn't it have choosen
(u64)-1 for improved results?

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-18 Thread Thomas Gleixner
On Wed, 18 Sep 2013, Uwe Kleine-König wrote:
 On Wed, Sep 18, 2013 at 11:38:07AM +0200, Thomas Gleixner wrote:
  On Wed, 18 Sep 2013, Uwe Kleine-König wrote:
   Another doubt I have is: You changed clockevent_delta2ns to round up now
   unconditionally. For the numbers on at91 that doesn't matter, but I
   wonder if there are situations that make the timer core violate the
   max_delta_ticks condition now.
  
  And how so? The + (mult - 1) ensures, that the conversion back to
  ticks results in the same value as latch. So how should it violate
  the max boundary?
 That is wrong:
 With max_delta_ticks  shift = n * mult - k for k in [0 .. mult-1] and
 an integer n:
 
 (max_delta_ns * mult)  shift
   = max_delta_ticks  shift) + mult - 1) / mult) * mult)  shift
   = (((n * mult - k + mult - 1) / mult) * mult)  shift
   = n * mult  shift
   = ((max_delta_ticks  shift) + k)  shift
   = max_delta_ticks + (k  shift)
 
 k  shift is only known to be zero if mult = 1  shift (i.e. the same
 condition as above where your 64bit overflow detection is wrong). So
 this can result in values  max_delta_ticks.

Right, should have waited for coffee actually activating the right
brain cells.

So the rounding thing works nicely for mult = (1  sft). For the
other cases the conversion is wrong by (mult / (1  sft)) - 1. So for
a 3 GHz clock it's off by 2. That's not an issue for the min value,
but for the max value it might overflow the hardware limit. Not a real
functional issue as we'd just get a pointless short interrupt, but for
correctness reasons and for the sake of NOHZ we need to fix that.

There is no real correct solution for these cases, which will result
in a correct backwards conversion. We could validate against the max
delta ticks in program_event, but that adds another boundary check to
the hotpath so I rather want to avoid it.

The simple solution is to check for the following in the conversion
routine:

if (latch == dev-min_delta_ticks || dev-mult = (1  dev-sft))
   clc += mult - 1

That ensures, that we never violate the lower boundary independent of
the mult/sft relation. For the max_delta_ticks conversion in the mult
 (1  sft) case we end up with a slightly smaller value, but well
inside the boundaries and close enough to the hardware limitations.

Versus the 64bit overflow check, we need to be even more careful. We
need to check for overflowing (1  63) - 1 (i.e. the max positive
value which fits into a s64). See clockevents_program_event().

So this check needs to be:

   u64 clc = (u64)latch  sft;

   if ((s64)clc  0 || (clc  sft ) != latch)
  clc = (1  63) - 1;

   And the above rounding magic, which has to be applied after this
   needs to take the following condition into account as well:

(1  63) - 1 - clc = mult - 1

If that's not true, we can't do the rounding add, but we know that
we are well at the upper limit of the hardware, so no issue
either.

for boundary violation and can limit clc to (1  63) - 1 before the
   Where does this magic constant come from?
  
  Rolling my magic hex dice gave me that.
 Wow, how many sides does your dice have? Couldn't it have choosen
 (u64)-1 for improved results?

No it's restricted to s64 positiv values due to software
limitations. See above.

Thanks,

tglx

Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-17 Thread Thomas Gleixner
On Wed, 18 Sep 2013, Marc Kleine-Budde wrote:
> On 09/17/2013 11:15 PM, Thomas Gleixner wrote:
> > Marc Kleine-Budde pointed out, that commit 77cc982 "clocksource: use
> > clockevents_config_and_register() where possible" caused a regression
> > for some of the converted subarchs.
> > 
> > The reason is, that the clockevents core code converts the minimal
> > hardware tick delta to a nanosecond value for core internal
> > usage. This conversion is affected by integer math rounding loss, so
> > the backwards conversion to hardware ticks will likely result in a
> > value which is less than the configured hardware limitation. The
> > affected subarchs used their own workaround (SIGH!) which got lost in
> > the conversion.
> > 
> > Now instead of fixing the underlying core code problem, Marcs patch
> > tried to work around the core code issue by increasing the minimal
> > tick delta at clockevents registration time so the resulting limit in
> > the core code backwards conversion did not violate the hardware
> > limits. More SIGH!
> 
> It was not easy getting your attention with this problem :)

A really hard to understand and solve problem! Obviously you are a
great fan of John Stultz famous "Math is hard, lets go shopping!"
line.

Why on earth do you need my attention to fix at least the simple
rounding issue in the core code?

You could have sent the "clk += evt->mult - 1:" patch directly to
Linus without even cc'ing me.

It's not rocket science, really. When I responded with the while(1)
loop patch today in the morning it was just because I did not have a
second to think about the correct fix for this, but the reason WHY
this happened was entirely clear in a split second after reading
Russells reply.

The upper bounds issue tooks a few minutes more, but it's not rocket
science either.

Sorry, I can understand the need for a second opinion on a weird race
condition problem, but tracking down an obvious simple math issue to
the point and fixing it... SIGH!

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-17 Thread Marc Kleine-Budde
On 09/17/2013 11:15 PM, Thomas Gleixner wrote:
> Marc Kleine-Budde pointed out, that commit 77cc982 "clocksource: use
> clockevents_config_and_register() where possible" caused a regression
> for some of the converted subarchs.
> 
> The reason is, that the clockevents core code converts the minimal
> hardware tick delta to a nanosecond value for core internal
> usage. This conversion is affected by integer math rounding loss, so
> the backwards conversion to hardware ticks will likely result in a
> value which is less than the configured hardware limitation. The
> affected subarchs used their own workaround (SIGH!) which got lost in
> the conversion.
> 
> Now instead of fixing the underlying core code problem, Marcs patch
> tried to work around the core code issue by increasing the minimal
> tick delta at clockevents registration time so the resulting limit in
> the core code backwards conversion did not violate the hardware
> limits. More SIGH!

It was not easy getting your attention with this problem :)

Marc

-- 
Pengutronix e.K.  | Marc Kleine-Budde   |
Industrial Linux Solutions| Phone: +49-231-2826-924 |
Vertretung West/Dortmund  | Fax:   +49-5121-206917- |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |



signature.asc
Description: OpenPGP digital signature


[PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-17 Thread Thomas Gleixner
Marc Kleine-Budde pointed out, that commit 77cc982 "clocksource: use
clockevents_config_and_register() where possible" caused a regression
for some of the converted subarchs.

The reason is, that the clockevents core code converts the minimal
hardware tick delta to a nanosecond value for core internal
usage. This conversion is affected by integer math rounding loss, so
the backwards conversion to hardware ticks will likely result in a
value which is less than the configured hardware limitation. The
affected subarchs used their own workaround (SIGH!) which got lost in
the conversion.

Now instead of fixing the underlying core code problem, Marcs patch
tried to work around the core code issue by increasing the minimal
tick delta at clockevents registration time so the resulting limit in
the core code backwards conversion did not violate the hardware
limits. More SIGH!

The solution for the issue at hand is simple: adding evt->mult - 1 to
the shifted value before the integer divison in the core conversion
function takes care of it.

Though looking closer at the details of that function reveals another
bogosity: The upper bounds check is broken as well. Checking for a
resulting "clc" value greater than KTIME_MAX after the conversion is
pointless. The conversion does:

  u64 clc = (latch << evt->shift) / evt->mult;

So there is no sanity check for (latch << evt->shift) exceeding the
64bit boundary. The latch argument is "unsigned long", so on a 64bit
arch the handed in argument could easily lead to an unnoticed shift
overflow. With the above rounding fix applied the calculation before
the divison is:

   u64 clc = (latch << evt->shift) + evt->mult - 1;

Now we can easily verify whether the whole equation fits into the
64bit boundary. Shifting the "clc" result back by evt->shift MUST
result in "latch". If that's not the case, we have a clear indicator
for boundary violation and can limit "clc" to (1 << 63) - 1 before the
divison by evt->mult. The resulting nsec * evt->mult in the
programming path will therefor always be in the 64bit boundary.

Signed-off-by: Thomas Gleixner 
---
diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index 38959c8..4fc4826 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -49,13 +49,25 @@ u64 clockevent_delta2ns(unsigned long latch, struct 
clock_event_device *evt)
WARN_ON(1);
}
 
+   /*
+* Prevent integer rounding loss, otherwise the backward
+* conversion from nsec to ticks could result in a value less
+* than evt->min_delta_ticks.
+*/
+   clc += evt->mult - 1;
+
+   /*
+* Upper bound sanity check. If the backwards conversion is
+* not equal latch, we know that the above (shift + rounding
+* correction) exceeded the 64 bit boundary.
+*/
+   if ((clc >> evt->shift) != (u64)latch)
+   clc = ((u64)1 << 63) - 1;
+
do_div(clc, evt->mult);
-   if (clc < 1000)
-   clc = 1000;
-   if (clc > KTIME_MAX)
-   clc = KTIME_MAX;
 
-   return clc;
+   /* Deltas less than 1usec are pointless noise */
+   return clc > 1000 ? clc : 1000;
 }
 EXPORT_SYMBOL_GPL(clockevent_delta2ns);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-17 Thread Thomas Gleixner
Marc Kleine-Budde pointed out, that commit 77cc982 clocksource: use
clockevents_config_and_register() where possible caused a regression
for some of the converted subarchs.

The reason is, that the clockevents core code converts the minimal
hardware tick delta to a nanosecond value for core internal
usage. This conversion is affected by integer math rounding loss, so
the backwards conversion to hardware ticks will likely result in a
value which is less than the configured hardware limitation. The
affected subarchs used their own workaround (SIGH!) which got lost in
the conversion.

Now instead of fixing the underlying core code problem, Marcs patch
tried to work around the core code issue by increasing the minimal
tick delta at clockevents registration time so the resulting limit in
the core code backwards conversion did not violate the hardware
limits. More SIGH!

The solution for the issue at hand is simple: adding evt-mult - 1 to
the shifted value before the integer divison in the core conversion
function takes care of it.

Though looking closer at the details of that function reveals another
bogosity: The upper bounds check is broken as well. Checking for a
resulting clc value greater than KTIME_MAX after the conversion is
pointless. The conversion does:

  u64 clc = (latch  evt-shift) / evt-mult;

So there is no sanity check for (latch  evt-shift) exceeding the
64bit boundary. The latch argument is unsigned long, so on a 64bit
arch the handed in argument could easily lead to an unnoticed shift
overflow. With the above rounding fix applied the calculation before
the divison is:

   u64 clc = (latch  evt-shift) + evt-mult - 1;

Now we can easily verify whether the whole equation fits into the
64bit boundary. Shifting the clc result back by evt-shift MUST
result in latch. If that's not the case, we have a clear indicator
for boundary violation and can limit clc to (1  63) - 1 before the
divison by evt-mult. The resulting nsec * evt-mult in the
programming path will therefor always be in the 64bit boundary.

Signed-off-by: Thomas Gleixner t...@linutronix.de
---
diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index 38959c8..4fc4826 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -49,13 +49,25 @@ u64 clockevent_delta2ns(unsigned long latch, struct 
clock_event_device *evt)
WARN_ON(1);
}
 
+   /*
+* Prevent integer rounding loss, otherwise the backward
+* conversion from nsec to ticks could result in a value less
+* than evt-min_delta_ticks.
+*/
+   clc += evt-mult - 1;
+
+   /*
+* Upper bound sanity check. If the backwards conversion is
+* not equal latch, we know that the above (shift + rounding
+* correction) exceeded the 64 bit boundary.
+*/
+   if ((clc  evt-shift) != (u64)latch)
+   clc = ((u64)1  63) - 1;
+
do_div(clc, evt-mult);
-   if (clc  1000)
-   clc = 1000;
-   if (clc  KTIME_MAX)
-   clc = KTIME_MAX;
 
-   return clc;
+   /* Deltas less than 1usec are pointless noise */
+   return clc  1000 ? clc : 1000;
 }
 EXPORT_SYMBOL_GPL(clockevent_delta2ns);
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-17 Thread Marc Kleine-Budde
On 09/17/2013 11:15 PM, Thomas Gleixner wrote:
 Marc Kleine-Budde pointed out, that commit 77cc982 clocksource: use
 clockevents_config_and_register() where possible caused a regression
 for some of the converted subarchs.
 
 The reason is, that the clockevents core code converts the minimal
 hardware tick delta to a nanosecond value for core internal
 usage. This conversion is affected by integer math rounding loss, so
 the backwards conversion to hardware ticks will likely result in a
 value which is less than the configured hardware limitation. The
 affected subarchs used their own workaround (SIGH!) which got lost in
 the conversion.
 
 Now instead of fixing the underlying core code problem, Marcs patch
 tried to work around the core code issue by increasing the minimal
 tick delta at clockevents registration time so the resulting limit in
 the core code backwards conversion did not violate the hardware
 limits. More SIGH!

It was not easy getting your attention with this problem :)

Marc

-- 
Pengutronix e.K.  | Marc Kleine-Budde   |
Industrial Linux Solutions| Phone: +49-231-2826-924 |
Vertretung West/Dortmund  | Fax:   +49-5121-206917- |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] clockevents: Sanitize ticks to nsec conversion

2013-09-17 Thread Thomas Gleixner
On Wed, 18 Sep 2013, Marc Kleine-Budde wrote:
 On 09/17/2013 11:15 PM, Thomas Gleixner wrote:
  Marc Kleine-Budde pointed out, that commit 77cc982 clocksource: use
  clockevents_config_and_register() where possible caused a regression
  for some of the converted subarchs.
  
  The reason is, that the clockevents core code converts the minimal
  hardware tick delta to a nanosecond value for core internal
  usage. This conversion is affected by integer math rounding loss, so
  the backwards conversion to hardware ticks will likely result in a
  value which is less than the configured hardware limitation. The
  affected subarchs used their own workaround (SIGH!) which got lost in
  the conversion.
  
  Now instead of fixing the underlying core code problem, Marcs patch
  tried to work around the core code issue by increasing the minimal
  tick delta at clockevents registration time so the resulting limit in
  the core code backwards conversion did not violate the hardware
  limits. More SIGH!
 
 It was not easy getting your attention with this problem :)

A really hard to understand and solve problem! Obviously you are a
great fan of John Stultz famous Math is hard, lets go shopping!
line.

Why on earth do you need my attention to fix at least the simple
rounding issue in the core code?

You could have sent the clk += evt-mult - 1: patch directly to
Linus without even cc'ing me.

It's not rocket science, really. When I responded with the while(1)
loop patch today in the morning it was just because I did not have a
second to think about the correct fix for this, but the reason WHY
this happened was entirely clear in a split second after reading
Russells reply.

The upper bounds issue tooks a few minutes more, but it's not rocket
science either.

Sorry, I can understand the need for a second opinion on a weird race
condition problem, but tracking down an obvious simple math issue to
the point and fixing it... SIGH!

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/