Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
On Fri, 20 Sep 2013, Uwe Kleine-König wrote: > On Fri, Sep 20, 2013 at 11:56:27AM +0200, Thomas Gleixner wrote: > > On Thu, 19 Sep 2013, Uwe Kleine-König wrote: > > > > +* from nsec to device ticks will be correct. > > > > +* > > > > +* For mult > (1 << shift), i.e. device frequency is > 1GHz we > > > > +* need to be careful. Adding mult - 1 will result in a value > > > > +* which when converted back to device ticks will be larger > > > s/will/can/ > > > > No, it will always be larger. > Hmm, consider a 1.25 GHz clock with shift = 2 and mult = 5. Then > ns2clc(clc2ns(1000)) = 1000. So it's not always larger! > In the fast-clock-case we have: > With x << shift = n * mult - k for k in [0 .. mult-1] and an integer n: > > ns2clc(clc2ns(x)) > = ns2clc(((x << shift) + mult - 1) / mult) > = x << shift) + mult - 1) / mult) * mult) >> shift > = n * mult >> shift > = ((x << shift) + k) >> shift > = x + (k >> shift) > > So ns2clc(clc2ns(x)) = x for all x > 0 that have > > k = mult - ((x << shift) - 1) % mult - 1 < 1 << shift > > So my correction still stands. Fair enough. > > 1) We cannot add if we'd overflow > > > > 2) For mult <= 1 << shift it's always correct > > > > 3) for mult > 1 << shift we only apply it to the min value not the max > > Yeah, I didn't say your code is wrong *here*. I just think that my > easier (and so probably faster) code is good enough. Granted. I was stuck in the correctness discussion. So yeah, it does not matter if we steal 30 usec of maximum idle sleep time from a 32kHz clock. OTOH it does not matter much in the setup slow path to take another conditional. :) > Best regards and thanks for the nice discussion, Ditto! You saved me from actually sitting down and using the pencil to do the proper math. Thanks, tglx
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
Hi Thomas, On Fri, Sep 20, 2013 at 11:56:27AM +0200, Thomas Gleixner wrote: > On Thu, 19 Sep 2013, Uwe Kleine-König wrote: > > > + * For mult <= (1 << shift) we can safely add mult - 1 to > > > + * prevent integer rounding loss. So the backwards conversion > > It doesn't prevent inexactness to add mult - 1. It (only) asserts that > > the ns2delta(delta2ns(latch)) >= latch instead of ... <= latch when not > > doing it. > > For mult <= 1 << shift the conversion is always ending up with the > same latch value. Ah right, I missed that we're in the slow-clock-case. > > > + * from nsec to device ticks will be correct. > > > + * > > > + * For mult > (1 << shift), i.e. device frequency is > 1GHz we > > > + * need to be careful. Adding mult - 1 will result in a value > > > + * which when converted back to device ticks will be larger > > s/will/can/ > > No, it will always be larger. Hmm, consider a 1.25 GHz clock with shift = 2 and mult = 5. Then ns2clc(clc2ns(1000)) = 1000. So it's not always larger! In the fast-clock-case we have: With x << shift = n * mult - k for k in [0 .. mult-1] and an integer n: ns2clc(clc2ns(x)) = ns2clc(((x << shift) + mult - 1) / mult) = x << shift) + mult - 1) / mult) * mult) >> shift = n * mult >> shift = ((x << shift) + k) >> shift = x + (k >> shift) So ns2clc(clc2ns(x)) = x for all x > 0 that have k = mult - ((x << shift) - 1) % mult - 1 < 1 << shift So my correction still stands. > > > + * than latch by (mult / (1 << shift)) - 1. For the min_delta > > s/by/by up to/ >From the calculation above you can also see that this term is wrong. k is smaller than mult (and there are values that realize k = mult - 1). So the converted back value can be larger than latch by up to (mult - 1) >> shift. This is zero for the slow-clock-case. In the 1.25 GHz example above that means that the difference is up to 1, not 0 as your term would imply. 1004 is an example where the conversion to nano seconds and back to ticks results in a difference of 1. > > > + * calculation we still want to apply this in order to stay > > > + * above the minimum device ticks limit. For the upper limit > > > + * we would end up with a latch value larger than the upper > > > + * limit of the device, so we omit the add to stay below the > > > + * device upper boundary. > > > + * > > > + * Also omit the add if it would overflow the u64 boundary. > > > + */ > > > + if ((~0ULL - clc > rnd) && > > > + (!ismax || evt->mult <= (1U << evt->shift))) > > > + clc += rnd; > > I would expect that > > > > if (!ismax) > > if (~0ULL - clc > rnd) > > clc += rnd; > > else > > clc = ~0ULL; > > > > is enough (and a tad more exact in the presence of an overflow). I have > > to think about that though. > > Errm. > > 1) We cannot add if we'd overflow > > 2) For mult <= 1 << shift it's always correct > > 3) for mult > 1 << shift we only apply it to the min value not the max Yeah, I didn't say your code is wrong *here*. I just think that my easier (and so probably faster) code is good enough. > > > clockevents_calc_mult_shift(dev, freq, sec); > > > - dev->min_delta_ns = clockevent_delta2ns(dev->min_delta_ticks, dev); > > > - dev->max_delta_ns = clockevent_delta2ns(dev->max_delta_ticks, dev); > > > + dev->min_delta_ns = cev_delta2ns(dev->min_delta_ticks, dev, false); > > > + dev->max_delta_ns = cev_delta2ns(dev->max_delta_ticks, dev, true); > > Another improvement that came to my mind just now. For min_delta_ns you > > want to assert that it results in a value >= min_delta_ticks when > > converted back. For max_delta_ns you want ... value <= max_delta_ticks. > > What about the values in between? They for sure should land in > > [min_delta_ticks ... max_delta_ticks] when converted back and ideally > > should be most exact. The latter part would mean to add (rnd / 2) > > instead of rnd. I don't know yet how that would behave at the borders of > > the [min_delta_ns ... max_delta_ns] interval, but I think you still need > > to special-case that. > > Again: > > 1) For mult <= 1 << shift the backwards conversion is always the same as >the input value. > > 2) For mult > 1 << shift the backwards conversion of the min value is >always > than the input value. And the backwards conversion of the >max value is always < than the input value. > > The values between that are completely uninteresting as the > program_events code always converts from nsec to device ticks. > > We clamp the delta between min_ns and max_ns. So due to the above any > >min_ns <= delta <= max_ns > > will after conversion fulfil > >min_tick <= delta_tick <= max_tick > > So what are you going to improve? Either the math works or it does not. Right, my idea is nice, but useless. So I suggest you resend your patch with the compile fix and the corrected comment and I will think about my
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
On Thu, 19 Sep 2013, Uwe Kleine-König wrote: > > + u64 rnd = (u64) evt->mult - 1; > > > > if (unlikely(!evt->mult)) { > > evt->mult = 1; > > WARN_ON(1); > > } > I suggest to move the assignment to rnd below this if block as it > changes mult. True. > > + /* > + * Upper bound sanity check. If the backwards conversion is > + * not equal latch, we know that the above shift overflowed. > + */ > + if (clc >> evt->shift) != (u64)latch) You didn't compile test, did you? Also the cast on the rhs isn't needed. I did. I just missed to refresh the patch before sending it :) > > +* For mult <= (1 << shift) we can safely add mult - 1 to > > +* prevent integer rounding loss. So the backwards conversion > It doesn't prevent inexactness to add mult - 1. It (only) asserts that > the ns2delta(delta2ns(latch)) >= latch instead of ... <= latch when not > doing it. For mult <= 1 << shift the conversion is always ending up with the same latch value. > > +* from nsec to device ticks will be correct. > > +* > > +* For mult > (1 << shift), i.e. device frequency is > 1GHz we > > +* need to be careful. Adding mult - 1 will result in a value > > +* which when converted back to device ticks will be larger > s/will/can/ No, it will always be larger. > > +* than latch by (mult / (1 << shift)) - 1. For the min_delta > s/by/by up to/ > > > +* calculation we still want to apply this in order to stay > > +* above the minimum device ticks limit. For the upper limit > > +* we would end up with a latch value larger than the upper > > +* limit of the device, so we omit the add to stay below the > > +* device upper boundary. > > +* > > +* Also omit the add if it would overflow the u64 boundary. > > +*/ > > + if ((~0ULL - clc > rnd) && > > + (!ismax || evt->mult <= (1U << evt->shift))) > > + clc += rnd; > I would expect that > > if (!ismax) > if (~0ULL - clc > rnd) > clc += rnd; > else > clc = ~0ULL; > > is enough (and a tad more exact in the presence of an overflow). I have > to think about that though. Errm. 1) We cannot add if we'd overflow 2) For mult <= 1 << shift it's always correct 3) for mult > 1 << shift we only apply it to the min value not the max > > clockevents_calc_mult_shift(dev, freq, sec); > > - dev->min_delta_ns = clockevent_delta2ns(dev->min_delta_ticks, dev); > > - dev->max_delta_ns = clockevent_delta2ns(dev->max_delta_ticks, dev); > > + dev->min_delta_ns = cev_delta2ns(dev->min_delta_ticks, dev, false); > > + dev->max_delta_ns = cev_delta2ns(dev->max_delta_ticks, dev, true); > Another improvement that came to my mind just now. For min_delta_ns you > want to assert that it results in a value >= min_delta_ticks when > converted back. For max_delta_ns you want ... value <= max_delta_ticks. > What about the values in between? They for sure should land in > [min_delta_ticks ... max_delta_ticks] when converted back and ideally > should be most exact. The latter part would mean to add (rnd / 2) > instead of rnd. I don't know yet how that would behave at the borders of > the [min_delta_ns ... max_delta_ns] interval, but I think you still need > to special-case that. Again: 1) For mult <= 1 << shift the backwards conversion is always the same as the input value. 2) For mult > 1 << shift the backwards conversion of the min value is always > than the input value. And the backwards conversion of the max value is always < than the input value. The values between that are completely uninteresting as the program_events code always converts from nsec to device ticks. We clamp the delta between min_ns and max_ns. So due to the above any min_ns <= delta <= max_ns will after conversion fulfil min_tick <= delta_tick <= max_tick So what are you going to improve? Either the math works or it does not. Thanks, tglx
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
On Thu, 19 Sep 2013, Uwe Kleine-König wrote: + u64 rnd = (u64) evt-mult - 1; if (unlikely(!evt-mult)) { evt-mult = 1; WARN_ON(1); } I suggest to move the assignment to rnd below this if block as it changes mult. True. + /* + * Upper bound sanity check. If the backwards conversion is + * not equal latch, we know that the above shift overflowed. + */ + if (clc evt-shift) != (u64)latch) You didn't compile test, did you? Also the cast on the rhs isn't needed. I did. I just missed to refresh the patch before sending it :) +* For mult = (1 shift) we can safely add mult - 1 to +* prevent integer rounding loss. So the backwards conversion It doesn't prevent inexactness to add mult - 1. It (only) asserts that the ns2delta(delta2ns(latch)) = latch instead of ... = latch when not doing it. For mult = 1 shift the conversion is always ending up with the same latch value. +* from nsec to device ticks will be correct. +* +* For mult (1 shift), i.e. device frequency is 1GHz we +* need to be careful. Adding mult - 1 will result in a value +* which when converted back to device ticks will be larger s/will/can/ No, it will always be larger. +* than latch by (mult / (1 shift)) - 1. For the min_delta s/by/by up to/ +* calculation we still want to apply this in order to stay +* above the minimum device ticks limit. For the upper limit +* we would end up with a latch value larger than the upper +* limit of the device, so we omit the add to stay below the +* device upper boundary. +* +* Also omit the add if it would overflow the u64 boundary. +*/ + if ((~0ULL - clc rnd) + (!ismax || evt-mult = (1U evt-shift))) + clc += rnd; I would expect that if (!ismax) if (~0ULL - clc rnd) clc += rnd; else clc = ~0ULL; is enough (and a tad more exact in the presence of an overflow). I have to think about that though. Errm. 1) We cannot add if we'd overflow 2) For mult = 1 shift it's always correct 3) for mult 1 shift we only apply it to the min value not the max clockevents_calc_mult_shift(dev, freq, sec); - dev-min_delta_ns = clockevent_delta2ns(dev-min_delta_ticks, dev); - dev-max_delta_ns = clockevent_delta2ns(dev-max_delta_ticks, dev); + dev-min_delta_ns = cev_delta2ns(dev-min_delta_ticks, dev, false); + dev-max_delta_ns = cev_delta2ns(dev-max_delta_ticks, dev, true); Another improvement that came to my mind just now. For min_delta_ns you want to assert that it results in a value = min_delta_ticks when converted back. For max_delta_ns you want ... value = max_delta_ticks. What about the values in between? They for sure should land in [min_delta_ticks ... max_delta_ticks] when converted back and ideally should be most exact. The latter part would mean to add (rnd / 2) instead of rnd. I don't know yet how that would behave at the borders of the [min_delta_ns ... max_delta_ns] interval, but I think you still need to special-case that. Again: 1) For mult = 1 shift the backwards conversion is always the same as the input value. 2) For mult 1 shift the backwards conversion of the min value is always than the input value. And the backwards conversion of the max value is always than the input value. The values between that are completely uninteresting as the program_events code always converts from nsec to device ticks. We clamp the delta between min_ns and max_ns. So due to the above any min_ns = delta = max_ns will after conversion fulfil min_tick = delta_tick = max_tick So what are you going to improve? Either the math works or it does not. Thanks, tglx
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
Hi Thomas, On Fri, Sep 20, 2013 at 11:56:27AM +0200, Thomas Gleixner wrote: On Thu, 19 Sep 2013, Uwe Kleine-König wrote: + * For mult = (1 shift) we can safely add mult - 1 to + * prevent integer rounding loss. So the backwards conversion It doesn't prevent inexactness to add mult - 1. It (only) asserts that the ns2delta(delta2ns(latch)) = latch instead of ... = latch when not doing it. For mult = 1 shift the conversion is always ending up with the same latch value. Ah right, I missed that we're in the slow-clock-case. + * from nsec to device ticks will be correct. + * + * For mult (1 shift), i.e. device frequency is 1GHz we + * need to be careful. Adding mult - 1 will result in a value + * which when converted back to device ticks will be larger s/will/can/ No, it will always be larger. Hmm, consider a 1.25 GHz clock with shift = 2 and mult = 5. Then ns2clc(clc2ns(1000)) = 1000. So it's not always larger! In the fast-clock-case we have: With x shift = n * mult - k for k in [0 .. mult-1] and an integer n: ns2clc(clc2ns(x)) = ns2clc(((x shift) + mult - 1) / mult) = x shift) + mult - 1) / mult) * mult) shift = n * mult shift = ((x shift) + k) shift = x + (k shift) So ns2clc(clc2ns(x)) = x for all x 0 that have k = mult - ((x shift) - 1) % mult - 1 1 shift So my correction still stands. + * than latch by (mult / (1 shift)) - 1. For the min_delta s/by/by up to/ From the calculation above you can also see that this term is wrong. k is smaller than mult (and there are values that realize k = mult - 1). So the converted back value can be larger than latch by up to (mult - 1) shift. This is zero for the slow-clock-case. In the 1.25 GHz example above that means that the difference is up to 1, not 0 as your term would imply. 1004 is an example where the conversion to nano seconds and back to ticks results in a difference of 1. + * calculation we still want to apply this in order to stay + * above the minimum device ticks limit. For the upper limit + * we would end up with a latch value larger than the upper + * limit of the device, so we omit the add to stay below the + * device upper boundary. + * + * Also omit the add if it would overflow the u64 boundary. + */ + if ((~0ULL - clc rnd) + (!ismax || evt-mult = (1U evt-shift))) + clc += rnd; I would expect that if (!ismax) if (~0ULL - clc rnd) clc += rnd; else clc = ~0ULL; is enough (and a tad more exact in the presence of an overflow). I have to think about that though. Errm. 1) We cannot add if we'd overflow 2) For mult = 1 shift it's always correct 3) for mult 1 shift we only apply it to the min value not the max Yeah, I didn't say your code is wrong *here*. I just think that my easier (and so probably faster) code is good enough. clockevents_calc_mult_shift(dev, freq, sec); - dev-min_delta_ns = clockevent_delta2ns(dev-min_delta_ticks, dev); - dev-max_delta_ns = clockevent_delta2ns(dev-max_delta_ticks, dev); + dev-min_delta_ns = cev_delta2ns(dev-min_delta_ticks, dev, false); + dev-max_delta_ns = cev_delta2ns(dev-max_delta_ticks, dev, true); Another improvement that came to my mind just now. For min_delta_ns you want to assert that it results in a value = min_delta_ticks when converted back. For max_delta_ns you want ... value = max_delta_ticks. What about the values in between? They for sure should land in [min_delta_ticks ... max_delta_ticks] when converted back and ideally should be most exact. The latter part would mean to add (rnd / 2) instead of rnd. I don't know yet how that would behave at the borders of the [min_delta_ns ... max_delta_ns] interval, but I think you still need to special-case that. Again: 1) For mult = 1 shift the backwards conversion is always the same as the input value. 2) For mult 1 shift the backwards conversion of the min value is always than the input value. And the backwards conversion of the max value is always than the input value. The values between that are completely uninteresting as the program_events code always converts from nsec to device ticks. We clamp the delta between min_ns and max_ns. So due to the above any min_ns = delta = max_ns will after conversion fulfil min_tick = delta_tick = max_tick So what are you going to improve? Either the math works or it does not. Right, my idea is nice, but useless. So I suggest you resend your patch with the compile fix and the corrected comment and I will think about my suggestion to simplify the if condition independently as it's only a small runtime improvent and so not important enough to stop the correctness issue your patch is fixing. Best regards and thanks for the nice discussion, Uwe --
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
On Fri, 20 Sep 2013, Uwe Kleine-König wrote: On Fri, Sep 20, 2013 at 11:56:27AM +0200, Thomas Gleixner wrote: On Thu, 19 Sep 2013, Uwe Kleine-König wrote: +* from nsec to device ticks will be correct. +* +* For mult (1 shift), i.e. device frequency is 1GHz we +* need to be careful. Adding mult - 1 will result in a value +* which when converted back to device ticks will be larger s/will/can/ No, it will always be larger. Hmm, consider a 1.25 GHz clock with shift = 2 and mult = 5. Then ns2clc(clc2ns(1000)) = 1000. So it's not always larger! In the fast-clock-case we have: With x shift = n * mult - k for k in [0 .. mult-1] and an integer n: ns2clc(clc2ns(x)) = ns2clc(((x shift) + mult - 1) / mult) = x shift) + mult - 1) / mult) * mult) shift = n * mult shift = ((x shift) + k) shift = x + (k shift) So ns2clc(clc2ns(x)) = x for all x 0 that have k = mult - ((x shift) - 1) % mult - 1 1 shift So my correction still stands. Fair enough. 1) We cannot add if we'd overflow 2) For mult = 1 shift it's always correct 3) for mult 1 shift we only apply it to the min value not the max Yeah, I didn't say your code is wrong *here*. I just think that my easier (and so probably faster) code is good enough. Granted. I was stuck in the correctness discussion. So yeah, it does not matter if we steal 30 usec of maximum idle sleep time from a 32kHz clock. OTOH it does not matter much in the setup slow path to take another conditional. :) Best regards and thanks for the nice discussion, Ditto! You saved me from actually sitting down and using the pencil to do the proper math. Thanks, tglx
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
Hi Thomas, On Thu, Sep 19, 2013 at 04:30:37PM +0200, Thomas Gleixner wrote: > Marc Kleine-Budde pointed out, that commit 77cc982 "clocksource: use > clockevents_config_and_register() where possible" caused a regression > for some of the converted subarchs. > > The reason is, that the clockevents core code converts the minimal > hardware tick delta to a nanosecond value for core internal > usage. This conversion is affected by integer math rounding loss, so > the backwards conversion to hardware ticks will likely result in a > value which is less than the configured hardware limitation. The > affected subarchs used their own workaround (SIGH!) which got lost in > the conversion. > > The solution for the issue at hand is simple: adding evt->mult - 1 to > the shifted value before the integer divison in the core conversion > function takes care of it. But this only works for the case where for > the scaled math mult/shift pair "mult <= 1 << shift" is true. For the > case where "mult > 1 << shift" we can apply the rounding add only for > the minimum delta value to make sure that the backward conversion is > not less than the given hardware limit. For the upper bound we need to > omit the rounding add, because the backwards conversion is always > larger than the original latch value. That would violate the upper > bound of the hardware device. > > Though looking closer at the details of that function reveals another > bogosity: The upper bounds check is broken as well. Checking for a > resulting "clc" value greater than KTIME_MAX after the conversion is > pointless. The conversion does: > > u64 clc = (latch << evt->shift) / evt->mult; > > So there is no sanity check for (latch << evt->shift) exceeding the > 64bit boundary. The latch argument is "unsigned long", so on a 64bit > arch the handed in argument could easily lead to an unnoticed shift > overflow. With the above rounding fix applied the calculation before > the divison is: > >u64 clc = (latch << evt->shift) + evt->mult - 1; > > So we need to make sure, that neither the shift nor the rounding add > is overflowing the u64 boundary. > > Signed-off-by: Thomas Gleixner > Cc: Russell King - ARM Linux > Cc: Marc Kleine-Budde > Cc: nicolas.fe...@atmel.com > Cc: Marc Pignat > Cc: john.stu...@linaro.org > Cc: ker...@pengutronix.de > Cc: Ronald Wahl > Cc: LAK > Cc: u.kleine-koe...@pengutronix.de > Cc: Ludovic Desroches > > --- > kernel/time/clockevents.c | 64 > +++--- > 1 file changed, 49 insertions(+), 15 deletions(-) > > Index: linux-2.6/kernel/time/clockevents.c > === > --- linux-2.6.orig/kernel/time/clockevents.c > +++ linux-2.6/kernel/time/clockevents.c > @@ -33,29 +33,63 @@ struct ce_unbind { > int res; > }; > > -/** > - * clockevents_delta2ns - Convert a latch value (device ticks) to nanoseconds > - * @latch: value to convert > - * @evt: pointer to clock event device descriptor > - * > - * Math helper, returns latch value converted to nanoseconds (bound checked) > - */ > -u64 clockevent_delta2ns(unsigned long latch, struct clock_event_device *evt) > +static u64 cev_delta2ns(unsigned long latch, struct clock_event_device *evt, > + bool ismax) > { > u64 clc = (u64) latch << evt->shift; > + u64 rnd = (u64) evt->mult - 1; > > if (unlikely(!evt->mult)) { > evt->mult = 1; > WARN_ON(1); > } I suggest to move the assignment to rnd below this if block as it changes mult. > > + /* > + * Upper bound sanity check. If the backwards conversion is > + * not equal latch, we know that the above shift overflowed. > + */ > + if (clc >> evt->shift) != (u64)latch) You didn't compile test, did you? Also the cast on the rhs isn't needed. > + clc = ~0ULL; > + > + /* > + * Scaled math oddities: > + * > + * For mult <= (1 << shift) we can safely add mult - 1 to > + * prevent integer rounding loss. So the backwards conversion It doesn't prevent inexactness to add mult - 1. It (only) asserts that the ns2delta(delta2ns(latch)) >= latch instead of ... <= latch when not doing it. > + * from nsec to device ticks will be correct. > + * > + * For mult > (1 << shift), i.e. device frequency is > 1GHz we > + * need to be careful. Adding mult - 1 will result in a value > + * which when converted back to device ticks will be larger s/will/can/ > + * than latch by (mult / (1 << shift)) - 1. For the min_delta s/by/by up to/ > + * calculation we still want to apply this in order to stay > + * above the minimum device ticks limit. For the upper limit > + * we would end up with a latch value larger than the upper > + * limit of the device, so we omit the add to stay below the > + * device upper boundary. > + * > + * Also omit the add if it would
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
Marc Kleine-Budde pointed out, that commit 77cc982 "clocksource: use clockevents_config_and_register() where possible" caused a regression for some of the converted subarchs. The reason is, that the clockevents core code converts the minimal hardware tick delta to a nanosecond value for core internal usage. This conversion is affected by integer math rounding loss, so the backwards conversion to hardware ticks will likely result in a value which is less than the configured hardware limitation. The affected subarchs used their own workaround (SIGH!) which got lost in the conversion. The solution for the issue at hand is simple: adding evt->mult - 1 to the shifted value before the integer divison in the core conversion function takes care of it. But this only works for the case where for the scaled math mult/shift pair "mult <= 1 << shift" is true. For the case where "mult > 1 << shift" we can apply the rounding add only for the minimum delta value to make sure that the backward conversion is not less than the given hardware limit. For the upper bound we need to omit the rounding add, because the backwards conversion is always larger than the original latch value. That would violate the upper bound of the hardware device. Though looking closer at the details of that function reveals another bogosity: The upper bounds check is broken as well. Checking for a resulting "clc" value greater than KTIME_MAX after the conversion is pointless. The conversion does: u64 clc = (latch << evt->shift) / evt->mult; So there is no sanity check for (latch << evt->shift) exceeding the 64bit boundary. The latch argument is "unsigned long", so on a 64bit arch the handed in argument could easily lead to an unnoticed shift overflow. With the above rounding fix applied the calculation before the divison is: u64 clc = (latch << evt->shift) + evt->mult - 1; So we need to make sure, that neither the shift nor the rounding add is overflowing the u64 boundary. Signed-off-by: Thomas Gleixner Cc: Russell King - ARM Linux Cc: Marc Kleine-Budde Cc: nicolas.fe...@atmel.com Cc: Marc Pignat Cc: john.stu...@linaro.org Cc: ker...@pengutronix.de Cc: Ronald Wahl Cc: LAK Cc: u.kleine-koe...@pengutronix.de Cc: Ludovic Desroches --- kernel/time/clockevents.c | 64 +++--- 1 file changed, 49 insertions(+), 15 deletions(-) Index: linux-2.6/kernel/time/clockevents.c === --- linux-2.6.orig/kernel/time/clockevents.c +++ linux-2.6/kernel/time/clockevents.c @@ -33,29 +33,63 @@ struct ce_unbind { int res; }; -/** - * clockevents_delta2ns - Convert a latch value (device ticks) to nanoseconds - * @latch: value to convert - * @evt: pointer to clock event device descriptor - * - * Math helper, returns latch value converted to nanoseconds (bound checked) - */ -u64 clockevent_delta2ns(unsigned long latch, struct clock_event_device *evt) +static u64 cev_delta2ns(unsigned long latch, struct clock_event_device *evt, + bool ismax) { u64 clc = (u64) latch << evt->shift; + u64 rnd = (u64) evt->mult - 1; if (unlikely(!evt->mult)) { evt->mult = 1; WARN_ON(1); } + /* +* Upper bound sanity check. If the backwards conversion is +* not equal latch, we know that the above shift overflowed. +*/ + if (clc >> evt->shift) != (u64)latch) + clc = ~0ULL; + + /* +* Scaled math oddities: +* +* For mult <= (1 << shift) we can safely add mult - 1 to +* prevent integer rounding loss. So the backwards conversion +* from nsec to device ticks will be correct. +* +* For mult > (1 << shift), i.e. device frequency is > 1GHz we +* need to be careful. Adding mult - 1 will result in a value +* which when converted back to device ticks will be larger +* than latch by (mult / (1 << shift)) - 1. For the min_delta +* calculation we still want to apply this in order to stay +* above the minimum device ticks limit. For the upper limit +* we would end up with a latch value larger than the upper +* limit of the device, so we omit the add to stay below the +* device upper boundary. +* +* Also omit the add if it would overflow the u64 boundary. +*/ + if ((~0ULL - clc > rnd) && + (!ismax || evt->mult <= (1U << evt->shift))) + clc += rnd; + do_div(clc, evt->mult); - if (clc < 1000) - clc = 1000; - if (clc > KTIME_MAX) - clc = KTIME_MAX; - return clc; + /* Deltas less than 1usec are pointless noise */ + return clc > 1000 ? clc : 1000; +} + +/** + * clockevents_delta2ns - Convert a latch value (device ticks) to nanoseconds + * @latch: value to convert + * @evt:
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
On Thu, 19 Sep 2013, Uwe Kleine-König wrote: > On Thu, Sep 19, 2013 at 12:15:10PM +0200, Thomas Gleixner wrote: > > On Thu, 19 Sep 2013, Uwe Kleine-König wrote: > > > On Thu, Sep 19, 2013 at 12:01:25AM +0200, Thomas Gleixner wrote: > > > > Versus the 64bit overflow check, we need to be even more careful. We > > > > need to check for overflowing (1 << 63) - 1 (i.e. the max positive > > > > value which fits into a s64). See clockevents_program_event(). > > > > > > That is because you interpret times < 0 as in the past, right? But note > > > that the interim result we're talking about here is still to be divided > > > by evt->mult. So assuming mult > 1, that check is too strict unless you > > > move it below the do_div in clockevent_delta2ns. For sure it makes sense > > > to use the same value for a and b in the handling: > > > > No, it's not too strict. > > > > nsec = (latch << shift) / mult; > > > > Now the backwards conversion does: > > > > latch = (nsec * mult) >> shift; > > > > So we want nsec * mult to be in the positive range of s64. Which > > means, that latch << shift must be in that range as well. > The backwards conversion is in clockevents_program_event(), right? There > is: > > clc = ((unsigned long long) delta * dev->mult) >> dev->shift; > > So I don't see a problem if nsec * mult overflows (1 << 63) - 1 as long > as it still fits into an unsigned long long (i.e. a 64 bit value). Right. It doesn't matter.
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
Hello Thomas, On Thu, Sep 19, 2013 at 12:15:10PM +0200, Thomas Gleixner wrote: > On Thu, 19 Sep 2013, Uwe Kleine-König wrote: > > On Thu, Sep 19, 2013 at 12:01:25AM +0200, Thomas Gleixner wrote: > > > Versus the 64bit overflow check, we need to be even more careful. We > > > need to check for overflowing (1 << 63) - 1 (i.e. the max positive > > > value which fits into a s64). See clockevents_program_event(). > > > > That is because you interpret times < 0 as in the past, right? But note > > that the interim result we're talking about here is still to be divided > > by evt->mult. So assuming mult > 1, that check is too strict unless you > > move it below the do_div in clockevent_delta2ns. For sure it makes sense > > to use the same value for a and b in the handling: > > No, it's not too strict. > > nsec = (latch << shift) / mult; > > Now the backwards conversion does: > > latch = (nsec * mult) >> shift; > > So we want nsec * mult to be in the positive range of s64. Which > means, that latch << shift must be in that range as well. The backwards conversion is in clockevents_program_event(), right? There is: clc = ((unsigned long long) delta * dev->mult) >> dev->shift; So I don't see a problem if nsec * mult overflows (1 << 63) - 1 as long as it still fits into an unsigned long long (i.e. a 64 bit value). What am I missing? Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König| Industrial Linux Solutions | http://www.pengutronix.de/ | -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
On Thu, 19 Sep 2013, Uwe Kleine-König wrote: > On Thu, Sep 19, 2013 at 12:01:25AM +0200, Thomas Gleixner wrote: > > Versus the 64bit overflow check, we need to be even more careful. We > > need to check for overflowing (1 << 63) - 1 (i.e. the max positive > > value which fits into a s64). See clockevents_program_event(). > > That is because you interpret times < 0 as in the past, right? But note > that the interim result we're talking about here is still to be divided > by evt->mult. So assuming mult > 1, that check is too strict unless you > move it below the do_div in clockevent_delta2ns. For sure it makes sense > to use the same value for a and b in the handling: No, it's not too strict. nsec = (latch << shift) / mult; Now the backwards conversion does: latch = (nsec * mult) >> shift; So we want nsec * mult to be in the positive range of s64. Which means, that latch << shift must be in that range as well. Thanks, tglx
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
Hello Thomas, On Thu, Sep 19, 2013 at 12:01:25AM +0200, Thomas Gleixner wrote: > On Wed, 18 Sep 2013, Uwe Kleine-König wrote: > > On Wed, Sep 18, 2013 at 11:38:07AM +0200, Thomas Gleixner wrote: > > > On Wed, 18 Sep 2013, Uwe Kleine-König wrote: > > > > Another doubt I have is: You changed clockevent_delta2ns to round up now > > > > unconditionally. For the numbers on at91 that doesn't matter, but I > > > > wonder if there are situations that make the timer core violate the > > > > max_delta_ticks condition now. > > > > > > And how so? The + (mult - 1) ensures, that the conversion back to > > > ticks results in the same value as latch. So how should it violate > > > the max boundary? > > That is wrong: > > With max_delta_ticks << shift = n * mult - k for k in [0 .. mult-1] and > > an integer n: > > > > (max_delta_ns * mult) >> shift > > = max_delta_ticks << shift) + mult - 1) / mult) * mult) >> shift > > = (((n * mult - k + mult - 1) / mult) * mult) >> shift > > = n * mult >> shift > > = ((max_delta_ticks << shift) + k) >> shift > > = max_delta_ticks + (k >> shift) > > > > k >> shift is only known to be zero if mult <= 1 << shift (i.e. the same > > condition as above where your 64bit overflow detection is wrong). So > > this can result in values > max_delta_ticks. > > Right, should have waited for coffee actually activating the right > brain cells. > > So the rounding thing works nicely for mult <= (1 << sft). For the > other cases the conversion is wrong by (mult / (1 << sft)) - 1. So for > a 3 GHz clock it's off by 2. That's not an issue for the min value, > but for the max value it might overflow the hardware limit. Not a real > functional issue as we'd just get a pointless short interrupt, but for > correctness reasons and for the sake of NOHZ we need to fix that. Well, you're assuming here that this is how the hardware reacts. Might be worse. But we're on the same side here, it should be fixed. > There is no real correct solution for these cases, which will result > in a correct backwards conversion. We could validate against the max > delta ticks in program_event, but that adds another boundary check to > the hotpath so I rather want to avoid it. > > The simple solution is to check for the following in the conversion > routine: > > if (latch == dev->min_delta_ticks || dev->mult <= (1 << dev->sft)) >clc += mult - 1 I didn't do the necessary math, but I wouldn't be surprised if you'd need to check for latch <= dev->min_delta_ticks + n for some non-negative n. Another simple solution is to apply the uprounding only for min_delta_ticks -> min_delta_ns conversion, but not for the max_delta_ticks -> max_delta_ns conversion. For the first you have to guarantee not to yield lower values so round up, for the latter you must not pass on bigger values so round down. > That ensures, that we never violate the lower boundary independent of > the mult/sft relation. For the max_delta_ticks conversion in the "mult > > (1 << sft)" case we end up with a slightly smaller value, but well > inside the boundaries and close enough to the hardware limitations. Yeah, probably it's not worth to calculate the optimal value (actually for both limits). Anyhow, I will check that on my next shopping tour :-) > Versus the 64bit overflow check, we need to be even more careful. We > need to check for overflowing (1 << 63) - 1 (i.e. the max positive > value which fits into a s64). See clockevents_program_event(). That is because you interpret times < 0 as in the past, right? But note that the interim result we're talking about here is still to be divided by evt->mult. So assuming mult > 1, that check is too strict unless you move it below the do_div in clockevent_delta2ns. For sure it makes sense to use the same value for a and b in the handling: if (calculatedval overflows a) calculatedval = b Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König| Industrial Linux Solutions | http://www.pengutronix.de/ | -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
Hello Thomas, On Thu, Sep 19, 2013 at 12:01:25AM +0200, Thomas Gleixner wrote: On Wed, 18 Sep 2013, Uwe Kleine-König wrote: On Wed, Sep 18, 2013 at 11:38:07AM +0200, Thomas Gleixner wrote: On Wed, 18 Sep 2013, Uwe Kleine-König wrote: Another doubt I have is: You changed clockevent_delta2ns to round up now unconditionally. For the numbers on at91 that doesn't matter, but I wonder if there are situations that make the timer core violate the max_delta_ticks condition now. And how so? The + (mult - 1) ensures, that the conversion back to ticks results in the same value as latch. So how should it violate the max boundary? That is wrong: With max_delta_ticks shift = n * mult - k for k in [0 .. mult-1] and an integer n: (max_delta_ns * mult) shift = max_delta_ticks shift) + mult - 1) / mult) * mult) shift = (((n * mult - k + mult - 1) / mult) * mult) shift = n * mult shift = ((max_delta_ticks shift) + k) shift = max_delta_ticks + (k shift) k shift is only known to be zero if mult = 1 shift (i.e. the same condition as above where your 64bit overflow detection is wrong). So this can result in values max_delta_ticks. Right, should have waited for coffee actually activating the right brain cells. So the rounding thing works nicely for mult = (1 sft). For the other cases the conversion is wrong by (mult / (1 sft)) - 1. So for a 3 GHz clock it's off by 2. That's not an issue for the min value, but for the max value it might overflow the hardware limit. Not a real functional issue as we'd just get a pointless short interrupt, but for correctness reasons and for the sake of NOHZ we need to fix that. Well, you're assuming here that this is how the hardware reacts. Might be worse. But we're on the same side here, it should be fixed. There is no real correct solution for these cases, which will result in a correct backwards conversion. We could validate against the max delta ticks in program_event, but that adds another boundary check to the hotpath so I rather want to avoid it. The simple solution is to check for the following in the conversion routine: if (latch == dev-min_delta_ticks || dev-mult = (1 dev-sft)) clc += mult - 1 I didn't do the necessary math, but I wouldn't be surprised if you'd need to check for latch = dev-min_delta_ticks + n for some non-negative n. Another simple solution is to apply the uprounding only for min_delta_ticks - min_delta_ns conversion, but not for the max_delta_ticks - max_delta_ns conversion. For the first you have to guarantee not to yield lower values so round up, for the latter you must not pass on bigger values so round down. That ensures, that we never violate the lower boundary independent of the mult/sft relation. For the max_delta_ticks conversion in the mult (1 sft) case we end up with a slightly smaller value, but well inside the boundaries and close enough to the hardware limitations. Yeah, probably it's not worth to calculate the optimal value (actually for both limits). Anyhow, I will check that on my next shopping tour :-) Versus the 64bit overflow check, we need to be even more careful. We need to check for overflowing (1 63) - 1 (i.e. the max positive value which fits into a s64). See clockevents_program_event(). That is because you interpret times 0 as in the past, right? But note that the interim result we're talking about here is still to be divided by evt-mult. So assuming mult 1, that check is too strict unless you move it below the do_div in clockevent_delta2ns. For sure it makes sense to use the same value for a and b in the handling: if (calculatedval overflows a) calculatedval = b Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König| Industrial Linux Solutions | http://www.pengutronix.de/ | -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
On Thu, 19 Sep 2013, Uwe Kleine-König wrote: On Thu, Sep 19, 2013 at 12:01:25AM +0200, Thomas Gleixner wrote: Versus the 64bit overflow check, we need to be even more careful. We need to check for overflowing (1 63) - 1 (i.e. the max positive value which fits into a s64). See clockevents_program_event(). That is because you interpret times 0 as in the past, right? But note that the interim result we're talking about here is still to be divided by evt-mult. So assuming mult 1, that check is too strict unless you move it below the do_div in clockevent_delta2ns. For sure it makes sense to use the same value for a and b in the handling: No, it's not too strict. nsec = (latch shift) / mult; Now the backwards conversion does: latch = (nsec * mult) shift; So we want nsec * mult to be in the positive range of s64. Which means, that latch shift must be in that range as well. Thanks, tglx
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
Hello Thomas, On Thu, Sep 19, 2013 at 12:15:10PM +0200, Thomas Gleixner wrote: On Thu, 19 Sep 2013, Uwe Kleine-König wrote: On Thu, Sep 19, 2013 at 12:01:25AM +0200, Thomas Gleixner wrote: Versus the 64bit overflow check, we need to be even more careful. We need to check for overflowing (1 63) - 1 (i.e. the max positive value which fits into a s64). See clockevents_program_event(). That is because you interpret times 0 as in the past, right? But note that the interim result we're talking about here is still to be divided by evt-mult. So assuming mult 1, that check is too strict unless you move it below the do_div in clockevent_delta2ns. For sure it makes sense to use the same value for a and b in the handling: No, it's not too strict. nsec = (latch shift) / mult; Now the backwards conversion does: latch = (nsec * mult) shift; So we want nsec * mult to be in the positive range of s64. Which means, that latch shift must be in that range as well. The backwards conversion is in clockevents_program_event(), right? There is: clc = ((unsigned long long) delta * dev-mult) dev-shift; So I don't see a problem if nsec * mult overflows (1 63) - 1 as long as it still fits into an unsigned long long (i.e. a 64 bit value). What am I missing? Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König| Industrial Linux Solutions | http://www.pengutronix.de/ | -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
On Thu, 19 Sep 2013, Uwe Kleine-König wrote: On Thu, Sep 19, 2013 at 12:15:10PM +0200, Thomas Gleixner wrote: On Thu, 19 Sep 2013, Uwe Kleine-König wrote: On Thu, Sep 19, 2013 at 12:01:25AM +0200, Thomas Gleixner wrote: Versus the 64bit overflow check, we need to be even more careful. We need to check for overflowing (1 63) - 1 (i.e. the max positive value which fits into a s64). See clockevents_program_event(). That is because you interpret times 0 as in the past, right? But note that the interim result we're talking about here is still to be divided by evt-mult. So assuming mult 1, that check is too strict unless you move it below the do_div in clockevent_delta2ns. For sure it makes sense to use the same value for a and b in the handling: No, it's not too strict. nsec = (latch shift) / mult; Now the backwards conversion does: latch = (nsec * mult) shift; So we want nsec * mult to be in the positive range of s64. Which means, that latch shift must be in that range as well. The backwards conversion is in clockevents_program_event(), right? There is: clc = ((unsigned long long) delta * dev-mult) dev-shift; So I don't see a problem if nsec * mult overflows (1 63) - 1 as long as it still fits into an unsigned long long (i.e. a 64 bit value). Right. It doesn't matter.
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
Marc Kleine-Budde pointed out, that commit 77cc982 clocksource: use clockevents_config_and_register() where possible caused a regression for some of the converted subarchs. The reason is, that the clockevents core code converts the minimal hardware tick delta to a nanosecond value for core internal usage. This conversion is affected by integer math rounding loss, so the backwards conversion to hardware ticks will likely result in a value which is less than the configured hardware limitation. The affected subarchs used their own workaround (SIGH!) which got lost in the conversion. The solution for the issue at hand is simple: adding evt-mult - 1 to the shifted value before the integer divison in the core conversion function takes care of it. But this only works for the case where for the scaled math mult/shift pair mult = 1 shift is true. For the case where mult 1 shift we can apply the rounding add only for the minimum delta value to make sure that the backward conversion is not less than the given hardware limit. For the upper bound we need to omit the rounding add, because the backwards conversion is always larger than the original latch value. That would violate the upper bound of the hardware device. Though looking closer at the details of that function reveals another bogosity: The upper bounds check is broken as well. Checking for a resulting clc value greater than KTIME_MAX after the conversion is pointless. The conversion does: u64 clc = (latch evt-shift) / evt-mult; So there is no sanity check for (latch evt-shift) exceeding the 64bit boundary. The latch argument is unsigned long, so on a 64bit arch the handed in argument could easily lead to an unnoticed shift overflow. With the above rounding fix applied the calculation before the divison is: u64 clc = (latch evt-shift) + evt-mult - 1; So we need to make sure, that neither the shift nor the rounding add is overflowing the u64 boundary. Signed-off-by: Thomas Gleixner t...@linutronix.de Cc: Russell King - ARM Linux li...@arm.linux.org.uk Cc: Marc Kleine-Budde m...@pengutronix.de Cc: nicolas.fe...@atmel.com Cc: Marc Pignat marc.pig...@hevs.ch Cc: john.stu...@linaro.org Cc: ker...@pengutronix.de Cc: Ronald Wahl ronald.w...@raritan.com Cc: LAK linux-arm-ker...@lists.infradead.org Cc: u.kleine-koe...@pengutronix.de Cc: Ludovic Desroches ludovic.desroc...@atmel.com --- kernel/time/clockevents.c | 64 +++--- 1 file changed, 49 insertions(+), 15 deletions(-) Index: linux-2.6/kernel/time/clockevents.c === --- linux-2.6.orig/kernel/time/clockevents.c +++ linux-2.6/kernel/time/clockevents.c @@ -33,29 +33,63 @@ struct ce_unbind { int res; }; -/** - * clockevents_delta2ns - Convert a latch value (device ticks) to nanoseconds - * @latch: value to convert - * @evt: pointer to clock event device descriptor - * - * Math helper, returns latch value converted to nanoseconds (bound checked) - */ -u64 clockevent_delta2ns(unsigned long latch, struct clock_event_device *evt) +static u64 cev_delta2ns(unsigned long latch, struct clock_event_device *evt, + bool ismax) { u64 clc = (u64) latch evt-shift; + u64 rnd = (u64) evt-mult - 1; if (unlikely(!evt-mult)) { evt-mult = 1; WARN_ON(1); } + /* +* Upper bound sanity check. If the backwards conversion is +* not equal latch, we know that the above shift overflowed. +*/ + if (clc evt-shift) != (u64)latch) + clc = ~0ULL; + + /* +* Scaled math oddities: +* +* For mult = (1 shift) we can safely add mult - 1 to +* prevent integer rounding loss. So the backwards conversion +* from nsec to device ticks will be correct. +* +* For mult (1 shift), i.e. device frequency is 1GHz we +* need to be careful. Adding mult - 1 will result in a value +* which when converted back to device ticks will be larger +* than latch by (mult / (1 shift)) - 1. For the min_delta +* calculation we still want to apply this in order to stay +* above the minimum device ticks limit. For the upper limit +* we would end up with a latch value larger than the upper +* limit of the device, so we omit the add to stay below the +* device upper boundary. +* +* Also omit the add if it would overflow the u64 boundary. +*/ + if ((~0ULL - clc rnd) + (!ismax || evt-mult = (1U evt-shift))) + clc += rnd; + do_div(clc, evt-mult); - if (clc 1000) - clc = 1000; - if (clc KTIME_MAX) - clc = KTIME_MAX; - return clc; + /* Deltas less than 1usec are pointless noise */ + return clc 1000 ? clc : 1000; +} + +/** + * clockevents_delta2ns -
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
Hi Thomas, On Thu, Sep 19, 2013 at 04:30:37PM +0200, Thomas Gleixner wrote: Marc Kleine-Budde pointed out, that commit 77cc982 clocksource: use clockevents_config_and_register() where possible caused a regression for some of the converted subarchs. The reason is, that the clockevents core code converts the minimal hardware tick delta to a nanosecond value for core internal usage. This conversion is affected by integer math rounding loss, so the backwards conversion to hardware ticks will likely result in a value which is less than the configured hardware limitation. The affected subarchs used their own workaround (SIGH!) which got lost in the conversion. The solution for the issue at hand is simple: adding evt-mult - 1 to the shifted value before the integer divison in the core conversion function takes care of it. But this only works for the case where for the scaled math mult/shift pair mult = 1 shift is true. For the case where mult 1 shift we can apply the rounding add only for the minimum delta value to make sure that the backward conversion is not less than the given hardware limit. For the upper bound we need to omit the rounding add, because the backwards conversion is always larger than the original latch value. That would violate the upper bound of the hardware device. Though looking closer at the details of that function reveals another bogosity: The upper bounds check is broken as well. Checking for a resulting clc value greater than KTIME_MAX after the conversion is pointless. The conversion does: u64 clc = (latch evt-shift) / evt-mult; So there is no sanity check for (latch evt-shift) exceeding the 64bit boundary. The latch argument is unsigned long, so on a 64bit arch the handed in argument could easily lead to an unnoticed shift overflow. With the above rounding fix applied the calculation before the divison is: u64 clc = (latch evt-shift) + evt-mult - 1; So we need to make sure, that neither the shift nor the rounding add is overflowing the u64 boundary. Signed-off-by: Thomas Gleixner t...@linutronix.de Cc: Russell King - ARM Linux li...@arm.linux.org.uk Cc: Marc Kleine-Budde m...@pengutronix.de Cc: nicolas.fe...@atmel.com Cc: Marc Pignat marc.pig...@hevs.ch Cc: john.stu...@linaro.org Cc: ker...@pengutronix.de Cc: Ronald Wahl ronald.w...@raritan.com Cc: LAK linux-arm-ker...@lists.infradead.org Cc: u.kleine-koe...@pengutronix.de Cc: Ludovic Desroches ludovic.desroc...@atmel.com --- kernel/time/clockevents.c | 64 +++--- 1 file changed, 49 insertions(+), 15 deletions(-) Index: linux-2.6/kernel/time/clockevents.c === --- linux-2.6.orig/kernel/time/clockevents.c +++ linux-2.6/kernel/time/clockevents.c @@ -33,29 +33,63 @@ struct ce_unbind { int res; }; -/** - * clockevents_delta2ns - Convert a latch value (device ticks) to nanoseconds - * @latch: value to convert - * @evt: pointer to clock event device descriptor - * - * Math helper, returns latch value converted to nanoseconds (bound checked) - */ -u64 clockevent_delta2ns(unsigned long latch, struct clock_event_device *evt) +static u64 cev_delta2ns(unsigned long latch, struct clock_event_device *evt, + bool ismax) { u64 clc = (u64) latch evt-shift; + u64 rnd = (u64) evt-mult - 1; if (unlikely(!evt-mult)) { evt-mult = 1; WARN_ON(1); } I suggest to move the assignment to rnd below this if block as it changes mult. + /* + * Upper bound sanity check. If the backwards conversion is + * not equal latch, we know that the above shift overflowed. + */ + if (clc evt-shift) != (u64)latch) You didn't compile test, did you? Also the cast on the rhs isn't needed. + clc = ~0ULL; + + /* + * Scaled math oddities: + * + * For mult = (1 shift) we can safely add mult - 1 to + * prevent integer rounding loss. So the backwards conversion It doesn't prevent inexactness to add mult - 1. It (only) asserts that the ns2delta(delta2ns(latch)) = latch instead of ... = latch when not doing it. + * from nsec to device ticks will be correct. + * + * For mult (1 shift), i.e. device frequency is 1GHz we + * need to be careful. Adding mult - 1 will result in a value + * which when converted back to device ticks will be larger s/will/can/ + * than latch by (mult / (1 shift)) - 1. For the min_delta s/by/by up to/ + * calculation we still want to apply this in order to stay + * above the minimum device ticks limit. For the upper limit + * we would end up with a latch value larger than the upper + * limit of the device, so we omit the add to stay below the + * device upper boundary. + * + * Also omit the add if it
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
On Wed, 18 Sep 2013, Uwe Kleine-König wrote: > On Wed, Sep 18, 2013 at 11:38:07AM +0200, Thomas Gleixner wrote: > > On Wed, 18 Sep 2013, Uwe Kleine-König wrote: > > > Another doubt I have is: You changed clockevent_delta2ns to round up now > > > unconditionally. For the numbers on at91 that doesn't matter, but I > > > wonder if there are situations that make the timer core violate the > > > max_delta_ticks condition now. > > > > And how so? The + (mult - 1) ensures, that the conversion back to > > ticks results in the same value as latch. So how should it violate > > the max boundary? > That is wrong: > With max_delta_ticks << shift = n * mult - k for k in [0 .. mult-1] and > an integer n: > > (max_delta_ns * mult) >> shift > = max_delta_ticks << shift) + mult - 1) / mult) * mult) >> shift > = (((n * mult - k + mult - 1) / mult) * mult) >> shift > = n * mult >> shift > = ((max_delta_ticks << shift) + k) >> shift > = max_delta_ticks + (k >> shift) > > k >> shift is only known to be zero if mult <= 1 << shift (i.e. the same > condition as above where your 64bit overflow detection is wrong). So > this can result in values > max_delta_ticks. Right, should have waited for coffee actually activating the right brain cells. So the rounding thing works nicely for mult <= (1 << sft). For the other cases the conversion is wrong by (mult / (1 << sft)) - 1. So for a 3 GHz clock it's off by 2. That's not an issue for the min value, but for the max value it might overflow the hardware limit. Not a real functional issue as we'd just get a pointless short interrupt, but for correctness reasons and for the sake of NOHZ we need to fix that. There is no real correct solution for these cases, which will result in a correct backwards conversion. We could validate against the max delta ticks in program_event, but that adds another boundary check to the hotpath so I rather want to avoid it. The simple solution is to check for the following in the conversion routine: if (latch == dev->min_delta_ticks || dev->mult <= (1 << dev->sft)) clc += mult - 1 That ensures, that we never violate the lower boundary independent of the mult/sft relation. For the max_delta_ticks conversion in the "mult > (1 << sft)" case we end up with a slightly smaller value, but well inside the boundaries and close enough to the hardware limitations. Versus the 64bit overflow check, we need to be even more careful. We need to check for overflowing (1 << 63) - 1 (i.e. the max positive value which fits into a s64). See clockevents_program_event(). So this check needs to be: u64 clc = (u64)latch << sft; if ((s64)clc < 0 || (clc >> sft ) != latch) clc = (1 << 63) - 1; And the above rounding magic, which has to be applied after this needs to take the following condition into account as well: (1 << 63) - 1 - clc >= mult - 1 If that's not true, we can't do the rounding add, but we know that we are well at the upper limit of the hardware, so no issue either. > > > > for boundary violation and can limit "clc" to (1 << 63) - 1 before the > > > Where does this magic constant come from? > > > > Rolling my magic hex dice gave me that. > Wow, how many sides does your dice have? Couldn't it have choosen > (u64)-1 for improved results? No it's restricted to s64 positiv values due to software limitations. See above. Thanks, tglx
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
Hello Thomas, On Wed, Sep 18, 2013 at 11:38:07AM +0200, Thomas Gleixner wrote: > On Wed, 18 Sep 2013, Uwe Kleine-König wrote: > > > Now we can easily verify whether the whole equation fits into the > > > 64bit boundary. Shifting the "clc" result back by evt->shift MUST > > > result in "latch". If that's not the case, we have a clear indicator > > But this is only the case if evt->mult is <= (1 << evt->shift). Is this > > always given? > > Crap, no. It's only true for device frequency <= 1GHz. Good catch! > > > Is it more sensible to adjust dev->max_delta_ns once at register time > > and so save the often recurrent overflow check in > > clockevents_program_event? > > Which overflow check are you talking about? > > There is only the boundary check: > > delta = min(delta, (int64_t) dev->max_delta_ns); > delta = max(delta, (int64_t) dev->min_delta_ns); > > Which sensible adjustment at register time is going to remove that? My idea was that wouldn't need to add if ((clc >> evt->shift) != (u64)latch) ... to clockevent_delta2ns (not clockevents_program_event as I wrote) if dev->max_delta_ns was small enough. So max_delta_ns would be the minimum of the hardware limit and the value to prevent an overflow. Not sure any more that this works though. > > Another doubt I have is: You changed clockevent_delta2ns to round up now > > unconditionally. For the numbers on at91 that doesn't matter, but I > > wonder if there are situations that make the timer core violate the > > max_delta_ticks condition now. > > And how so? The + (mult - 1) ensures, that the conversion back to > ticks results in the same value as latch. So how should it violate > the max boundary? That is wrong: With max_delta_ticks << shift = n * mult - k for k in [0 .. mult-1] and an integer n: (max_delta_ns * mult) >> shift = max_delta_ticks << shift) + mult - 1) / mult) * mult) >> shift = (((n * mult - k + mult - 1) / mult) * mult) >> shift = n * mult >> shift = ((max_delta_ticks << shift) + k) >> shift = max_delta_ticks + (k >> shift) k >> shift is only known to be zero if mult <= 1 << shift (i.e. the same condition as above where your 64bit overflow detection is wrong). So this can result in values > max_delta_ticks. > Math is hard, right? Yes, if it involves integer division and overflow handling it's hard to come up with correct solutions during shopping. ;-) > > > for boundary violation and can limit "clc" to (1 << 63) - 1 before the > > Where does this magic constant come from? > > Rolling my magic hex dice gave me that. Wow, how many sides does your dice have? Couldn't it have choosen (u64)-1 for improved results? Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König| Industrial Linux Solutions | http://www.pengutronix.de/ | -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
On Wed, 18 Sep 2013, Uwe Kleine-König wrote: > > Now we can easily verify whether the whole equation fits into the > > 64bit boundary. Shifting the "clc" result back by evt->shift MUST > > result in "latch". If that's not the case, we have a clear indicator > But this is only the case if evt->mult is <= (1 << evt->shift). Is this > always given? Crap, no. It's only true for device frequency <= 1GHz. Good catch! > Is it more sensible to adjust dev->max_delta_ns once at register time > and so save the often recurrent overflow check in > clockevents_program_event? Which overflow check are you talking about? There is only the boundary check: delta = min(delta, (int64_t) dev->max_delta_ns); delta = max(delta, (int64_t) dev->min_delta_ns); Which sensible adjustment at register time is going to remove that? > Another doubt I have is: You changed clockevent_delta2ns to round up now > unconditionally. For the numbers on at91 that doesn't matter, but I > wonder if there are situations that make the timer core violate the > max_delta_ticks condition now. And how so? The + (mult - 1) ensures, that the conversion back to ticks results in the same value as latch. So how should it violate the max boundary? Math is hard, right? > > for boundary violation and can limit "clc" to (1 << 63) - 1 before the > Where does this magic constant come from? Rolling my magic hex dice gave me that. Thanks, tglx
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
Hello Thomas, On Tue, Sep 17, 2013 at 11:15:20PM +0200, Thomas Gleixner wrote: > Marc Kleine-Budde pointed out, that commit 77cc982 "clocksource: use > clockevents_config_and_register() where possible" caused a regression > for some of the converted subarchs. > > The reason is, that the clockevents core code converts the minimal > hardware tick delta to a nanosecond value for core internal > usage. This conversion is affected by integer math rounding loss, so > the backwards conversion to hardware ticks will likely result in a > value which is less than the configured hardware limitation. The > affected subarchs used their own workaround (SIGH!) which got lost in > the conversion. > > Now instead of fixing the underlying core code problem, Marcs patch s/Marcs/Marc's/ > tried to work around the core code issue by increasing the minimal > tick delta at clockevents registration time so the resulting limit in > the core code backwards conversion did not violate the hardware > limits. More SIGH! > > The solution for the issue at hand is simple: adding evt->mult - 1 to > the shifted value before the integer divison in the core conversion > function takes care of it. > > Though looking closer at the details of that function reveals another > bogosity: The upper bounds check is broken as well. Checking for a > resulting "clc" value greater than KTIME_MAX after the conversion is > pointless. The conversion does: > > u64 clc = (latch << evt->shift) / evt->mult; > > So there is no sanity check for (latch << evt->shift) exceeding the > 64bit boundary. The latch argument is "unsigned long", so on a 64bit > arch the handed in argument could easily lead to an unnoticed shift > overflow. With the above rounding fix applied the calculation before > the divison is: > >u64 clc = (latch << evt->shift) + evt->mult - 1; > > Now we can easily verify whether the whole equation fits into the > 64bit boundary. Shifting the "clc" result back by evt->shift MUST > result in "latch". If that's not the case, we have a clear indicator But this is only the case if evt->mult is <= (1 << evt->shift). Is this always given? Is it more sensible to adjust dev->max_delta_ns once at register time and so save the often recurrent overflow check in clockevents_program_event? Another doubt I have is: You changed clockevent_delta2ns to round up now unconditionally. For the numbers on at91 that doesn't matter, but I wonder if there are situations that make the timer core violate the max_delta_ticks condition now. > for boundary violation and can limit "clc" to (1 << 63) - 1 before the Where does this magic constant come from? Best regards Uwe > divison by evt->mult. The resulting nsec * evt->mult in the > programming path will therefor always be in the 64bit boundary. > > Signed-off-by: Thomas Gleixner > --- > diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c > index 38959c8..4fc4826 100644 > --- a/kernel/time/clockevents.c > +++ b/kernel/time/clockevents.c > @@ -49,13 +49,25 @@ u64 clockevent_delta2ns(unsigned long latch, struct > clock_event_device *evt) > WARN_ON(1); > } > > + /* > + * Prevent integer rounding loss, otherwise the backward > + * conversion from nsec to ticks could result in a value less > + * than evt->min_delta_ticks. > + */ > + clc += evt->mult - 1; > + > + /* > + * Upper bound sanity check. If the backwards conversion is > + * not equal latch, we know that the above (shift + rounding > + * correction) exceeded the 64 bit boundary. > + */ > + if ((clc >> evt->shift) != (u64)latch) > + clc = ((u64)1 << 63) - 1; > + > do_div(clc, evt->mult); > - if (clc < 1000) > - clc = 1000; > - if (clc > KTIME_MAX) > - clc = KTIME_MAX; > > - return clc; > + /* Deltas less than 1usec are pointless noise */ > + return clc > 1000 ? clc : 1000; > } > EXPORT_SYMBOL_GPL(clockevent_delta2ns); > -- Pengutronix e.K. | Uwe Kleine-König| Industrial Linux Solutions | http://www.pengutronix.de/ | -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
Hello Thomas, On Tue, Sep 17, 2013 at 11:15:20PM +0200, Thomas Gleixner wrote: Marc Kleine-Budde pointed out, that commit 77cc982 clocksource: use clockevents_config_and_register() where possible caused a regression for some of the converted subarchs. The reason is, that the clockevents core code converts the minimal hardware tick delta to a nanosecond value for core internal usage. This conversion is affected by integer math rounding loss, so the backwards conversion to hardware ticks will likely result in a value which is less than the configured hardware limitation. The affected subarchs used their own workaround (SIGH!) which got lost in the conversion. Now instead of fixing the underlying core code problem, Marcs patch s/Marcs/Marc's/ tried to work around the core code issue by increasing the minimal tick delta at clockevents registration time so the resulting limit in the core code backwards conversion did not violate the hardware limits. More SIGH! The solution for the issue at hand is simple: adding evt-mult - 1 to the shifted value before the integer divison in the core conversion function takes care of it. Though looking closer at the details of that function reveals another bogosity: The upper bounds check is broken as well. Checking for a resulting clc value greater than KTIME_MAX after the conversion is pointless. The conversion does: u64 clc = (latch evt-shift) / evt-mult; So there is no sanity check for (latch evt-shift) exceeding the 64bit boundary. The latch argument is unsigned long, so on a 64bit arch the handed in argument could easily lead to an unnoticed shift overflow. With the above rounding fix applied the calculation before the divison is: u64 clc = (latch evt-shift) + evt-mult - 1; Now we can easily verify whether the whole equation fits into the 64bit boundary. Shifting the clc result back by evt-shift MUST result in latch. If that's not the case, we have a clear indicator But this is only the case if evt-mult is = (1 evt-shift). Is this always given? Is it more sensible to adjust dev-max_delta_ns once at register time and so save the often recurrent overflow check in clockevents_program_event? Another doubt I have is: You changed clockevent_delta2ns to round up now unconditionally. For the numbers on at91 that doesn't matter, but I wonder if there are situations that make the timer core violate the max_delta_ticks condition now. for boundary violation and can limit clc to (1 63) - 1 before the Where does this magic constant come from? Best regards Uwe divison by evt-mult. The resulting nsec * evt-mult in the programming path will therefor always be in the 64bit boundary. Signed-off-by: Thomas Gleixner t...@linutronix.de --- diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c index 38959c8..4fc4826 100644 --- a/kernel/time/clockevents.c +++ b/kernel/time/clockevents.c @@ -49,13 +49,25 @@ u64 clockevent_delta2ns(unsigned long latch, struct clock_event_device *evt) WARN_ON(1); } + /* + * Prevent integer rounding loss, otherwise the backward + * conversion from nsec to ticks could result in a value less + * than evt-min_delta_ticks. + */ + clc += evt-mult - 1; + + /* + * Upper bound sanity check. If the backwards conversion is + * not equal latch, we know that the above (shift + rounding + * correction) exceeded the 64 bit boundary. + */ + if ((clc evt-shift) != (u64)latch) + clc = ((u64)1 63) - 1; + do_div(clc, evt-mult); - if (clc 1000) - clc = 1000; - if (clc KTIME_MAX) - clc = KTIME_MAX; - return clc; + /* Deltas less than 1usec are pointless noise */ + return clc 1000 ? clc : 1000; } EXPORT_SYMBOL_GPL(clockevent_delta2ns); -- Pengutronix e.K. | Uwe Kleine-König| Industrial Linux Solutions | http://www.pengutronix.de/ | -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
On Wed, 18 Sep 2013, Uwe Kleine-König wrote: Now we can easily verify whether the whole equation fits into the 64bit boundary. Shifting the clc result back by evt-shift MUST result in latch. If that's not the case, we have a clear indicator But this is only the case if evt-mult is = (1 evt-shift). Is this always given? Crap, no. It's only true for device frequency = 1GHz. Good catch! Is it more sensible to adjust dev-max_delta_ns once at register time and so save the often recurrent overflow check in clockevents_program_event? Which overflow check are you talking about? There is only the boundary check: delta = min(delta, (int64_t) dev-max_delta_ns); delta = max(delta, (int64_t) dev-min_delta_ns); Which sensible adjustment at register time is going to remove that? Another doubt I have is: You changed clockevent_delta2ns to round up now unconditionally. For the numbers on at91 that doesn't matter, but I wonder if there are situations that make the timer core violate the max_delta_ticks condition now. And how so? The + (mult - 1) ensures, that the conversion back to ticks results in the same value as latch. So how should it violate the max boundary? Math is hard, right? for boundary violation and can limit clc to (1 63) - 1 before the Where does this magic constant come from? Rolling my magic hex dice gave me that. Thanks, tglx
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
Hello Thomas, On Wed, Sep 18, 2013 at 11:38:07AM +0200, Thomas Gleixner wrote: On Wed, 18 Sep 2013, Uwe Kleine-König wrote: Now we can easily verify whether the whole equation fits into the 64bit boundary. Shifting the clc result back by evt-shift MUST result in latch. If that's not the case, we have a clear indicator But this is only the case if evt-mult is = (1 evt-shift). Is this always given? Crap, no. It's only true for device frequency = 1GHz. Good catch! Is it more sensible to adjust dev-max_delta_ns once at register time and so save the often recurrent overflow check in clockevents_program_event? Which overflow check are you talking about? There is only the boundary check: delta = min(delta, (int64_t) dev-max_delta_ns); delta = max(delta, (int64_t) dev-min_delta_ns); Which sensible adjustment at register time is going to remove that? My idea was that wouldn't need to add if ((clc evt-shift) != (u64)latch) ... to clockevent_delta2ns (not clockevents_program_event as I wrote) if dev-max_delta_ns was small enough. So max_delta_ns would be the minimum of the hardware limit and the value to prevent an overflow. Not sure any more that this works though. Another doubt I have is: You changed clockevent_delta2ns to round up now unconditionally. For the numbers on at91 that doesn't matter, but I wonder if there are situations that make the timer core violate the max_delta_ticks condition now. And how so? The + (mult - 1) ensures, that the conversion back to ticks results in the same value as latch. So how should it violate the max boundary? That is wrong: With max_delta_ticks shift = n * mult - k for k in [0 .. mult-1] and an integer n: (max_delta_ns * mult) shift = max_delta_ticks shift) + mult - 1) / mult) * mult) shift = (((n * mult - k + mult - 1) / mult) * mult) shift = n * mult shift = ((max_delta_ticks shift) + k) shift = max_delta_ticks + (k shift) k shift is only known to be zero if mult = 1 shift (i.e. the same condition as above where your 64bit overflow detection is wrong). So this can result in values max_delta_ticks. Math is hard, right? Yes, if it involves integer division and overflow handling it's hard to come up with correct solutions during shopping. ;-) for boundary violation and can limit clc to (1 63) - 1 before the Where does this magic constant come from? Rolling my magic hex dice gave me that. Wow, how many sides does your dice have? Couldn't it have choosen (u64)-1 for improved results? Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König| Industrial Linux Solutions | http://www.pengutronix.de/ | -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
On Wed, 18 Sep 2013, Uwe Kleine-König wrote: On Wed, Sep 18, 2013 at 11:38:07AM +0200, Thomas Gleixner wrote: On Wed, 18 Sep 2013, Uwe Kleine-König wrote: Another doubt I have is: You changed clockevent_delta2ns to round up now unconditionally. For the numbers on at91 that doesn't matter, but I wonder if there are situations that make the timer core violate the max_delta_ticks condition now. And how so? The + (mult - 1) ensures, that the conversion back to ticks results in the same value as latch. So how should it violate the max boundary? That is wrong: With max_delta_ticks shift = n * mult - k for k in [0 .. mult-1] and an integer n: (max_delta_ns * mult) shift = max_delta_ticks shift) + mult - 1) / mult) * mult) shift = (((n * mult - k + mult - 1) / mult) * mult) shift = n * mult shift = ((max_delta_ticks shift) + k) shift = max_delta_ticks + (k shift) k shift is only known to be zero if mult = 1 shift (i.e. the same condition as above where your 64bit overflow detection is wrong). So this can result in values max_delta_ticks. Right, should have waited for coffee actually activating the right brain cells. So the rounding thing works nicely for mult = (1 sft). For the other cases the conversion is wrong by (mult / (1 sft)) - 1. So for a 3 GHz clock it's off by 2. That's not an issue for the min value, but for the max value it might overflow the hardware limit. Not a real functional issue as we'd just get a pointless short interrupt, but for correctness reasons and for the sake of NOHZ we need to fix that. There is no real correct solution for these cases, which will result in a correct backwards conversion. We could validate against the max delta ticks in program_event, but that adds another boundary check to the hotpath so I rather want to avoid it. The simple solution is to check for the following in the conversion routine: if (latch == dev-min_delta_ticks || dev-mult = (1 dev-sft)) clc += mult - 1 That ensures, that we never violate the lower boundary independent of the mult/sft relation. For the max_delta_ticks conversion in the mult (1 sft) case we end up with a slightly smaller value, but well inside the boundaries and close enough to the hardware limitations. Versus the 64bit overflow check, we need to be even more careful. We need to check for overflowing (1 63) - 1 (i.e. the max positive value which fits into a s64). See clockevents_program_event(). So this check needs to be: u64 clc = (u64)latch sft; if ((s64)clc 0 || (clc sft ) != latch) clc = (1 63) - 1; And the above rounding magic, which has to be applied after this needs to take the following condition into account as well: (1 63) - 1 - clc = mult - 1 If that's not true, we can't do the rounding add, but we know that we are well at the upper limit of the hardware, so no issue either. for boundary violation and can limit clc to (1 63) - 1 before the Where does this magic constant come from? Rolling my magic hex dice gave me that. Wow, how many sides does your dice have? Couldn't it have choosen (u64)-1 for improved results? No it's restricted to s64 positiv values due to software limitations. See above. Thanks, tglx
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
On Wed, 18 Sep 2013, Marc Kleine-Budde wrote: > On 09/17/2013 11:15 PM, Thomas Gleixner wrote: > > Marc Kleine-Budde pointed out, that commit 77cc982 "clocksource: use > > clockevents_config_and_register() where possible" caused a regression > > for some of the converted subarchs. > > > > The reason is, that the clockevents core code converts the minimal > > hardware tick delta to a nanosecond value for core internal > > usage. This conversion is affected by integer math rounding loss, so > > the backwards conversion to hardware ticks will likely result in a > > value which is less than the configured hardware limitation. The > > affected subarchs used their own workaround (SIGH!) which got lost in > > the conversion. > > > > Now instead of fixing the underlying core code problem, Marcs patch > > tried to work around the core code issue by increasing the minimal > > tick delta at clockevents registration time so the resulting limit in > > the core code backwards conversion did not violate the hardware > > limits. More SIGH! > > It was not easy getting your attention with this problem :) A really hard to understand and solve problem! Obviously you are a great fan of John Stultz famous "Math is hard, lets go shopping!" line. Why on earth do you need my attention to fix at least the simple rounding issue in the core code? You could have sent the "clk += evt->mult - 1:" patch directly to Linus without even cc'ing me. It's not rocket science, really. When I responded with the while(1) loop patch today in the morning it was just because I did not have a second to think about the correct fix for this, but the reason WHY this happened was entirely clear in a split second after reading Russells reply. The upper bounds issue tooks a few minutes more, but it's not rocket science either. Sorry, I can understand the need for a second opinion on a weird race condition problem, but tracking down an obvious simple math issue to the point and fixing it... SIGH! Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
On 09/17/2013 11:15 PM, Thomas Gleixner wrote: > Marc Kleine-Budde pointed out, that commit 77cc982 "clocksource: use > clockevents_config_and_register() where possible" caused a regression > for some of the converted subarchs. > > The reason is, that the clockevents core code converts the minimal > hardware tick delta to a nanosecond value for core internal > usage. This conversion is affected by integer math rounding loss, so > the backwards conversion to hardware ticks will likely result in a > value which is less than the configured hardware limitation. The > affected subarchs used their own workaround (SIGH!) which got lost in > the conversion. > > Now instead of fixing the underlying core code problem, Marcs patch > tried to work around the core code issue by increasing the minimal > tick delta at clockevents registration time so the resulting limit in > the core code backwards conversion did not violate the hardware > limits. More SIGH! It was not easy getting your attention with this problem :) Marc -- Pengutronix e.K. | Marc Kleine-Budde | Industrial Linux Solutions| Phone: +49-231-2826-924 | Vertretung West/Dortmund | Fax: +49-5121-206917- | Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de | signature.asc Description: OpenPGP digital signature
[PATCH] clockevents: Sanitize ticks to nsec conversion
Marc Kleine-Budde pointed out, that commit 77cc982 "clocksource: use clockevents_config_and_register() where possible" caused a regression for some of the converted subarchs. The reason is, that the clockevents core code converts the minimal hardware tick delta to a nanosecond value for core internal usage. This conversion is affected by integer math rounding loss, so the backwards conversion to hardware ticks will likely result in a value which is less than the configured hardware limitation. The affected subarchs used their own workaround (SIGH!) which got lost in the conversion. Now instead of fixing the underlying core code problem, Marcs patch tried to work around the core code issue by increasing the minimal tick delta at clockevents registration time so the resulting limit in the core code backwards conversion did not violate the hardware limits. More SIGH! The solution for the issue at hand is simple: adding evt->mult - 1 to the shifted value before the integer divison in the core conversion function takes care of it. Though looking closer at the details of that function reveals another bogosity: The upper bounds check is broken as well. Checking for a resulting "clc" value greater than KTIME_MAX after the conversion is pointless. The conversion does: u64 clc = (latch << evt->shift) / evt->mult; So there is no sanity check for (latch << evt->shift) exceeding the 64bit boundary. The latch argument is "unsigned long", so on a 64bit arch the handed in argument could easily lead to an unnoticed shift overflow. With the above rounding fix applied the calculation before the divison is: u64 clc = (latch << evt->shift) + evt->mult - 1; Now we can easily verify whether the whole equation fits into the 64bit boundary. Shifting the "clc" result back by evt->shift MUST result in "latch". If that's not the case, we have a clear indicator for boundary violation and can limit "clc" to (1 << 63) - 1 before the divison by evt->mult. The resulting nsec * evt->mult in the programming path will therefor always be in the 64bit boundary. Signed-off-by: Thomas Gleixner --- diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c index 38959c8..4fc4826 100644 --- a/kernel/time/clockevents.c +++ b/kernel/time/clockevents.c @@ -49,13 +49,25 @@ u64 clockevent_delta2ns(unsigned long latch, struct clock_event_device *evt) WARN_ON(1); } + /* +* Prevent integer rounding loss, otherwise the backward +* conversion from nsec to ticks could result in a value less +* than evt->min_delta_ticks. +*/ + clc += evt->mult - 1; + + /* +* Upper bound sanity check. If the backwards conversion is +* not equal latch, we know that the above (shift + rounding +* correction) exceeded the 64 bit boundary. +*/ + if ((clc >> evt->shift) != (u64)latch) + clc = ((u64)1 << 63) - 1; + do_div(clc, evt->mult); - if (clc < 1000) - clc = 1000; - if (clc > KTIME_MAX) - clc = KTIME_MAX; - return clc; + /* Deltas less than 1usec are pointless noise */ + return clc > 1000 ? clc : 1000; } EXPORT_SYMBOL_GPL(clockevent_delta2ns); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] clockevents: Sanitize ticks to nsec conversion
Marc Kleine-Budde pointed out, that commit 77cc982 clocksource: use clockevents_config_and_register() where possible caused a regression for some of the converted subarchs. The reason is, that the clockevents core code converts the minimal hardware tick delta to a nanosecond value for core internal usage. This conversion is affected by integer math rounding loss, so the backwards conversion to hardware ticks will likely result in a value which is less than the configured hardware limitation. The affected subarchs used their own workaround (SIGH!) which got lost in the conversion. Now instead of fixing the underlying core code problem, Marcs patch tried to work around the core code issue by increasing the minimal tick delta at clockevents registration time so the resulting limit in the core code backwards conversion did not violate the hardware limits. More SIGH! The solution for the issue at hand is simple: adding evt-mult - 1 to the shifted value before the integer divison in the core conversion function takes care of it. Though looking closer at the details of that function reveals another bogosity: The upper bounds check is broken as well. Checking for a resulting clc value greater than KTIME_MAX after the conversion is pointless. The conversion does: u64 clc = (latch evt-shift) / evt-mult; So there is no sanity check for (latch evt-shift) exceeding the 64bit boundary. The latch argument is unsigned long, so on a 64bit arch the handed in argument could easily lead to an unnoticed shift overflow. With the above rounding fix applied the calculation before the divison is: u64 clc = (latch evt-shift) + evt-mult - 1; Now we can easily verify whether the whole equation fits into the 64bit boundary. Shifting the clc result back by evt-shift MUST result in latch. If that's not the case, we have a clear indicator for boundary violation and can limit clc to (1 63) - 1 before the divison by evt-mult. The resulting nsec * evt-mult in the programming path will therefor always be in the 64bit boundary. Signed-off-by: Thomas Gleixner t...@linutronix.de --- diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c index 38959c8..4fc4826 100644 --- a/kernel/time/clockevents.c +++ b/kernel/time/clockevents.c @@ -49,13 +49,25 @@ u64 clockevent_delta2ns(unsigned long latch, struct clock_event_device *evt) WARN_ON(1); } + /* +* Prevent integer rounding loss, otherwise the backward +* conversion from nsec to ticks could result in a value less +* than evt-min_delta_ticks. +*/ + clc += evt-mult - 1; + + /* +* Upper bound sanity check. If the backwards conversion is +* not equal latch, we know that the above (shift + rounding +* correction) exceeded the 64 bit boundary. +*/ + if ((clc evt-shift) != (u64)latch) + clc = ((u64)1 63) - 1; + do_div(clc, evt-mult); - if (clc 1000) - clc = 1000; - if (clc KTIME_MAX) - clc = KTIME_MAX; - return clc; + /* Deltas less than 1usec are pointless noise */ + return clc 1000 ? clc : 1000; } EXPORT_SYMBOL_GPL(clockevent_delta2ns); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
On 09/17/2013 11:15 PM, Thomas Gleixner wrote: Marc Kleine-Budde pointed out, that commit 77cc982 clocksource: use clockevents_config_and_register() where possible caused a regression for some of the converted subarchs. The reason is, that the clockevents core code converts the minimal hardware tick delta to a nanosecond value for core internal usage. This conversion is affected by integer math rounding loss, so the backwards conversion to hardware ticks will likely result in a value which is less than the configured hardware limitation. The affected subarchs used their own workaround (SIGH!) which got lost in the conversion. Now instead of fixing the underlying core code problem, Marcs patch tried to work around the core code issue by increasing the minimal tick delta at clockevents registration time so the resulting limit in the core code backwards conversion did not violate the hardware limits. More SIGH! It was not easy getting your attention with this problem :) Marc -- Pengutronix e.K. | Marc Kleine-Budde | Industrial Linux Solutions| Phone: +49-231-2826-924 | Vertretung West/Dortmund | Fax: +49-5121-206917- | Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de | signature.asc Description: OpenPGP digital signature
Re: [PATCH] clockevents: Sanitize ticks to nsec conversion
On Wed, 18 Sep 2013, Marc Kleine-Budde wrote: On 09/17/2013 11:15 PM, Thomas Gleixner wrote: Marc Kleine-Budde pointed out, that commit 77cc982 clocksource: use clockevents_config_and_register() where possible caused a regression for some of the converted subarchs. The reason is, that the clockevents core code converts the minimal hardware tick delta to a nanosecond value for core internal usage. This conversion is affected by integer math rounding loss, so the backwards conversion to hardware ticks will likely result in a value which is less than the configured hardware limitation. The affected subarchs used their own workaround (SIGH!) which got lost in the conversion. Now instead of fixing the underlying core code problem, Marcs patch tried to work around the core code issue by increasing the minimal tick delta at clockevents registration time so the resulting limit in the core code backwards conversion did not violate the hardware limits. More SIGH! It was not easy getting your attention with this problem :) A really hard to understand and solve problem! Obviously you are a great fan of John Stultz famous Math is hard, lets go shopping! line. Why on earth do you need my attention to fix at least the simple rounding issue in the core code? You could have sent the clk += evt-mult - 1: patch directly to Linus without even cc'ing me. It's not rocket science, really. When I responded with the while(1) loop patch today in the morning it was just because I did not have a second to think about the correct fix for this, but the reason WHY this happened was entirely clear in a split second after reading Russells reply. The upper bounds issue tooks a few minutes more, but it's not rocket science either. Sorry, I can understand the need for a second opinion on a weird race condition problem, but tracking down an obvious simple math issue to the point and fixing it... SIGH! Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/