Re: [PATCH] sched: fix timeval conversion to jiffies

2014-09-04 Thread Andrew Hunter
On Thu, Sep 4, 2014 at 2:36 PM, Paul Turner  wrote:
> On Thu, Sep 4, 2014 at 2:30 PM, John Stultz  wrote:
>> This seems to be a quite old bug.. Do you think this is needed for -stable?
>
> Seems reasonable to me.
>

I have no opinion: backport or don't at your preference.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: fix timeval conversion to jiffies

2014-09-04 Thread Paul Turner
On Thu, Sep 4, 2014 at 2:30 PM, John Stultz  wrote:
> On Thu, Sep 4, 2014 at 2:17 PM, Andrew Hunter  wrote:
>> On Wed, Sep 3, 2014 at 5:06 PM, John Stultz  wrote:
>>> Maybe with the next version of the patch, before you get into the
>>> unwinding the math, you might practically describe what is broken,
>>> then explain how its broken.
>>>
>>> My quick read here is that we're converting a timespec -> jiffies, and
>>> in doing so we round up by one jiffy.
>>>
>>> This seems actually perfectly normal, as we usually end up rounding up
>>> by a jiffy in many cases since we don't want to undershoot any
>>> timeout, and we're always allowed return later then specified.
>>
>> Well, yes, timeouts can be longer than specified, but what you said
>> technically applies just as much to code that arbitrarily multiplies
>> jiffies by 10 before returning, no? :)
>>
>> The problem isn't the rounding, it's that the rounding is _wrong_: I'm
>> fine rounding 10100usec / (1000 usec/jiffie) = 11 jiffies. The current
>> code rounds 1usec / (1000 usec/jiffies) to 11.  I've rewritten the
>> description to make this clearer.
>
> Ok. Very much appreciated!
>
 In particular, with HZ=1000, we consistently computed that 1 usec
 was 11 jiffies; the same was true for any exact multiple of
 TICK_NSEC. This is obviously bad as a general rule, and caused
 observable user problems with setitimer() at the very least:

 setitimer(ITIMER_PROF, , NULL);
 setitimer(ITIMER_PROF, NULL, );

 would actually add a tick to val!
>>>
>>> So this looks like an issue. Since we convert and store the internal
>>> representation in jiffies, when we pull it back out, we get the
>>> rounded up value, which is larger then the timespec value originally
>>> submitted.  This is really the core issue, correct?
>>
>> For the particular user bug reported to me, yes, this was the core
>> issue: some code that stopped and restarted an itimer found the
>> interval growing by 1ms each time.  But again, it wasn't that it was
>> rounded:  if we initially passed e.g. 10500 usec and got back 11000,
>> that'd be annoying but workable, because if we then went through
>> another cycle of enabling/disabling itimer, we'd set it to 11000 usec
>> and get back 11000 again.  What we have now instead adds a full jiffie
>> _every time_.
>
> Ah, ok. This part is key to understanding the problem. Thanks for
> clarifying this.
>
> This seems to be a quite old bug.. Do you think this is needed for -stable?

Seems reasonable to me.

>
> thanks
> -john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: fix timeval conversion to jiffies

2014-09-04 Thread John Stultz
On Thu, Sep 4, 2014 at 2:17 PM, Andrew Hunter  wrote:
> On Wed, Sep 3, 2014 at 5:06 PM, John Stultz  wrote:
>> Maybe with the next version of the patch, before you get into the
>> unwinding the math, you might practically describe what is broken,
>> then explain how its broken.
>>
>> My quick read here is that we're converting a timespec -> jiffies, and
>> in doing so we round up by one jiffy.
>>
>> This seems actually perfectly normal, as we usually end up rounding up
>> by a jiffy in many cases since we don't want to undershoot any
>> timeout, and we're always allowed return later then specified.
>
> Well, yes, timeouts can be longer than specified, but what you said
> technically applies just as much to code that arbitrarily multiplies
> jiffies by 10 before returning, no? :)
>
> The problem isn't the rounding, it's that the rounding is _wrong_: I'm
> fine rounding 10100usec / (1000 usec/jiffie) = 11 jiffies. The current
> code rounds 1usec / (1000 usec/jiffies) to 11.  I've rewritten the
> description to make this clearer.

Ok. Very much appreciated!

>>> In particular, with HZ=1000, we consistently computed that 1 usec
>>> was 11 jiffies; the same was true for any exact multiple of
>>> TICK_NSEC. This is obviously bad as a general rule, and caused
>>> observable user problems with setitimer() at the very least:
>>>
>>> setitimer(ITIMER_PROF, , NULL);
>>> setitimer(ITIMER_PROF, NULL, );
>>>
>>> would actually add a tick to val!
>>
>> So this looks like an issue. Since we convert and store the internal
>> representation in jiffies, when we pull it back out, we get the
>> rounded up value, which is larger then the timespec value originally
>> submitted.  This is really the core issue, correct?
>
> For the particular user bug reported to me, yes, this was the core
> issue: some code that stopped and restarted an itimer found the
> interval growing by 1ms each time.  But again, it wasn't that it was
> rounded:  if we initially passed e.g. 10500 usec and got back 11000,
> that'd be annoying but workable, because if we then went through
> another cycle of enabling/disabling itimer, we'd set it to 11000 usec
> and get back 11000 again.  What we have now instead adds a full jiffie
> _every time_.

Ah, ok. This part is key to understanding the problem. Thanks for
clarifying this.

This seems to be a quite old bug.. Do you think this is needed for -stable?

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] sched: fix timeval conversion to jiffies

2014-09-04 Thread Andrew Hunter
timeval_to_jiffies tried to round a timeval up to an integral number
of jiffies, but the logic for doing so was incorrect: intervals
corresponding to exactly N jiffies would become N+1. This manifested
itself particularly repeatedly stopping/starting an itimer:

setitimer(ITIMER_PROF, , NULL);
setitimer(ITIMER_PROF, NULL, );

would add a full tick to val, _even if it was exactly representable in
terms of jiffies_ (say, the result of a previous rounding.)  Doing
this repeatedly would cause unbounded growth in val.  So fix the math.

Here's what was wrong with the conversion: we essentially computed
(eliding seconds)

jiffies = usec  * (NSEC_PER_USEC/TICK_NSEC)

by using scaling arithmetic, which took the best approximation of
NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC =
x/(2^USEC_JIFFIE_SC), and computed:

jiffies = (usec * x) >> USEC_JIFFIE_SC

and rounded this calculation up in the intermediate form (since we
can't necessarily exactly represent TICK_NSEC in usec.) But the
scaling arithmetic is a (very slight) *over*approximation of the true
value; that is, instead of dividing by (1 usec/ 1 jiffie), we
effectively divided by (1 usec/1 jiffie)-epsilon (rounding
down). This would normally be fine, but we want to round timeouts up,
and we did so by adding 2^USEC_JIFFIE_SC - 1 before the shift; this
would be fine if our division was exact, but dividing this by the
slightly smaller factor was equivalent to adding just _over_ 1 to the
final result (instead of just _under_ 1, as desired.)

In particular, with HZ=1000, we consistently computed that 1 usec
was 11 jiffies; the same was true for any exact multiple of
TICK_NSEC.

We could possibly still round in the intermediate form, adding
something less than 2^USEC_JIFFIE_SC - 1, but easier still is to
convert usec->nsec, round in nanoseconds, and then convert using
time*spec*_to_jiffies.  This adds one constant multiplication, and is
not observably slower in microbenchmarks on recent x86 hardware.

Tested: the following program:

int main() {
  struct itimerval zero = {{0, 0}, {0, 0}};
  /* Initially set to 10 ms. */
  struct itimerval initial = zero;
  initial.it_interval.tv_usec = 1;
  setitimer(ITIMER_PROF, , NULL);
  /* Save and restore several times. */
  for (size_t i = 0; i < 10; ++i) {
struct itimerval prev;
setitimer(ITIMER_PROF, , );
/* on old kernels, this goes up by TICK_USEC every iteration */
printf("previous value: %ld %ld %ld %ld\n",
   prev.it_interval.tv_sec, prev.it_interval.tv_usec,
   prev.it_value.tv_sec, prev.it_value.tv_usec);
setitimer(ITIMER_PROF, , NULL);
  }
return 0;
}

Signed-off-by: Andrew Hunter 
Reviewed-by: Paul Turner 
Reported-by: Aaron Jacobs 
Change-Id: I7cd0f0764847fd055d39531f54e6ea3dd3ce5453
---
 include/linux/jiffies.h | 12 ---
 kernel/time.c   | 54 +++--
 2 files changed, 30 insertions(+), 36 deletions(-)

diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h
index 1f44466..c367cbd 100644
--- a/include/linux/jiffies.h
+++ b/include/linux/jiffies.h
@@ -258,23 +258,11 @@ extern unsigned long preset_lpj;
 #define SEC_JIFFIE_SC (32 - SHIFT_HZ)
 #endif
 #define NSEC_JIFFIE_SC (SEC_JIFFIE_SC + 29)
-#define USEC_JIFFIE_SC (SEC_JIFFIE_SC + 19)
 #define SEC_CONVERSION ((unsigned long)u64)NSEC_PER_SEC << SEC_JIFFIE_SC) 
+\
 TICK_NSEC -1) / (u64)TICK_NSEC))
 
 #define NSEC_CONVERSION ((unsigned long)u64)1 << NSEC_JIFFIE_SC) +\
 TICK_NSEC -1) / (u64)TICK_NSEC))
-#define USEC_CONVERSION  \
-((unsigned long)u64)NSEC_PER_USEC << USEC_JIFFIE_SC) +\
-TICK_NSEC -1) / (u64)TICK_NSEC))
-/*
- * USEC_ROUND is used in the timeval to jiffie conversion.  See there
- * for more details.  It is the scaled resolution rounding value.  Note
- * that it is a 64-bit value.  Since, when it is applied, we are already
- * in jiffies (albit scaled), it is nothing but the bits we will shift
- * off.
- */
-#define USEC_ROUND (u64)(((u64)1 << USEC_JIFFIE_SC) - 1)
 /*
  * The maximum jiffie value is (MAX_INT >> 1).  Here we translate that
  * into seconds.  The 64-bit case will overflow if we are not careful,
diff --git a/kernel/time.c b/kernel/time.c
index 7c7964c..3c49ab4 100644
--- a/kernel/time.c
+++ b/kernel/time.c
@@ -496,17 +496,20 @@ EXPORT_SYMBOL(usecs_to_jiffies);
  * that a remainder subtract here would not do the right thing as the
  * resolution values don't fall on second boundries.  I.e. the line:
  * nsec -= nsec % TICK_NSEC; is NOT a correct resolution rounding.
+ * Note that due to the small error in the multiplier here, this
+ * rounding is incorrect for sufficiently large values of tv_nsec, but
+ * well formed timespecs should have tv_nsec < NSEC_PER_SEC, so we're
+ * OK.
  *
  * Rather, we just shift the bits off the right.
  *
  * The >> (NSEC_JIFFIE_SC - 

Re: [PATCH] sched: fix timeval conversion to jiffies

2014-09-04 Thread Andrew Hunter
On Wed, Sep 3, 2014 at 5:06 PM, John Stultz  wrote:
> Maybe with the next version of the patch, before you get into the
> unwinding the math, you might practically describe what is broken,
> then explain how its broken.
>

Done.

> My quick read here is that we're converting a timespec -> jiffies, and
> in doing so we round up by one jiffy.
>
> This seems actually perfectly normal, as we usually end up rounding up
> by a jiffy in many cases since we don't want to undershoot any
> timeout, and we're always allowed return later then specified.

Well, yes, timeouts can be longer than specified, but what you said
technically applies just as much to code that arbitrarily multiplies
jiffies by 10 before returning, no? :)

The problem isn't the rounding, it's that the rounding is _wrong_: I'm
fine rounding 10100usec / (1000 usec/jiffie) = 11 jiffies. The current
code rounds 1usec / (1000 usec/jiffies) to 11.  I've rewritten the
description to make this clearer.

>
>> In particular, with HZ=1000, we consistently computed that 1 usec
>> was 11 jiffies; the same was true for any exact multiple of
>> TICK_NSEC. This is obviously bad as a general rule, and caused
>> observable user problems with setitimer() at the very least:
>>
>> setitimer(ITIMER_PROF, , NULL);
>> setitimer(ITIMER_PROF, NULL, );
>>
>> would actually add a tick to val!
>
> So this looks like an issue. Since we convert and store the internal
> representation in jiffies, when we pull it back out, we get the
> rounded up value, which is larger then the timespec value originally
> submitted.  This is really the core issue, correct?

For the particular user bug reported to me, yes, this was the core
issue: some code that stopped and restarted an itimer found the
interval growing by 1ms each time.  But again, it wasn't that it was
rounded:  if we initially passed e.g. 10500 usec and got back 11000,
that'd be annoying but workable, because if we then went through
another cycle of enabling/disabling itimer, we'd set it to 11000 usec
and get back 11000 again.  What we have now instead adds a full jiffie
_every time_.

> Or does this
> manifest in other ways that are problematic (the patch seems to hint
> that it does)?
>

Um: hard to be sure. I can tell you that for PROF itimers we track the
difference between the requested timeval in ns and the given number of
jiffies (see the logic with it->error in check_cpu_itimer() in
kernel/posix-cpu-timers.c), so over long periods we'll get the right
sample rate (the only problem is the timeval reported back to
userspace, and the samples might not quite be uniform.)  It appears
there are no other uses of timeval_to_jiffies, so we're probably safe.
But in any case it's wrong as currently coded so we should fix it.

(Actually, one could use the same code tracking it->error to always
return the exact--unrounded--interval to userspace, but that's minor
compared to a constantly growing interval, and I'd rather not
complicate this patch any more.)

>> +   nsec = nsec + TICK_NSEC - 1;
>
>
> If we're still rounding up one tick here, does the same issue not persist?
>

Nope (at least not in tested configurations.) Since jiffie is
effectively defined as some integral number of nsec, this rounding
DTRT to round up to an integral number of jiffies. (The scaled
arithmetic might, for sufficiently large intervals/right values of
timeval produce some error, but I haven't been able to make it do so.)

> Overall the patch doesn't look too bad and does reduce the amount of
> ugly math with simpler code reuse.  But I'd like the patch description
> to be more clear about why this is needed and the impact.
>

Rewritten; sending v2.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: fix timeval conversion to jiffies

2014-09-04 Thread Andrew Hunter
On Wed, Sep 3, 2014 at 5:06 PM, John Stultz john.stu...@linaro.org wrote:
 Maybe with the next version of the patch, before you get into the
 unwinding the math, you might practically describe what is broken,
 then explain how its broken.


Done.

 My quick read here is that we're converting a timespec - jiffies, and
 in doing so we round up by one jiffy.

 This seems actually perfectly normal, as we usually end up rounding up
 by a jiffy in many cases since we don't want to undershoot any
 timeout, and we're always allowed return later then specified.

Well, yes, timeouts can be longer than specified, but what you said
technically applies just as much to code that arbitrarily multiplies
jiffies by 10 before returning, no? :)

The problem isn't the rounding, it's that the rounding is _wrong_: I'm
fine rounding 10100usec / (1000 usec/jiffie) = 11 jiffies. The current
code rounds 1usec / (1000 usec/jiffies) to 11.  I've rewritten the
description to make this clearer.


 In particular, with HZ=1000, we consistently computed that 1 usec
 was 11 jiffies; the same was true for any exact multiple of
 TICK_NSEC. This is obviously bad as a general rule, and caused
 observable user problems with setitimer() at the very least:

 setitimer(ITIMER_PROF, val, NULL);
 setitimer(ITIMER_PROF, NULL, val);

 would actually add a tick to val!

 So this looks like an issue. Since we convert and store the internal
 representation in jiffies, when we pull it back out, we get the
 rounded up value, which is larger then the timespec value originally
 submitted.  This is really the core issue, correct?

For the particular user bug reported to me, yes, this was the core
issue: some code that stopped and restarted an itimer found the
interval growing by 1ms each time.  But again, it wasn't that it was
rounded:  if we initially passed e.g. 10500 usec and got back 11000,
that'd be annoying but workable, because if we then went through
another cycle of enabling/disabling itimer, we'd set it to 11000 usec
and get back 11000 again.  What we have now instead adds a full jiffie
_every time_.

 Or does this
 manifest in other ways that are problematic (the patch seems to hint
 that it does)?


Um: hard to be sure. I can tell you that for PROF itimers we track the
difference between the requested timeval in ns and the given number of
jiffies (see the logic with it-error in check_cpu_itimer() in
kernel/posix-cpu-timers.c), so over long periods we'll get the right
sample rate (the only problem is the timeval reported back to
userspace, and the samples might not quite be uniform.)  It appears
there are no other uses of timeval_to_jiffies, so we're probably safe.
But in any case it's wrong as currently coded so we should fix it.

(Actually, one could use the same code tracking it-error to always
return the exact--unrounded--interval to userspace, but that's minor
compared to a constantly growing interval, and I'd rather not
complicate this patch any more.)

 +   nsec = nsec + TICK_NSEC - 1;


 If we're still rounding up one tick here, does the same issue not persist?


Nope (at least not in tested configurations.) Since jiffie is
effectively defined as some integral number of nsec, this rounding
DTRT to round up to an integral number of jiffies. (The scaled
arithmetic might, for sufficiently large intervals/right values of
timeval produce some error, but I haven't been able to make it do so.)

 Overall the patch doesn't look too bad and does reduce the amount of
 ugly math with simpler code reuse.  But I'd like the patch description
 to be more clear about why this is needed and the impact.


Rewritten; sending v2.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] sched: fix timeval conversion to jiffies

2014-09-04 Thread Andrew Hunter
timeval_to_jiffies tried to round a timeval up to an integral number
of jiffies, but the logic for doing so was incorrect: intervals
corresponding to exactly N jiffies would become N+1. This manifested
itself particularly repeatedly stopping/starting an itimer:

setitimer(ITIMER_PROF, val, NULL);
setitimer(ITIMER_PROF, NULL, val);

would add a full tick to val, _even if it was exactly representable in
terms of jiffies_ (say, the result of a previous rounding.)  Doing
this repeatedly would cause unbounded growth in val.  So fix the math.

Here's what was wrong with the conversion: we essentially computed
(eliding seconds)

jiffies = usec  * (NSEC_PER_USEC/TICK_NSEC)

by using scaling arithmetic, which took the best approximation of
NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC =
x/(2^USEC_JIFFIE_SC), and computed:

jiffies = (usec * x)  USEC_JIFFIE_SC

and rounded this calculation up in the intermediate form (since we
can't necessarily exactly represent TICK_NSEC in usec.) But the
scaling arithmetic is a (very slight) *over*approximation of the true
value; that is, instead of dividing by (1 usec/ 1 jiffie), we
effectively divided by (1 usec/1 jiffie)-epsilon (rounding
down). This would normally be fine, but we want to round timeouts up,
and we did so by adding 2^USEC_JIFFIE_SC - 1 before the shift; this
would be fine if our division was exact, but dividing this by the
slightly smaller factor was equivalent to adding just _over_ 1 to the
final result (instead of just _under_ 1, as desired.)

In particular, with HZ=1000, we consistently computed that 1 usec
was 11 jiffies; the same was true for any exact multiple of
TICK_NSEC.

We could possibly still round in the intermediate form, adding
something less than 2^USEC_JIFFIE_SC - 1, but easier still is to
convert usec-nsec, round in nanoseconds, and then convert using
time*spec*_to_jiffies.  This adds one constant multiplication, and is
not observably slower in microbenchmarks on recent x86 hardware.

Tested: the following program:

int main() {
  struct itimerval zero = {{0, 0}, {0, 0}};
  /* Initially set to 10 ms. */
  struct itimerval initial = zero;
  initial.it_interval.tv_usec = 1;
  setitimer(ITIMER_PROF, initial, NULL);
  /* Save and restore several times. */
  for (size_t i = 0; i  10; ++i) {
struct itimerval prev;
setitimer(ITIMER_PROF, zero, prev);
/* on old kernels, this goes up by TICK_USEC every iteration */
printf(previous value: %ld %ld %ld %ld\n,
   prev.it_interval.tv_sec, prev.it_interval.tv_usec,
   prev.it_value.tv_sec, prev.it_value.tv_usec);
setitimer(ITIMER_PROF, prev, NULL);
  }
return 0;
}

Signed-off-by: Andrew Hunter a...@google.com
Reviewed-by: Paul Turner p...@google.com
Reported-by: Aaron Jacobs jaco...@google.com
Change-Id: I7cd0f0764847fd055d39531f54e6ea3dd3ce5453
---
 include/linux/jiffies.h | 12 ---
 kernel/time.c   | 54 +++--
 2 files changed, 30 insertions(+), 36 deletions(-)

diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h
index 1f44466..c367cbd 100644
--- a/include/linux/jiffies.h
+++ b/include/linux/jiffies.h
@@ -258,23 +258,11 @@ extern unsigned long preset_lpj;
 #define SEC_JIFFIE_SC (32 - SHIFT_HZ)
 #endif
 #define NSEC_JIFFIE_SC (SEC_JIFFIE_SC + 29)
-#define USEC_JIFFIE_SC (SEC_JIFFIE_SC + 19)
 #define SEC_CONVERSION ((unsigned long)u64)NSEC_PER_SEC  SEC_JIFFIE_SC) 
+\
 TICK_NSEC -1) / (u64)TICK_NSEC))
 
 #define NSEC_CONVERSION ((unsigned long)u64)1  NSEC_JIFFIE_SC) +\
 TICK_NSEC -1) / (u64)TICK_NSEC))
-#define USEC_CONVERSION  \
-((unsigned long)u64)NSEC_PER_USEC  USEC_JIFFIE_SC) +\
-TICK_NSEC -1) / (u64)TICK_NSEC))
-/*
- * USEC_ROUND is used in the timeval to jiffie conversion.  See there
- * for more details.  It is the scaled resolution rounding value.  Note
- * that it is a 64-bit value.  Since, when it is applied, we are already
- * in jiffies (albit scaled), it is nothing but the bits we will shift
- * off.
- */
-#define USEC_ROUND (u64)(((u64)1  USEC_JIFFIE_SC) - 1)
 /*
  * The maximum jiffie value is (MAX_INT  1).  Here we translate that
  * into seconds.  The 64-bit case will overflow if we are not careful,
diff --git a/kernel/time.c b/kernel/time.c
index 7c7964c..3c49ab4 100644
--- a/kernel/time.c
+++ b/kernel/time.c
@@ -496,17 +496,20 @@ EXPORT_SYMBOL(usecs_to_jiffies);
  * that a remainder subtract here would not do the right thing as the
  * resolution values don't fall on second boundries.  I.e. the line:
  * nsec -= nsec % TICK_NSEC; is NOT a correct resolution rounding.
+ * Note that due to the small error in the multiplier here, this
+ * rounding is incorrect for sufficiently large values of tv_nsec, but
+ * well formed timespecs should have tv_nsec  NSEC_PER_SEC, so we're
+ * OK.
  *
  * Rather, we just shift the bits 

Re: [PATCH] sched: fix timeval conversion to jiffies

2014-09-04 Thread John Stultz
On Thu, Sep 4, 2014 at 2:17 PM, Andrew Hunter a...@google.com wrote:
 On Wed, Sep 3, 2014 at 5:06 PM, John Stultz john.stu...@linaro.org wrote:
 Maybe with the next version of the patch, before you get into the
 unwinding the math, you might practically describe what is broken,
 then explain how its broken.

 My quick read here is that we're converting a timespec - jiffies, and
 in doing so we round up by one jiffy.

 This seems actually perfectly normal, as we usually end up rounding up
 by a jiffy in many cases since we don't want to undershoot any
 timeout, and we're always allowed return later then specified.

 Well, yes, timeouts can be longer than specified, but what you said
 technically applies just as much to code that arbitrarily multiplies
 jiffies by 10 before returning, no? :)

 The problem isn't the rounding, it's that the rounding is _wrong_: I'm
 fine rounding 10100usec / (1000 usec/jiffie) = 11 jiffies. The current
 code rounds 1usec / (1000 usec/jiffies) to 11.  I've rewritten the
 description to make this clearer.

Ok. Very much appreciated!

 In particular, with HZ=1000, we consistently computed that 1 usec
 was 11 jiffies; the same was true for any exact multiple of
 TICK_NSEC. This is obviously bad as a general rule, and caused
 observable user problems with setitimer() at the very least:

 setitimer(ITIMER_PROF, val, NULL);
 setitimer(ITIMER_PROF, NULL, val);

 would actually add a tick to val!

 So this looks like an issue. Since we convert and store the internal
 representation in jiffies, when we pull it back out, we get the
 rounded up value, which is larger then the timespec value originally
 submitted.  This is really the core issue, correct?

 For the particular user bug reported to me, yes, this was the core
 issue: some code that stopped and restarted an itimer found the
 interval growing by 1ms each time.  But again, it wasn't that it was
 rounded:  if we initially passed e.g. 10500 usec and got back 11000,
 that'd be annoying but workable, because if we then went through
 another cycle of enabling/disabling itimer, we'd set it to 11000 usec
 and get back 11000 again.  What we have now instead adds a full jiffie
 _every time_.

Ah, ok. This part is key to understanding the problem. Thanks for
clarifying this.

This seems to be a quite old bug.. Do you think this is needed for -stable?

thanks
-john
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: fix timeval conversion to jiffies

2014-09-04 Thread Paul Turner
On Thu, Sep 4, 2014 at 2:30 PM, John Stultz john.stu...@linaro.org wrote:
 On Thu, Sep 4, 2014 at 2:17 PM, Andrew Hunter a...@google.com wrote:
 On Wed, Sep 3, 2014 at 5:06 PM, John Stultz john.stu...@linaro.org wrote:
 Maybe with the next version of the patch, before you get into the
 unwinding the math, you might practically describe what is broken,
 then explain how its broken.

 My quick read here is that we're converting a timespec - jiffies, and
 in doing so we round up by one jiffy.

 This seems actually perfectly normal, as we usually end up rounding up
 by a jiffy in many cases since we don't want to undershoot any
 timeout, and we're always allowed return later then specified.

 Well, yes, timeouts can be longer than specified, but what you said
 technically applies just as much to code that arbitrarily multiplies
 jiffies by 10 before returning, no? :)

 The problem isn't the rounding, it's that the rounding is _wrong_: I'm
 fine rounding 10100usec / (1000 usec/jiffie) = 11 jiffies. The current
 code rounds 1usec / (1000 usec/jiffies) to 11.  I've rewritten the
 description to make this clearer.

 Ok. Very much appreciated!

 In particular, with HZ=1000, we consistently computed that 1 usec
 was 11 jiffies; the same was true for any exact multiple of
 TICK_NSEC. This is obviously bad as a general rule, and caused
 observable user problems with setitimer() at the very least:

 setitimer(ITIMER_PROF, val, NULL);
 setitimer(ITIMER_PROF, NULL, val);

 would actually add a tick to val!

 So this looks like an issue. Since we convert and store the internal
 representation in jiffies, when we pull it back out, we get the
 rounded up value, which is larger then the timespec value originally
 submitted.  This is really the core issue, correct?

 For the particular user bug reported to me, yes, this was the core
 issue: some code that stopped and restarted an itimer found the
 interval growing by 1ms each time.  But again, it wasn't that it was
 rounded:  if we initially passed e.g. 10500 usec and got back 11000,
 that'd be annoying but workable, because if we then went through
 another cycle of enabling/disabling itimer, we'd set it to 11000 usec
 and get back 11000 again.  What we have now instead adds a full jiffie
 _every time_.

 Ah, ok. This part is key to understanding the problem. Thanks for
 clarifying this.

 This seems to be a quite old bug.. Do you think this is needed for -stable?

Seems reasonable to me.


 thanks
 -john
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: fix timeval conversion to jiffies

2014-09-04 Thread Andrew Hunter
On Thu, Sep 4, 2014 at 2:36 PM, Paul Turner p...@google.com wrote:
 On Thu, Sep 4, 2014 at 2:30 PM, John Stultz john.stu...@linaro.org wrote:
 This seems to be a quite old bug.. Do you think this is needed for -stable?

 Seems reasonable to me.


I have no opinion: backport or don't at your preference.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: fix timeval conversion to jiffies

2014-09-03 Thread John Stultz
On Thu, Aug 7, 2014 at 5:09 PM, Andrew Hunter  wrote:
> timeval_to_jiffies rounding was broken.  It essentially computed
> (eliding seconds)
>
> jiffies = usec  * (NSEC_PER_USEC/TICK_NSEC)
>
> by using scaling arithmetic, which took the best approximation of
> NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC =
> x/(2^USEC_JIFFIE_SC), and computed:
>
> jiffies = (usec * x) >> USEC_JIFFIE_SC
>
> and it rounded this calculation up in the intermediate form (since we
> can't necessarily exactly represent TICK_NSEC in usec.) But the
> scaling arithmetic is a (very slight) *over*approximation of the true
> value: rounding the division up by adding 2^USEC_JIFFIE_SC - 1, when
> the error in scaling was added in, was sufficient to add one jiffie to
> the final result.

Apologies for the slow response here. I've never really touched any of
the code here, and parsing the above made me want to go do other
things. :)

Maybe with the next version of the patch, before you get into the
unwinding the math, you might practically describe what is broken,
then explain how its broken.

My quick read here is that we're converting a timespec -> jiffies, and
in doing so we round up by one jiffy.

This seems actually perfectly normal, as we usually end up rounding up
by a jiffy in many cases since we don't want to undershoot any
timeout, and we're always allowed return later then specified. That
said...

> In particular, with HZ=1000, we consistently computed that 1 usec
> was 11 jiffies; the same was true for any exact multiple of
> TICK_NSEC. This is obviously bad as a general rule, and caused
> observable user problems with setitimer() at the very least:
>
> setitimer(ITIMER_PROF, , NULL);
> setitimer(ITIMER_PROF, NULL, );
>
> would actually add a tick to val!

So this looks like an issue. Since we convert and store the internal
representation in jiffies, when we pull it back out, we get the
rounded up value, which is larger then the timespec value originally
submitted.  This is really the core issue, correct? Or does this
manifest in other ways that are problematic (the patch seems to hint
that it does)?


> We could possibly still round in the intermediate form, adding
> something less than 2^USEC_JIFFIE_SC - 1, but easier still is to
> convert usec->nsec, round in nanoseconds, and then convert using
> time*spec*_to_jiffies.  This adds one constant multiplication, and is
> not observably slower in microbenchmarks on recent x86 hardware.
>
> Tested: the following program:
>
> int main() {
>   struct itimerval zero = {{0, 0}, {0, 0}};
>   /* Initially set to 10 ms. */
>   struct itimerval initial = zero;
>   initial.it_interval.tv_usec = 1;
>   setitimer(ITIMER_PROF, , NULL);
>   /* Save and restore several times. */
>   for (size_t i = 0; i < 10; ++i) {
> struct itimerval prev;
> setitimer(ITIMER_PROF, , );
> /* on old kernels, this goes up by TICK_USEC every iteration */
> printf("previous value: %ld %ld %ld %ld\n",
>prev.it_interval.tv_sec, prev.it_interval.tv_usec,
>prev.it_value.tv_sec, prev.it_value.tv_usec);
> setitimer(ITIMER_PROF, , NULL);
>   }
> return 0;
> }
>
> Signed-off-by: Andrew Hunter 
> Reviewed-by: Paul Turner 
> Reported-by: Aaron Jacobs 
> Change-Id: I7cd0f0764847fd055d39531f54e6ea3dd3ce5453
> ---
>  include/linux/jiffies.h | 12 ---
>  kernel/time.c   | 54 
> +++--
>  2 files changed, 30 insertions(+), 36 deletions(-)
>
> diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h
> index 1f44466..c367cbd 100644
> --- a/include/linux/jiffies.h
> +++ b/include/linux/jiffies.h
> @@ -258,23 +258,11 @@ extern unsigned long preset_lpj;
>  #define SEC_JIFFIE_SC (32 - SHIFT_HZ)
>  #endif
>  #define NSEC_JIFFIE_SC (SEC_JIFFIE_SC + 29)
> -#define USEC_JIFFIE_SC (SEC_JIFFIE_SC + 19)
>  #define SEC_CONVERSION ((unsigned long)u64)NSEC_PER_SEC << 
> SEC_JIFFIE_SC) +\
>  TICK_NSEC -1) / (u64)TICK_NSEC))
>
>  #define NSEC_CONVERSION ((unsigned long)u64)1 << NSEC_JIFFIE_SC) +\
>  TICK_NSEC -1) / (u64)TICK_NSEC))
> -#define USEC_CONVERSION  \
> -((unsigned long)u64)NSEC_PER_USEC << USEC_JIFFIE_SC) 
> +\
> -TICK_NSEC -1) / (u64)TICK_NSEC))
> -/*
> - * USEC_ROUND is used in the timeval to jiffie conversion.  See there
> - * for more details.  It is the scaled resolution rounding value.  Note
> - * that it is a 64-bit value.  Since, when it is applied, we are already
> - * in jiffies (albit scaled), it is nothing but the bits we will shift
> - * off.
> - */
> -#define USEC_ROUND (u64)(((u64)1 << USEC_JIFFIE_SC) - 1)
>  /*
>   * The maximum jiffie value is (MAX_INT >> 1).  Here we translate that
>   * into seconds.  The 64-bit case will overflow if we are not careful,
> diff --git a/kernel/time.c b/kernel/time.c
> index 7c7964c..3c49ab4 100644
> --- 

Re: [PATCH] sched: fix timeval conversion to jiffies

2014-09-03 Thread Paul Turner
John -- Any chance to take a look at this?  (The dilation is pretty
unfortunate when profiling reprograms a timer with it.)

On Thu, Aug 7, 2014 at 5:09 PM, Andrew Hunter  wrote:
> timeval_to_jiffies rounding was broken.  It essentially computed
> (eliding seconds)
>
> jiffies = usec  * (NSEC_PER_USEC/TICK_NSEC)
>
> by using scaling arithmetic, which took the best approximation of
> NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC =
> x/(2^USEC_JIFFIE_SC), and computed:
>
> jiffies = (usec * x) >> USEC_JIFFIE_SC
>
> and it rounded this calculation up in the intermediate form (since we
> can't necessarily exactly represent TICK_NSEC in usec.) But the
> scaling arithmetic is a (very slight) *over*approximation of the true
> value: rounding the division up by adding 2^USEC_JIFFIE_SC - 1, when
> the error in scaling was added in, was sufficient to add one jiffie to
> the final result.
>
> In particular, with HZ=1000, we consistently computed that 1 usec
> was 11 jiffies; the same was true for any exact multiple of
> TICK_NSEC. This is obviously bad as a general rule, and caused
> observable user problems with setitimer() at the very least:
>
> setitimer(ITIMER_PROF, , NULL);
> setitimer(ITIMER_PROF, NULL, );
>
> would actually add a tick to val!
>
> We could possibly still round in the intermediate form, adding
> something less than 2^USEC_JIFFIE_SC - 1, but easier still is to
> convert usec->nsec, round in nanoseconds, and then convert using
> time*spec*_to_jiffies.  This adds one constant multiplication, and is
> not observably slower in microbenchmarks on recent x86 hardware.
>
> Tested: the following program:
>
> int main() {
>   struct itimerval zero = {{0, 0}, {0, 0}};
>   /* Initially set to 10 ms. */
>   struct itimerval initial = zero;
>   initial.it_interval.tv_usec = 1;
>   setitimer(ITIMER_PROF, , NULL);
>   /* Save and restore several times. */
>   for (size_t i = 0; i < 10; ++i) {
> struct itimerval prev;
> setitimer(ITIMER_PROF, , );
> /* on old kernels, this goes up by TICK_USEC every iteration */
> printf("previous value: %ld %ld %ld %ld\n",
>prev.it_interval.tv_sec, prev.it_interval.tv_usec,
>prev.it_value.tv_sec, prev.it_value.tv_usec);
> setitimer(ITIMER_PROF, , NULL);
>   }
> return 0;
> }
>
> Signed-off-by: Andrew Hunter 
> Reviewed-by: Paul Turner 
> Reported-by: Aaron Jacobs 
> Change-Id: I7cd0f0764847fd055d39531f54e6ea3dd3ce5453
> ---
>  include/linux/jiffies.h | 12 ---
>  kernel/time.c   | 54 
> +++--
>  2 files changed, 30 insertions(+), 36 deletions(-)
>
> diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h
> index 1f44466..c367cbd 100644
> --- a/include/linux/jiffies.h
> +++ b/include/linux/jiffies.h
> @@ -258,23 +258,11 @@ extern unsigned long preset_lpj;
>  #define SEC_JIFFIE_SC (32 - SHIFT_HZ)
>  #endif
>  #define NSEC_JIFFIE_SC (SEC_JIFFIE_SC + 29)
> -#define USEC_JIFFIE_SC (SEC_JIFFIE_SC + 19)
>  #define SEC_CONVERSION ((unsigned long)u64)NSEC_PER_SEC << 
> SEC_JIFFIE_SC) +\
>  TICK_NSEC -1) / (u64)TICK_NSEC))
>
>  #define NSEC_CONVERSION ((unsigned long)u64)1 << NSEC_JIFFIE_SC) +\
>  TICK_NSEC -1) / (u64)TICK_NSEC))
> -#define USEC_CONVERSION  \
> -((unsigned long)u64)NSEC_PER_USEC << USEC_JIFFIE_SC) 
> +\
> -TICK_NSEC -1) / (u64)TICK_NSEC))
> -/*
> - * USEC_ROUND is used in the timeval to jiffie conversion.  See there
> - * for more details.  It is the scaled resolution rounding value.  Note
> - * that it is a 64-bit value.  Since, when it is applied, we are already
> - * in jiffies (albit scaled), it is nothing but the bits we will shift
> - * off.
> - */
> -#define USEC_ROUND (u64)(((u64)1 << USEC_JIFFIE_SC) - 1)
>  /*
>   * The maximum jiffie value is (MAX_INT >> 1).  Here we translate that
>   * into seconds.  The 64-bit case will overflow if we are not careful,
> diff --git a/kernel/time.c b/kernel/time.c
> index 7c7964c..3c49ab4 100644
> --- a/kernel/time.c
> +++ b/kernel/time.c
> @@ -496,17 +496,20 @@ EXPORT_SYMBOL(usecs_to_jiffies);
>   * that a remainder subtract here would not do the right thing as the
>   * resolution values don't fall on second boundries.  I.e. the line:
>   * nsec -= nsec % TICK_NSEC; is NOT a correct resolution rounding.
> + * Note that due to the small error in the multiplier here, this
> + * rounding is incorrect for sufficiently large values of tv_nsec, but
> + * well formed timespecs should have tv_nsec < NSEC_PER_SEC, so we're
> + * OK.
>   *
>   * Rather, we just shift the bits off the right.
>   *
>   * The >> (NSEC_JIFFIE_SC - SEC_JIFFIE_SC) converts the scaled nsec
>   * value to a scaled second value.
>   */
> -unsigned long
> -timespec_to_jiffies(const struct timespec *value)
> +static unsigned long
> +__timespec_to_jiffies(unsigned long sec, long nsec)
> 

Re: [PATCH] sched: fix timeval conversion to jiffies

2014-09-03 Thread Paul Turner
John -- Any chance to take a look at this?  (The dilation is pretty
unfortunate when profiling reprograms a timer with it.)

On Thu, Aug 7, 2014 at 5:09 PM, Andrew Hunter a...@google.com wrote:
 timeval_to_jiffies rounding was broken.  It essentially computed
 (eliding seconds)

 jiffies = usec  * (NSEC_PER_USEC/TICK_NSEC)

 by using scaling arithmetic, which took the best approximation of
 NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC =
 x/(2^USEC_JIFFIE_SC), and computed:

 jiffies = (usec * x)  USEC_JIFFIE_SC

 and it rounded this calculation up in the intermediate form (since we
 can't necessarily exactly represent TICK_NSEC in usec.) But the
 scaling arithmetic is a (very slight) *over*approximation of the true
 value: rounding the division up by adding 2^USEC_JIFFIE_SC - 1, when
 the error in scaling was added in, was sufficient to add one jiffie to
 the final result.

 In particular, with HZ=1000, we consistently computed that 1 usec
 was 11 jiffies; the same was true for any exact multiple of
 TICK_NSEC. This is obviously bad as a general rule, and caused
 observable user problems with setitimer() at the very least:

 setitimer(ITIMER_PROF, val, NULL);
 setitimer(ITIMER_PROF, NULL, val);

 would actually add a tick to val!

 We could possibly still round in the intermediate form, adding
 something less than 2^USEC_JIFFIE_SC - 1, but easier still is to
 convert usec-nsec, round in nanoseconds, and then convert using
 time*spec*_to_jiffies.  This adds one constant multiplication, and is
 not observably slower in microbenchmarks on recent x86 hardware.

 Tested: the following program:

 int main() {
   struct itimerval zero = {{0, 0}, {0, 0}};
   /* Initially set to 10 ms. */
   struct itimerval initial = zero;
   initial.it_interval.tv_usec = 1;
   setitimer(ITIMER_PROF, initial, NULL);
   /* Save and restore several times. */
   for (size_t i = 0; i  10; ++i) {
 struct itimerval prev;
 setitimer(ITIMER_PROF, zero, prev);
 /* on old kernels, this goes up by TICK_USEC every iteration */
 printf(previous value: %ld %ld %ld %ld\n,
prev.it_interval.tv_sec, prev.it_interval.tv_usec,
prev.it_value.tv_sec, prev.it_value.tv_usec);
 setitimer(ITIMER_PROF, prev, NULL);
   }
 return 0;
 }

 Signed-off-by: Andrew Hunter a...@google.com
 Reviewed-by: Paul Turner p...@google.com
 Reported-by: Aaron Jacobs jaco...@google.com
 Change-Id: I7cd0f0764847fd055d39531f54e6ea3dd3ce5453
 ---
  include/linux/jiffies.h | 12 ---
  kernel/time.c   | 54 
 +++--
  2 files changed, 30 insertions(+), 36 deletions(-)

 diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h
 index 1f44466..c367cbd 100644
 --- a/include/linux/jiffies.h
 +++ b/include/linux/jiffies.h
 @@ -258,23 +258,11 @@ extern unsigned long preset_lpj;
  #define SEC_JIFFIE_SC (32 - SHIFT_HZ)
  #endif
  #define NSEC_JIFFIE_SC (SEC_JIFFIE_SC + 29)
 -#define USEC_JIFFIE_SC (SEC_JIFFIE_SC + 19)
  #define SEC_CONVERSION ((unsigned long)u64)NSEC_PER_SEC  
 SEC_JIFFIE_SC) +\
  TICK_NSEC -1) / (u64)TICK_NSEC))

  #define NSEC_CONVERSION ((unsigned long)u64)1  NSEC_JIFFIE_SC) +\
  TICK_NSEC -1) / (u64)TICK_NSEC))
 -#define USEC_CONVERSION  \
 -((unsigned long)u64)NSEC_PER_USEC  USEC_JIFFIE_SC) 
 +\
 -TICK_NSEC -1) / (u64)TICK_NSEC))
 -/*
 - * USEC_ROUND is used in the timeval to jiffie conversion.  See there
 - * for more details.  It is the scaled resolution rounding value.  Note
 - * that it is a 64-bit value.  Since, when it is applied, we are already
 - * in jiffies (albit scaled), it is nothing but the bits we will shift
 - * off.
 - */
 -#define USEC_ROUND (u64)(((u64)1  USEC_JIFFIE_SC) - 1)
  /*
   * The maximum jiffie value is (MAX_INT  1).  Here we translate that
   * into seconds.  The 64-bit case will overflow if we are not careful,
 diff --git a/kernel/time.c b/kernel/time.c
 index 7c7964c..3c49ab4 100644
 --- a/kernel/time.c
 +++ b/kernel/time.c
 @@ -496,17 +496,20 @@ EXPORT_SYMBOL(usecs_to_jiffies);
   * that a remainder subtract here would not do the right thing as the
   * resolution values don't fall on second boundries.  I.e. the line:
   * nsec -= nsec % TICK_NSEC; is NOT a correct resolution rounding.
 + * Note that due to the small error in the multiplier here, this
 + * rounding is incorrect for sufficiently large values of tv_nsec, but
 + * well formed timespecs should have tv_nsec  NSEC_PER_SEC, so we're
 + * OK.
   *
   * Rather, we just shift the bits off the right.
   *
   * The  (NSEC_JIFFIE_SC - SEC_JIFFIE_SC) converts the scaled nsec
   * value to a scaled second value.
   */
 -unsigned long
 -timespec_to_jiffies(const struct timespec *value)
 +static unsigned long
 +__timespec_to_jiffies(unsigned long sec, long nsec)
  {
 -   unsigned long sec = value-tv_sec;
 - 

Re: [PATCH] sched: fix timeval conversion to jiffies

2014-09-03 Thread John Stultz
On Thu, Aug 7, 2014 at 5:09 PM, Andrew Hunter a...@google.com wrote:
 timeval_to_jiffies rounding was broken.  It essentially computed
 (eliding seconds)

 jiffies = usec  * (NSEC_PER_USEC/TICK_NSEC)

 by using scaling arithmetic, which took the best approximation of
 NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC =
 x/(2^USEC_JIFFIE_SC), and computed:

 jiffies = (usec * x)  USEC_JIFFIE_SC

 and it rounded this calculation up in the intermediate form (since we
 can't necessarily exactly represent TICK_NSEC in usec.) But the
 scaling arithmetic is a (very slight) *over*approximation of the true
 value: rounding the division up by adding 2^USEC_JIFFIE_SC - 1, when
 the error in scaling was added in, was sufficient to add one jiffie to
 the final result.

Apologies for the slow response here. I've never really touched any of
the code here, and parsing the above made me want to go do other
things. :)

Maybe with the next version of the patch, before you get into the
unwinding the math, you might practically describe what is broken,
then explain how its broken.

My quick read here is that we're converting a timespec - jiffies, and
in doing so we round up by one jiffy.

This seems actually perfectly normal, as we usually end up rounding up
by a jiffy in many cases since we don't want to undershoot any
timeout, and we're always allowed return later then specified. That
said...

 In particular, with HZ=1000, we consistently computed that 1 usec
 was 11 jiffies; the same was true for any exact multiple of
 TICK_NSEC. This is obviously bad as a general rule, and caused
 observable user problems with setitimer() at the very least:

 setitimer(ITIMER_PROF, val, NULL);
 setitimer(ITIMER_PROF, NULL, val);

 would actually add a tick to val!

So this looks like an issue. Since we convert and store the internal
representation in jiffies, when we pull it back out, we get the
rounded up value, which is larger then the timespec value originally
submitted.  This is really the core issue, correct? Or does this
manifest in other ways that are problematic (the patch seems to hint
that it does)?


 We could possibly still round in the intermediate form, adding
 something less than 2^USEC_JIFFIE_SC - 1, but easier still is to
 convert usec-nsec, round in nanoseconds, and then convert using
 time*spec*_to_jiffies.  This adds one constant multiplication, and is
 not observably slower in microbenchmarks on recent x86 hardware.

 Tested: the following program:

 int main() {
   struct itimerval zero = {{0, 0}, {0, 0}};
   /* Initially set to 10 ms. */
   struct itimerval initial = zero;
   initial.it_interval.tv_usec = 1;
   setitimer(ITIMER_PROF, initial, NULL);
   /* Save and restore several times. */
   for (size_t i = 0; i  10; ++i) {
 struct itimerval prev;
 setitimer(ITIMER_PROF, zero, prev);
 /* on old kernels, this goes up by TICK_USEC every iteration */
 printf(previous value: %ld %ld %ld %ld\n,
prev.it_interval.tv_sec, prev.it_interval.tv_usec,
prev.it_value.tv_sec, prev.it_value.tv_usec);
 setitimer(ITIMER_PROF, prev, NULL);
   }
 return 0;
 }

 Signed-off-by: Andrew Hunter a...@google.com
 Reviewed-by: Paul Turner p...@google.com
 Reported-by: Aaron Jacobs jaco...@google.com
 Change-Id: I7cd0f0764847fd055d39531f54e6ea3dd3ce5453
 ---
  include/linux/jiffies.h | 12 ---
  kernel/time.c   | 54 
 +++--
  2 files changed, 30 insertions(+), 36 deletions(-)

 diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h
 index 1f44466..c367cbd 100644
 --- a/include/linux/jiffies.h
 +++ b/include/linux/jiffies.h
 @@ -258,23 +258,11 @@ extern unsigned long preset_lpj;
  #define SEC_JIFFIE_SC (32 - SHIFT_HZ)
  #endif
  #define NSEC_JIFFIE_SC (SEC_JIFFIE_SC + 29)
 -#define USEC_JIFFIE_SC (SEC_JIFFIE_SC + 19)
  #define SEC_CONVERSION ((unsigned long)u64)NSEC_PER_SEC  
 SEC_JIFFIE_SC) +\
  TICK_NSEC -1) / (u64)TICK_NSEC))

  #define NSEC_CONVERSION ((unsigned long)u64)1  NSEC_JIFFIE_SC) +\
  TICK_NSEC -1) / (u64)TICK_NSEC))
 -#define USEC_CONVERSION  \
 -((unsigned long)u64)NSEC_PER_USEC  USEC_JIFFIE_SC) 
 +\
 -TICK_NSEC -1) / (u64)TICK_NSEC))
 -/*
 - * USEC_ROUND is used in the timeval to jiffie conversion.  See there
 - * for more details.  It is the scaled resolution rounding value.  Note
 - * that it is a 64-bit value.  Since, when it is applied, we are already
 - * in jiffies (albit scaled), it is nothing but the bits we will shift
 - * off.
 - */
 -#define USEC_ROUND (u64)(((u64)1  USEC_JIFFIE_SC) - 1)
  /*
   * The maximum jiffie value is (MAX_INT  1).  Here we translate that
   * into seconds.  The 64-bit case will overflow if we are not careful,
 diff --git a/kernel/time.c b/kernel/time.c
 index 7c7964c..3c49ab4 100644
 --- a/kernel/time.c
 +++ 

[PATCH] sched: fix timeval conversion to jiffies

2014-08-07 Thread Andrew Hunter
timeval_to_jiffies rounding was broken.  It essentially computed
(eliding seconds)

jiffies = usec  * (NSEC_PER_USEC/TICK_NSEC)

by using scaling arithmetic, which took the best approximation of
NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC =
x/(2^USEC_JIFFIE_SC), and computed:

jiffies = (usec * x) >> USEC_JIFFIE_SC

and it rounded this calculation up in the intermediate form (since we
can't necessarily exactly represent TICK_NSEC in usec.) But the
scaling arithmetic is a (very slight) *over*approximation of the true
value: rounding the division up by adding 2^USEC_JIFFIE_SC - 1, when
the error in scaling was added in, was sufficient to add one jiffie to
the final result.

In particular, with HZ=1000, we consistently computed that 1 usec
was 11 jiffies; the same was true for any exact multiple of
TICK_NSEC. This is obviously bad as a general rule, and caused
observable user problems with setitimer() at the very least:

setitimer(ITIMER_PROF, , NULL);
setitimer(ITIMER_PROF, NULL, );

would actually add a tick to val!

We could possibly still round in the intermediate form, adding
something less than 2^USEC_JIFFIE_SC - 1, but easier still is to
convert usec->nsec, round in nanoseconds, and then convert using
time*spec*_to_jiffies.  This adds one constant multiplication, and is
not observably slower in microbenchmarks on recent x86 hardware.

Tested: the following program:

int main() {
  struct itimerval zero = {{0, 0}, {0, 0}};
  /* Initially set to 10 ms. */
  struct itimerval initial = zero;
  initial.it_interval.tv_usec = 1;
  setitimer(ITIMER_PROF, , NULL);
  /* Save and restore several times. */
  for (size_t i = 0; i < 10; ++i) {
struct itimerval prev;
setitimer(ITIMER_PROF, , );
/* on old kernels, this goes up by TICK_USEC every iteration */
printf("previous value: %ld %ld %ld %ld\n",
   prev.it_interval.tv_sec, prev.it_interval.tv_usec,
   prev.it_value.tv_sec, prev.it_value.tv_usec);
setitimer(ITIMER_PROF, , NULL);
  }
return 0;
}

Signed-off-by: Andrew Hunter 
Reviewed-by: Paul Turner 
Reported-by: Aaron Jacobs 
Change-Id: I7cd0f0764847fd055d39531f54e6ea3dd3ce5453
---
 include/linux/jiffies.h | 12 ---
 kernel/time.c   | 54 +++--
 2 files changed, 30 insertions(+), 36 deletions(-)

diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h
index 1f44466..c367cbd 100644
--- a/include/linux/jiffies.h
+++ b/include/linux/jiffies.h
@@ -258,23 +258,11 @@ extern unsigned long preset_lpj;
 #define SEC_JIFFIE_SC (32 - SHIFT_HZ)
 #endif
 #define NSEC_JIFFIE_SC (SEC_JIFFIE_SC + 29)
-#define USEC_JIFFIE_SC (SEC_JIFFIE_SC + 19)
 #define SEC_CONVERSION ((unsigned long)u64)NSEC_PER_SEC << SEC_JIFFIE_SC) 
+\
 TICK_NSEC -1) / (u64)TICK_NSEC))
 
 #define NSEC_CONVERSION ((unsigned long)u64)1 << NSEC_JIFFIE_SC) +\
 TICK_NSEC -1) / (u64)TICK_NSEC))
-#define USEC_CONVERSION  \
-((unsigned long)u64)NSEC_PER_USEC << USEC_JIFFIE_SC) +\
-TICK_NSEC -1) / (u64)TICK_NSEC))
-/*
- * USEC_ROUND is used in the timeval to jiffie conversion.  See there
- * for more details.  It is the scaled resolution rounding value.  Note
- * that it is a 64-bit value.  Since, when it is applied, we are already
- * in jiffies (albit scaled), it is nothing but the bits we will shift
- * off.
- */
-#define USEC_ROUND (u64)(((u64)1 << USEC_JIFFIE_SC) - 1)
 /*
  * The maximum jiffie value is (MAX_INT >> 1).  Here we translate that
  * into seconds.  The 64-bit case will overflow if we are not careful,
diff --git a/kernel/time.c b/kernel/time.c
index 7c7964c..3c49ab4 100644
--- a/kernel/time.c
+++ b/kernel/time.c
@@ -496,17 +496,20 @@ EXPORT_SYMBOL(usecs_to_jiffies);
  * that a remainder subtract here would not do the right thing as the
  * resolution values don't fall on second boundries.  I.e. the line:
  * nsec -= nsec % TICK_NSEC; is NOT a correct resolution rounding.
+ * Note that due to the small error in the multiplier here, this
+ * rounding is incorrect for sufficiently large values of tv_nsec, but
+ * well formed timespecs should have tv_nsec < NSEC_PER_SEC, so we're
+ * OK.
  *
  * Rather, we just shift the bits off the right.
  *
  * The >> (NSEC_JIFFIE_SC - SEC_JIFFIE_SC) converts the scaled nsec
  * value to a scaled second value.
  */
-unsigned long
-timespec_to_jiffies(const struct timespec *value)
+static unsigned long
+__timespec_to_jiffies(unsigned long sec, long nsec)
 {
-   unsigned long sec = value->tv_sec;
-   long nsec = value->tv_nsec + TICK_NSEC - 1;
+   nsec = nsec + TICK_NSEC - 1;
 
if (sec >= MAX_SEC_IN_JIFFIES){
sec = MAX_SEC_IN_JIFFIES;
@@ -517,6 +520,13 @@ timespec_to_jiffies(const struct timespec *value)
 (NSEC_JIFFIE_SC - SEC_JIFFIE_SC))) >> SEC_JIFFIE_SC;
 
 }
+
+unsigned long

[PATCH] sched: fix timeval conversion to jiffies

2014-08-07 Thread Andrew Hunter
timeval_to_jiffies rounding was broken.  It essentially computed
(eliding seconds)

jiffies = usec  * (NSEC_PER_USEC/TICK_NSEC)

by using scaling arithmetic, which took the best approximation of
NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC =
x/(2^USEC_JIFFIE_SC), and computed:

jiffies = (usec * x)  USEC_JIFFIE_SC

and it rounded this calculation up in the intermediate form (since we
can't necessarily exactly represent TICK_NSEC in usec.) But the
scaling arithmetic is a (very slight) *over*approximation of the true
value: rounding the division up by adding 2^USEC_JIFFIE_SC - 1, when
the error in scaling was added in, was sufficient to add one jiffie to
the final result.

In particular, with HZ=1000, we consistently computed that 1 usec
was 11 jiffies; the same was true for any exact multiple of
TICK_NSEC. This is obviously bad as a general rule, and caused
observable user problems with setitimer() at the very least:

setitimer(ITIMER_PROF, val, NULL);
setitimer(ITIMER_PROF, NULL, val);

would actually add a tick to val!

We could possibly still round in the intermediate form, adding
something less than 2^USEC_JIFFIE_SC - 1, but easier still is to
convert usec-nsec, round in nanoseconds, and then convert using
time*spec*_to_jiffies.  This adds one constant multiplication, and is
not observably slower in microbenchmarks on recent x86 hardware.

Tested: the following program:

int main() {
  struct itimerval zero = {{0, 0}, {0, 0}};
  /* Initially set to 10 ms. */
  struct itimerval initial = zero;
  initial.it_interval.tv_usec = 1;
  setitimer(ITIMER_PROF, initial, NULL);
  /* Save and restore several times. */
  for (size_t i = 0; i  10; ++i) {
struct itimerval prev;
setitimer(ITIMER_PROF, zero, prev);
/* on old kernels, this goes up by TICK_USEC every iteration */
printf(previous value: %ld %ld %ld %ld\n,
   prev.it_interval.tv_sec, prev.it_interval.tv_usec,
   prev.it_value.tv_sec, prev.it_value.tv_usec);
setitimer(ITIMER_PROF, prev, NULL);
  }
return 0;
}

Signed-off-by: Andrew Hunter a...@google.com
Reviewed-by: Paul Turner p...@google.com
Reported-by: Aaron Jacobs jaco...@google.com
Change-Id: I7cd0f0764847fd055d39531f54e6ea3dd3ce5453
---
 include/linux/jiffies.h | 12 ---
 kernel/time.c   | 54 +++--
 2 files changed, 30 insertions(+), 36 deletions(-)

diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h
index 1f44466..c367cbd 100644
--- a/include/linux/jiffies.h
+++ b/include/linux/jiffies.h
@@ -258,23 +258,11 @@ extern unsigned long preset_lpj;
 #define SEC_JIFFIE_SC (32 - SHIFT_HZ)
 #endif
 #define NSEC_JIFFIE_SC (SEC_JIFFIE_SC + 29)
-#define USEC_JIFFIE_SC (SEC_JIFFIE_SC + 19)
 #define SEC_CONVERSION ((unsigned long)u64)NSEC_PER_SEC  SEC_JIFFIE_SC) 
+\
 TICK_NSEC -1) / (u64)TICK_NSEC))
 
 #define NSEC_CONVERSION ((unsigned long)u64)1  NSEC_JIFFIE_SC) +\
 TICK_NSEC -1) / (u64)TICK_NSEC))
-#define USEC_CONVERSION  \
-((unsigned long)u64)NSEC_PER_USEC  USEC_JIFFIE_SC) +\
-TICK_NSEC -1) / (u64)TICK_NSEC))
-/*
- * USEC_ROUND is used in the timeval to jiffie conversion.  See there
- * for more details.  It is the scaled resolution rounding value.  Note
- * that it is a 64-bit value.  Since, when it is applied, we are already
- * in jiffies (albit scaled), it is nothing but the bits we will shift
- * off.
- */
-#define USEC_ROUND (u64)(((u64)1  USEC_JIFFIE_SC) - 1)
 /*
  * The maximum jiffie value is (MAX_INT  1).  Here we translate that
  * into seconds.  The 64-bit case will overflow if we are not careful,
diff --git a/kernel/time.c b/kernel/time.c
index 7c7964c..3c49ab4 100644
--- a/kernel/time.c
+++ b/kernel/time.c
@@ -496,17 +496,20 @@ EXPORT_SYMBOL(usecs_to_jiffies);
  * that a remainder subtract here would not do the right thing as the
  * resolution values don't fall on second boundries.  I.e. the line:
  * nsec -= nsec % TICK_NSEC; is NOT a correct resolution rounding.
+ * Note that due to the small error in the multiplier here, this
+ * rounding is incorrect for sufficiently large values of tv_nsec, but
+ * well formed timespecs should have tv_nsec  NSEC_PER_SEC, so we're
+ * OK.
  *
  * Rather, we just shift the bits off the right.
  *
  * The  (NSEC_JIFFIE_SC - SEC_JIFFIE_SC) converts the scaled nsec
  * value to a scaled second value.
  */
-unsigned long
-timespec_to_jiffies(const struct timespec *value)
+static unsigned long
+__timespec_to_jiffies(unsigned long sec, long nsec)
 {
-   unsigned long sec = value-tv_sec;
-   long nsec = value-tv_nsec + TICK_NSEC - 1;
+   nsec = nsec + TICK_NSEC - 1;
 
if (sec = MAX_SEC_IN_JIFFIES){
sec = MAX_SEC_IN_JIFFIES;
@@ -517,6 +520,13 @@ timespec_to_jiffies(const struct timespec *value)
 (NSEC_JIFFIE_SC - SEC_JIFFIE_SC)))