Re: [PATCH] sched: fix timeval conversion to jiffies
On Thu, Sep 4, 2014 at 2:36 PM, Paul Turner wrote: > On Thu, Sep 4, 2014 at 2:30 PM, John Stultz wrote: >> This seems to be a quite old bug.. Do you think this is needed for -stable? > > Seems reasonable to me. > I have no opinion: backport or don't at your preference. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched: fix timeval conversion to jiffies
On Thu, Sep 4, 2014 at 2:30 PM, John Stultz wrote: > On Thu, Sep 4, 2014 at 2:17 PM, Andrew Hunter wrote: >> On Wed, Sep 3, 2014 at 5:06 PM, John Stultz wrote: >>> Maybe with the next version of the patch, before you get into the >>> unwinding the math, you might practically describe what is broken, >>> then explain how its broken. >>> >>> My quick read here is that we're converting a timespec -> jiffies, and >>> in doing so we round up by one jiffy. >>> >>> This seems actually perfectly normal, as we usually end up rounding up >>> by a jiffy in many cases since we don't want to undershoot any >>> timeout, and we're always allowed return later then specified. >> >> Well, yes, timeouts can be longer than specified, but what you said >> technically applies just as much to code that arbitrarily multiplies >> jiffies by 10 before returning, no? :) >> >> The problem isn't the rounding, it's that the rounding is _wrong_: I'm >> fine rounding 10100usec / (1000 usec/jiffie) = 11 jiffies. The current >> code rounds 1usec / (1000 usec/jiffies) to 11. I've rewritten the >> description to make this clearer. > > Ok. Very much appreciated! > In particular, with HZ=1000, we consistently computed that 1 usec was 11 jiffies; the same was true for any exact multiple of TICK_NSEC. This is obviously bad as a general rule, and caused observable user problems with setitimer() at the very least: setitimer(ITIMER_PROF, , NULL); setitimer(ITIMER_PROF, NULL, ); would actually add a tick to val! >>> >>> So this looks like an issue. Since we convert and store the internal >>> representation in jiffies, when we pull it back out, we get the >>> rounded up value, which is larger then the timespec value originally >>> submitted. This is really the core issue, correct? >> >> For the particular user bug reported to me, yes, this was the core >> issue: some code that stopped and restarted an itimer found the >> interval growing by 1ms each time. But again, it wasn't that it was >> rounded: if we initially passed e.g. 10500 usec and got back 11000, >> that'd be annoying but workable, because if we then went through >> another cycle of enabling/disabling itimer, we'd set it to 11000 usec >> and get back 11000 again. What we have now instead adds a full jiffie >> _every time_. > > Ah, ok. This part is key to understanding the problem. Thanks for > clarifying this. > > This seems to be a quite old bug.. Do you think this is needed for -stable? Seems reasonable to me. > > thanks > -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched: fix timeval conversion to jiffies
On Thu, Sep 4, 2014 at 2:17 PM, Andrew Hunter wrote: > On Wed, Sep 3, 2014 at 5:06 PM, John Stultz wrote: >> Maybe with the next version of the patch, before you get into the >> unwinding the math, you might practically describe what is broken, >> then explain how its broken. >> >> My quick read here is that we're converting a timespec -> jiffies, and >> in doing so we round up by one jiffy. >> >> This seems actually perfectly normal, as we usually end up rounding up >> by a jiffy in many cases since we don't want to undershoot any >> timeout, and we're always allowed return later then specified. > > Well, yes, timeouts can be longer than specified, but what you said > technically applies just as much to code that arbitrarily multiplies > jiffies by 10 before returning, no? :) > > The problem isn't the rounding, it's that the rounding is _wrong_: I'm > fine rounding 10100usec / (1000 usec/jiffie) = 11 jiffies. The current > code rounds 1usec / (1000 usec/jiffies) to 11. I've rewritten the > description to make this clearer. Ok. Very much appreciated! >>> In particular, with HZ=1000, we consistently computed that 1 usec >>> was 11 jiffies; the same was true for any exact multiple of >>> TICK_NSEC. This is obviously bad as a general rule, and caused >>> observable user problems with setitimer() at the very least: >>> >>> setitimer(ITIMER_PROF, , NULL); >>> setitimer(ITIMER_PROF, NULL, ); >>> >>> would actually add a tick to val! >> >> So this looks like an issue. Since we convert and store the internal >> representation in jiffies, when we pull it back out, we get the >> rounded up value, which is larger then the timespec value originally >> submitted. This is really the core issue, correct? > > For the particular user bug reported to me, yes, this was the core > issue: some code that stopped and restarted an itimer found the > interval growing by 1ms each time. But again, it wasn't that it was > rounded: if we initially passed e.g. 10500 usec and got back 11000, > that'd be annoying but workable, because if we then went through > another cycle of enabling/disabling itimer, we'd set it to 11000 usec > and get back 11000 again. What we have now instead adds a full jiffie > _every time_. Ah, ok. This part is key to understanding the problem. Thanks for clarifying this. This seems to be a quite old bug.. Do you think this is needed for -stable? thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] sched: fix timeval conversion to jiffies
timeval_to_jiffies tried to round a timeval up to an integral number of jiffies, but the logic for doing so was incorrect: intervals corresponding to exactly N jiffies would become N+1. This manifested itself particularly repeatedly stopping/starting an itimer: setitimer(ITIMER_PROF, , NULL); setitimer(ITIMER_PROF, NULL, ); would add a full tick to val, _even if it was exactly representable in terms of jiffies_ (say, the result of a previous rounding.) Doing this repeatedly would cause unbounded growth in val. So fix the math. Here's what was wrong with the conversion: we essentially computed (eliding seconds) jiffies = usec * (NSEC_PER_USEC/TICK_NSEC) by using scaling arithmetic, which took the best approximation of NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC = x/(2^USEC_JIFFIE_SC), and computed: jiffies = (usec * x) >> USEC_JIFFIE_SC and rounded this calculation up in the intermediate form (since we can't necessarily exactly represent TICK_NSEC in usec.) But the scaling arithmetic is a (very slight) *over*approximation of the true value; that is, instead of dividing by (1 usec/ 1 jiffie), we effectively divided by (1 usec/1 jiffie)-epsilon (rounding down). This would normally be fine, but we want to round timeouts up, and we did so by adding 2^USEC_JIFFIE_SC - 1 before the shift; this would be fine if our division was exact, but dividing this by the slightly smaller factor was equivalent to adding just _over_ 1 to the final result (instead of just _under_ 1, as desired.) In particular, with HZ=1000, we consistently computed that 1 usec was 11 jiffies; the same was true for any exact multiple of TICK_NSEC. We could possibly still round in the intermediate form, adding something less than 2^USEC_JIFFIE_SC - 1, but easier still is to convert usec->nsec, round in nanoseconds, and then convert using time*spec*_to_jiffies. This adds one constant multiplication, and is not observably slower in microbenchmarks on recent x86 hardware. Tested: the following program: int main() { struct itimerval zero = {{0, 0}, {0, 0}}; /* Initially set to 10 ms. */ struct itimerval initial = zero; initial.it_interval.tv_usec = 1; setitimer(ITIMER_PROF, , NULL); /* Save and restore several times. */ for (size_t i = 0; i < 10; ++i) { struct itimerval prev; setitimer(ITIMER_PROF, , ); /* on old kernels, this goes up by TICK_USEC every iteration */ printf("previous value: %ld %ld %ld %ld\n", prev.it_interval.tv_sec, prev.it_interval.tv_usec, prev.it_value.tv_sec, prev.it_value.tv_usec); setitimer(ITIMER_PROF, , NULL); } return 0; } Signed-off-by: Andrew Hunter Reviewed-by: Paul Turner Reported-by: Aaron Jacobs Change-Id: I7cd0f0764847fd055d39531f54e6ea3dd3ce5453 --- include/linux/jiffies.h | 12 --- kernel/time.c | 54 +++-- 2 files changed, 30 insertions(+), 36 deletions(-) diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h index 1f44466..c367cbd 100644 --- a/include/linux/jiffies.h +++ b/include/linux/jiffies.h @@ -258,23 +258,11 @@ extern unsigned long preset_lpj; #define SEC_JIFFIE_SC (32 - SHIFT_HZ) #endif #define NSEC_JIFFIE_SC (SEC_JIFFIE_SC + 29) -#define USEC_JIFFIE_SC (SEC_JIFFIE_SC + 19) #define SEC_CONVERSION ((unsigned long)u64)NSEC_PER_SEC << SEC_JIFFIE_SC) +\ TICK_NSEC -1) / (u64)TICK_NSEC)) #define NSEC_CONVERSION ((unsigned long)u64)1 << NSEC_JIFFIE_SC) +\ TICK_NSEC -1) / (u64)TICK_NSEC)) -#define USEC_CONVERSION \ -((unsigned long)u64)NSEC_PER_USEC << USEC_JIFFIE_SC) +\ -TICK_NSEC -1) / (u64)TICK_NSEC)) -/* - * USEC_ROUND is used in the timeval to jiffie conversion. See there - * for more details. It is the scaled resolution rounding value. Note - * that it is a 64-bit value. Since, when it is applied, we are already - * in jiffies (albit scaled), it is nothing but the bits we will shift - * off. - */ -#define USEC_ROUND (u64)(((u64)1 << USEC_JIFFIE_SC) - 1) /* * The maximum jiffie value is (MAX_INT >> 1). Here we translate that * into seconds. The 64-bit case will overflow if we are not careful, diff --git a/kernel/time.c b/kernel/time.c index 7c7964c..3c49ab4 100644 --- a/kernel/time.c +++ b/kernel/time.c @@ -496,17 +496,20 @@ EXPORT_SYMBOL(usecs_to_jiffies); * that a remainder subtract here would not do the right thing as the * resolution values don't fall on second boundries. I.e. the line: * nsec -= nsec % TICK_NSEC; is NOT a correct resolution rounding. + * Note that due to the small error in the multiplier here, this + * rounding is incorrect for sufficiently large values of tv_nsec, but + * well formed timespecs should have tv_nsec < NSEC_PER_SEC, so we're + * OK. * * Rather, we just shift the bits off the right. * * The >> (NSEC_JIFFIE_SC -
Re: [PATCH] sched: fix timeval conversion to jiffies
On Wed, Sep 3, 2014 at 5:06 PM, John Stultz wrote: > Maybe with the next version of the patch, before you get into the > unwinding the math, you might practically describe what is broken, > then explain how its broken. > Done. > My quick read here is that we're converting a timespec -> jiffies, and > in doing so we round up by one jiffy. > > This seems actually perfectly normal, as we usually end up rounding up > by a jiffy in many cases since we don't want to undershoot any > timeout, and we're always allowed return later then specified. Well, yes, timeouts can be longer than specified, but what you said technically applies just as much to code that arbitrarily multiplies jiffies by 10 before returning, no? :) The problem isn't the rounding, it's that the rounding is _wrong_: I'm fine rounding 10100usec / (1000 usec/jiffie) = 11 jiffies. The current code rounds 1usec / (1000 usec/jiffies) to 11. I've rewritten the description to make this clearer. > >> In particular, with HZ=1000, we consistently computed that 1 usec >> was 11 jiffies; the same was true for any exact multiple of >> TICK_NSEC. This is obviously bad as a general rule, and caused >> observable user problems with setitimer() at the very least: >> >> setitimer(ITIMER_PROF, , NULL); >> setitimer(ITIMER_PROF, NULL, ); >> >> would actually add a tick to val! > > So this looks like an issue. Since we convert and store the internal > representation in jiffies, when we pull it back out, we get the > rounded up value, which is larger then the timespec value originally > submitted. This is really the core issue, correct? For the particular user bug reported to me, yes, this was the core issue: some code that stopped and restarted an itimer found the interval growing by 1ms each time. But again, it wasn't that it was rounded: if we initially passed e.g. 10500 usec and got back 11000, that'd be annoying but workable, because if we then went through another cycle of enabling/disabling itimer, we'd set it to 11000 usec and get back 11000 again. What we have now instead adds a full jiffie _every time_. > Or does this > manifest in other ways that are problematic (the patch seems to hint > that it does)? > Um: hard to be sure. I can tell you that for PROF itimers we track the difference between the requested timeval in ns and the given number of jiffies (see the logic with it->error in check_cpu_itimer() in kernel/posix-cpu-timers.c), so over long periods we'll get the right sample rate (the only problem is the timeval reported back to userspace, and the samples might not quite be uniform.) It appears there are no other uses of timeval_to_jiffies, so we're probably safe. But in any case it's wrong as currently coded so we should fix it. (Actually, one could use the same code tracking it->error to always return the exact--unrounded--interval to userspace, but that's minor compared to a constantly growing interval, and I'd rather not complicate this patch any more.) >> + nsec = nsec + TICK_NSEC - 1; > > > If we're still rounding up one tick here, does the same issue not persist? > Nope (at least not in tested configurations.) Since jiffie is effectively defined as some integral number of nsec, this rounding DTRT to round up to an integral number of jiffies. (The scaled arithmetic might, for sufficiently large intervals/right values of timeval produce some error, but I haven't been able to make it do so.) > Overall the patch doesn't look too bad and does reduce the amount of > ugly math with simpler code reuse. But I'd like the patch description > to be more clear about why this is needed and the impact. > Rewritten; sending v2. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched: fix timeval conversion to jiffies
On Wed, Sep 3, 2014 at 5:06 PM, John Stultz john.stu...@linaro.org wrote: Maybe with the next version of the patch, before you get into the unwinding the math, you might practically describe what is broken, then explain how its broken. Done. My quick read here is that we're converting a timespec - jiffies, and in doing so we round up by one jiffy. This seems actually perfectly normal, as we usually end up rounding up by a jiffy in many cases since we don't want to undershoot any timeout, and we're always allowed return later then specified. Well, yes, timeouts can be longer than specified, but what you said technically applies just as much to code that arbitrarily multiplies jiffies by 10 before returning, no? :) The problem isn't the rounding, it's that the rounding is _wrong_: I'm fine rounding 10100usec / (1000 usec/jiffie) = 11 jiffies. The current code rounds 1usec / (1000 usec/jiffies) to 11. I've rewritten the description to make this clearer. In particular, with HZ=1000, we consistently computed that 1 usec was 11 jiffies; the same was true for any exact multiple of TICK_NSEC. This is obviously bad as a general rule, and caused observable user problems with setitimer() at the very least: setitimer(ITIMER_PROF, val, NULL); setitimer(ITIMER_PROF, NULL, val); would actually add a tick to val! So this looks like an issue. Since we convert and store the internal representation in jiffies, when we pull it back out, we get the rounded up value, which is larger then the timespec value originally submitted. This is really the core issue, correct? For the particular user bug reported to me, yes, this was the core issue: some code that stopped and restarted an itimer found the interval growing by 1ms each time. But again, it wasn't that it was rounded: if we initially passed e.g. 10500 usec and got back 11000, that'd be annoying but workable, because if we then went through another cycle of enabling/disabling itimer, we'd set it to 11000 usec and get back 11000 again. What we have now instead adds a full jiffie _every time_. Or does this manifest in other ways that are problematic (the patch seems to hint that it does)? Um: hard to be sure. I can tell you that for PROF itimers we track the difference between the requested timeval in ns and the given number of jiffies (see the logic with it-error in check_cpu_itimer() in kernel/posix-cpu-timers.c), so over long periods we'll get the right sample rate (the only problem is the timeval reported back to userspace, and the samples might not quite be uniform.) It appears there are no other uses of timeval_to_jiffies, so we're probably safe. But in any case it's wrong as currently coded so we should fix it. (Actually, one could use the same code tracking it-error to always return the exact--unrounded--interval to userspace, but that's minor compared to a constantly growing interval, and I'd rather not complicate this patch any more.) + nsec = nsec + TICK_NSEC - 1; If we're still rounding up one tick here, does the same issue not persist? Nope (at least not in tested configurations.) Since jiffie is effectively defined as some integral number of nsec, this rounding DTRT to round up to an integral number of jiffies. (The scaled arithmetic might, for sufficiently large intervals/right values of timeval produce some error, but I haven't been able to make it do so.) Overall the patch doesn't look too bad and does reduce the amount of ugly math with simpler code reuse. But I'd like the patch description to be more clear about why this is needed and the impact. Rewritten; sending v2. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] sched: fix timeval conversion to jiffies
timeval_to_jiffies tried to round a timeval up to an integral number of jiffies, but the logic for doing so was incorrect: intervals corresponding to exactly N jiffies would become N+1. This manifested itself particularly repeatedly stopping/starting an itimer: setitimer(ITIMER_PROF, val, NULL); setitimer(ITIMER_PROF, NULL, val); would add a full tick to val, _even if it was exactly representable in terms of jiffies_ (say, the result of a previous rounding.) Doing this repeatedly would cause unbounded growth in val. So fix the math. Here's what was wrong with the conversion: we essentially computed (eliding seconds) jiffies = usec * (NSEC_PER_USEC/TICK_NSEC) by using scaling arithmetic, which took the best approximation of NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC = x/(2^USEC_JIFFIE_SC), and computed: jiffies = (usec * x) USEC_JIFFIE_SC and rounded this calculation up in the intermediate form (since we can't necessarily exactly represent TICK_NSEC in usec.) But the scaling arithmetic is a (very slight) *over*approximation of the true value; that is, instead of dividing by (1 usec/ 1 jiffie), we effectively divided by (1 usec/1 jiffie)-epsilon (rounding down). This would normally be fine, but we want to round timeouts up, and we did so by adding 2^USEC_JIFFIE_SC - 1 before the shift; this would be fine if our division was exact, but dividing this by the slightly smaller factor was equivalent to adding just _over_ 1 to the final result (instead of just _under_ 1, as desired.) In particular, with HZ=1000, we consistently computed that 1 usec was 11 jiffies; the same was true for any exact multiple of TICK_NSEC. We could possibly still round in the intermediate form, adding something less than 2^USEC_JIFFIE_SC - 1, but easier still is to convert usec-nsec, round in nanoseconds, and then convert using time*spec*_to_jiffies. This adds one constant multiplication, and is not observably slower in microbenchmarks on recent x86 hardware. Tested: the following program: int main() { struct itimerval zero = {{0, 0}, {0, 0}}; /* Initially set to 10 ms. */ struct itimerval initial = zero; initial.it_interval.tv_usec = 1; setitimer(ITIMER_PROF, initial, NULL); /* Save and restore several times. */ for (size_t i = 0; i 10; ++i) { struct itimerval prev; setitimer(ITIMER_PROF, zero, prev); /* on old kernels, this goes up by TICK_USEC every iteration */ printf(previous value: %ld %ld %ld %ld\n, prev.it_interval.tv_sec, prev.it_interval.tv_usec, prev.it_value.tv_sec, prev.it_value.tv_usec); setitimer(ITIMER_PROF, prev, NULL); } return 0; } Signed-off-by: Andrew Hunter a...@google.com Reviewed-by: Paul Turner p...@google.com Reported-by: Aaron Jacobs jaco...@google.com Change-Id: I7cd0f0764847fd055d39531f54e6ea3dd3ce5453 --- include/linux/jiffies.h | 12 --- kernel/time.c | 54 +++-- 2 files changed, 30 insertions(+), 36 deletions(-) diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h index 1f44466..c367cbd 100644 --- a/include/linux/jiffies.h +++ b/include/linux/jiffies.h @@ -258,23 +258,11 @@ extern unsigned long preset_lpj; #define SEC_JIFFIE_SC (32 - SHIFT_HZ) #endif #define NSEC_JIFFIE_SC (SEC_JIFFIE_SC + 29) -#define USEC_JIFFIE_SC (SEC_JIFFIE_SC + 19) #define SEC_CONVERSION ((unsigned long)u64)NSEC_PER_SEC SEC_JIFFIE_SC) +\ TICK_NSEC -1) / (u64)TICK_NSEC)) #define NSEC_CONVERSION ((unsigned long)u64)1 NSEC_JIFFIE_SC) +\ TICK_NSEC -1) / (u64)TICK_NSEC)) -#define USEC_CONVERSION \ -((unsigned long)u64)NSEC_PER_USEC USEC_JIFFIE_SC) +\ -TICK_NSEC -1) / (u64)TICK_NSEC)) -/* - * USEC_ROUND is used in the timeval to jiffie conversion. See there - * for more details. It is the scaled resolution rounding value. Note - * that it is a 64-bit value. Since, when it is applied, we are already - * in jiffies (albit scaled), it is nothing but the bits we will shift - * off. - */ -#define USEC_ROUND (u64)(((u64)1 USEC_JIFFIE_SC) - 1) /* * The maximum jiffie value is (MAX_INT 1). Here we translate that * into seconds. The 64-bit case will overflow if we are not careful, diff --git a/kernel/time.c b/kernel/time.c index 7c7964c..3c49ab4 100644 --- a/kernel/time.c +++ b/kernel/time.c @@ -496,17 +496,20 @@ EXPORT_SYMBOL(usecs_to_jiffies); * that a remainder subtract here would not do the right thing as the * resolution values don't fall on second boundries. I.e. the line: * nsec -= nsec % TICK_NSEC; is NOT a correct resolution rounding. + * Note that due to the small error in the multiplier here, this + * rounding is incorrect for sufficiently large values of tv_nsec, but + * well formed timespecs should have tv_nsec NSEC_PER_SEC, so we're + * OK. * * Rather, we just shift the bits
Re: [PATCH] sched: fix timeval conversion to jiffies
On Thu, Sep 4, 2014 at 2:17 PM, Andrew Hunter a...@google.com wrote: On Wed, Sep 3, 2014 at 5:06 PM, John Stultz john.stu...@linaro.org wrote: Maybe with the next version of the patch, before you get into the unwinding the math, you might practically describe what is broken, then explain how its broken. My quick read here is that we're converting a timespec - jiffies, and in doing so we round up by one jiffy. This seems actually perfectly normal, as we usually end up rounding up by a jiffy in many cases since we don't want to undershoot any timeout, and we're always allowed return later then specified. Well, yes, timeouts can be longer than specified, but what you said technically applies just as much to code that arbitrarily multiplies jiffies by 10 before returning, no? :) The problem isn't the rounding, it's that the rounding is _wrong_: I'm fine rounding 10100usec / (1000 usec/jiffie) = 11 jiffies. The current code rounds 1usec / (1000 usec/jiffies) to 11. I've rewritten the description to make this clearer. Ok. Very much appreciated! In particular, with HZ=1000, we consistently computed that 1 usec was 11 jiffies; the same was true for any exact multiple of TICK_NSEC. This is obviously bad as a general rule, and caused observable user problems with setitimer() at the very least: setitimer(ITIMER_PROF, val, NULL); setitimer(ITIMER_PROF, NULL, val); would actually add a tick to val! So this looks like an issue. Since we convert and store the internal representation in jiffies, when we pull it back out, we get the rounded up value, which is larger then the timespec value originally submitted. This is really the core issue, correct? For the particular user bug reported to me, yes, this was the core issue: some code that stopped and restarted an itimer found the interval growing by 1ms each time. But again, it wasn't that it was rounded: if we initially passed e.g. 10500 usec and got back 11000, that'd be annoying but workable, because if we then went through another cycle of enabling/disabling itimer, we'd set it to 11000 usec and get back 11000 again. What we have now instead adds a full jiffie _every time_. Ah, ok. This part is key to understanding the problem. Thanks for clarifying this. This seems to be a quite old bug.. Do you think this is needed for -stable? thanks -john -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched: fix timeval conversion to jiffies
On Thu, Sep 4, 2014 at 2:30 PM, John Stultz john.stu...@linaro.org wrote: On Thu, Sep 4, 2014 at 2:17 PM, Andrew Hunter a...@google.com wrote: On Wed, Sep 3, 2014 at 5:06 PM, John Stultz john.stu...@linaro.org wrote: Maybe with the next version of the patch, before you get into the unwinding the math, you might practically describe what is broken, then explain how its broken. My quick read here is that we're converting a timespec - jiffies, and in doing so we round up by one jiffy. This seems actually perfectly normal, as we usually end up rounding up by a jiffy in many cases since we don't want to undershoot any timeout, and we're always allowed return later then specified. Well, yes, timeouts can be longer than specified, but what you said technically applies just as much to code that arbitrarily multiplies jiffies by 10 before returning, no? :) The problem isn't the rounding, it's that the rounding is _wrong_: I'm fine rounding 10100usec / (1000 usec/jiffie) = 11 jiffies. The current code rounds 1usec / (1000 usec/jiffies) to 11. I've rewritten the description to make this clearer. Ok. Very much appreciated! In particular, with HZ=1000, we consistently computed that 1 usec was 11 jiffies; the same was true for any exact multiple of TICK_NSEC. This is obviously bad as a general rule, and caused observable user problems with setitimer() at the very least: setitimer(ITIMER_PROF, val, NULL); setitimer(ITIMER_PROF, NULL, val); would actually add a tick to val! So this looks like an issue. Since we convert and store the internal representation in jiffies, when we pull it back out, we get the rounded up value, which is larger then the timespec value originally submitted. This is really the core issue, correct? For the particular user bug reported to me, yes, this was the core issue: some code that stopped and restarted an itimer found the interval growing by 1ms each time. But again, it wasn't that it was rounded: if we initially passed e.g. 10500 usec and got back 11000, that'd be annoying but workable, because if we then went through another cycle of enabling/disabling itimer, we'd set it to 11000 usec and get back 11000 again. What we have now instead adds a full jiffie _every time_. Ah, ok. This part is key to understanding the problem. Thanks for clarifying this. This seems to be a quite old bug.. Do you think this is needed for -stable? Seems reasonable to me. thanks -john -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched: fix timeval conversion to jiffies
On Thu, Sep 4, 2014 at 2:36 PM, Paul Turner p...@google.com wrote: On Thu, Sep 4, 2014 at 2:30 PM, John Stultz john.stu...@linaro.org wrote: This seems to be a quite old bug.. Do you think this is needed for -stable? Seems reasonable to me. I have no opinion: backport or don't at your preference. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched: fix timeval conversion to jiffies
On Thu, Aug 7, 2014 at 5:09 PM, Andrew Hunter wrote: > timeval_to_jiffies rounding was broken. It essentially computed > (eliding seconds) > > jiffies = usec * (NSEC_PER_USEC/TICK_NSEC) > > by using scaling arithmetic, which took the best approximation of > NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC = > x/(2^USEC_JIFFIE_SC), and computed: > > jiffies = (usec * x) >> USEC_JIFFIE_SC > > and it rounded this calculation up in the intermediate form (since we > can't necessarily exactly represent TICK_NSEC in usec.) But the > scaling arithmetic is a (very slight) *over*approximation of the true > value: rounding the division up by adding 2^USEC_JIFFIE_SC - 1, when > the error in scaling was added in, was sufficient to add one jiffie to > the final result. Apologies for the slow response here. I've never really touched any of the code here, and parsing the above made me want to go do other things. :) Maybe with the next version of the patch, before you get into the unwinding the math, you might practically describe what is broken, then explain how its broken. My quick read here is that we're converting a timespec -> jiffies, and in doing so we round up by one jiffy. This seems actually perfectly normal, as we usually end up rounding up by a jiffy in many cases since we don't want to undershoot any timeout, and we're always allowed return later then specified. That said... > In particular, with HZ=1000, we consistently computed that 1 usec > was 11 jiffies; the same was true for any exact multiple of > TICK_NSEC. This is obviously bad as a general rule, and caused > observable user problems with setitimer() at the very least: > > setitimer(ITIMER_PROF, , NULL); > setitimer(ITIMER_PROF, NULL, ); > > would actually add a tick to val! So this looks like an issue. Since we convert and store the internal representation in jiffies, when we pull it back out, we get the rounded up value, which is larger then the timespec value originally submitted. This is really the core issue, correct? Or does this manifest in other ways that are problematic (the patch seems to hint that it does)? > We could possibly still round in the intermediate form, adding > something less than 2^USEC_JIFFIE_SC - 1, but easier still is to > convert usec->nsec, round in nanoseconds, and then convert using > time*spec*_to_jiffies. This adds one constant multiplication, and is > not observably slower in microbenchmarks on recent x86 hardware. > > Tested: the following program: > > int main() { > struct itimerval zero = {{0, 0}, {0, 0}}; > /* Initially set to 10 ms. */ > struct itimerval initial = zero; > initial.it_interval.tv_usec = 1; > setitimer(ITIMER_PROF, , NULL); > /* Save and restore several times. */ > for (size_t i = 0; i < 10; ++i) { > struct itimerval prev; > setitimer(ITIMER_PROF, , ); > /* on old kernels, this goes up by TICK_USEC every iteration */ > printf("previous value: %ld %ld %ld %ld\n", >prev.it_interval.tv_sec, prev.it_interval.tv_usec, >prev.it_value.tv_sec, prev.it_value.tv_usec); > setitimer(ITIMER_PROF, , NULL); > } > return 0; > } > > Signed-off-by: Andrew Hunter > Reviewed-by: Paul Turner > Reported-by: Aaron Jacobs > Change-Id: I7cd0f0764847fd055d39531f54e6ea3dd3ce5453 > --- > include/linux/jiffies.h | 12 --- > kernel/time.c | 54 > +++-- > 2 files changed, 30 insertions(+), 36 deletions(-) > > diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h > index 1f44466..c367cbd 100644 > --- a/include/linux/jiffies.h > +++ b/include/linux/jiffies.h > @@ -258,23 +258,11 @@ extern unsigned long preset_lpj; > #define SEC_JIFFIE_SC (32 - SHIFT_HZ) > #endif > #define NSEC_JIFFIE_SC (SEC_JIFFIE_SC + 29) > -#define USEC_JIFFIE_SC (SEC_JIFFIE_SC + 19) > #define SEC_CONVERSION ((unsigned long)u64)NSEC_PER_SEC << > SEC_JIFFIE_SC) +\ > TICK_NSEC -1) / (u64)TICK_NSEC)) > > #define NSEC_CONVERSION ((unsigned long)u64)1 << NSEC_JIFFIE_SC) +\ > TICK_NSEC -1) / (u64)TICK_NSEC)) > -#define USEC_CONVERSION \ > -((unsigned long)u64)NSEC_PER_USEC << USEC_JIFFIE_SC) > +\ > -TICK_NSEC -1) / (u64)TICK_NSEC)) > -/* > - * USEC_ROUND is used in the timeval to jiffie conversion. See there > - * for more details. It is the scaled resolution rounding value. Note > - * that it is a 64-bit value. Since, when it is applied, we are already > - * in jiffies (albit scaled), it is nothing but the bits we will shift > - * off. > - */ > -#define USEC_ROUND (u64)(((u64)1 << USEC_JIFFIE_SC) - 1) > /* > * The maximum jiffie value is (MAX_INT >> 1). Here we translate that > * into seconds. The 64-bit case will overflow if we are not careful, > diff --git a/kernel/time.c b/kernel/time.c > index 7c7964c..3c49ab4 100644 > ---
Re: [PATCH] sched: fix timeval conversion to jiffies
John -- Any chance to take a look at this? (The dilation is pretty unfortunate when profiling reprograms a timer with it.) On Thu, Aug 7, 2014 at 5:09 PM, Andrew Hunter wrote: > timeval_to_jiffies rounding was broken. It essentially computed > (eliding seconds) > > jiffies = usec * (NSEC_PER_USEC/TICK_NSEC) > > by using scaling arithmetic, which took the best approximation of > NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC = > x/(2^USEC_JIFFIE_SC), and computed: > > jiffies = (usec * x) >> USEC_JIFFIE_SC > > and it rounded this calculation up in the intermediate form (since we > can't necessarily exactly represent TICK_NSEC in usec.) But the > scaling arithmetic is a (very slight) *over*approximation of the true > value: rounding the division up by adding 2^USEC_JIFFIE_SC - 1, when > the error in scaling was added in, was sufficient to add one jiffie to > the final result. > > In particular, with HZ=1000, we consistently computed that 1 usec > was 11 jiffies; the same was true for any exact multiple of > TICK_NSEC. This is obviously bad as a general rule, and caused > observable user problems with setitimer() at the very least: > > setitimer(ITIMER_PROF, , NULL); > setitimer(ITIMER_PROF, NULL, ); > > would actually add a tick to val! > > We could possibly still round in the intermediate form, adding > something less than 2^USEC_JIFFIE_SC - 1, but easier still is to > convert usec->nsec, round in nanoseconds, and then convert using > time*spec*_to_jiffies. This adds one constant multiplication, and is > not observably slower in microbenchmarks on recent x86 hardware. > > Tested: the following program: > > int main() { > struct itimerval zero = {{0, 0}, {0, 0}}; > /* Initially set to 10 ms. */ > struct itimerval initial = zero; > initial.it_interval.tv_usec = 1; > setitimer(ITIMER_PROF, , NULL); > /* Save and restore several times. */ > for (size_t i = 0; i < 10; ++i) { > struct itimerval prev; > setitimer(ITIMER_PROF, , ); > /* on old kernels, this goes up by TICK_USEC every iteration */ > printf("previous value: %ld %ld %ld %ld\n", >prev.it_interval.tv_sec, prev.it_interval.tv_usec, >prev.it_value.tv_sec, prev.it_value.tv_usec); > setitimer(ITIMER_PROF, , NULL); > } > return 0; > } > > Signed-off-by: Andrew Hunter > Reviewed-by: Paul Turner > Reported-by: Aaron Jacobs > Change-Id: I7cd0f0764847fd055d39531f54e6ea3dd3ce5453 > --- > include/linux/jiffies.h | 12 --- > kernel/time.c | 54 > +++-- > 2 files changed, 30 insertions(+), 36 deletions(-) > > diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h > index 1f44466..c367cbd 100644 > --- a/include/linux/jiffies.h > +++ b/include/linux/jiffies.h > @@ -258,23 +258,11 @@ extern unsigned long preset_lpj; > #define SEC_JIFFIE_SC (32 - SHIFT_HZ) > #endif > #define NSEC_JIFFIE_SC (SEC_JIFFIE_SC + 29) > -#define USEC_JIFFIE_SC (SEC_JIFFIE_SC + 19) > #define SEC_CONVERSION ((unsigned long)u64)NSEC_PER_SEC << > SEC_JIFFIE_SC) +\ > TICK_NSEC -1) / (u64)TICK_NSEC)) > > #define NSEC_CONVERSION ((unsigned long)u64)1 << NSEC_JIFFIE_SC) +\ > TICK_NSEC -1) / (u64)TICK_NSEC)) > -#define USEC_CONVERSION \ > -((unsigned long)u64)NSEC_PER_USEC << USEC_JIFFIE_SC) > +\ > -TICK_NSEC -1) / (u64)TICK_NSEC)) > -/* > - * USEC_ROUND is used in the timeval to jiffie conversion. See there > - * for more details. It is the scaled resolution rounding value. Note > - * that it is a 64-bit value. Since, when it is applied, we are already > - * in jiffies (albit scaled), it is nothing but the bits we will shift > - * off. > - */ > -#define USEC_ROUND (u64)(((u64)1 << USEC_JIFFIE_SC) - 1) > /* > * The maximum jiffie value is (MAX_INT >> 1). Here we translate that > * into seconds. The 64-bit case will overflow if we are not careful, > diff --git a/kernel/time.c b/kernel/time.c > index 7c7964c..3c49ab4 100644 > --- a/kernel/time.c > +++ b/kernel/time.c > @@ -496,17 +496,20 @@ EXPORT_SYMBOL(usecs_to_jiffies); > * that a remainder subtract here would not do the right thing as the > * resolution values don't fall on second boundries. I.e. the line: > * nsec -= nsec % TICK_NSEC; is NOT a correct resolution rounding. > + * Note that due to the small error in the multiplier here, this > + * rounding is incorrect for sufficiently large values of tv_nsec, but > + * well formed timespecs should have tv_nsec < NSEC_PER_SEC, so we're > + * OK. > * > * Rather, we just shift the bits off the right. > * > * The >> (NSEC_JIFFIE_SC - SEC_JIFFIE_SC) converts the scaled nsec > * value to a scaled second value. > */ > -unsigned long > -timespec_to_jiffies(const struct timespec *value) > +static unsigned long > +__timespec_to_jiffies(unsigned long sec, long nsec) >
Re: [PATCH] sched: fix timeval conversion to jiffies
John -- Any chance to take a look at this? (The dilation is pretty unfortunate when profiling reprograms a timer with it.) On Thu, Aug 7, 2014 at 5:09 PM, Andrew Hunter a...@google.com wrote: timeval_to_jiffies rounding was broken. It essentially computed (eliding seconds) jiffies = usec * (NSEC_PER_USEC/TICK_NSEC) by using scaling arithmetic, which took the best approximation of NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC = x/(2^USEC_JIFFIE_SC), and computed: jiffies = (usec * x) USEC_JIFFIE_SC and it rounded this calculation up in the intermediate form (since we can't necessarily exactly represent TICK_NSEC in usec.) But the scaling arithmetic is a (very slight) *over*approximation of the true value: rounding the division up by adding 2^USEC_JIFFIE_SC - 1, when the error in scaling was added in, was sufficient to add one jiffie to the final result. In particular, with HZ=1000, we consistently computed that 1 usec was 11 jiffies; the same was true for any exact multiple of TICK_NSEC. This is obviously bad as a general rule, and caused observable user problems with setitimer() at the very least: setitimer(ITIMER_PROF, val, NULL); setitimer(ITIMER_PROF, NULL, val); would actually add a tick to val! We could possibly still round in the intermediate form, adding something less than 2^USEC_JIFFIE_SC - 1, but easier still is to convert usec-nsec, round in nanoseconds, and then convert using time*spec*_to_jiffies. This adds one constant multiplication, and is not observably slower in microbenchmarks on recent x86 hardware. Tested: the following program: int main() { struct itimerval zero = {{0, 0}, {0, 0}}; /* Initially set to 10 ms. */ struct itimerval initial = zero; initial.it_interval.tv_usec = 1; setitimer(ITIMER_PROF, initial, NULL); /* Save and restore several times. */ for (size_t i = 0; i 10; ++i) { struct itimerval prev; setitimer(ITIMER_PROF, zero, prev); /* on old kernels, this goes up by TICK_USEC every iteration */ printf(previous value: %ld %ld %ld %ld\n, prev.it_interval.tv_sec, prev.it_interval.tv_usec, prev.it_value.tv_sec, prev.it_value.tv_usec); setitimer(ITIMER_PROF, prev, NULL); } return 0; } Signed-off-by: Andrew Hunter a...@google.com Reviewed-by: Paul Turner p...@google.com Reported-by: Aaron Jacobs jaco...@google.com Change-Id: I7cd0f0764847fd055d39531f54e6ea3dd3ce5453 --- include/linux/jiffies.h | 12 --- kernel/time.c | 54 +++-- 2 files changed, 30 insertions(+), 36 deletions(-) diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h index 1f44466..c367cbd 100644 --- a/include/linux/jiffies.h +++ b/include/linux/jiffies.h @@ -258,23 +258,11 @@ extern unsigned long preset_lpj; #define SEC_JIFFIE_SC (32 - SHIFT_HZ) #endif #define NSEC_JIFFIE_SC (SEC_JIFFIE_SC + 29) -#define USEC_JIFFIE_SC (SEC_JIFFIE_SC + 19) #define SEC_CONVERSION ((unsigned long)u64)NSEC_PER_SEC SEC_JIFFIE_SC) +\ TICK_NSEC -1) / (u64)TICK_NSEC)) #define NSEC_CONVERSION ((unsigned long)u64)1 NSEC_JIFFIE_SC) +\ TICK_NSEC -1) / (u64)TICK_NSEC)) -#define USEC_CONVERSION \ -((unsigned long)u64)NSEC_PER_USEC USEC_JIFFIE_SC) +\ -TICK_NSEC -1) / (u64)TICK_NSEC)) -/* - * USEC_ROUND is used in the timeval to jiffie conversion. See there - * for more details. It is the scaled resolution rounding value. Note - * that it is a 64-bit value. Since, when it is applied, we are already - * in jiffies (albit scaled), it is nothing but the bits we will shift - * off. - */ -#define USEC_ROUND (u64)(((u64)1 USEC_JIFFIE_SC) - 1) /* * The maximum jiffie value is (MAX_INT 1). Here we translate that * into seconds. The 64-bit case will overflow if we are not careful, diff --git a/kernel/time.c b/kernel/time.c index 7c7964c..3c49ab4 100644 --- a/kernel/time.c +++ b/kernel/time.c @@ -496,17 +496,20 @@ EXPORT_SYMBOL(usecs_to_jiffies); * that a remainder subtract here would not do the right thing as the * resolution values don't fall on second boundries. I.e. the line: * nsec -= nsec % TICK_NSEC; is NOT a correct resolution rounding. + * Note that due to the small error in the multiplier here, this + * rounding is incorrect for sufficiently large values of tv_nsec, but + * well formed timespecs should have tv_nsec NSEC_PER_SEC, so we're + * OK. * * Rather, we just shift the bits off the right. * * The (NSEC_JIFFIE_SC - SEC_JIFFIE_SC) converts the scaled nsec * value to a scaled second value. */ -unsigned long -timespec_to_jiffies(const struct timespec *value) +static unsigned long +__timespec_to_jiffies(unsigned long sec, long nsec) { - unsigned long sec = value-tv_sec; -
Re: [PATCH] sched: fix timeval conversion to jiffies
On Thu, Aug 7, 2014 at 5:09 PM, Andrew Hunter a...@google.com wrote: timeval_to_jiffies rounding was broken. It essentially computed (eliding seconds) jiffies = usec * (NSEC_PER_USEC/TICK_NSEC) by using scaling arithmetic, which took the best approximation of NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC = x/(2^USEC_JIFFIE_SC), and computed: jiffies = (usec * x) USEC_JIFFIE_SC and it rounded this calculation up in the intermediate form (since we can't necessarily exactly represent TICK_NSEC in usec.) But the scaling arithmetic is a (very slight) *over*approximation of the true value: rounding the division up by adding 2^USEC_JIFFIE_SC - 1, when the error in scaling was added in, was sufficient to add one jiffie to the final result. Apologies for the slow response here. I've never really touched any of the code here, and parsing the above made me want to go do other things. :) Maybe with the next version of the patch, before you get into the unwinding the math, you might practically describe what is broken, then explain how its broken. My quick read here is that we're converting a timespec - jiffies, and in doing so we round up by one jiffy. This seems actually perfectly normal, as we usually end up rounding up by a jiffy in many cases since we don't want to undershoot any timeout, and we're always allowed return later then specified. That said... In particular, with HZ=1000, we consistently computed that 1 usec was 11 jiffies; the same was true for any exact multiple of TICK_NSEC. This is obviously bad as a general rule, and caused observable user problems with setitimer() at the very least: setitimer(ITIMER_PROF, val, NULL); setitimer(ITIMER_PROF, NULL, val); would actually add a tick to val! So this looks like an issue. Since we convert and store the internal representation in jiffies, when we pull it back out, we get the rounded up value, which is larger then the timespec value originally submitted. This is really the core issue, correct? Or does this manifest in other ways that are problematic (the patch seems to hint that it does)? We could possibly still round in the intermediate form, adding something less than 2^USEC_JIFFIE_SC - 1, but easier still is to convert usec-nsec, round in nanoseconds, and then convert using time*spec*_to_jiffies. This adds one constant multiplication, and is not observably slower in microbenchmarks on recent x86 hardware. Tested: the following program: int main() { struct itimerval zero = {{0, 0}, {0, 0}}; /* Initially set to 10 ms. */ struct itimerval initial = zero; initial.it_interval.tv_usec = 1; setitimer(ITIMER_PROF, initial, NULL); /* Save and restore several times. */ for (size_t i = 0; i 10; ++i) { struct itimerval prev; setitimer(ITIMER_PROF, zero, prev); /* on old kernels, this goes up by TICK_USEC every iteration */ printf(previous value: %ld %ld %ld %ld\n, prev.it_interval.tv_sec, prev.it_interval.tv_usec, prev.it_value.tv_sec, prev.it_value.tv_usec); setitimer(ITIMER_PROF, prev, NULL); } return 0; } Signed-off-by: Andrew Hunter a...@google.com Reviewed-by: Paul Turner p...@google.com Reported-by: Aaron Jacobs jaco...@google.com Change-Id: I7cd0f0764847fd055d39531f54e6ea3dd3ce5453 --- include/linux/jiffies.h | 12 --- kernel/time.c | 54 +++-- 2 files changed, 30 insertions(+), 36 deletions(-) diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h index 1f44466..c367cbd 100644 --- a/include/linux/jiffies.h +++ b/include/linux/jiffies.h @@ -258,23 +258,11 @@ extern unsigned long preset_lpj; #define SEC_JIFFIE_SC (32 - SHIFT_HZ) #endif #define NSEC_JIFFIE_SC (SEC_JIFFIE_SC + 29) -#define USEC_JIFFIE_SC (SEC_JIFFIE_SC + 19) #define SEC_CONVERSION ((unsigned long)u64)NSEC_PER_SEC SEC_JIFFIE_SC) +\ TICK_NSEC -1) / (u64)TICK_NSEC)) #define NSEC_CONVERSION ((unsigned long)u64)1 NSEC_JIFFIE_SC) +\ TICK_NSEC -1) / (u64)TICK_NSEC)) -#define USEC_CONVERSION \ -((unsigned long)u64)NSEC_PER_USEC USEC_JIFFIE_SC) +\ -TICK_NSEC -1) / (u64)TICK_NSEC)) -/* - * USEC_ROUND is used in the timeval to jiffie conversion. See there - * for more details. It is the scaled resolution rounding value. Note - * that it is a 64-bit value. Since, when it is applied, we are already - * in jiffies (albit scaled), it is nothing but the bits we will shift - * off. - */ -#define USEC_ROUND (u64)(((u64)1 USEC_JIFFIE_SC) - 1) /* * The maximum jiffie value is (MAX_INT 1). Here we translate that * into seconds. The 64-bit case will overflow if we are not careful, diff --git a/kernel/time.c b/kernel/time.c index 7c7964c..3c49ab4 100644 --- a/kernel/time.c +++
[PATCH] sched: fix timeval conversion to jiffies
timeval_to_jiffies rounding was broken. It essentially computed (eliding seconds) jiffies = usec * (NSEC_PER_USEC/TICK_NSEC) by using scaling arithmetic, which took the best approximation of NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC = x/(2^USEC_JIFFIE_SC), and computed: jiffies = (usec * x) >> USEC_JIFFIE_SC and it rounded this calculation up in the intermediate form (since we can't necessarily exactly represent TICK_NSEC in usec.) But the scaling arithmetic is a (very slight) *over*approximation of the true value: rounding the division up by adding 2^USEC_JIFFIE_SC - 1, when the error in scaling was added in, was sufficient to add one jiffie to the final result. In particular, with HZ=1000, we consistently computed that 1 usec was 11 jiffies; the same was true for any exact multiple of TICK_NSEC. This is obviously bad as a general rule, and caused observable user problems with setitimer() at the very least: setitimer(ITIMER_PROF, , NULL); setitimer(ITIMER_PROF, NULL, ); would actually add a tick to val! We could possibly still round in the intermediate form, adding something less than 2^USEC_JIFFIE_SC - 1, but easier still is to convert usec->nsec, round in nanoseconds, and then convert using time*spec*_to_jiffies. This adds one constant multiplication, and is not observably slower in microbenchmarks on recent x86 hardware. Tested: the following program: int main() { struct itimerval zero = {{0, 0}, {0, 0}}; /* Initially set to 10 ms. */ struct itimerval initial = zero; initial.it_interval.tv_usec = 1; setitimer(ITIMER_PROF, , NULL); /* Save and restore several times. */ for (size_t i = 0; i < 10; ++i) { struct itimerval prev; setitimer(ITIMER_PROF, , ); /* on old kernels, this goes up by TICK_USEC every iteration */ printf("previous value: %ld %ld %ld %ld\n", prev.it_interval.tv_sec, prev.it_interval.tv_usec, prev.it_value.tv_sec, prev.it_value.tv_usec); setitimer(ITIMER_PROF, , NULL); } return 0; } Signed-off-by: Andrew Hunter Reviewed-by: Paul Turner Reported-by: Aaron Jacobs Change-Id: I7cd0f0764847fd055d39531f54e6ea3dd3ce5453 --- include/linux/jiffies.h | 12 --- kernel/time.c | 54 +++-- 2 files changed, 30 insertions(+), 36 deletions(-) diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h index 1f44466..c367cbd 100644 --- a/include/linux/jiffies.h +++ b/include/linux/jiffies.h @@ -258,23 +258,11 @@ extern unsigned long preset_lpj; #define SEC_JIFFIE_SC (32 - SHIFT_HZ) #endif #define NSEC_JIFFIE_SC (SEC_JIFFIE_SC + 29) -#define USEC_JIFFIE_SC (SEC_JIFFIE_SC + 19) #define SEC_CONVERSION ((unsigned long)u64)NSEC_PER_SEC << SEC_JIFFIE_SC) +\ TICK_NSEC -1) / (u64)TICK_NSEC)) #define NSEC_CONVERSION ((unsigned long)u64)1 << NSEC_JIFFIE_SC) +\ TICK_NSEC -1) / (u64)TICK_NSEC)) -#define USEC_CONVERSION \ -((unsigned long)u64)NSEC_PER_USEC << USEC_JIFFIE_SC) +\ -TICK_NSEC -1) / (u64)TICK_NSEC)) -/* - * USEC_ROUND is used in the timeval to jiffie conversion. See there - * for more details. It is the scaled resolution rounding value. Note - * that it is a 64-bit value. Since, when it is applied, we are already - * in jiffies (albit scaled), it is nothing but the bits we will shift - * off. - */ -#define USEC_ROUND (u64)(((u64)1 << USEC_JIFFIE_SC) - 1) /* * The maximum jiffie value is (MAX_INT >> 1). Here we translate that * into seconds. The 64-bit case will overflow if we are not careful, diff --git a/kernel/time.c b/kernel/time.c index 7c7964c..3c49ab4 100644 --- a/kernel/time.c +++ b/kernel/time.c @@ -496,17 +496,20 @@ EXPORT_SYMBOL(usecs_to_jiffies); * that a remainder subtract here would not do the right thing as the * resolution values don't fall on second boundries. I.e. the line: * nsec -= nsec % TICK_NSEC; is NOT a correct resolution rounding. + * Note that due to the small error in the multiplier here, this + * rounding is incorrect for sufficiently large values of tv_nsec, but + * well formed timespecs should have tv_nsec < NSEC_PER_SEC, so we're + * OK. * * Rather, we just shift the bits off the right. * * The >> (NSEC_JIFFIE_SC - SEC_JIFFIE_SC) converts the scaled nsec * value to a scaled second value. */ -unsigned long -timespec_to_jiffies(const struct timespec *value) +static unsigned long +__timespec_to_jiffies(unsigned long sec, long nsec) { - unsigned long sec = value->tv_sec; - long nsec = value->tv_nsec + TICK_NSEC - 1; + nsec = nsec + TICK_NSEC - 1; if (sec >= MAX_SEC_IN_JIFFIES){ sec = MAX_SEC_IN_JIFFIES; @@ -517,6 +520,13 @@ timespec_to_jiffies(const struct timespec *value) (NSEC_JIFFIE_SC - SEC_JIFFIE_SC))) >> SEC_JIFFIE_SC; } + +unsigned long
[PATCH] sched: fix timeval conversion to jiffies
timeval_to_jiffies rounding was broken. It essentially computed (eliding seconds) jiffies = usec * (NSEC_PER_USEC/TICK_NSEC) by using scaling arithmetic, which took the best approximation of NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC = x/(2^USEC_JIFFIE_SC), and computed: jiffies = (usec * x) USEC_JIFFIE_SC and it rounded this calculation up in the intermediate form (since we can't necessarily exactly represent TICK_NSEC in usec.) But the scaling arithmetic is a (very slight) *over*approximation of the true value: rounding the division up by adding 2^USEC_JIFFIE_SC - 1, when the error in scaling was added in, was sufficient to add one jiffie to the final result. In particular, with HZ=1000, we consistently computed that 1 usec was 11 jiffies; the same was true for any exact multiple of TICK_NSEC. This is obviously bad as a general rule, and caused observable user problems with setitimer() at the very least: setitimer(ITIMER_PROF, val, NULL); setitimer(ITIMER_PROF, NULL, val); would actually add a tick to val! We could possibly still round in the intermediate form, adding something less than 2^USEC_JIFFIE_SC - 1, but easier still is to convert usec-nsec, round in nanoseconds, and then convert using time*spec*_to_jiffies. This adds one constant multiplication, and is not observably slower in microbenchmarks on recent x86 hardware. Tested: the following program: int main() { struct itimerval zero = {{0, 0}, {0, 0}}; /* Initially set to 10 ms. */ struct itimerval initial = zero; initial.it_interval.tv_usec = 1; setitimer(ITIMER_PROF, initial, NULL); /* Save and restore several times. */ for (size_t i = 0; i 10; ++i) { struct itimerval prev; setitimer(ITIMER_PROF, zero, prev); /* on old kernels, this goes up by TICK_USEC every iteration */ printf(previous value: %ld %ld %ld %ld\n, prev.it_interval.tv_sec, prev.it_interval.tv_usec, prev.it_value.tv_sec, prev.it_value.tv_usec); setitimer(ITIMER_PROF, prev, NULL); } return 0; } Signed-off-by: Andrew Hunter a...@google.com Reviewed-by: Paul Turner p...@google.com Reported-by: Aaron Jacobs jaco...@google.com Change-Id: I7cd0f0764847fd055d39531f54e6ea3dd3ce5453 --- include/linux/jiffies.h | 12 --- kernel/time.c | 54 +++-- 2 files changed, 30 insertions(+), 36 deletions(-) diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h index 1f44466..c367cbd 100644 --- a/include/linux/jiffies.h +++ b/include/linux/jiffies.h @@ -258,23 +258,11 @@ extern unsigned long preset_lpj; #define SEC_JIFFIE_SC (32 - SHIFT_HZ) #endif #define NSEC_JIFFIE_SC (SEC_JIFFIE_SC + 29) -#define USEC_JIFFIE_SC (SEC_JIFFIE_SC + 19) #define SEC_CONVERSION ((unsigned long)u64)NSEC_PER_SEC SEC_JIFFIE_SC) +\ TICK_NSEC -1) / (u64)TICK_NSEC)) #define NSEC_CONVERSION ((unsigned long)u64)1 NSEC_JIFFIE_SC) +\ TICK_NSEC -1) / (u64)TICK_NSEC)) -#define USEC_CONVERSION \ -((unsigned long)u64)NSEC_PER_USEC USEC_JIFFIE_SC) +\ -TICK_NSEC -1) / (u64)TICK_NSEC)) -/* - * USEC_ROUND is used in the timeval to jiffie conversion. See there - * for more details. It is the scaled resolution rounding value. Note - * that it is a 64-bit value. Since, when it is applied, we are already - * in jiffies (albit scaled), it is nothing but the bits we will shift - * off. - */ -#define USEC_ROUND (u64)(((u64)1 USEC_JIFFIE_SC) - 1) /* * The maximum jiffie value is (MAX_INT 1). Here we translate that * into seconds. The 64-bit case will overflow if we are not careful, diff --git a/kernel/time.c b/kernel/time.c index 7c7964c..3c49ab4 100644 --- a/kernel/time.c +++ b/kernel/time.c @@ -496,17 +496,20 @@ EXPORT_SYMBOL(usecs_to_jiffies); * that a remainder subtract here would not do the right thing as the * resolution values don't fall on second boundries. I.e. the line: * nsec -= nsec % TICK_NSEC; is NOT a correct resolution rounding. + * Note that due to the small error in the multiplier here, this + * rounding is incorrect for sufficiently large values of tv_nsec, but + * well formed timespecs should have tv_nsec NSEC_PER_SEC, so we're + * OK. * * Rather, we just shift the bits off the right. * * The (NSEC_JIFFIE_SC - SEC_JIFFIE_SC) converts the scaled nsec * value to a scaled second value. */ -unsigned long -timespec_to_jiffies(const struct timespec *value) +static unsigned long +__timespec_to_jiffies(unsigned long sec, long nsec) { - unsigned long sec = value-tv_sec; - long nsec = value-tv_nsec + TICK_NSEC - 1; + nsec = nsec + TICK_NSEC - 1; if (sec = MAX_SEC_IN_JIFFIES){ sec = MAX_SEC_IN_JIFFIES; @@ -517,6 +520,13 @@ timespec_to_jiffies(const struct timespec *value) (NSEC_JIFFIE_SC - SEC_JIFFIE_SC)))