Re: [tip:sched/core] sched: Lower chances of cputime scaling overflow

Peter Zijlstra Thu, 11 Apr 2013 10:32:15 -0700

On Thu, 2013-04-11 at 16:50 +0200, Stanislaw Gruszka wrote:

> > +   /*
> > +    * Since the stime:utime ratio is already an approximation through
> > +    * the sampling, reducing its resolution isn't too big a deal.
> > +    * And since total = stime+utime; the total_fls will be the biggest
> > +    * of the two;
> > +    */
> > +   if (total_fls > 32) {
> > +           shift = total_fls - 32; /* a = 2^shift */
> > +           stime >>= shift;
> > +           total >>= shift;
> > +           stime_fls -= shift;
> > +           total_fls -= shift;
> > +   }
> > +
> > +   /*
> > +    * Since we limited stime to 32bits the multiplication reduced to 96bit.
> > +    *   stime * rtime = stime * (rl + rh * 2^32) = 
> > +    *                   stime * rl + stime * rh * 2^32
> > +    */
> > +   lo = stime * rtime_lo;
> > +   hi = stime * rtime_hi;
> > +   t = hi << 32;
> > +   lo += t;
> > +   if (lo < t) /* overflow */
> > +           hi += 0x100000000L;
> > +   hi >>= 32;
> 
> I do not understand why we shift hi value here, is that correct?


Yes.. remember that we have:

  stime * rl + stime * rh * 2^32

How we get this 96bit value but our two 64bit values overlap:

 | w3 | w2 | w1 | w0 |
 +----+----+----+----+
           |    lo   |
       |   hi   |

So what I do is I add the low word of hi to lo and shift the high word
of hi to get:

 |     hi  |    lo   |

Two non-overlapping 64bit values where the high word of hi is always 0.

> > +   /*
> > +    * Pick the 64 most significant bits for division into @lo.
> > +    * 
> > +    * NOTE: res_fls is an approximation (upper-bound) do we want to
> > +    *       properly calculate?
> > +    */
> > +   shift = 0;
> > +   res_fls = stime_fls + rtime_fls;
> > +   if (res_fls > 64) {
> > +           shift = res_fls - 64; /* b = 2^shift */
> > +           lo >>= shift;
> > +           hi <<= 64 - shift;
> > +           lo |= hi;
> >     }
> > -   return (__force cputime_t) scaled;
> > +   /*
> > +    * So here we do:
> > +    *
> > +    *    ((stime / a) * rtime / b)
> > +    *    --------------------------- / b
> > +    *           (total / a)
> > +    */
> > +   return div_u64(lo, total) >> shift;
> 
> I think it should be:
> 
>  ((stime / a) * rtime / b)
> --------------------------- * b
>         (total / a)
> 
> return div_u64(lo, total) << shift;

I think you're very right about that.. got my head twisted by staring
at this stuff too long.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [tip:sched/core] sched: Lower chances of cputime scaling overflow

Reply via email to