Hi Thomas,

Thanks for the reviews.

On Wed, Mar 06, 2013 at 03:09:26PM +0100, Thomas Gleixner wrote:
> On Wed, 6 Mar 2013, Feng Tang wrote:
> 
> > Current clocksource_cyc2ns() has a implicit limit that the (cycles * mult)
> > can not exceed 64 bits limit. Jason Gunthorpe proposed a way to
> > handle this big cycles case, and this patch put the handling into
> > clocksource_cyc2ns() so that it could be used unconditionally.
> 
> Could be used if it wouldn't break the world and some more.

Exactly.

One excuse I can think of is usually the clocksource_cyc2ns() will be called
for cycles less than 600 seconds, based on which the "mult" and "shift" are
calculated for a clocksource.

> 
> > Suggested-by: Jason Gunthorpe <[email protected]>
> > Signed-off-by: Feng Tang <[email protected]>
> > ---
> >  include/linux/clocksource.h |   11 ++++++++++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
> > index aa7032c..1ecc872 100644
> > --- a/include/linux/clocksource.h
> > +++ b/include/linux/clocksource.h
> > @@ -274,7 +274,16 @@ static inline u32 clocksource_hz2mult(u32 hz, u32 
> > shift_constant)
> >   */
> >  static inline s64 clocksource_cyc2ns(cycle_t cycles, u32 mult, u32 shift)
> >  {
> > -   return ((u64) cycles * mult) >> shift;
> > +   u64 max = ULLONG_MAX / mult;
> 
> This breaks everything which does not have a 64/32bit divide
> instruction. And you can't replace it with do_div() as that would
> impose massive overhead on those architectures in the fast path.

I thought about this once. And in my v2 patch, I used some code like

+               /*
+                * The system suspended time and the delta cycles may be very
+                * long, so we can't call clocksource_cyc2ns() directly with
+                * clocksource's default mult and shift to avoid overflow.
+                */
+               max_cycles = 1ULL << (63 - (ilog2(mult) + 1));
+               while (cycle_delta > max_cycles) {
+                       max_cycles <<= 1;
+                       mult >>= 1;
+                       shift--;
+               }
+

trying to avoid expensieve maths. But as Jason pointed, there is some accuracy
lost. 

> 
> The max value can be precalculated and stored in the timekeeper
> struct. We really do not want expensive calculations in the fast path.

Yeah, just like the max_idle_ns.

> > +   s64 nsec = 0;
> > +
> > +   /* The (mult * cycles) may overflow 64 bits, so add a max check */
> > +   if (cycles > max) {
> > +           nsec = ((max * mult) >> shift) * (cycles / max);
> 
> This breaks everything which does not have a 64/64bit divide instruction.
>  
> > +           cycles %= max;
> 
> Ditto.
> 
> As this is the slow path on resume you can use the 64bit functions
> declared in math64.h. And you want to put the slow path out of line.

So should I leave the clocksource_cyl2ns() untouched and only add these
do_div() 64 bit operation inside the timekeeping_resume() slow path?

Thanks,
Feng


 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to