Re: [RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-26 Thread Peter Zijlstra
On Fri, Jan 23, 2015 at 03:45:55PM -0800, Jason Low wrote:
> On a side note, if we just move the cputimer->running = 1 to after the
> call to update_gt_cputime in thread_group_cputimer(), then we don't have
> to worry about concurrent adds occuring in this function?

Yeah, maybe.. There are a few races there, but I figure that because we
already test cputimer->running outside of cputimer->lock they're already
possible.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-26 Thread Peter Zijlstra
On Fri, Jan 23, 2015 at 03:45:55PM -0800, Jason Low wrote:
 On a side note, if we just move the cputimer-running = 1 to after the
 call to update_gt_cputime in thread_group_cputimer(), then we don't have
 to worry about concurrent adds occuring in this function?

Yeah, maybe.. There are a few races there, but I figure that because we
already test cputimer-running outside of cputimer-lock they're already
possible.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-23 Thread Jason Low
On Fri, 2015-01-23 at 21:08 +0100, Peter Zijlstra wrote:
> On Fri, Jan 23, 2015 at 11:23:36AM -0800, Jason Low wrote:
> > On Fri, 2015-01-23 at 10:25 +0100, Peter Zijlstra wrote:
> > > On Thu, Jan 22, 2015 at 07:31:53PM -0800, Jason Low wrote:
> > > > +static void update_gt_cputime(struct thread_group_cputimer *a, struct 
> > > > task_cputime *b)
> > > >  {
> > > > +   if (b->utime > atomic64_read(>utime))
> > > > +   atomic64_set(>utime, b->utime);
> > > >  
> > > > +   if (b->stime > atomic64_read(>stime))
> > > > +   atomic64_set(>stime, b->stime);
> > > >  
> > > > +   if (b->sum_exec_runtime > atomic64_read(>sum_exec_runtime))
> > > > +   atomic64_set(>sum_exec_runtime, b->sum_exec_runtime);
> > > >  }
> > > 
> > > See something like this is not safe against concurrent adds.
> > 
> > How about something like:
> > 
> > u64 a_utime, a_stime, a_sum_exec_runtime;
> > 
> > retry_utime:
> > a_utime = atomic64_read(>utime);
> > if (b->utime > a_utime) {
> > if (atomic64_cmpxchg(>utime, a_utime, b->utime) != a_utime)
> > goto retry_utime;
> > }
> > 
> > retry_stime:
> > a_stime = atomic64_read(>stime);
> > if (b->stime > a_stime) {
> > if (atomic64_cmpxchg(>stime, a_stime, b->stime) != a_stime)
> > goto retry_stime;
> > }
> > 
> > retry_sum_exec_runtime:
> > a_sum_exec_runtime = atomic64_read(>sum_exec_runtime);
> > if (b->sum_exec_runtime > a_sum_exec_runtime) {
> > if (atomic64_cmpxchg(>sum_exec_runtime, a_sum_exec_runtime,
> >  b->sum_exec_runtime) != a_sum_exec_runtime)
> > goto retry_sum_exec_runtime;
> > }
> 
> Disgusting, at least use an inline or macro to avoid repeating it :-)

Okay, let me see if I can make that a bit more readable  :)

On a side note, if we just move the cputimer->running = 1 to after the
call to update_gt_cputime in thread_group_cputimer(), then we don't have
to worry about concurrent adds occuring in this function?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-23 Thread Jason Low
On Fri, 2015-01-23 at 21:08 +0100, Peter Zijlstra wrote:
> On Fri, Jan 23, 2015 at 11:23:36AM -0800, Jason Low wrote:
> > On Fri, 2015-01-23 at 10:25 +0100, Peter Zijlstra wrote:
> > > On Thu, Jan 22, 2015 at 07:31:53PM -0800, Jason Low wrote:
> > > > +static void update_gt_cputime(struct thread_group_cputimer *a, struct 
> > > > task_cputime *b)
> > > >  {
> > > > +   if (b->utime > atomic64_read(>utime))
> > > > +   atomic64_set(>utime, b->utime);
> > > >  
> > > > +   if (b->stime > atomic64_read(>stime))
> > > > +   atomic64_set(>stime, b->stime);
> > > >  
> > > > +   if (b->sum_exec_runtime > atomic64_read(>sum_exec_runtime))
> > > > +   atomic64_set(>sum_exec_runtime, b->sum_exec_runtime);
> > > >  }
> > > 
> > > See something like this is not safe against concurrent adds.
> > 
> > How about something like:
> > 
> > u64 a_utime, a_stime, a_sum_exec_runtime;
> > 
> > retry_utime:
> > a_utime = atomic64_read(>utime);
> > if (b->utime > a_utime) {
> > if (atomic64_cmpxchg(>utime, a_utime, b->utime) != a_utime)
> > goto retry_utime;
> > }
> > 
> > retry_stime:
> > a_stime = atomic64_read(>stime);
> > if (b->stime > a_stime) {
> > if (atomic64_cmpxchg(>stime, a_stime, b->stime) != a_stime)
> > goto retry_stime;
> > }
> > 
> > retry_sum_exec_runtime:
> > a_sum_exec_runtime = atomic64_read(>sum_exec_runtime);
> > if (b->sum_exec_runtime > a_sum_exec_runtime) {
> > if (atomic64_cmpxchg(>sum_exec_runtime, a_sum_exec_runtime,
> >  b->sum_exec_runtime) != a_sum_exec_runtime)
> > goto retry_sum_exec_runtime;
> > }
> 
> Disgusting, at least use an inline or macro to avoid repeating it :-)
> 
> Also, does anyone care about performance on 32bit systems? There's a few
> where atomic64 is abysmal.

Yeah, though we're also avoiding spin lock/unlock calls each time, so
not sure if we're really adding anything of significance to the "overall
cost" on 32 bit systems. And update_gt_cputime wouldn't get called too
frequently.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-23 Thread Peter Zijlstra
On Fri, Jan 23, 2015 at 10:07:31AM -0800, Jason Low wrote:
> On Fri, 2015-01-23 at 10:33 +0100, Peter Zijlstra wrote:
> > > + .running = ATOMIC_INIT(0),  \
> > > + atomic_t running;
> > > + atomic_set(>cputimer.running, 1);
> > > @@ -174,7 +174,7 @@ static inline bool cputimer_running(struct 
> > > task_struct *tsk)
> > > + if (!atomic_read(>running))
> > > + if (!atomic_read(>running)) {
> > > + atomic_set(>running, 1);
> > > + if (atomic_read(>signal->cputimer.running))
> > > + atomic_set(>running, 0);
> > > + if (atomic_read(>cputimer.running)) {
> > > + if (atomic_read(>signal->cputimer.running))
> > 
> > That doesn't really need an atomic_t.
> 
> Yeah, I was wondering about that, and made it atomic since we had:
> 
> raw_spin_lock_irqsave(>lock, flags);
> cputimer->running = 0;
> raw_spin_unlock_irqrestore(>lock, flags);
> 
> in stop_process_timers().

Yeah, that could've been ACCESS_ONCE(cputimer->running) = 0. FWIW
atomic_set() seems to not actually include the needed volatile cast.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-23 Thread Peter Zijlstra
On Fri, Jan 23, 2015 at 11:23:36AM -0800, Jason Low wrote:
> On Fri, 2015-01-23 at 10:25 +0100, Peter Zijlstra wrote:
> > On Thu, Jan 22, 2015 at 07:31:53PM -0800, Jason Low wrote:
> > > +static void update_gt_cputime(struct thread_group_cputimer *a, struct 
> > > task_cputime *b)
> > >  {
> > > + if (b->utime > atomic64_read(>utime))
> > > + atomic64_set(>utime, b->utime);
> > >  
> > > + if (b->stime > atomic64_read(>stime))
> > > + atomic64_set(>stime, b->stime);
> > >  
> > > + if (b->sum_exec_runtime > atomic64_read(>sum_exec_runtime))
> > > + atomic64_set(>sum_exec_runtime, b->sum_exec_runtime);
> > >  }
> > 
> > See something like this is not safe against concurrent adds.
> 
> How about something like:
> 
> u64 a_utime, a_stime, a_sum_exec_runtime;
> 
> retry_utime:
>   a_utime = atomic64_read(>utime);
>   if (b->utime > a_utime) {
>   if (atomic64_cmpxchg(>utime, a_utime, b->utime) != a_utime)
>   goto retry_utime;
>   }
> 
> retry_stime:
>   a_stime = atomic64_read(>stime);
>   if (b->stime > a_stime) {
>   if (atomic64_cmpxchg(>stime, a_stime, b->stime) != a_stime)
>   goto retry_stime;
>   }
> 
> retry_sum_exec_runtime:
>   a_sum_exec_runtime = atomic64_read(>sum_exec_runtime);
>   if (b->sum_exec_runtime > a_sum_exec_runtime) {
>   if (atomic64_cmpxchg(>sum_exec_runtime, a_sum_exec_runtime,
>b->sum_exec_runtime) != a_sum_exec_runtime)
>   goto retry_sum_exec_runtime;
>   }

Disgusting, at least use an inline or macro to avoid repeating it :-)

Also, does anyone care about performance on 32bit systems? There's a few
where atomic64 is abysmal.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-23 Thread Jason Low
On Fri, 2015-01-23 at 10:25 +0100, Peter Zijlstra wrote:
> On Thu, Jan 22, 2015 at 07:31:53PM -0800, Jason Low wrote:
> > +static void update_gt_cputime(struct thread_group_cputimer *a, struct 
> > task_cputime *b)
> >  {
> > +   if (b->utime > atomic64_read(>utime))
> > +   atomic64_set(>utime, b->utime);
> >  
> > +   if (b->stime > atomic64_read(>stime))
> > +   atomic64_set(>stime, b->stime);
> >  
> > +   if (b->sum_exec_runtime > atomic64_read(>sum_exec_runtime))
> > +   atomic64_set(>sum_exec_runtime, b->sum_exec_runtime);
> >  }
> 
> See something like this is not safe against concurrent adds.

How about something like:

u64 a_utime, a_stime, a_sum_exec_runtime;

retry_utime:
a_utime = atomic64_read(>utime);
if (b->utime > a_utime) {
if (atomic64_cmpxchg(>utime, a_utime, b->utime) != a_utime)
goto retry_utime;
}

retry_stime:
a_stime = atomic64_read(>stime);
if (b->stime > a_stime) {
if (atomic64_cmpxchg(>stime, a_stime, b->stime) != a_stime)
goto retry_stime;
}

retry_sum_exec_runtime:
a_sum_exec_runtime = atomic64_read(>sum_exec_runtime);
if (b->sum_exec_runtime > a_sum_exec_runtime) {
if (atomic64_cmpxchg(>sum_exec_runtime, a_sum_exec_runtime,
 b->sum_exec_runtime) != a_sum_exec_runtime)
goto retry_sum_exec_runtime;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-23 Thread Jason Low
On Fri, 2015-01-23 at 10:33 +0100, Peter Zijlstra wrote:
> > +   .running = ATOMIC_INIT(0),  \
> > +   atomic_t running;
> > +   atomic_set(>cputimer.running, 1);
> > @@ -174,7 +174,7 @@ static inline bool cputimer_running(struct task_struct 
> > *tsk)
> > +   if (!atomic_read(>running))
> > +   if (!atomic_read(>running)) {
> > +   atomic_set(>running, 1);
> > +   if (atomic_read(>signal->cputimer.running))
> > +   atomic_set(>running, 0);
> > +   if (atomic_read(>cputimer.running)) {
> > +   if (atomic_read(>signal->cputimer.running))
> 
> That doesn't really need an atomic_t.

Yeah, I was wondering about that, and made it atomic since we had:

raw_spin_lock_irqsave(>lock, flags);
cputimer->running = 0;
raw_spin_unlock_irqrestore(>lock, flags);

in stop_process_timers().

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-23 Thread Peter Zijlstra
> + .running = ATOMIC_INIT(0),  \
> + atomic_t running;
> + atomic_set(>cputimer.running, 1);
> @@ -174,7 +174,7 @@ static inline bool cputimer_running(struct task_struct 
> *tsk)
> + if (!atomic_read(>running))
> + if (!atomic_read(>running)) {
> + atomic_set(>running, 1);
> + if (atomic_read(>signal->cputimer.running))
> + atomic_set(>running, 0);
> + if (atomic_read(>cputimer.running)) {
> + if (atomic_read(>signal->cputimer.running))

That doesn't really need an atomic_t.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-23 Thread Peter Zijlstra
On Thu, Jan 22, 2015 at 07:31:53PM -0800, Jason Low wrote:
> +static void update_gt_cputime(struct thread_group_cputimer *a, struct 
> task_cputime *b)
>  {
> + if (b->utime > atomic64_read(>utime))
> + atomic64_set(>utime, b->utime);
>  
> + if (b->stime > atomic64_read(>stime))
> + atomic64_set(>stime, b->stime);
>  
> + if (b->sum_exec_runtime > atomic64_read(>sum_exec_runtime))
> + atomic64_set(>sum_exec_runtime, b->sum_exec_runtime);
>  }

See something like this is not safe against concurrent adds.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-23 Thread Peter Zijlstra
On Thu, Jan 22, 2015 at 07:31:53PM -0800, Jason Low wrote:
> When running a database workload, we found a scalability issue
> with itimers.
> 
> Much of the problem was caused by the thread_group_cputimer spinlock.
> Each time we account for group system/user time, we need to obtain a
> thread_group_cputimer's spinlock to update the timers. On larger
> systems (such as a 16 socket machine), this caused more than 30% of
> total time spent trying to obtain the kernel lock to update these
> group timer stats.
> 
> This patch converts the timers to 64 bit atomic variables and use
> atomic add to update them without a lock. With this patch, the percent
> of total time spent updating thread group cputimer timers was reduced
> from 30% down to less than 1%.

I'll have to look; I worry about consistency between the values. But why
would any self respecting piece of software use this crap stuff? Its a
guaranteed scalability fail.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-23 Thread Jason Low
On Fri, 2015-01-23 at 10:33 +0100, Peter Zijlstra wrote:
  +   .running = ATOMIC_INIT(0),  \
  +   atomic_t running;
  +   atomic_set(sig-cputimer.running, 1);
  @@ -174,7 +174,7 @@ static inline bool cputimer_running(struct task_struct 
  *tsk)
  +   if (!atomic_read(cputimer-running))
  +   if (!atomic_read(cputimer-running)) {
  +   atomic_set(cputimer-running, 1);
  +   if (atomic_read(tsk-signal-cputimer.running))
  +   atomic_set(cputimer-running, 0);
  +   if (atomic_read(sig-cputimer.running)) {
  +   if (atomic_read(tsk-signal-cputimer.running))
 
 That doesn't really need an atomic_t.

Yeah, I was wondering about that, and made it atomic since we had:

raw_spin_lock_irqsave(cputimer-lock, flags);
cputimer-running = 0;
raw_spin_unlock_irqrestore(cputimer-lock, flags);

in stop_process_timers().

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-23 Thread Jason Low
On Fri, 2015-01-23 at 10:25 +0100, Peter Zijlstra wrote:
 On Thu, Jan 22, 2015 at 07:31:53PM -0800, Jason Low wrote:
  +static void update_gt_cputime(struct thread_group_cputimer *a, struct 
  task_cputime *b)
   {
  +   if (b-utime  atomic64_read(a-utime))
  +   atomic64_set(a-utime, b-utime);
   
  +   if (b-stime  atomic64_read(a-stime))
  +   atomic64_set(a-stime, b-stime);
   
  +   if (b-sum_exec_runtime  atomic64_read(a-sum_exec_runtime))
  +   atomic64_set(a-sum_exec_runtime, b-sum_exec_runtime);
   }
 
 See something like this is not safe against concurrent adds.

How about something like:

u64 a_utime, a_stime, a_sum_exec_runtime;

retry_utime:
a_utime = atomic64_read(a-utime);
if (b-utime  a_utime) {
if (atomic64_cmpxchg(a-utime, a_utime, b-utime) != a_utime)
goto retry_utime;
}

retry_stime:
a_stime = atomic64_read(a-stime);
if (b-stime  a_stime) {
if (atomic64_cmpxchg(a-stime, a_stime, b-stime) != a_stime)
goto retry_stime;
}

retry_sum_exec_runtime:
a_sum_exec_runtime = atomic64_read(a-sum_exec_runtime);
if (b-sum_exec_runtime  a_sum_exec_runtime) {
if (atomic64_cmpxchg(a-sum_exec_runtime, a_sum_exec_runtime,
 b-sum_exec_runtime) != a_sum_exec_runtime)
goto retry_sum_exec_runtime;
}

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-23 Thread Peter Zijlstra
On Fri, Jan 23, 2015 at 10:07:31AM -0800, Jason Low wrote:
 On Fri, 2015-01-23 at 10:33 +0100, Peter Zijlstra wrote:
   + .running = ATOMIC_INIT(0),  \
   + atomic_t running;
   + atomic_set(sig-cputimer.running, 1);
   @@ -174,7 +174,7 @@ static inline bool cputimer_running(struct 
   task_struct *tsk)
   + if (!atomic_read(cputimer-running))
   + if (!atomic_read(cputimer-running)) {
   + atomic_set(cputimer-running, 1);
   + if (atomic_read(tsk-signal-cputimer.running))
   + atomic_set(cputimer-running, 0);
   + if (atomic_read(sig-cputimer.running)) {
   + if (atomic_read(tsk-signal-cputimer.running))
  
  That doesn't really need an atomic_t.
 
 Yeah, I was wondering about that, and made it atomic since we had:
 
 raw_spin_lock_irqsave(cputimer-lock, flags);
 cputimer-running = 0;
 raw_spin_unlock_irqrestore(cputimer-lock, flags);
 
 in stop_process_timers().

Yeah, that could've been ACCESS_ONCE(cputimer-running) = 0. FWIW
atomic_set() seems to not actually include the needed volatile cast.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-23 Thread Peter Zijlstra
On Fri, Jan 23, 2015 at 11:23:36AM -0800, Jason Low wrote:
 On Fri, 2015-01-23 at 10:25 +0100, Peter Zijlstra wrote:
  On Thu, Jan 22, 2015 at 07:31:53PM -0800, Jason Low wrote:
   +static void update_gt_cputime(struct thread_group_cputimer *a, struct 
   task_cputime *b)
{
   + if (b-utime  atomic64_read(a-utime))
   + atomic64_set(a-utime, b-utime);

   + if (b-stime  atomic64_read(a-stime))
   + atomic64_set(a-stime, b-stime);

   + if (b-sum_exec_runtime  atomic64_read(a-sum_exec_runtime))
   + atomic64_set(a-sum_exec_runtime, b-sum_exec_runtime);
}
  
  See something like this is not safe against concurrent adds.
 
 How about something like:
 
 u64 a_utime, a_stime, a_sum_exec_runtime;
 
 retry_utime:
   a_utime = atomic64_read(a-utime);
   if (b-utime  a_utime) {
   if (atomic64_cmpxchg(a-utime, a_utime, b-utime) != a_utime)
   goto retry_utime;
   }
 
 retry_stime:
   a_stime = atomic64_read(a-stime);
   if (b-stime  a_stime) {
   if (atomic64_cmpxchg(a-stime, a_stime, b-stime) != a_stime)
   goto retry_stime;
   }
 
 retry_sum_exec_runtime:
   a_sum_exec_runtime = atomic64_read(a-sum_exec_runtime);
   if (b-sum_exec_runtime  a_sum_exec_runtime) {
   if (atomic64_cmpxchg(a-sum_exec_runtime, a_sum_exec_runtime,
b-sum_exec_runtime) != a_sum_exec_runtime)
   goto retry_sum_exec_runtime;
   }

Disgusting, at least use an inline or macro to avoid repeating it :-)

Also, does anyone care about performance on 32bit systems? There's a few
where atomic64 is abysmal.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-23 Thread Jason Low
On Fri, 2015-01-23 at 21:08 +0100, Peter Zijlstra wrote:
 On Fri, Jan 23, 2015 at 11:23:36AM -0800, Jason Low wrote:
  On Fri, 2015-01-23 at 10:25 +0100, Peter Zijlstra wrote:
   On Thu, Jan 22, 2015 at 07:31:53PM -0800, Jason Low wrote:
+static void update_gt_cputime(struct thread_group_cputimer *a, struct 
task_cputime *b)
 {
+   if (b-utime  atomic64_read(a-utime))
+   atomic64_set(a-utime, b-utime);
 
+   if (b-stime  atomic64_read(a-stime))
+   atomic64_set(a-stime, b-stime);
 
+   if (b-sum_exec_runtime  atomic64_read(a-sum_exec_runtime))
+   atomic64_set(a-sum_exec_runtime, b-sum_exec_runtime);
 }
   
   See something like this is not safe against concurrent adds.
  
  How about something like:
  
  u64 a_utime, a_stime, a_sum_exec_runtime;
  
  retry_utime:
  a_utime = atomic64_read(a-utime);
  if (b-utime  a_utime) {
  if (atomic64_cmpxchg(a-utime, a_utime, b-utime) != a_utime)
  goto retry_utime;
  }
  
  retry_stime:
  a_stime = atomic64_read(a-stime);
  if (b-stime  a_stime) {
  if (atomic64_cmpxchg(a-stime, a_stime, b-stime) != a_stime)
  goto retry_stime;
  }
  
  retry_sum_exec_runtime:
  a_sum_exec_runtime = atomic64_read(a-sum_exec_runtime);
  if (b-sum_exec_runtime  a_sum_exec_runtime) {
  if (atomic64_cmpxchg(a-sum_exec_runtime, a_sum_exec_runtime,
   b-sum_exec_runtime) != a_sum_exec_runtime)
  goto retry_sum_exec_runtime;
  }
 
 Disgusting, at least use an inline or macro to avoid repeating it :-)
 
 Also, does anyone care about performance on 32bit systems? There's a few
 where atomic64 is abysmal.

Yeah, though we're also avoiding spin lock/unlock calls each time, so
not sure if we're really adding anything of significance to the overall
cost on 32 bit systems. And update_gt_cputime wouldn't get called too
frequently.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-23 Thread Jason Low
On Fri, 2015-01-23 at 21:08 +0100, Peter Zijlstra wrote:
 On Fri, Jan 23, 2015 at 11:23:36AM -0800, Jason Low wrote:
  On Fri, 2015-01-23 at 10:25 +0100, Peter Zijlstra wrote:
   On Thu, Jan 22, 2015 at 07:31:53PM -0800, Jason Low wrote:
+static void update_gt_cputime(struct thread_group_cputimer *a, struct 
task_cputime *b)
 {
+   if (b-utime  atomic64_read(a-utime))
+   atomic64_set(a-utime, b-utime);
 
+   if (b-stime  atomic64_read(a-stime))
+   atomic64_set(a-stime, b-stime);
 
+   if (b-sum_exec_runtime  atomic64_read(a-sum_exec_runtime))
+   atomic64_set(a-sum_exec_runtime, b-sum_exec_runtime);
 }
   
   See something like this is not safe against concurrent adds.
  
  How about something like:
  
  u64 a_utime, a_stime, a_sum_exec_runtime;
  
  retry_utime:
  a_utime = atomic64_read(a-utime);
  if (b-utime  a_utime) {
  if (atomic64_cmpxchg(a-utime, a_utime, b-utime) != a_utime)
  goto retry_utime;
  }
  
  retry_stime:
  a_stime = atomic64_read(a-stime);
  if (b-stime  a_stime) {
  if (atomic64_cmpxchg(a-stime, a_stime, b-stime) != a_stime)
  goto retry_stime;
  }
  
  retry_sum_exec_runtime:
  a_sum_exec_runtime = atomic64_read(a-sum_exec_runtime);
  if (b-sum_exec_runtime  a_sum_exec_runtime) {
  if (atomic64_cmpxchg(a-sum_exec_runtime, a_sum_exec_runtime,
   b-sum_exec_runtime) != a_sum_exec_runtime)
  goto retry_sum_exec_runtime;
  }
 
 Disgusting, at least use an inline or macro to avoid repeating it :-)

Okay, let me see if I can make that a bit more readable  :)

On a side note, if we just move the cputimer-running = 1 to after the
call to update_gt_cputime in thread_group_cputimer(), then we don't have
to worry about concurrent adds occuring in this function?

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-23 Thread Peter Zijlstra
On Thu, Jan 22, 2015 at 07:31:53PM -0800, Jason Low wrote:
 When running a database workload, we found a scalability issue
 with itimers.
 
 Much of the problem was caused by the thread_group_cputimer spinlock.
 Each time we account for group system/user time, we need to obtain a
 thread_group_cputimer's spinlock to update the timers. On larger
 systems (such as a 16 socket machine), this caused more than 30% of
 total time spent trying to obtain the kernel lock to update these
 group timer stats.
 
 This patch converts the timers to 64 bit atomic variables and use
 atomic add to update them without a lock. With this patch, the percent
 of total time spent updating thread group cputimer timers was reduced
 from 30% down to less than 1%.

I'll have to look; I worry about consistency between the values. But why
would any self respecting piece of software use this crap stuff? Its a
guaranteed scalability fail.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-23 Thread Peter Zijlstra
On Thu, Jan 22, 2015 at 07:31:53PM -0800, Jason Low wrote:
 +static void update_gt_cputime(struct thread_group_cputimer *a, struct 
 task_cputime *b)
  {
 + if (b-utime  atomic64_read(a-utime))
 + atomic64_set(a-utime, b-utime);
  
 + if (b-stime  atomic64_read(a-stime))
 + atomic64_set(a-stime, b-stime);
  
 + if (b-sum_exec_runtime  atomic64_read(a-sum_exec_runtime))
 + atomic64_set(a-sum_exec_runtime, b-sum_exec_runtime);
  }

See something like this is not safe against concurrent adds.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-23 Thread Peter Zijlstra
 + .running = ATOMIC_INIT(0),  \
 + atomic_t running;
 + atomic_set(sig-cputimer.running, 1);
 @@ -174,7 +174,7 @@ static inline bool cputimer_running(struct task_struct 
 *tsk)
 + if (!atomic_read(cputimer-running))
 + if (!atomic_read(cputimer-running)) {
 + atomic_set(cputimer-running, 1);
 + if (atomic_read(tsk-signal-cputimer.running))
 + atomic_set(cputimer-running, 0);
 + if (atomic_read(sig-cputimer.running)) {
 + if (atomic_read(tsk-signal-cputimer.running))

That doesn't really need an atomic_t.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-22 Thread Jason Low
When running a database workload, we found a scalability issue
with itimers.

Much of the problem was caused by the thread_group_cputimer spinlock.
Each time we account for group system/user time, we need to obtain a
thread_group_cputimer's spinlock to update the timers. On larger
systems (such as a 16 socket machine), this caused more than 30% of
total time spent trying to obtain the kernel lock to update these
group timer stats.

This patch converts the timers to 64 bit atomic variables and use
atomic add to update them without a lock. With this patch, the percent
of total time spent updating thread group cputimer timers was reduced
from 30% down to less than 1%.

Signed-off-by: Jason Low 
---
 include/linux/init_task.h  |7 +++--
 include/linux/sched.h  |   12 +++--
 kernel/fork.c  |5 +---
 kernel/sched/stats.h   |   14 +++
 kernel/time/posix-cpu-timers.c |   48 ++-
 5 files changed, 35 insertions(+), 51 deletions(-)

diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index 3037fc0..f593b38 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -50,9 +50,10 @@ extern struct fs_struct init_fs;
.cpu_timers = INIT_CPU_TIMERS(sig.cpu_timers),  \
.rlim   = INIT_RLIMITS, \
.cputimer   = { \
-   .cputime = INIT_CPUTIME,\
-   .running = 0,   \
-   .lock = __RAW_SPIN_LOCK_UNLOCKED(sig.cputimer.lock),\
+   .utime = ATOMIC64_INIT(0),  \
+   .stime = ATOMIC64_INIT(0),  \
+   .sum_exec_runtime = ATOMIC64_INIT(0),   \
+   .running = ATOMIC_INIT(0),  \
},  \
.cred_guard_mutex = \
 __MUTEX_INITIALIZER(sig.cred_guard_mutex), \
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 8db31ef..0d73fd4 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -588,9 +588,10 @@ struct task_cputime {
  * used for thread group CPU timer calculations.
  */
 struct thread_group_cputimer {
-   struct task_cputime cputime;
-   int running;
-   raw_spinlock_t lock;
+   atomic64_t utime;
+   atomic64_t stime;
+   atomic64_t sum_exec_runtime;
+   atomic_t running;
 };
 
 #include 
@@ -2942,11 +2943,6 @@ static __always_inline bool need_resched(void)
 void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times);
 void thread_group_cputimer(struct task_struct *tsk, struct task_cputime 
*times);
 
-static inline void thread_group_cputime_init(struct signal_struct *sig)
-{
-   raw_spin_lock_init(>cputimer.lock);
-}
-
 /*
  * Reevaluate whether the task has signals pending delivery.
  * Wake the task if so.
diff --git a/kernel/fork.c b/kernel/fork.c
index 4dc2dda..d511f99 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1037,13 +1037,10 @@ static void posix_cpu_timers_init_group(struct 
signal_struct *sig)
 {
unsigned long cpu_limit;
 
-   /* Thread group counters. */
-   thread_group_cputime_init(sig);
-
cpu_limit = ACCESS_ONCE(sig->rlim[RLIMIT_CPU].rlim_cur);
if (cpu_limit != RLIM_INFINITY) {
sig->cputime_expires.prof_exp = secs_to_cputime(cpu_limit);
-   sig->cputimer.running = 1;
+   atomic_set(>cputimer.running, 1);
}
 
/* The timer lists. */
diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
index 4ab7043..caeab5f 100644
--- a/kernel/sched/stats.h
+++ b/kernel/sched/stats.h
@@ -174,7 +174,7 @@ static inline bool cputimer_running(struct task_struct *tsk)
 {
struct thread_group_cputimer *cputimer = >signal->cputimer;
 
-   if (!cputimer->running)
+   if (!atomic_read(>running))
return false;
 
/*
@@ -215,9 +215,7 @@ static inline void account_group_user_time(struct 
task_struct *tsk,
if (!cputimer_running(tsk))
return;
 
-   raw_spin_lock(>lock);
-   cputimer->cputime.utime += cputime;
-   raw_spin_unlock(>lock);
+   atomic64_add(cputime, >utime);
 }
 
 /**
@@ -238,9 +236,7 @@ static inline void account_group_system_time(struct 
task_struct *tsk,
if (!cputimer_running(tsk))
return;
 
-   raw_spin_lock(>lock);
-   cputimer->cputime.stime += cputime;
-   raw_spin_unlock(>lock);
+   atomic64_add(cputime, >stime);
 }
 
 /**
@@ -261,7 +257,5 @@ static inline void account_group_exec_runtime(struct 
task_struct *tsk,
if (!cputimer_running(tsk))
return;
 
-   raw_spin_lock(>lock);
-  

[RFC PATCH] sched, timer: Use atomics for thread_group_cputimer stats

2015-01-22 Thread Jason Low
When running a database workload, we found a scalability issue
with itimers.

Much of the problem was caused by the thread_group_cputimer spinlock.
Each time we account for group system/user time, we need to obtain a
thread_group_cputimer's spinlock to update the timers. On larger
systems (such as a 16 socket machine), this caused more than 30% of
total time spent trying to obtain the kernel lock to update these
group timer stats.

This patch converts the timers to 64 bit atomic variables and use
atomic add to update them without a lock. With this patch, the percent
of total time spent updating thread group cputimer timers was reduced
from 30% down to less than 1%.

Signed-off-by: Jason Low jason.l...@hp.com
---
 include/linux/init_task.h  |7 +++--
 include/linux/sched.h  |   12 +++--
 kernel/fork.c  |5 +---
 kernel/sched/stats.h   |   14 +++
 kernel/time/posix-cpu-timers.c |   48 ++-
 5 files changed, 35 insertions(+), 51 deletions(-)

diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index 3037fc0..f593b38 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -50,9 +50,10 @@ extern struct fs_struct init_fs;
.cpu_timers = INIT_CPU_TIMERS(sig.cpu_timers),  \
.rlim   = INIT_RLIMITS, \
.cputimer   = { \
-   .cputime = INIT_CPUTIME,\
-   .running = 0,   \
-   .lock = __RAW_SPIN_LOCK_UNLOCKED(sig.cputimer.lock),\
+   .utime = ATOMIC64_INIT(0),  \
+   .stime = ATOMIC64_INIT(0),  \
+   .sum_exec_runtime = ATOMIC64_INIT(0),   \
+   .running = ATOMIC_INIT(0),  \
},  \
.cred_guard_mutex = \
 __MUTEX_INITIALIZER(sig.cred_guard_mutex), \
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 8db31ef..0d73fd4 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -588,9 +588,10 @@ struct task_cputime {
  * used for thread group CPU timer calculations.
  */
 struct thread_group_cputimer {
-   struct task_cputime cputime;
-   int running;
-   raw_spinlock_t lock;
+   atomic64_t utime;
+   atomic64_t stime;
+   atomic64_t sum_exec_runtime;
+   atomic_t running;
 };
 
 #include linux/rwsem.h
@@ -2942,11 +2943,6 @@ static __always_inline bool need_resched(void)
 void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times);
 void thread_group_cputimer(struct task_struct *tsk, struct task_cputime 
*times);
 
-static inline void thread_group_cputime_init(struct signal_struct *sig)
-{
-   raw_spin_lock_init(sig-cputimer.lock);
-}
-
 /*
  * Reevaluate whether the task has signals pending delivery.
  * Wake the task if so.
diff --git a/kernel/fork.c b/kernel/fork.c
index 4dc2dda..d511f99 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1037,13 +1037,10 @@ static void posix_cpu_timers_init_group(struct 
signal_struct *sig)
 {
unsigned long cpu_limit;
 
-   /* Thread group counters. */
-   thread_group_cputime_init(sig);
-
cpu_limit = ACCESS_ONCE(sig-rlim[RLIMIT_CPU].rlim_cur);
if (cpu_limit != RLIM_INFINITY) {
sig-cputime_expires.prof_exp = secs_to_cputime(cpu_limit);
-   sig-cputimer.running = 1;
+   atomic_set(sig-cputimer.running, 1);
}
 
/* The timer lists. */
diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
index 4ab7043..caeab5f 100644
--- a/kernel/sched/stats.h
+++ b/kernel/sched/stats.h
@@ -174,7 +174,7 @@ static inline bool cputimer_running(struct task_struct *tsk)
 {
struct thread_group_cputimer *cputimer = tsk-signal-cputimer;
 
-   if (!cputimer-running)
+   if (!atomic_read(cputimer-running))
return false;
 
/*
@@ -215,9 +215,7 @@ static inline void account_group_user_time(struct 
task_struct *tsk,
if (!cputimer_running(tsk))
return;
 
-   raw_spin_lock(cputimer-lock);
-   cputimer-cputime.utime += cputime;
-   raw_spin_unlock(cputimer-lock);
+   atomic64_add(cputime, cputimer-utime);
 }
 
 /**
@@ -238,9 +236,7 @@ static inline void account_group_system_time(struct 
task_struct *tsk,
if (!cputimer_running(tsk))
return;
 
-   raw_spin_lock(cputimer-lock);
-   cputimer-cputime.stime += cputime;
-   raw_spin_unlock(cputimer-lock);
+   atomic64_add(cputime, cputimer-stime);
 }
 
 /**
@@ -261,7 +257,5 @@ static inline void account_group_exec_runtime(struct 
task_struct *tsk,
if