Re: [PATCH v2 0/4] timer: Improve itimers scalability
Removing dependency to the database code is trivial. It's just 100 lines that launch lots of threads and do NUMA-aware memory accesses so that remote NUMA access cost does not affect the benchmark. It's just a bit tedious to convert the C++11 code into C/pthread. C++11 really spoiled me. Still, not much work. Let me know where to post the code. On 10/16/2015 10:34 AM, Jason Low wrote: Mind posting it, so that people can stick it into a new 'perf bench timer' subcommand, and/or reproduce your results with it? Yes, sure. At the moment, this micro benchmark is written in C++ and integrated with the database code. We can look into rewriting it into a more general program so that it can be included in perf. Thanks, Jason -- Hideaki Kimura -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 0/4] timer: Improve itimers scalability
On Fri, 2015-10-16 at 09:12 +0200, Ingo Molnar wrote: > * Jason Low wrote: > > > > > With this patch set (along with commit 1018016c706f mentioned above), > > > > the performance hit of itimers almost completely goes away on the > > > > 16 socket system. > > > > > > > > Jason Low (4): > > > > timer: Optimize fastpath_timer_check() > > > > timer: Check thread timers only when there are active thread timers > > > > timer: Convert cputimer->running to bool > > > > timer: Reduce unnecessary sighand lock contention > > > > > > > > include/linux/init_task.h |3 +- > > > > include/linux/sched.h |9 -- > > > > kernel/fork.c |2 +- > > > > kernel/time/posix-cpu-timers.c | 63 > > > > --- > > > > 4 files changed, 54 insertions(+), 23 deletions(-) > > > > > > Is there some itimers benchmark that can be used to measure the effects > > > of these > > > changes? > > > > Yes, we also wrote a micro benchmark which generates cache misses and > > measures > > the average cost of each cache miss (with itimers enabled). We used this > > while > > writing and testing patches, since it takes a bit longer to set up and run > > the > > database. > > Mind posting it, so that people can stick it into a new 'perf bench timer' > subcommand, and/or reproduce your results with it? Yes, sure. At the moment, this micro benchmark is written in C++ and integrated with the database code. We can look into rewriting it into a more general program so that it can be included in perf. Thanks, Jason -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 0/4] timer: Improve itimers scalability
* Jason Low wrote: > > > With this patch set (along with commit 1018016c706f mentioned above), > > > the performance hit of itimers almost completely goes away on the > > > 16 socket system. > > > > > > Jason Low (4): > > > timer: Optimize fastpath_timer_check() > > > timer: Check thread timers only when there are active thread timers > > > timer: Convert cputimer->running to bool > > > timer: Reduce unnecessary sighand lock contention > > > > > > include/linux/init_task.h |3 +- > > > include/linux/sched.h |9 -- > > > kernel/fork.c |2 +- > > > kernel/time/posix-cpu-timers.c | 63 > > > --- > > > 4 files changed, 54 insertions(+), 23 deletions(-) > > > > Is there some itimers benchmark that can be used to measure the effects of > > these > > changes? > > Yes, we also wrote a micro benchmark which generates cache misses and > measures > the average cost of each cache miss (with itimers enabled). We used this > while > writing and testing patches, since it takes a bit longer to set up and run > the > database. Mind posting it, so that people can stick it into a new 'perf bench timer' subcommand, and/or reproduce your results with it? Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 0/4] timer: Improve itimers scalability
On Thu, 2015-10-15 at 10:47 +0200, Ingo Molnar wrote: > * Jason Low wrote: > > > While running a database workload on a 16 socket machine, there were > > scalability issues related to itimers. The following link contains a > > more detailed summary of the issues at the application level. > > > > https://lkml.org/lkml/2015/8/26/737 > > > > Commit 1018016c706f addressed the issue with the thread_group_cputimer > > spinlock taking up a significant portion of total run time. > > This patch series addresses the secondary issue where a lot of time is > > spent trying to acquire the sighand lock. It was found in some cases > > that 200+ threads were simultaneously contending for the same sighand > > lock, reducing throughput by more than 30%. > > > > With this patch set (along with commit 1018016c706f mentioned above), > > the performance hit of itimers almost completely goes away on the > > 16 socket system. > > > > Jason Low (4): > > timer: Optimize fastpath_timer_check() > > timer: Check thread timers only when there are active thread timers > > timer: Convert cputimer->running to bool > > timer: Reduce unnecessary sighand lock contention > > > > include/linux/init_task.h |3 +- > > include/linux/sched.h |9 -- > > kernel/fork.c |2 +- > > kernel/time/posix-cpu-timers.c | 63 > > --- > > 4 files changed, 54 insertions(+), 23 deletions(-) > > Is there some itimers benchmark that can be used to measure the effects of > these > changes? Yes, we also wrote a micro benchmark which generates cache misses and measures the average cost of each cache miss (with itimers enabled). We used this while writing and testing patches, since it takes a bit longer to set up and run the database. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 0/4] timer: Improve itimers scalability
On Wed, 2015-10-14 at 17:18 -0400, George Spelvin wrote: > I'm going to give 4/4 a closer look to see if the races with timer > expiration make more sense to me than last time around. > (E.g. do CPU time signals even work in CONFIG_NO_HZ_FULL?) > > But although I haven't yet convinced myself the current code is right, > the changes don't seem to make it any worse. So consider all four > > Reviewed-by: George Spelvin Thanks George! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 0/4] timer: Improve itimers scalability
On Wed, Oct 14, 2015 at 05:18:27PM -0400, George Spelvin wrote: > I'm going to give 4/4 a closer look to see if the races with timer > expiration make more sense to me than last time around. > (E.g. do CPU time signals even work in CONFIG_NO_HZ_FULL?) Those enqueued with timer_settime() do work. But itimers, and rlimits (RLIMIT_RTTIME, RLIMIT_CPU) aren't supported well. I need to rework that. > > But although I haven't yet convinced myself the current code is right, > the changes don't seem to make it any worse. So consider all four > > Reviewed-by: George Spelvin > > Thank you! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 0/4] timer: Improve itimers scalability
* Jason Low wrote: > While running a database workload on a 16 socket machine, there were > scalability issues related to itimers. The following link contains a > more detailed summary of the issues at the application level. > > https://lkml.org/lkml/2015/8/26/737 > > Commit 1018016c706f addressed the issue with the thread_group_cputimer > spinlock taking up a significant portion of total run time. > This patch series addresses the secondary issue where a lot of time is > spent trying to acquire the sighand lock. It was found in some cases > that 200+ threads were simultaneously contending for the same sighand > lock, reducing throughput by more than 30%. > > With this patch set (along with commit 1018016c706f mentioned above), > the performance hit of itimers almost completely goes away on the > 16 socket system. > > Jason Low (4): > timer: Optimize fastpath_timer_check() > timer: Check thread timers only when there are active thread timers > timer: Convert cputimer->running to bool > timer: Reduce unnecessary sighand lock contention > > include/linux/init_task.h |3 +- > include/linux/sched.h |9 -- > kernel/fork.c |2 +- > kernel/time/posix-cpu-timers.c | 63 --- > 4 files changed, 54 insertions(+), 23 deletions(-) Is there some itimers benchmark that can be used to measure the effects of these changes? Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 0/4] timer: Improve itimers scalability
I'm going to give 4/4 a closer look to see if the races with timer expiration make more sense to me than last time around. (E.g. do CPU time signals even work in CONFIG_NO_HZ_FULL?) But although I haven't yet convinced myself the current code is right, the changes don't seem to make it any worse. So consider all four Reviewed-by: George Spelvin Thank you! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 0/4] timer: Improve itimers scalability
While running a database workload on a 16 socket machine, there were scalability issues related to itimers. The following link contains a more detailed summary of the issues at the application level. https://lkml.org/lkml/2015/8/26/737 Commit 1018016c706f addressed the issue with the thread_group_cputimer spinlock taking up a significant portion of total run time. This patch series addresses the secondary issue where a lot of time is spent trying to acquire the sighand lock. It was found in some cases that 200+ threads were simultaneously contending for the same sighand lock, reducing throughput by more than 30%. With this patch set (along with commit 1018016c706f mentioned above), the performance hit of itimers almost completely goes away on the 16 socket system. Jason Low (4): timer: Optimize fastpath_timer_check() timer: Check thread timers only when there are active thread timers timer: Convert cputimer->running to bool timer: Reduce unnecessary sighand lock contention include/linux/init_task.h |3 +- include/linux/sched.h |9 -- kernel/fork.c |2 +- kernel/time/posix-cpu-timers.c | 63 --- 4 files changed, 54 insertions(+), 23 deletions(-) -- 1.7.2.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/