Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-25 Thread Khalid Aziz
On 03/25/2014 05:01 PM, Davidlohr Bueso wrote: Good timing! The topic came up just yesterday in LSF/MM. This functionality is on the wish list for both facebook and postgres. Thanks for letting me know. I am glad to hear of others who need this functionality. Did you happen to catch the

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-25 Thread Davidlohr Bueso
On Mon, 2014-03-03 at 11:07 -0700, Khalid Aziz wrote: > I am working on a feature that has been requested by database folks that > helps with performance. Some of the oft executed database code uses > mutexes to lock other threads out of a critical section. They often see > a situation where a

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-25 Thread Davidlohr Bueso
On Mon, 2014-03-03 at 11:07 -0700, Khalid Aziz wrote: I am working on a feature that has been requested by database folks that helps with performance. Some of the oft executed database code uses mutexes to lock other threads out of a critical section. They often see a situation where a thread

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-25 Thread Khalid Aziz
On 03/25/2014 05:01 PM, Davidlohr Bueso wrote: Good timing! The topic came up just yesterday in LSF/MM. This functionality is on the wish list for both facebook and postgres. Thanks for letting me know. I am glad to hear of others who need this functionality. Did you happen to catch the

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Andi Kleen
On Thu, Mar 06, 2014 at 02:59:46PM +0100, Peter Zijlstra wrote: > On Thu, Mar 06, 2014 at 11:13:33PM +1100, Kevin Easton wrote: > > On Tue, Mar 04, 2014 at 04:51:15PM -0800, Andi Kleen wrote: > > > Anything else? > > > > If it was possible to make the time remaining in the current timeslice > >

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Khalid Aziz
On 03/06/2014 04:14 AM, Thomas Gleixner wrote: We understand that you want to avoid preemption in the first place and not getting into the contention handling case. But, what you're trying to do is essentially creating an ABI which we have to support and maintain forever. And that definitely is

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Khalid Aziz
On 03/06/2014 07:25 AM, David Lang wrote: On Thu, 6 Mar 2014, Kevin Easton wrote: On Tue, Mar 04, 2014 at 04:51:15PM -0800, Andi Kleen wrote: Anything else? If it was possible to make the time remaining in the current timeslice available to userspace through the vdso, the thread could do

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Khalid Aziz
On 03/06/2014 02:57 AM, Peter Zijlstra wrote: On Wed, Mar 05, 2014 at 12:58:29PM -0700, Khalid Aziz wrote: Looking at the current problem I am trying to solve with databases and JVM, I run into the same issue I described in my earlier email. Proxy execution is a post-contention solution. By the

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread H. Peter Anvin
The no checking is omitting access_ok(), no? Either way, disabling page faults have to be done explicitly. On March 6, 2014 6:33:04 AM PST, Thomas Gleixner wrote: > > >On Thu, 6 Mar 2014, Peter Zijlstra wrote: > >> On Thu, Mar 06, 2014 at 02:45:00PM +0100, Rasmus Villemoes wrote: >> > Peter

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Thomas Gleixner
On Thu, 6 Mar 2014, Peter Zijlstra wrote: > On Thu, Mar 06, 2014 at 02:45:00PM +0100, Rasmus Villemoes wrote: > > Peter Zijlstra writes: > > > > > On Thu, Mar 06, 2014 at 02:24:43PM +0100, Rasmus Villemoes wrote: > > >> Is it possible to implement non-sleeping versions of {get,put}_user()? >

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread David Lang
On Thu, 6 Mar 2014, Kevin Easton wrote: On Tue, Mar 04, 2014 at 04:51:15PM -0800, Andi Kleen wrote: Anything else? If it was possible to make the time remaining in the current timeslice available to userspace through the vdso, the thread could do something like: if (sys_timeleft() <

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread David Lang
On Wed, 5 Mar 2014, Khalid Aziz wrote: On 03/05/2014 05:36 PM, David Lang wrote: Yes, you pay for two context switches, but you don't pay for threads B..ZZZ all running (and potentially spinning) trying to aquire the lock before thread A is able to complete it's work. Ah, great. We are

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Thomas Gleixner
On Thu, 6 Mar 2014, Rasmus Villemoes wrote: > Peter Zijlstra writes: > > > On Thu, Mar 06, 2014 at 02:24:43PM +0100, Rasmus Villemoes wrote: > >> Is it possible to implement non-sleeping versions of {get,put}_user()? > > > > __{get,put}_user() > > Huh? > > arch/x86/include/asm/uaccess.h: > >

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Peter Zijlstra
On Thu, Mar 06, 2014 at 02:45:00PM +0100, Rasmus Villemoes wrote: > Peter Zijlstra writes: > > > On Thu, Mar 06, 2014 at 02:24:43PM +0100, Rasmus Villemoes wrote: > >> Is it possible to implement non-sleeping versions of {get,put}_user()? > > > > __{get,put}_user() > > Huh? > >

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Peter Zijlstra
On Thu, Mar 06, 2014 at 11:13:33PM +1100, Kevin Easton wrote: > On Tue, Mar 04, 2014 at 04:51:15PM -0800, Andi Kleen wrote: > > Anything else? > > If it was possible to make the time remaining in the current timeslice > available to userspace through the vdso, the thread could do something like:

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Rasmus Villemoes
Peter Zijlstra writes: > On Thu, Mar 06, 2014 at 02:24:43PM +0100, Rasmus Villemoes wrote: >> Is it possible to implement non-sleeping versions of {get,put}_user()? > > __{get,put}_user() Huh? arch/x86/include/asm/uaccess.h: /** * __get_user: - Get a simple variable from user space, with

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Peter Zijlstra
On Thu, Mar 06, 2014 at 02:24:43PM +0100, Rasmus Villemoes wrote: > Is it possible to implement non-sleeping versions of {get,put}_user()? __{get,put}_user() -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Rasmus Villemoes
"H. Peter Anvin" writes: > I have several issues with this interface: > > 1. First, a process needs to know if it *should* have been preempted > before it calls sched_yield(). So there needs to be a second flag set > by the scheduler when granting amnesty. > > 2. A process which fails to call

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Kevin Easton
On Tue, Mar 04, 2014 at 04:51:15PM -0800, Andi Kleen wrote: > Anything else? If it was possible to make the time remaining in the current timeslice available to userspace through the vdso, the thread could do something like: if (sys_timeleft() < CRITICAL_SECTION_SIZE) yield(); lock(); to

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Thomas Gleixner
On Wed, 5 Mar 2014, Khalid Aziz wrote: > On 03/05/2014 04:10 AM, Peter Zijlstra wrote: > > On Tue, Mar 04, 2014 at 04:51:15PM -0800, Andi Kleen wrote: > > > Anything else? > > > > Proxy execution; its a form of PI that works for arbitrary scheduling > > policies (thus also very much including

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Peter Zijlstra
On Wed, Mar 05, 2014 at 12:58:29PM -0700, Khalid Aziz wrote: > On 03/05/2014 04:10 AM, Peter Zijlstra wrote: > >On Tue, Mar 04, 2014 at 04:51:15PM -0800, Andi Kleen wrote: > >>Anything else? > > > >Proxy execution; its a form of PI that works for arbitrary scheduling > >policies (thus also very

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Peter Zijlstra
On Wed, Mar 05, 2014 at 12:58:29PM -0700, Khalid Aziz wrote: On 03/05/2014 04:10 AM, Peter Zijlstra wrote: On Tue, Mar 04, 2014 at 04:51:15PM -0800, Andi Kleen wrote: Anything else? Proxy execution; its a form of PI that works for arbitrary scheduling policies (thus also very much including

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Thomas Gleixner
On Wed, 5 Mar 2014, Khalid Aziz wrote: On 03/05/2014 04:10 AM, Peter Zijlstra wrote: On Tue, Mar 04, 2014 at 04:51:15PM -0800, Andi Kleen wrote: Anything else? Proxy execution; its a form of PI that works for arbitrary scheduling policies (thus also very much including fair).

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Kevin Easton
On Tue, Mar 04, 2014 at 04:51:15PM -0800, Andi Kleen wrote: Anything else? If it was possible to make the time remaining in the current timeslice available to userspace through the vdso, the thread could do something like: if (sys_timeleft() CRITICAL_SECTION_SIZE) yield(); lock(); to

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Rasmus Villemoes
H. Peter Anvin h...@zytor.com writes: I have several issues with this interface: 1. First, a process needs to know if it *should* have been preempted before it calls sched_yield(). So there needs to be a second flag set by the scheduler when granting amnesty. 2. A process which fails to

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Peter Zijlstra
On Thu, Mar 06, 2014 at 02:24:43PM +0100, Rasmus Villemoes wrote: Is it possible to implement non-sleeping versions of {get,put}_user()? __{get,put}_user() -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Rasmus Villemoes
Peter Zijlstra pet...@infradead.org writes: On Thu, Mar 06, 2014 at 02:24:43PM +0100, Rasmus Villemoes wrote: Is it possible to implement non-sleeping versions of {get,put}_user()? __{get,put}_user() Huh? arch/x86/include/asm/uaccess.h: /** * __get_user: - Get a simple variable from user

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Peter Zijlstra
On Thu, Mar 06, 2014 at 11:13:33PM +1100, Kevin Easton wrote: On Tue, Mar 04, 2014 at 04:51:15PM -0800, Andi Kleen wrote: Anything else? If it was possible to make the time remaining in the current timeslice available to userspace through the vdso, the thread could do something like:

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Peter Zijlstra
On Thu, Mar 06, 2014 at 02:45:00PM +0100, Rasmus Villemoes wrote: Peter Zijlstra pet...@infradead.org writes: On Thu, Mar 06, 2014 at 02:24:43PM +0100, Rasmus Villemoes wrote: Is it possible to implement non-sleeping versions of {get,put}_user()? __{get,put}_user() Huh?

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Thomas Gleixner
On Thu, 6 Mar 2014, Rasmus Villemoes wrote: Peter Zijlstra pet...@infradead.org writes: On Thu, Mar 06, 2014 at 02:24:43PM +0100, Rasmus Villemoes wrote: Is it possible to implement non-sleeping versions of {get,put}_user()? __{get,put}_user() Huh? arch/x86/include/asm/uaccess.h:

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread David Lang
On Wed, 5 Mar 2014, Khalid Aziz wrote: On 03/05/2014 05:36 PM, David Lang wrote: Yes, you pay for two context switches, but you don't pay for threads B..ZZZ all running (and potentially spinning) trying to aquire the lock before thread A is able to complete it's work. Ah, great. We are

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread David Lang
On Thu, 6 Mar 2014, Kevin Easton wrote: On Tue, Mar 04, 2014 at 04:51:15PM -0800, Andi Kleen wrote: Anything else? If it was possible to make the time remaining in the current timeslice available to userspace through the vdso, the thread could do something like: if (sys_timeleft()

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Thomas Gleixner
On Thu, 6 Mar 2014, Peter Zijlstra wrote: On Thu, Mar 06, 2014 at 02:45:00PM +0100, Rasmus Villemoes wrote: Peter Zijlstra pet...@infradead.org writes: On Thu, Mar 06, 2014 at 02:24:43PM +0100, Rasmus Villemoes wrote: Is it possible to implement non-sleeping versions of

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread H. Peter Anvin
The no checking is omitting access_ok(), no? Either way, disabling page faults have to be done explicitly. On March 6, 2014 6:33:04 AM PST, Thomas Gleixner t...@linutronix.de wrote: On Thu, 6 Mar 2014, Peter Zijlstra wrote: On Thu, Mar 06, 2014 at 02:45:00PM +0100, Rasmus Villemoes wrote:

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Khalid Aziz
On 03/06/2014 02:57 AM, Peter Zijlstra wrote: On Wed, Mar 05, 2014 at 12:58:29PM -0700, Khalid Aziz wrote: Looking at the current problem I am trying to solve with databases and JVM, I run into the same issue I described in my earlier email. Proxy execution is a post-contention solution. By the

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Khalid Aziz
On 03/06/2014 07:25 AM, David Lang wrote: On Thu, 6 Mar 2014, Kevin Easton wrote: On Tue, Mar 04, 2014 at 04:51:15PM -0800, Andi Kleen wrote: Anything else? If it was possible to make the time remaining in the current timeslice available to userspace through the vdso, the thread could do

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Khalid Aziz
On 03/06/2014 04:14 AM, Thomas Gleixner wrote: We understand that you want to avoid preemption in the first place and not getting into the contention handling case. But, what you're trying to do is essentially creating an ABI which we have to support and maintain forever. And that definitely is

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-06 Thread Andi Kleen
On Thu, Mar 06, 2014 at 02:59:46PM +0100, Peter Zijlstra wrote: On Thu, Mar 06, 2014 at 11:13:33PM +1100, Kevin Easton wrote: On Tue, Mar 04, 2014 at 04:51:15PM -0800, Andi Kleen wrote: Anything else? If it was possible to make the time remaining in the current timeslice available to

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Khalid Aziz
On 03/05/2014 05:36 PM, David Lang wrote: Yes, you pay for two context switches, but you don't pay for threads B..ZZZ all running (and potentially spinning) trying to aquire the lock before thread A is able to complete it's work. Ah, great. We are converging now. As soon as a second thread

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread David Lang
On Wed, 5 Mar 2014, Khalid Aziz wrote: On 03/05/2014 04:59 PM, David Lang wrote: what's the cost to setup mmap of this file in /proc. this is sounding like a lot of work. That is a one time cost paid when a thread initializes itself. is this gain from not giving up the CPU at all? or is

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Khalid Aziz
On 03/05/2014 04:59 PM, David Lang wrote: what's the cost to setup mmap of this file in /proc. this is sounding like a lot of work. That is a one time cost paid when a thread initializes itself. is this gain from not giving up the CPU at all? or is it from avoiding all the delays due to

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread H. Peter Anvin
On 03/05/2014 04:02 PM, Khalid Aziz wrote: > > Yes, you had made that suggestion earlier and I like it. It will be in > v2 patch. I am thinking of making the penalty be denial of next > preemption immunity request if a process fails to yield when it should > have. Sounds good? > That is the

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Khalid Aziz
On 03/05/2014 04:56 PM, H. Peter Anvin wrote: On 03/05/2014 03:48 PM, Khalid Aziz wrote: Cost is writing to a memory location since thread is using mmap, not insignificant but hardly expensive. Thread does not need to know how much time it has left in current timeslice. It always sets the flag

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread David Lang
On Wed, 5 Mar 2014, Khalid Aziz wrote: On 03/05/2014 04:13 PM, David Lang wrote: Yes, you have paid the cost of the context switch, but your original problem description talked about having multiple other threads trying to get the lock, then spinning trying to get the lock (wasting time if the

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread H. Peter Anvin
On 03/05/2014 03:48 PM, Khalid Aziz wrote: > > Cost is writing to a memory location since thread is using mmap, not > insignificant but hardly expensive. Thread does not need to know how > much time it has left in current timeslice. It always sets the flag to > request pre-emption immunity before

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Khalid Aziz
On 03/05/2014 04:13 PM, David Lang wrote: Yes, you have paid the cost of the context switch, but your original problem description talked about having multiple other threads trying to get the lock, then spinning trying to get the lock (wasting time if the process holding it is asleep, but not if

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread David Lang
On Wed, 5 Mar 2014, Khalid Aziz wrote: On 03/05/2014 09:36 AM, Oleg Nesterov wrote: On 03/05, Andi Kleen wrote: On Wed, Mar 05, 2014 at 03:54:20PM +0100, Oleg Nesterov wrote: On 03/04, Andi Kleen wrote: Anything else? Well, we have yield_to(). Perhaps sys_yield_to(lock_owner) can help.

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Khalid Aziz
On 03/05/2014 04:10 AM, Peter Zijlstra wrote: On Tue, Mar 04, 2014 at 04:51:15PM -0800, Andi Kleen wrote: Anything else? Proxy execution; its a form of PI that works for arbitrary scheduling policies (thus also very much including fair). With that what you effectively end up with is the lock

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Khalid Aziz
On 03/05/2014 04:10 AM, Peter Zijlstra wrote: On Tue, Mar 04, 2014 at 04:51:15PM -0800, Andi Kleen wrote: Anything else? Proxy execution; its a form of PI that works for arbitrary scheduling policies (thus also very much including fair). With that what you effectively end up with is the lock

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Khalid Aziz
On 03/05/2014 09:36 AM, Oleg Nesterov wrote: On 03/05, Andi Kleen wrote: On Wed, Mar 05, 2014 at 03:54:20PM +0100, Oleg Nesterov wrote: On 03/04, Andi Kleen wrote: Anything else? Well, we have yield_to(). Perhaps sys_yield_to(lock_owner) can help. Or perhaps sys_futex() can do this if it

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Khalid Aziz
On 03/05/2014 09:12 AM, Oleg Nesterov wrote: On 03/05, Oleg Nesterov wrote: You added /proc/sched_preempt_delay to avoid the syscall. I think it would be better to simply add vdso_sched_preempt_delay() instead. I am stupid. vdso_sched_preempt_delay() obviously can't write to, say,

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Oleg Nesterov
On 03/05, Oleg Nesterov wrote: > > You added /proc/sched_preempt_delay to avoid the syscall. I think it > would be better to simply add vdso_sched_preempt_delay() instead. I am stupid. vdso_sched_preempt_delay() obviously can't write to, say, task_struct. Oleg. -- To unsubscribe from this list:

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Oleg Nesterov
On 03/05, Andi Kleen wrote: > > On Wed, Mar 05, 2014 at 03:54:20PM +0100, Oleg Nesterov wrote: > > On 03/04, Andi Kleen wrote: > > > > > > Anything else? > > > > Well, we have yield_to(). Perhaps sys_yield_to(lock_owner) can help. > > Or perhaps sys_futex() can do this if it knows the owner. Don't

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Andi Kleen
On Wed, Mar 05, 2014 at 03:54:20PM +0100, Oleg Nesterov wrote: > On 03/04, Andi Kleen wrote: > > > > Anything else? > > Well, we have yield_to(). Perhaps sys_yield_to(lock_owner) can help. > Or perhaps sys_futex() can do this if it knows the owner. Don't ask > me what exactly I mean though ;)

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Oleg Nesterov
On 03/04, Andi Kleen wrote: > > Anything else? Well, we have yield_to(). Perhaps sys_yield_to(lock_owner) can help. Or perhaps sys_futex() can do this if it knows the owner. Don't ask me what exactly I mean though ;) Oleg. -- To unsubscribe from this list: send the line "unsubscribe

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Oleg Nesterov
On 03/04, Khalid Aziz wrote: > > On 03/04/2014 12:03 PM, Oleg Nesterov wrote: >> >> 1. mremap() can move this vma, so do_exit() can't trust ->uaddr >> >> 2. Even worse, mremap() itself is not safe. It can do ->close() >> too and create the new vma with the same vm_ops. Another >>

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Peter Zijlstra
On Tue, Mar 04, 2014 at 04:51:15PM -0800, Andi Kleen wrote: > Anything else? Proxy execution; its a form of PI that works for arbitrary scheduling policies (thus also very much including fair). With that what you effectively end up with is the lock holder running 'boosted' by the runtime of its

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Peter Zijlstra
On Tue, Mar 04, 2014 at 04:51:15PM -0800, Andi Kleen wrote: Anything else? Proxy execution; its a form of PI that works for arbitrary scheduling policies (thus also very much including fair). With that what you effectively end up with is the lock holder running 'boosted' by the runtime of its

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Oleg Nesterov
On 03/04, Khalid Aziz wrote: On 03/04/2014 12:03 PM, Oleg Nesterov wrote: 1. mremap() can move this vma, so do_exit() can't trust -uaddr 2. Even worse, mremap() itself is not safe. It can do -close() too and create the new vma with the same vm_ops. Another unmap

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Oleg Nesterov
On 03/04, Andi Kleen wrote: Anything else? Well, we have yield_to(). Perhaps sys_yield_to(lock_owner) can help. Or perhaps sys_futex() can do this if it knows the owner. Don't ask me what exactly I mean though ;) Oleg. -- To unsubscribe from this list: send the line unsubscribe linux-kernel

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Andi Kleen
On Wed, Mar 05, 2014 at 03:54:20PM +0100, Oleg Nesterov wrote: On 03/04, Andi Kleen wrote: Anything else? Well, we have yield_to(). Perhaps sys_yield_to(lock_owner) can help. Or perhaps sys_futex() can do this if it knows the owner. Don't ask me what exactly I mean though ;) You mean

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Oleg Nesterov
On 03/05, Andi Kleen wrote: On Wed, Mar 05, 2014 at 03:54:20PM +0100, Oleg Nesterov wrote: On 03/04, Andi Kleen wrote: Anything else? Well, we have yield_to(). Perhaps sys_yield_to(lock_owner) can help. Or perhaps sys_futex() can do this if it knows the owner. Don't ask me what

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Oleg Nesterov
On 03/05, Oleg Nesterov wrote: You added /proc/sched_preempt_delay to avoid the syscall. I think it would be better to simply add vdso_sched_preempt_delay() instead. I am stupid. vdso_sched_preempt_delay() obviously can't write to, say, task_struct. Oleg. -- To unsubscribe from this list:

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Khalid Aziz
On 03/05/2014 09:12 AM, Oleg Nesterov wrote: On 03/05, Oleg Nesterov wrote: You added /proc/sched_preempt_delay to avoid the syscall. I think it would be better to simply add vdso_sched_preempt_delay() instead. I am stupid. vdso_sched_preempt_delay() obviously can't write to, say,

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Khalid Aziz
On 03/05/2014 09:36 AM, Oleg Nesterov wrote: On 03/05, Andi Kleen wrote: On Wed, Mar 05, 2014 at 03:54:20PM +0100, Oleg Nesterov wrote: On 03/04, Andi Kleen wrote: Anything else? Well, we have yield_to(). Perhaps sys_yield_to(lock_owner) can help. Or perhaps sys_futex() can do this if it

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Khalid Aziz
On 03/05/2014 04:10 AM, Peter Zijlstra wrote: On Tue, Mar 04, 2014 at 04:51:15PM -0800, Andi Kleen wrote: Anything else? Proxy execution; its a form of PI that works for arbitrary scheduling policies (thus also very much including fair). With that what you effectively end up with is the lock

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Khalid Aziz
On 03/05/2014 04:10 AM, Peter Zijlstra wrote: On Tue, Mar 04, 2014 at 04:51:15PM -0800, Andi Kleen wrote: Anything else? Proxy execution; its a form of PI that works for arbitrary scheduling policies (thus also very much including fair). With that what you effectively end up with is the lock

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread David Lang
On Wed, 5 Mar 2014, Khalid Aziz wrote: On 03/05/2014 09:36 AM, Oleg Nesterov wrote: On 03/05, Andi Kleen wrote: On Wed, Mar 05, 2014 at 03:54:20PM +0100, Oleg Nesterov wrote: On 03/04, Andi Kleen wrote: Anything else? Well, we have yield_to(). Perhaps sys_yield_to(lock_owner) can help.

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Khalid Aziz
On 03/05/2014 04:13 PM, David Lang wrote: Yes, you have paid the cost of the context switch, but your original problem description talked about having multiple other threads trying to get the lock, then spinning trying to get the lock (wasting time if the process holding it is asleep, but not if

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread H. Peter Anvin
On 03/05/2014 03:48 PM, Khalid Aziz wrote: Cost is writing to a memory location since thread is using mmap, not insignificant but hardly expensive. Thread does not need to know how much time it has left in current timeslice. It always sets the flag to request pre-emption immunity before

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread David Lang
On Wed, 5 Mar 2014, Khalid Aziz wrote: On 03/05/2014 04:13 PM, David Lang wrote: Yes, you have paid the cost of the context switch, but your original problem description talked about having multiple other threads trying to get the lock, then spinning trying to get the lock (wasting time if the

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Khalid Aziz
On 03/05/2014 04:56 PM, H. Peter Anvin wrote: On 03/05/2014 03:48 PM, Khalid Aziz wrote: Cost is writing to a memory location since thread is using mmap, not insignificant but hardly expensive. Thread does not need to know how much time it has left in current timeslice. It always sets the flag

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread H. Peter Anvin
On 03/05/2014 04:02 PM, Khalid Aziz wrote: Yes, you had made that suggestion earlier and I like it. It will be in v2 patch. I am thinking of making the penalty be denial of next preemption immunity request if a process fails to yield when it should have. Sounds good? That is the minimum

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Khalid Aziz
On 03/05/2014 04:59 PM, David Lang wrote: what's the cost to setup mmap of this file in /proc. this is sounding like a lot of work. That is a one time cost paid when a thread initializes itself. is this gain from not giving up the CPU at all? or is it from avoiding all the delays due to

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread David Lang
On Wed, 5 Mar 2014, Khalid Aziz wrote: On 03/05/2014 04:59 PM, David Lang wrote: what's the cost to setup mmap of this file in /proc. this is sounding like a lot of work. That is a one time cost paid when a thread initializes itself. is this gain from not giving up the CPU at all? or is

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-05 Thread Khalid Aziz
On 03/05/2014 05:36 PM, David Lang wrote: Yes, you pay for two context switches, but you don't pay for threads B..ZZZ all running (and potentially spinning) trying to aquire the lock before thread A is able to complete it's work. Ah, great. We are converging now. As soon as a second thread

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Andi Kleen
Thomas Gleixner writes: > On Tue, 4 Mar 2014, Khalid Aziz wrote: >> be in the right control group. Besides they want to use a common mechanism >> across multiple OSs and pre-emption delay is already in use on other OSs. >> Good >> idea though. > > Well, just because preemption delay is a

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Thomas Gleixner
On Tue, 4 Mar 2014, Khalid Aziz wrote: > be in the right control group. Besides they want to use a common mechanism > across multiple OSs and pre-emption delay is already in use on other OSs. Good > idea though. Well, just because preemption delay is a mechanism exposed by some other OS does not

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Khalid Aziz
On 03/04/2014 03:23 PM, One Thousand Gnomes wrote: Obvious bug | Usage model is a thread mmaps this file during initialization. It then | writes a 1 to the mmap'd file after it grabs the lock in its critical | section where it wants immunity from pre-emption. You need to write it first or you

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread One Thousand Gnomes
Obvious bug | Usage model is a thread mmaps this file during initialization. It then | writes a 1 to the mmap'd file after it grabs the lock in its critical | section where it wants immunity from pre-emption. You need to write it first or you can be pre-empted taking the lock before asking for

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Khalid Aziz
On 03/04/2014 02:12 PM, H. Peter Anvin wrote: Shades of the Android wakelocks, no? This seems to effectively give userspace an option to turn preemptive multitasking into cooperative multitasking, which of course is unacceptable for a privileged process (the same reason why unprivileged

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread H. Peter Anvin
On 03/03/2014 10:07 AM, Khalid Aziz wrote: > > I am working on a feature that has been requested by database folks that > helps with performance. Some of the oft executed database code uses > mutexes to lock other threads out of a critical section. They often see > a situation where a thread

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Khalid Aziz
On 03/04/2014 12:03 PM, Oleg Nesterov wrote: On 03/04, Khalid Aziz wrote: On 03/04/2014 06:56 AM, Oleg Nesterov wrote: Hmm. In fact I think do_exit() should crash after munmap? ->mmap_state should be NULL ?? Perhaps I misread this patch completely... do_exit() unmaps mmap_state->uaddr, and

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Oleg Nesterov
On 03/04, Khalid Aziz wrote: > > On 03/04/2014 06:56 AM, Oleg Nesterov wrote: >> Hmm. In fact I think do_exit() should crash after munmap? ->mmap_state >> should be NULL ?? Perhaps I misread this patch completely... > > do_exit() unmaps mmap_state->uaddr, and frees up mmap_state->kaddr and >

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Khalid Aziz
On Tue, 2014-03-04 at 18:38 +, Al Viro wrote: > On Tue, Mar 04, 2014 at 10:44:54AM -0700, Khalid Aziz wrote: > > > do_exit() unmaps mmap_state->uaddr, and frees up mmap_state->kaddr > > and mmap_state. mmap_state should not be NULL after unmap. vfree() > > and kfree() are tolerant of pointers

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Al Viro
On Tue, Mar 04, 2014 at 10:44:54AM -0700, Khalid Aziz wrote: > do_exit() unmaps mmap_state->uaddr, and frees up mmap_state->kaddr > and mmap_state. mmap_state should not be NULL after unmap. vfree() > and kfree() are tolerant of pointers that have already been freed. Huh? Double free() is a

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Khalid Aziz
Thanks for the review. Please see my comments inline below. On 03/04/2014 06:56 AM, Oleg Nesterov wrote: On 03/03, Khalid Aziz wrote: kernel/sched/preempt_delay.c| 39 ++ Why? This can go into proc/ as well. Sure. No strong reason to keep these functions in

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Oleg Nesterov
On 03/03, Khalid Aziz wrote: > > This queueing > and subsequent CPU cycle wastage can be avoided if the locking thread > could request to be granted an additional timeslice if its current > timeslice runs out before it gives up the lock. Well. I am in no position to discuss the changes in

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Oleg Nesterov
On 03/03, Khalid Aziz wrote: This queueing and subsequent CPU cycle wastage can be avoided if the locking thread could request to be granted an additional timeslice if its current timeslice runs out before it gives up the lock. Well. I am in no position to discuss the changes in

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Khalid Aziz
Thanks for the review. Please see my comments inline below. On 03/04/2014 06:56 AM, Oleg Nesterov wrote: On 03/03, Khalid Aziz wrote: kernel/sched/preempt_delay.c| 39 ++ Why? This can go into proc/ as well. Sure. No strong reason to keep these functions in

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Al Viro
On Tue, Mar 04, 2014 at 10:44:54AM -0700, Khalid Aziz wrote: do_exit() unmaps mmap_state-uaddr, and frees up mmap_state-kaddr and mmap_state. mmap_state should not be NULL after unmap. vfree() and kfree() are tolerant of pointers that have already been freed. Huh? Double free() is a bug,

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Khalid Aziz
On Tue, 2014-03-04 at 18:38 +, Al Viro wrote: On Tue, Mar 04, 2014 at 10:44:54AM -0700, Khalid Aziz wrote: do_exit() unmaps mmap_state-uaddr, and frees up mmap_state-kaddr and mmap_state. mmap_state should not be NULL after unmap. vfree() and kfree() are tolerant of pointers that have

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Oleg Nesterov
On 03/04, Khalid Aziz wrote: On 03/04/2014 06:56 AM, Oleg Nesterov wrote: Hmm. In fact I think do_exit() should crash after munmap? -mmap_state should be NULL ?? Perhaps I misread this patch completely... do_exit() unmaps mmap_state-uaddr, and frees up mmap_state-kaddr and mmap_state.

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Khalid Aziz
On 03/04/2014 12:03 PM, Oleg Nesterov wrote: On 03/04, Khalid Aziz wrote: On 03/04/2014 06:56 AM, Oleg Nesterov wrote: Hmm. In fact I think do_exit() should crash after munmap? -mmap_state should be NULL ?? Perhaps I misread this patch completely... do_exit() unmaps mmap_state-uaddr, and

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread H. Peter Anvin
On 03/03/2014 10:07 AM, Khalid Aziz wrote: I am working on a feature that has been requested by database folks that helps with performance. Some of the oft executed database code uses mutexes to lock other threads out of a critical section. They often see a situation where a thread grabs the

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Khalid Aziz
On 03/04/2014 02:12 PM, H. Peter Anvin wrote: Shades of the Android wakelocks, no? This seems to effectively give userspace an option to turn preemptive multitasking into cooperative multitasking, which of course is unacceptable for a privileged process (the same reason why unprivileged

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread One Thousand Gnomes
Obvious bug | Usage model is a thread mmaps this file during initialization. It then | writes a 1 to the mmap'd file after it grabs the lock in its critical | section where it wants immunity from pre-emption. You need to write it first or you can be pre-empted taking the lock before asking for

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Khalid Aziz
On 03/04/2014 03:23 PM, One Thousand Gnomes wrote: Obvious bug | Usage model is a thread mmaps this file during initialization. It then | writes a 1 to the mmap'd file after it grabs the lock in its critical | section where it wants immunity from pre-emption. You need to write it first or you

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Thomas Gleixner
On Tue, 4 Mar 2014, Khalid Aziz wrote: be in the right control group. Besides they want to use a common mechanism across multiple OSs and pre-emption delay is already in use on other OSs. Good idea though. Well, just because preemption delay is a mechanism exposed by some other OS does not

Re: [RFC] [PATCH] Pre-emption control for userspace

2014-03-04 Thread Andi Kleen
Thomas Gleixner t...@linutronix.de writes: On Tue, 4 Mar 2014, Khalid Aziz wrote: be in the right control group. Besides they want to use a common mechanism across multiple OSs and pre-emption delay is already in use on other OSs. Good idea though. Well, just because preemption delay is a

  1   2   >