Re: [PATCH] Add a text_poke syscall v2

2013-11-30 Thread Oleg Nesterov
On 11/30, Oleg Nesterov wrote: > > First of all, this smp_wmb() is not clear. Yes, but... > However, do_int3() does preempt_conditional_sti() and this looks > as if it can be called with irqs enabled? Ah, please ignore, I misread preempt_conditional_sti(). It enables irqs if they were enabled be

Re: [PATCH] Add a text_poke syscall v2

2013-11-30 Thread Oleg Nesterov
Sorry for completely offtopic question, but while we are here... On 11/30, Jiri Kosina wrote: > > We have moved from using stop_machine() to int3-based patching exactly > because it's much more lightweight. I don't really understans the barriers in poke_int3_handler() and text_poke_bp(). To the p

Re: [PATCH] Add a text_poke syscall v2

2013-11-30 Thread Oleg Nesterov
On 11/30, Jiri Kosina wrote: > > On Fri, 29 Nov 2013, Oleg Nesterov wrote: > > > > > Andi, et al. I am going to discuss the things I do not really > > > > understand, probably this can't make any sense, but... > > > > > > I think it's enough to set the dirty bit in the underlying > > > struct page,

Re: [PATCH] Add a text_poke syscall v2

2013-11-30 Thread Oleg Nesterov
On 11/29, H. Peter Anvin wrote: > > On 11/29/2013 12:35 PM, Oleg Nesterov wrote: > > On 11/29, H. Peter Anvin wrote: > >> > >> On 11/29/2013 12:05 PM, Oleg Nesterov wrote: > >>> > >>> Can't we invalidate pte (so that any user will stuck in page fault), > >>> update the page(s), restore the pte and

Re: [PATCH] Add a text_poke syscall v2

2013-11-30 Thread Oleg Nesterov
On 11/29, H. Peter Anvin wrote: > > On 11/29/2013 12:05 PM, Oleg Nesterov wrote: > > > > Can't we invalidate pte (so that any user will stuck in page fault), > > update the page(s), restore the pte and drop the locks? > > > > This way sys_text_poke() won't be x86-specific, and it will be per-mm. >

Re: [PATCH] Add a text_poke syscall v2

2013-11-29 Thread H. Peter Anvin
On 11/29/2013 12:05 PM, Oleg Nesterov wrote: > On 11/29, Andi Kleen wrote: >> >>> Andi, et al. I am going to discuss the things I do not really >>> understand, probably this can't make any sense, but... >> >> I think it's enough to set the dirty bit in the underlying >> struct page, no need to play

Re: [PATCH] Add a text_poke syscall v2

2013-11-29 Thread Linus Torvalds
On Fri, Nov 29, 2013 at 3:24 PM, Jiri Kosina wrote: > > Do you think this'd be faster than the int3-based aproach? Unlikely to be faster, but perhaps more robust and more portable. Maybe. And the reason we use the int3-based approach is that doing TLB shootdowns of kernel mappings is completely

Re: [PATCH] Add a text_poke syscall v2

2013-11-29 Thread Jiri Kosina
On Fri, 29 Nov 2013, Oleg Nesterov wrote: > > > Andi, et al. I am going to discuss the things I do not really > > > understand, probably this can't make any sense, but... > > > > I think it's enough to set the dirty bit in the underlying > > struct page, no need to play games with the PTE. > > Ah

Re: [PATCH] Add a text_poke syscall v2

2013-11-29 Thread H. Peter Anvin
On 11/29/2013 12:35 PM, Oleg Nesterov wrote: > On 11/29, H. Peter Anvin wrote: >> >> On 11/29/2013 12:05 PM, Oleg Nesterov wrote: >>> >>> Can't we invalidate pte (so that any user will stuck in page fault), >>> update the page(s), restore the pte and drop the locks? >> >> That would require a globa

Re: [PATCH] Add a text_poke syscall v2

2013-11-29 Thread Oleg Nesterov
On 11/29, H. Peter Anvin wrote: > > On 11/29/2013 12:05 PM, Oleg Nesterov wrote: > > > > Can't we invalidate pte (so that any user will stuck in page fault), > > update the page(s), restore the pte and drop the locks? > > > > That would require a global TLB shootdown Well, it is not really global,

Re: [PATCH] Add a text_poke syscall v2

2013-11-29 Thread H. Peter Anvin
On 11/29/2013 12:05 PM, Oleg Nesterov wrote: > > Can't we invalidate pte (so that any user will stuck in page fault), > update the page(s), restore the pte and drop the locks? > That would require a global TLB shootdown (and wouldn't help shared-memory code segments, if we care about that at all

Re: [PATCH] Add a text_poke syscall v2

2013-11-29 Thread Oleg Nesterov
On 11/29, Andi Kleen wrote: > > > Andi, et al. I am going to discuss the things I do not really > > understand, probably this can't make any sense, but... > > I think it's enough to set the dirty bit in the underlying > struct page, no need to play games with the PTE. Ah, sorry for confusion, I gu

Re: [PATCH] Add a text_poke syscall v2

2013-11-29 Thread Andi Kleen
Hi Oleg, Thanks for finding all these problems. > We are going to change this user page, it seems that we need > set_page_dirty() ? That's true. Will add it next rev. > > Andi, et al. I am going to discuss the things I do not really > understand, probably this can't make any sense, but... I t

Re: [PATCH] Add a text_poke syscall v2

2013-11-29 Thread Oleg Nesterov
On 11/25, Andi Kleen wrote: > > + err = get_user_pages_fast((unsigned long)addr, npages, 1, pages); > + if (err < 0) > + return err; > + if (err != npages) { > + err = -EFAULT; > + goto out; > + } > + err = 0; > + mutex_lock(&text_mutex);

Re: [PATCH] Add a text_poke syscall v2

2013-11-28 Thread Jiri Kosina
On Wed, 27 Nov 2013, Linus Torvalds wrote: > But is such batching really even worth it? If' it's not *that* much more > effort, maybe it's worth it, but do we have known users that really > would have thousands and thousands of cases all at once? If you want to base tracing infrastructure on it

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread H. Peter Anvin
ftrace is the flagship example. And yes, agreed about timeouts. Linus Torvalds wrote: >On Wed, Nov 27, 2013 at 3:28 PM, H. Peter Anvin wrote: >> >> The timeout bit was an acknowledgment that some kind of batching >> interface is necessary. > >That's just moronic. People would make up totally ra

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread Linus Torvalds
On Wed, Nov 27, 2013 at 3:28 PM, H. Peter Anvin wrote: > > The timeout bit was an acknowledgment that some kind of batching > interface is necessary. That's just moronic. People would make up totally random timeouts, so from an interface standpoint it's just horrid, horrid. Giving user space ran

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread H. Peter Anvin
On 11/27/2013 03:40 PM, Borislav Petkov wrote: > > But I'd guess that depends on the uarch because Bulldozer, reportedly, > implements MFENCE by causing a pipeline stall which controls the > prefetches too: > An implementation can always make the memory model stronger, but not weaker. -

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread Andi Kleen
One reason why I like having a syscall is that if there is some CPU which needs additional workarounds here it could be all done in a central place (I don't know of any that does right now but traditionally it has been a bug intensive area) > Hmm? It doesn't sound too bad. And I really don't see

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread Borislav Petkov
On Wed, Nov 27, 2013 at 03:20:07PM -0800, H. Peter Anvin wrote: > No. MFENCE doesn't serialize the front end. So my manual says "The MFENCE instruction is weakly-ordered with respect to data and instruction prefetches. Speculative loads initiated by the processor, or specified explicitly using c

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread H. Peter Anvin
On 11/27/2013 03:15 PM, Linus Torvalds wrote: > > Oh, I agree. The interface of the original patch was just inane/insane. > > The timeout and the callback is pointless. The only thing the system > call should get as an argument is the address and the replacement > instruction. So > > int text

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread H. Peter Anvin
On 11/27/2013 03:10 PM, Borislav Petkov wrote: > On Wed, Nov 27, 2013 at 02:40:04PM -0800, H. Peter Anvin wrote: >> Also don't forget we need the IPIs, too... > > Yeah, I was simply looking at whether we could get away with executing > an empty syscall, i.e. save us the CPUID and rely only on the

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread Linus Torvalds
On Wed, Nov 27, 2013 at 2:53 PM, H. Peter Anvin wrote: > > If we are going to go down that route, I would like to see a list of > patch sites, not just one with a "timeout" that won't get used. Oh, I agree. The interface of the original patch was just inane/insane. The timeout and the callback i

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread Borislav Petkov
On Wed, Nov 27, 2013 at 03:04:41PM -0800, Linus Torvalds wrote: > So sysexit/sysret doesn't count as a serializing instruction, no. > But it doesn't need to, because *self*-modifying code doesn't need a > serializing instruction, only a branch. It's only *cross*-modifying > code that needs a serial

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread Borislav Petkov
On Wed, Nov 27, 2013 at 02:40:04PM -0800, H. Peter Anvin wrote: > Also don't forget we need the IPIs, too... Yeah, I was simply looking at whether we could get away with executing an empty syscall, i.e. save us the CPUID and rely only on the IPIs and IRET. But we don't IPI ourselves in smp_call_f

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread Linus Torvalds
On Wed, Nov 27, 2013 at 2:31 PM, H. Peter Anvin wrote: > No, we're not... sysexit/sysret doesn't count. So sysexit/sysret doesn't count as a serializing instruction, no. But it doesn't need to, because *self*-modifying code doesn't need a serializing instruction, only a branch. It's only *cross*-

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread H. Peter Anvin
On 11/27/2013 02:41 PM, Linus Torvalds wrote: > On Wed, Nov 27, 2013 at 2:02 PM, H. Peter Anvin wrote: >> For the record, this is the entire patch necessary to do the >> sync_cores() system call -- and no potential interactions with security >> frameworks or whatnot, simply because no security-sen

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread Linus Torvalds
On Wed, Nov 27, 2013 at 2:02 PM, H. Peter Anvin wrote: > For the record, this is the entire patch necessary to do the > sync_cores() system call -- and no potential interactions with security > frameworks or whatnot, simply because no security-sensitive operations > are performed of any kind. > >

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread H. Peter Anvin
On 11/27/2013 02:29 PM, Borislav Petkov wrote: > On Wed, Nov 27, 2013 at 02:25:29PM -0800, H. Peter Anvin wrote: >> Actually the CPUID is superfluous on anything than the executing CPU >> since IRET is serializing. > > Why not on the executing core too? We're IRET-ting there too. > > Which would

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread H. Peter Anvin
No, we're not... sysexit/sysret doesn't count. Borislav Petkov wrote: >On Wed, Nov 27, 2013 at 02:25:29PM -0800, H. Peter Anvin wrote: >> Actually the CPUID is superfluous on anything than the executing CPU >> since IRET is serializing. > >Why not on the executing core too? We're IRET-ting there

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread Borislav Petkov
On Wed, Nov 27, 2013 at 02:25:29PM -0800, H. Peter Anvin wrote: > Actually the CPUID is superfluous on anything than the executing CPU > since IRET is serializing. Why not on the executing core too? We're IRET-ting there too. Which would mean that a dummy empty syscall would be enough too since w

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread H. Peter Anvin
Correct. Borislav Petkov wrote: >On Wed, Nov 27, 2013 at 02:02:50PM -0800, H. Peter Anvin wrote: >> For the record, this is the entire patch necessary to do the >> sync_cores() system call -- and no potential interactions with >security >> frameworks or whatnot, simply because no security-sensiti

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread H. Peter Anvin
Actually the CPUID is superfluous on anything than the executing CPU since IRET is serializing. Borislav Petkov wrote: >On Wed, Nov 27, 2013 at 02:02:50PM -0800, H. Peter Anvin wrote: >> For the record, this is the entire patch necessary to do the >> sync_cores() system call -- and no potential

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread Andy Lutomirski
On Wed, Nov 27, 2013 at 2:02 PM, H. Peter Anvin wrote: > For the record, this is the entire patch necessary to do the > sync_cores() system call -- and no potential interactions with security > frameworks or whatnot, simply because no security-sensitive operations > are performed of any kind. > >

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread Borislav Petkov
On Wed, Nov 27, 2013 at 02:02:50PM -0800, H. Peter Anvin wrote: > For the record, this is the entire patch necessary to do the > sync_cores() system call -- and no potential interactions with security > frameworks or whatnot, simply because no security-sensitive operations > are performed of any ki

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread H. Peter Anvin
For the record, this is the entire patch necessary to do the sync_cores() system call -- and no potential interactions with security frameworks or whatnot, simply because no security-sensitive operations are performed of any kind. Comments/opinions appreciated. -hpa >From c0246c43c30453e

Re: [PATCH] Add a text_poke syscall v2

2013-11-27 Thread H. Peter Anvin
On 11/26/2013 12:03 PM, Linus Torvalds wrote: > On Tue, Nov 26, 2013 at 11:05 AM, Andy Lutomirski wrote: >> >> IIRC someone proposed that, rather than specifying a "handler", that any >> user thread that traps just wait until the poke completes. This would >> complicate the kernel implementation

Re: [PATCH] Add a text_poke syscall v2

2013-11-26 Thread Linus Torvalds
On Tue, Nov 26, 2013 at 11:05 AM, Andy Lutomirski wrote: > > IIRC someone proposed that, rather than specifying a "handler", that any > user thread that traps just wait until the poke completes. This would > complicate the kernel implementation a bit, but it would make the user > code a good deal

Re: [PATCH] Add a text_poke syscall v2

2013-11-26 Thread Andi Kleen
> IIRC someone proposed that, rather than specifying a "handler", that any > user thread that traps just wait until the poke completes. This would > complicate the kernel implementation a bit, but it would make the user > code a good deal simpler. Is there any reason that this is a bad idea? Sho

Re: [PATCH] Add a text_poke syscall v2

2013-11-26 Thread Andy Lutomirski
On 11/25/2013 04:37 PM, Andi Kleen wrote: > From: Andi Kleen > > [Addressed all addressable review feedback in v2] > > Properly patching running code ("cross modification") > is a quite complicated business on x86. > > The CPU has specific rules that need to be followed, including > multiple gl

[PATCH] Add a text_poke syscall v2

2013-11-25 Thread Andi Kleen
From: Andi Kleen [Addressed all addressable review feedback in v2] Properly patching running code ("cross modification") is a quite complicated business on x86. The CPU has specific rules that need to be followed, including multiple global barriers. Self modifying code is getting more popular,