Re: Dealing with the NMI mess

2015-09-08 Thread Maciej W. Rozycki
On Mon, 7 Sep 2015, Andy Lutomirski wrote: > > It does not have to be mentioned, because it's implied by how the #DB > > exception is propagated: regardless of its origin it never checks the DPL. > > And user-mode software may well use POPF at any time to set the TF bit in > > the flags register

Re: Dealing with the NMI mess

2015-09-08 Thread Maciej W. Rozycki
On Mon, 7 Sep 2015, Andy Lutomirski wrote: > > It does not have to be mentioned, because it's implied by how the #DB > > exception is propagated: regardless of its origin it never checks the DPL. > > And user-mode software may well use POPF at any time to set the TF bit in > > the flags register

Re: Dealing with the NMI mess

2015-09-07 Thread Andy Lutomirski
On Mon, Sep 7, 2015 at 12:30 PM, Maciej W. Rozycki wrote: > On Mon, 7 Sep 2015, Andy Lutomirski wrote: > >> > These are all implementation-specific details, including the INT1 >> > instruction, which is why I am not at all surprised that they are omitted >> > from architecture manuals. >> >>

Re: Dealing with the NMI mess

2015-09-07 Thread Maciej W. Rozycki
On Mon, 7 Sep 2015, Andy Lutomirski wrote: > > These are all implementation-specific details, including the INT1 > > instruction, which is why I am not at all surprised that they are omitted > > from architecture manuals. > > That bit is BS, though. The INT1 instruction, executed in user mode

Re: Dealing with the NMI mess

2015-09-07 Thread Andy Lutomirski
On Mon, Sep 7, 2015 at 10:01 AM, Maciej W. Rozycki wrote: > These are all implementation-specific details, including the INT1 > instruction, which is why I am not at all surprised that they are omitted > from architecture manuals. That bit is BS, though. The INT1 instruction, executed in user

Re: Dealing with the NMI mess

2015-09-07 Thread Maciej W. Rozycki
On Mon, 7 Sep 2015, Paolo Bonzini wrote: > > I didn't do stuff at the probe firmware level so I can't say for sure, > > but my gut feeling is the debug mode is indeed very close if not the same > > as SMM. I think duplicating the logic would be an unnecessary waste of > > silicon. > > I

Re: Dealing with the NMI mess

2015-09-07 Thread Paolo Bonzini
On 07/09/2015 10:19, Maciej W. Rozycki wrote: >> > Essentially the ICE breakpoint instruction enters SMM mode? > I didn't do stuff at the probe firmware level so I can't say for sure, > but my gut feeling is the debug mode is indeed very close if not the same > as SMM. I think duplicating

Re: Dealing with the NMI mess

2015-09-07 Thread Maciej W. Rozycki
On Mon, 7 Sep 2015, Ingo Molnar wrote: > > I did some work on this a few years ago, including emulating DR0-7 > > accesses in > > software down the JTAG handler upon a General Detect fault to keep the > > kernel > > both happy and away from real debug registers. ;) Yes, you can debug any >

Re: Dealing with the NMI mess

2015-09-07 Thread Ingo Molnar
* Maciej W. Rozycki wrote: > I did some work on this a few years ago, including emulating DR0-7 accesses > in > software down the JTAG handler upon a General Detect fault to keep the kernel > both happy and away from real debug registers. ;) Yes, you can debug any > software with this

Re: Dealing with the NMI mess

2015-09-07 Thread Maciej W. Rozycki
On Mon, 7 Sep 2015, Ingo Molnar wrote: > > I did some work on this a few years ago, including emulating DR0-7 > > accesses in > > software down the JTAG handler upon a General Detect fault to keep the > > kernel > > both happy and away from real debug registers. ;) Yes, you can debug any >

Re: Dealing with the NMI mess

2015-09-07 Thread Ingo Molnar
* Maciej W. Rozycki wrote: > I did some work on this a few years ago, including emulating DR0-7 accesses > in > software down the JTAG handler upon a General Detect fault to keep the kernel > both happy and away from real debug registers. ;) Yes, you can debug any >

Re: Dealing with the NMI mess

2015-09-07 Thread Paolo Bonzini
On 07/09/2015 10:19, Maciej W. Rozycki wrote: >> > Essentially the ICE breakpoint instruction enters SMM mode? > I didn't do stuff at the probe firmware level so I can't say for sure, > but my gut feeling is the debug mode is indeed very close if not the same > as SMM. I think duplicating

Re: Dealing with the NMI mess

2015-09-07 Thread Maciej W. Rozycki
On Mon, 7 Sep 2015, Paolo Bonzini wrote: > > I didn't do stuff at the probe firmware level so I can't say for sure, > > but my gut feeling is the debug mode is indeed very close if not the same > > as SMM. I think duplicating the logic would be an unnecessary waste of > > silicon. > > I

Re: Dealing with the NMI mess

2015-09-07 Thread Andy Lutomirski
On Mon, Sep 7, 2015 at 10:01 AM, Maciej W. Rozycki wrote: > These are all implementation-specific details, including the INT1 > instruction, which is why I am not at all surprised that they are omitted > from architecture manuals. That bit is BS, though. The INT1

Re: Dealing with the NMI mess

2015-09-07 Thread Andy Lutomirski
On Mon, Sep 7, 2015 at 12:30 PM, Maciej W. Rozycki wrote: > On Mon, 7 Sep 2015, Andy Lutomirski wrote: > >> > These are all implementation-specific details, including the INT1 >> > instruction, which is why I am not at all surprised that they are omitted >> > from

Re: Dealing with the NMI mess

2015-09-07 Thread Maciej W. Rozycki
On Mon, 7 Sep 2015, Andy Lutomirski wrote: > > These are all implementation-specific details, including the INT1 > > instruction, which is why I am not at all surprised that they are omitted > > from architecture manuals. > > That bit is BS, though. The INT1 instruction, executed in user mode

Re: Dealing with the NMI mess

2015-09-06 Thread Maciej W. Rozycki
On Fri, 31 Jul 2015, Borislav Petkov wrote: > Yeah, INT 1. I wonder whether INT 1, i.e. CD imm8 does the same thing. > > But why do you say it is special - it simply raises #DB, i.e. vector 1. > Web page seems to say so when interrupt redirection is disabled. It > sounds like a nice and quick

Re: Dealing with the NMI mess

2015-09-06 Thread Maciej W. Rozycki
On Fri, 31 Jul 2015, Borislav Petkov wrote: > Yeah, INT 1. I wonder whether INT 1, i.e. CD imm8 does the same thing. > > But why do you say it is special - it simply raises #DB, i.e. vector 1. > Web page seems to say so when interrupt redirection is disabled. It > sounds like a nice and quick

Re: Dealing with the NMI mess

2015-07-31 Thread Borislav Petkov
On Fri, Jul 31, 2015 at 12:26:34PM +0200, Paolo Bonzini wrote: > > > On 31/07/2015 12:25, Borislav Petkov wrote: > >> > The reason why it isn't documented is probably hidden within Intel. > >> > Besides ICEBP, which is a bit fringe, there's no reason not to document > >> > SALC which Thomas

Re: Dealing with the NMI mess

2015-07-31 Thread Borislav Petkov
On Fri, Jul 31, 2015 at 11:27:13AM +0200, Paolo Bonzini wrote: > Is the strace different between KVM and baremetal? Yes, the signal part is missing from kvm: $ strace ./icebp execve("./icebp", ["./icebp"], [/* 20 vars */]) = 0 brk(0) = 0x601000

Re: Dealing with the NMI mess

2015-07-31 Thread Paolo Bonzini
On 31/07/2015 12:25, Borislav Petkov wrote: >> > The reason why it isn't documented is probably hidden within Intel. >> > Besides ICEBP, which is a bit fringe, there's no reason not to document >> > SALC which Thomas mentioned. SALC all has been there since the 8086, >> > and has been

Re: Dealing with the NMI mess

2015-07-31 Thread Paolo Bonzini
On 31/07/2015 10:03, Borislav Petkov wrote: > $ ./icebp > Trace/breakpoint trap > > ^ this in qemu. Is the strace different between KVM and baremetal? QEMU dynamic translation is broken I think, but KVM should be the same as baremetal. >> Fortunately, it looks like the vm86 case is correct

Re: Dealing with the NMI mess

2015-07-31 Thread Borislav Petkov
On Thu, Jul 30, 2015 at 10:11:40PM -0700, Andy Lutomirski wrote: > This instruction is awesome. Binutils can disassemble it (it's called > "icebp") but it can't assemble it. KVM has special handling for it on > VMX and actually reports it to QEMU on SVM (complete with a defined > ABI). Fun. >

Re: Dealing with the NMI mess

2015-07-31 Thread Paolo Bonzini
On 31/07/2015 07:11, Andy Lutomirski wrote: > This instruction is awesome. Binutils can disassemble it (it's called > "icebp") but it can't assemble it. KVM has special handling for it on > VMX and actually reports it to QEMU on SVM (complete with a defined > ABI). FWIW it's not reported to

Re: Dealing with the NMI mess

2015-07-31 Thread Borislav Petkov
On Thu, Jul 30, 2015 at 10:11:40PM -0700, Andy Lutomirski wrote: This instruction is awesome. Binutils can disassemble it (it's called icebp) but it can't assemble it. KVM has special handling for it on VMX and actually reports it to QEMU on SVM (complete with a defined ABI). Fun. We have

Re: Dealing with the NMI mess

2015-07-31 Thread Paolo Bonzini
On 31/07/2015 07:11, Andy Lutomirski wrote: This instruction is awesome. Binutils can disassemble it (it's called icebp) but it can't assemble it. KVM has special handling for it on VMX and actually reports it to QEMU on SVM (complete with a defined ABI). FWIW it's not reported to QEMU,

Re: Dealing with the NMI mess

2015-07-31 Thread Paolo Bonzini
On 31/07/2015 10:03, Borislav Petkov wrote: $ ./icebp Trace/breakpoint trap ^ this in qemu. Is the strace different between KVM and baremetal? QEMU dynamic translation is broken I think, but KVM should be the same as baremetal. Fortunately, it looks like the vm86 case is correct (or as

Re: Dealing with the NMI mess

2015-07-31 Thread Paolo Bonzini
On 31/07/2015 12:25, Borislav Petkov wrote: The reason why it isn't documented is probably hidden within Intel. Besides ICEBP, which is a bit fringe, there's no reason not to document SALC which Thomas mentioned. SALC all has been there since the 8086, and has been undocumented for

Re: Dealing with the NMI mess

2015-07-31 Thread Borislav Petkov
On Fri, Jul 31, 2015 at 11:27:13AM +0200, Paolo Bonzini wrote: Is the strace different between KVM and baremetal? Yes, the signal part is missing from kvm: $ strace ./icebp execve(./icebp, [./icebp], [/* 20 vars */]) = 0 brk(0) = 0x601000

Re: Dealing with the NMI mess

2015-07-31 Thread Borislav Petkov
On Fri, Jul 31, 2015 at 12:26:34PM +0200, Paolo Bonzini wrote: On 31/07/2015 12:25, Borislav Petkov wrote: The reason why it isn't documented is probably hidden within Intel. Besides ICEBP, which is a bit fringe, there's no reason not to document SALC which Thomas mentioned. SALC

Re: Dealing with the NMI mess

2015-07-30 Thread Andy Lutomirski
On Thu, Jul 30, 2015 at 9:22 PM, Borislav Petkov wrote: > On Thu, Jul 30, 2015 at 02:22:06PM -0700, Andy Lutomirski wrote: >> Great. There's an opcode that invokes an interrupt gate that's not >> marked as allowing unprivileged access, and that opcode doesn't appear >> in the SDM. It appears in

Re: Dealing with the NMI mess

2015-07-30 Thread Borislav Petkov
On Thu, Jul 30, 2015 at 02:22:06PM -0700, Andy Lutomirski wrote: > Great. There's an opcode that invokes an interrupt gate that's not > marked as allowing unprivileged access, and that opcode doesn't appear > in the SDM. It appears in the APM opcode map with no explanation at > all. > > Thanks,

Re: Dealing with the NMI mess

2015-07-30 Thread Thomas Gleixner
On Thu, 30 Jul 2015, Andy Lutomirski wrote: > On Thu, Jul 30, 2015 at 8:41 AM, Paolo Bonzini wrote: > > > > > > On 24/07/2015 23:08, Andy Lutomirski wrote: > >> user_icebp is set if int $0x01 happens, except it isn't because user > >> code can't actually do that -- it'll cause #GP instead. >

Re: Dealing with the NMI mess

2015-07-30 Thread Brian Gerst
On Thu, Jul 30, 2015 at 5:22 PM, Andy Lutomirski wrote: > On Thu, Jul 30, 2015 at 8:41 AM, Paolo Bonzini wrote: >> >> >> On 24/07/2015 23:08, Andy Lutomirski wrote: >>> user_icebp is set if int $0x01 happens, except it isn't because user >>> code can't actually do that -- it'll cause #GP

Re: Dealing with the NMI mess

2015-07-30 Thread Andy Lutomirski
On Thu, Jul 30, 2015 at 8:41 AM, Paolo Bonzini wrote: > > > On 24/07/2015 23:08, Andy Lutomirski wrote: >> user_icebp is set if int $0x01 happens, except it isn't because user >> code can't actually do that -- it'll cause #GP instead. >> >> user_icebp is also set if the user has a bloody

Re: Dealing with the NMI mess

2015-07-30 Thread Paolo Bonzini
On 24/07/2015 19:20, Andy Lutomirski wrote: > > Andy, section 5.8 of the SDM makes me think we could possibly abuse SYSRET > > to emulate IRET, and then possibly simplify the flags processing. It says > > that it takes the CPL3 code segment but nowhere it says that the target is > > validated

Re: Dealing with the NMI mess

2015-07-30 Thread Paolo Bonzini
On 24/07/2015 23:08, Andy Lutomirski wrote: > user_icebp is set if int $0x01 happens, except it isn't because user > code can't actually do that -- it'll cause #GP instead. > > user_icebp is also set if the user has a bloody in-circuit emulator, > given the name. But who on Earth has one of

Re: Dealing with the NMI mess

2015-07-30 Thread Paolo Bonzini
On 24/07/2015 23:08, Andy Lutomirski wrote: user_icebp is set if int $0x01 happens, except it isn't because user code can't actually do that -- it'll cause #GP instead. user_icebp is also set if the user has a bloody in-circuit emulator, given the name. But who on Earth has one of those

Re: Dealing with the NMI mess

2015-07-30 Thread Andy Lutomirski
On Thu, Jul 30, 2015 at 8:41 AM, Paolo Bonzini pbonz...@redhat.com wrote: On 24/07/2015 23:08, Andy Lutomirski wrote: user_icebp is set if int $0x01 happens, except it isn't because user code can't actually do that -- it'll cause #GP instead. user_icebp is also set if the user has a bloody

Re: Dealing with the NMI mess

2015-07-30 Thread Brian Gerst
On Thu, Jul 30, 2015 at 5:22 PM, Andy Lutomirski l...@amacapital.net wrote: On Thu, Jul 30, 2015 at 8:41 AM, Paolo Bonzini pbonz...@redhat.com wrote: On 24/07/2015 23:08, Andy Lutomirski wrote: user_icebp is set if int $0x01 happens, except it isn't because user code can't actually do that

Re: Dealing with the NMI mess

2015-07-30 Thread Borislav Petkov
On Thu, Jul 30, 2015 at 02:22:06PM -0700, Andy Lutomirski wrote: Great. There's an opcode that invokes an interrupt gate that's not marked as allowing unprivileged access, and that opcode doesn't appear in the SDM. It appears in the APM opcode map with no explanation at all. Thanks, CPU

Re: Dealing with the NMI mess

2015-07-30 Thread Andy Lutomirski
On Thu, Jul 30, 2015 at 9:22 PM, Borislav Petkov b...@alien8.de wrote: On Thu, Jul 30, 2015 at 02:22:06PM -0700, Andy Lutomirski wrote: Great. There's an opcode that invokes an interrupt gate that's not marked as allowing unprivileged access, and that opcode doesn't appear in the SDM. It

Re: Dealing with the NMI mess

2015-07-30 Thread Paolo Bonzini
On 24/07/2015 19:20, Andy Lutomirski wrote: Andy, section 5.8 of the SDM makes me think we could possibly abuse SYSRET to emulate IRET, and then possibly simplify the flags processing. It says that it takes the CPL3 code segment but nowhere it says that the target is validated for

Re: Dealing with the NMI mess

2015-07-30 Thread Thomas Gleixner
On Thu, 30 Jul 2015, Andy Lutomirski wrote: On Thu, Jul 30, 2015 at 8:41 AM, Paolo Bonzini pbonz...@redhat.com wrote: On 24/07/2015 23:08, Andy Lutomirski wrote: user_icebp is set if int $0x01 happens, except it isn't because user code can't actually do that -- it'll cause #GP

Re: Dealing with the NMI mess

2015-07-24 Thread Linus Torvalds
On Fri, Jul 24, 2015 at 1:51 PM, Peter Zijlstra wrote: > > do_debug() > send_sigtrap() > force_sig_info() > spin_lock_irqsave() > > Now, I don't pretend to understand the condition before send_sigtrap(), > so it _might_ be ok, but it sure as heck could do with a comment. Ugh. As Andy

Re: Dealing with the NMI mess

2015-07-24 Thread Andy Lutomirski
On Fri, Jul 24, 2015 at 1:51 PM, Peter Zijlstra wrote: > On Fri, Jul 24, 2015 at 01:22:11PM -0700, Linus Torvalds wrote: >> On Fri, Jul 24, 2015 at 12:55 PM, Peter Zijlstra >> wrote: >> > >> > I worry that we'll end up running the do_debug() handlers from effective >> > NMI context. >> > >> >

Re: Dealing with the NMI mess

2015-07-24 Thread Steven Rostedt
On Fri, 24 Jul 2015 22:51:19 +0200 Peter Zijlstra wrote: > > Do we really take locks in the #DB handler? > > do_debug() > send_sigtrap() > force_sig_info() > spin_lock_irqsave() > > Now, I don't pretend to understand the condition before send_sigtrap(), > so it _might_ be ok, but

Re: Dealing with the NMI mess

2015-07-24 Thread Peter Zijlstra
On Fri, Jul 24, 2015 at 01:22:11PM -0700, Linus Torvalds wrote: > On Fri, Jul 24, 2015 at 12:55 PM, Peter Zijlstra wrote: > > > > I worry that we'll end up running the do_debug() handlers from effective > > NMI context. > > > > The NMI might have preempted locks which these handlers require etc..

Re: Dealing with the NMI mess

2015-07-24 Thread Linus Torvalds
On Fri, Jul 24, 2015 at 12:55 PM, Peter Zijlstra wrote: > > I worry that we'll end up running the do_debug() handlers from effective > NMI context. > > The NMI might have preempted locks which these handlers require etc.. If #DB takes any locks like that, then #DB is broken. Pretty much by

Re: Dealing with the NMI mess

2015-07-24 Thread Peter Zijlstra
On Fri, Jul 24, 2015 at 11:29:29AM -0700, Linus Torvalds wrote: > On Fri, Jul 24, 2015 at 8:30 AM, Peter Zijlstra wrote: > > On Fri, Jul 24, 2015 at 05:26:37PM +0200, Willy Tarreau wrote: > >> > > >> > The point is, if we trigger a #DB on an instruction breakpoint > >> > while !IF, then we simply

Re: Dealing with the NMI mess

2015-07-24 Thread Steven Rostedt
On Fri, 24 Jul 2015 11:41:55 -0700 Linus Torvalds wrote: > On Fri, Jul 24, 2015 at 11:29 AM, Linus Torvalds > wrote: > > > > So in the #DB handler, we would basically only clear instruction > > breakpoints, and only when they trigger. If we have a data breakpoint > > that triggers (even in

Re: Dealing with the NMI mess

2015-07-24 Thread Linus Torvalds
On Fri, Jul 24, 2015 at 11:29 AM, Linus Torvalds wrote: > > So in the #DB handler, we would basically only clear instruction > breakpoints, and only when they trigger. If we have a data breakpoint > that triggers (even in kernel mode, and with interrupts disabled), let > it trigger and return

Re: Dealing with the NMI mess

2015-07-24 Thread Linus Torvalds
On Fri, Jul 24, 2015 at 8:30 AM, Peter Zijlstra wrote: > On Fri, Jul 24, 2015 at 05:26:37PM +0200, Willy Tarreau wrote: >> > >> > The point is, if we trigger a #DB on an instruction breakpoint >> > while !IF, then we simply disable that breakpoint and do the RET. >> >> Yes but the breakpoint

Re: Dealing with the NMI mess

2015-07-24 Thread Willy Tarreau
On Fri, Jul 24, 2015 at 07:10:18PM +0200, Willy Tarreau wrote: > The OS has to set the RSP by itself before doing SYSRET, which opens a > race between "mov rsp" and "sysret", but if we only take that path once > we figure we come from NMI (using just IF+RSP), we know that IRQs and > NMIs are still

Re: Dealing with the NMI mess

2015-07-24 Thread Andy Lutomirski
On Fri, Jul 24, 2015 at 9:25 AM, Willy Tarreau wrote: > On Fri, Jul 24, 2015 at 08:48:57AM -0700, Andy Lutomirski wrote: >> So by the time we detect that we've hit a watchpoint, the instruction >> that tripped it is done and we don't need RF. Furthermore, after >> reading 17.3.1.1: I *think*

Re: Dealing with the NMI mess

2015-07-24 Thread Andy Lutomirski
On Fri, Jul 24, 2015 at 10:10 AM, Willy Tarreau wrote: > On Fri, Jul 24, 2015 at 08:48:57AM -0700, Andy Lutomirski wrote: >> So by the time we detect that we've hit a watchpoint, the instruction >> that tripped it is done and we don't need RF. Furthermore, after >> reading 17.3.1.1: I *think*

Re: Dealing with the NMI mess

2015-07-24 Thread Willy Tarreau
On Fri, Jul 24, 2015 at 08:48:57AM -0700, Andy Lutomirski wrote: > So by the time we detect that we've hit a watchpoint, the instruction > that tripped it is done and we don't need RF. Furthermore, after > reading 17.3.1.1: I *think* that regs->flags withh have RF *clear* if > we hit a

Re: Dealing with the NMI mess

2015-07-24 Thread Raymond Jennings
On Thu, 2015-07-23 at 13:21 -0700, Andy Lutomirski wrote: > [moved to a new thread, cc list trimmed] > > Hi all- > > We've considered two approaches to dealing with NMIs: > > 1. Allow nesting. We know quite well how messy that is. This might be a stupid question, but 1. What exactly does

Re: Dealing with the NMI mess

2015-07-24 Thread Steven Rostedt
On Fri, 24 Jul 2015 18:08:06 +0200 Willy Tarreau wrote: > > Um, isn't 0xf * DR_TRAP0 same as a constant "true"? > > For me it's a typo, it should have been : > > if ((dr6 & (0xf * DR_TRAP0) && (regs->flags & (X86_EFLAGS_RF | > > (the purpose was to read all 4 lower bits at once). I figured

Re: Dealing with the NMI mess

2015-07-24 Thread Willy Tarreau
On Fri, Jul 24, 2015 at 08:48:57AM -0700, Andy Lutomirski wrote: > So by the time we detect that we've hit a watchpoint, the instruction > that tripped it is done and we don't need RF. Furthermore, after > reading 17.3.1.1: I *think* that regs->flags withh have RF *clear* if > we hit a

Re: Dealing with the NMI mess

2015-07-24 Thread Willy Tarreau
On Fri, Jul 24, 2015 at 12:02:49PM -0400, Steven Rostedt wrote: > On Fri, 24 Jul 2015 08:48:57 -0700 > Andy Lutomirski wrote: > > > So by the time we detect that we've hit a watchpoint, the instruction > > that tripped it is done and we don't need RF. Furthermore, after > > reading 17.3.1.1: I

Re: Dealing with the NMI mess

2015-07-24 Thread Steven Rostedt
On Fri, 24 Jul 2015 08:48:57 -0700 Andy Lutomirski wrote: > So by the time we detect that we've hit a watchpoint, the instruction > that tripped it is done and we don't need RF. Furthermore, after > reading 17.3.1.1: I *think* that regs->flags withh have RF *clear* if > we hit a watchpoint. So

Re: Dealing with the NMI mess

2015-07-24 Thread Steven Rostedt
On Fri, 24 Jul 2015 08:48:57 -0700 Andy Lutomirski wrote: > So by the time we detect that we've hit a watchpoint, the instruction > that tripped it is done and we don't need RF. Furthermore, after > reading 17.3.1.1: I *think* that regs->flags withh have RF *clear* if > we hit a watchpoint. So

Re: Dealing with the NMI mess

2015-07-24 Thread Willy Tarreau
On Fri, Jul 24, 2015 at 11:34:26AM -0400, Steven Rostedt wrote: > On Fri, 24 Jul 2015 17:26:37 +0200 > Willy Tarreau wrote: > > > > > The point is, if we trigger a #DB on an instruction breakpoint > > > while !IF, then we simply disable that breakpoint and do the RET. > > > > Yes but the

Re: Dealing with the NMI mess

2015-07-24 Thread Andy Lutomirski
On Fri, Jul 24, 2015 at 1:13 AM, Peter Zijlstra wrote: > On Thu, Jul 23, 2015 at 02:59:56PM -0700, Linus Torvalds wrote: >> Hmmm. I thought watchpoints were "before the instruction" too, but >> that's just because I haven't used them in ages, and I didn't remember >> the details. I just looked it

Re: Dealing with the NMI mess

2015-07-24 Thread Willy Tarreau
On Fri, Jul 24, 2015 at 05:30:54PM +0200, Peter Zijlstra wrote: > On Fri, Jul 24, 2015 at 05:26:37PM +0200, Willy Tarreau wrote: > > > > > > The point is, if we trigger a #DB on an instruction breakpoint > > > while !IF, then we simply disable that breakpoint and do the RET. > > > > Yes but the

Re: Dealing with the NMI mess

2015-07-24 Thread Steven Rostedt
On Fri, 24 Jul 2015 17:26:37 +0200 Willy Tarreau wrote: > > The point is, if we trigger a #DB on an instruction breakpoint > > while !IF, then we simply disable that breakpoint and do the RET. > > Yes but the breakpoint remains disabled then. Or I'm missing > something. Do we care? If it was

Re: Dealing with the NMI mess

2015-07-24 Thread Peter Zijlstra
On Fri, Jul 24, 2015 at 05:26:37PM +0200, Willy Tarreau wrote: > > > > The point is, if we trigger a #DB on an instruction breakpoint > > while !IF, then we simply disable that breakpoint and do the RET. > > Yes but the breakpoint remains disabled then. Or I'm missing > something.

Re: Dealing with the NMI mess

2015-07-24 Thread Willy Tarreau
On Fri, Jul 24, 2015 at 11:16:21AM -0400, Steven Rostedt wrote: > On Fri, 24 Jul 2015 16:59:01 +0200 > Willy Tarreau wrote: > > > On Fri, Jul 24, 2015 at 10:31:27AM -0400, Steven Rostedt wrote: > > > On Fri, 24 Jul 2015 15:21:28 +0200 > > > Willy Tarreau wrote: > > > > > > > My understanding

Re: Dealing with the NMI mess

2015-07-24 Thread Steven Rostedt
On Fri, 24 Jul 2015 16:59:01 +0200 Willy Tarreau wrote: > On Fri, Jul 24, 2015 at 10:31:27AM -0400, Steven Rostedt wrote: > > On Fri, 24 Jul 2015 15:21:28 +0200 > > Willy Tarreau wrote: > > > > > My understanding is that by using RET we can't set the RF flag and #DB > > > > But the RF flag is

Re: Dealing with the NMI mess

2015-07-24 Thread Willy Tarreau
On Fri, Jul 24, 2015 at 10:31:27AM -0400, Steven Rostedt wrote: > On Fri, 24 Jul 2015 15:21:28 +0200 > Willy Tarreau wrote: > > > My understanding is that by using RET we can't set the RF flag and #DB > > But the RF flag is only set for instruction (executing) breakpoints. It > is not set for

Re: Dealing with the NMI mess

2015-07-24 Thread Steven Rostedt
On Fri, 24 Jul 2015 15:21:28 +0200 Willy Tarreau wrote: > My understanding is that by using RET we can't set the RF flag and #DB But the RF flag is only set for instruction (executing) breakpoints. It is not set for data (RW) ones. -- Steve > will immediately strike again when the operation

Re: Dealing with the NMI mess

2015-07-24 Thread Peter Zijlstra
On Fri, Jul 24, 2015 at 03:30:13PM +0200, Peter Zijlstra wrote: > > > But do we care if we do hit one? The return from the #DB handler can > > > use a RET. Right? > > Look at do_debug(), it has lovely bits like: > > preempt_conditional_sti(); > > in it, we do _NOT_ want to be re-enabling

Re: Dealing with the NMI mess

2015-07-24 Thread Peter Zijlstra
On Fri, Jul 24, 2015 at 03:21:28PM +0200, Willy Tarreau wrote: > On Fri, Jul 24, 2015 at 09:03:42AM -0400, Steven Rostedt wrote: > > On Fri, 24 Jul 2015 14:43:04 +0200 > > Peter Zijlstra wrote: > > > > > > > > I'm not too familiar with how to use hw breakpoints, but I'm guessing > > > >

Re: Dealing with the NMI mess

2015-07-24 Thread Willy Tarreau
On Fri, Jul 24, 2015 at 09:03:42AM -0400, Steven Rostedt wrote: > On Fri, 24 Jul 2015 14:43:04 +0200 > Peter Zijlstra wrote: > > > > > I'm not too familiar with how to use hw breakpoints, but I'm guessing > > > (correct me if I'm wrong) that breakpoints on code that trigger when > > >

Re: Dealing with the NMI mess

2015-07-24 Thread Steven Rostedt
On Fri, 24 Jul 2015 14:43:04 +0200 Peter Zijlstra wrote: > > I'm not too familiar with how to use hw breakpoints, but I'm guessing > > (correct me if I'm wrong) that breakpoints on code that trigger when > > executed, but watchpoints on data trigger when accessed. Then > >

Re: Dealing with the NMI mess

2015-07-24 Thread Peter Zijlstra
On Fri, Jul 24, 2015 at 07:58:41AM -0400, Steven Rostedt wrote: > On Fri, 24 Jul 2015 10:13:26 +0200 > Peter Zijlstra wrote: > > > On Thu, Jul 23, 2015 at 02:59:56PM -0700, Linus Torvalds wrote: > > > Hmmm. I thought watchpoints were "before the instruction" too, but > > > that's just because I

Re: Dealing with the NMI mess

2015-07-24 Thread Steven Rostedt
On Fri, 24 Jul 2015 10:13:26 +0200 Peter Zijlstra wrote: > On Thu, Jul 23, 2015 at 02:59:56PM -0700, Linus Torvalds wrote: > > Hmmm. I thought watchpoints were "before the instruction" too, but > > that's just because I haven't used them in ages, and I didn't remember > > the details. I just

Re: Dealing with the NMI mess

2015-07-24 Thread Peter Zijlstra
On Thu, Jul 23, 2015 at 02:54:54PM -0700, Linus Torvalds wrote: > On Thu, Jul 23, 2015 at 2:45 PM, Andy Lutomirski wrote: > > > > Or we just re-enable them on the way out of NMI (i.e. the very last > > thing we do in the NMI handler). I don't want to break regular > > userspace gdb when perf is

Re: Dealing with the NMI mess

2015-07-24 Thread Peter Zijlstra
On Thu, Jul 23, 2015 at 02:59:46PM -0700, Andy Lutomirski wrote: > OK, new proposal: > > In do_debug, if we trip an instruction breakpoint while > !user_mode(regs) && ((regs->flags & X86_EFLAGS_IF) == 0), then disarm > *that breakpoint*. Doesn't !IF already imply that it must be kernel space?

Re: Dealing with the NMI mess

2015-07-24 Thread Willy Tarreau
On Fri, Jul 24, 2015 at 10:13:26AM +0200, Peter Zijlstra wrote: > On Thu, Jul 23, 2015 at 02:59:56PM -0700, Linus Torvalds wrote: > > Hmmm. I thought watchpoints were "before the instruction" too, but > > that's just because I haven't used them in ages, and I didn't remember > > the details. I

Re: Dealing with the NMI mess

2015-07-24 Thread Peter Zijlstra
On Thu, Jul 23, 2015 at 02:59:56PM -0700, Linus Torvalds wrote: > Hmmm. I thought watchpoints were "before the instruction" too, but > that's just because I haven't used them in ages, and I didn't remember > the details. I just looked it up. > > You're right - the memory watchpoints trigger after

Re: Dealing with the NMI mess

2015-07-24 Thread Willy Tarreau
On Fri, Jul 24, 2015 at 10:13:26AM +0200, Peter Zijlstra wrote: On Thu, Jul 23, 2015 at 02:59:56PM -0700, Linus Torvalds wrote: Hmmm. I thought watchpoints were before the instruction too, but that's just because I haven't used them in ages, and I didn't remember the details. I just looked

Re: Dealing with the NMI mess

2015-07-24 Thread Peter Zijlstra
On Thu, Jul 23, 2015 at 02:59:56PM -0700, Linus Torvalds wrote: Hmmm. I thought watchpoints were before the instruction too, but that's just because I haven't used them in ages, and I didn't remember the details. I just looked it up. You're right - the memory watchpoints trigger after the

Re: Dealing with the NMI mess

2015-07-24 Thread Peter Zijlstra
On Fri, Jul 24, 2015 at 07:58:41AM -0400, Steven Rostedt wrote: On Fri, 24 Jul 2015 10:13:26 +0200 Peter Zijlstra pet...@infradead.org wrote: On Thu, Jul 23, 2015 at 02:59:56PM -0700, Linus Torvalds wrote: Hmmm. I thought watchpoints were before the instruction too, but that's just

Re: Dealing with the NMI mess

2015-07-24 Thread Steven Rostedt
On Fri, 24 Jul 2015 14:43:04 +0200 Peter Zijlstra pet...@infradead.org wrote: I'm not too familiar with how to use hw breakpoints, but I'm guessing (correct me if I'm wrong) that breakpoints on code that trigger when executed, but watchpoints on data trigger when accessed. Then

Re: Dealing with the NMI mess

2015-07-24 Thread Peter Zijlstra
On Thu, Jul 23, 2015 at 02:54:54PM -0700, Linus Torvalds wrote: On Thu, Jul 23, 2015 at 2:45 PM, Andy Lutomirski l...@amacapital.net wrote: Or we just re-enable them on the way out of NMI (i.e. the very last thing we do in the NMI handler). I don't want to break regular userspace gdb

Re: Dealing with the NMI mess

2015-07-24 Thread Peter Zijlstra
On Thu, Jul 23, 2015 at 02:59:46PM -0700, Andy Lutomirski wrote: OK, new proposal: In do_debug, if we trip an instruction breakpoint while !user_mode(regs) ((regs-flags X86_EFLAGS_IF) == 0), then disarm *that breakpoint*. Doesn't !IF already imply that it must be kernel space? AFAIK user

Re: Dealing with the NMI mess

2015-07-24 Thread Steven Rostedt
On Fri, 24 Jul 2015 08:48:57 -0700 Andy Lutomirski l...@amacapital.net wrote: So by the time we detect that we've hit a watchpoint, the instruction that tripped it is done and we don't need RF. Furthermore, after reading 17.3.1.1: I *think* that regs-flags withh have RF *clear* if we hit a

Re: Dealing with the NMI mess

2015-07-24 Thread Willy Tarreau
On Fri, Jul 24, 2015 at 12:02:49PM -0400, Steven Rostedt wrote: On Fri, 24 Jul 2015 08:48:57 -0700 Andy Lutomirski l...@amacapital.net wrote: So by the time we detect that we've hit a watchpoint, the instruction that tripped it is done and we don't need RF. Furthermore, after reading

Re: Dealing with the NMI mess

2015-07-24 Thread Steven Rostedt
On Fri, 24 Jul 2015 15:21:28 +0200 Willy Tarreau w...@1wt.eu wrote: My understanding is that by using RET we can't set the RF flag and #DB But the RF flag is only set for instruction (executing) breakpoints. It is not set for data (RW) ones. -- Steve will immediately strike again when the

Re: Dealing with the NMI mess

2015-07-24 Thread Willy Tarreau
On Fri, Jul 24, 2015 at 11:16:21AM -0400, Steven Rostedt wrote: On Fri, 24 Jul 2015 16:59:01 +0200 Willy Tarreau w...@1wt.eu wrote: On Fri, Jul 24, 2015 at 10:31:27AM -0400, Steven Rostedt wrote: On Fri, 24 Jul 2015 15:21:28 +0200 Willy Tarreau w...@1wt.eu wrote: My

Re: Dealing with the NMI mess

2015-07-24 Thread Raymond Jennings
On Thu, 2015-07-23 at 13:21 -0700, Andy Lutomirski wrote: [moved to a new thread, cc list trimmed] Hi all- We've considered two approaches to dealing with NMIs: 1. Allow nesting. We know quite well how messy that is. This might be a stupid question, but 1. What exactly does the NMI

Re: Dealing with the NMI mess

2015-07-24 Thread Willy Tarreau
On Fri, Jul 24, 2015 at 10:31:27AM -0400, Steven Rostedt wrote: On Fri, 24 Jul 2015 15:21:28 +0200 Willy Tarreau w...@1wt.eu wrote: My understanding is that by using RET we can't set the RF flag and #DB But the RF flag is only set for instruction (executing) breakpoints. It is not set

Re: Dealing with the NMI mess

2015-07-24 Thread Steven Rostedt
On Fri, 24 Jul 2015 08:48:57 -0700 Andy Lutomirski l...@amacapital.net wrote: So by the time we detect that we've hit a watchpoint, the instruction that tripped it is done and we don't need RF. Furthermore, after reading 17.3.1.1: I *think* that regs-flags withh have RF *clear* if we hit a

Re: Dealing with the NMI mess

2015-07-24 Thread Willy Tarreau
On Fri, Jul 24, 2015 at 05:30:54PM +0200, Peter Zijlstra wrote: On Fri, Jul 24, 2015 at 05:26:37PM +0200, Willy Tarreau wrote: The point is, if we trigger a #DB on an instruction breakpoint while !IF, then we simply disable that breakpoint and do the RET. Yes but the breakpoint

Re: Dealing with the NMI mess

2015-07-24 Thread Willy Tarreau
On Fri, Jul 24, 2015 at 11:34:26AM -0400, Steven Rostedt wrote: On Fri, 24 Jul 2015 17:26:37 +0200 Willy Tarreau w...@1wt.eu wrote: The point is, if we trigger a #DB on an instruction breakpoint while !IF, then we simply disable that breakpoint and do the RET. Yes but the

Re: Dealing with the NMI mess

2015-07-24 Thread Steven Rostedt
On Fri, 24 Jul 2015 18:08:06 +0200 Willy Tarreau w...@1wt.eu wrote: Um, isn't 0xf * DR_TRAP0 same as a constant true? For me it's a typo, it should have been : if ((dr6 (0xf * DR_TRAP0) (regs-flags (X86_EFLAGS_RF | (the purpose was to read all 4 lower bits at once). I figured

Re: Dealing with the NMI mess

2015-07-24 Thread Steven Rostedt
On Fri, 24 Jul 2015 17:26:37 +0200 Willy Tarreau w...@1wt.eu wrote: The point is, if we trigger a #DB on an instruction breakpoint while !IF, then we simply disable that breakpoint and do the RET. Yes but the breakpoint remains disabled then. Or I'm missing something. Do we care? If it

Re: Dealing with the NMI mess

2015-07-24 Thread Linus Torvalds
On Fri, Jul 24, 2015 at 12:55 PM, Peter Zijlstra pet...@infradead.org wrote: I worry that we'll end up running the do_debug() handlers from effective NMI context. The NMI might have preempted locks which these handlers require etc.. If #DB takes any locks like that, then #DB is broken.

  1   2   >