On Fri, Jul 28, 2017 at 10:54:37PM -0500, Josh Poimboeuf wrote:
> On Fri, Jul 28, 2017 at 07:59:12PM +0000, Levin, Alexander (Sasha Levin)
> > On Fri, Jul 28, 2017 at 01:57:20PM -0500, Josh Poimboeuf wrote:
> > >Thanks, that's much better. I'm relieved the unwinder didn't screw that
> > >up, at least.
> > >
> > >This looks like a tricky one. Is it easily recreatable?
> > Yeah, I just hit it again with slightly different initial calls:
> Sasha sent me some data privately. As I suspected, the cause is some
> bad ORC data. Objtool incorrectly assumes that once the frame pointer
> is set up, it no longer gets touched.
So as it turns out, my pre-vacation self was completely wrong. The
problem is actually some runtime instruction patching which affects the
accuracy of the ORC data.
Take for example the lock_is_held_type() function. In vmlinux, it has
the following instruction:
callq *0xffffffff85a94880 (pv_irq_ops.save_fl)
At runtime, that instruction is patched and replaced with a fast inline
version of arch_local_save_flags() which eliminates the call:
The problem is when an interrupt hits after the push:
--- irq ---
The push offsets the stack pointer by 8 bytes, confusing the ORC
unwinder when it tries to unwind from the IRQ.
The race should be somewhat rare, though Sasha has no problems hitting
it with syzkaller.
I'm not sure what the solution should be. It will probably need to be
one of the following:
a) either don't allow runtime "alternative" patches to mess with the
stack pointer (objtool could enforce this); or
b) come up with some way to register such patches with the ORC
unwinder at runtime.
I haven't looked much at either option to see how feasible they would