Hi,

On Mon, Apr 18, 2011 at 10:00:02PM -0700, Philip Guenther wrote:
> > The sti was introduced in revision 1.97 of locore.s in March 2006 by
> > mickey@. Commit message:
> >
> > | prevent the faults on iret to run w/ disabled intrs and cause
> > | deadlocks; niklas toby tom ok
> >
> > Maybe mickey or one of the people giving oks back then want to comment?
> 
> That predates my involvement with OpenBSD, but I think get the gist.
> iret can fault if, for example, the user corrupts certain members of
> the sigcontext passed to a signal handler.  If that happens the the
> CPU generates an exception on the iret and you end up in trap() where
> it peeks at the instruction that triggered the trap and, seeing it was
> iret, arranges to 'return' to the resume_iret label to generate what
> looks like a normal fault against the process.  The goal of that sti
> is to keep that 'normal' fault from being processed with interrupts
> blocked.

Point taken, I missed the sigreturn path. However, note that iret can
only fault if the values of the segment registers stored on the kernel
stack are bogus. In particular iret will not fault on the new EIP address.
(The address can page fault but the fault will be on the "instruction"
pointed to by the new CS:EIP and not in iret).

With that in mind, the question is: Is it really a problem if the trap
handler runs with interrupts disabled? See below.

> So, possibility one: sti's effect is delayed to the end of next
> instruction.

This is true, the effect of sti is delayed by one instruction. This is
documented in the intel manual.

> What happens if you put it immediately before an iret
> that faults?  If the flags pushed on the stack by the fault have
> interrupts unblocked, then simply moving the sti down a line to
> between the addl and the iret would be enough.  If you can reproduce
> this situation easily than I would ask that you try that.

This solves the stack overflow that I see, but it results in
inconsistent code. See below.

> Possibility two would be to try to handle this from, uh, inside
> resume_iret I guess.  I'm not 100% sure whether it can be unilateral
> there though.

This would be my preferred solution. In particular, I see two possibilities:

* Maybe it is not even a problem is irqs are sometimes disabled during
  trap.
* It is a problem if trap runs with irqs disabled. In this case, userland
  can easily trigger this by trapping on one of the pop instructions for
  ds, es, fs or gs. Those instructions in the interrupt return path run
  with irqs disabled. Thus moving the sti in the interupt return path down
  one instruction immediately before the iret does not really help.
  The proper solution would probably be to add explicit calls to sti in all
  of the resume paths (or even explicitly before calling "trap").

    regards    Christian

Reply via email to