Hi misc@!

I have figured out that it is possible to get vmd(8) into a state where
1) com1_dev.rcv_pending != 0 
2) there is data pending on com1_dev.fd
3) the guest doesn't seem to care

This results in a locked up situation where com_rcv_event() is called on
indefinitely.  It seems to me that an interrupt is lost somewhere, leading
to a situation where the guest OS is happily ignorant of the available data,
while the vmm is waiting for the guest to eat it up. 

This has made it impossible to install Linux via the serial console on
vmm(4).  It seems that people previously have reported "freezing" problems
in vmm(4) form time to time, but when reported no one else have been able to
reproduce it.

I have solved the problem for myself by changing com_rcv_event() to the
following:

static void
com_rcv_event(int fd, short kind, void *arg)
{
        mutex_lock(&com1_dev.mutex);

        /*
         * We already have other data pending to be received. The data that
         * has become available now will be moved to the com port later.
         */
        if (com1_dev.rcv_pending) {
                /* If pending interrupt, inject */
                if ((com1_dev.regs.iir & IIR_NOPEND) == 0) {
                        utrace("comrcv injintr", &com1_dev.regs.lsr, 
sizeof(com1_dev.regs.lsr));
                        /* XXX: vcpu_id */
                        vcpu_assert_pic_irq((uintptr_t) arg, 0, com1_dev.irq);
                        vcpu_deassert_pic_irq((uintptr_t) arg, 0, com1_dev.irq);
                }
                mutex_unlock(&com1_dev.mutex);
                return;
        }
        if (com1_dev.regs.lsr & LSR_RXRDY)
                com1_dev.rcv_pending = 1;
        else {
                com_rcv(&com1_dev, (uintptr_t) arg, 0);

                /* If pending interrupt, inject */
                if ((com1_dev.regs.iir & IIR_NOPEND) == 0) {
                        /* XXX: vcpu_id */
                        vcpu_assert_pic_irq((uintptr_t) arg, 0, com1_dev.irq);
                        vcpu_deassert_pic_irq((uintptr_t) arg, 0, com1_dev.irq);
                }
        }

        mutex_unlock(&com1_dev.mutex);
}

However, I have little experience in the interrupt behaviour on x86.  I'm
also aware of that there has been an attempt to fix this behaviour [1].

I think the problem is that when com_rcv() is called from
vcpu_process_com_data(), the interrupt is triggered using vcpu_exit_inout(),
which was not touched in the previous attempt [1] to fix the "freezing"
problem.  vcpu_exit_inout() still uses a simple vcpu_assert_pic_irq() call
to trigger the interrupt while for example com_rcv_event() uses the
vcpu_assert_pic_irq(); vcpu_deassert_pic_irq() sequence to trigger it.

With my modifications to com_rcv_event() I was able to install not only
alpine linux, but even debian using the serial console.  Without the
modification I can't even install alpine linux via the serial console.

Any thoughts on this?  If people think my change is a sound one, I can make
a proper patch for it.  If people think the change is unsound, I would have
to look into changing vcpu_exit_inout() and probably extend the interface to
it to decide how the interrupt should be triggered.

1. https://marc.info/?l=openbsd-cvs&m=153115270302514&w=2

--
Wictor Lund

Reply via email to