Re: [Qemu-devel] Re: Network connections stalling (due to lost interrupts/ticks?)
I'm seeing the same rtc error but my systems are not hanging. I can still get to them and they seem to handle a good load from time to time, 4 running proc. Is this a stability or performance issue? If it is a stability issue how do I test it? - Original Message From: Jason Wessel [EMAIL PROTECTED] To: [EMAIL PROTECTED]; qemu-devel@nongnu.org Sent: Friday, August 3, 2007 8:18:50 AM Subject: Re: [Qemu-devel] Re: Network connections stalling (due to lost interrupts/ticks?) Charles, Are you willing to try an experimental patch? Perhaps you could try the attached patch and post back if it happens to solve your problem. There is most definitely a problem where qemu can get hung up indefinitely after an interrupt storm. I had not ever submitted it because there is no clean way to do this via the opaque information that is passed around. It seems wrong to have to make the ioapic a global. If this does fix the problem perhaps someone will decide to fix this up in a cleaner fashion via the opaque structures. Jason. Charles Duffy wrote: Charles Duffy wrote: There's a warning on startup that the system can't set a 1024Hz timer, which persists even after I set /proc/sys/dev/rtc/max-user-freq to 1024, and I occasionally get warnings at runtime (Your time source seems to be instable or some driver is hogging interrupts). This was happening because my host kernel was compiled with CONFIG_HPET_RTC_IRQ=y. I've disabled this option, recompiled and rebooted, and it resolved the RTC warning (and apparently, the unstable time source messages) -- but my network connections are still stalling.
Re: [Qemu-devel] Re: Network connections stalling (due to lost interrupts/ticks?)
The RTC message has nothing to do with the interrupt controller load. The patch I mentioned was aimed at stability/bug fix. Nothing to do with performance what so ever. The simple test that you can usually break the qemu interrupt controller with is to do a ping -f to the target when using TAP. Then just run some other processes on the target or try to use the network with telnet or write to the disk with echo file blah ; sync... It usually doesn't last too long. It is the ping -f that will keep the interrupt load at the max. Jason. n schembr wrote: I'm seeing the same rtc error but my systems are not hanging. I can still get to them and they seem to handle a good load from time to time, 4 running proc. Is this a stability or performance issue? If it is a stability issue how do I test it? - Original Message From: Jason Wessel [EMAIL PROTECTED] To: [EMAIL PROTECTED]; qemu-devel@nongnu.org Sent: Friday, August 3, 2007 8:18:50 AM Subject: Re: [Qemu-devel] Re: Network connections stalling (due to lost interrupts/ticks?) Charles, Are you willing to try an experimental patch? Perhaps you could try the attached patch and post back if it happens to solve your problem. There is most definitely a problem where qemu can get hung up indefinitely after an interrupt storm. I had not ever submitted it because there is no clean way to do this via the opaque information that is passed around. It seems wrong to have to make the ioapic a global. If this does fix the problem perhaps someone will decide to fix this up in a cleaner fashion via the opaque structures. Jason. Charles Duffy wrote: Charles Duffy wrote: There's a warning on startup that the system can't set a 1024Hz timer, which persists even after I set /proc/sys/dev/rtc/max-user-freq to 1024, and I occasionally get warnings at runtime (Your time source seems to be instable or some driver is hogging interrupts). This was happening because my host kernel was compiled with CONFIG_HPET_RTC_IRQ=y. I've disabled this option, recompiled and rebooted, and it resolved the RTC warning (and apparently, the unstable time source messages) -- but my network connections are still stalling.
Re: [Qemu-devel] Re: Network connections stalling (due to lost interrupts/ticks?)
Charles, Are you willing to try an experimental patch? Perhaps you could try the attached patch and post back if it happens to solve your problem. There is most definitely a problem where qemu can get hung up indefinitely after an interrupt storm. I had not ever submitted it because there is no clean way to do this via the opaque information that is passed around. It seems wrong to have to make the ioapic a global. If this does fix the problem perhaps someone will decide to fix this up in a cleaner fashion via the opaque structures. Jason. Charles Duffy wrote: Charles Duffy wrote: There's a warning on startup that the system can't set a 1024Hz timer, which persists even after I set /proc/sys/dev/rtc/max-user-freq to 1024, and I occasionally get warnings at runtime (Your time source seems to be instable or some driver is hogging interrupts). This was happening because my host kernel was compiled with CONFIG_HPET_RTC_IRQ=y. I've disabled this option, recompiled and rebooted, and it resolved the RTC warning (and apparently, the unstable time source messages) -- but my network connections are still stalling. Recover from an interupt flood by propagating the end of interrupt state. Signed-off-by: Jason Wessel [EMAIL PROTECTED] --- hw/apic.c | 23 +-- hw/pc.c |2 +- 2 files changed, 22 insertions(+), 3 deletions(-) Index: qemu/hw/apic.c === --- qemu.orig/hw/apic.c +++ qemu/hw/apic.c @@ -332,6 +332,26 @@ static void apic_set_irq(APICState *s, i apic_update_irq(s); } +struct IOAPICState *ioapic; +/* XXX Multi IOAPIC support */ +static void apic_propogate_eoi(int vector) { +uint32_t irr; +int pin; + +if ((vector 0x10) || (vector 0xfe)) +return; + +irr = ioapic-irr; +while (irr) { +pin = ffs_bit(irr); +irr = ~(1 pin); +if ((ioapic-ioredtbl[pin] 0xff) == vector) { +ioapic-irr = ~(1 pin); +break; +} +} +} + static void apic_eoi(APICState *s) { int isrv; @@ -339,8 +359,7 @@ static void apic_eoi(APICState *s) if (isrv 0) return; reset_bit(s-isr, isrv); -/* XXX: send the EOI packet to the APIC bus to allow the I/O APIC to -set the remote IRR bit for level triggered interrupts. */ +apic_propogate_eoi(isrv); apic_update_irq(s); } Index: qemu/hw/pc.c === --- qemu.orig/hw/pc.c +++ qemu/hw/pc.c @@ -36,7 +36,7 @@ static fdctrl_t *floppy_controller; static RTCState *rtc_state; static PITState *pit; -static IOAPICState *ioapic; +extern IOAPICState *ioapic; static PCIDevice *i440fx_state; static void ioport80_write(void *opaque, uint32_t addr, uint32_t data)
[Qemu-devel] Re: Network connections stalling (due to lost interrupts/ticks?)
Well, behavior with the patch applied is certainly different. The large download I'm running still times out; however, it is now able to resume without needing to bring the interface down and back up. However, after the first timeout, subsequent timeouts occur with much greater frequency -- still making this multi-GB download an impracticality when using -net tap. The flood ping is not killing the network connection, though it is interrupted by frequent messages: Warning: time of day goes back (-23150us), taking countermeasures. (This is no the high end of the time variances shown; the smallest are on the scale of 120us