Re: [Qemu-devel] fix clearing i8259 IRQ lines (Was: Should the i8259 devices remain no-user?)
On Sat, 26 Oct 2013, Matthew Ogilvie wrote: Although the 8259 (interrupts) model is clearly wrong with respect to clearing an IRQ request line, only one ancient unimportant guest (Microport UNIX ca. 1987) seems to care, and there are potentially significant risks to more important guests if we try to fix it: There's at least one more guest that cares I know about which is less ancient but maybe just as unimportant: OPENSTEP for Mach. But nevertheless it still is a now known bug which just seems to be tolerated by the OS-es that are most commonly run under Qemu. What was not clear to me is how significant are the risks of the fix and if they were considered or the patch was just forgotten without ever getting the thought about merging it. Risks: The 8254 (timers) model is wrong in various ways, some of which are hidden by the incorrect 8259 model, and fixing it could potentially break migration, depending on exact circumstances. Also, it isn't clear if there are other device models depending on the incorrect 8259 that would also need to be fixed. I had the impression from previous discussion that the main risk was a potential lost timer interrupt in some circumstances at migration which may affect some guests but it was not clear (to me at least) how big of a risk is this. IMO if other models depend on a bug they are also buggy and should be fixed but I don't know how many models could that affect. If someone actually showed real interest in actually merging these, including the selection of a migration compatibility strategy they would actually be willing to merge (and above: other devices, KVM, etc), I could look into updating the patches to match. But the if parts aren't looking particularly likely. This seems like a rather core-level wide-implication change for a newbie to be messing with. (I've already spent noticably more time on qemu patches than I had intended to spend total on playing with this guest, although I may continue if I have a clearly defined strategy.) I think you have already provided detailed analysis, test cases and multiple options and patch versions so it is not you who should spend more time on this now. What I think would be needed is that people who have the knowledge and insight to analyse and decide about the patches give it some time to think about it and come to a decision then tell what to do or why it's better to leave it unfixed. Can this be done in this thread? Or maybe on one of the upcoming phone conferences where the right people are together anyway to discuss it? Regards, BALATON Zoltan
Re: [Qemu-devel] fix clearing i8259 IRQ lines (Was: Should the i8259 devices remain no-user?)
On Wed, Oct 16, 2013 at 06:23:11PM +0200, Paolo Bonzini wrote: Il 16/10/2013 18:21, BALATON Zoltan ha scritto: A bit off topic but this reminded me of these patches: http://patchwork.ozlabs.org/patch/206753/ http://patchwork.ozlabs.org/patch/208252/ which never got merged. Is there a chance that these fixes get merged sometimes or is there an explanation why it won't be fixed? As far as I remember the patches were reviewed and multiple versions were proposed but at the end no decision was reached on which one to merge and it was just left uncorrected. Right, thank you very much. ISTR the unanswered question was what to do about migration, but I need to reread all the threads. Paolo Essentially correct. Although the 8259 (interrupts) model is clearly wrong with respect to clearing an IRQ request line, only one ancient unimportant guest (Microport UNIX ca. 1987) seems to care, and there are potentially significant risks to more important guests if we try to fix it: Risks: The 8254 (timers) model is wrong in various ways, some of which are hidden by the incorrect 8259 model, and fixing it could potentially break migration, depending on exact circumstances. Also, it isn't clear if there are other device models depending on the incorrect 8259 that would also need to be fixed. Similar changes are needed in KVM for consistency, although some of the 8254 modes are implemented in a more simplistic way (pulses handled as fast as possible directly, instead of 1-millisecond-long pulses on real hardware). Note that I was never able to get my guest running successfully under KVM; I'm not sure what the remaining problems were. Also, the patch series included a few other things: - A couple of low priority fixes that can still be worked around without code changes, but could probably qualify as trivial patches. - Some test cases to test for the 8259 problem. - Plus an optional VGA hack to make it work when my ancient guest tries to directly (no BIOS) configure it for CGA text mode. I didn't get much feedback about these. - If someone actually showed real interest in actually merging these, including the selection of a migration compatibility strategy they would actually be willing to merge (and above: other devices, KVM, etc), I could look into updating the patches to match. But the if parts aren't looking particularly likely. This seems like a rather core-level wide-implication change for a newbie to be messing with. (I've already spent noticably more time on qemu patches than I had intended to spend total on playing with this guest, although I may continue if I have a clearly defined strategy.) - Matthew Ogilvie