The underlying issue here isn't that Linux is being tricked into servicing the same interrupt twice, it's that one of the many, many modes of the APICs isn't doing what it's supposed to. The details of how it works are probably not important, but in order to fix it I need to know what APICs are out there and what some of their internal state is. In the manual it sounds like messages from the IO APIC (the interrupt controller that handles devices) keeps its own local copies of that internal state which the local APICs (the ones on the CPUs) send updates to. The local APICs arbitrate amongst themselves and don't keep copies of each others state when they use this mechanism.
To solve the problem, there could be an object that keeps track of all the APICs in a system and knows the pertinent information to figure out where the different interrupt messages should go. That object could find out about APICs either by them registering themselves with it when they're getting set up, or by having some mechanism in the python stuff that gathers that up and passes it into C++. Alternatively, we could implement broadcast packets and set up something that behaves like hardware where local APICs decide for themselves if they should accept an interrupt message. The default choice here would be to make the system object (or platform I maybe?) keep track of APICs by having them register themselves, but the other ways sound like they could be useful for other things. Collecting aggregated information about the systems configuration in the python for building BIOS tables comes to mind. Is it worth biting the bullet and setting up something like that, or no? If anyone is feeling industrious, the mechanism is the "Lowest Priority" delivery mode using the "Logical Destination" destination mode, and the state is the LDR (Logical Destination Register, not to be confused with the local APIC ID, not to be confused with the initial APIC ID), and the TPR (Task Priority Register). All the different widgets and modes get a bit confusing so more or less state may be necessary. Gabe Gabe Black wrote: > Never mind. There's some x86 memory weirdness going on where Linux is > tricked into servicing the completion interrupt twice. > > Gabe > > Gabriel Michael Black wrote: > >> Or to rephrase, should the event be descheduled, or should it still >> happen and the IDE disk just deal with it. >> >> Gabe >> >> Quoting Gabe Black <[email protected]>: >> >> >> >>> I think what's happening is that the controller is being told to stop >>> DMAing before it's done. The controller tells the disk to abort the DMA, >>> but the disk has already scheduled an event in the future to do the >>> transfer. It sets itself up to not do a DMA, the event goes off, and the >>> panic gets called. What is the right way to fix this? The chunk of trace >>> I'm basing this on is below and ends at the point of the panic. A >>> regular DMA is next for comparison. Note how eot is 0 until the DMA >>> completes. In the first case, it's still zero when the DMA is stopped, >>> and in the second it becomes 0x8000 first. >>> >>> Gabe >>> >>> >>> 2792590196000: system.pc.south_bridge.ide: Write from offset: >>> 0x80000000000001f7 size: 0x1 data: 0xca >>> 2792590712000: system.pc.south_bridge.ide: Read from offset: >>> 0x8000000000001000 size: 0x1 data: 0 >>> 2792590722500: system.pc.south_bridge.ide.primary: Starting DMA transfer >>> 2792590722500: system.pc.south_bridge.ide: Write from offset: >>> 0x8000000000001000 size: 0x1 data: 0x1 >>> 2792590802501: system.pc.south_bridge.ide.disks0: PRD: >>> baseAddr:0x776cc00 (0x776cc00) byteCount:1024 (6) eot:0 sector:426053 >>> 2792590802501: system.pc.south_bridge.ide.disks0: doDmaRead, diskDelay: >>> 1000000 totalDiskDelay: 1000002 >>> 2792591027500: system.pc.south_bridge.ide: Read from offset: >>> 0x8000000000001002 size: 0x1 data: 0x65 >>> 2792591142000: system.pc.south_bridge.ide: Read from offset: >>> 0x8000000000001000 size: 0x1 data: 0x1 >>> 2792591151000: system.pc.south_bridge.ide.primary: Stopping DMA transfer >>> 2792591151000: system.pc.south_bridge.ide.disks0: Posting Interrupt >>> 2792591151000: system.pc.south_bridge.ide: Write from offset: >>> 0x8000000000001000 size: 0x1 data: 0 >>> 2792591157500: system.pc.south_bridge.ide: Read from offset: >>> 0x8000000000001002 size: 0x1 data: 0x64 >>> 2792591168500: system.pc.south_bridge.ide: Write from offset: >>> 0x8000000000001002 size: 0x1 data: 0x66 >>> 2792591205000: system.pc.south_bridge.ide.disks0: Clearing Interrupt >>> 2792591205000: system.pc.south_bridge.ide.disks0: Read to disk at >>> offset: 0x7 data 0x50 >>> 2792591205000: system.pc.south_bridge.ide: Read from offset: >>> 0x80000000000001f7 size: 0x1 data: 0x50 >>> >>> >>> >>> 2789322130500: system.pc.south_bridge.ide: Write from offset: >>> 0x80000000000001f7 size: 0x1 data: 0xc8 >>> 2789322646500: system.pc.south_bridge.ide: Read from offset: >>> 0x8000000000001000 size: 0x1 data: 0x8 >>> 2789322657000: system.pc.south_bridge.ide.primary: Starting DMA transfer >>> 2789322657000: system.pc.south_bridge.ide: Write from offset: >>> 0x8000000000001000 size: 0x1 data: 0x9 >>> 2789322737001: system.pc.south_bridge.ide.disks0: PRD: >>> baseAddr:0x7d9e000 (0x7d9e000) byteCount:4096 (56) eot:0 sector:568405 >>> 2789322737001: system.pc.south_bridge.ide.disks0: doDmaWrite, diskDelay: >>> 1000000 totalDiskDelay: 1000008 >>> 2789323897009: system.pc.south_bridge.ide.disks0: PRD: >>> baseAddr:0x7dbd000 (0x7dbd000) byteCount:4096 (48) eot:0 sector:568413 >>> 2789323897009: system.pc.south_bridge.ide.disks0: doDmaWrite, diskDelay: >>> 1000000 totalDiskDelay: 1000008 >>> 2789325057017: system.pc.south_bridge.ide.disks0: PRD: >>> baseAddr:0x7d75000 (0x7d75000) byteCount:4096 (40) eot:0 sector:568421 >>> 2789325057017: system.pc.south_bridge.ide.disks0: doDmaWrite, diskDelay: >>> 1000000 totalDiskDelay: 1000008 >>> 2789326217025: system.pc.south_bridge.ide.disks0: PRD: >>> baseAddr:0x7ca5000 (0x7ca5000) byteCount:4096 (32) eot:0 sector:568429 >>> 2789326217025: system.pc.south_bridge.ide.disks0: doDmaWrite, diskDelay: >>> 1000000 totalDiskDelay: 1000008 >>> 2789327377033: system.pc.south_bridge.ide.disks0: PRD: >>> baseAddr:0x7d90000 (0x7d90000) byteCount:4096 (24) eot:0 sector:568437 >>> 2789327377033: system.pc.south_bridge.ide.disks0: doDmaWrite, diskDelay: >>> 1000000 totalDiskDelay: 1000008 >>> 2789328537041: system.pc.south_bridge.ide.disks0: PRD: >>> baseAddr:0x7b46000 (0x7b46000) byteCount:4096 (16) eot:0 sector:568445 >>> 2789328537041: system.pc.south_bridge.ide.disks0: doDmaWrite, diskDelay: >>> 1000000 totalDiskDelay: 1000008 >>> 2789329697049: system.pc.south_bridge.ide.disks0: PRD: >>> baseAddr:0x7caa000 (0x7caa000) byteCount:4096 (8) eot:0x8000 sector:568453 >>> 2789329697049: system.pc.south_bridge.ide.disks0: doDmaWrite, diskDelay: >>> 1000000 totalDiskDelay: 1000008 >>> 2789330777057: system.pc.south_bridge.ide.disks0: Posting Interrupt >>> 2789331068500: system.pc.south_bridge.ide: Read from offset: >>> 0x8000000000001002 size: 0x1 data: 0x64 >>> 2789331183000: system.pc.south_bridge.ide: Read from offset: >>> 0x8000000000001000 size: 0x1 data: 0x8 >>> 2789331192000: system.pc.south_bridge.ide: Write from offset: >>> 0x8000000000001000 size: 0x1 data: 0x8 >>> 2789331198500: system.pc.south_bridge.ide: Read from offset: >>> 0x8000000000001002 size: 0x1 data: 0x64 >>> 2789331209500: system.pc.south_bridge.ide: Write from offset: >>> 0x8000000000001002 size: 0x1 data: 0x66 >>> 2789331246000: system.pc.south_bridge.ide.disks0: Clearing Interrupt >>> >>> >>> Gabe Black wrote: >>> >>> >>>> panic: Inconsistent DMA transfer state: dmaState = 2 devState = 0 >>>> >>>> I think the 2 is correct. It's the same device/vendor ID but the BARs >>>> work a little differently to be compatible with legacy IO ranges on PCs. >>>> The chunk of log below makes me believe it's being identified as the >>>> right device, so it's probably be the accesses. I'll compare with Alpha >>>> and see if I can figure it out. Thanks! >>>> >>>> Gabe >>>> >>>> ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx >>>> PIIX4: IDE controller at PCI slot 0000:00:04.0 >>>> PIIX4: chipset revision 0 >>>> PIIX4: 100% native mode on irq 14 >>>> ide0: BM-DMA at 0x1000-0x1007, BIOS settings: hda:DMA, hdb:DMA >>>> ide1: BM-DMA at 0x1008-0x100f, BIOS settings: hdc:DMA, hdd:DMA >>>> hda: M5 IDE Disk, ATA DISK drive >>>> hdb: M5 IDE Disk, ATA DISK drive >>>> ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 >>>> >>>> >>>> Ali Saidi wrote: >>>> >>>> >>>> >>>>> It's a panic that is supposed to check that the state machine in >>>>> operating correctly. To start a dma transfer the device is supposed to >>>>> be in Transfer_Data_Dma (which some device access writes put in in) >>>>> and the dmaState is supposed to be in Dma_Transfer (as opposed to some >>>>> other phase of the DMA). Which one is incorrect in the panic? >>>>> >>>>> Are you identifying the device (PCI device/vendor IDs) exactly as we >>>>> do for Alpha (it ends up being a Intel PIIX IDE controller or >>>>> something). If you're not you might be stumbling on some un- >>>>> implemented feature. If you are, then I would guess that it's >>>>> something to do with I/O writes not getting to the device correctly >>>>> (either order or just disappearing). You can boot the Alpha version >>>>> with DPRINTFs and see the correct sequence of commands and then >>>>> compare that to the DMA that dies for some more insight. >>>>> >>>>> Ali >>>>> >>>>> >>>>> >>>>> On Apr 8, 2009, at 5:46 PM, Steve Reinhardt wrote: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> I don't know much about IDE either, but it may have more to do with >>>>>> DMA in general than with IDE specifically. What is the specific >>>>>> panic that you're getting (i.e., the actual output)? >>>>>> >>>>>> Steve >>>>>> >>>>>> On Wed, Apr 8, 2009 at 1:36 AM, Gabe Black <[email protected]> >>>>>> wrote: >>>>>> Anybody? The ATA spec I have is -really- long and I'm not feeling >>>>>> ambitious enough to read it :). I don't know if this would be >>>>>> covered in >>>>>> there anyway. >>>>>> >>>>>> Gabe >>>>>> >>>>>> Gabe Black wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> I have an SMP kernel booting and running the init process, but >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> I'm >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> getting the following panic from the IDE disk on line 321 in >>>>>>> dev/ide_disk.cc: >>>>>>> >>>>>>> void >>>>>>> IdeDisk::doDmaTransfer() >>>>>>> { >>>>>>> if (dmaState != Dma_Transfer || devState != Transfer_Data_Dma) >>>>>>> panic("Inconsistent DMA transfer state: dmaState = %d >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> devState = >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> %d\n", >>>>>>> dmaState, devState); >>>>>>> >>>>>>> >>>>>>> I'm not familiar with the details of IDE. Could someone please >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> give me a >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> 30 second explanation of what this is checking? >>>>>>> >>>>>>> Gabe >>>>>>> _______________________________________________ >>>>>>> m5-dev mailing list >>>>>>> [email protected] >>>>>>> http://m5sim.org/mailman/listinfo/m5-dev >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> m5-dev mailing list >>>>>> [email protected] >>>>>> http://m5sim.org/mailman/listinfo/m5-dev >>>>>> >>>>>> _______________________________________________ >>>>>> m5-dev mailing list >>>>>> [email protected] >>>>>> http://m5sim.org/mailman/listinfo/m5-dev >>>>>> >>>>>> >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> m5-dev mailing list >>>>> [email protected] >>>>> http://m5sim.org/mailman/listinfo/m5-dev >>>>> >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> m5-dev mailing list >>>> [email protected] >>>> http://m5sim.org/mailman/listinfo/m5-dev >>>> >>>> >>>> >>> _______________________________________________ >>> m5-dev mailing list >>> [email protected] >>> http://m5sim.org/mailman/listinfo/m5-dev >>> >>> >>> >> _______________________________________________ >> m5-dev mailing list >> [email protected] >> http://m5sim.org/mailman/listinfo/m5-dev >> >> > > _______________________________________________ > m5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/m5-dev > _______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
