On 01/18/16 20:33, Laszlo Ersek wrote: > On 01/18/16 19:44, John Snow wrote: >> >> >> On 01/18/2016 05:57 AM, Laszlo Ersek wrote: >>> Hello Feng, John, >>> >>> On 11/03/15 02:48, Tian Feng wrote: >>>> When executing ATAPI cmd at IDE mode, EFI_SUCCESS may be returned wrongly >>>> with old logic but in fact DRQ is not ready and the transaction doesn't >>>> get executed correctly at this time. >>>> >>>> Contributed-under: TianoCore Contribution Agreement 1.0 >>>> Signed-off-by: Feng Tian <feng.t...@intel.com> >>>> Cc: Star Zeng <star.z...@intel.com> >>>> --- >>>> MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c >>>> b/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c >>>> index 4928ed5..f74e5ca 100644 >>>> --- a/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c >>>> +++ b/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c >>>> @@ -1949,7 +1949,7 @@ AtaPacketReadWrite ( >>>> // >>>> Status = DRQReady2 (PciIo, IdeRegisters, Timeout); >>>> if (EFI_ERROR (Status)) { >>>> - return CheckStatusRegister (PciIo, IdeRegisters); >>>> + return Status; >>>> } >>>> >>>> // >>>> >>> >>> Unfortunately, this patch breaks OVMF on QEMU's ide-cd device. The >>> symptoms were reported in <https://github.com/tianocore/edk2/issues/43>; >> >> To confirm, does this affect both the AHCI ATAPI device and the BMDMA >> ATAPI device? (Q35 and PC default HBA, respectively) Or is it just AHCI? > > I only used i440fx for testing. This is my command line: > > qemu-system-x86_64 \ > -debugcon file:ovmf.log \ > -global isa-debugcon.iobase=0x402 \ > -net none \ > -enable-kvm \ > -pflash OVMF.fd \ > -drive id=cd0,if=none,readonly,format=raw,file=f22.iso \ > -device ide-cd,drive=cd0,bootindex=0 > > If I add "-M q35" (--> AHCI), then things seem to work fine; I can reach > grub on the F22 workstation installer CD. > >>> I reproduced the issue and bisected it to this patch (git commit >>> 7cac2401 / SVN 19611). I also confirmed it by reverting the patch on top >>> of current master, and the CD-ROM started to work. >>> >>> I'm adding John Snow, who maintains QEMU's IDE emulation. >>> >>> Can you guys please investigate this patch, and discuss why it breaks on >>> QEMU's ide-cd device? >>> >>> John, if you'd like to browse the code, the following link highlights >>> the line that the above patch changes: >>> >>> https://github.com/tianocore/edk2/blob/7cac2401635317360226a235d1f95f8347081cbb/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c#L1952 >>> >>> As far as I can see, this problem could be timing-sensitive. The >>> DRQReady2() function polls for the DRQ bit to become set, in a certain >>> amount of time. If that doesn't happen (or a definitive error emerges), >>> the function fails. >>> >>> Pre-patch, such failure would not be decisive; the status register would >>> be read and evaluated. Post-patch, the failure is decisive, even if it >>> is just a timeout, and the CheckStatusRegister() call would otherwise >>> return success. >>> >>> ... Hm, I added a debug log right after DRQReady2(), on the error branch >>> (i.e., when it fails). The status code is *not* EFI_TIMEOUT, it is >>> EFI_NOT_READY. >>> >>> Looking at DRQReady2(), this means: >>> - BSY is clear >>> - ERR is clear >>> - DRQ is clear too >>> >>> Apparently, the DRQReady2() function expects that as soon as BSY is >>> clear (and there is no error), DRQ must be immediately set (more or >>> less: not busy, no error, so ready to transfer data). >>> >> >> Thank you for your research! >> >>> Is that a valid assumption to make? >>> >> >> Whenever BSY, DRQ and ERR are all clear that should be confirmation of a >> completed command. > > That's interesting, because the edk2 code seems to treat this condition > as "failure to transition to data transfer", or some such. > >> Under AHCI, this should *not* be true before the AHCI >> device has signalled completion (Either via the status shadow register, >> an interrupt, or an MSI interrupt, depending on device configuration.) >> >> (In fact, hw/ide/core.c ide_exec_cmd uses the presence of DRQ|BSY to >> know whether or not it is in a state fit to accept a new command. If the >> guest is not blocked when BSY|DRQ are unset, this means a new command >> can be issued.) >> >> == Thought #1: == >> >> However, when a command completes, it updates the shadow registers in >> the AHCI HBA right before signalling completion. It is, I think, >> possible that if you are polling the AHCI shadow registers instead of >> polling the AHCI status registers that you could find DRQ/BSY/ERR unset >> before receiving a command completed notice (by a time window of two >> passing electrons.) >> >> Take a look at ahci_cmd_done, which calls ahci_write_fis_d2h. We >> populate a guest memory FIS buffer with the values of the registers, >> then update the status shadow register, then signal completion (via IRQ, >> MSI, or neither.) >> >> You are probably not hitting that reliably, so can I ask what command is >> erroring out, here? 0xA0 PACKET for IDE, and 0x28 CMD_READ for SCSI with >> the DMA bit off? There might be a problem with the device not managing >> its state well. > > Again, I can only see it with i440fx. This is the section of the IDE > debug log from QEMU that fails: > > ide: CMD=a0 > ide: read addr=0x177 val=58 > ATAPI limit=0xfffe packet: 12 00 00 00 60 00 00 00 00 00 00 00 > reply: tx_size=36 elem_tx_size=0 index=0 > byte_count_limit=65534 > status=0x58 > ide: read status addr=0x376 val=58 > ide: read addr=0x175 val=00 > ide: read addr=0x174 val=24 > reply: tx_size=0 elem_tx_size=0 index=36 > end of transfer, status=0x50 > ide: read addr=0x177 val=50 > ide: read status addr=0x376 val=50 <---- triggers the failure in edk2 > IDE: write addr=0x176 val=0xe0 > ide: write control addr=0x376 val=0a > ide: read addr=0x177 val=50 > IDE: write addr=0x176 val=0xe0 > ide: read status addr=0x376 val=50 > IDE: write addr=0x171 val=0x00 > IDE: write addr=0x171 val=0x00 > IDE: write addr=0x172 val=0x00 > IDE: write addr=0x172 val=0x00 > IDE: write addr=0x173 val=0x00 > IDE: write addr=0x173 val=0x00 > IDE: write addr=0x174 val=0x00 > IDE: write addr=0x174 val=0xfe > IDE: write addr=0x175 val=0x00 > IDE: write addr=0x175 val=0xff > IDE: write addr=0x177 val=0xa0 > > >> >> >> == Thought #2: == >> >> We recently updated the ATAPI pio buffering loop to be asynchronous in >> cases where it can be, which included some changes to when we fiddle >> with DRQ and BSY. These changes went live for 2.5.0. Does version 2.4.1 >> work? That would point to the new buffering loop. > > Great tip, let me check. (The above is experienced on v2.5.0-618-g19b6d84.) > > With v2.4.1, i440fx fails the same, and q35 works the same. > >> Useful information here: >> >> - In the 0xA0 PACKET command sent, what is the value of the >> byte_count_limit? (this is a 16bit count comprised of the hcyl and lcyl >> fields in the packet command structure.) Depending on if it matches the >> block size of 2048 or not, we take a different buffering path. Multiples >> of 2048 are buffered asynchronously, others are buffered synchronously. > > Please see above: 0xfffe; seems to imply synchronous. > >> >> - In the 0xA0 PACKET command sent, what is the value of bit 0 in the >> "feature" field? (the ATAPI DMA bit.) Affects the buffer loop depending >> on which transport it's preparing to use. >> >> - What SCSI read command is being used? (Presumed 0x28 CMD_READ) > > Apparently 0x12 INQUIRY? > > Thanks > Laszlo > >> >> - How many sectors are being requested?
So it looks like the edk2-side code is trying to transition from HP2: Check_Status_B State to HP4: Transfer_Data State The spec says, "Transition HP2:HP4: When BSY is cleared to zero and DRQ is set to one, then the host shall make a transition to the HP4: Transfer_Data state". I think edk2 intends to read 0x60 (96 decimal) bytes of inquiry data, as data transfer. However, QEMU doesn't seem to set the DRQ bit; we have cmd_inquiry() [hw/ide/atapi.c] ide_atapi_cmd_reply() // DMA is not selected s->status = READY_STAT | SEEK_STAT; (== 0x50) However, I don't know if the inquiry response data counts as "data transfer". If it does not, then QEMU seems to be right, because then the transition is *not* to "HP4: Transfer_Data State". Instead, it could be one of the many "BSY = 0 & DRQ = 0" transitions, which mostly seem to mean "command complete". I have no clue which interpretation is correct. Thanks Laszlo _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel