On 01/19/16 00:08, John Snow wrote: > > > On 01/18/2016 05:28 PM, Laszlo Ersek wrote: >> On 01/18/16 22:05, John Snow wrote: >>> >>> >>> On 01/18/2016 02:33 PM, Laszlo Ersek wrote: >>>> On 01/18/16 19:44, John Snow wrote: >>>>> >>>>> >>>>> On 01/18/2016 05:57 AM, Laszlo Ersek wrote: >>>>>> Hello Feng, John, >>>>>> >>>>>> On 11/03/15 02:48, Tian Feng wrote: >>>>>>> When executing ATAPI cmd at IDE mode, EFI_SUCCESS may be returned >>>>>>> wrongly >>>>>>> with old logic but in fact DRQ is not ready and the transaction doesn't >>>>>>> get executed correctly at this time. >>>>>>> >>>>>>> Contributed-under: TianoCore Contribution Agreement 1.0 >>>>>>> Signed-off-by: Feng Tian <feng.t...@intel.com> >>>>>>> Cc: Star Zeng <star.z...@intel.com> >>>>>>> --- >>>>>>> MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c | 2 +- >>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>>> >>>>>>> diff --git a/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c >>>>>>> b/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c >>>>>>> index 4928ed5..f74e5ca 100644 >>>>>>> --- a/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c >>>>>>> +++ b/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c >>>>>>> @@ -1949,7 +1949,7 @@ AtaPacketReadWrite ( >>>>>>> // >>>>>>> Status = DRQReady2 (PciIo, IdeRegisters, Timeout); >>>>>>> if (EFI_ERROR (Status)) { >>>>>>> - return CheckStatusRegister (PciIo, IdeRegisters); >>>>>>> + return Status; >>>>>>> } >>>>>>> >>>>>>> // >>>>>>> >>>>>> >>>>>> Unfortunately, this patch breaks OVMF on QEMU's ide-cd device. The >>>>>> symptoms were reported in <https://github.com/tianocore/edk2/issues/43>; >>>>> >>>>> To confirm, does this affect both the AHCI ATAPI device and the BMDMA >>>>> ATAPI device? (Q35 and PC default HBA, respectively) Or is it just AHCI? >>>> >>>> I only used i440fx for testing. This is my command line: >>>> >>>> qemu-system-x86_64 \ >>>> -debugcon file:ovmf.log \ >>>> -global isa-debugcon.iobase=0x402 \ >>>> -net none \ >>>> -enable-kvm \ >>>> -pflash OVMF.fd \ >>>> -drive id=cd0,if=none,readonly,format=raw,file=f22.iso \ >>>> -device ide-cd,drive=cd0,bootindex=0 >>>> >>>> If I add "-M q35" (--> AHCI), then things seem to work fine; I can reach >>>> grub on the F22 workstation installer CD. >>>> >>>>>> I reproduced the issue and bisected it to this patch (git commit >>>>>> 7cac2401 / SVN 19611). I also confirmed it by reverting the patch on top >>>>>> of current master, and the CD-ROM started to work. >>>>>> >>>>>> I'm adding John Snow, who maintains QEMU's IDE emulation. >>>>>> >>>>>> Can you guys please investigate this patch, and discuss why it breaks on >>>>>> QEMU's ide-cd device? >>>>>> >>>>>> John, if you'd like to browse the code, the following link highlights >>>>>> the line that the above patch changes: >>>>>> >>>>>> https://github.com/tianocore/edk2/blob/7cac2401635317360226a235d1f95f8347081cbb/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c#L1952 >>>>>> >>>>>> As far as I can see, this problem could be timing-sensitive. The >>>>>> DRQReady2() function polls for the DRQ bit to become set, in a certain >>>>>> amount of time. If that doesn't happen (or a definitive error emerges), >>>>>> the function fails. >>>>>> >>>>>> Pre-patch, such failure would not be decisive; the status register would >>>>>> be read and evaluated. Post-patch, the failure is decisive, even if it >>>>>> is just a timeout, and the CheckStatusRegister() call would otherwise >>>>>> return success. >>>>>> >>>>>> ... Hm, I added a debug log right after DRQReady2(), on the error branch >>>>>> (i.e., when it fails). The status code is *not* EFI_TIMEOUT, it is >>>>>> EFI_NOT_READY. >>>>>> >>>>>> Looking at DRQReady2(), this means: >>>>>> - BSY is clear >>>>>> - ERR is clear >>>>>> - DRQ is clear too >>>>>> >>>>>> Apparently, the DRQReady2() function expects that as soon as BSY is >>>>>> clear (and there is no error), DRQ must be immediately set (more or >>>>>> less: not busy, no error, so ready to transfer data). >>>>>> >>>>> >>>>> Thank you for your research! >>>>> >>>>>> Is that a valid assumption to make? >>>>>> >>>>> >>>>> Whenever BSY, DRQ and ERR are all clear that should be confirmation of a >>>>> completed command. >>>> >>>> That's interesting, because the edk2 code seems to treat this condition >>>> as "failure to transition to data transfer", or some such. >>>> >>> >>> If it's in the middle of a transfer loop, it's a weird condition to >>> encounter. We expect to see BSY and/or DRQ for the duration of the >>> command. At the end of the command, though, READY|SEEK is the normal >>> idle status. >>> >>>>> Under AHCI, this should *not* be true before the AHCI >>>>> device has signalled completion (Either via the status shadow register, >>>>> an interrupt, or an MSI interrupt, depending on device configuration.) >>>>> >>>>> (In fact, hw/ide/core.c ide_exec_cmd uses the presence of DRQ|BSY to >>>>> know whether or not it is in a state fit to accept a new command. If the >>>>> guest is not blocked when BSY|DRQ are unset, this means a new command >>>>> can be issued.) >>>>> >>>>> == Thought #1: == >>>>> >>>>> However, when a command completes, it updates the shadow registers in >>>>> the AHCI HBA right before signalling completion. It is, I think, >>>>> possible that if you are polling the AHCI shadow registers instead of >>>>> polling the AHCI status registers that you could find DRQ/BSY/ERR unset >>>>> before receiving a command completed notice (by a time window of two >>>>> passing electrons.) >>>>> >>>>> Take a look at ahci_cmd_done, which calls ahci_write_fis_d2h. We >>>>> populate a guest memory FIS buffer with the values of the registers, >>>>> then update the status shadow register, then signal completion (via IRQ, >>>>> MSI, or neither.) >>>>> >>>>> You are probably not hitting that reliably, so can I ask what command is >>>>> erroring out, here? 0xA0 PACKET for IDE, and 0x28 CMD_READ for SCSI with >>>>> the DMA bit off? There might be a problem with the device not managing >>>>> its state well. >>>> >>>> Again, I can only see it with i440fx. This is the section of the IDE >>>> debug log from QEMU that fails: >>>> >>> >>> That helps narrow it down, at least. AHCI is hard to track sometimes due >>> to all the register shadowing going on. >>> >>> Are you ready for more information than you wanted? Let's go! >>> >>>> ide: CMD=a0 >>> >>> ATAPI packet command received by ide_exec_cmd. >>> >>> (NOTE: I really should add a debug printf here in cmd_packet that tells >>> us if it's PIO or DMA, that'd be helpful!) >>> >>>> ide: read addr=0x177 val=58 >>> >>> primary status read by OVMF; 0x58 is READY | SEEK | DRQ. Interrupts are >>> reset. >>> OVMF sends the ATAPI/SCSI packet. >>> QEMU receives the last expected byte of this DRQ phase. >>> DRQ is unset. >>> ide_data_writew invokes ide_atapi_cmd as s->end_transfer_func(). >>> >>>> ATAPI limit=0xfffe packet: 12 00 00 00 60 00 00 00 00 00 00 00 >>> >>> DRQ read phase of the SCSI packet is over; command is 0x12 INQUIRY. >>> Max length OVMF is prepared to receive is 60 bytes. >>> QEMU prepares the reply in s->io_buffer. >>> Callchain is ide_atapi_cmd -> cmd_inquiry -> ide_atapi_cmd_reply. >>> Status (if pio ...) is set to 0x50, (READY | SEEK). >>> >>>> reply: tx_size=36 elem_tx_size=0 index=0 >>> >>> ide_atapi_cmd_reply_end has been called and we are preparing to send >>> data back to OVMF. >>> >>>> byte_count_limit=65534 >>> >>> OVMF has requested no more than 0xfffe bytes at a time. We only have 36 >>> bytes to send. s->hcyl and s->lcyl are updated to reflect the new limit >>> of just 36 bytes. >>> >>> QEMU starts the PIO transfer and sets +DRQ again. >>> >>>> status=0x58 >>> >>> print from ide_atapi_cmd_reply_end prints the status, +DRQ +READY +SEEK. >>> >>>> ide: read status addr=0x376 val=58 >>> >>> Alternate status read by OVMF, any pending interrupts are not reset. DRQ >>> is set, so OVMF is free to read our prepared reply. >>> >>>> ide: read addr=0x175 val=00 >>> >>> OVMF reads s->hcyl, gets 0x00. >>> >>>> ide: read addr=0x174 val=24 >>> >>> OVMF reads s->lcyl, gets 0x24. OVMF now knows to expect 0x24 = 36 bytes >>> of data. >> >> Aha! >> >>> There are no prints for clues in the middle here, but OVMF should have >>> then read 36 bytes, which causes ide_data_readw to: >>> - Clear DRQ >>> - Invoke s->end_transfer_func, which is ide_atapi_cmd_reply_end. >>> >>>> reply: tx_size=0 elem_tx_size=0 index=36 >>>> end of transfer, status=0x50 >>> >>> ide_atapi_cmd_reply_end is invoked as a callback, nothing left to >>> transfer; tx_size is 0 and the index is equal to the total transfer >>> size. DRQ remains unset and we begin cleaning up by calling >>> ide_atapi_cmd_ok: >>> >>> - status is set to 0x50 (READY | SEEK) >>> - IRQ is raised. >>> >>>> ide: read addr=0x177 val=50 >>> >>> Primary status register is read, IRQ is cleared, and status is READY|SEEK. >>> >>>> ide: read status addr=0x376 val=50 <---- triggers the failure in edk2 >>> >>> Alternate status register is read. Status is still 0x50. >>> >>> At this point, the transfer is done and EDK2 should not expect further >>> bytes. READY|SEEK should not be interpreted as an error, IMO. >>> >>> Is there some reason it's still expecting data (i.e., waiting for DRQ?) >> >> This is extremely helpful, thanks! >> >> The edk2 code is [MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c], sorry >> for the big paste: >> >> 1896 /** >> 1897 This function is called by AtaPacketCommandExecute(). >> 1898 It is used to transfer data between host and device. The data >> direction is specified >> 1899 by the fourth parameter. >> 1900 >> 1901 @param PciIo A pointer to >> ATA_ATAPI_PASS_THRU_INSTANCE data structure. >> 1902 @param IdeRegisters A pointer to EFI_IDE_REGISTERS data >> structure. >> 1903 @param Buffer Buffer contained data transferred >> between host and device. >> 1904 @param ByteCount Data size in byte unit of the buffer. >> 1905 @param Read Flag used to determine the data transfer >> direction. >> 1906 Read equals 1, means data transferred >> from device to host; >> 1907 Read equals 0, means data transferred >> from host to device. >> 1908 @param Timeout Timeout value for wait DRQ ready before >> each data stream's transfer >> 1909 , uses 100ns as a unit. >> 1910 >> 1911 @retval EFI_SUCCESS data is transferred successfully. >> 1912 @retval EFI_DEVICE_ERROR the device failed to transfer data. >> 1913 **/ >> 1914 EFI_STATUS >> 1915 EFIAPI >> 1916 AtaPacketReadWrite ( >> 1917 IN EFI_PCI_IO_PROTOCOL *PciIo, >> 1918 IN EFI_IDE_REGISTERS *IdeRegisters, >> 1919 IN OUT VOID *Buffer, >> 1920 IN UINT64 ByteCount, >> 1921 IN BOOLEAN Read, >> 1922 IN UINT64 Timeout >> 1923 ) >> 1924 { >> 1925 UINT32 RequiredWordCount; >> 1926 UINT32 ActualWordCount; >> 1927 UINT32 WordCount; >> 1928 EFI_STATUS Status; >> 1929 UINT16 *PtrBuffer; >> 1930 >> 1931 // >> 1932 // No data transfer is premitted. >> 1933 // >> 1934 if (ByteCount == 0) { >> 1935 return EFI_SUCCESS; >> 1936 } >> 1937 >> 1938 PtrBuffer = Buffer; >> 1939 RequiredWordCount = (UINT32)RShiftU64(ByteCount, 1); >> 1940 // >> 1941 // ActuralWordCount means the word count of data really >> transferred. >> 1942 // >> 1943 ActualWordCount = 0; >> 1944 >> 1945 while (ActualWordCount < RequiredWordCount) { >> 1946 // >> 1947 // before each data transfer stream, the host should poll >> DRQ bit ready, >> 1948 // to see whether indicates device is ready to transfer >> data. >> 1949 // >> 1950 Status = DRQReady2 (PciIo, IdeRegisters, Timeout); >> 1951 if (EFI_ERROR (Status)) { >> 1952 return Status; >> 1953 } >> 1954 >> 1955 // >> 1956 // get current data transfer size from Cylinder Registers. >> 1957 // >> 1958 WordCount = IdeReadPortB (PciIo, >> IdeRegisters->CylinderMsb) << 8; >> 1959 WordCount = WordCount | IdeReadPortB (PciIo, >> IdeRegisters->CylinderLsb); >> 1960 WordCount = WordCount & 0xffff; >> 1961 WordCount /= 2; >> 1962 >> 1963 WordCount = MIN (WordCount, (RequiredWordCount - >> ActualWordCount)); >> 1964 >> 1965 if (Read) { >> 1966 IdeReadPortWMultiple ( >> 1967 PciIo, >> 1968 IdeRegisters->Data, >> 1969 WordCount, >> 1970 PtrBuffer >> 1971 ); >> 1972 } else { >> 1973 IdeWritePortWMultiple ( >> 1974 PciIo, >> 1975 IdeRegisters->Data, >> 1976 WordCount, >> 1977 PtrBuffer >> 1978 ); >> 1979 } >> 1980 >> 1981 // >> 1982 // read status register to check whether error happens. >> 1983 // >> 1984 Status = CheckStatusRegister (PciIo, IdeRegisters); >> 1985 if (EFI_ERROR (Status)) { >> 1986 return EFI_DEVICE_ERROR; >> 1987 } >> 1988 >> 1989 PtrBuffer += WordCount; >> 1990 ActualWordCount += WordCount; >> 1991 } >> 1992 >> 1993 if (Read) { >> 1994 // >> 1995 // In the case where the drive wants to send more data than >> we need to read, >> 1996 // the DRQ bit will be set and cause delays from >> DRQClear2(). >> 1997 // We need to read data from the drive until it clears DRQ >> so we can move on. >> 1998 // >> 1999 AtaPacketReadPendingData (PciIo, IdeRegisters); >> 2000 } >> 2001 >> 2002 // >> 2003 // read status register to check whether error happens. >> 2004 // >> 2005 Status = CheckStatusRegister (PciIo, IdeRegisters); >> 2006 if (EFI_ERROR (Status)) { >> 2007 return EFI_DEVICE_ERROR; >> 2008 } >> 2009 >> 2010 // >> 2011 // After data transfer is completed, normally, DRQ bit should >> clear. >> 2012 // >> 2013 Status = DRQClear (PciIo, IdeRegisters, Timeout); >> 2014 if (EFI_ERROR (Status)) { >> 2015 return EFI_DEVICE_ERROR; >> 2016 } >> 2017 >> 2018 return Status; >> 2019 } >> >> I think I understand what's going on! >> >> In its current form, the loop above continues to recognize a short read -- >> same as the pre-patch version. We bail out of the loop when we notice that >> the device is willing to give us less than what we originally wanted to have. >> >> However, the exit status of the function changed! Instead of the old action >> (= "just make sure everything is fine with the status register"), we now >> directly propagate *any and all* errors from DRQReady2(). The short read >> (that is, BSY=0 && DRQ=0 && ERR=0) situation is reported by DRQReady2() as >> EFI_NOT_READY, and we blindly propagate that EFI error code outwards, >> instead of reporting it as EFI_SUCCESS. >> >> So, I think the fix here is to restore the original code, *conditional* on: >> - we are reading, not writing, *and* >> - DRQReady2() actually noticed a short read. >> >> Let me try to come up with an edk2 patch for this. >> >> Thank you! >> Laszlo >> > > Is this truly a "short read," though? The way QEMU treats the "60 bytes" > request is as a "maximum reply length," so QEMU does not really treat > the 32 byte reply as an underflow.
Maybe "short read" is not the right term then :) I meant "less data returned than the maximum possible". > INQUIRY works slightly different from other commands. It has a maximum > length instead of a deterministic length; once HCYL/LCYL are read back, > should RequiredBytes not be adjusted in this case? (This way we don't > short the loop, we just finish gracefully.) Hm, making AtaPacketReadWrite() specific to INQUIRY doesn't feel right. ... I think that the pre-patch state was actually correct for EFI_NOT_READY (i.e., BSY==0 && ERR==0 && DRQ==0), when reading. The after-loop code is superfluous for that case. I'll CC you on the patches, so you can see my reasoning in more detail. > > I don't know the full hierarchy of these commands, however, so I am > comfortable leaving it in capable hands :) > > Glad I could help either way. Thank you! Laszlo > > --js > >>> >>>> IDE: write addr=0x176 val=0xe0 >>>> ide: write control addr=0x376 val=0a >>>> ide: read addr=0x177 val=50 >>>> IDE: write addr=0x176 val=0xe0 >>>> ide: read status addr=0x376 val=50 >>>> IDE: write addr=0x171 val=0x00 >>>> IDE: write addr=0x171 val=0x00 >>>> IDE: write addr=0x172 val=0x00 >>>> IDE: write addr=0x172 val=0x00 >>>> IDE: write addr=0x173 val=0x00 >>>> IDE: write addr=0x173 val=0x00 >>>> IDE: write addr=0x174 val=0x00 >>>> IDE: write addr=0x174 val=0xfe >>>> IDE: write addr=0x175 val=0x00 >>>> IDE: write addr=0x175 val=0xff >>>> IDE: write addr=0x177 val=0xa0 >>>> >>>> >>>>> >>>>> >>>>> == Thought #2: == >>>>> >>>>> We recently updated the ATAPI pio buffering loop to be asynchronous in >>>>> cases where it can be, which included some changes to when we fiddle >>>>> with DRQ and BSY. These changes went live for 2.5.0. Does version 2.4.1 >>>>> work? That would point to the new buffering loop. >>>> >>>> Great tip, let me check. (The above is experienced on v2.5.0-618-g19b6d84.) >>>> >>>> With v2.4.1, i440fx fails the same, and q35 works the same. >>>> >>>>> Useful information here: >>>>> >>>>> - In the 0xA0 PACKET command sent, what is the value of the >>>>> byte_count_limit? (this is a 16bit count comprised of the hcyl and lcyl >>>>> fields in the packet command structure.) Depending on if it matches the >>>>> block size of 2048 or not, we take a different buffering path. Multiples >>>>> of 2048 are buffered asynchronously, others are buffered synchronously. >>>> >>>> Please see above: 0xfffe; seems to imply synchronous. >>>> >>>>> >>>>> - In the 0xA0 PACKET command sent, what is the value of bit 0 in the >>>>> "feature" field? (the ATAPI DMA bit.) Affects the buffer loop depending >>>>> on which transport it's preparing to use. >>>>> >>>>> - What SCSI read command is being used? (Presumed 0x28 CMD_READ) >>>> >>>> Apparently 0x12 INQUIRY? >>>> >>> >>> Good news and bad news. >>> >>> Bad news is I don't have any clue right now. >>> Good news is that the PIO buffering loop is annoying to debug anyway, >>> and inquiry is pretty easy. It's always synchronous. >>> >>> (Written before the above tracing.) >>> >>>> Thanks >>>> Laszlo >>>> >>>>> >>>>> - How many sectors are being requested? >>>>> >>>>> >>>>>> Thanks! >>>>>> Laszlo >>>>>> >>>>> >>>> >>> >>> >>> Thanks, >>> --js >>> >> _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel