On 01/19/16 00:08, John Snow wrote:
> 
> 
> On 01/18/2016 05:28 PM, Laszlo Ersek wrote:
>> On 01/18/16 22:05, John Snow wrote:
>>>
>>>
>>> On 01/18/2016 02:33 PM, Laszlo Ersek wrote:
>>>> On 01/18/16 19:44, John Snow wrote:
>>>>>
>>>>>
>>>>> On 01/18/2016 05:57 AM, Laszlo Ersek wrote:
>>>>>> Hello Feng, John,
>>>>>>
>>>>>> On 11/03/15 02:48, Tian Feng wrote:
>>>>>>> When executing ATAPI cmd at IDE mode, EFI_SUCCESS may be returned 
>>>>>>> wrongly
>>>>>>> with old logic but in fact DRQ is not ready and the transaction doesn't
>>>>>>> get executed correctly at this time.
>>>>>>>
>>>>>>> Contributed-under: TianoCore Contribution Agreement 1.0
>>>>>>> Signed-off-by: Feng Tian <feng.t...@intel.com>
>>>>>>> Cc: Star Zeng <star.z...@intel.com>
>>>>>>> ---
>>>>>>>  MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c | 2 +-
>>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c 
>>>>>>> b/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c
>>>>>>> index 4928ed5..f74e5ca 100644
>>>>>>> --- a/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c
>>>>>>> +++ b/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c
>>>>>>> @@ -1949,7 +1949,7 @@ AtaPacketReadWrite (
>>>>>>>      //
>>>>>>>      Status = DRQReady2 (PciIo, IdeRegisters, Timeout);
>>>>>>>      if (EFI_ERROR (Status)) {
>>>>>>> -      return CheckStatusRegister (PciIo, IdeRegisters);
>>>>>>> +      return Status;
>>>>>>>      }
>>>>>>>  
>>>>>>>      //
>>>>>>>
>>>>>>
>>>>>> Unfortunately, this patch breaks OVMF on QEMU's ide-cd device. The
>>>>>> symptoms were reported in <https://github.com/tianocore/edk2/issues/43>;
>>>>>
>>>>> To confirm, does this affect both the AHCI ATAPI device and the BMDMA
>>>>> ATAPI device? (Q35 and PC default HBA, respectively) Or is it just AHCI?
>>>>
>>>> I only used i440fx for testing. This is my command line:
>>>>
>>>> qemu-system-x86_64 \
>>>>   -debugcon file:ovmf.log \
>>>>   -global isa-debugcon.iobase=0x402 \
>>>>   -net none \
>>>>   -enable-kvm \
>>>>   -pflash OVMF.fd \
>>>>   -drive id=cd0,if=none,readonly,format=raw,file=f22.iso \
>>>>   -device ide-cd,drive=cd0,bootindex=0
>>>>
>>>> If I add "-M q35" (--> AHCI), then things seem to work fine; I can reach
>>>> grub on the F22 workstation installer CD.
>>>>
>>>>>> I reproduced the issue and bisected it to this patch (git commit
>>>>>> 7cac2401 / SVN 19611). I also confirmed it by reverting the patch on top
>>>>>> of current master, and the CD-ROM started to work.
>>>>>>
>>>>>> I'm adding John Snow, who maintains QEMU's IDE emulation.
>>>>>>
>>>>>> Can you guys please investigate this patch, and discuss why it breaks on
>>>>>> QEMU's ide-cd device?
>>>>>>
>>>>>> John, if you'd like to browse the code, the following link highlights
>>>>>> the line that the above patch changes:
>>>>>>
>>>>>> https://github.com/tianocore/edk2/blob/7cac2401635317360226a235d1f95f8347081cbb/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c#L1952
>>>>>>
>>>>>> As far as I can see, this problem could be timing-sensitive. The
>>>>>> DRQReady2() function polls for the DRQ bit to become set, in a certain
>>>>>> amount of time. If that doesn't happen (or a definitive error emerges),
>>>>>> the function fails.
>>>>>>
>>>>>> Pre-patch, such failure would not be decisive; the status register would
>>>>>> be read and evaluated. Post-patch, the failure is decisive, even if it
>>>>>> is just a timeout, and the CheckStatusRegister() call would otherwise
>>>>>> return success.
>>>>>>
>>>>>> ... Hm, I added a debug log right after DRQReady2(), on the error branch
>>>>>> (i.e., when it fails). The status code is *not* EFI_TIMEOUT, it is
>>>>>> EFI_NOT_READY.
>>>>>>
>>>>>> Looking at DRQReady2(), this means:
>>>>>> - BSY is clear
>>>>>> - ERR is clear
>>>>>> - DRQ is clear too
>>>>>>
>>>>>> Apparently, the DRQReady2() function expects that as soon as BSY is
>>>>>> clear (and there is no error), DRQ must be immediately set (more or
>>>>>> less: not busy, no error, so ready to transfer data).
>>>>>>
>>>>>
>>>>> Thank you for your research!
>>>>>
>>>>>> Is that a valid assumption to make?
>>>>>>
>>>>>
>>>>> Whenever BSY, DRQ and ERR are all clear that should be confirmation of a
>>>>> completed command.
>>>>
>>>> That's interesting, because the edk2 code seems to treat this condition
>>>> as "failure to transition to data transfer", or some such.
>>>>
>>>
>>> If it's in the middle of a transfer loop, it's a weird condition to
>>> encounter. We expect to see BSY and/or DRQ for the duration of the
>>> command. At the end of the command, though, READY|SEEK is the normal
>>> idle status.
>>>
>>>>>                    Under AHCI, this should *not* be true before the AHCI
>>>>> device has signalled completion (Either via the status shadow register,
>>>>> an interrupt, or an MSI interrupt, depending on device configuration.)
>>>>>
>>>>> (In fact, hw/ide/core.c ide_exec_cmd uses the presence of DRQ|BSY to
>>>>> know whether or not it is in a state fit to accept a new command. If the
>>>>> guest is not blocked when BSY|DRQ are unset, this means a new command
>>>>> can be issued.)
>>>>>
>>>>> == Thought #1: ==
>>>>>
>>>>> However, when a command completes, it updates the shadow registers in
>>>>> the AHCI HBA right before signalling completion. It is, I think,
>>>>> possible that if you are polling the AHCI shadow registers instead of
>>>>> polling the AHCI status registers that you could find DRQ/BSY/ERR unset
>>>>> before receiving a command completed notice (by a time window of two
>>>>> passing electrons.)
>>>>>
>>>>> Take a look at ahci_cmd_done, which calls ahci_write_fis_d2h. We
>>>>> populate a guest memory FIS buffer with the values of the registers,
>>>>> then update the status shadow register, then signal completion (via IRQ,
>>>>> MSI, or neither.)
>>>>>
>>>>> You are probably not hitting that reliably, so can I ask what command is
>>>>> erroring out, here? 0xA0 PACKET for IDE, and 0x28 CMD_READ for SCSI with
>>>>> the DMA bit off? There might be a problem with the device not managing
>>>>> its state well.
>>>>
>>>> Again, I can only see it with i440fx. This is the section of the IDE
>>>> debug log from QEMU that fails:
>>>>
>>>
>>> That helps narrow it down, at least. AHCI is hard to track sometimes due
>>> to all the register shadowing going on.
>>>
>>> Are you ready for more information than you wanted? Let's go!
>>>
>>>> ide: CMD=a0
>>>
>>> ATAPI packet command received by ide_exec_cmd.
>>>
>>> (NOTE: I really should add a debug printf here in cmd_packet that tells
>>> us if it's PIO or DMA, that'd be helpful!)
>>>
>>>> ide: read addr=0x177 val=58
>>>
>>> primary status read by OVMF; 0x58 is READY | SEEK | DRQ. Interrupts are
>>> reset.
>>> OVMF sends the ATAPI/SCSI packet.
>>> QEMU receives the last expected byte of this DRQ phase.
>>> DRQ is unset.
>>> ide_data_writew invokes ide_atapi_cmd as s->end_transfer_func().
>>>
>>>> ATAPI limit=0xfffe packet: 12 00 00 00 60 00 00 00 00 00 00 00
>>>
>>> DRQ read phase of the SCSI packet is over; command is 0x12 INQUIRY.
>>> Max length OVMF is prepared to receive is 60 bytes.
>>> QEMU prepares the reply in s->io_buffer.
>>> Callchain is ide_atapi_cmd -> cmd_inquiry -> ide_atapi_cmd_reply.
>>> Status (if pio ...) is set to 0x50, (READY | SEEK).
>>>
>>>> reply: tx_size=36 elem_tx_size=0 index=0
>>>
>>> ide_atapi_cmd_reply_end has been called and we are preparing to send
>>> data back to OVMF.
>>>
>>>> byte_count_limit=65534
>>>
>>> OVMF has requested no more than 0xfffe bytes at a time. We only have 36
>>> bytes to send. s->hcyl and s->lcyl are updated to reflect the new limit
>>> of just 36 bytes.
>>>
>>> QEMU starts the PIO transfer and sets +DRQ again.
>>>
>>>> status=0x58
>>>
>>> print from ide_atapi_cmd_reply_end prints the status, +DRQ +READY +SEEK.
>>>
>>>> ide: read status addr=0x376 val=58
>>>
>>> Alternate status read by OVMF, any pending interrupts are not reset. DRQ
>>> is set, so OVMF is free to read our prepared reply.
>>>
>>>> ide: read addr=0x175 val=00
>>>
>>> OVMF reads s->hcyl, gets 0x00.
>>>
>>>> ide: read addr=0x174 val=24
>>>
>>> OVMF reads s->lcyl, gets 0x24. OVMF now knows to expect 0x24 = 36 bytes
>>> of data.
>>
>> Aha!
>>
>>> There are no prints for clues in the middle here, but OVMF should have
>>> then read 36 bytes, which causes ide_data_readw to:
>>> - Clear DRQ
>>> - Invoke s->end_transfer_func, which is ide_atapi_cmd_reply_end.
>>>
>>>> reply: tx_size=0 elem_tx_size=0 index=36
>>>> end of transfer, status=0x50
>>>
>>> ide_atapi_cmd_reply_end is invoked as a callback, nothing left to
>>> transfer; tx_size is 0 and the index is equal to the total transfer
>>> size. DRQ remains unset and we begin cleaning up by calling
>>> ide_atapi_cmd_ok:
>>>
>>> - status is set to 0x50 (READY | SEEK)
>>> - IRQ is raised.
>>>
>>>> ide: read addr=0x177 val=50
>>>
>>> Primary status register is read, IRQ is cleared, and status is READY|SEEK.
>>>
>>>> ide: read status addr=0x376 val=50  <---- triggers the failure in edk2
>>>
>>> Alternate status register is read. Status is still 0x50.
>>>
>>> At this point, the transfer is done and EDK2 should not expect further
>>> bytes. READY|SEEK should not be interpreted as an error, IMO.
>>>
>>> Is there some reason it's still expecting data (i.e., waiting for DRQ?)
>>
>> This is extremely helpful, thanks!
>>
>> The edk2 code is [MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c], sorry 
>> for the big paste:
>>
>>   1896       /**
>>   1897         This function is called by AtaPacketCommandExecute().
>>   1898         It is used to transfer data between host and device. The data 
>> direction is specified
>>   1899         by the fourth parameter.
>>   1900       
>>   1901         @param PciIo         A pointer to 
>> ATA_ATAPI_PASS_THRU_INSTANCE data structure.
>>   1902         @param IdeRegisters  A pointer to EFI_IDE_REGISTERS data 
>> structure.
>>   1903         @param Buffer        Buffer contained data transferred 
>> between host and device.
>>   1904         @param ByteCount     Data size in byte unit of the buffer.
>>   1905         @param Read          Flag used to determine the data transfer 
>> direction.
>>   1906                              Read equals 1, means data transferred 
>> from device to host;
>>   1907                              Read equals 0, means data transferred 
>> from host to device.
>>   1908         @param Timeout       Timeout value for wait DRQ ready before 
>> each data stream's transfer
>>   1909                              , uses 100ns as a unit.
>>   1910       
>>   1911         @retval EFI_SUCCESS      data is transferred successfully.
>>   1912         @retval EFI_DEVICE_ERROR the device failed to transfer data.
>>   1913       **/
>>   1914       EFI_STATUS
>>   1915       EFIAPI
>>   1916       AtaPacketReadWrite (
>>   1917         IN     EFI_PCI_IO_PROTOCOL       *PciIo,
>>   1918         IN     EFI_IDE_REGISTERS         *IdeRegisters,
>>   1919         IN OUT VOID                      *Buffer,
>>   1920         IN     UINT64                    ByteCount,
>>   1921         IN     BOOLEAN                   Read,
>>   1922         IN     UINT64                    Timeout
>>   1923         )
>>   1924       {
>>   1925         UINT32      RequiredWordCount;
>>   1926         UINT32      ActualWordCount;
>>   1927         UINT32      WordCount;
>>   1928         EFI_STATUS  Status;
>>   1929         UINT16      *PtrBuffer;
>>   1930       
>>   1931         //
>>   1932         // No data transfer is premitted.
>>   1933         //
>>   1934         if (ByteCount == 0) {
>>   1935           return EFI_SUCCESS;
>>   1936         }
>>   1937       
>>   1938         PtrBuffer         = Buffer;
>>   1939         RequiredWordCount = (UINT32)RShiftU64(ByteCount, 1);
>>   1940         //
>>   1941         // ActuralWordCount means the word count of data really 
>> transferred.
>>   1942         //
>>   1943         ActualWordCount = 0;
>>   1944       
>>   1945         while (ActualWordCount < RequiredWordCount) {
>>   1946           //
>>   1947           // before each data transfer stream, the host should poll 
>> DRQ bit ready,
>>   1948           // to see whether indicates device is ready to transfer 
>> data.
>>   1949           //
>>   1950           Status = DRQReady2 (PciIo, IdeRegisters, Timeout);
>>   1951           if (EFI_ERROR (Status)) {
>>   1952             return Status;
>>   1953           }
>>   1954       
>>   1955           //
>>   1956           // get current data transfer size from Cylinder Registers.
>>   1957           //
>>   1958           WordCount  = IdeReadPortB (PciIo, 
>> IdeRegisters->CylinderMsb) << 8;
>>   1959           WordCount  = WordCount | IdeReadPortB (PciIo, 
>> IdeRegisters->CylinderLsb);
>>   1960           WordCount  = WordCount & 0xffff;
>>   1961           WordCount /= 2;
>>   1962       
>>   1963           WordCount = MIN (WordCount, (RequiredWordCount - 
>> ActualWordCount));
>>   1964       
>>   1965           if (Read) {
>>   1966             IdeReadPortWMultiple (
>>   1967               PciIo,
>>   1968               IdeRegisters->Data,
>>   1969               WordCount,
>>   1970               PtrBuffer
>>   1971               );
>>   1972           } else {
>>   1973             IdeWritePortWMultiple (
>>   1974               PciIo,
>>   1975               IdeRegisters->Data,
>>   1976               WordCount,
>>   1977               PtrBuffer
>>   1978               );
>>   1979           }
>>   1980       
>>   1981           //
>>   1982           // read status register to check whether error happens.
>>   1983           //
>>   1984           Status = CheckStatusRegister (PciIo, IdeRegisters);
>>   1985           if (EFI_ERROR (Status)) {
>>   1986             return EFI_DEVICE_ERROR;
>>   1987           }
>>   1988       
>>   1989           PtrBuffer       += WordCount;
>>   1990           ActualWordCount += WordCount;
>>   1991         }
>>   1992       
>>   1993         if (Read) {
>>   1994           //
>>   1995           // In the case where the drive wants to send more data than 
>> we need to read,
>>   1996           // the DRQ bit will be set and cause delays from 
>> DRQClear2().
>>   1997           // We need to read data from the drive until it clears DRQ 
>> so we can move on.
>>   1998           //
>>   1999           AtaPacketReadPendingData (PciIo, IdeRegisters);
>>   2000         }
>>   2001       
>>   2002         //
>>   2003         // read status register to check whether error happens.
>>   2004         //
>>   2005         Status = CheckStatusRegister (PciIo, IdeRegisters);
>>   2006         if (EFI_ERROR (Status)) {
>>   2007           return EFI_DEVICE_ERROR;
>>   2008         }
>>   2009       
>>   2010         //
>>   2011         // After data transfer is completed, normally, DRQ bit should 
>> clear.
>>   2012         //
>>   2013         Status = DRQClear (PciIo, IdeRegisters, Timeout);
>>   2014         if (EFI_ERROR (Status)) {
>>   2015           return EFI_DEVICE_ERROR;
>>   2016         }
>>   2017       
>>   2018         return Status;
>>   2019       }
>>
>> I think I understand what's going on!
>>
>> In its current form, the loop above continues to recognize a short read -- 
>> same as the pre-patch version. We bail out of the loop when we notice that 
>> the device is willing to give us less than what we originally wanted to have.
>>
>> However, the exit status of the function changed! Instead of the old action 
>> (= "just make sure everything is fine with the status register"), we now 
>> directly propagate *any and all* errors from DRQReady2(). The short read 
>> (that is, BSY=0 && DRQ=0 && ERR=0) situation is reported by DRQReady2() as 
>> EFI_NOT_READY, and we blindly propagate that EFI error code outwards, 
>> instead of reporting it as EFI_SUCCESS.
>>
>> So, I think the fix here is to restore the original code, *conditional* on:
>> - we are reading, not writing, *and*
>> - DRQReady2() actually noticed a short read.
>>
>> Let me try to come up with an edk2 patch for this.
>>
>> Thank you!
>> Laszlo
>>
> 
> Is this truly a "short read," though? The way QEMU treats the "60 bytes"
> request is as a "maximum reply length," so QEMU does not really treat
> the 32 byte reply as an underflow.

Maybe "short read" is not the right term then :) I meant "less data
returned than the maximum possible".

> INQUIRY works slightly different from other commands. It has a maximum
> length instead of a deterministic length; once HCYL/LCYL are read back,
> should RequiredBytes not be adjusted in this case? (This way we don't
> short the loop, we just finish gracefully.)

Hm, making AtaPacketReadWrite() specific to INQUIRY doesn't feel right.

... I think that the pre-patch state was actually correct for
EFI_NOT_READY (i.e., BSY==0 && ERR==0 && DRQ==0), when reading. The
after-loop code is superfluous for that case.

I'll CC you on the patches, so you can see my reasoning in more detail.

> 
> I don't know the full hierarchy of these commands, however, so I am
> comfortable leaving it in capable hands :)
> 
> Glad I could help either way.

Thank you!
Laszlo

> 
> --js
> 
>>>
>>>> IDE: write addr=0x176 val=0xe0
>>>> ide: write control addr=0x376 val=0a
>>>> ide: read addr=0x177 val=50
>>>> IDE: write addr=0x176 val=0xe0
>>>> ide: read status addr=0x376 val=50
>>>> IDE: write addr=0x171 val=0x00
>>>> IDE: write addr=0x171 val=0x00
>>>> IDE: write addr=0x172 val=0x00
>>>> IDE: write addr=0x172 val=0x00
>>>> IDE: write addr=0x173 val=0x00
>>>> IDE: write addr=0x173 val=0x00
>>>> IDE: write addr=0x174 val=0x00
>>>> IDE: write addr=0x174 val=0xfe
>>>> IDE: write addr=0x175 val=0x00
>>>> IDE: write addr=0x175 val=0xff
>>>> IDE: write addr=0x177 val=0xa0
>>>>
>>>>
>>>>>
>>>>>
>>>>> == Thought #2: ==
>>>>>
>>>>> We recently updated the ATAPI pio buffering loop to be asynchronous in
>>>>> cases where it can be, which included some changes to when we fiddle
>>>>> with DRQ and BSY. These changes went live for 2.5.0. Does version 2.4.1
>>>>> work? That would point to the new buffering loop.
>>>>
>>>> Great tip, let me check. (The above is experienced on v2.5.0-618-g19b6d84.)
>>>>
>>>> With v2.4.1, i440fx fails the same, and q35 works the same.
>>>>
>>>>> Useful information here:
>>>>>
>>>>>   - In the 0xA0 PACKET command sent, what is the value of the
>>>>> byte_count_limit? (this is a 16bit count comprised of the hcyl and lcyl
>>>>> fields in the packet command structure.) Depending on if it matches the
>>>>> block size of 2048 or not, we take a different buffering path. Multiples
>>>>> of 2048 are buffered asynchronously, others are buffered synchronously.
>>>>
>>>> Please see above: 0xfffe; seems to imply synchronous.
>>>>
>>>>>
>>>>>   - In the 0xA0 PACKET command sent, what is the value of bit 0 in the
>>>>> "feature" field? (the ATAPI DMA bit.) Affects the buffer loop depending
>>>>> on which transport it's preparing to use.
>>>>>
>>>>>   - What SCSI read command is being used? (Presumed 0x28 CMD_READ)
>>>>
>>>> Apparently 0x12 INQUIRY?
>>>>
>>>
>>> Good news and bad news.
>>>
>>> Bad news is I don't have any clue right now.
>>> Good news is that the PIO buffering loop is annoying to debug anyway,
>>> and inquiry is pretty easy. It's always synchronous.
>>>
>>> (Written before the above tracing.)
>>>
>>>> Thanks
>>>> Laszlo
>>>>
>>>>>
>>>>>   - How many sectors are being requested?
>>>>>
>>>>>
>>>>>> Thanks!
>>>>>> Laszlo
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> Thanks,
>>> --js
>>>
>>

_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to