On 01/18/16 20:33, Laszlo Ersek wrote:
> On 01/18/16 19:44, John Snow wrote:
>>
>>
>> On 01/18/2016 05:57 AM, Laszlo Ersek wrote:
>>> Hello Feng, John,
>>>
>>> On 11/03/15 02:48, Tian Feng wrote:
>>>> When executing ATAPI cmd at IDE mode, EFI_SUCCESS may be returned wrongly
>>>> with old logic but in fact DRQ is not ready and the transaction doesn't
>>>> get executed correctly at this time.
>>>>
>>>> Contributed-under: TianoCore Contribution Agreement 1.0
>>>> Signed-off-by: Feng Tian <feng.t...@intel.com>
>>>> Cc: Star Zeng <star.z...@intel.com>
>>>> ---
>>>>  MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c 
>>>> b/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c
>>>> index 4928ed5..f74e5ca 100644
>>>> --- a/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c
>>>> +++ b/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c
>>>> @@ -1949,7 +1949,7 @@ AtaPacketReadWrite (
>>>>      //
>>>>      Status = DRQReady2 (PciIo, IdeRegisters, Timeout);
>>>>      if (EFI_ERROR (Status)) {
>>>> -      return CheckStatusRegister (PciIo, IdeRegisters);
>>>> +      return Status;
>>>>      }
>>>>  
>>>>      //
>>>>
>>>
>>> Unfortunately, this patch breaks OVMF on QEMU's ide-cd device. The
>>> symptoms were reported in <https://github.com/tianocore/edk2/issues/43>;
>>
>> To confirm, does this affect both the AHCI ATAPI device and the BMDMA
>> ATAPI device? (Q35 and PC default HBA, respectively) Or is it just AHCI?
> 
> I only used i440fx for testing. This is my command line:
> 
> qemu-system-x86_64 \
>   -debugcon file:ovmf.log \
>   -global isa-debugcon.iobase=0x402 \
>   -net none \
>   -enable-kvm \
>   -pflash OVMF.fd \
>   -drive id=cd0,if=none,readonly,format=raw,file=f22.iso \
>   -device ide-cd,drive=cd0,bootindex=0
> 
> If I add "-M q35" (--> AHCI), then things seem to work fine; I can reach
> grub on the F22 workstation installer CD.
> 
>>> I reproduced the issue and bisected it to this patch (git commit
>>> 7cac2401 / SVN 19611). I also confirmed it by reverting the patch on top
>>> of current master, and the CD-ROM started to work.
>>>
>>> I'm adding John Snow, who maintains QEMU's IDE emulation.
>>>
>>> Can you guys please investigate this patch, and discuss why it breaks on
>>> QEMU's ide-cd device?
>>>
>>> John, if you'd like to browse the code, the following link highlights
>>> the line that the above patch changes:
>>>
>>> https://github.com/tianocore/edk2/blob/7cac2401635317360226a235d1f95f8347081cbb/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c#L1952
>>>
>>> As far as I can see, this problem could be timing-sensitive. The
>>> DRQReady2() function polls for the DRQ bit to become set, in a certain
>>> amount of time. If that doesn't happen (or a definitive error emerges),
>>> the function fails.
>>>
>>> Pre-patch, such failure would not be decisive; the status register would
>>> be read and evaluated. Post-patch, the failure is decisive, even if it
>>> is just a timeout, and the CheckStatusRegister() call would otherwise
>>> return success.
>>>
>>> ... Hm, I added a debug log right after DRQReady2(), on the error branch
>>> (i.e., when it fails). The status code is *not* EFI_TIMEOUT, it is
>>> EFI_NOT_READY.
>>>
>>> Looking at DRQReady2(), this means:
>>> - BSY is clear
>>> - ERR is clear
>>> - DRQ is clear too
>>>
>>> Apparently, the DRQReady2() function expects that as soon as BSY is
>>> clear (and there is no error), DRQ must be immediately set (more or
>>> less: not busy, no error, so ready to transfer data).
>>>
>>
>> Thank you for your research!
>>
>>> Is that a valid assumption to make?
>>>
>>
>> Whenever BSY, DRQ and ERR are all clear that should be confirmation of a
>> completed command.
> 
> That's interesting, because the edk2 code seems to treat this condition
> as "failure to transition to data transfer", or some such.
> 
>>                    Under AHCI, this should *not* be true before the AHCI
>> device has signalled completion (Either via the status shadow register,
>> an interrupt, or an MSI interrupt, depending on device configuration.)
>>
>> (In fact, hw/ide/core.c ide_exec_cmd uses the presence of DRQ|BSY to
>> know whether or not it is in a state fit to accept a new command. If the
>> guest is not blocked when BSY|DRQ are unset, this means a new command
>> can be issued.)
>>
>> == Thought #1: ==
>>
>> However, when a command completes, it updates the shadow registers in
>> the AHCI HBA right before signalling completion. It is, I think,
>> possible that if you are polling the AHCI shadow registers instead of
>> polling the AHCI status registers that you could find DRQ/BSY/ERR unset
>> before receiving a command completed notice (by a time window of two
>> passing electrons.)
>>
>> Take a look at ahci_cmd_done, which calls ahci_write_fis_d2h. We
>> populate a guest memory FIS buffer with the values of the registers,
>> then update the status shadow register, then signal completion (via IRQ,
>> MSI, or neither.)
>>
>> You are probably not hitting that reliably, so can I ask what command is
>> erroring out, here? 0xA0 PACKET for IDE, and 0x28 CMD_READ for SCSI with
>> the DMA bit off? There might be a problem with the device not managing
>> its state well.
> 
> Again, I can only see it with i440fx. This is the section of the IDE
> debug log from QEMU that fails:
> 
> ide: CMD=a0
> ide: read addr=0x177 val=58
> ATAPI limit=0xfffe packet: 12 00 00 00 60 00 00 00 00 00 00 00
> reply: tx_size=36 elem_tx_size=0 index=0
> byte_count_limit=65534
> status=0x58
> ide: read status addr=0x376 val=58
> ide: read addr=0x175 val=00
> ide: read addr=0x174 val=24
> reply: tx_size=0 elem_tx_size=0 index=36
> end of transfer, status=0x50
> ide: read addr=0x177 val=50
> ide: read status addr=0x376 val=50  <---- triggers the failure in edk2
> IDE: write addr=0x176 val=0xe0
> ide: write control addr=0x376 val=0a
> ide: read addr=0x177 val=50
> IDE: write addr=0x176 val=0xe0
> ide: read status addr=0x376 val=50
> IDE: write addr=0x171 val=0x00
> IDE: write addr=0x171 val=0x00
> IDE: write addr=0x172 val=0x00
> IDE: write addr=0x172 val=0x00
> IDE: write addr=0x173 val=0x00
> IDE: write addr=0x173 val=0x00
> IDE: write addr=0x174 val=0x00
> IDE: write addr=0x174 val=0xfe
> IDE: write addr=0x175 val=0x00
> IDE: write addr=0x175 val=0xff
> IDE: write addr=0x177 val=0xa0
> 
> 
>>
>>
>> == Thought #2: ==
>>
>> We recently updated the ATAPI pio buffering loop to be asynchronous in
>> cases where it can be, which included some changes to when we fiddle
>> with DRQ and BSY. These changes went live for 2.5.0. Does version 2.4.1
>> work? That would point to the new buffering loop.
> 
> Great tip, let me check. (The above is experienced on v2.5.0-618-g19b6d84.)
> 
> With v2.4.1, i440fx fails the same, and q35 works the same.
> 
>> Useful information here:
>>
>>      - In the 0xA0 PACKET command sent, what is the value of the
>> byte_count_limit? (this is a 16bit count comprised of the hcyl and lcyl
>> fields in the packet command structure.) Depending on if it matches the
>> block size of 2048 or not, we take a different buffering path. Multiples
>> of 2048 are buffered asynchronously, others are buffered synchronously.
> 
> Please see above: 0xfffe; seems to imply synchronous.
> 
>>
>>      - In the 0xA0 PACKET command sent, what is the value of bit 0 in the
>> "feature" field? (the ATAPI DMA bit.) Affects the buffer loop depending
>> on which transport it's preparing to use.
>>
>>      - What SCSI read command is being used? (Presumed 0x28 CMD_READ)
> 
> Apparently 0x12 INQUIRY?
> 
> Thanks
> Laszlo
> 
>>
>>      - How many sectors are being requested?

So it looks like the edk2-side code is trying to transition from

  HP2: Check_Status_B State

to

  HP4: Transfer_Data State

The spec says, "Transition HP2:HP4: When BSY is cleared to zero and DRQ
is set to one, then the host shall make a transition to the HP4:
Transfer_Data state". I think edk2 intends to read 0x60 (96 decimal)
bytes of inquiry data, as data transfer.

However, QEMU doesn't seem to set the DRQ bit; we have

cmd_inquiry() [hw/ide/atapi.c]
  ide_atapi_cmd_reply()
    // DMA is not selected
    s->status = READY_STAT | SEEK_STAT; (== 0x50)

However, I don't know if the inquiry response data counts as "data
transfer".

If it does not, then QEMU seems to be right, because then the transition
is *not* to "HP4: Transfer_Data State". Instead, it could be one of the
many "BSY = 0 & DRQ = 0" transitions, which mostly seem to mean "command
complete".

I have no clue which interpretation is correct.

Thanks
Laszlo

_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to