On 01/18/16 19:44, John Snow wrote:
> 
> 
> On 01/18/2016 05:57 AM, Laszlo Ersek wrote:
>> Hello Feng, John,
>>
>> On 11/03/15 02:48, Tian Feng wrote:
>>> When executing ATAPI cmd at IDE mode, EFI_SUCCESS may be returned wrongly
>>> with old logic but in fact DRQ is not ready and the transaction doesn't
>>> get executed correctly at this time.
>>>
>>> Contributed-under: TianoCore Contribution Agreement 1.0
>>> Signed-off-by: Feng Tian <feng.t...@intel.com>
>>> Cc: Star Zeng <star.z...@intel.com>
>>> ---
>>>  MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c 
>>> b/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c
>>> index 4928ed5..f74e5ca 100644
>>> --- a/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c
>>> +++ b/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c
>>> @@ -1949,7 +1949,7 @@ AtaPacketReadWrite (
>>>      //
>>>      Status = DRQReady2 (PciIo, IdeRegisters, Timeout);
>>>      if (EFI_ERROR (Status)) {
>>> -      return CheckStatusRegister (PciIo, IdeRegisters);
>>> +      return Status;
>>>      }
>>>  
>>>      //
>>>
>>
>> Unfortunately, this patch breaks OVMF on QEMU's ide-cd device. The
>> symptoms were reported in <https://github.com/tianocore/edk2/issues/43>;
> 
> To confirm, does this affect both the AHCI ATAPI device and the BMDMA
> ATAPI device? (Q35 and PC default HBA, respectively) Or is it just AHCI?

I only used i440fx for testing. This is my command line:

qemu-system-x86_64 \
  -debugcon file:ovmf.log \
  -global isa-debugcon.iobase=0x402 \
  -net none \
  -enable-kvm \
  -pflash OVMF.fd \
  -drive id=cd0,if=none,readonly,format=raw,file=f22.iso \
  -device ide-cd,drive=cd0,bootindex=0

If I add "-M q35" (--> AHCI), then things seem to work fine; I can reach
grub on the F22 workstation installer CD.

>> I reproduced the issue and bisected it to this patch (git commit
>> 7cac2401 / SVN 19611). I also confirmed it by reverting the patch on top
>> of current master, and the CD-ROM started to work.
>>
>> I'm adding John Snow, who maintains QEMU's IDE emulation.
>>
>> Can you guys please investigate this patch, and discuss why it breaks on
>> QEMU's ide-cd device?
>>
>> John, if you'd like to browse the code, the following link highlights
>> the line that the above patch changes:
>>
>> https://github.com/tianocore/edk2/blob/7cac2401635317360226a235d1f95f8347081cbb/MdeModulePkg/Bus/Ata/AtaAtapiPassThru/IdeMode.c#L1952
>>
>> As far as I can see, this problem could be timing-sensitive. The
>> DRQReady2() function polls for the DRQ bit to become set, in a certain
>> amount of time. If that doesn't happen (or a definitive error emerges),
>> the function fails.
>>
>> Pre-patch, such failure would not be decisive; the status register would
>> be read and evaluated. Post-patch, the failure is decisive, even if it
>> is just a timeout, and the CheckStatusRegister() call would otherwise
>> return success.
>>
>> ... Hm, I added a debug log right after DRQReady2(), on the error branch
>> (i.e., when it fails). The status code is *not* EFI_TIMEOUT, it is
>> EFI_NOT_READY.
>>
>> Looking at DRQReady2(), this means:
>> - BSY is clear
>> - ERR is clear
>> - DRQ is clear too
>>
>> Apparently, the DRQReady2() function expects that as soon as BSY is
>> clear (and there is no error), DRQ must be immediately set (more or
>> less: not busy, no error, so ready to transfer data).
>>
> 
> Thank you for your research!
> 
>> Is that a valid assumption to make?
>>
> 
> Whenever BSY, DRQ and ERR are all clear that should be confirmation of a
> completed command.

That's interesting, because the edk2 code seems to treat this condition
as "failure to transition to data transfer", or some such.

>                    Under AHCI, this should *not* be true before the AHCI
> device has signalled completion (Either via the status shadow register,
> an interrupt, or an MSI interrupt, depending on device configuration.)
> 
> (In fact, hw/ide/core.c ide_exec_cmd uses the presence of DRQ|BSY to
> know whether or not it is in a state fit to accept a new command. If the
> guest is not blocked when BSY|DRQ are unset, this means a new command
> can be issued.)
> 
> == Thought #1: ==
> 
> However, when a command completes, it updates the shadow registers in
> the AHCI HBA right before signalling completion. It is, I think,
> possible that if you are polling the AHCI shadow registers instead of
> polling the AHCI status registers that you could find DRQ/BSY/ERR unset
> before receiving a command completed notice (by a time window of two
> passing electrons.)
> 
> Take a look at ahci_cmd_done, which calls ahci_write_fis_d2h. We
> populate a guest memory FIS buffer with the values of the registers,
> then update the status shadow register, then signal completion (via IRQ,
> MSI, or neither.)
> 
> You are probably not hitting that reliably, so can I ask what command is
> erroring out, here? 0xA0 PACKET for IDE, and 0x28 CMD_READ for SCSI with
> the DMA bit off? There might be a problem with the device not managing
> its state well.

Again, I can only see it with i440fx. This is the section of the IDE
debug log from QEMU that fails:

ide: CMD=a0
ide: read addr=0x177 val=58
ATAPI limit=0xfffe packet: 12 00 00 00 60 00 00 00 00 00 00 00
reply: tx_size=36 elem_tx_size=0 index=0
byte_count_limit=65534
status=0x58
ide: read status addr=0x376 val=58
ide: read addr=0x175 val=00
ide: read addr=0x174 val=24
reply: tx_size=0 elem_tx_size=0 index=36
end of transfer, status=0x50
ide: read addr=0x177 val=50
ide: read status addr=0x376 val=50  <---- triggers the failure in edk2
IDE: write addr=0x176 val=0xe0
ide: write control addr=0x376 val=0a
ide: read addr=0x177 val=50
IDE: write addr=0x176 val=0xe0
ide: read status addr=0x376 val=50
IDE: write addr=0x171 val=0x00
IDE: write addr=0x171 val=0x00
IDE: write addr=0x172 val=0x00
IDE: write addr=0x172 val=0x00
IDE: write addr=0x173 val=0x00
IDE: write addr=0x173 val=0x00
IDE: write addr=0x174 val=0x00
IDE: write addr=0x174 val=0xfe
IDE: write addr=0x175 val=0x00
IDE: write addr=0x175 val=0xff
IDE: write addr=0x177 val=0xa0


> 
> 
> == Thought #2: ==
> 
> We recently updated the ATAPI pio buffering loop to be asynchronous in
> cases where it can be, which included some changes to when we fiddle
> with DRQ and BSY. These changes went live for 2.5.0. Does version 2.4.1
> work? That would point to the new buffering loop.

Great tip, let me check. (The above is experienced on v2.5.0-618-g19b6d84.)

With v2.4.1, i440fx fails the same, and q35 works the same.

> Useful information here:
> 
>       - In the 0xA0 PACKET command sent, what is the value of the
> byte_count_limit? (this is a 16bit count comprised of the hcyl and lcyl
> fields in the packet command structure.) Depending on if it matches the
> block size of 2048 or not, we take a different buffering path. Multiples
> of 2048 are buffered asynchronously, others are buffered synchronously.

Please see above: 0xfffe; seems to imply synchronous.

> 
>       - In the 0xA0 PACKET command sent, what is the value of bit 0 in the
> "feature" field? (the ATAPI DMA bit.) Affects the buffer loop depending
> on which transport it's preparing to use.
> 
>       - What SCSI read command is being used? (Presumed 0x28 CMD_READ)

Apparently 0x12 INQUIRY?

Thanks
Laszlo

> 
>       - How many sectors are being requested?
> 
> 
>> Thanks!
>> Laszlo
>>
> 

_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to