On 27/03/2020 14:26, Laszlo Ersek wrote:
On 03/25/20 17:10, Liran Alon wrote:
+/**
+  Returns if PVSCSI request ring is full
+**/
+STATIC
+BOOLEAN
+PvScsiIsReqRingFull (
+  IN CONST PVSCSI_DEV   *Dev
+  )
+{
+  PVSCSI_RINGS_STATE *RingsState;
+  UINT32             ReqNumEntries;
+
+  RingsState = Dev->RingDesc.RingState;
+  ReqNumEntries = 1U << RingsState->ReqNumEntriesLog2;
+  return (RingsState->ReqProdIdx - RingsState->CmpConsIdx) >= ReqNumEntries;
+}
(Just some thoughts, not a request for changing the code.)

Normally I prefer accessing buffers shared with the device though
volatile-qualified  pointers.

Meaning, in this case, that every "PCI host" pointer (i.e., each pointer
that is associated with a PVSCSI_DMA_DESC) would have to be
volatile-qualified. In particular:

- in patch#13, PVSCSI_RING_DESC would have to be updated like this:

typedef struct {
   volatile PVSCSI_RINGS_STATE   *RingState;
   PVSCSI_DMA_DESC               RingStateDmaDesc;

   volatile PVSCSI_RING_REQ_DESC *RingReqs;
   PVSCSI_DMA_DESC               RingReqsDmaDesc;

   volatile PVSCSI_RING_CMP_DESC *RingCmps;
   PVSCSI_DMA_DESC               RingCmpsDmaDesc;
} PVSCSI_RING_DESC;
- in patch#14, PVSCSI_DEV would change as follows:

typedef struct {
   UINT32                          Signature;
   EFI_PCI_IO_PROTOCOL             *PciIo;
   EFI_EVENT                       ExitBoot;
   UINT64                          OriginalPciAttributes;
   PVSCSI_RING_DESC                RingDesc;
   volatile PVSCSI_DMA_BUFFER      *DmaBuf;
   PVSCSI_DMA_DESC                 DmaBufDmaDesc;
   UINT8                           MaxTarget;
   UINT8                           MaxLun;
   UINTN                           WaitForCmpStallInUsecs;
   EFI_EXT_SCSI_PASS_THRU_PROTOCOL PassThru;
   EFI_EXT_SCSI_PASS_THRU_MODE     PassThruMode;
} PVSCSI_DEV;
After these changes, the compiler would (justifiedly) flag a bunch of
code locations casting away the volatile qualification -- for example,
in the above function, in the assignment to the "RingsState" local
variable.

Clearly, most of these compilation errors would have to be fixed (not
suppressed), because they would be valid. Meaning:

- you'd have to volatile-qualify the "RingsState" local variable in all
   of PvScsiIsReqRingFull(), PvScsiGetCurrentRequest(),
   PvScsiWaitForRequestCompletion();

- you'd also have to volatile-qualify the return types of
   PvScsiGetCurrentRequest() and PvScsiWaitForRequestCompletion();

- you'd have to update PopulateRequest() and HandleResponse() too; and
   the most annoying part of that would be that you could no longer use
   CopyMem() and ZeroMem() -- because those functions take
   pointer-to-void parameters, rather than pointer-to-volatile-void ones.

(FWIW, we wouldn't have to change the PvScsiFreeSharedPages() prototype
-- it would be OK to cast away volatile in those calls, as we wouldn't
dereference the pointers in that case.)

So... the reason I'm not actually requesting these
volatile-qualifications is that (a) your use of MemoryFence() seems
mostly OK, and (b) the UEFI Driver Writer's guide recommends *either*
volatile *or* MemoryFence(). Of course using both techniques at the same
time is not a problem -- and in code I write I actually like to use both
at the same time --, but just one suffices too. (See section 4.2.6
"Memory ordering" in the DWG.)

The reason I'm writing this up here is because I want the "record" (the
mailing list archive) to show that we have considered this topic
explicitly.
I prefer to remain with only memory fences if that's OK by you. As the code is written now. As it's allows for potential compiler optimization and leads to more readable code in my opinion.
Back to your patch:

On 03/25/20 17:10, Liran Alon wrote:
+  //
+  // This cast is safe as MaxLun is defined as UINT8
+  //
+  Request->Lun[1] = (UINT8)Lun;
+  Request->SenseLen = Packet->SenseDataLength;
Ah, *now* I understand why you chose MAX_UINT8 as the size of
"PVSCSI_DMA_BUFFER.SenseData". Because, "Packet->SenseDataLength" has
type UINT8, and this way you guarantee that the SCSI client's
"Packet->SenseDataLength" will always fit in the DMA buffer.

Good solution, but it *absolutely* needs to be documented in patch#14
("OvmfPkg/PvScsiDxe: Introduce DMA communication buffer") -- in fact,
see my question (4) under patch#14.
Please read the response I have written you to your patch#14 review.
Where I suggest we define a constant in IndustryStandard/Scsi.h for the limit of the total length of SenseData that is defined to be 252 according to SCSI specification.

(2) Also, please add a comment here that a "Dev->DmaBuf->SenseData"
overflow is not possible due to "Packet->SenseDataLength" having type
UINT8.

This would be a comment in the same vein as the "MaxLun" reference just
above -- I find *that* comment very helpful, too.
OK.
+
+  return EFI_SUCCESS;
+}
+
+/**
+  Handle the PVSCSI device response:
+  - Copy returned data from DMA communication buffer.
+  - Update fields in Extended SCSI Pass Thru Protocol packet as required.
+  - Translate response code to EFI status code and host adapter status.
+**/
+STATIC
+EFI_STATUS
+HandleResponse (
+  IN PVSCSI_DEV                                     *Dev,
+  IN OUT EFI_EXT_SCSI_PASS_THRU_SCSI_REQUEST_PACKET *Packet,
+  IN CONST PVSCSI_RING_CMP_DESC                     *Response
+  )
+{
+  //
+  // Check if device returned sense data
+  //
+  if (Response->ScsiStatus == EFI_EXT_SCSI_STATUS_TARGET_CHECK_CONDITION) {
+    //
+    // Fix SenseDataLength to amount of data returned
+    //
+    if (Packet->SenseDataLength > Response->SenseLen) {
+      Packet->SenseDataLength = (UINT8)Response->SenseLen;
+    }
+    //
+    // Copy sense data from DMA communication buffer
+    //
+    CopyMem (
+      Packet->SenseData,
+      Dev->DmaBuf->SenseData,
+      Packet->SenseDataLength
+      );
+  } else {
+    //
+    // Signal no sense data returned
+    //
+    Packet->SenseDataLength = 0;
+  }
+
+  //
+  // Copy device output from DMA communication buffer
+  //
+  if (Packet->DataDirection == EFI_EXT_SCSI_DATA_DIRECTION_READ) {
+    CopyMem (Packet->InDataBuffer, Dev->DmaBuf->Data, 
Packet->InTransferLength);
+  }
I'm unfamilar with the PVSCSI device model, but I think this is not
general enough. The "PVSCSI_RING_CMP_DESC.DataLen" field suggests that
short reads are possible at least in theory.

(5) If a short read occurs (Response->DataLen <
Packet->InTransferLength), then we should adjust
"Packet->InTransferLength", and also copy that many bytes only.

(6) I think it would be prudent to update "Packet->OutTransferLength"
too, for short writes.
As you can see below, this is done in case device return Response->HostStatus as either PvScsiBtStatDatarun or PvScsiBtStatDataUnderrun.

+
+  //
+  // Report target status
+  //
+  Packet->TargetStatus = Response->ScsiStatus;
+
+  //
+  // Host adapter status and function return value depend on
+  // device response's host status
+  //
+  switch (Response->HostStatus) {
+    case PvScsiBtStatSuccess:
+    case PvScsiBtStatLinkedCommandCompleted:
+    case PvScsiBtStatLinkedCommandCompletedWithFlag:
+      Packet->HostAdapterStatus = EFI_EXT_SCSI_STATUS_HOST_ADAPTER_OK;
+      return EFI_SUCCESS;
+
+    case PvScsiBtStatSelTimeout:
+      Packet->HostAdapterStatus =
+                EFI_EXT_SCSI_STATUS_HOST_ADAPTER_SELECTION_TIMEOUT;
+      return EFI_TIMEOUT;
+
+    case PvScsiBtStatDatarun:
+    case :
+      //
+      // Report residual data in overrun/underrun
+      //
+      if (Packet->DataDirection == EFI_EXT_SCSI_DATA_DIRECTION_READ) {
+        Packet->InTransferLength = Response->DataLen;
+      } else {
+        Packet->OutTransferLength = Response->DataLen;
+      }
OK, if we are sure that (a) the device will always report short
reads/writes like this, and that (b) the above assignments will never
cause InTransferLength / OutTransferLength to *grow*, then the
InTransferLength / OutTransferLength adjustments are sufficiently
covered.
I believe both of these are indeed true.
Even though that current QEMU VMware PVSCSI device emulation code have a bug that it never sets this in pvscsi_command_complete() when it does set BTSTAT_DATARUN...
Still:

(8) The CopyMem() call above should not copy garbage (at the tail).
I don't think it matters. We don't guarantee anything on the content in Packet->InDataBuffer beyond Packet->InTransferLength.
I think the code is simpler how it is currently written.

Honestly, *if* the PVSCSI device model always sets "Response->DataLen",
I don't think this is the case.
then I would prefer if:

- we always updated InTransferLength / OutTransferLength (regardless of
"Response->HostStatus"),

- and we only used these case labels (PvScsiBtStatDatarun /
PvScsiBtStatDataUnderrun) for setting "Packet->HostAdapterStatus".

+      Packet->HostAdapterStatus =
+                EFI_EXT_SCSI_STATUS_HOST_ADAPTER_DATA_OVERRUN_UNDERRUN;
+      return EFI_BAD_BUFFER_SIZE;
I think EFI_BAD_BUFFER_SIZE is invalid here. According to the UEFI spec,
EFI_BAD_BUFFER_SIZE means "The SCSI Request Packet was not executed".
But that's not the case here -- we do have a partially completed
transfer.

Hmm... According to the documentation above EFI_SCSI_PASS_THRU_PASSTHRU in MdePkg/Include/Protocol/ScsiPassThru.h:

  @retval EFI_BAD_BUFFER_SIZE       The SCSI Request Packet was executed, but the                                     entire DataBuffer could not be transferred.                                     The actual number of bytes transferred is returned                                     in TransferLength. See HostAdapterStatus,                                     TargetStatus, SenseDataLength, and SenseData in                                     that order for additional status information.

So I don't know who to believe... It does seem to me that this documentation in the code makes more sense
and then my current code is correct. What do you think?


(9) Thus I feel we should use a "break" here.

+
+    case PvScsiBtStatBusFree:
+      Packet->HostAdapterStatus = EFI_EXT_SCSI_STATUS_HOST_ADAPTER_BUS_FREE;
+      break;
+
+    case PvScsiBtStatInvPhase:
+      Packet->HostAdapterStatus = EFI_EXT_SCSI_STATUS_HOST_ADAPTER_PHASE_ERROR;
+      break;
+
+    case PvScsiBtStatSensFailed:
+      Packet->HostAdapterStatus =
+                EFI_EXT_SCSI_STATUS_HOST_ADAPTER_REQUEST_SENSE_FAILED;
+      break;
+
+    case PvScsiBtStatTagReject:
+    case PvScsiBtStatBadMsg:
+      Packet->HostAdapterStatus =
+          EFI_EXT_SCSI_STATUS_HOST_ADAPTER_MESSAGE_REJECT;
+      break;
+
+    case PvScsiBtStatBusReset:
+      Packet->HostAdapterStatus = EFI_EXT_SCSI_STATUS_HOST_ADAPTER_BUS_RESET;
+      break;
+
+    case PvScsiBtStatHaTimeout:
+      Packet->HostAdapterStatus = EFI_EXT_SCSI_STATUS_HOST_ADAPTER_TIMEOUT;
+      return EFI_TIMEOUT;
+
+    case PvScsiBtStatScsiParity:
+      Packet->HostAdapterStatus = 
EFI_EXT_SCSI_STATUS_HOST_ADAPTER_PARITY_ERROR;
+      break;
+
+    default:
+      Packet->HostAdapterStatus = EFI_EXT_SCSI_STATUS_HOST_ADAPTER_OTHER;
+      break;
+  }
+
+  return EFI_DEVICE_ERROR;
+}
+

  //
  // Ext SCSI Pass Thru
  //
@@ -144,7 +528,62 @@ PvScsiPassThru (
    IN EFI_EVENT                                      Event    OPTIONAL
    )
  {
-  return EFI_UNSUPPORTED;
+  PVSCSI_DEV            *Dev;
+  EFI_STATUS            Status;
+  PVSCSI_RING_REQ_DESC *Request;
+  PVSCSI_RING_CMP_DESC *Response;
+
+  Dev = PVSCSI_FROM_PASS_THRU (This);
+
+  if (PvScsiIsReqRingFull (Dev)) {
+    return EFI_NOT_READY;
+  }
+
+  Request = PvScsiGetCurrentRequest (Dev);
+
+  Status = PopulateRequest (Dev, Target, Lun, Packet, Request);
+  if (EFI_ERROR (Status)) {
+    return Status;
+  }
+
+  //
+  // Writes to Request must be globally visible before making request
+  // available to device
+  //
+  MemoryFence ();
+  Dev->RingDesc.RingState->ReqProdIdx++;
+
(10) Please insert another MemoryFence () here.

That would be unnecessary and wrong.

The MemoryFence() here is used to make sure the request is globally visible before the update to the producer-index. As in any circular-buffer implementation.
There is no need for an additional MemoryFence() here.

Note that the MMIO access below is guaranteed to be globally visible only after the write to the producer-index. If EDK2 MMIO accessors wouldn't have guaranteed this, you would have a very broken code base...
Similar to why Linux MMIO accessors (e.g. writel()) macros guarantee these.

For example, see how MdePkg/Library/BaseIoLibIntrinsic/IoLib.c MmioWrite32() internally calls MemoryFence() before and after MMIO access itself.


+  Status = PvScsiMmioWrite32 (Dev, PvScsiRegOffsetKickRwIo, 0);
+  if (EFI_ERROR (Status)) {
+    //
+    // If kicking the host fails, we must fake a host adapter error.
+    // EFI_NOT_READY would save us the effort, but it would also suggest that
+    // the caller retry.
+    //
+    return ReportHostAdapterError (Packet);
+  }
+
+  Status = PvScsiWaitForRequestCompletion (Dev);
+  if (EFI_ERROR (Status)) {
+    //
+    // If waiting for request completion fails, we must fake a host adapter
+    // error. EFI_NOT_READY would save us the effort, but it would also suggest
+    // that the caller retry.
+    //
+    return ReportHostAdapterError (Packet);
+  }
+
(11) Please insert a MemoryFence() here.

Why is a MemoryFence() needed here? I don't think that's true.

PvScsiWaitForRequestCompletion() ends with an MMIO write which is guaranteed to be a memory fence. Thus, there is no need for a MemoryFence() here (to serve as a rmb()) to make sure the completion-descriptor is globally visible.

Thanks,

-Liran



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#56492): https://edk2.groups.io/g/devel/message/56492
Mute This Topic: https://groups.io/mt/72544127/21656
Group Owner: devel+ow...@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to