On 12/01/2024 12:52, Mark Cave-Ayland wrote:

The ESP SCSI chip fundamentally consists of a FIFO for transferring data to/from
the SCSI bus along with a command sequencer which automates various processes 
such
as selection, message/command transfer and data transfer. What makes this chip
particularly interesting is that the FIFO is also used as a buffer for DMA 
transfers
which makes it possible to mix and match DMA and non-DMA transfers when sending 
and
receiving data to/from the SCSI bus.

Whilst the current ESP emulation works well for most cases, there are several
problems that I've encountered whilst trying to debug various guest issues:


1) Multiple code paths for non-DMA, DMA and pDMA

Consider a SCSI request being submitted by the guest: if it is a DMA SCSI
request then it takes one path through get_cmd(), and a different path through
get_cmd() for pDMA. This then leads to the DMA code path being called for DMA
and the pDMA codepath being called for each FIFO write. Finally once you add
in the non-DMA path you end up with a total of 5 potential codepaths for any
given SCSI request.

2) Manual handling of STAT_TC flag

According to the datasheet the STAT_TC flag in ESP_RSTAT is set when the DMA
transfer counter reaches zero. The current code handles this manually in
multiple places, including where it shouldn't be necessary.

3) Deferred interrupts are used only for reads and not writes

The ESP emulation currently makes use of a deferred interrupt when submitting
read commands i.e. the interrupt flags are not updated immediately but when
the QEMU SCSI layer indicates it is ready. This works well for reads, but isn't
implemented for writes which can cause problems when the host is heavily loaded
or a large SCSI write is requested.

4) Padding of pDMA TI transfers to a minimum of 16 bytes

In order to allow Classic MacOS to boot successfully there is a workaround that
pads all pDMA TI transfers to a minimum of 16 bytes (see commit 7aa6baee7c
"esp: add support for unaligned accesses"). This feature is not documented and 
it
turns out that this is a symptom of incorrect handling of transfer
overflow/underflow conditions.

5) Duplication of some non-DMA logic in esp_reg_read() and esp_reg_write()

When reading and writing to the FIFO there is some duplication of logic from
esp_do_nodma() in esp_reg_read() and esp_reg_write(). This should be eliminated
in favour of a single state machine for handling non-DMA FIFO accesses in a
single place.


The series here reworks the ESP emulation to use a SCSI phase-based state 
machine
with just two paths: one for DMA/pDMA requests, and another for non-DMA 
requests.
The pDMA callbacks are completely removed so that there is now only a single 
path
for DMA requests. As part of this work the manual STAT_TC handling is removed, 
and a
couple of bugs in the SCSI phase selection are fixed to allow proper transfer
underflow/overflow detection and recovery.

I've tested the series with all of my available ESP images and compatibility is
greatly improved with these changes: I believe that this series should also fix 
a
few GitLab issues including https://gitlab.com/qemu-project/qemu/-/issues/1831,
https://gitlab.com/qemu-project/qemu/-/issues/611 and
https://gitlab.com/qemu-project/qemu/-/issues/1127. In addition it allows 
Aurelien's
really old SPARC Linux images to boot once again, and also the NeXTCube machine 
can
now boot, load its kernel from disk and start to execute it.

The series breaks down into the following approximate sections:

   - Patches 1-6 do some initial tidy-up for ESP select commands

   - Patches 7-16 implement consolidation of the STAT_TC flag handling

   - Patches 17-36 start to bring esp_do_dma() and do_pdma_dma() into line with 
each
     other, removing unneeded hacks and fixing up end-of-phase logic

   - Patches 37-56 move the MESSAGE OUT and COMMAND phase processing to the new 
SCSI
     phase-based state machine, allowing the removal of the pDMA-specific 
callbacks

   - Patches 57-62 tidy up the end of SCSI command logic and adds deferred 
interrupt
     handling for the DATA IN phase to complement the existing DATA OUT phase
     implementation

   - Patches 63-64 remove the ti_cmd field which is now no longer required

   - Patches 65-67 move the last bit of MESSAGE OUT and COMMAND phase 
processing to
     the new SCSI phase-based state machine which allows get_cmd() to be removed

   - Patches 68-69 move the STATUS and MESSAGE IN phase processing to the new 
SCSI
     phase-based state machine

   - Patches 70-72 fix the reset of the ESP_RINTR/ESP_STAT registers which 
allows the
     TC underflow detection to work correctly

   - Patch 73 removes the restriction on FIFO read access if the DMA routines 
are not
     defined (which also allows the NeXTCube to pass its self-test)

   - Patches 74-77 update the state machine to allow more combinations of mixed 
TI
     and FIFO reads/writes exposed by my test images

   - Patches 78-88 are final tidy-ups and improvements, along with adding an
     implementation of the "Transfer Pad" command

There is a migration break for the Q800 machine here as the series removes the 
pDMA
subsection: this seems the safest option as it isn't possible to map the old 
pDMA
callback enum directly to the new state machine. In theory there is a small 
window
whereby the normal DMA state may miss setting data_ready correctly, but I 
haven't
been able to recreate this during my tests.


Notes for reviewers:

This series is not a particularly easy one to review without having good 
in-depth
knowledge of the ESP device. The biggest issue I found is that due to the state
machines used by the client upon receiving an interrupt, a change to a ESP 
register
in one place can cause a completely unrelated error elsewhere.

A prime example of this is the consolidation of the STAT_TC flag: updating the
emulated device to behave as described in the datasheet caused other failures 
until
other fixes/updates could be made. This is why the work is done over 2 separate
commits with a gap in-between to keep the series bisectible, clearly indicating 
how
sensitive this series is to commit order.

In order to implement the changes I decided to focus on 3 main test images that 
should
pass for each commit to keep the series bisectible:

1) Modern Linux kernel with am53c974.c (hppa)
2) Modern Linux kernel with mac_esp.c (q800)
3) Classic MacOS 8.0

Other images may stop working during parts of the series, which sometimes is the
only option as otherwise implementing a compatibility workaround would be
prohibitively time-consuming or impossible at that point in the series. By the
end of the series all of my test images work correctly, including a few extra 
that
don't work with QEMU current git master.

Finally it is worth reiterating that in its current form this series fixes 
several
significant bugs and improves the compatibility of the ESP emulation, and these 
changes
make future fixes considerably easier compared with the existing implementation.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayl...@ilande.co.uk>


Mark Cave-Ayland (88):
   esp: don't clear cmdfifo when esp_select() fails in get_cmd()
   esp: move existing request cancel check into esp_select()
   esp.c: add FIFO wraparound support to esp_fifo_pop_buf()
   esp: remove FIFO clear from esp_select()
   esp: move esp_select() to ESP selection commands from get_cmd()
   esp: update esp_set_tc() to set STAT_TC flag
   esp: start removal of manual STAT_TC setting when transfer counter
     reaches zero
   esp: move command execution logic to new esp_run_cmd() function
   esp: update TC check logic in do_dma_pdma_cb() to check for TC == 0
   esp: move buffer and TC logic into separate to/from device paths in
     esp_do_dma()
   esp.c: remove unused case from esp_pdma_read()
   esp.c: don't accumulate directly into cmdfifo
   esp.c: decrement the TC during MESSAGE OUT and COMMAND phases
   esp.c: introduce esp_set_phase() helper function
   esp.c: remove another set of manual STAT_TC updates
   esp.c: remove MacOS TI workaround that pads FIFO transfers to
     ESP_FIFO_SZ
   esp.c: don't reset the TC and ESP_RSEQ state when executing a SCSI
     command
   esp.c: don't clear RFLAGS register when DMA is complete
   esp: remove zero transfer size check from esp_do_dma()
   esp.c: update condition for esp_dma_done() in esp_do_dma() from device
     path
   esp.c: update condition for esp_dma_done() in esp_do_dma() to device
     path
   esp.c: ensure that the PDMA callback is called for every device read
   esp.c: don't immediately raise INTR_BS if SCSI data needed in
     esp_do_dma()
   esp.c: remove TC adjustment in esp_do_dma() from device path
   esp.c: remove unaligned adjustment in do_dma_pdma_cb() to device path
   esp.c: remove unneeded if() check in esp_transfer_data()
   esp.c: update end of transfer logic at the end of esp_transfer_data()
   esp.c: consolidate async_len and TC == 0 checks in do_dma_pdma_cb()
     and esp_do_dma()
   esp.c: fix premature end of phase logic esp_command_complete
   esp.c: move TC and FIFO check logic into esp_dma_done()
   esp.c: rename esp_dma_done() to esp_dma_ti_check()
   esp.c: copy PDMA logic for transfers to device from do_dma_pdma_cb()
     to esp_do_dma()
   esp.c: copy logic for do_cmd transfers from do_dma_pdma_cb() to
     esp_do_dma()
   esp.c: update esp_do_dma() bypass if async_len is zero to include
     non-zero transfer check
   esp.c: move end of SCSI transfer check after TC adjustment in
     do_dma_pdma_cb()
   esp.c: remove s_without_satn_pdma_cb() PDMA callback
   esp.c: introduce esp_get_phase() function
   esp.c: convert esp_do_dma() to switch statement based upon SCSI phase
   esp.c: convert do_dma_pdma_db() to switch statement based upon SCSI
     phase
   esp.c: convert esp_do_nodma() to switch statement based upon SCSI
     phase
   esp.c: convert esp_do_dma() do_cmd path to check for SCSI phase
     instead
   esp.c: convert do_dma_pdma_cb() do_cmd path to check for SCSI phase
     instead
   esp.c: convert esp_do_nodma() do_cmd path to check for SCSI phase
     instead
   esp.c: convert esp_reg_write() do_cmd path to check for SCSI phase
     instead
   esp.c: remove do_cmd from ESPState
   esp.c: untangle MESSAGE OUT and COMMAND phase logic in esp_do_dma()
   esp.c: untangle MESSAGE OUT and COMMAND phase logic in
     do_dma_pdma_cb()
   esp.c: untangle MESSAGE OUT and COMMAND phase logic in esp_do_nodma()
   esp.c: move CMD_SELATN end of message phase detection to esp_do_dma()
     and do_dma_pdma_cb()
   esp.c: move CMD_TI end of message phase detection to esp_do_dma() and
     do_dma_pdma_cb()
   esp.c: don't use get_cmd() for CMD_SEL DMA commands
   esp.c: move CMD_SELATNS end of command logic to esp_do_dma() and
     do_dma_pdma_cb()
   esp.c: replace do_dma_pdma_cb() with esp_do_dma()
   esp.c: move CMD_ICCS command logic to esp_do_dma()
   esp.c: always use esp_do_dma() in pdma_cb()
   esp.c: remove unused PDMA callback implementation
   esp.c: rename data_in_ready to to data_ready
   esp.c: separate logic based upon ESP command in esp_command_complete()
   esp.c: separate logic based upon ESP command in esp_transfer_data()
   esp.c: use deferred interrupts for both DATA IN and DATA OUT phases
   esp.c: remove DATA IN phase logic when reading from FIFO
   esp.c: zero command register when TI command terminates due to phase
     change
   esp.c: remove unneeded ti_cmd field
   esp.c: don't raise INTR_BS interrupt in DATA IN phase until TI command
     issued
   esp.c: move non-DMA TI logic to separate esp_nodma_ti_dataout()
     function
   esp.c: process non-DMA FIFO writes in esp_do_nodma()
   esp.c: replace get_cmd() with esp_do_nodma()
   esp.c: move write_response() non-DMA logic to esp_do_nodma()
   esp.c: consolidate end of command sequence after ICCS command
   esp.c: ensure that STAT_INT is cleared when reading ESP_RINTR
   esp.c: don't clear the SCSI phase when reading ESP_RINTR
   esp.c: handle TC underflow for DMA SCSI requests
   esp.c: remove restriction on FIFO read access when DMA memory routines
     defined
   esp.c: handle non-DMA FIFO writes used to terminate DMA commands
   esp.c: improve ESP_RSEQ logic consolidation
   esp.c: only transfer non-DMA COMMAND phase data for specific commands
   esp.c: only transfer non-DMA MESSAGE OUT phase data for specific
     commands
   esp.c: consolidate DMA and PDMA logic in DATA OUT phase
   esp.c: consolidate DMA and PDMA logic in DATA IN phase
   esp.c: consolidate DMA and PDMA logic in MESSAGE OUT phase
   esp.c: remove redundant n variable in PDMA COMMAND phase
   esp.c: consolidate DMA and PDMA logic in STATUS and MESSAGE IN phases
   esp.c: replace n variable with len in esp_do_nodma()
   esp.c: implement DMA Transfer Pad command for DATA phases
   esp.c: rename irq_data IRQ to drq_irq
   esp.c: keep track of the DRQ state during DMA
   esp.c: switch TypeInfo registration to use DEFINE_TYPES() macro
   esp.c: add my copyright to the file

  hw/scsi/esp.c         | 1371 ++++++++++++++++++++++-------------------
  hw/scsi/trace-events  |    1 +
  include/hw/scsi/esp.h |   18 +-
  3 files changed, 727 insertions(+), 663 deletions(-)

Any other thoughts on this series, particularly from SCSI maintainers? I appreciate that it's not the easiest one to review, and I'm grateful for the T-B tags received so far.


ATB,

Mark.


Reply via email to