From: Arnd Bergmann
> Sent: 23 February 2018 15:37
> 
> 32-bit architectures generally cannot use writeq(), so we now get a build
> failure for the lpfc driver:
> 
> drivers/scsi/lpfc/lpfc_sli.c: In function 'lpfc_sli4_wq_put':
> drivers/scsi/lpfc/lpfc_sli.c:145:4: error: implicit declaration of function 
> 'writeq'; did you mean
> 'writeb'? [-Werror=implicit-function-declaration]
> 
> Another problem here is that writing out actual data (unlike accessing
> mmio registers) means we must write the data with the same endianess
> that we have read from memory, but writeq() will perform byte swaps
> and add barriers inbetween accesses as we do for registers.
> 
> Using memcpy_toio() should do the right thing here, using register
> sized stores with correct endianess conversion and barriers (i.e. none),
> but on some architectures might fall back to byte-size access.
...

Have you looked at the performance impact of this on x86?
Last time I looked memcpy_toio() aliased directly to memcpy().
memcpy() is run-time patched between several different algorithms.
On recent Intel cpus memcpy() is implemented as 'rep movsb' relying
on the hardware to DTRT.
For uncached accesses (typical for io) the 'RT' has to be byte transfers.
So instead of the 8 byte transfers (on 64 bit) you get single bytes.
This won't be what is intended!
memcpy_toio() should probably use 'rep movsd' for the bulk of the transfer.

        David

Reply via email to