Re: [PATCH v13 01/10] iomap: Use correct endian conversion function in mmio_writeXXbe

2018-03-26 Thread Logan Gunthorpe


On 26/03/18 01:50 PM, Arnd Bergmann wrote:
> I wouldn't expect it to matter: the byte swap is almost always much
> cheaper compared to the actual bus access for the MMIO, and I
> would also guess that modern compilers can eliminate the double
> byte swap on architectures where writel() is an inline function. Most of
> the important architectures use ARCH_USE_BUILTIN_BSWAP, which
> guarantees that.

Fair enough. Sometime this week I'll update my patch set to change that.

Thanks,

Logan


Re: [PATCH v13 01/10] iomap: Use correct endian conversion function in mmio_writeXXbe

2018-03-26 Thread Arnd Bergmann
On Mon, Mar 26, 2018 at 6:21 PM, Logan Gunthorpe  wrote:
>
>
> On 26/03/18 04:53 AM, Arnd Bergmann wrote:
>> On most architectures, this is not important:
>> - For x86, the stores are aways atomic and no additional barriers
>>   are needed, so the two are the same
>> - For ARM (both 32 and 64-bit), powerpc and many others, we don't
>>   use the generic iowrite() and just fall back to writel() or
>>   writel(swab32()).
>>
>> However, shouldn't we just use the writel(swab32()) logic here as well
>> for the common case rather than risking missing barriers?
>
> Hmm, I don't know... it's complicated?
>
> Doing a bit of digging shows that the existing code was written during a
> time when writel() did not include extra barriers over __raw_writel() in
> any of the common arches.
>
> The commit logs don't seem to provide any guidance as to why this it was
> done this way, but I'd assume it was done to avoid a double swab() call
> on BE arches. Seeing writel() is typically implemented as:
>
> __raw_writel(__cpu_to_le32(value), addr);
>
> Then on BE arches, writel(swab32()) would become:
>
> __raw_writel(swab32(swab32(value)), addr)
>
> Which seems undesirable.

I wouldn't expect it to matter: the byte swap is almost always much
cheaper compared to the actual bus access for the MMIO, and I
would also guess that modern compilers can eliminate the double
byte swap on architectures where writel() is an inline function. Most of
the important architectures use ARCH_USE_BUILTIN_BSWAP, which
guarantees that.

   Arnd


Re: [PATCH v13 01/10] iomap: Use correct endian conversion function in mmio_writeXXbe

2018-03-26 Thread Logan Gunthorpe


On 26/03/18 04:53 AM, Arnd Bergmann wrote:
> On most architectures, this is not important:
> - For x86, the stores are aways atomic and no additional barriers
>   are needed, so the two are the same
> - For ARM (both 32 and 64-bit), powerpc and many others, we don't
>   use the generic iowrite() and just fall back to writel() or
>   writel(swab32()).
> 
> However, shouldn't we just use the writel(swab32()) logic here as well
> for the common case rather than risking missing barriers?

Hmm, I don't know... it's complicated?

Doing a bit of digging shows that the existing code was written during a
time when writel() did not include extra barriers over __raw_writel() in
any of the common arches.

The commit logs don't seem to provide any guidance as to why this it was
done this way, but I'd assume it was done to avoid a double swab() call
on BE arches. Seeing writel() is typically implemented as:

__raw_writel(__cpu_to_le32(value), addr);

Then on BE arches, writel(swab32()) would become:

__raw_writel(swab32(swab32(value)), addr)

Which seems undesirable.

Logan


Re: [PATCH v13 01/10] iomap: Use correct endian conversion function in mmio_writeXXbe

2018-03-26 Thread Arnd Bergmann
On Wed, Mar 21, 2018 at 5:37 PM, Logan Gunthorpe  wrote:
> The semantics of the iowriteXXbe() functions are to write a
> value in CPU endianess to an IO register that is known by the
> caller to be in Big Endian. The mmio_writeXXbe() macro, which
> is called by iowriteXXbe(), should therefore use cpu_to_beXX()
> instead of beXX_to_cpu().
>
> Seeing both beXX_to_cpu() and cpu_to_beXX() are both functionally
> implemented as either null operations or swabXX operations there
> was no noticable bug here. But it is confusing for both developers
> and code analysis tools alike.
>
> Signed-off-by: Logan Gunthorpe 

Your patch is a clear improvement of what we had before, but I notice
that we have a weird asymmetry between big-endian and little-endian
accessors before and after this patch:

void iowrite32(u32 val, void __iomem *addr)
{
IO_COND(addr, outl(val,port), writel(val, addr));
}
void iowrite32be(u32 val, void __iomem *addr)
{
IO_COND(addr, pio_write32be(val,port), mmio_write32be(val, addr));
}

The little-endian iowrite32() when applied to mmio registers uses
a 32-bit wide atomic store to a little-endian register with barriers to
order against both spinlocks and DMA.

The big-endian iowrite32be() on the same pointer uses a nonatomic
store with no barriers whatsoever and the opposite endianess.

On most architectures, this is not important:
- For x86, the stores are aways atomic and no additional barriers
  are needed, so the two are the same
- For ARM (both 32 and 64-bit), powerpc and many others, we don't
  use the generic iowrite() and just fall back to writel() or
  writel(swab32()).

However, shouldn't we just use the writel(swab32()) logic here as well
for the common case rather than risking missing barriers?

   Arnd


Re: [PATCH v13 01/10] iomap: Use correct endian conversion function in mmio_writeXXbe

2018-03-21 Thread Luc Van Oostenryck
On Wed, Mar 21, 2018 at 10:37:36AM -0600, Logan Gunthorpe wrote:
> The semantics of the iowriteXXbe() functions are to write a
> value in CPU endianess to an IO register that is known by the
> caller to be in Big Endian. The mmio_writeXXbe() macro, which
> is called by iowriteXXbe(), should therefore use cpu_to_beXX()
> instead of beXX_to_cpu().
> 
> Seeing both beXX_to_cpu() and cpu_to_beXX() are both functionally
> implemented as either null operations or swabXX operations there
> was no noticable bug here. But it is confusing for both developers
> and code analysis tools alike.
> 
> Signed-off-by: Logan Gunthorpe 
> Cc: Philippe Ombredanne 
> Cc: Thomas Gleixner 
> Cc: Kate Stewart 
> Cc: Greg Kroah-Hartman 
> Cc: Luc Van Oostenryck 
> ---
>  lib/iomap.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/iomap.c b/lib/iomap.c
> index 541d926da95e..be120c13d6cc 100644
> --- a/lib/iomap.c
> +++ b/lib/iomap.c
> @@ -106,8 +106,8 @@ EXPORT_SYMBOL(ioread32be);
>  #endif
>  
>  #ifndef mmio_write16be
> -#define mmio_write16be(val,port) __raw_writew(be16_to_cpu(val),port)
> -#define mmio_write32be(val,port) __raw_writel(be32_to_cpu(val),port)
> +#define mmio_write16be(val,port) __raw_writew(cpu_to_be16(val),port)
> +#define mmio_write32be(val,port) __raw_writel(cpu_to_be32(val),port)
>  #endif
>  
>  void iowrite8(u8 val, void __iomem *addr)
> -- 
> 2.11.0
> 

LGTM, feel free to add my
Reviewed-by: Luc Van Oostenryck 

-- Luc


[PATCH v13 01/10] iomap: Use correct endian conversion function in mmio_writeXXbe

2018-03-21 Thread Logan Gunthorpe
The semantics of the iowriteXXbe() functions are to write a
value in CPU endianess to an IO register that is known by the
caller to be in Big Endian. The mmio_writeXXbe() macro, which
is called by iowriteXXbe(), should therefore use cpu_to_beXX()
instead of beXX_to_cpu().

Seeing both beXX_to_cpu() and cpu_to_beXX() are both functionally
implemented as either null operations or swabXX operations there
was no noticable bug here. But it is confusing for both developers
and code analysis tools alike.

Signed-off-by: Logan Gunthorpe 
Cc: Philippe Ombredanne 
Cc: Thomas Gleixner 
Cc: Kate Stewart 
Cc: Greg Kroah-Hartman 
Cc: Luc Van Oostenryck 
---
 lib/iomap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/iomap.c b/lib/iomap.c
index 541d926da95e..be120c13d6cc 100644
--- a/lib/iomap.c
+++ b/lib/iomap.c
@@ -106,8 +106,8 @@ EXPORT_SYMBOL(ioread32be);
 #endif
 
 #ifndef mmio_write16be
-#define mmio_write16be(val,port) __raw_writew(be16_to_cpu(val),port)
-#define mmio_write32be(val,port) __raw_writel(be32_to_cpu(val),port)
+#define mmio_write16be(val,port) __raw_writew(cpu_to_be16(val),port)
+#define mmio_write32be(val,port) __raw_writel(cpu_to_be32(val),port)
 #endif
 
 void iowrite8(u8 val, void __iomem *addr)
-- 
2.11.0