Re: [PATCH v2 00/18] Cross-architecture definitions of relaxed MMIO accessors

2014-05-27 Thread Will Deacon
On Tue, May 27, 2014 at 09:23:30PM +0100, Benjamin Herrenschmidt wrote:
> On Tue, 2014-05-27 at 20:34 +0100, Will Deacon wrote:
> 
> > Do you mean the io{read,write} functions? Funnily enough, they're already
> > relaxed on ARM if you go by the semantics I've proposed. That implies we at
> > least need some Documentation to that effect...
> > 
> > What do you do on ppc?
> 
> They are not supposed to be relaxed. If they are, you probably have a
> whole lot of busted drivers :-)

Lucky me!

> They have the same semantics as readl/writel for memory and as inb/outb
> for IO space, they just allow to hide the "type" (memory vs. IO) from
> most of the driver code.
> 
> We probably need to create a set of _relaxed variants.

Ok. I'll try putting together a v3 including this and the mmiowb work.

Thanks for the feedback,

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 00/18] Cross-architecture definitions of relaxed MMIO accessors

2014-05-27 Thread Will Deacon
On Tue, May 27, 2014 at 09:21:38PM +0100, Benjamin Herrenschmidt wrote:
> On Tue, 2014-05-27 at 20:32 +0100, Will Deacon wrote:
> 
> > Why would you need two barriers? I would have though an mmiowb() inlined
> > into writel after the store operation would be sufficient. Or is this to
> > ensure a non-relaxed write is ordered with respect to a relaxed write?
> 
> Well, so the non-relaxed writel would have to do:
> 
>   sync
>   store
>   sync
> 
> The first sync is to synchronize with DMAs, so that a sequence of
> 
>   store to mem
>   writel
> 
> Remains ordered vs. the device (ie, when the writel causes the device
> to do a DMA, it will see the previous store to mem).
> 
> The second sync is needed as mmiowb, to order with unlocks.

Ah yeah, thanks. I was so hung up on the ordering against locks that I
completely forgot about DMA!

> At this point, I'm keen on keeping my per-cpu trick to avoid that
> second one in most cases.

Makes sense. The alternative is dropping that requirement and instead
relying on drivers to use mmiowb() even with the non-relaxed accessors,
but I think that's going to be fairly painful (and hence why you have the
trick to start with).

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 00/18] Cross-architecture definitions of relaxed MMIO accessors

2014-05-27 Thread Benjamin Herrenschmidt
On Tue, 2014-05-27 at 20:34 +0100, Will Deacon wrote:

> Do you mean the io{read,write} functions? Funnily enough, they're already
> relaxed on ARM if you go by the semantics I've proposed. That implies we at
> least need some Documentation to that effect...
> 
> What do you do on ppc?

They are not supposed to be relaxed. If they are, you probably have a
whole lot of busted drivers :-)

They have the same semantics as readl/writel for memory and as inb/outb
for IO space, they just allow to hide the "type" (memory vs. IO) from
most of the driver code.

We probably need to create a set of _relaxed variants.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 00/18] Cross-architecture definitions of relaxed MMIO accessors

2014-05-27 Thread Benjamin Herrenschmidt
On Tue, 2014-05-27 at 20:32 +0100, Will Deacon wrote:

> Why would you need two barriers? I would have though an mmiowb() inlined
> into writel after the store operation would be sufficient. Or is this to
> ensure a non-relaxed write is ordered with respect to a relaxed write?

Well, so the non-relaxed writel would have to do:

sync
store
sync

The first sync is to synchronize with DMAs, so that a sequence of

store to mem
writel

Remains ordered vs. the device (ie, when the writel causes the device
to do a DMA, it will see the previous store to mem).

The second sync is needed as mmiowb, to order with unlocks.
 
At this point, I'm keen on keeping my per-cpu trick to avoid that
second one in most cases.

> Anyway, we may need something similar for other architectures with mmiowb
> implementations:
> 
>   blackfin
>   frv
>   ia64
>   mips
>   sh
> 
> so I'm anticipating some more discussion when I try to push that patch :)
> 
> Cheers,
> 
> Will


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 00/18] Cross-architecture definitions of relaxed MMIO accessors

2014-05-27 Thread Will Deacon
On Sun, May 25, 2014 at 10:47:50PM +0100, Benjamin Herrenschmidt wrote:
> On Thu, 2014-05-22 at 17:47 +0100, Will Deacon wrote:
> > Hi all,
> > 
> > This is version 2 of the series I originally posted here:
> > 
> >   https://lkml.org/lkml/2014/4/17/269
> > 
> > Changes since v1 include:
> > 
> >  - Added relevant acks from arch maintainers
> >  - Fixed potential compiler re-ordering issue for x86 definitions
> > 
> > I'd *really* appreciate some feedback on the proposed semantics here, but
> > acks are still good :)
> > 
> > The original cover letter is duplicated below.
> 
> Question (sorry if I missed an existing explanation...), do we have an
> equivalent bunch for iomap ?

Do you mean the io{read,write} functions? Funnily enough, they're already
relaxed on ARM if you go by the semantics I've proposed. That implies we at
least need some Documentation to that effect...

What do you do on ppc?

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 00/18] Cross-architecture definitions of relaxed MMIO accessors

2014-05-27 Thread Will Deacon
Hi Ben,

On Sun, May 25, 2014 at 10:46:03PM +0100, Benjamin Herrenschmidt wrote:
> On Thu, 2014-05-22 at 17:47 +0100, Will Deacon wrote:
> > A corollary to this is that mmiowb() probably needs rethinking. As it 
> > currently
> > stands, an mmiowb() is required to order MMIO writes to a device from 
> > multiple
> > CPUs, even if that device is protected by a lock. However, this isn't often 
> > used
> > in practice, leading to PowerPC implementing both mmiowb() *and* 
> > synchronising
> > I/O in spin_unlock.
> > 
> > I would propose making the non-relaxed I/O accessors ordered with respect to
> > LOCK/UNLOCK, leaving mmiowb() to be used with the relaxed accessors, if
> > required, but would welcome thoughts/suggestions on this topic.
> 
> I agree on the proposed semantics, though for us that does mean we still need
> that per-cpu flag tracking non-relaxed MMIO stores and corresponding added 
> barrier
> in unlock. Eventually, if the use of the relaxed accessors becomes pervasive
> enough I suppose I can just make the ordered ones unconditionally do 2 
> barriers.

Why would you need two barriers? I would have though an mmiowb() inlined
into writel after the store operation would be sufficient. Or is this to
ensure a non-relaxed write is ordered with respect to a relaxed write?

Anyway, we may need something similar for other architectures with mmiowb
implementations:

  blackfin
  frv
  ia64
  mips
  sh

so I'm anticipating some more discussion when I try to push that patch :)

Cheers,

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 00/18] Cross-architecture definitions of relaxed MMIO accessors

2014-05-27 Thread Will Deacon
Hi Ben,

On Sun, May 25, 2014 at 10:46:03PM +0100, Benjamin Herrenschmidt wrote:
 On Thu, 2014-05-22 at 17:47 +0100, Will Deacon wrote:
  A corollary to this is that mmiowb() probably needs rethinking. As it 
  currently
  stands, an mmiowb() is required to order MMIO writes to a device from 
  multiple
  CPUs, even if that device is protected by a lock. However, this isn't often 
  used
  in practice, leading to PowerPC implementing both mmiowb() *and* 
  synchronising
  I/O in spin_unlock.
  
  I would propose making the non-relaxed I/O accessors ordered with respect to
  LOCK/UNLOCK, leaving mmiowb() to be used with the relaxed accessors, if
  required, but would welcome thoughts/suggestions on this topic.
 
 I agree on the proposed semantics, though for us that does mean we still need
 that per-cpu flag tracking non-relaxed MMIO stores and corresponding added 
 barrier
 in unlock. Eventually, if the use of the relaxed accessors becomes pervasive
 enough I suppose I can just make the ordered ones unconditionally do 2 
 barriers.

Why would you need two barriers? I would have though an mmiowb() inlined
into writel after the store operation would be sufficient. Or is this to
ensure a non-relaxed write is ordered with respect to a relaxed write?

Anyway, we may need something similar for other architectures with mmiowb
implementations:

  blackfin
  frv
  ia64
  mips
  sh

so I'm anticipating some more discussion when I try to push that patch :)

Cheers,

Will
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 00/18] Cross-architecture definitions of relaxed MMIO accessors

2014-05-27 Thread Will Deacon
On Sun, May 25, 2014 at 10:47:50PM +0100, Benjamin Herrenschmidt wrote:
 On Thu, 2014-05-22 at 17:47 +0100, Will Deacon wrote:
  Hi all,
  
  This is version 2 of the series I originally posted here:
  
https://lkml.org/lkml/2014/4/17/269
  
  Changes since v1 include:
  
   - Added relevant acks from arch maintainers
   - Fixed potential compiler re-ordering issue for x86 definitions
  
  I'd *really* appreciate some feedback on the proposed semantics here, but
  acks are still good :)
  
  The original cover letter is duplicated below.
 
 Question (sorry if I missed an existing explanation...), do we have an
 equivalent bunch for iomap ?

Do you mean the io{read,write} functions? Funnily enough, they're already
relaxed on ARM if you go by the semantics I've proposed. That implies we at
least need some Documentation to that effect...

What do you do on ppc?

Will
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 00/18] Cross-architecture definitions of relaxed MMIO accessors

2014-05-27 Thread Benjamin Herrenschmidt
On Tue, 2014-05-27 at 20:32 +0100, Will Deacon wrote:

 Why would you need two barriers? I would have though an mmiowb() inlined
 into writel after the store operation would be sufficient. Or is this to
 ensure a non-relaxed write is ordered with respect to a relaxed write?

Well, so the non-relaxed writel would have to do:

sync
store
sync

The first sync is to synchronize with DMAs, so that a sequence of

store to mem
writel

Remains ordered vs. the device (ie, when the writel causes the device
to do a DMA, it will see the previous store to mem).

The second sync is needed as mmiowb, to order with unlocks.
 
At this point, I'm keen on keeping my per-cpu trick to avoid that
second one in most cases.

 Anyway, we may need something similar for other architectures with mmiowb
 implementations:
 
   blackfin
   frv
   ia64
   mips
   sh
 
 so I'm anticipating some more discussion when I try to push that patch :)
 
 Cheers,
 
 Will


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 00/18] Cross-architecture definitions of relaxed MMIO accessors

2014-05-27 Thread Benjamin Herrenschmidt
On Tue, 2014-05-27 at 20:34 +0100, Will Deacon wrote:

 Do you mean the io{read,write} functions? Funnily enough, they're already
 relaxed on ARM if you go by the semantics I've proposed. That implies we at
 least need some Documentation to that effect...
 
 What do you do on ppc?

They are not supposed to be relaxed. If they are, you probably have a
whole lot of busted drivers :-)

They have the same semantics as readl/writel for memory and as inb/outb
for IO space, they just allow to hide the type (memory vs. IO) from
most of the driver code.

We probably need to create a set of _relaxed variants.

Cheers,
Ben.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 00/18] Cross-architecture definitions of relaxed MMIO accessors

2014-05-27 Thread Will Deacon
On Tue, May 27, 2014 at 09:21:38PM +0100, Benjamin Herrenschmidt wrote:
 On Tue, 2014-05-27 at 20:32 +0100, Will Deacon wrote:
 
  Why would you need two barriers? I would have though an mmiowb() inlined
  into writel after the store operation would be sufficient. Or is this to
  ensure a non-relaxed write is ordered with respect to a relaxed write?
 
 Well, so the non-relaxed writel would have to do:
 
   sync
   store
   sync
 
 The first sync is to synchronize with DMAs, so that a sequence of
 
   store to mem
   writel
 
 Remains ordered vs. the device (ie, when the writel causes the device
 to do a DMA, it will see the previous store to mem).
 
 The second sync is needed as mmiowb, to order with unlocks.

Ah yeah, thanks. I was so hung up on the ordering against locks that I
completely forgot about DMA!

 At this point, I'm keen on keeping my per-cpu trick to avoid that
 second one in most cases.

Makes sense. The alternative is dropping that requirement and instead
relying on drivers to use mmiowb() even with the non-relaxed accessors,
but I think that's going to be fairly painful (and hence why you have the
trick to start with).

Will
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 00/18] Cross-architecture definitions of relaxed MMIO accessors

2014-05-27 Thread Will Deacon
On Tue, May 27, 2014 at 09:23:30PM +0100, Benjamin Herrenschmidt wrote:
 On Tue, 2014-05-27 at 20:34 +0100, Will Deacon wrote:
 
  Do you mean the io{read,write} functions? Funnily enough, they're already
  relaxed on ARM if you go by the semantics I've proposed. That implies we at
  least need some Documentation to that effect...
  
  What do you do on ppc?
 
 They are not supposed to be relaxed. If they are, you probably have a
 whole lot of busted drivers :-)

Lucky me!

 They have the same semantics as readl/writel for memory and as inb/outb
 for IO space, they just allow to hide the type (memory vs. IO) from
 most of the driver code.
 
 We probably need to create a set of _relaxed variants.

Ok. I'll try putting together a v3 including this and the mmiowb work.

Thanks for the feedback,

Will
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 00/18] Cross-architecture definitions of relaxed MMIO accessors

2014-05-25 Thread Benjamin Herrenschmidt
On Thu, 2014-05-22 at 17:47 +0100, Will Deacon wrote:
> Hi all,
> 
> This is version 2 of the series I originally posted here:
> 
>   https://lkml.org/lkml/2014/4/17/269
> 
> Changes since v1 include:
> 
>  - Added relevant acks from arch maintainers
>  - Fixed potential compiler re-ordering issue for x86 definitions
> 
> I'd *really* appreciate some feedback on the proposed semantics here, but
> acks are still good :)
> 
> The original cover letter is duplicated below.

Question (sorry if I missed an existing explanation...), do we have an
equivalent bunch for iomap ?

Cheers,
Ben.

> Cheers,
> 
> Will
> 
> --->8
> 
> This RFC series attempts to define a portable (i.e. cross-architecture)
> definition of the {readX,writeX}_relaxed MMIO accessor functions. These
> functions are already in widespread use amongst drivers (mainly those 
> supporting
> devices embedded in ARM SoCs), but lack any well-defined semantics and,
> subsequently, any portable definitions to allow these drivers to be compiled 
> for
> other architectures.
> 
> The two main motivations for this series are:
> 
>  (1) To promote use of the _relaxed MMIO accessors on weakly-ordered
>  architectures, where they can bring significant performance improvements
>  over their non-relaxed counterparts.
> 
>  (2) To allow COMPILE_TEST to build drivers using the relaxed accessors across
>  all architectures.
> 
> The proposed semantics largely match exactly those provided by the ARM
> implementation (i.e. no weaker), with one exception (see below).
> 
> Informally:
> 
>   - Relaxed accesses to the same device are ordered with respect to each 
> other.
> 
>   - Relaxed accesses are *not* guaranteed to be ordered with respect to normal
> memory accesses (e.g. DMA buffers -- this is what gives us the performance
> boost over the non-relaxed versions).
> 
>   - Relaxed accesses are not guaranteed to be ordered with respect to
> LOCK/UNLOCK operations.
> 
> In actual fact, the relaxed accessors *are* ordered with respect to 
> LOCK/UNLOCK
> operations on ARM[64], but I have added this constraint for the benefit of
> PowerPC, which has expensive I/O barriers in the spin_unlock path for the
> non-relaxed accessors.
> 
> A corollary to this is that mmiowb() probably needs rethinking. As it 
> currently
> stands, an mmiowb() is required to order MMIO writes to a device from multiple
> CPUs, even if that device is protected by a lock. However, this isn't often 
> used
> in practice, leading to PowerPC implementing both mmiowb() *and* synchronising
> I/O in spin_unlock.
> 
> I would propose making the non-relaxed I/O accessors ordered with respect to
> LOCK/UNLOCK, leaving mmiowb() to be used with the relaxed accessors, if
> required, but would welcome thoughts/suggestions on this topic.
> 
> 
> Will Deacon (18):
>   asm-generic: io: implement relaxed accessor macros as conditional
> wrappers
>   microblaze: io: remove dummy relaxed accessor macros
>   s390: io: remove dummy relaxed accessor macros for reads
>   xtensa: io: remove dummy relaxed accessor macros for reads
>   alpha: io: implement relaxed accessor macros for writes
>   frv: io: implement dummy relaxed accessor macros for writes
>   cris: io: implement dummy relaxed accessor macros for writes
>   ia64: io: implement dummy relaxed accessor macros for writes
>   m32r: io: implement dummy relaxed accessor macros for writes
>   m68k: io: implement dummy relaxed accessor macros for writes
>   mn10300: io: implement dummy relaxed accessor macros for writes
>   parisc: io: implement dummy relaxed accessor macros for writes
>   powerpc: io: implement dummy relaxed accessor macros for writes
>   sparc: io: implement dummy relaxed accessor macros for writes
>   tile: io: implement dummy relaxed accessor macros for writes
>   x86: io: implement dummy relaxed accessor macros for writes
>   documentation: memory-barriers: clarify relaxed io accessor semantics
>   asm-generic: io: define relaxed accessor macros unconditionally
> 
>  Documentation/memory-barriers.txt | 13 +
>  arch/alpha/include/asm/io.h   | 12 
>  arch/cris/include/asm/io.h|  3 +++
>  arch/frv/include/asm/io.h |  3 +++
>  arch/ia64/include/asm/io.h|  4 
>  arch/m32r/include/asm/io.h|  3 +++
>  arch/m68k/include/asm/io.h|  8 
>  arch/m68k/include/asm/io_no.h |  4 
>  arch/microblaze/include/asm/io.h  |  8 
>  arch/mn10300/include/asm/io.h |  4 
>  arch/parisc/include/asm/io.h  | 12 
>  arch/powerpc/include/asm/io.h | 12 
>  arch/s390/include/asm/io.h|  5 -
>  arch/sparc/include/asm/io.h   |  9 +
>  arch/sparc/include/asm/io_32.h|  3 ---
>  arch/sparc/include/asm/io_64.h| 22 ++
>  arch/tile/include/asm/io.h|  4 
>  arch/x86/include/asm/io.h | 10 +++---
>  arch/xtensa/include/asm/io.h  |  7 ---
>  

Re: [PATCH v2 00/18] Cross-architecture definitions of relaxed MMIO accessors

2014-05-25 Thread Benjamin Herrenschmidt
On Thu, 2014-05-22 at 17:47 +0100, Will Deacon wrote:
> A corollary to this is that mmiowb() probably needs rethinking. As it 
> currently
> stands, an mmiowb() is required to order MMIO writes to a device from multiple
> CPUs, even if that device is protected by a lock. However, this isn't often 
> used
> in practice, leading to PowerPC implementing both mmiowb() *and* synchronising
> I/O in spin_unlock.
> 
> I would propose making the non-relaxed I/O accessors ordered with respect to
> LOCK/UNLOCK, leaving mmiowb() to be used with the relaxed accessors, if
> required, but would welcome thoughts/suggestions on this topic.

I agree on the proposed semantics, though for us that does mean we still need
that per-cpu flag tracking non-relaxed MMIO stores and corresponding added 
barrier
in unlock. Eventually, if the use of the relaxed accessors becomes pervasive
enough I suppose I can just make the ordered ones unconditionally do 2 barriers.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 00/18] Cross-architecture definitions of relaxed MMIO accessors

2014-05-25 Thread Benjamin Herrenschmidt
On Thu, 2014-05-22 at 17:47 +0100, Will Deacon wrote:
 A corollary to this is that mmiowb() probably needs rethinking. As it 
 currently
 stands, an mmiowb() is required to order MMIO writes to a device from multiple
 CPUs, even if that device is protected by a lock. However, this isn't often 
 used
 in practice, leading to PowerPC implementing both mmiowb() *and* synchronising
 I/O in spin_unlock.
 
 I would propose making the non-relaxed I/O accessors ordered with respect to
 LOCK/UNLOCK, leaving mmiowb() to be used with the relaxed accessors, if
 required, but would welcome thoughts/suggestions on this topic.

I agree on the proposed semantics, though for us that does mean we still need
that per-cpu flag tracking non-relaxed MMIO stores and corresponding added 
barrier
in unlock. Eventually, if the use of the relaxed accessors becomes pervasive
enough I suppose I can just make the ordered ones unconditionally do 2 barriers.

Cheers,
Ben.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 00/18] Cross-architecture definitions of relaxed MMIO accessors

2014-05-25 Thread Benjamin Herrenschmidt
On Thu, 2014-05-22 at 17:47 +0100, Will Deacon wrote:
 Hi all,
 
 This is version 2 of the series I originally posted here:
 
   https://lkml.org/lkml/2014/4/17/269
 
 Changes since v1 include:
 
  - Added relevant acks from arch maintainers
  - Fixed potential compiler re-ordering issue for x86 definitions
 
 I'd *really* appreciate some feedback on the proposed semantics here, but
 acks are still good :)
 
 The original cover letter is duplicated below.

Question (sorry if I missed an existing explanation...), do we have an
equivalent bunch for iomap ?

Cheers,
Ben.

 Cheers,
 
 Will
 
 ---8
 
 This RFC series attempts to define a portable (i.e. cross-architecture)
 definition of the {readX,writeX}_relaxed MMIO accessor functions. These
 functions are already in widespread use amongst drivers (mainly those 
 supporting
 devices embedded in ARM SoCs), but lack any well-defined semantics and,
 subsequently, any portable definitions to allow these drivers to be compiled 
 for
 other architectures.
 
 The two main motivations for this series are:
 
  (1) To promote use of the _relaxed MMIO accessors on weakly-ordered
  architectures, where they can bring significant performance improvements
  over their non-relaxed counterparts.
 
  (2) To allow COMPILE_TEST to build drivers using the relaxed accessors across
  all architectures.
 
 The proposed semantics largely match exactly those provided by the ARM
 implementation (i.e. no weaker), with one exception (see below).
 
 Informally:
 
   - Relaxed accesses to the same device are ordered with respect to each 
 other.
 
   - Relaxed accesses are *not* guaranteed to be ordered with respect to normal
 memory accesses (e.g. DMA buffers -- this is what gives us the performance
 boost over the non-relaxed versions).
 
   - Relaxed accesses are not guaranteed to be ordered with respect to
 LOCK/UNLOCK operations.
 
 In actual fact, the relaxed accessors *are* ordered with respect to 
 LOCK/UNLOCK
 operations on ARM[64], but I have added this constraint for the benefit of
 PowerPC, which has expensive I/O barriers in the spin_unlock path for the
 non-relaxed accessors.
 
 A corollary to this is that mmiowb() probably needs rethinking. As it 
 currently
 stands, an mmiowb() is required to order MMIO writes to a device from multiple
 CPUs, even if that device is protected by a lock. However, this isn't often 
 used
 in practice, leading to PowerPC implementing both mmiowb() *and* synchronising
 I/O in spin_unlock.
 
 I would propose making the non-relaxed I/O accessors ordered with respect to
 LOCK/UNLOCK, leaving mmiowb() to be used with the relaxed accessors, if
 required, but would welcome thoughts/suggestions on this topic.
 
 
 Will Deacon (18):
   asm-generic: io: implement relaxed accessor macros as conditional
 wrappers
   microblaze: io: remove dummy relaxed accessor macros
   s390: io: remove dummy relaxed accessor macros for reads
   xtensa: io: remove dummy relaxed accessor macros for reads
   alpha: io: implement relaxed accessor macros for writes
   frv: io: implement dummy relaxed accessor macros for writes
   cris: io: implement dummy relaxed accessor macros for writes
   ia64: io: implement dummy relaxed accessor macros for writes
   m32r: io: implement dummy relaxed accessor macros for writes
   m68k: io: implement dummy relaxed accessor macros for writes
   mn10300: io: implement dummy relaxed accessor macros for writes
   parisc: io: implement dummy relaxed accessor macros for writes
   powerpc: io: implement dummy relaxed accessor macros for writes
   sparc: io: implement dummy relaxed accessor macros for writes
   tile: io: implement dummy relaxed accessor macros for writes
   x86: io: implement dummy relaxed accessor macros for writes
   documentation: memory-barriers: clarify relaxed io accessor semantics
   asm-generic: io: define relaxed accessor macros unconditionally
 
  Documentation/memory-barriers.txt | 13 +
  arch/alpha/include/asm/io.h   | 12 
  arch/cris/include/asm/io.h|  3 +++
  arch/frv/include/asm/io.h |  3 +++
  arch/ia64/include/asm/io.h|  4 
  arch/m32r/include/asm/io.h|  3 +++
  arch/m68k/include/asm/io.h|  8 
  arch/m68k/include/asm/io_no.h |  4 
  arch/microblaze/include/asm/io.h  |  8 
  arch/mn10300/include/asm/io.h |  4 
  arch/parisc/include/asm/io.h  | 12 
  arch/powerpc/include/asm/io.h | 12 
  arch/s390/include/asm/io.h|  5 -
  arch/sparc/include/asm/io.h   |  9 +
  arch/sparc/include/asm/io_32.h|  3 ---
  arch/sparc/include/asm/io_64.h| 22 ++
  arch/tile/include/asm/io.h|  4 
  arch/x86/include/asm/io.h | 10 +++---
  arch/xtensa/include/asm/io.h  |  7 ---
  include/asm-generic/io.h  | 10 ++
  20 files changed, 98 insertions(+), 58 deletions(-)
 


--
To