Re: I/O memory barriers vs SMP memory barriers

2007-03-28 Thread Lennert Buytenhek
On Mon, Mar 26, 2007 at 01:07:11PM -0700, Paul E. McKenney wrote:

> > > > Does everybody agree on these semantics, though?  At least David
> > > > seems to think that mb/rmb/wmb aren't required to order normal
> > > > memory accesses against each other..
> > > 
> > > Not on UP.  On SMP, ordering is (almost certainly) required.
> > 
> > 'almost certainly'?  That sounds like there is a possibility that it
> > wouldn't have to?  What does this depend on?
> 
> The underlying memory model of the CPU.  For sequentially consistent
> systems, only compiler barriers are required.  There are very few such
> systems -- MIPS and PA-RISC, if I remember correctly.  Performance
> dictates otherwise.
> 
> I believe that MIPS is -not- sequentially consistent, but have not yet
> purchased an architecture reference manual.

ARM Normal memory (RAM) accesses are weakly ordered, so on SMP, you
need barriers.  (SMP ARM systems are the definite minority, though.)

(For ARM UP, we generally don't care, since most have virtual caches
and are not I/O coherent, and so DMA coherent mappings will be done
as uncached mappings, and uncached mappings are strongly ordered --
except on XScale V3, which supports I/O coherency, and so you need to
use barriers when operating on DMA coherent memory because DMA coherent
mappings are done as Normal memory (which is weakly ordered) when I/O
coherency is enabled.)


> Given that ARM device drivers are accessing MMIO locations, which are
> often slow anyway, how much is ARM really gaining by dropping memory
> barriers when only I/O accesses need be ordered?  Is it measurable?

No idea -- I assume Catalin has looked at this.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: I/O memory barriers vs SMP memory barriers

2007-03-28 Thread Paul E. McKenney
On Mon, Mar 26, 2007 at 10:46:39AM +0200, Lennert Buytenhek wrote:
> On Sun, Mar 25, 2007 at 08:24:18PM -0700, Paul E. McKenney wrote:
> 
> > > > > > [ background: On ARM, SMP synchronisation does need barriers but 
> > > > > > device
> > > > > >   synchronisation does not.  The question is that given this, 
> > > > > > whether
> > > > > >   mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
> > > > > >   supposed to sync against other CPUs or not, or whether only 
> > > > > > smp_mb()
> > > > > >   can be used for this.)  ]
> > > > > 
> > > > > H...
> > > > > 
> > > > > [snip]
> > > > 
> > > > 3.  Orders memory accesses and device accesses, but not necessarily
> > > > the union of the two -- mb(), rmb(), wmb().
> > > 
> > > If mb/rmb/wmb are required to order normal memory accesses, that means
> > > that the change made in commit 9623b3732d11b0a18d9af3419f680d27ea24b014
> > > to always define mb/rmb/wmb as barrier() on ARM systems was wrong.
> > 
> > This was on UP ARM systems, right?
> 
> No.
> 
> If you look at commit 9623b3732d11b0a18d9af3419f680d27ea24b014, you can
> see that it defines mb/rmb/wmb as barrier() on both ARM UP and SMP systems.
> The UP part is obviously fine, the SMP part is what is under debate here.

Yep, looks wrong to me.

> > Assuming that ARM CPUs respect the usual CPU-self-consistency
> > semantics, and given the background that device accesses are ordered,
> > then it might well be OK to have mb/rmb/wmb be barrier() on UP ARM
> > systems.
> > 
> > Most likely not on SMP ARM systems, however.
> 
> Given the semantics above, mb/rmb/wmb can obviously be just barrier()s
> on ARM UP systems.. I don't think anyone ever disagreed about that.

Good.

> > > Does everybody agree on these semantics, though?  At least David
> > > seems to think that mb/rmb/wmb aren't required to order normal
> > > memory accesses against each other..
> > 
> > Not on UP.  On SMP, ordering is (almost certainly) required.
> 
> 'almost certainly'?  That sounds like there is a possibility that it
> wouldn't have to?  What does this depend on?

The underlying memory model of the CPU.  For sequentially consistent
systems, only compiler barriers are required.  There are very few such
systems -- MIPS and PA-RISC, if I remember correctly.  Performance
dictates otherwise.

I believe that MIPS is -not- sequentially consistent, but have not yet
purchased an architecture reference manual.

> At least David and Catalin seem to disagree with the statement
> that mb/rmb/wmb should order accesses from different CPUs.  And
> memory-barriers.txt is pretty vague about this..

mb() needs to do everything that smp_mb() does, ditto for rmb() and
wmb().  There really are cases where both I/O and memory accesses
need to be ordered, so just providing separate memory ordering and
I/O ordering is not enough.

Given that ARM device drivers are accessing MMIO locations, which are
often slow anyway, how much is ARM really gaining by dropping memory
barriers when only I/O accesses need be ordered?  Is it measurable?
If not, there is no point in adding yet another set of combinatorial
choices to the memory-barrier API.

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: I/O memory barriers vs SMP memory barriers

2007-03-28 Thread Paul E. McKenney
On Mon, Mar 26, 2007 at 10:46:39AM +0200, Lennert Buytenhek wrote:
 On Sun, Mar 25, 2007 at 08:24:18PM -0700, Paul E. McKenney wrote:
 
  [ background: On ARM, SMP synchronisation does need barriers but 
  device
synchronisation does not.  The question is that given this, 
  whether
mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
supposed to sync against other CPUs or not, or whether only 
  smp_mb()
can be used for this.)  ]
 
 H...
 
 [snip]

3.  Orders memory accesses and device accesses, but not necessarily
the union of the two -- mb(), rmb(), wmb().
   
   If mb/rmb/wmb are required to order normal memory accesses, that means
   that the change made in commit 9623b3732d11b0a18d9af3419f680d27ea24b014
   to always define mb/rmb/wmb as barrier() on ARM systems was wrong.
  
  This was on UP ARM systems, right?
 
 No.
 
 If you look at commit 9623b3732d11b0a18d9af3419f680d27ea24b014, you can
 see that it defines mb/rmb/wmb as barrier() on both ARM UP and SMP systems.
 The UP part is obviously fine, the SMP part is what is under debate here.

Yep, looks wrong to me.

  Assuming that ARM CPUs respect the usual CPU-self-consistency
  semantics, and given the background that device accesses are ordered,
  then it might well be OK to have mb/rmb/wmb be barrier() on UP ARM
  systems.
  
  Most likely not on SMP ARM systems, however.
 
 Given the semantics above, mb/rmb/wmb can obviously be just barrier()s
 on ARM UP systems.. I don't think anyone ever disagreed about that.

Good.

   Does everybody agree on these semantics, though?  At least David
   seems to think that mb/rmb/wmb aren't required to order normal
   memory accesses against each other..
  
  Not on UP.  On SMP, ordering is (almost certainly) required.
 
 'almost certainly'?  That sounds like there is a possibility that it
 wouldn't have to?  What does this depend on?

The underlying memory model of the CPU.  For sequentially consistent
systems, only compiler barriers are required.  There are very few such
systems -- MIPS and PA-RISC, if I remember correctly.  Performance
dictates otherwise.

I believe that MIPS is -not- sequentially consistent, but have not yet
purchased an architecture reference manual.

 At least David and Catalin seem to disagree with the statement
 that mb/rmb/wmb should order accesses from different CPUs.  And
 memory-barriers.txt is pretty vague about this..

mb() needs to do everything that smp_mb() does, ditto for rmb() and
wmb().  There really are cases where both I/O and memory accesses
need to be ordered, so just providing separate memory ordering and
I/O ordering is not enough.

Given that ARM device drivers are accessing MMIO locations, which are
often slow anyway, how much is ARM really gaining by dropping memory
barriers when only I/O accesses need be ordered?  Is it measurable?
If not, there is no point in adding yet another set of combinatorial
choices to the memory-barrier API.

Thanx, Paul
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: I/O memory barriers vs SMP memory barriers

2007-03-28 Thread Lennert Buytenhek
On Mon, Mar 26, 2007 at 01:07:11PM -0700, Paul E. McKenney wrote:

Does everybody agree on these semantics, though?  At least David
seems to think that mb/rmb/wmb aren't required to order normal
memory accesses against each other..
   
   Not on UP.  On SMP, ordering is (almost certainly) required.
  
  'almost certainly'?  That sounds like there is a possibility that it
  wouldn't have to?  What does this depend on?
 
 The underlying memory model of the CPU.  For sequentially consistent
 systems, only compiler barriers are required.  There are very few such
 systems -- MIPS and PA-RISC, if I remember correctly.  Performance
 dictates otherwise.
 
 I believe that MIPS is -not- sequentially consistent, but have not yet
 purchased an architecture reference manual.

ARM Normal memory (RAM) accesses are weakly ordered, so on SMP, you
need barriers.  (SMP ARM systems are the definite minority, though.)

(For ARM UP, we generally don't care, since most have virtual caches
and are not I/O coherent, and so DMA coherent mappings will be done
as uncached mappings, and uncached mappings are strongly ordered --
except on XScale V3, which supports I/O coherency, and so you need to
use barriers when operating on DMA coherent memory because DMA coherent
mappings are done as Normal memory (which is weakly ordered) when I/O
coherency is enabled.)


 Given that ARM device drivers are accessing MMIO locations, which are
 often slow anyway, how much is ARM really gaining by dropping memory
 barriers when only I/O accesses need be ordered?  Is it measurable?

No idea -- I assume Catalin has looked at this.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: I/O memory barriers vs SMP memory barriers

2007-03-26 Thread David Howells
Benjamin Herrenschmidt <[EMAIL PROTECTED]> wrote:

> Hrm... I'm not sure I like the io_* name, I think it's even more
> confusing, people will never know when to use what ...

I'd've thought it more obvious, but given there are several types of I/O, some
of which might require different barriering to others, I can see your point.

However, I think mb() unadorned is also confusing.

> Maybe we should dig out again my attempt at properly defining semantics
> of IO accessors and related barriers and extend it to include CPU vs.
> DMA barriers.

That could be useful.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: I/O memory barriers vs SMP memory barriers

2007-03-26 Thread David Howells
Lennert Buytenhek <[EMAIL PROTECTED]> wrote:

> Does everybody agree on these semantics, though?  At least David seems
> to think that mb/rmb/wmb aren't required to order normal memory accesses
> against each other..

Ummm...  I've just realised that your statement here is ambiguous.  When you
say "aren't required to", do you mean "aren't necessary to" or do you mean
"don't have to"?  Isn't English a fun language?

Anyway, what I meant is that mb() and co. as they stand _must_ do everything
smp_mb() and co do respectively, _in_ _addition_ to other side effects.

mb() implies smp_mb()
rmb() implies smp_rmb()
wmb() implies smp_wmb()
...

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: I/O memory barriers vs SMP memory barriers

2007-03-26 Thread Lennert Buytenhek
On Sun, Mar 25, 2007 at 08:24:18PM -0700, Paul E. McKenney wrote:

> > > > > [ background: On ARM, SMP synchronisation does need barriers but 
> > > > > device
> > > > >   synchronisation does not.  The question is that given this, whether
> > > > >   mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
> > > > >   supposed to sync against other CPUs or not, or whether only smp_mb()
> > > > >   can be used for this.)  ]
> > > > 
> > > > H...
> > > > 
> > > > [snip]
> > > 
> > > 3.Orders memory accesses and device accesses, but not necessarily
> > >   the union of the two -- mb(), rmb(), wmb().
> > 
> > If mb/rmb/wmb are required to order normal memory accesses, that means
> > that the change made in commit 9623b3732d11b0a18d9af3419f680d27ea24b014
> > to always define mb/rmb/wmb as barrier() on ARM systems was wrong.
> 
> This was on UP ARM systems, right?

No.

If you look at commit 9623b3732d11b0a18d9af3419f680d27ea24b014, you can
see that it defines mb/rmb/wmb as barrier() on both ARM UP and SMP systems.
The UP part is obviously fine, the SMP part is what is under debate here.


> Assuming that ARM CPUs respect the usual CPU-self-consistency
> semantics, and given the background that device accesses are ordered,
> then it might well be OK to have mb/rmb/wmb be barrier() on UP ARM
> systems.
> 
> Most likely not on SMP ARM systems, however.

Given the semantics above, mb/rmb/wmb can obviously be just barrier()s
on ARM UP systems.. I don't think anyone ever disagreed about that.


> > Does everybody agree on these semantics, though?  At least David
> > seems to think that mb/rmb/wmb aren't required to order normal
> > memory accesses against each other..
> 
> Not on UP.  On SMP, ordering is (almost certainly) required.

'almost certainly'?  That sounds like there is a possibility that it
wouldn't have to?  What does this depend on?

At least David and Catalin seem to disagree with the statement
that mb/rmb/wmb should order accesses from different CPUs.  And
memory-barriers.txt is pretty vague about this..
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: I/O memory barriers vs SMP memory barriers

2007-03-26 Thread Lennert Buytenhek
On Sun, Mar 25, 2007 at 08:24:18PM -0700, Paul E. McKenney wrote:

 [ background: On ARM, SMP synchronisation does need barriers but 
 device
   synchronisation does not.  The question is that given this, whether
   mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
   supposed to sync against other CPUs or not, or whether only smp_mb()
   can be used for this.)  ]

H...

[snip]
   
   3.Orders memory accesses and device accesses, but not necessarily
 the union of the two -- mb(), rmb(), wmb().
  
  If mb/rmb/wmb are required to order normal memory accesses, that means
  that the change made in commit 9623b3732d11b0a18d9af3419f680d27ea24b014
  to always define mb/rmb/wmb as barrier() on ARM systems was wrong.
 
 This was on UP ARM systems, right?

No.

If you look at commit 9623b3732d11b0a18d9af3419f680d27ea24b014, you can
see that it defines mb/rmb/wmb as barrier() on both ARM UP and SMP systems.
The UP part is obviously fine, the SMP part is what is under debate here.


 Assuming that ARM CPUs respect the usual CPU-self-consistency
 semantics, and given the background that device accesses are ordered,
 then it might well be OK to have mb/rmb/wmb be barrier() on UP ARM
 systems.
 
 Most likely not on SMP ARM systems, however.

Given the semantics above, mb/rmb/wmb can obviously be just barrier()s
on ARM UP systems.. I don't think anyone ever disagreed about that.


  Does everybody agree on these semantics, though?  At least David
  seems to think that mb/rmb/wmb aren't required to order normal
  memory accesses against each other..
 
 Not on UP.  On SMP, ordering is (almost certainly) required.

'almost certainly'?  That sounds like there is a possibility that it
wouldn't have to?  What does this depend on?

At least David and Catalin seem to disagree with the statement
that mb/rmb/wmb should order accesses from different CPUs.  And
memory-barriers.txt is pretty vague about this..
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: I/O memory barriers vs SMP memory barriers

2007-03-26 Thread David Howells
Lennert Buytenhek [EMAIL PROTECTED] wrote:

 Does everybody agree on these semantics, though?  At least David seems
 to think that mb/rmb/wmb aren't required to order normal memory accesses
 against each other..

Ummm...  I've just realised that your statement here is ambiguous.  When you
say aren't required to, do you mean aren't necessary to or do you mean
don't have to?  Isn't English a fun language?

Anyway, what I meant is that mb() and co. as they stand _must_ do everything
smp_mb() and co do respectively, _in_ _addition_ to other side effects.

mb() implies smp_mb()
rmb() implies smp_rmb()
wmb() implies smp_wmb()
...

David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: I/O memory barriers vs SMP memory barriers

2007-03-26 Thread David Howells
Benjamin Herrenschmidt [EMAIL PROTECTED] wrote:

 Hrm... I'm not sure I like the io_* name, I think it's even more
 confusing, people will never know when to use what ...

I'd've thought it more obvious, but given there are several types of I/O, some
of which might require different barriering to others, I can see your point.

However, I think mb() unadorned is also confusing.

 Maybe we should dig out again my attempt at properly defining semantics
 of IO accessors and related barriers and extend it to include CPU vs.
 DMA barriers.

That could be useful.

David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: I/O memory barriers vs SMP memory barriers

2007-03-25 Thread Paul E. McKenney
On Sun, Mar 25, 2007 at 11:38:43PM +0200, Lennert Buytenhek wrote:
> On Sun, Mar 25, 2007 at 02:15:42PM -0700, Paul E. McKenney wrote:
> 
> > > > [ background: On ARM, SMP synchronisation does need barriers but device
> > > >   synchronisation does not.  The question is that given this, whether
> > > >   mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
> > > >   supposed to sync against other CPUs or not, or whether only smp_mb()
> > > >   can be used for this.)  ]
> > > 
> > > H...
> > > 
> > > [snip]
> > 
> > 3.  Orders memory accesses and device accesses, but not necessarily
> > the union of the two -- mb(), rmb(), wmb().
> 
> If mb/rmb/wmb are required to order normal memory accesses, that means
> that the change made in commit 9623b3732d11b0a18d9af3419f680d27ea24b014
> to always define mb/rmb/wmb as barrier() on ARM systems was wrong.

This was on UP ARM systems, right?  Assuming that ARM CPUs respect the
usual CPU-self-consistency semantics, and given the background that
device accesses are ordered, then it might well be OK to have mb/rmb/wmb
be barrier() on UP ARM systems.

Most likely not on SMP ARM systems, however.

> Does everybody agree on these semantics, though?  At least David seems
> to think that mb/rmb/wmb aren't required to order normal memory accesses
> against each other..

Not on UP.  On SMP, ordering is (almost certainly) required.

> > 4.  Orders only device accesses, which is what seems to be looked
> > for here.
> 
> Yes.  (As above, on ARM, SMP synchronisation does need barriers but
> device synchronisation does not.  If mb/rmb/wmb were only required to
> synchronise device accesses, they could have been regular compiler
> barriers on ARM, but if they are also required to synchronise normal
> memory accesses against each other, they have to map to hardware
> barriers.)

Again, for kernels built for UP, you might well be able to make the
mb() primitives be barrier().  I don't see it for SMP, though.

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: I/O memory barriers vs SMP memory barriers

2007-03-25 Thread Paul E. McKenney
On Fri, Mar 23, 2007 at 01:43:53PM +, David Howells wrote:
> 
> [Resend - this time with a comma in the addresses, not a dot]
> 
> Lennert Buytenhek <[EMAIL PROTECTED]> wrote:
> 
> > [ background: On ARM, SMP synchronisation does need barriers but device
> >   synchronisation does not.  The question is that given this, whether
> >   mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
> >   supposed to sync against other CPUs or not, or whether only smp_mb()
> >   can be used for this.)  ]
> 
> H...
> 
> I see your problem.  I think the right way to deal with this is to get rid of
> mb(), rmb(), wmb() and read_barrier_depends() and replace them with io_mb(),
> io_rmb(), ...

We will get combinatorial explosion if we aren't -extremely- careful:

1.  Orders only normal memory accesses, which is all that is required
of smp_*().

2.  Orders both normal and device accesses -- mmiowb().

3.  Orders memory accesses and device accesses, but not necessarily
the union of the two -- mb(), rmb(), wmb().

4.  Orders only device accesses, which is what seems to be looked
for here.

Thanx, Paul

> I think that there are only two places you should be using explicit memory
> barriers:
> 
>  (1) To control inter-CPU effects on an SMP system.
> 
>  (2) To control CPU vs device effects.
> 
> > On Thu, Mar 22, 2007 at 04:17:44PM +, Catalin Marinas wrote:
> > 
> > > Is the requirement for mb() to act correctly in the SMP case as well?
> > 
> > That's what the docs seem to suggest.  A couple of snippets from
> > memory-barriers.txt:
> > 
> > [1]  A write memory barrier gives a guarantee that all the STORE operations
> >  specified before the barrier will appear to happen before all the STORE
> >  operations specified after the barrier with respect to the other
> >  components of the system.
> > 
> > [2]  A read barrier is a data dependency barrier plus a guarantee that all 
> > the
> >  LOAD operations specified before the barrier will appear to happen 
> > before
> >  all the LOAD operations specified after the barrier with respect to the
> >  other components of the system.
> > 
> > [3] TYPEMANDATORY   SMP CONDITIONAL
> > === === ===
> > GENERAL mb()smp_mb()
> > WRITE   wmb()   smp_wmb()
> > READrmb()   smp_rmb()
> > DATA DEPENDENCY read_barrier_depends()  smp_read_barrier_depends()
> > 
> > [4]  Mandatory barriers should not be used to control SMP effects,
> >  since mandatory barriers unnecessarily impose overhead on UP
> >  systems.
> > 
> > Note the wording of 'other components of the system' in [1] and [2] --
> > the way I read it, this includes devices as well as other CPUs.
> 
> Yes, but I suppose which "other components" may depend on the class of barrier
> used.
> 
> > [4] says that mandatory barriers (i.e. from [3]: mb(), wmb(), rmb(),
> > read_barrier_depends()) SHOULD not be used to control SMP effects, but
> > it does not say that they MUST not.
> 
> As it stands, mb() is a superset of smp_mb(), and rmb() of smp_rmb(), etc.,
> so, yes, currently, mb() implies smp_mb().  However, mb() shouldn't be used if
> smb_mb() is sufficient as that may impact performance on a UP system.
> 
> Really, mb() should only be used with respect to I/O.
> 
> > > The memory-barriers.txt doc says that smp_* must be used for the SMP
> > > case.
> > 
> > The exact wording is:
> > 
> > [!] Note that SMP memory barriers _must_ be used to control the
> > ordering of references to shared memory on SMP systems, though
> > the use of locking instead is sufficient.
> > 
> > This can IMHO be interpreted in two ways:
> > 1. If you want to control ordering of references to shared memory on
> >SMP systems, you must use SMP memory barriers and not any other kind
> >of memory barrier.
> 
> If the shared memory is purely an inter-CPU effect, yes.  If the shared memory
> is actually a device with side effects, then I/O safe memory barriers are
> required - mb() and co.  Note that there must _also_ be safety wrt to other
> CPUs in the system, as other CPUs may also try to access the device.
> 
> > 2. If you want to control ordering of references to shared memory on
> >SMP systems, you must use memory barriers, and the SMP memory barrier
> >is the most appropriate barrier type to use.
> 
> You may use locking instead to control inter-CPU effects.  Locks imply one-way
> permeable SMP-class memory barriers.
> 
> > I'm thinking that [2] is what was intended.  [1] doesn't seem consistent
> > with the rest of the document, but if [1] _is_ what is what was intended,
> > we're off the hook and mb() and friends can be NOPs on ARM.  (But it'd
> > probably still need a thorough audit... :-/ )
> 

Re: I/O memory barriers vs SMP memory barriers

2007-03-25 Thread Lennert Buytenhek
On Sun, Mar 25, 2007 at 02:15:42PM -0700, Paul E. McKenney wrote:

> > > [ background: On ARM, SMP synchronisation does need barriers but device
> > >   synchronisation does not.  The question is that given this, whether
> > >   mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
> > >   supposed to sync against other CPUs or not, or whether only smp_mb()
> > >   can be used for this.)  ]
> > 
> > H...
> > 
> > [snip]
> 
> 3.Orders memory accesses and device accesses, but not necessarily
>   the union of the two -- mb(), rmb(), wmb().

If mb/rmb/wmb are required to order normal memory accesses, that means
that the change made in commit 9623b3732d11b0a18d9af3419f680d27ea24b014
to always define mb/rmb/wmb as barrier() on ARM systems was wrong.

Does everybody agree on these semantics, though?  At least David seems
to think that mb/rmb/wmb aren't required to order normal memory accesses
against each other..


> 4.Orders only device accesses, which is what seems to be looked
>   for here.

Yes.  (As above, on ARM, SMP synchronisation does need barriers but
device synchronisation does not.  If mb/rmb/wmb were only required to
synchronise device accesses, they could have been regular compiler
barriers on ARM, but if they are also required to synchronise normal
memory accesses against each other, they have to map to hardware
barriers.)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: I/O memory barriers vs SMP memory barriers

2007-03-25 Thread Lennert Buytenhek
On Sun, Mar 25, 2007 at 02:15:42PM -0700, Paul E. McKenney wrote:

   [ background: On ARM, SMP synchronisation does need barriers but device
 synchronisation does not.  The question is that given this, whether
 mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
 supposed to sync against other CPUs or not, or whether only smp_mb()
 can be used for this.)  ]
  
  H...
  
  [snip]
 
 3.Orders memory accesses and device accesses, but not necessarily
   the union of the two -- mb(), rmb(), wmb().

If mb/rmb/wmb are required to order normal memory accesses, that means
that the change made in commit 9623b3732d11b0a18d9af3419f680d27ea24b014
to always define mb/rmb/wmb as barrier() on ARM systems was wrong.

Does everybody agree on these semantics, though?  At least David seems
to think that mb/rmb/wmb aren't required to order normal memory accesses
against each other..


 4.Orders only device accesses, which is what seems to be looked
   for here.

Yes.  (As above, on ARM, SMP synchronisation does need barriers but
device synchronisation does not.  If mb/rmb/wmb were only required to
synchronise device accesses, they could have been regular compiler
barriers on ARM, but if they are also required to synchronise normal
memory accesses against each other, they have to map to hardware
barriers.)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: I/O memory barriers vs SMP memory barriers

2007-03-25 Thread Paul E. McKenney
On Fri, Mar 23, 2007 at 01:43:53PM +, David Howells wrote:
 
 [Resend - this time with a comma in the addresses, not a dot]
 
 Lennert Buytenhek [EMAIL PROTECTED] wrote:
 
  [ background: On ARM, SMP synchronisation does need barriers but device
synchronisation does not.  The question is that given this, whether
mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
supposed to sync against other CPUs or not, or whether only smp_mb()
can be used for this.)  ]
 
 H...
 
 I see your problem.  I think the right way to deal with this is to get rid of
 mb(), rmb(), wmb() and read_barrier_depends() and replace them with io_mb(),
 io_rmb(), ...

We will get combinatorial explosion if we aren't -extremely- careful:

1.  Orders only normal memory accesses, which is all that is required
of smp_*().

2.  Orders both normal and device accesses -- mmiowb().

3.  Orders memory accesses and device accesses, but not necessarily
the union of the two -- mb(), rmb(), wmb().

4.  Orders only device accesses, which is what seems to be looked
for here.

Thanx, Paul

 I think that there are only two places you should be using explicit memory
 barriers:
 
  (1) To control inter-CPU effects on an SMP system.
 
  (2) To control CPU vs device effects.
 
  On Thu, Mar 22, 2007 at 04:17:44PM +, Catalin Marinas wrote:
  
   Is the requirement for mb() to act correctly in the SMP case as well?
  
  That's what the docs seem to suggest.  A couple of snippets from
  memory-barriers.txt:
  
  [1]  A write memory barrier gives a guarantee that all the STORE operations
   specified before the barrier will appear to happen before all the STORE
   operations specified after the barrier with respect to the other
   components of the system.
  
  [2]  A read barrier is a data dependency barrier plus a guarantee that all 
  the
   LOAD operations specified before the barrier will appear to happen 
  before
   all the LOAD operations specified after the barrier with respect to the
   other components of the system.
  
  [3] TYPEMANDATORY   SMP CONDITIONAL
  === === ===
  GENERAL mb()smp_mb()
  WRITE   wmb()   smp_wmb()
  READrmb()   smp_rmb()
  DATA DEPENDENCY read_barrier_depends()  smp_read_barrier_depends()
  
  [4]  Mandatory barriers should not be used to control SMP effects,
   since mandatory barriers unnecessarily impose overhead on UP
   systems.
  
  Note the wording of 'other components of the system' in [1] and [2] --
  the way I read it, this includes devices as well as other CPUs.
 
 Yes, but I suppose which other components may depend on the class of barrier
 used.
 
  [4] says that mandatory barriers (i.e. from [3]: mb(), wmb(), rmb(),
  read_barrier_depends()) SHOULD not be used to control SMP effects, but
  it does not say that they MUST not.
 
 As it stands, mb() is a superset of smp_mb(), and rmb() of smp_rmb(), etc.,
 so, yes, currently, mb() implies smp_mb().  However, mb() shouldn't be used if
 smb_mb() is sufficient as that may impact performance on a UP system.
 
 Really, mb() should only be used with respect to I/O.
 
   The memory-barriers.txt doc says that smp_* must be used for the SMP
   case.
  
  The exact wording is:
  
  [!] Note that SMP memory barriers _must_ be used to control the
  ordering of references to shared memory on SMP systems, though
  the use of locking instead is sufficient.
  
  This can IMHO be interpreted in two ways:
  1. If you want to control ordering of references to shared memory on
 SMP systems, you must use SMP memory barriers and not any other kind
 of memory barrier.
 
 If the shared memory is purely an inter-CPU effect, yes.  If the shared memory
 is actually a device with side effects, then I/O safe memory barriers are
 required - mb() and co.  Note that there must _also_ be safety wrt to other
 CPUs in the system, as other CPUs may also try to access the device.
 
  2. If you want to control ordering of references to shared memory on
 SMP systems, you must use memory barriers, and the SMP memory barrier
 is the most appropriate barrier type to use.
 
 You may use locking instead to control inter-CPU effects.  Locks imply one-way
 permeable SMP-class memory barriers.
 
  I'm thinking that [2] is what was intended.  [1] doesn't seem consistent
  with the rest of the document, but if [1] _is_ what is what was intended,
  we're off the hook and mb() and friends can be NOPs on ARM.  (But it'd
  probably still need a thorough audit... :-/ )
 
 I think the best way to do an audit would be to make mb() and co. deprecated,
 pending obsolete, and to replace them with io_mb() and co.  That way people
 would 

Re: I/O memory barriers vs SMP memory barriers

2007-03-25 Thread Paul E. McKenney
On Sun, Mar 25, 2007 at 11:38:43PM +0200, Lennert Buytenhek wrote:
 On Sun, Mar 25, 2007 at 02:15:42PM -0700, Paul E. McKenney wrote:
 
[ background: On ARM, SMP synchronisation does need barriers but device
  synchronisation does not.  The question is that given this, whether
  mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
  supposed to sync against other CPUs or not, or whether only smp_mb()
  can be used for this.)  ]
   
   H...
   
   [snip]
  
  3.  Orders memory accesses and device accesses, but not necessarily
  the union of the two -- mb(), rmb(), wmb().
 
 If mb/rmb/wmb are required to order normal memory accesses, that means
 that the change made in commit 9623b3732d11b0a18d9af3419f680d27ea24b014
 to always define mb/rmb/wmb as barrier() on ARM systems was wrong.

This was on UP ARM systems, right?  Assuming that ARM CPUs respect the
usual CPU-self-consistency semantics, and given the background that
device accesses are ordered, then it might well be OK to have mb/rmb/wmb
be barrier() on UP ARM systems.

Most likely not on SMP ARM systems, however.

 Does everybody agree on these semantics, though?  At least David seems
 to think that mb/rmb/wmb aren't required to order normal memory accesses
 against each other..

Not on UP.  On SMP, ordering is (almost certainly) required.

  4.  Orders only device accesses, which is what seems to be looked
  for here.
 
 Yes.  (As above, on ARM, SMP synchronisation does need barriers but
 device synchronisation does not.  If mb/rmb/wmb were only required to
 synchronise device accesses, they could have been regular compiler
 barriers on ARM, but if they are also required to synchronise normal
 memory accesses against each other, they have to map to hardware
 barriers.)

Again, for kernels built for UP, you might well be able to make the
mb() primitives be barrier().  I don't see it for SMP, though.

Thanx, Paul
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: I/O memory barriers vs SMP memory barriers

2007-03-24 Thread Benjamin Herrenschmidt
On Fri, 2007-03-23 at 13:43 +, David Howells wrote:
> [Resend - this time with a comma in the addresses, not a dot]
> 
> Lennert Buytenhek <[EMAIL PROTECTED]> wrote:
> 
> > [ background: On ARM, SMP synchronisation does need barriers but device
> >   synchronisation does not.  The question is that given this, whether
> >   mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
> >   supposed to sync against other CPUs or not, or whether only smp_mb()
> >   can be used for this.)  ]
> 
> H...
> 
> I see your problem.  I think the right way to deal with this is to get rid of
> mb(), rmb(), wmb() and read_barrier_depends() and replace them with io_mb(),
> io_rmb(), ...

Hrm... I'm not sure I like the io_* name, I think it's even more
confusing, people will never know when to use what ...

Maybe we should dig out again my attempt at properly defining semantics
of IO accessors and related barriers and extend it to include CPU vs.
DMA barriers.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: I/O memory barriers vs SMP memory barriers

2007-03-24 Thread Benjamin Herrenschmidt
On Fri, 2007-03-23 at 13:43 +, David Howells wrote:
 [Resend - this time with a comma in the addresses, not a dot]
 
 Lennert Buytenhek [EMAIL PROTECTED] wrote:
 
  [ background: On ARM, SMP synchronisation does need barriers but device
synchronisation does not.  The question is that given this, whether
mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
supposed to sync against other CPUs or not, or whether only smp_mb()
can be used for this.)  ]
 
 H...
 
 I see your problem.  I think the right way to deal with this is to get rid of
 mb(), rmb(), wmb() and read_barrier_depends() and replace them with io_mb(),
 io_rmb(), ...

Hrm... I'm not sure I like the io_* name, I think it's even more
confusing, people will never know when to use what ...

Maybe we should dig out again my attempt at properly defining semantics
of IO accessors and related barriers and extend it to include CPU vs.
DMA barriers.

Ben.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: I/O memory barriers vs SMP memory barriers

2007-03-23 Thread Lennert Buytenhek
On Fri, Mar 23, 2007 at 01:43:53PM +, David Howells wrote:

> > [ background: On ARM, SMP synchronisation does need barriers but device
> >   synchronisation does not.  The question is that given this, whether
> >   mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
> >   supposed to sync against other CPUs or not, or whether only smp_mb()
> >   can be used for this.)  ]
> 
> H...
> 
> I see your problem.  I think the right way to deal with this is to get
> rid of mb(), rmb(), wmb() and read_barrier_depends() and replace them
> with io_mb(), io_rmb(), ...

There's actually three different cases of interest on ARM:
1. direct-mapped and vmalloc()ed kernel memory
2. coherent DMA memory
3. I/O memory (device mappings)

smp_*() only make sense on (1).  Here, you'd want a hardware barrier
on SMP systems, and just a compiler barrier on UP systems.

For (2), most ARM systems use uncached mappings of kernel memory, which
are strongly ordered, and you don't need hardware barriers.  However,
some ARM systems are cache coherent, and they can use ordinary mappings
for (2) (i.e. kmalloc), _but_, such ordinary mappings are weakly ordered,
and so on those systems, you _would_ need hardware barriers for (2).

For (3), Device memory (i.e. I/O mappings) are strongly ordered on all
ARM platforms.

(And of course, then there's the synchronisation issues _between_ the
different mapping types.)

Anyway, we could split the barrier types into three groups, or even
more groups (I bet that on, say, ia64, there's at least a couple more
different scenarios of interest), however, I'm really worried that the
Average Joe Driver Writer's head is just going to explode.


> > [4] says that mandatory barriers (i.e. from [3]: mb(), wmb(), rmb(),
> > read_barrier_depends()) SHOULD not be used to control SMP effects, but
> > it does not say that they MUST not.
> 
> As it stands, mb() is a superset of smp_mb(), and rmb() of smp_rmb(),
> etc.,

Not (anymore) on ARM:


http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9623b3732d11b0a18d9af3419f680d27ea24b014

The question is whether this change was correct.


> so, yes, currently, mb() implies smp_mb().  However, mb() shouldn't
> be used if smb_mb() is sufficient as that may impact performance on
> a UP system.

There's two different statements that can be made about mb():

1. You shouldn't use mb() to synchronise with other CPUs as that is
   unnecessarily slow.

2. You must not use mb() to synchronise with other CPUs as that is
   wrong.

Which is it, (1) or (2)?  The memory-barriers.txt document confuses
these two issues, and you confuse these two issues, but there is a
_fundamental_ _semantic_ _difference_ between these two statements.
Let's not confuse them.


> Really, mb() should only be used with respect to I/O.

OK.  Can we clarify the docs on this point, please?


> > > The memory-barriers.txt doc says that smp_* must be used for the SMP
> > > case.
> > 
> > The exact wording is:
> > 
> > [!] Note that SMP memory barriers _must_ be used to control the
> > ordering of references to shared memory on SMP systems, though
> > the use of locking instead is sufficient.
> > 
> > This can IMHO be interpreted in two ways:
> > 1. If you want to control ordering of references to shared memory on
> >SMP systems, you must use SMP memory barriers and not any other kind
> >of memory barrier.
> 
> If the shared memory is purely an inter-CPU effect, yes.  If the shared
> memory is actually a device with side effects, then I/O safe memory
> barriers are required - mb() and co.  Note that there must _also_ be
> safety wrt to other CPUs in the system, as other CPUs may also try to
> access the device.

I was not making any statement, I was just giving two possible
interpretations of the above-quoted snippet from memory-barriers.txt.

Yes, I'm aware of the issues you mention, and yes, all the other
necessary guarantees are provided on the ARM platform.


> > 2. If you want to control ordering of references to shared memory on
> >SMP systems, you must use memory barriers, and the SMP memory barrier
> >is the most appropriate barrier type to use.
> 
> You may use locking instead to control inter-CPU effects.  Locks imply
> one-way permeable SMP-class memory barriers.

Again, I was not trying to make a statement here, just giving a
possible interpretation of a statement in memory-barriers.txt.


> > I'm thinking that [2] is what was intended.  [1] doesn't seem
> > consistent with the rest of the document, but if [1] _is_ what
> > is what was intended, we're off the hook and mb() and friends
> > can be NOPs on ARM.  (But it'd probably still need a thorough
> > audit... :-/ )
> 
> I think the best way to do an audit would be to make mb() and co.
> deprecated, pending obsolete, and to replace them with io_mb() and
> co.  That way people would have to eyeball any usages of mb() and
> co.

Sounds OK to me.  Then again, I have an idea 

I/O memory barriers vs SMP memory barriers

2007-03-23 Thread David Howells

[Resend - this time with a comma in the addresses, not a dot]

Lennert Buytenhek <[EMAIL PROTECTED]> wrote:

> [ background: On ARM, SMP synchronisation does need barriers but device
>   synchronisation does not.  The question is that given this, whether
>   mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
>   supposed to sync against other CPUs or not, or whether only smp_mb()
>   can be used for this.)  ]

H...

I see your problem.  I think the right way to deal with this is to get rid of
mb(), rmb(), wmb() and read_barrier_depends() and replace them with io_mb(),
io_rmb(), ...

I think that there are only two places you should be using explicit memory
barriers:

 (1) To control inter-CPU effects on an SMP system.

 (2) To control CPU vs device effects.

> On Thu, Mar 22, 2007 at 04:17:44PM +, Catalin Marinas wrote:
> 
> > Is the requirement for mb() to act correctly in the SMP case as well?
> 
> That's what the docs seem to suggest.  A couple of snippets from
> memory-barriers.txt:
> 
> [1]  A write memory barrier gives a guarantee that all the STORE operations
>  specified before the barrier will appear to happen before all the STORE
>  operations specified after the barrier with respect to the other
>  components of the system.
> 
> [2]  A read barrier is a data dependency barrier plus a guarantee that all the
>  LOAD operations specified before the barrier will appear to happen before
>  all the LOAD operations specified after the barrier with respect to the
>  other components of the system.
> 
> [3] TYPEMANDATORY   SMP CONDITIONAL
> === === ===
> GENERAL mb()smp_mb()
> WRITE   wmb()   smp_wmb()
> READrmb()   smp_rmb()
> DATA DEPENDENCY read_barrier_depends()  smp_read_barrier_depends()
> 
> [4]  Mandatory barriers should not be used to control SMP effects,
>  since mandatory barriers unnecessarily impose overhead on UP
>  systems.
> 
> Note the wording of 'other components of the system' in [1] and [2] --
> the way I read it, this includes devices as well as other CPUs.

Yes, but I suppose which "other components" may depend on the class of barrier
used.

> [4] says that mandatory barriers (i.e. from [3]: mb(), wmb(), rmb(),
> read_barrier_depends()) SHOULD not be used to control SMP effects, but
> it does not say that they MUST not.

As it stands, mb() is a superset of smp_mb(), and rmb() of smp_rmb(), etc.,
so, yes, currently, mb() implies smp_mb().  However, mb() shouldn't be used if
smb_mb() is sufficient as that may impact performance on a UP system.

Really, mb() should only be used with respect to I/O.

> > The memory-barriers.txt doc says that smp_* must be used for the SMP
> > case.
> 
> The exact wording is:
> 
>   [!] Note that SMP memory barriers _must_ be used to control the
>   ordering of references to shared memory on SMP systems, though
>   the use of locking instead is sufficient.
> 
> This can IMHO be interpreted in two ways:
> 1. If you want to control ordering of references to shared memory on
>SMP systems, you must use SMP memory barriers and not any other kind
>of memory barrier.

If the shared memory is purely an inter-CPU effect, yes.  If the shared memory
is actually a device with side effects, then I/O safe memory barriers are
required - mb() and co.  Note that there must _also_ be safety wrt to other
CPUs in the system, as other CPUs may also try to access the device.

> 2. If you want to control ordering of references to shared memory on
>SMP systems, you must use memory barriers, and the SMP memory barrier
>is the most appropriate barrier type to use.

You may use locking instead to control inter-CPU effects.  Locks imply one-way
permeable SMP-class memory barriers.

> I'm thinking that [2] is what was intended.  [1] doesn't seem consistent
> with the rest of the document, but if [1] _is_ what is what was intended,
> we're off the hook and mb() and friends can be NOPs on ARM.  (But it'd
> probably still need a thorough audit... :-/ )

I think the best way to do an audit would be to make mb() and co. deprecated,
pending obsolete, and to replace them with io_mb() and co.  That way people
would have to eyeball any usages of mb() and co.

> > This means that if code uses mb() to control SMP sharing, it is broken.
> 
> I'm not so sure.

If it's _purely_ to control inter-CPU SMP sharing, then yes, it's broken.  It
must use either a lock or an smp_*mb() barrier.

Of course, Linus may disagree...

David

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


I/O memory barriers vs SMP memory barriers

2007-03-23 Thread David Howells

[Resend - this time with a comma in the addresses, not a dot]

Lennert Buytenhek [EMAIL PROTECTED] wrote:

 [ background: On ARM, SMP synchronisation does need barriers but device
   synchronisation does not.  The question is that given this, whether
   mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
   supposed to sync against other CPUs or not, or whether only smp_mb()
   can be used for this.)  ]

H...

I see your problem.  I think the right way to deal with this is to get rid of
mb(), rmb(), wmb() and read_barrier_depends() and replace them with io_mb(),
io_rmb(), ...

I think that there are only two places you should be using explicit memory
barriers:

 (1) To control inter-CPU effects on an SMP system.

 (2) To control CPU vs device effects.

 On Thu, Mar 22, 2007 at 04:17:44PM +, Catalin Marinas wrote:
 
  Is the requirement for mb() to act correctly in the SMP case as well?
 
 That's what the docs seem to suggest.  A couple of snippets from
 memory-barriers.txt:
 
 [1]  A write memory barrier gives a guarantee that all the STORE operations
  specified before the barrier will appear to happen before all the STORE
  operations specified after the barrier with respect to the other
  components of the system.
 
 [2]  A read barrier is a data dependency barrier plus a guarantee that all the
  LOAD operations specified before the barrier will appear to happen before
  all the LOAD operations specified after the barrier with respect to the
  other components of the system.
 
 [3] TYPEMANDATORY   SMP CONDITIONAL
 === === ===
 GENERAL mb()smp_mb()
 WRITE   wmb()   smp_wmb()
 READrmb()   smp_rmb()
 DATA DEPENDENCY read_barrier_depends()  smp_read_barrier_depends()
 
 [4]  Mandatory barriers should not be used to control SMP effects,
  since mandatory barriers unnecessarily impose overhead on UP
  systems.
 
 Note the wording of 'other components of the system' in [1] and [2] --
 the way I read it, this includes devices as well as other CPUs.

Yes, but I suppose which other components may depend on the class of barrier
used.

 [4] says that mandatory barriers (i.e. from [3]: mb(), wmb(), rmb(),
 read_barrier_depends()) SHOULD not be used to control SMP effects, but
 it does not say that they MUST not.

As it stands, mb() is a superset of smp_mb(), and rmb() of smp_rmb(), etc.,
so, yes, currently, mb() implies smp_mb().  However, mb() shouldn't be used if
smb_mb() is sufficient as that may impact performance on a UP system.

Really, mb() should only be used with respect to I/O.

  The memory-barriers.txt doc says that smp_* must be used for the SMP
  case.
 
 The exact wording is:
 
   [!] Note that SMP memory barriers _must_ be used to control the
   ordering of references to shared memory on SMP systems, though
   the use of locking instead is sufficient.
 
 This can IMHO be interpreted in two ways:
 1. If you want to control ordering of references to shared memory on
SMP systems, you must use SMP memory barriers and not any other kind
of memory barrier.

If the shared memory is purely an inter-CPU effect, yes.  If the shared memory
is actually a device with side effects, then I/O safe memory barriers are
required - mb() and co.  Note that there must _also_ be safety wrt to other
CPUs in the system, as other CPUs may also try to access the device.

 2. If you want to control ordering of references to shared memory on
SMP systems, you must use memory barriers, and the SMP memory barrier
is the most appropriate barrier type to use.

You may use locking instead to control inter-CPU effects.  Locks imply one-way
permeable SMP-class memory barriers.

 I'm thinking that [2] is what was intended.  [1] doesn't seem consistent
 with the rest of the document, but if [1] _is_ what is what was intended,
 we're off the hook and mb() and friends can be NOPs on ARM.  (But it'd
 probably still need a thorough audit... :-/ )

I think the best way to do an audit would be to make mb() and co. deprecated,
pending obsolete, and to replace them with io_mb() and co.  That way people
would have to eyeball any usages of mb() and co.

  This means that if code uses mb() to control SMP sharing, it is broken.
 
 I'm not so sure.

If it's _purely_ to control inter-CPU SMP sharing, then yes, it's broken.  It
must use either a lock or an smp_*mb() barrier.

Of course, Linus may disagree...

David

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: I/O memory barriers vs SMP memory barriers

2007-03-23 Thread Lennert Buytenhek
On Fri, Mar 23, 2007 at 01:43:53PM +, David Howells wrote:

  [ background: On ARM, SMP synchronisation does need barriers but device
synchronisation does not.  The question is that given this, whether
mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
supposed to sync against other CPUs or not, or whether only smp_mb()
can be used for this.)  ]
 
 H...
 
 I see your problem.  I think the right way to deal with this is to get
 rid of mb(), rmb(), wmb() and read_barrier_depends() and replace them
 with io_mb(), io_rmb(), ...

There's actually three different cases of interest on ARM:
1. direct-mapped and vmalloc()ed kernel memory
2. coherent DMA memory
3. I/O memory (device mappings)

smp_*() only make sense on (1).  Here, you'd want a hardware barrier
on SMP systems, and just a compiler barrier on UP systems.

For (2), most ARM systems use uncached mappings of kernel memory, which
are strongly ordered, and you don't need hardware barriers.  However,
some ARM systems are cache coherent, and they can use ordinary mappings
for (2) (i.e. kmalloc), _but_, such ordinary mappings are weakly ordered,
and so on those systems, you _would_ need hardware barriers for (2).

For (3), Device memory (i.e. I/O mappings) are strongly ordered on all
ARM platforms.

(And of course, then there's the synchronisation issues _between_ the
different mapping types.)

Anyway, we could split the barrier types into three groups, or even
more groups (I bet that on, say, ia64, there's at least a couple more
different scenarios of interest), however, I'm really worried that the
Average Joe Driver Writer's head is just going to explode.


  [4] says that mandatory barriers (i.e. from [3]: mb(), wmb(), rmb(),
  read_barrier_depends()) SHOULD not be used to control SMP effects, but
  it does not say that they MUST not.
 
 As it stands, mb() is a superset of smp_mb(), and rmb() of smp_rmb(),
 etc.,

Not (anymore) on ARM:


http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9623b3732d11b0a18d9af3419f680d27ea24b014

The question is whether this change was correct.


 so, yes, currently, mb() implies smp_mb().  However, mb() shouldn't
 be used if smb_mb() is sufficient as that may impact performance on
 a UP system.

There's two different statements that can be made about mb():

1. You shouldn't use mb() to synchronise with other CPUs as that is
   unnecessarily slow.

2. You must not use mb() to synchronise with other CPUs as that is
   wrong.

Which is it, (1) or (2)?  The memory-barriers.txt document confuses
these two issues, and you confuse these two issues, but there is a
_fundamental_ _semantic_ _difference_ between these two statements.
Let's not confuse them.


 Really, mb() should only be used with respect to I/O.

OK.  Can we clarify the docs on this point, please?


   The memory-barriers.txt doc says that smp_* must be used for the SMP
   case.
  
  The exact wording is:
  
  [!] Note that SMP memory barriers _must_ be used to control the
  ordering of references to shared memory on SMP systems, though
  the use of locking instead is sufficient.
  
  This can IMHO be interpreted in two ways:
  1. If you want to control ordering of references to shared memory on
 SMP systems, you must use SMP memory barriers and not any other kind
 of memory barrier.
 
 If the shared memory is purely an inter-CPU effect, yes.  If the shared
 memory is actually a device with side effects, then I/O safe memory
 barriers are required - mb() and co.  Note that there must _also_ be
 safety wrt to other CPUs in the system, as other CPUs may also try to
 access the device.

I was not making any statement, I was just giving two possible
interpretations of the above-quoted snippet from memory-barriers.txt.

Yes, I'm aware of the issues you mention, and yes, all the other
necessary guarantees are provided on the ARM platform.


  2. If you want to control ordering of references to shared memory on
 SMP systems, you must use memory barriers, and the SMP memory barrier
 is the most appropriate barrier type to use.
 
 You may use locking instead to control inter-CPU effects.  Locks imply
 one-way permeable SMP-class memory barriers.

Again, I was not trying to make a statement here, just giving a
possible interpretation of a statement in memory-barriers.txt.


  I'm thinking that [2] is what was intended.  [1] doesn't seem
  consistent with the rest of the document, but if [1] _is_ what
  is what was intended, we're off the hook and mb() and friends
  can be NOPs on ARM.  (But it'd probably still need a thorough
  audit... :-/ )
 
 I think the best way to do an audit would be to make mb() and co.
 deprecated, pending obsolete, and to replace them with io_mb() and
 co.  That way people would have to eyeball any usages of mb() and
 co.

Sounds OK to me.  Then again, I have an idea of what all the different
types of barriers do.. Joe Driver Writer might not.