Re: I/O memory barriers vs SMP memory barriers
On Mon, Mar 26, 2007 at 01:07:11PM -0700, Paul E. McKenney wrote: > > > > Does everybody agree on these semantics, though? At least David > > > > seems to think that mb/rmb/wmb aren't required to order normal > > > > memory accesses against each other.. > > > > > > Not on UP. On SMP, ordering is (almost certainly) required. > > > > 'almost certainly'? That sounds like there is a possibility that it > > wouldn't have to? What does this depend on? > > The underlying memory model of the CPU. For sequentially consistent > systems, only compiler barriers are required. There are very few such > systems -- MIPS and PA-RISC, if I remember correctly. Performance > dictates otherwise. > > I believe that MIPS is -not- sequentially consistent, but have not yet > purchased an architecture reference manual. ARM Normal memory (RAM) accesses are weakly ordered, so on SMP, you need barriers. (SMP ARM systems are the definite minority, though.) (For ARM UP, we generally don't care, since most have virtual caches and are not I/O coherent, and so DMA coherent mappings will be done as uncached mappings, and uncached mappings are strongly ordered -- except on XScale V3, which supports I/O coherency, and so you need to use barriers when operating on DMA coherent memory because DMA coherent mappings are done as Normal memory (which is weakly ordered) when I/O coherency is enabled.) > Given that ARM device drivers are accessing MMIO locations, which are > often slow anyway, how much is ARM really gaining by dropping memory > barriers when only I/O accesses need be ordered? Is it measurable? No idea -- I assume Catalin has looked at this. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: I/O memory barriers vs SMP memory barriers
On Mon, Mar 26, 2007 at 10:46:39AM +0200, Lennert Buytenhek wrote: > On Sun, Mar 25, 2007 at 08:24:18PM -0700, Paul E. McKenney wrote: > > > > > > > [ background: On ARM, SMP synchronisation does need barriers but > > > > > > device > > > > > > synchronisation does not. The question is that given this, > > > > > > whether > > > > > > mb() and friends can be NOPs on ARM or not (i.e. whether mb() is > > > > > > supposed to sync against other CPUs or not, or whether only > > > > > > smp_mb() > > > > > > can be used for this.) ] > > > > > > > > > > H... > > > > > > > > > > [snip] > > > > > > > > 3. Orders memory accesses and device accesses, but not necessarily > > > > the union of the two -- mb(), rmb(), wmb(). > > > > > > If mb/rmb/wmb are required to order normal memory accesses, that means > > > that the change made in commit 9623b3732d11b0a18d9af3419f680d27ea24b014 > > > to always define mb/rmb/wmb as barrier() on ARM systems was wrong. > > > > This was on UP ARM systems, right? > > No. > > If you look at commit 9623b3732d11b0a18d9af3419f680d27ea24b014, you can > see that it defines mb/rmb/wmb as barrier() on both ARM UP and SMP systems. > The UP part is obviously fine, the SMP part is what is under debate here. Yep, looks wrong to me. > > Assuming that ARM CPUs respect the usual CPU-self-consistency > > semantics, and given the background that device accesses are ordered, > > then it might well be OK to have mb/rmb/wmb be barrier() on UP ARM > > systems. > > > > Most likely not on SMP ARM systems, however. > > Given the semantics above, mb/rmb/wmb can obviously be just barrier()s > on ARM UP systems.. I don't think anyone ever disagreed about that. Good. > > > Does everybody agree on these semantics, though? At least David > > > seems to think that mb/rmb/wmb aren't required to order normal > > > memory accesses against each other.. > > > > Not on UP. On SMP, ordering is (almost certainly) required. > > 'almost certainly'? That sounds like there is a possibility that it > wouldn't have to? What does this depend on? The underlying memory model of the CPU. For sequentially consistent systems, only compiler barriers are required. There are very few such systems -- MIPS and PA-RISC, if I remember correctly. Performance dictates otherwise. I believe that MIPS is -not- sequentially consistent, but have not yet purchased an architecture reference manual. > At least David and Catalin seem to disagree with the statement > that mb/rmb/wmb should order accesses from different CPUs. And > memory-barriers.txt is pretty vague about this.. mb() needs to do everything that smp_mb() does, ditto for rmb() and wmb(). There really are cases where both I/O and memory accesses need to be ordered, so just providing separate memory ordering and I/O ordering is not enough. Given that ARM device drivers are accessing MMIO locations, which are often slow anyway, how much is ARM really gaining by dropping memory barriers when only I/O accesses need be ordered? Is it measurable? If not, there is no point in adding yet another set of combinatorial choices to the memory-barrier API. Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: I/O memory barriers vs SMP memory barriers
On Mon, Mar 26, 2007 at 10:46:39AM +0200, Lennert Buytenhek wrote: On Sun, Mar 25, 2007 at 08:24:18PM -0700, Paul E. McKenney wrote: [ background: On ARM, SMP synchronisation does need barriers but device synchronisation does not. The question is that given this, whether mb() and friends can be NOPs on ARM or not (i.e. whether mb() is supposed to sync against other CPUs or not, or whether only smp_mb() can be used for this.) ] H... [snip] 3. Orders memory accesses and device accesses, but not necessarily the union of the two -- mb(), rmb(), wmb(). If mb/rmb/wmb are required to order normal memory accesses, that means that the change made in commit 9623b3732d11b0a18d9af3419f680d27ea24b014 to always define mb/rmb/wmb as barrier() on ARM systems was wrong. This was on UP ARM systems, right? No. If you look at commit 9623b3732d11b0a18d9af3419f680d27ea24b014, you can see that it defines mb/rmb/wmb as barrier() on both ARM UP and SMP systems. The UP part is obviously fine, the SMP part is what is under debate here. Yep, looks wrong to me. Assuming that ARM CPUs respect the usual CPU-self-consistency semantics, and given the background that device accesses are ordered, then it might well be OK to have mb/rmb/wmb be barrier() on UP ARM systems. Most likely not on SMP ARM systems, however. Given the semantics above, mb/rmb/wmb can obviously be just barrier()s on ARM UP systems.. I don't think anyone ever disagreed about that. Good. Does everybody agree on these semantics, though? At least David seems to think that mb/rmb/wmb aren't required to order normal memory accesses against each other.. Not on UP. On SMP, ordering is (almost certainly) required. 'almost certainly'? That sounds like there is a possibility that it wouldn't have to? What does this depend on? The underlying memory model of the CPU. For sequentially consistent systems, only compiler barriers are required. There are very few such systems -- MIPS and PA-RISC, if I remember correctly. Performance dictates otherwise. I believe that MIPS is -not- sequentially consistent, but have not yet purchased an architecture reference manual. At least David and Catalin seem to disagree with the statement that mb/rmb/wmb should order accesses from different CPUs. And memory-barriers.txt is pretty vague about this.. mb() needs to do everything that smp_mb() does, ditto for rmb() and wmb(). There really are cases where both I/O and memory accesses need to be ordered, so just providing separate memory ordering and I/O ordering is not enough. Given that ARM device drivers are accessing MMIO locations, which are often slow anyway, how much is ARM really gaining by dropping memory barriers when only I/O accesses need be ordered? Is it measurable? If not, there is no point in adding yet another set of combinatorial choices to the memory-barrier API. Thanx, Paul - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: I/O memory barriers vs SMP memory barriers
On Mon, Mar 26, 2007 at 01:07:11PM -0700, Paul E. McKenney wrote: Does everybody agree on these semantics, though? At least David seems to think that mb/rmb/wmb aren't required to order normal memory accesses against each other.. Not on UP. On SMP, ordering is (almost certainly) required. 'almost certainly'? That sounds like there is a possibility that it wouldn't have to? What does this depend on? The underlying memory model of the CPU. For sequentially consistent systems, only compiler barriers are required. There are very few such systems -- MIPS and PA-RISC, if I remember correctly. Performance dictates otherwise. I believe that MIPS is -not- sequentially consistent, but have not yet purchased an architecture reference manual. ARM Normal memory (RAM) accesses are weakly ordered, so on SMP, you need barriers. (SMP ARM systems are the definite minority, though.) (For ARM UP, we generally don't care, since most have virtual caches and are not I/O coherent, and so DMA coherent mappings will be done as uncached mappings, and uncached mappings are strongly ordered -- except on XScale V3, which supports I/O coherency, and so you need to use barriers when operating on DMA coherent memory because DMA coherent mappings are done as Normal memory (which is weakly ordered) when I/O coherency is enabled.) Given that ARM device drivers are accessing MMIO locations, which are often slow anyway, how much is ARM really gaining by dropping memory barriers when only I/O accesses need be ordered? Is it measurable? No idea -- I assume Catalin has looked at this. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: I/O memory barriers vs SMP memory barriers
Benjamin Herrenschmidt <[EMAIL PROTECTED]> wrote: > Hrm... I'm not sure I like the io_* name, I think it's even more > confusing, people will never know when to use what ... I'd've thought it more obvious, but given there are several types of I/O, some of which might require different barriering to others, I can see your point. However, I think mb() unadorned is also confusing. > Maybe we should dig out again my attempt at properly defining semantics > of IO accessors and related barriers and extend it to include CPU vs. > DMA barriers. That could be useful. David - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: I/O memory barriers vs SMP memory barriers
Lennert Buytenhek <[EMAIL PROTECTED]> wrote: > Does everybody agree on these semantics, though? At least David seems > to think that mb/rmb/wmb aren't required to order normal memory accesses > against each other.. Ummm... I've just realised that your statement here is ambiguous. When you say "aren't required to", do you mean "aren't necessary to" or do you mean "don't have to"? Isn't English a fun language? Anyway, what I meant is that mb() and co. as they stand _must_ do everything smp_mb() and co do respectively, _in_ _addition_ to other side effects. mb() implies smp_mb() rmb() implies smp_rmb() wmb() implies smp_wmb() ... David - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: I/O memory barriers vs SMP memory barriers
On Sun, Mar 25, 2007 at 08:24:18PM -0700, Paul E. McKenney wrote: > > > > > [ background: On ARM, SMP synchronisation does need barriers but > > > > > device > > > > > synchronisation does not. The question is that given this, whether > > > > > mb() and friends can be NOPs on ARM or not (i.e. whether mb() is > > > > > supposed to sync against other CPUs or not, or whether only smp_mb() > > > > > can be used for this.) ] > > > > > > > > H... > > > > > > > > [snip] > > > > > > 3.Orders memory accesses and device accesses, but not necessarily > > > the union of the two -- mb(), rmb(), wmb(). > > > > If mb/rmb/wmb are required to order normal memory accesses, that means > > that the change made in commit 9623b3732d11b0a18d9af3419f680d27ea24b014 > > to always define mb/rmb/wmb as barrier() on ARM systems was wrong. > > This was on UP ARM systems, right? No. If you look at commit 9623b3732d11b0a18d9af3419f680d27ea24b014, you can see that it defines mb/rmb/wmb as barrier() on both ARM UP and SMP systems. The UP part is obviously fine, the SMP part is what is under debate here. > Assuming that ARM CPUs respect the usual CPU-self-consistency > semantics, and given the background that device accesses are ordered, > then it might well be OK to have mb/rmb/wmb be barrier() on UP ARM > systems. > > Most likely not on SMP ARM systems, however. Given the semantics above, mb/rmb/wmb can obviously be just barrier()s on ARM UP systems.. I don't think anyone ever disagreed about that. > > Does everybody agree on these semantics, though? At least David > > seems to think that mb/rmb/wmb aren't required to order normal > > memory accesses against each other.. > > Not on UP. On SMP, ordering is (almost certainly) required. 'almost certainly'? That sounds like there is a possibility that it wouldn't have to? What does this depend on? At least David and Catalin seem to disagree with the statement that mb/rmb/wmb should order accesses from different CPUs. And memory-barriers.txt is pretty vague about this.. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: I/O memory barriers vs SMP memory barriers
On Sun, Mar 25, 2007 at 08:24:18PM -0700, Paul E. McKenney wrote: [ background: On ARM, SMP synchronisation does need barriers but device synchronisation does not. The question is that given this, whether mb() and friends can be NOPs on ARM or not (i.e. whether mb() is supposed to sync against other CPUs or not, or whether only smp_mb() can be used for this.) ] H... [snip] 3.Orders memory accesses and device accesses, but not necessarily the union of the two -- mb(), rmb(), wmb(). If mb/rmb/wmb are required to order normal memory accesses, that means that the change made in commit 9623b3732d11b0a18d9af3419f680d27ea24b014 to always define mb/rmb/wmb as barrier() on ARM systems was wrong. This was on UP ARM systems, right? No. If you look at commit 9623b3732d11b0a18d9af3419f680d27ea24b014, you can see that it defines mb/rmb/wmb as barrier() on both ARM UP and SMP systems. The UP part is obviously fine, the SMP part is what is under debate here. Assuming that ARM CPUs respect the usual CPU-self-consistency semantics, and given the background that device accesses are ordered, then it might well be OK to have mb/rmb/wmb be barrier() on UP ARM systems. Most likely not on SMP ARM systems, however. Given the semantics above, mb/rmb/wmb can obviously be just barrier()s on ARM UP systems.. I don't think anyone ever disagreed about that. Does everybody agree on these semantics, though? At least David seems to think that mb/rmb/wmb aren't required to order normal memory accesses against each other.. Not on UP. On SMP, ordering is (almost certainly) required. 'almost certainly'? That sounds like there is a possibility that it wouldn't have to? What does this depend on? At least David and Catalin seem to disagree with the statement that mb/rmb/wmb should order accesses from different CPUs. And memory-barriers.txt is pretty vague about this.. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: I/O memory barriers vs SMP memory barriers
Lennert Buytenhek [EMAIL PROTECTED] wrote: Does everybody agree on these semantics, though? At least David seems to think that mb/rmb/wmb aren't required to order normal memory accesses against each other.. Ummm... I've just realised that your statement here is ambiguous. When you say aren't required to, do you mean aren't necessary to or do you mean don't have to? Isn't English a fun language? Anyway, what I meant is that mb() and co. as they stand _must_ do everything smp_mb() and co do respectively, _in_ _addition_ to other side effects. mb() implies smp_mb() rmb() implies smp_rmb() wmb() implies smp_wmb() ... David - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: I/O memory barriers vs SMP memory barriers
Benjamin Herrenschmidt [EMAIL PROTECTED] wrote: Hrm... I'm not sure I like the io_* name, I think it's even more confusing, people will never know when to use what ... I'd've thought it more obvious, but given there are several types of I/O, some of which might require different barriering to others, I can see your point. However, I think mb() unadorned is also confusing. Maybe we should dig out again my attempt at properly defining semantics of IO accessors and related barriers and extend it to include CPU vs. DMA barriers. That could be useful. David - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: I/O memory barriers vs SMP memory barriers
On Sun, Mar 25, 2007 at 11:38:43PM +0200, Lennert Buytenhek wrote: > On Sun, Mar 25, 2007 at 02:15:42PM -0700, Paul E. McKenney wrote: > > > > > [ background: On ARM, SMP synchronisation does need barriers but device > > > > synchronisation does not. The question is that given this, whether > > > > mb() and friends can be NOPs on ARM or not (i.e. whether mb() is > > > > supposed to sync against other CPUs or not, or whether only smp_mb() > > > > can be used for this.) ] > > > > > > H... > > > > > > [snip] > > > > 3. Orders memory accesses and device accesses, but not necessarily > > the union of the two -- mb(), rmb(), wmb(). > > If mb/rmb/wmb are required to order normal memory accesses, that means > that the change made in commit 9623b3732d11b0a18d9af3419f680d27ea24b014 > to always define mb/rmb/wmb as barrier() on ARM systems was wrong. This was on UP ARM systems, right? Assuming that ARM CPUs respect the usual CPU-self-consistency semantics, and given the background that device accesses are ordered, then it might well be OK to have mb/rmb/wmb be barrier() on UP ARM systems. Most likely not on SMP ARM systems, however. > Does everybody agree on these semantics, though? At least David seems > to think that mb/rmb/wmb aren't required to order normal memory accesses > against each other.. Not on UP. On SMP, ordering is (almost certainly) required. > > 4. Orders only device accesses, which is what seems to be looked > > for here. > > Yes. (As above, on ARM, SMP synchronisation does need barriers but > device synchronisation does not. If mb/rmb/wmb were only required to > synchronise device accesses, they could have been regular compiler > barriers on ARM, but if they are also required to synchronise normal > memory accesses against each other, they have to map to hardware > barriers.) Again, for kernels built for UP, you might well be able to make the mb() primitives be barrier(). I don't see it for SMP, though. Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: I/O memory barriers vs SMP memory barriers
On Fri, Mar 23, 2007 at 01:43:53PM +, David Howells wrote: > > [Resend - this time with a comma in the addresses, not a dot] > > Lennert Buytenhek <[EMAIL PROTECTED]> wrote: > > > [ background: On ARM, SMP synchronisation does need barriers but device > > synchronisation does not. The question is that given this, whether > > mb() and friends can be NOPs on ARM or not (i.e. whether mb() is > > supposed to sync against other CPUs or not, or whether only smp_mb() > > can be used for this.) ] > > H... > > I see your problem. I think the right way to deal with this is to get rid of > mb(), rmb(), wmb() and read_barrier_depends() and replace them with io_mb(), > io_rmb(), ... We will get combinatorial explosion if we aren't -extremely- careful: 1. Orders only normal memory accesses, which is all that is required of smp_*(). 2. Orders both normal and device accesses -- mmiowb(). 3. Orders memory accesses and device accesses, but not necessarily the union of the two -- mb(), rmb(), wmb(). 4. Orders only device accesses, which is what seems to be looked for here. Thanx, Paul > I think that there are only two places you should be using explicit memory > barriers: > > (1) To control inter-CPU effects on an SMP system. > > (2) To control CPU vs device effects. > > > On Thu, Mar 22, 2007 at 04:17:44PM +, Catalin Marinas wrote: > > > > > Is the requirement for mb() to act correctly in the SMP case as well? > > > > That's what the docs seem to suggest. A couple of snippets from > > memory-barriers.txt: > > > > [1] A write memory barrier gives a guarantee that all the STORE operations > > specified before the barrier will appear to happen before all the STORE > > operations specified after the barrier with respect to the other > > components of the system. > > > > [2] A read barrier is a data dependency barrier plus a guarantee that all > > the > > LOAD operations specified before the barrier will appear to happen > > before > > all the LOAD operations specified after the barrier with respect to the > > other components of the system. > > > > [3] TYPEMANDATORY SMP CONDITIONAL > > === === === > > GENERAL mb()smp_mb() > > WRITE wmb() smp_wmb() > > READrmb() smp_rmb() > > DATA DEPENDENCY read_barrier_depends() smp_read_barrier_depends() > > > > [4] Mandatory barriers should not be used to control SMP effects, > > since mandatory barriers unnecessarily impose overhead on UP > > systems. > > > > Note the wording of 'other components of the system' in [1] and [2] -- > > the way I read it, this includes devices as well as other CPUs. > > Yes, but I suppose which "other components" may depend on the class of barrier > used. > > > [4] says that mandatory barriers (i.e. from [3]: mb(), wmb(), rmb(), > > read_barrier_depends()) SHOULD not be used to control SMP effects, but > > it does not say that they MUST not. > > As it stands, mb() is a superset of smp_mb(), and rmb() of smp_rmb(), etc., > so, yes, currently, mb() implies smp_mb(). However, mb() shouldn't be used if > smb_mb() is sufficient as that may impact performance on a UP system. > > Really, mb() should only be used with respect to I/O. > > > > The memory-barriers.txt doc says that smp_* must be used for the SMP > > > case. > > > > The exact wording is: > > > > [!] Note that SMP memory barriers _must_ be used to control the > > ordering of references to shared memory on SMP systems, though > > the use of locking instead is sufficient. > > > > This can IMHO be interpreted in two ways: > > 1. If you want to control ordering of references to shared memory on > >SMP systems, you must use SMP memory barriers and not any other kind > >of memory barrier. > > If the shared memory is purely an inter-CPU effect, yes. If the shared memory > is actually a device with side effects, then I/O safe memory barriers are > required - mb() and co. Note that there must _also_ be safety wrt to other > CPUs in the system, as other CPUs may also try to access the device. > > > 2. If you want to control ordering of references to shared memory on > >SMP systems, you must use memory barriers, and the SMP memory barrier > >is the most appropriate barrier type to use. > > You may use locking instead to control inter-CPU effects. Locks imply one-way > permeable SMP-class memory barriers. > > > I'm thinking that [2] is what was intended. [1] doesn't seem consistent > > with the rest of the document, but if [1] _is_ what is what was intended, > > we're off the hook and mb() and friends can be NOPs on ARM. (But it'd > > probably still need a thorough audit... :-/ ) >
Re: I/O memory barriers vs SMP memory barriers
On Sun, Mar 25, 2007 at 02:15:42PM -0700, Paul E. McKenney wrote: > > > [ background: On ARM, SMP synchronisation does need barriers but device > > > synchronisation does not. The question is that given this, whether > > > mb() and friends can be NOPs on ARM or not (i.e. whether mb() is > > > supposed to sync against other CPUs or not, or whether only smp_mb() > > > can be used for this.) ] > > > > H... > > > > [snip] > > 3.Orders memory accesses and device accesses, but not necessarily > the union of the two -- mb(), rmb(), wmb(). If mb/rmb/wmb are required to order normal memory accesses, that means that the change made in commit 9623b3732d11b0a18d9af3419f680d27ea24b014 to always define mb/rmb/wmb as barrier() on ARM systems was wrong. Does everybody agree on these semantics, though? At least David seems to think that mb/rmb/wmb aren't required to order normal memory accesses against each other.. > 4.Orders only device accesses, which is what seems to be looked > for here. Yes. (As above, on ARM, SMP synchronisation does need barriers but device synchronisation does not. If mb/rmb/wmb were only required to synchronise device accesses, they could have been regular compiler barriers on ARM, but if they are also required to synchronise normal memory accesses against each other, they have to map to hardware barriers.) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: I/O memory barriers vs SMP memory barriers
On Sun, Mar 25, 2007 at 02:15:42PM -0700, Paul E. McKenney wrote: [ background: On ARM, SMP synchronisation does need barriers but device synchronisation does not. The question is that given this, whether mb() and friends can be NOPs on ARM or not (i.e. whether mb() is supposed to sync against other CPUs or not, or whether only smp_mb() can be used for this.) ] H... [snip] 3.Orders memory accesses and device accesses, but not necessarily the union of the two -- mb(), rmb(), wmb(). If mb/rmb/wmb are required to order normal memory accesses, that means that the change made in commit 9623b3732d11b0a18d9af3419f680d27ea24b014 to always define mb/rmb/wmb as barrier() on ARM systems was wrong. Does everybody agree on these semantics, though? At least David seems to think that mb/rmb/wmb aren't required to order normal memory accesses against each other.. 4.Orders only device accesses, which is what seems to be looked for here. Yes. (As above, on ARM, SMP synchronisation does need barriers but device synchronisation does not. If mb/rmb/wmb were only required to synchronise device accesses, they could have been regular compiler barriers on ARM, but if they are also required to synchronise normal memory accesses against each other, they have to map to hardware barriers.) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: I/O memory barriers vs SMP memory barriers
On Fri, Mar 23, 2007 at 01:43:53PM +, David Howells wrote: [Resend - this time with a comma in the addresses, not a dot] Lennert Buytenhek [EMAIL PROTECTED] wrote: [ background: On ARM, SMP synchronisation does need barriers but device synchronisation does not. The question is that given this, whether mb() and friends can be NOPs on ARM or not (i.e. whether mb() is supposed to sync against other CPUs or not, or whether only smp_mb() can be used for this.) ] H... I see your problem. I think the right way to deal with this is to get rid of mb(), rmb(), wmb() and read_barrier_depends() and replace them with io_mb(), io_rmb(), ... We will get combinatorial explosion if we aren't -extremely- careful: 1. Orders only normal memory accesses, which is all that is required of smp_*(). 2. Orders both normal and device accesses -- mmiowb(). 3. Orders memory accesses and device accesses, but not necessarily the union of the two -- mb(), rmb(), wmb(). 4. Orders only device accesses, which is what seems to be looked for here. Thanx, Paul I think that there are only two places you should be using explicit memory barriers: (1) To control inter-CPU effects on an SMP system. (2) To control CPU vs device effects. On Thu, Mar 22, 2007 at 04:17:44PM +, Catalin Marinas wrote: Is the requirement for mb() to act correctly in the SMP case as well? That's what the docs seem to suggest. A couple of snippets from memory-barriers.txt: [1] A write memory barrier gives a guarantee that all the STORE operations specified before the barrier will appear to happen before all the STORE operations specified after the barrier with respect to the other components of the system. [2] A read barrier is a data dependency barrier plus a guarantee that all the LOAD operations specified before the barrier will appear to happen before all the LOAD operations specified after the barrier with respect to the other components of the system. [3] TYPEMANDATORY SMP CONDITIONAL === === === GENERAL mb()smp_mb() WRITE wmb() smp_wmb() READrmb() smp_rmb() DATA DEPENDENCY read_barrier_depends() smp_read_barrier_depends() [4] Mandatory barriers should not be used to control SMP effects, since mandatory barriers unnecessarily impose overhead on UP systems. Note the wording of 'other components of the system' in [1] and [2] -- the way I read it, this includes devices as well as other CPUs. Yes, but I suppose which other components may depend on the class of barrier used. [4] says that mandatory barriers (i.e. from [3]: mb(), wmb(), rmb(), read_barrier_depends()) SHOULD not be used to control SMP effects, but it does not say that they MUST not. As it stands, mb() is a superset of smp_mb(), and rmb() of smp_rmb(), etc., so, yes, currently, mb() implies smp_mb(). However, mb() shouldn't be used if smb_mb() is sufficient as that may impact performance on a UP system. Really, mb() should only be used with respect to I/O. The memory-barriers.txt doc says that smp_* must be used for the SMP case. The exact wording is: [!] Note that SMP memory barriers _must_ be used to control the ordering of references to shared memory on SMP systems, though the use of locking instead is sufficient. This can IMHO be interpreted in two ways: 1. If you want to control ordering of references to shared memory on SMP systems, you must use SMP memory barriers and not any other kind of memory barrier. If the shared memory is purely an inter-CPU effect, yes. If the shared memory is actually a device with side effects, then I/O safe memory barriers are required - mb() and co. Note that there must _also_ be safety wrt to other CPUs in the system, as other CPUs may also try to access the device. 2. If you want to control ordering of references to shared memory on SMP systems, you must use memory barriers, and the SMP memory barrier is the most appropriate barrier type to use. You may use locking instead to control inter-CPU effects. Locks imply one-way permeable SMP-class memory barriers. I'm thinking that [2] is what was intended. [1] doesn't seem consistent with the rest of the document, but if [1] _is_ what is what was intended, we're off the hook and mb() and friends can be NOPs on ARM. (But it'd probably still need a thorough audit... :-/ ) I think the best way to do an audit would be to make mb() and co. deprecated, pending obsolete, and to replace them with io_mb() and co. That way people would
Re: I/O memory barriers vs SMP memory barriers
On Sun, Mar 25, 2007 at 11:38:43PM +0200, Lennert Buytenhek wrote: On Sun, Mar 25, 2007 at 02:15:42PM -0700, Paul E. McKenney wrote: [ background: On ARM, SMP synchronisation does need barriers but device synchronisation does not. The question is that given this, whether mb() and friends can be NOPs on ARM or not (i.e. whether mb() is supposed to sync against other CPUs or not, or whether only smp_mb() can be used for this.) ] H... [snip] 3. Orders memory accesses and device accesses, but not necessarily the union of the two -- mb(), rmb(), wmb(). If mb/rmb/wmb are required to order normal memory accesses, that means that the change made in commit 9623b3732d11b0a18d9af3419f680d27ea24b014 to always define mb/rmb/wmb as barrier() on ARM systems was wrong. This was on UP ARM systems, right? Assuming that ARM CPUs respect the usual CPU-self-consistency semantics, and given the background that device accesses are ordered, then it might well be OK to have mb/rmb/wmb be barrier() on UP ARM systems. Most likely not on SMP ARM systems, however. Does everybody agree on these semantics, though? At least David seems to think that mb/rmb/wmb aren't required to order normal memory accesses against each other.. Not on UP. On SMP, ordering is (almost certainly) required. 4. Orders only device accesses, which is what seems to be looked for here. Yes. (As above, on ARM, SMP synchronisation does need barriers but device synchronisation does not. If mb/rmb/wmb were only required to synchronise device accesses, they could have been regular compiler barriers on ARM, but if they are also required to synchronise normal memory accesses against each other, they have to map to hardware barriers.) Again, for kernels built for UP, you might well be able to make the mb() primitives be barrier(). I don't see it for SMP, though. Thanx, Paul - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: I/O memory barriers vs SMP memory barriers
On Fri, 2007-03-23 at 13:43 +, David Howells wrote: > [Resend - this time with a comma in the addresses, not a dot] > > Lennert Buytenhek <[EMAIL PROTECTED]> wrote: > > > [ background: On ARM, SMP synchronisation does need barriers but device > > synchronisation does not. The question is that given this, whether > > mb() and friends can be NOPs on ARM or not (i.e. whether mb() is > > supposed to sync against other CPUs or not, or whether only smp_mb() > > can be used for this.) ] > > H... > > I see your problem. I think the right way to deal with this is to get rid of > mb(), rmb(), wmb() and read_barrier_depends() and replace them with io_mb(), > io_rmb(), ... Hrm... I'm not sure I like the io_* name, I think it's even more confusing, people will never know when to use what ... Maybe we should dig out again my attempt at properly defining semantics of IO accessors and related barriers and extend it to include CPU vs. DMA barriers. Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: I/O memory barriers vs SMP memory barriers
On Fri, 2007-03-23 at 13:43 +, David Howells wrote: [Resend - this time with a comma in the addresses, not a dot] Lennert Buytenhek [EMAIL PROTECTED] wrote: [ background: On ARM, SMP synchronisation does need barriers but device synchronisation does not. The question is that given this, whether mb() and friends can be NOPs on ARM or not (i.e. whether mb() is supposed to sync against other CPUs or not, or whether only smp_mb() can be used for this.) ] H... I see your problem. I think the right way to deal with this is to get rid of mb(), rmb(), wmb() and read_barrier_depends() and replace them with io_mb(), io_rmb(), ... Hrm... I'm not sure I like the io_* name, I think it's even more confusing, people will never know when to use what ... Maybe we should dig out again my attempt at properly defining semantics of IO accessors and related barriers and extend it to include CPU vs. DMA barriers. Ben. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: I/O memory barriers vs SMP memory barriers
On Fri, Mar 23, 2007 at 01:43:53PM +, David Howells wrote: > > [ background: On ARM, SMP synchronisation does need barriers but device > > synchronisation does not. The question is that given this, whether > > mb() and friends can be NOPs on ARM or not (i.e. whether mb() is > > supposed to sync against other CPUs or not, or whether only smp_mb() > > can be used for this.) ] > > H... > > I see your problem. I think the right way to deal with this is to get > rid of mb(), rmb(), wmb() and read_barrier_depends() and replace them > with io_mb(), io_rmb(), ... There's actually three different cases of interest on ARM: 1. direct-mapped and vmalloc()ed kernel memory 2. coherent DMA memory 3. I/O memory (device mappings) smp_*() only make sense on (1). Here, you'd want a hardware barrier on SMP systems, and just a compiler barrier on UP systems. For (2), most ARM systems use uncached mappings of kernel memory, which are strongly ordered, and you don't need hardware barriers. However, some ARM systems are cache coherent, and they can use ordinary mappings for (2) (i.e. kmalloc), _but_, such ordinary mappings are weakly ordered, and so on those systems, you _would_ need hardware barriers for (2). For (3), Device memory (i.e. I/O mappings) are strongly ordered on all ARM platforms. (And of course, then there's the synchronisation issues _between_ the different mapping types.) Anyway, we could split the barrier types into three groups, or even more groups (I bet that on, say, ia64, there's at least a couple more different scenarios of interest), however, I'm really worried that the Average Joe Driver Writer's head is just going to explode. > > [4] says that mandatory barriers (i.e. from [3]: mb(), wmb(), rmb(), > > read_barrier_depends()) SHOULD not be used to control SMP effects, but > > it does not say that they MUST not. > > As it stands, mb() is a superset of smp_mb(), and rmb() of smp_rmb(), > etc., Not (anymore) on ARM: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9623b3732d11b0a18d9af3419f680d27ea24b014 The question is whether this change was correct. > so, yes, currently, mb() implies smp_mb(). However, mb() shouldn't > be used if smb_mb() is sufficient as that may impact performance on > a UP system. There's two different statements that can be made about mb(): 1. You shouldn't use mb() to synchronise with other CPUs as that is unnecessarily slow. 2. You must not use mb() to synchronise with other CPUs as that is wrong. Which is it, (1) or (2)? The memory-barriers.txt document confuses these two issues, and you confuse these two issues, but there is a _fundamental_ _semantic_ _difference_ between these two statements. Let's not confuse them. > Really, mb() should only be used with respect to I/O. OK. Can we clarify the docs on this point, please? > > > The memory-barriers.txt doc says that smp_* must be used for the SMP > > > case. > > > > The exact wording is: > > > > [!] Note that SMP memory barriers _must_ be used to control the > > ordering of references to shared memory on SMP systems, though > > the use of locking instead is sufficient. > > > > This can IMHO be interpreted in two ways: > > 1. If you want to control ordering of references to shared memory on > >SMP systems, you must use SMP memory barriers and not any other kind > >of memory barrier. > > If the shared memory is purely an inter-CPU effect, yes. If the shared > memory is actually a device with side effects, then I/O safe memory > barriers are required - mb() and co. Note that there must _also_ be > safety wrt to other CPUs in the system, as other CPUs may also try to > access the device. I was not making any statement, I was just giving two possible interpretations of the above-quoted snippet from memory-barriers.txt. Yes, I'm aware of the issues you mention, and yes, all the other necessary guarantees are provided on the ARM platform. > > 2. If you want to control ordering of references to shared memory on > >SMP systems, you must use memory barriers, and the SMP memory barrier > >is the most appropriate barrier type to use. > > You may use locking instead to control inter-CPU effects. Locks imply > one-way permeable SMP-class memory barriers. Again, I was not trying to make a statement here, just giving a possible interpretation of a statement in memory-barriers.txt. > > I'm thinking that [2] is what was intended. [1] doesn't seem > > consistent with the rest of the document, but if [1] _is_ what > > is what was intended, we're off the hook and mb() and friends > > can be NOPs on ARM. (But it'd probably still need a thorough > > audit... :-/ ) > > I think the best way to do an audit would be to make mb() and co. > deprecated, pending obsolete, and to replace them with io_mb() and > co. That way people would have to eyeball any usages of mb() and > co. Sounds OK to me. Then again, I have an idea
I/O memory barriers vs SMP memory barriers
[Resend - this time with a comma in the addresses, not a dot] Lennert Buytenhek <[EMAIL PROTECTED]> wrote: > [ background: On ARM, SMP synchronisation does need barriers but device > synchronisation does not. The question is that given this, whether > mb() and friends can be NOPs on ARM or not (i.e. whether mb() is > supposed to sync against other CPUs or not, or whether only smp_mb() > can be used for this.) ] H... I see your problem. I think the right way to deal with this is to get rid of mb(), rmb(), wmb() and read_barrier_depends() and replace them with io_mb(), io_rmb(), ... I think that there are only two places you should be using explicit memory barriers: (1) To control inter-CPU effects on an SMP system. (2) To control CPU vs device effects. > On Thu, Mar 22, 2007 at 04:17:44PM +, Catalin Marinas wrote: > > > Is the requirement for mb() to act correctly in the SMP case as well? > > That's what the docs seem to suggest. A couple of snippets from > memory-barriers.txt: > > [1] A write memory barrier gives a guarantee that all the STORE operations > specified before the barrier will appear to happen before all the STORE > operations specified after the barrier with respect to the other > components of the system. > > [2] A read barrier is a data dependency barrier plus a guarantee that all the > LOAD operations specified before the barrier will appear to happen before > all the LOAD operations specified after the barrier with respect to the > other components of the system. > > [3] TYPEMANDATORY SMP CONDITIONAL > === === === > GENERAL mb()smp_mb() > WRITE wmb() smp_wmb() > READrmb() smp_rmb() > DATA DEPENDENCY read_barrier_depends() smp_read_barrier_depends() > > [4] Mandatory barriers should not be used to control SMP effects, > since mandatory barriers unnecessarily impose overhead on UP > systems. > > Note the wording of 'other components of the system' in [1] and [2] -- > the way I read it, this includes devices as well as other CPUs. Yes, but I suppose which "other components" may depend on the class of barrier used. > [4] says that mandatory barriers (i.e. from [3]: mb(), wmb(), rmb(), > read_barrier_depends()) SHOULD not be used to control SMP effects, but > it does not say that they MUST not. As it stands, mb() is a superset of smp_mb(), and rmb() of smp_rmb(), etc., so, yes, currently, mb() implies smp_mb(). However, mb() shouldn't be used if smb_mb() is sufficient as that may impact performance on a UP system. Really, mb() should only be used with respect to I/O. > > The memory-barriers.txt doc says that smp_* must be used for the SMP > > case. > > The exact wording is: > > [!] Note that SMP memory barriers _must_ be used to control the > ordering of references to shared memory on SMP systems, though > the use of locking instead is sufficient. > > This can IMHO be interpreted in two ways: > 1. If you want to control ordering of references to shared memory on >SMP systems, you must use SMP memory barriers and not any other kind >of memory barrier. If the shared memory is purely an inter-CPU effect, yes. If the shared memory is actually a device with side effects, then I/O safe memory barriers are required - mb() and co. Note that there must _also_ be safety wrt to other CPUs in the system, as other CPUs may also try to access the device. > 2. If you want to control ordering of references to shared memory on >SMP systems, you must use memory barriers, and the SMP memory barrier >is the most appropriate barrier type to use. You may use locking instead to control inter-CPU effects. Locks imply one-way permeable SMP-class memory barriers. > I'm thinking that [2] is what was intended. [1] doesn't seem consistent > with the rest of the document, but if [1] _is_ what is what was intended, > we're off the hook and mb() and friends can be NOPs on ARM. (But it'd > probably still need a thorough audit... :-/ ) I think the best way to do an audit would be to make mb() and co. deprecated, pending obsolete, and to replace them with io_mb() and co. That way people would have to eyeball any usages of mb() and co. > > This means that if code uses mb() to control SMP sharing, it is broken. > > I'm not so sure. If it's _purely_ to control inter-CPU SMP sharing, then yes, it's broken. It must use either a lock or an smp_*mb() barrier. Of course, Linus may disagree... David - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
I/O memory barriers vs SMP memory barriers
[Resend - this time with a comma in the addresses, not a dot] Lennert Buytenhek [EMAIL PROTECTED] wrote: [ background: On ARM, SMP synchronisation does need barriers but device synchronisation does not. The question is that given this, whether mb() and friends can be NOPs on ARM or not (i.e. whether mb() is supposed to sync against other CPUs or not, or whether only smp_mb() can be used for this.) ] H... I see your problem. I think the right way to deal with this is to get rid of mb(), rmb(), wmb() and read_barrier_depends() and replace them with io_mb(), io_rmb(), ... I think that there are only two places you should be using explicit memory barriers: (1) To control inter-CPU effects on an SMP system. (2) To control CPU vs device effects. On Thu, Mar 22, 2007 at 04:17:44PM +, Catalin Marinas wrote: Is the requirement for mb() to act correctly in the SMP case as well? That's what the docs seem to suggest. A couple of snippets from memory-barriers.txt: [1] A write memory barrier gives a guarantee that all the STORE operations specified before the barrier will appear to happen before all the STORE operations specified after the barrier with respect to the other components of the system. [2] A read barrier is a data dependency barrier plus a guarantee that all the LOAD operations specified before the barrier will appear to happen before all the LOAD operations specified after the barrier with respect to the other components of the system. [3] TYPEMANDATORY SMP CONDITIONAL === === === GENERAL mb()smp_mb() WRITE wmb() smp_wmb() READrmb() smp_rmb() DATA DEPENDENCY read_barrier_depends() smp_read_barrier_depends() [4] Mandatory barriers should not be used to control SMP effects, since mandatory barriers unnecessarily impose overhead on UP systems. Note the wording of 'other components of the system' in [1] and [2] -- the way I read it, this includes devices as well as other CPUs. Yes, but I suppose which other components may depend on the class of barrier used. [4] says that mandatory barriers (i.e. from [3]: mb(), wmb(), rmb(), read_barrier_depends()) SHOULD not be used to control SMP effects, but it does not say that they MUST not. As it stands, mb() is a superset of smp_mb(), and rmb() of smp_rmb(), etc., so, yes, currently, mb() implies smp_mb(). However, mb() shouldn't be used if smb_mb() is sufficient as that may impact performance on a UP system. Really, mb() should only be used with respect to I/O. The memory-barriers.txt doc says that smp_* must be used for the SMP case. The exact wording is: [!] Note that SMP memory barriers _must_ be used to control the ordering of references to shared memory on SMP systems, though the use of locking instead is sufficient. This can IMHO be interpreted in two ways: 1. If you want to control ordering of references to shared memory on SMP systems, you must use SMP memory barriers and not any other kind of memory barrier. If the shared memory is purely an inter-CPU effect, yes. If the shared memory is actually a device with side effects, then I/O safe memory barriers are required - mb() and co. Note that there must _also_ be safety wrt to other CPUs in the system, as other CPUs may also try to access the device. 2. If you want to control ordering of references to shared memory on SMP systems, you must use memory barriers, and the SMP memory barrier is the most appropriate barrier type to use. You may use locking instead to control inter-CPU effects. Locks imply one-way permeable SMP-class memory barriers. I'm thinking that [2] is what was intended. [1] doesn't seem consistent with the rest of the document, but if [1] _is_ what is what was intended, we're off the hook and mb() and friends can be NOPs on ARM. (But it'd probably still need a thorough audit... :-/ ) I think the best way to do an audit would be to make mb() and co. deprecated, pending obsolete, and to replace them with io_mb() and co. That way people would have to eyeball any usages of mb() and co. This means that if code uses mb() to control SMP sharing, it is broken. I'm not so sure. If it's _purely_ to control inter-CPU SMP sharing, then yes, it's broken. It must use either a lock or an smp_*mb() barrier. Of course, Linus may disagree... David - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: I/O memory barriers vs SMP memory barriers
On Fri, Mar 23, 2007 at 01:43:53PM +, David Howells wrote: [ background: On ARM, SMP synchronisation does need barriers but device synchronisation does not. The question is that given this, whether mb() and friends can be NOPs on ARM or not (i.e. whether mb() is supposed to sync against other CPUs or not, or whether only smp_mb() can be used for this.) ] H... I see your problem. I think the right way to deal with this is to get rid of mb(), rmb(), wmb() and read_barrier_depends() and replace them with io_mb(), io_rmb(), ... There's actually three different cases of interest on ARM: 1. direct-mapped and vmalloc()ed kernel memory 2. coherent DMA memory 3. I/O memory (device mappings) smp_*() only make sense on (1). Here, you'd want a hardware barrier on SMP systems, and just a compiler barrier on UP systems. For (2), most ARM systems use uncached mappings of kernel memory, which are strongly ordered, and you don't need hardware barriers. However, some ARM systems are cache coherent, and they can use ordinary mappings for (2) (i.e. kmalloc), _but_, such ordinary mappings are weakly ordered, and so on those systems, you _would_ need hardware barriers for (2). For (3), Device memory (i.e. I/O mappings) are strongly ordered on all ARM platforms. (And of course, then there's the synchronisation issues _between_ the different mapping types.) Anyway, we could split the barrier types into three groups, or even more groups (I bet that on, say, ia64, there's at least a couple more different scenarios of interest), however, I'm really worried that the Average Joe Driver Writer's head is just going to explode. [4] says that mandatory barriers (i.e. from [3]: mb(), wmb(), rmb(), read_barrier_depends()) SHOULD not be used to control SMP effects, but it does not say that they MUST not. As it stands, mb() is a superset of smp_mb(), and rmb() of smp_rmb(), etc., Not (anymore) on ARM: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9623b3732d11b0a18d9af3419f680d27ea24b014 The question is whether this change was correct. so, yes, currently, mb() implies smp_mb(). However, mb() shouldn't be used if smb_mb() is sufficient as that may impact performance on a UP system. There's two different statements that can be made about mb(): 1. You shouldn't use mb() to synchronise with other CPUs as that is unnecessarily slow. 2. You must not use mb() to synchronise with other CPUs as that is wrong. Which is it, (1) or (2)? The memory-barriers.txt document confuses these two issues, and you confuse these two issues, but there is a _fundamental_ _semantic_ _difference_ between these two statements. Let's not confuse them. Really, mb() should only be used with respect to I/O. OK. Can we clarify the docs on this point, please? The memory-barriers.txt doc says that smp_* must be used for the SMP case. The exact wording is: [!] Note that SMP memory barriers _must_ be used to control the ordering of references to shared memory on SMP systems, though the use of locking instead is sufficient. This can IMHO be interpreted in two ways: 1. If you want to control ordering of references to shared memory on SMP systems, you must use SMP memory barriers and not any other kind of memory barrier. If the shared memory is purely an inter-CPU effect, yes. If the shared memory is actually a device with side effects, then I/O safe memory barriers are required - mb() and co. Note that there must _also_ be safety wrt to other CPUs in the system, as other CPUs may also try to access the device. I was not making any statement, I was just giving two possible interpretations of the above-quoted snippet from memory-barriers.txt. Yes, I'm aware of the issues you mention, and yes, all the other necessary guarantees are provided on the ARM platform. 2. If you want to control ordering of references to shared memory on SMP systems, you must use memory barriers, and the SMP memory barrier is the most appropriate barrier type to use. You may use locking instead to control inter-CPU effects. Locks imply one-way permeable SMP-class memory barriers. Again, I was not trying to make a statement here, just giving a possible interpretation of a statement in memory-barriers.txt. I'm thinking that [2] is what was intended. [1] doesn't seem consistent with the rest of the document, but if [1] _is_ what is what was intended, we're off the hook and mb() and friends can be NOPs on ARM. (But it'd probably still need a thorough audit... :-/ ) I think the best way to do an audit would be to make mb() and co. deprecated, pending obsolete, and to replace them with io_mb() and co. That way people would have to eyeball any usages of mb() and co. Sounds OK to me. Then again, I have an idea of what all the different types of barriers do.. Joe Driver Writer might not.