Re: MMIO and gcc re-ordering issue

2008-06-12 Thread Paul Mackerras
Nick Piggin writes: /* turn off LED */ val64 = readq(bar0-adapter_control); val64 = val64 (~ADAPTER_LED_ON); writeq(val64, bar0-adapter_control); s2io_link(nic, LINK_DOWN); }

Re: MMIO and gcc re-ordering issue

2008-06-12 Thread Nick Piggin
On Thursday 12 June 2008 22:14, Paul Mackerras wrote: Nick Piggin writes: /* turn off LED */ val64 = readq(bar0-adapter_control); val64 = val64 (~ADAPTER_LED_ON); writeq(val64, bar0-adapter_control);

Re: MMIO and gcc re-ordering issue

2008-06-12 Thread Matthew Wilcox
On Thu, Jun 05, 2008 at 06:43:53PM +1000, Benjamin Herrenschmidt wrote: Note that the powerpc implementation currently clears the flag on spin_lock and tests it on unlock. We are considering changing that to not touch the flag on spin_lock and just clear it whenever we do a sync (ie, on

Re: MMIO and gcc re-ordering issue

2008-06-12 Thread Benjamin Herrenschmidt
On Thu, 2008-06-12 at 09:07 -0600, Matthew Wilcox wrote: On Thu, Jun 05, 2008 at 06:43:53PM +1000, Benjamin Herrenschmidt wrote: Note that the powerpc implementation currently clears the flag on spin_lock and tests it on unlock. We are considering changing that to not touch the flag on

Re: MMIO and gcc re-ordering issue

2008-06-11 Thread Nick Piggin
On Wednesday 11 June 2008 15:35, Nick Piggin wrote: On Wednesday 11 June 2008 15:13, Paul Mackerras wrote: Nick Piggin writes: I just wish we had even one actual example of things going wrong with the current rules we have on powerpc to motivate changing to this model.

Re: MMIO and gcc re-ordering issue

2008-06-11 Thread Paul Mackerras
Nick Piggin writes: Now that doesn't leave waker ordering architectures lumped with slow old x86 semantics. Think of it as giving them the benefit of sharing x86 development and testing :) Worth something, but not perhaps as much as you think, given that many x86 driver writers still don't

Re: MMIO and gcc re-ordering issue

2008-06-11 Thread Linus Torvalds
On Wed, 11 Jun 2008, Nick Piggin wrote: I can't actually find the definitive statement in the Intel manuals saying UC is strongly ordered also WRT WB. Linus? Definitive? Dunno. But look in the Architecture manual, volume 3A, 10.3 Methods of Caching Available, and then under the bullet

Re: MMIO and gcc re-ordering issue

2008-06-11 Thread Jesse Barnes
On Tuesday, June 10, 2008 8:29 pm Nick Piggin wrote: On Wednesday 11 June 2008 05:19, Jesse Barnes wrote: On Tuesday, June 10, 2008 12:05 pm Roland Dreier wrote: me too. That's the whole basis for readX_relaxed() and its cohorts: we make our weirdest machines (like altix) conform to

Re: MMIO and gcc re-ordering issue

2008-06-10 Thread Nick Piggin
On Wednesday 04 June 2008 05:07, Linus Torvalds wrote: On Tue, 3 Jun 2008, Trent Piepho wrote: On Tue, 3 Jun 2008, Linus Torvalds wrote: On Tue, 3 Jun 2008, Nick Piggin wrote: Linus: on x86, memory operations to wc and wc+ memory are not ordered with one another, or operations to

Re: MMIO and gcc re-ordering issue

2008-06-10 Thread Jesse Barnes
On Monday, June 09, 2008 11:56 pm Nick Piggin wrote: So that still doesn't tell us what *minimum* level of ordering we should provide in the cross platform readl/writel API. Some relatively sane suggestions would be: - as strong as x86. guaranteed not to break drivers that work on x86, but

Re: MMIO and gcc re-ordering issue

2008-06-10 Thread James Bottomley
On Tue, 2008-06-10 at 10:41 -0700, Jesse Barnes wrote: On Monday, June 09, 2008 11:56 pm Nick Piggin wrote: So that still doesn't tell us what *minimum* level of ordering we should provide in the cross platform readl/writel API. Some relatively sane suggestions would be: - as strong as

Re: MMIO and gcc re-ordering issue

2008-06-10 Thread Roland Dreier
me too. That's the whole basis for readX_relaxed() and its cohorts: we make our weirdest machines (like altix) conform to the x86 norm. Then where it really kills us we introduce additional semantics to selected drivers that enable us to recover I/O speed on the abnormal platforms.

Re: MMIO and gcc re-ordering issue

2008-06-10 Thread Jesse Barnes
On Tuesday, June 10, 2008 12:05 pm Roland Dreier wrote: me too. That's the whole basis for readX_relaxed() and its cohorts: we make our weirdest machines (like altix) conform to the x86 norm. Then where it really kills us we introduce additional semantics to selected drivers that

Re: MMIO and gcc re-ordering issue

2008-06-10 Thread Nick Piggin
On Wednesday 11 June 2008 05:19, Jesse Barnes wrote: On Tuesday, June 10, 2008 12:05 pm Roland Dreier wrote: me too. That's the whole basis for readX_relaxed() and its cohorts: we make our weirdest machines (like altix) conform to the x86 norm. Then where it really kills us we

Re: MMIO and gcc re-ordering issue

2008-06-10 Thread Benjamin Herrenschmidt
On Wed, 2008-06-11 at 13:29 +1000, Nick Piggin wrote: Exactly, yes. I guess everybody has had good intentions here, but as noticed, what is lacking is coordination and documentation. You mention strong ordering WRT spin_unlock, which suggests that you would prefer to take option #2 (the

Re: MMIO and gcc re-ordering issue

2008-06-10 Thread Nick Piggin
On Wednesday 11 June 2008 13:40, Benjamin Herrenschmidt wrote: On Wed, 2008-06-11 at 13:29 +1000, Nick Piggin wrote: Exactly, yes. I guess everybody has had good intentions here, but as noticed, what is lacking is coordination and documentation. You mention strong ordering WRT

Re: MMIO and gcc re-ordering issue

2008-06-10 Thread Paul Mackerras
Nick Piggin writes: OK, I'm sitll not quite sure where this has ended up. I guess you are happy with x86 semantics as they are now. That is, all IO accesses are strongly ordered WRT one another and WRT cacheable memory (which includes keeping them within spinlocks), My understanding was that

Re: MMIO and gcc re-ordering issue

2008-06-10 Thread Nick Piggin
On Wednesday 11 June 2008 14:18, Paul Mackerras wrote: Nick Piggin writes: OK, I'm sitll not quite sure where this has ended up. I guess you are happy with x86 semantics as they are now. That is, all IO accesses are strongly ordered WRT one another and WRT cacheable memory (which includes

Re: MMIO and gcc re-ordering issue

2008-06-10 Thread Paul Mackerras
Nick Piggin writes: I just wish we had even one actual example of things going wrong with the current rules we have on powerpc to motivate changing to this model. ~/usr/src/linux-2.6 git grep test_and_set_bit drivers/ | wc -l 506 How sure are you that none of those forms part of a

Re: MMIO and gcc re-ordering issue

2008-06-05 Thread Jes Sorensen
Jesse Barnes wrote: Now, in hindsight, using a PIO write set test flag approach in writeX/spin_unlock (ala powerpc) might have been a better approach, but iirc that never came up in the discussion, probably because we were focused on PCI posting and not uncached vs. cached ordering. Hi

Re: MMIO and gcc re-ordering issue

2008-06-05 Thread Benjamin Herrenschmidt
On Thu, 2008-06-05 at 10:40 +0200, Jes Sorensen wrote: Jesse Barnes wrote: Now, in hindsight, using a PIO write set test flag approach in writeX/spin_unlock (ala powerpc) might have been a better approach, but iirc that never came up in the discussion, probably because we were focused

Re: MMIO and gcc re-ordering issue

2008-06-04 Thread Linus Torvalds
On Mon, 2 Jun 2008, Haavard Skinnemoen wrote: So what happened to the old idea of putting the accessor function pointers in the device/bus structure? Don't know. I think it sounds like overkill to replace a simple load or store with an indirect function call. Indeed. *Especially* as

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Jeremy Higdon
On Tue, Jun 03, 2008 at 06:19:05PM +1000, Nick Piggin wrote: On Tuesday 03 June 2008 18:15, Jeremy Higdon wrote: On Tue, Jun 03, 2008 at 02:33:11PM +1000, Nick Piggin wrote: On Monday 02 June 2008 19:56, Jes Sorensen wrote: Would we be able to use Ben's trick of setting a per cpu flag in

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Nick Piggin
On Tuesday 03 June 2008 18:15, Jeremy Higdon wrote: On Tue, Jun 03, 2008 at 02:33:11PM +1000, Nick Piggin wrote: On Monday 02 June 2008 19:56, Jes Sorensen wrote: Would we be able to use Ben's trick of setting a per cpu flag in writel() then and checking that in spin unlock issuing the

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Haavard Skinnemoen
Scott Wood [EMAIL PROTECTED] wrote: On Mon, Jun 02, 2008 at 10:11:02AM +0200, Haavard Skinnemoen wrote: Geert Uytterhoeven [EMAIL PROTECTED] wrote: On Fri, 30 May 2008, Haavard Skinnemoen wrote: Maybe we need another interface that does not do byteswapping but provides stronger

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Benjamin Herrenschmidt
On Tue, 2008-06-03 at 16:11 +1000, Nick Piggin wrote: - readl is synchronous (ie, makes the CPU think the data was actually used before executing subsequent instructions, thus waits for the data to come back, for example to ensure that a read used to

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Nick Piggin
On Tuesday 03 June 2008 16:53, Paul Mackerras wrote: Nick Piggin writes: So your readl can pass an earlier cacheable store or earlier writel? No. It's quite gross at the moment, it has a sync before the access (i.e. a full mb()) and a twi; isync sequence after the access that stalls

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Jeremy Higdon
On Tue, Jun 03, 2008 at 02:33:11PM +1000, Nick Piggin wrote: On Monday 02 June 2008 19:56, Jes Sorensen wrote: Jeremy Higdon wrote: We don't actually have that problem on the Altix. All writes issued by CPU X will be ordered with respect to each other. But writes by CPU X and CPU Y

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Paul Mackerras
Nick Piggin writes: So your readl can pass an earlier cacheable store or earlier writel? No. It's quite gross at the moment, it has a sync before the access (i.e. a full mb()) and a twi; isync sequence after the access that stalls execution until the data comes back. We don't provide

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Linus Torvalds
On Tue, 3 Jun 2008, Nick Piggin wrote: Linus: on x86, memory operations to wc and wc+ memory are not ordered with one another, or operations to other memory types (ie. load/load and store/store reordering is allowed). Also, as you know, store/load reordering is explicitly allowed as well,

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Linus Torvalds
On Tue, 3 Jun 2008, Trent Piepho wrote: On Tue, 3 Jun 2008, Linus Torvalds wrote: On Tue, 3 Jun 2008, Nick Piggin wrote: Linus: on x86, memory operations to wc and wc+ memory are not ordered with one another, or operations to other memory types (ie. load/load and store/store

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Trent Piepho
On Tue, 3 Jun 2008, Linus Torvalds wrote: On Tue, 3 Jun 2008, Nick Piggin wrote: Linus: on x86, memory operations to wc and wc+ memory are not ordered with one another, or operations to other memory types (ie. load/load and store/store reordering is allowed). Also, as you know, store/load

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Matthew Wilcox
On Tue, Jun 03, 2008 at 11:47:00AM -0700, Trent Piepho wrote: On Tue, 3 Jun 2008, Linus Torvalds wrote: On Tue, 3 Jun 2008, Nick Piggin wrote: Linus: on x86, memory operations to wc and wc+ memory are not ordered with one another, or operations to other memory types (ie. load/load and

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Trent Piepho
On Tue, 3 Jun 2008, Matthew Wilcox wrote: On Tue, Jun 03, 2008 at 11:47:00AM -0700, Trent Piepho wrote: On Tue, 3 Jun 2008, Linus Torvalds wrote: On Tue, 3 Jun 2008, Nick Piggin wrote: Linus: on x86, memory operations to wc and wc+ memory are not ordered with one another, or operations to

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Trent Piepho
On Tue, 3 Jun 2008, Nick Piggin wrote: On Monday 02 June 2008 17:24, Russell King wrote: So, can the semantics of what's expected from these IO accessor functions be documented somewhere. Please? Before this thread gets lost in the depths of time? This whole thread also ties in with my

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Matthew Wilcox
On Tue, Jun 03, 2008 at 12:43:21PM -0700, Trent Piepho wrote: IOW, there are four ways one can defined endianness/swapping: 1) Little-endian 2) Big-endian 3) Native-endian aka non-byte-swapping 4) Foreign-endian aka byte-swapping 1 and 2 are by far the most used. Some code wants 3. No

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Matthew Wilcox
On Tue, Jun 03, 2008 at 12:57:56PM -0700, Trent Piepho wrote: On Tue, 3 Jun 2008, Matthew Wilcox wrote: On Tue, Jun 03, 2008 at 11:47:00AM -0700, Trent Piepho wrote: On Tue, 3 Jun 2008, Linus Torvalds wrote: On Tue, 3 Jun 2008, Nick Piggin wrote: Linus: on x86, memory operations to wc and

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Trent Piepho
On Tue, 3 Jun 2008, Matthew Wilcox wrote: On Tue, Jun 03, 2008 at 12:43:21PM -0700, Trent Piepho wrote: IOW, there are four ways one can defined endianness/swapping: 1) Little-endian 2) Big-endian 3) Native-endian aka non-byte-swapping 4) Foreign-endian aka byte-swapping 1 and 2 are by far the

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Trent Piepho
On Tue, 3 Jun 2008, Matthew Wilcox wrote: On Tue, Jun 03, 2008 at 12:57:56PM -0700, Trent Piepho wrote: On Tue, 3 Jun 2008, Matthew Wilcox wrote: On Tue, Jun 03, 2008 at 11:47:00AM -0700, Trent Piepho wrote: On Tue, 3 Jun 2008, Linus Torvalds wrote: On Tue, 3 Jun 2008, Nick Piggin wrote:

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Benjamin Herrenschmidt
On Tue, 2008-06-03 at 12:43 -0700, Trent Piepho wrote: Byte-swapping vs not byte-swapping is not usually what the programmer wants. Usually your device's registers are defined as being big-endian or little-endian and you want whatever is needed to give you that. Yes, which is why I (and

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Nick Piggin
On Wednesday 04 June 2008 07:58, Trent Piepho wrote: On Tue, 3 Jun 2008, Matthew Wilcox wrote: On Tue, Jun 03, 2008 at 12:57:56PM -0700, Trent Piepho wrote: On Tue, 3 Jun 2008, Matthew Wilcox wrote: On Tue, Jun 03, 2008 at 11:47:00AM -0700, Trent Piepho wrote: On Tue, 3 Jun 2008, Linus

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Nick Piggin
On Wednesday 04 June 2008 05:07, Linus Torvalds wrote: On Tue, 3 Jun 2008, Trent Piepho wrote: On Tue, 3 Jun 2008, Linus Torvalds wrote: On Tue, 3 Jun 2008, Nick Piggin wrote: Linus: on x86, memory operations to wc and wc+ memory are not ordered with one another, or operations to

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Nick Piggin
On Wednesday 04 June 2008 00:47, Linus Torvalds wrote: On Tue, 3 Jun 2008, Nick Piggin wrote: Linus: on x86, memory operations to wc and wc+ memory are not ordered with one another, or operations to other memory types (ie. load/load and store/store reordering is allowed). Also, as you know,

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Nick Piggin
On Wednesday 04 June 2008 07:44, Trent Piepho wrote: On Tue, 3 Jun 2008, Matthew Wilcox wrote: I don't understand why you keep talking about DMA. Are you talking about ordering between readX() and DMA? PCI proides those guarantees. I guess you haven't been reading the whole thread. The

Re: MMIO and gcc re-ordering issue

2008-06-03 Thread Linus Torvalds
On Wed, 4 Jun 2008, Nick Piggin wrote: Actually, according to the document I am looking at (the AMD one), a UC store may pass a previous WC store. Hmm. Intel arch manyal, Vol 3, 10.3 (page 10-7 in my version): If the WC bufer is partially filled, the writes may be delayed until the

Re: MMIO and gcc re-ordering issue

2008-06-02 Thread Russell King
On Tue, May 27, 2008 at 02:55:56PM -0700, Linus Torvalds wrote: On Wed, 28 May 2008, Benjamin Herrenschmidt wrote: A problem with __raw_ though is that they -also- don't do byteswap, Well, that's why there is __readl() and __raw_readl(), no? Neither does ordering, and __raw_readl()

Re: MMIO and gcc re-ordering issue

2008-06-02 Thread Haavard Skinnemoen
Geert Uytterhoeven [EMAIL PROTECTED] wrote: On Fri, 30 May 2008, Haavard Skinnemoen wrote: Maybe we need another interface that does not do byteswapping but provides stronger ordering guarantees? The byte swapping depends on the device/bus. Of course. But isn't it reasonable to assume

Re: MMIO and gcc re-ordering issue

2008-06-02 Thread Jes Sorensen
Jeremy Higdon wrote: We don't actually have that problem on the Altix. All writes issued by CPU X will be ordered with respect to each other. But writes by CPU X and CPU Y will not be, unless an mmiowb() is done by the original CPU before the second CPU writes. I.e. CPU X writel

Re: MMIO and gcc re-ordering issue

2008-06-02 Thread Ingo Molnar
* Linus Torvalds [EMAIL PROTECTED] wrote: Here's a UNTESTED patch for x86 that may or may not compile and work, and which serializes (on a compiler level) the IO accesses against regular memory accesses. Ok, so it at least boots on x86-32. Thus probably on x86-64 too (since the code

Re: MMIO and gcc re-ordering issue

2008-06-02 Thread Jes Sorensen
Pavel Machek wrote: Still better than changing semantics of writel for _all_ drivers. If you are really sure driver does not depend on writel order, it is just a sed/// command, so I don't see any big code maintenance issues... This isn't changing the semantics for all drivers, it means it

Re: MMIO and gcc re-ordering issue

2008-06-02 Thread Scott Wood
On Mon, Jun 02, 2008 at 10:11:02AM +0200, Haavard Skinnemoen wrote: Geert Uytterhoeven [EMAIL PROTECTED] wrote: On Fri, 30 May 2008, Haavard Skinnemoen wrote: Maybe we need another interface that does not do byteswapping but provides stronger ordering guarantees? The byte swapping

Re: MMIO and gcc re-ordering issue

2008-06-02 Thread Jeremy Higdon
On Mon, Jun 02, 2008 at 11:56:39AM +0200, Jes Sorensen wrote: Jeremy Higdon wrote: We don't actually have that problem on the Altix. All writes issued by CPU X will be ordered with respect to each other. But writes by CPU X and CPU Y will not be, unless an mmiowb() is done by the original

Re: MMIO and gcc re-ordering issue

2008-06-02 Thread Benjamin Herrenschmidt
On Mon, 2008-06-02 at 12:36 +0200, Ingo Molnar wrote: The patch passed initial light testing in -tip (~30 successful random self-builds and bootups on various mixed 32-bit/64-bit boxes) but it's still v2.6.27 material IMO. I think adding the memory clobber should be .26 and even -stable. We

Re: MMIO and gcc re-ordering issue

2008-06-02 Thread Nick Piggin
On Monday 02 June 2008 19:56, Jes Sorensen wrote: Jeremy Higdon wrote: We don't actually have that problem on the Altix. All writes issued by CPU X will be ordered with respect to each other. But writes by CPU X and CPU Y will not be, unless an mmiowb() is done by the original CPU

Re: MMIO and gcc re-ordering issue

2008-06-01 Thread Pavel Machek
Hi! Though it's my understanding that at least ia64 does require the explicit barriers anyway, so we are still in a dodgy situation here where it's not clear what drivers should do and we end up with possibly excessive barriers on powerpc where I end up with both the wmb/rmb/mb

Re: MMIO and gcc re-ordering issue

2008-06-01 Thread Pavel Machek
Hi! The only way to guarantee ordering in the above setup, is to either make writel() fully ordered or adding the mmiowb()'s inbetween the two writel's. On Altix you have to go and read from the PCI brige to ensure all writes to it have been flushed, which is also what mmiowb() is

Re: MMIO and gcc re-ordering issue

2008-05-31 Thread Jeremy Higdon
On Thu, May 29, 2008 at 10:47:18AM -0400, Jes Sorensen wrote: Thats not going to solve the problem on Altix. On Altix the issue is that there can be multiple paths through the NUMA fabric from cpuX to PCI bridge Y. Consider this uber-cooltm ascii art - NR is my abbrevation for NUMA router:

Re: MMIO and gcc re-ordering issue

2008-05-31 Thread Jeremy Higdon
On Fri, May 30, 2008 at 10:21:00AM -0700, Jesse Barnes wrote: On Friday, May 30, 2008 2:36 am Jes Sorensen wrote: James Bottomley wrote: The only way to guarantee ordering in the above setup, is to either make writel() fully ordered or adding the mmiowb()'s inbetween the two writel's.

Re: MMIO and gcc re-ordering issue

2008-05-30 Thread Haavard Skinnemoen
On Fri, 30 May 2008 11:13:23 +1000 Benjamin Herrenschmidt [EMAIL PROTECTED] wrote: Currently, this is the only interface I know that can do native-endian accesses, so if you take it away, I'm gonna need an alternative interface that doesn't do byteswapping. Are you aware that these

Re: MMIO and gcc re-ordering issue

2008-05-30 Thread Benjamin Herrenschmidt
On Fri, 2008-05-30 at 08:07 +0200, Haavard Skinnemoen wrote: On Fri, 30 May 2008 11:13:23 +1000 Benjamin Herrenschmidt [EMAIL PROTECTED] wrote: Currently, this is the only interface I know that can do native-endian accesses, so if you take it away, I'm gonna need an alternative

Re: MMIO and gcc re-ordering issue

2008-05-30 Thread Haavard Skinnemoen
On Fri, 30 May 2008 17:24:27 +1000 Benjamin Herrenschmidt [EMAIL PROTECTED] wrote: On Fri, 2008-05-30 at 08:07 +0200, Haavard Skinnemoen wrote: I think the drivers I've written have the necessary barriers (or dma ops with implicit barriers) that they don't actually depend on any DMA vs.

Re: MMIO and gcc re-ordering issue

2008-05-30 Thread Geert Uytterhoeven
On Fri, 30 May 2008, Haavard Skinnemoen wrote: Maybe we need another interface that does not do byteswapping but provides stronger ordering guarantees? The byte swapping depends on the device/bus. So what happened to the old idea of putting the accessor function pointers in the device/bus

Re: MMIO and gcc re-ordering issue

2008-05-30 Thread Jes Sorensen
James Bottomley wrote: The only way to guarantee ordering in the above setup, is to either make writel() fully ordered or adding the mmiowb()'s inbetween the two writel's. On Altix you have to go and read from the PCI brige to ensure all writes to it have been flushed, which is also what

Re: MMIO and gcc re-ordering issue

2008-05-30 Thread Jes Sorensen
Jesse Barnes wrote: On Thursday, May 29, 2008 2:40 pm Benjamin Herrenschmidt wrote: On Thu, 2008-05-29 at 10:47 -0400, Jes Sorensen wrote: The only way to guarantee ordering in the above setup, is to either make writel() fully ordered or adding the mmiowb()'s inbetween the two writel's. On

Re: MMIO and gcc re-ordering issue

2008-05-30 Thread Jes Sorensen
Benjamin Herrenschmidt wrote: On Thu, 2008-05-29 at 10:47 -0400, Jes Sorensen wrote: The only way to guarantee ordering in the above setup, is to either make writel() fully ordered or adding the mmiowb()'s inbetween the two writel's. On Altix you have to go and read from the PCI brige to ensure

Re: MMIO and gcc re-ordering issue

2008-05-30 Thread Jesse Barnes
On Friday, May 30, 2008 2:36 am Jes Sorensen wrote: James Bottomley wrote: The only way to guarantee ordering in the above setup, is to either make writel() fully ordered or adding the mmiowb()'s inbetween the two writel's. On Altix you have to go and read from the PCI brige to ensure all

Re: MMIO and gcc re-ordering issue

2008-05-29 Thread Arnd Bergmann
On Wednesday 28 May 2008, Benjamin Herrenschmidt wrote: On Tue, 2008-05-27 at 14:55 -0700, Linus Torvalds wrote: On Wed, 28 May 2008, Benjamin Herrenschmidt wrote: A problem with __raw_ though is that they -also- don't do byteswap, Well, that's why there is __readl() and

Re: MMIO and gcc re-ordering issue

2008-05-29 Thread Alan Cox
It's not exactly a well-established interface. Only five architectures define these functions, and there is not a single user in the kernel source outside of these architecture's io.h files. That is because the drivers using them had them removed (eg I²O) - mostly because it didn't compile on

Re: MMIO and gcc re-ordering issue

2008-05-29 Thread Pantelis Antoniou
On 28 Μαϊ 2008, at 11:36 ΠΜ, Haavard Skinnemoen wrote: Benjamin Herrenschmidt [EMAIL PROTECTED] wrote: I'm happy to say that __raw is purely about ordering and make them byteswap on powerpc tho (ie, make them little endian like the non- raw counterpart). That would break a lot of drivers.

Re: MMIO and gcc re-ordering issue

2008-05-29 Thread James Bottomley
On Thu, 2008-05-29 at 10:47 -0400, Jes Sorensen wrote: Roland == Roland Dreier [EMAIL PROTECTED] writes: This is a different issue. We deal with it on powerpc by having writel set a per-cpu flag and spin_unlock() test it, and do the barrier if needed there. Roland Cool... I assume you

Re: MMIO and gcc re-ordering issue

2008-05-29 Thread Jes Sorensen
Roland == Roland Dreier [EMAIL PROTECTED] writes: This is a different issue. We deal with it on powerpc by having writel set a per-cpu flag and spin_unlock() test it, and do the barrier if needed there. Roland Cool... I assume you do this for mutex_unlock() etc? Roland Is there any reason

Re: MMIO and gcc re-ordering issue

2008-05-29 Thread Benjamin Herrenschmidt
On Thu, 2008-05-29 at 10:47 -0400, Jes Sorensen wrote: The only way to guarantee ordering in the above setup, is to either make writel() fully ordered or adding the mmiowb()'s inbetween the two writel's. On Altix you have to go and read from the PCI brige to ensure all writes to it have been

Re: MMIO and gcc re-ordering issue

2008-05-29 Thread Trent Piepho
On Fri, 30 May 2008, Benjamin Herrenschmidt wrote: On Thu, 2008-05-29 at 10:47 -0400, Jes Sorensen wrote: Interesting. I've always been taught by ia64 people that mmiowb() was intended to be used solely between writel() and spin_unlock(). That's what I gathered too, based on what's written in

Re: MMIO and gcc re-ordering issue

2008-05-29 Thread Jesse Barnes
On Thursday, May 29, 2008 2:40 pm Benjamin Herrenschmidt wrote: On Thu, 2008-05-29 at 10:47 -0400, Jes Sorensen wrote: The only way to guarantee ordering in the above setup, is to either make writel() fully ordered or adding the mmiowb()'s inbetween the two writel's. On Altix you have to go

Re: MMIO and gcc re-ordering issue

2008-05-29 Thread Roland Dreier
The problem is that your two writel's, despite being both issued on cpu X, due to the spin lock, in your example, can end up with the first one going through NR 1 and the second one going through NR 2. If there's contention on NR 1, the write going via NR 2 may hit the PCI bridge prior

Re: MMIO and gcc re-ordering issue

2008-05-29 Thread Benjamin Herrenschmidt
On Thu, 2008-05-29 at 14:48 -0700, Trent Piepho wrote: I wrote a JTAG over gpio driver for the powerpc MPC8572DS platform. With the non-raw io accessors, the JTAG clock can run at almost ~9.5 MHz. Using raw versions (which I had to write since powerpc doesn't have any), the clock speed

Re: MMIO and gcc re-ordering issue

2008-05-29 Thread Benjamin Herrenschmidt
I do -- in all the drivers for on-chip peripherals that are shared between AT91 ARM (LE) and AVR32 (BE). Since everything goes on inside the chip, we must use LE accesses on ARM and BE accesses on AVR32. Currently, this is the only interface I know that can do native-endian accesses, so if

Re: MMIO and gcc re-ordering issue

2008-05-29 Thread Trent Piepho
On Fri, 30 May 2008, Benjamin Herrenschmidt wrote: On Thu, 2008-05-29 at 14:48 -0700, Trent Piepho wrote: I wrote a JTAG over gpio driver for the powerpc MPC8572DS platform. With the non-raw io accessors, the JTAG clock can run at almost ~9.5 MHz. Using raw versions (which I had to write

Re: MMIO and gcc re-ordering issue

2008-05-29 Thread Paul Mackerras
Trent Piepho writes: On Thu, 29 May 2008, Roland Dreier wrote: The problem is that your two writel's, despite being both issued on cpu X, due to the spin lock, in your example, can end up with the first one going through NR 1 and the second one going through NR 2. If there's

Re: MMIO and gcc re-ordering issue

2008-05-28 Thread Haavard Skinnemoen
Benjamin Herrenschmidt [EMAIL PROTECTED] wrote: I'm happy to say that __raw is purely about ordering and make them byteswap on powerpc tho (ie, make them little endian like the non-raw counterpart). That would break a lot of drivers. How many actually use __raw_ * ? I do --

Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Linus Torvalds
On Tue, 27 May 2008, Linus Torvalds wrote: Here's a UNTESTED patch for x86 that may or may not compile and work, and which serializes (on a compiler level) the IO accesses against regular memory accesses. Ok, so it at least boots on x86-32. Thus probably on x86-64 too (since the code is

Re: MMIO and gcc re-ordering issue

2008-05-27 Thread James Bottomley
On Tue, 2008-05-27 at 08:50 -0700, Roland Dreier wrote: Though it's my understanding that at least ia64 does require the explicit barriers anyway, so we are still in a dodgy situation here where it's not clear what drivers should do and we end up with possibly excessive barriers on

Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Roland Dreier
Actually, this specifically should not be. The need for mmiowb on altix is because it explicitly violates some of the PCI rules that would otherwise impede performance. The compromise is that readX on altix contains the needed dma flush but there's a variant operator, readX_relaxed

Re: MMIO and gcc re-ordering issue

2008-05-27 Thread James Bottomley
On Tue, 2008-05-27 at 10:38 -0700, Roland Dreier wrote: Actually, this specifically should not be. The need for mmiowb on altix is because it explicitly violates some of the PCI rules that would otherwise impede performance. The compromise is that readX on altix contains the needed

Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Roland Dreier
Um, OK, you've said write twice now ... I was assuming you meant read. Even on an x86, writes are posted, so there's no way a spin lock could serialise a write without an intervening read to flush the posting (that's why only reads have a relaxed version on altix). Or is there something

Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Roland Dreier
Writes are posted yes, but not reordered arbitrarily. on standard x86 I mean here... ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Trent Piepho
On Tue, 27 May 2008, Linus Torvalds wrote: On Tue, 27 May 2008, Benjamin Herrenschmidt wrote: Yes. As it is today, tg3 for example is potentially broken on all archs with newer gcc unless we either add memory clobber to readl/writel or stick some wmb's in there (just a random driver I picked).

Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Scott Wood
Trent Piepho wrote: Is there an issue with anything _besides_ coherent DMA? Could one have a special version of the accessors for drivers that want to assume they are strictly ordered vs coherent DMA memory? That would be much easier to get right, without slowing _everything_ down. It's

Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Benjamin Herrenschmidt
On Tue, 2008-05-27 at 08:35 -0700, Linus Torvalds wrote: On Tue, 27 May 2008, Benjamin Herrenschmidt wrote: Yes. As it is today, tg3 for example is potentially broken on all archs with newer gcc unless we either add memory clobber to readl/writel or stick some wmb's in there (just a

Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Benjamin Herrenschmidt
On Tue, 2008-05-27 at 08:50 -0700, Roland Dreier wrote: Though it's my understanding that at least ia64 does require the explicit barriers anyway, so we are still in a dodgy situation here where it's not clear what drivers should do and we end up with possibly excessive barriers on

Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Benjamin Herrenschmidt
On Tue, 2008-05-27 at 09:47 -0700, Linus Torvalds wrote: __read[bwlq]()/__write[bwlq]() are not serialized with a :memory barrier, although since they still use asm volatile I suspect that i practice they are probably serial too. Did not look very closely at any generated code (only

Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Chris Friesen
Roland Dreier wrote: Writes are posted yes, but not reordered arbitrarily. If I have code like: spin_lock(mmio_lock); writel(val1, reg1); writel(val2, reg2); spin_unlock(mmio_lock); then I have a reasonable expectation that if two CPUs run this at the same

Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Roland Dreier
Writes are posted yes, but not reordered arbitrarily. If I have code like: spin_lock(mmio_lock); writel(val1, reg1); writel(val2, reg2); spin_unlock(mmio_lock); then I have a reasonable expectation that if two CPUs run this at the same time, their writes to

Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Linus Torvalds
On Wed, 28 May 2008, Benjamin Herrenschmidt wrote: On Tue, 2008-05-27 at 08:35 -0700, Linus Torvalds wrote: Expecting people to fix up all drivers is simply not going to happen. And serializing things shouldn't be *that* expensive. People who cannot take the expense can continue to

Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Roland Dreier
This is a different issue. We deal with it on powerpc by having writel set a per-cpu flag and spin_unlock() test it, and do the barrier if needed there. Cool... I assume you do this for mutex_unlock() etc? Is there any reason why ia64 can't do this too so we can kill mmiowb and save

Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Benjamin Herrenschmidt
So practically speaking, I suspect that the right approach is to just say ok, x86 will continue to be pretty darn ordered, and the barriers won't really matter (*) but at the same time also saying we wish reality was different, and well-maintained drivers should probably try to work in the

Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Alan Cox
re-ordering, even though I doubt it will be visible in practice. So if you use the __ versions, you'd better have barriers even on x86! Are we also going to have __ioread*/__iowrite* ? Also is the sematics of __readl/__writel defined for all architectures - I used it ages ago in the i2o

Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Matthew Wilcox
On Tue, May 27, 2008 at 10:38:22PM +0100, Alan Cox wrote: re-ordering, even though I doubt it will be visible in practice. So if you use the __ versions, you'd better have barriers even on x86! Are we also going to have __ioread*/__iowrite* ? Didn't we already define ioread*() to have

Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Linus Torvalds
On Wed, 28 May 2008, Benjamin Herrenschmidt wrote: A problem with __raw_ though is that they -also- don't do byteswap, Well, that's why there is __readl() and __raw_readl(), no? Neither does ordering, and __raw_readl() doesn't do byte-swap. Of course, I'm not going to guarantee every

Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Linus Torvalds
On Tue, 27 May 2008, Alan Cox wrote: re-ordering, even though I doubt it will be visible in practice. So if you use the __ versions, you'd better have barriers even on x86! Are we also going to have __ioread*/__iowrite* ? I doubt there is any reason to. Let's just keep them very

  1   2   >