Bestcomm and AC97

2008-05-27 Thread Jon Smirl
I'm playing around with a Alsa SOC driver for the Efika. The Efika
uses a STAC9266 for AC97 which supports simultaneous SPDIF and analog
output. To do this the PSC needs DMA data from two different buffers
interleaved and placed into its FIFO. Is this something Bestcomm can
do?

Examples
mono + spdif = 1 sample from buffer one, two from buffer two
stereo + spdif = 2 samples from buffer one, two from buffer two
clone the output, 2 samples from buffer one, write to FIFO twice

When only spdif or analog is used standalone the existing generic task works.

-- 
Jon Smirl
[EMAIL PROTECTED]
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: PPC PReP link failure - copy_page, empty_zero_page

2008-05-27 Thread Tony Breeds
On Tue, May 27, 2008 at 08:25:43PM +0300, Meelis Roos wrote:
> While compiling the current kernel on prep ppc:
> 
>   MODPOST 703 modules
> ERROR: "copy_page" [fs/fuse/fuse.ko] undefined!
> ERROR: "empty_zero_page" [fs/ext4/ext4dev.ko] undefined!

Try that patch at: http://patchwork.ozlabs.org/linuxppc/patch?id=18710

Fixes it for me.

Yours Tony

  linux.conf.auhttp://www.marchsouth.org/
  Jan 19 - 24 2009 The Australian Linux Technical Conference!

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH] [PPC] Export emty_zero_page and copy_page on ppc

2008-05-27 Thread Tony Breeds
Currently ext4 and fuse fail to link if modular:
ERROR: "copy_page" [fs/fuse/fuse.ko] undefined!
ERROR: "empty_zero_page" [fs/ext4/ext4dev.ko] undefined!
make[3]: *** [__modpost] Error 1
make[2]: *** [modules] Error 2
make[1]: *** [sub-make] Error 2

While arch ppc exists it may as well compile.

Signed-off-by: Tony Breeds <[EMAIL PROTECTED]>
---
Should be safe for 2.6.26.

 arch/ppc/kernel/ppc_ksyms.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/arch/ppc/kernel/ppc_ksyms.c b/arch/ppc/kernel/ppc_ksyms.c
index 602c268..5d529bc 100644
--- a/arch/ppc/kernel/ppc_ksyms.c
+++ b/arch/ppc/kernel/ppc_ksyms.c
@@ -60,8 +60,10 @@ long long __ashrdi3(long long, int);
 long long __ashldi3(long long, int);
 long long __lshrdi3(long long, int);
 
+EXPORT_SYMBOL(empty_zero_page);
 EXPORT_SYMBOL(clear_pages);
 EXPORT_SYMBOL(clear_user_page);
+EXPORT_SYMBOL(copy_page);
 EXPORT_SYMBOL(transfer_to_handler);
 EXPORT_SYMBOL(do_IRQ);
 EXPORT_SYMBOL(machine_check_exception);
-- 
1.5.5.1

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH] [POWERPC] Add "memory" clobber to MMIO accessors

2008-05-27 Thread Benjamin Herrenschmidt
gcc might re-order MMIO accessors vs. surrounding consistent
memory accesses, which is a "bad thing".

Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
---

 include/asm-powerpc/io.h |   12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

--- linux-work.orig/include/asm-powerpc/io.h2008-05-28 09:39:11.0 
+1000
+++ linux-work/include/asm-powerpc/io.h 2008-05-28 09:40:28.0 +1000
@@ -100,7 +100,7 @@ static inline type name(const volatile t
 {  \
type ret;   \
__asm__ __volatile__("sync;" insn ";twi 0,%0,0;isync"   \
-   : "=r" (ret) : "r" (addr), "m" (*addr));\
+   : "=r" (ret) : "r" (addr), "m" (*addr) : "memory"); \
return ret; \
 }
 
@@ -108,8 +108,8 @@ static inline type name(const volatile t
 static inline void name(volatile type __iomem *addr, type val) \
 {  \
__asm__ __volatile__("sync;" insn   \
-   : "=m" (*addr) : "r" (val), "r" (addr));\
-   IO_SET_SYNC_FLAG(); \
+   : "=m" (*addr) : "r" (val), "r" (addr) : "memory"); \
+   IO_SET_SYNC_FLAG(); \
 }
 
 
@@ -333,7 +333,8 @@ static inline unsigned int name(unsigned
"   .long   3b,5b\n"\
".previous" \
: "=&r" (x) \
-   : "r" (port + _IO_BASE));   \
+   : "r" (port + _IO_BASE) \
+   : "memory");
return x;   \
 }
 
@@ -350,7 +351,8 @@ static inline void name(unsigned int val
"   .long   0b,2b\n"\
"   .long   1b,2b\n"\
".previous" \
-   : : "r" (val), "r" (port + _IO_BASE));  \
+   : : "r" (val), "r" (port + _IO_BASE)\
+   : "memory");
 }
 
 __do_in_asm(_rec_inb, "lbzx")
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH V2] [POWERPC] Improve (in|out)_[bl]eXX() asm code

2008-05-27 Thread Trent Piepho
Since commit 4cb3cee03d558fd457cb58f56c80a2a09a66110c the code generated
for the in_beXX() and out_beXX() mmio functions has been sub-optimal.

The out_leXX() family of functions are created with the macro
DEF_MMIO_OUT_LE() while the out_beXX() family are created with
DEF_MMIO_OUT_BE().  In what was perhaps a bit too much macro use, both of
these macros are in turn created via the macro DEF_MMIO_OUT().

For the LE versions, eventually they boil down to an asm that will look
something like this:
asm("sync; stwbrx %1,0,%2" : "=m" (*addr) : "r" (val), "r" (addr));

The issue is that the "stwbrx" instruction only comes in an indexed, or
'x', version, in which the address is represented by the sum of two
registers (the "0,%2").  Unfortunately, gcc doesn't have a constraint for
an indexed memory reference.  The "m" constraint allows both indexed and
offset, i.e. register plus constant, memory references and there is no
"stwbr" version for offset references.  "m" also allows updating addresses
and there is no 'u' version of "stwbrx" like there is with "stwux".

The unused first operand to the asm is just to tell gcc that *addr is an
output of the asm.  The address used is passed in a single register via the
third asm operand, and the index register is just hard coded as 0.  This
means gcc is forced to put the address in a single register and can't use
index addressing, e.g. if one has the data in register 9, a base address in
register 3 and an index in register 4, gcc must emit code like "add 11,4,3;
stwbrx 9,0,11" instead of just "stwbrx 9,4,3".  This costs an extra add
instruction and another register.

For gcc 4.0 and older, there doesn't appear to be anything that can be
done.  But for 4.1 and newer, there is a 'Z' constraint.  It does not allow
"updating" addresses, but does allow both indexed and offset addresses.
However, the only allowed constant offset is 0.  We can then use the
undocumented 'y' operand modifier, which causes gcc to convert "0(reg)"
into the equivilient "0,reg" format that can be used with stwbrx.

This brings us the to problem with the BE version.  In this case, the "stw"
instruction does have both indexed and non-indexed versions.  The final asm
ends up looking like this:
asm("sync; stw%U0%X0 %1,%0" : "=m" (*addr) : "r" (val), "r" (addr));

The undocumented codes "%U0" and "%0X" will generate a 'u' if the memory
reference should be an auto-updating one, and an 'x' if the memory
reference is indexed, respectively.  The third operand is unused, it's just
there because asm the code is reused from the LE version.  However, gcc
does not know this, and generates unnecessary code to stick addr in a
register!  To use the example from the LE version, gcc will generate "add
11,4,3; stwx 9,4,3".  It is able to use the indexed address "4,3" for the
"stwx", but still thinks it needs to put 4+3 into register 11, which will
never be used.

This also ends up happening a lot for the offset addressing mode, where
common code like this:  out_be32(&device_registers->some_register, data);
uses an instruction like "stw 9, 42(3)", where register 3 has the pointer
device_registers and 42 is the offset of some_register in that structure.
gcc will be forced to generate the unnecessary instruction "addi 11, 3, 42"
to put the address into a single (unused) register.

The in_* versions end up having these exact same problems as well.

Signed-off-by: Trent Piepho <[EMAIL PROTECTED]>
CC: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
CC: Andreas Schwab <[EMAIL PROTECTED]>
---

This version uses the Andreas's suggestions about how to use the "Z"
constraint and "y" modifier to improve the LE version as well.

 include/asm-powerpc/io.h |   63 -
 1 files changed, 45 insertions(+), 18 deletions(-)

diff --git a/include/asm-powerpc/io.h b/include/asm-powerpc/io.h
index e0062d7..a2547ac 100644
--- a/include/asm-powerpc/io.h
+++ b/include/asm-powerpc/io.h
@@ -95,33 +95,60 @@ extern resource_size_t isa_mem_base;
 #define IO_SET_SYNC_FLAG()
 #endif
 
-#define DEF_MMIO_IN(name, type, insn)  \
-static inline type name(const volatile type __iomem *addr) \
+/* gcc 4.0 and older doesn't have 'Z' constraint */
+#if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ == 0)
+#define DEF_MMIO_IN_LE(name, size, insn)   \
+static inline u##size name(const volatile u##size __iomem *addr)   \
 {  \
-   type ret;   \
-   __asm__ __volatile__("sync;" insn ";twi 0,%0,0;isync"   \
-   : "=r" (ret) : "r" (addr), "m" (*addr));\
+   u##size ret;\
+   __asm__ __volatile__("sync;"#insn" %0,0,%1;twi 0,%0,0;isync"\
+   : "=r" (ret) : "r" (addr), "m" (*addr));\
return ret;

Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Paul Mackerras
Chris Friesen writes:

> Roland Dreier wrote:
> 
> > Writes are posted yes, but not reordered arbitrarily.  If I have code like:
> > 
> > spin_lock(&mmio_lock);
> > writel(val1, reg1);
> > writel(val2, reg2);
> > spin_unlock(&mmio_lock);
> > 
> > then I have a reasonable expectation that if two CPUs run this at the
> > same time, their writes to reg1/reg2 won't be interleaved with each
> > other (because the whole section is inside a spinlock).  And Altix
> > violates that expectation.
> 
> Does that necessarily follow?
> 
> If you've got a large system with multiple pci bridges, could you end up 
> with posted writes coming from different cpus taking a different amount 
> of time to propagate to a device and thus colliding?

On powerpc we explicitly make sure that can't happen.  That's the "do
a sync in spin_unlock if there were any writels since the last
spin_lock" magic.  The sync instruction makes sure the writes have got
to the host bridge before the spinlock is unlocked.

Paul.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Roland Dreier
 > > Cool... I assume you do this for mutex_unlock() etc?

 > That's a good point... I don't think we do. Maybe we should.

I think it's needed -- take a look at 76d7cc03, which came from a real
bug seen on Altix boxes.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Benjamin Herrenschmidt

On Tue, 2008-05-27 at 14:55 -0700, Linus Torvalds wrote:
> 
> On Wed, 28 May 2008, Benjamin Herrenschmidt wrote:
> > 
> > A problem with __raw_ though is that they -also- don't do byteswap,
> 
> Well, that's why there is __readl() and __raw_readl(), no?

As I replied to somebody else, __readl() is news to me :-) we dont' have
those on powerpc.

> Neither does ordering, and __raw_readl() doesn't do byte-swap.

But I can add them :-)

> Of course, I'm not going to guarantee every architecture even has all 
> those versions, nor am I going to guarantee they all work as advertised :)
> 
> For x86, they have historially all been 100% identical. With the inline 
> asm patch I posted, the "__" version (whether "raw" or not) lack the 
> "memory" barrier, so they allow a *little* bit more re-ordering.
> 
> (They won't be re-ordered wrt spinlocks etc, unless gcc starts reordering 
> volatile asm's against each other, which would be a bug).
> 
> In practice, I doubt it matters. Whatever small compiler re-ordering it 
> might affect won't have any real performance impack one way or the other, 
> I think.

I prefer explicit endian. Always. Thus I prefer introducing _be variants
(we already have those on powerpc and iomap has it's own _be versions
too) so we should probably generalize _be.

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Benjamin Herrenschmidt

On Tue, 2008-05-27 at 15:42 -0600, Matthew Wilcox wrote:
> On Wed, May 28, 2008 at 07:38:55AM +1000, Benjamin Herrenschmidt wrote:
> > A problem with __raw_ though is that they -also- don't do byteswap,
> > which is a pain in the neck as people use them for either one reason
> > (relaxed ordering) or the other (no byteswap) without always knowing the
> > consequences of doing so...
> 
> That's why there's __readl() which does byteswap, but doesn't do
> ordering ...

Ah, that one is news to me. I don't think we ever had it on powerpc :-)

> > I'm happy to say that __raw is purely about ordering and make them
> > byteswap on powerpc tho (ie, make them little endian like the non-raw
> > counterpart).
> 
> That would break a lot of drivers.

How many actually use __raw_ * ?

> > Some archs started providing writel_be etc... I added those to powerpc a
> > little while ago, and I tend to prefer that approach for the byteswap
> > issue.

> Those are for people who use big endian chips on little endian
> architectures.

Why limit them to LE architecture ? There is nothing fundamentally
speicifc to LE architectures here, and it's wrong to provide accessors
on some archs and not others. The endianness is a property of the device
registers. Current writel/readl are basically writel_le/readl_le. It
thus makes sense to have the opposite, ie, readl_be/writel_be, which
thus byteswaps on LE platforms and not on BE platforms, which is what I
provided on powerpc a while ago.

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Benjamin Herrenschmidt

On Tue, 2008-05-27 at 14:33 -0700, Roland Dreier wrote:
> > This is a different issue. We deal with it on powerpc by having writel
>  > set a per-cpu flag and spin_unlock() test it, and do the barrier if
>  > needed there.
> 
> Cool... I assume you do this for mutex_unlock() etc?

That's a good point... I don't think we do. Maybe we should.

> Is there any reason why ia64 can't do this too so we can kill mmiowb and
> save everyone a lot of hassle?  (mips, sh and frv have non-empty
> mmiowb() definitions too but I'd guess that these are all bugs based on
> misunderstandings of the mmiowb() semantics...)

Well, basically our approach was that mmiowb() is a pain in the neck,
nobody (ie. driver writers) really understands what it's for, and so
it's either not there or misused. So we didn't want to introduce it for
powerpc, but instead did the trick above in order to -slightly- improve
our writel (ie avoid a sync -after- the write) .

>  > However, drivers such as e1000 -also- have a wmb() between filling the
>  > ring buffer and kicking the DMA with MMIO, with a comment about this
>  > being needed for ia64 relaxed ordering.
> 
> I put these barriers into mthca, mlx4 etc, although it came from my
> possible misunderstanding of the memory ordering rules in the kernel
> more than any experience of problems (as opposed the the mmiowb()s,
> which all came from real world bugs).

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Matthew Wilcox
On Wed, May 28, 2008 at 07:38:55AM +1000, Benjamin Herrenschmidt wrote:
> A problem with __raw_ though is that they -also- don't do byteswap,
> which is a pain in the neck as people use them for either one reason
> (relaxed ordering) or the other (no byteswap) without always knowing the
> consequences of doing so...

That's why there's __readl() which does byteswap, but doesn't do
ordering ...

> I'm happy to say that __raw is purely about ordering and make them
> byteswap on powerpc tho (ie, make them little endian like the non-raw
> counterpart).

That would break a lot of drivers.

> Some archs started providing writel_be etc... I added those to powerpc a
> little while ago, and I tend to prefer that approach for the byteswap
> issue.

Those are for people who use big endian chips on little endian
architectures.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Linus Torvalds


On Tue, 27 May 2008, Matthew Wilcox wrote:

> On Tue, May 27, 2008 at 10:38:22PM +0100, Alan Cox wrote:
> > > re-ordering, even though I doubt it will be visible in practice. So if 
> > > you 
> > > use the "__" versions, you'd better have barriers even on x86!
> > 
> > Are we also going to have __ioread*/__iowrite* ?
> 
> Didn't we already define ioread*() to have loose semantics?

They are supposed to have the same semantics as readl/writel.

And yes, it's "loose", but only when compared to inb/outb (which are 
really very strict, if you want to emulate x86 - an "outb" basically is 
not only ordered, it doesn't even post the write, and waits until it has 
hit the bus!)

Linus
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Alan Cox
On Tue, 27 May 2008 15:53:28 -0600
Matthew Wilcox <[EMAIL PROTECTED]> wrote:

> On Tue, May 27, 2008 at 10:38:22PM +0100, Alan Cox wrote:
> > > re-ordering, even though I doubt it will be visible in practice. So if 
> > > you 
> > > use the "__" versions, you'd better have barriers even on x86!
> > 
> > Are we also going to have __ioread*/__iowrite* ?
> 
> Didn't we already define ioread*() to have loose semantics?

The ATA layer doesn't think so.

Alan
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Linus Torvalds


On Tue, 27 May 2008, Alan Cox wrote:
>
> > re-ordering, even though I doubt it will be visible in practice. So if you 
> > use the "__" versions, you'd better have barriers even on x86!
> 
> Are we also going to have __ioread*/__iowrite* ?

I doubt there is any reason to. Let's just keep them very strictly 
ordered. 

> Also is the sematics of __readl/__writel defined for all architectures -
> I used it ages ago in the i2o drivers for speed and it got removed
> because it didn't build on some platforms.

Agreed - I'm not sure the __ versions are really worth it. We have them, 
but the semantics are subtle enough that most drivers will never care 
enough to really use them.

I would suggest using them mainly for architecture-specific drivers, on 
architectures where it actually matters (which is not the case on x86).

Linus
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Linus Torvalds


On Wed, 28 May 2008, Benjamin Herrenschmidt wrote:
> 
> A problem with __raw_ though is that they -also- don't do byteswap,

Well, that's why there is __readl() and __raw_readl(), no?

Neither does ordering, and __raw_readl() doesn't do byte-swap.

Of course, I'm not going to guarantee every architecture even has all 
those versions, nor am I going to guarantee they all work as advertised :)

For x86, they have historially all been 100% identical. With the inline 
asm patch I posted, the "__" version (whether "raw" or not) lack the 
"memory" barrier, so they allow a *little* bit more re-ordering.

(They won't be re-ordered wrt spinlocks etc, unless gcc starts reordering 
volatile asm's against each other, which would be a bug).

In practice, I doubt it matters. Whatever small compiler re-ordering it 
might affect won't have any real performance impack one way or the other, 
I think.

Linus
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Matthew Wilcox
On Tue, May 27, 2008 at 10:38:22PM +0100, Alan Cox wrote:
> > re-ordering, even though I doubt it will be visible in practice. So if you 
> > use the "__" versions, you'd better have barriers even on x86!
> 
> Are we also going to have __ioread*/__iowrite* ?

Didn't we already define ioread*() to have loose semantics?

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Alan Cox
> re-ordering, even though I doubt it will be visible in practice. So if you 
> use the "__" versions, you'd better have barriers even on x86!

Are we also going to have __ioread*/__iowrite* ?

Also is the sematics of __readl/__writel defined for all architectures -
I used it ages ago in the i2o drivers for speed and it got removed
because it didn't build on some platforms.

Alan

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Benjamin Herrenschmidt

> So practically speaking, I suspect that the right approach is to just say 
> "ok, x86 will continue to be pretty darn ordered, and the barriers won't 
> really matter (*)" but at the same time also saying "we wish reality was 
> different, and well-maintained drivers should probably try to work in the 
> presense of re-ordering".

Ok, well, I'll slap "memory" clobbers onto powerpc accessors, we made
them fully ordered a while ago anyway. The extra barriers in drivers
like USB etc.. won't hurt us much, we can always fine tune drivers that
really want high performances.

A problem with __raw_ though is that they -also- don't do byteswap,
which is a pain in the neck as people use them for either one reason
(relaxed ordering) or the other (no byteswap) without always knowing the
consequences of doing so...

I'm happy to say that __raw is purely about ordering and make them
byteswap on powerpc tho (ie, make them little endian like the non-raw
counterpart).

Some archs started providing writel_be etc... I added those to powerpc a
little while ago, and I tend to prefer that approach for the byteswap
issue.

What do you think ?

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Roland Dreier
 > This is a different issue. We deal with it on powerpc by having writel
 > set a per-cpu flag and spin_unlock() test it, and do the barrier if
 > needed there.

Cool... I assume you do this for mutex_unlock() etc?

Is there any reason why ia64 can't do this too so we can kill mmiowb and
save everyone a lot of hassle?  (mips, sh and frv have non-empty
mmiowb() definitions too but I'd guess that these are all bugs based on
misunderstandings of the mmiowb() semantics...)

 > However, drivers such as e1000 -also- have a wmb() between filling the
 > ring buffer and kicking the DMA with MMIO, with a comment about this
 > being needed for ia64 relaxed ordering.

I put these barriers into mthca, mlx4 etc, although it came from my
possible misunderstanding of the memory ordering rules in the kernel
more than any experience of problems (as opposed the the mmiowb()s,
which all came from real world bugs).

 - R.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [RFC] 4xx hardware watchpoint support

2008-05-27 Thread Roland McGrath
> Kumar was just mentioning this post a few messages ago:
> http://ozlabs.org/pipermail/linuxppc-dev/2008-May/055745.html
> 
> That is a very interesting approach to handle all the differences
> between each processor's architecture, and a much cleaner way to set the
> facilities we want than the current interface we have. Do you know what
> is the status of this work? Did it move any further?

[EMAIL PROTECTED] was going to look into this.  I don't think there
has been much progress yet, but I hope we can get it started up again.


Thanks,
Roland
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Linus Torvalds


On Wed, 28 May 2008, Benjamin Herrenschmidt wrote:
> 
> On Tue, 2008-05-27 at 08:35 -0700, Linus Torvalds wrote:
> > 
> > Expecting people to fix up all drivers is simply not going to happen. And 
> > serializing things shouldn't be *that* expensive. People who cannot take 
> > the expense can continue to use the magic __raw_writel() etc stuff.
> 
> Ok.
> 
> Do we also remove wmb/rmb/... from drivers then ? :-) I think ia64 would
> need to be fixed to make their writel serializing...

Well..

There's really two different issues:

 (a) x86 and the fact that we have thousands of drivers

which in turn conflicts with

 (b) non-x86 and the fact that other architectures tend to be absolute 
 pieces of cr*p when it comes to ordering, _especially_ across IO.

and the thing about (b) is that the number of drivers involved is a hell 
of a lot smaller. For example, ia64 and the big SGI machines probably 
really only care about roughly five drivers (number taken out of my nether 
regions). 

So practically speaking, I suspect that the right approach is to just say 
"ok, x86 will continue to be pretty darn ordered, and the barriers won't 
really matter (*)" but at the same time also saying "we wish reality was 
different, and well-maintained drivers should probably try to work in the 
presense of re-ordering".

In *practice*, that probably means that most architectures will be better 
off if they emulate x86 closely, just because that means that they won't 
rely on drivers always getting things right, but I think we can leave the 
door open for the odd machines. We should just realize that they will 
never get a lot of testing, but on the other hand, their usage scenarios 
will generally also be very limited (very specific loads, and _very_ 
specific hardware).

And the patch I sent out actually made "__[raw_]readl()" different from 
"readl()" on x86 too, in that the assembly _allows_ a bit more 
re-ordering, even though I doubt it will be visible in practice. So if you 
use the "__" versions, you'd better have barriers even on x86!

Linus

(*) With the possible but unlikely exception of some big machines with 
separate IO networks, but if they happen they will fall into the 'ia64' 
case of just having a few relevant drivers.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Roland Dreier
 > > Writes are posted yes, but not reordered arbitrarily.  If I have code like:
 > >
 > >spin_lock(&mmio_lock);
 > >writel(val1, reg1);
 > >writel(val2, reg2);
 > >spin_unlock(&mmio_lock);
 > >
 > > then I have a reasonable expectation that if two CPUs run this at the
 > > same time, their writes to reg1/reg2 won't be interleaved with each
 > > other (because the whole section is inside a spinlock).  And Altix
 > > violates that expectation.
 > 
 > Does that necessarily follow?
 > 
 > If you've got a large system with multiple pci bridges, could you end
 > up with posted writes coming from different cpus taking a different
 > amount of time to propagate to a device and thus colliding?

Not on x86.  And a given PCI device can only be reached from a single
host bridge, so I don't see how it can happen.  But on SGI altix
systems, there is a routed fabric between the CPU and the PCI bus, so
the reordering can happen there.  Hence mmiowb() and the endless supply
of driver bugs that it causes.

 - R.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Chris Friesen

Roland Dreier wrote:


Writes are posted yes, but not reordered arbitrarily.  If I have code like:

spin_lock(&mmio_lock);
writel(val1, reg1);
writel(val2, reg2);
spin_unlock(&mmio_lock);

then I have a reasonable expectation that if two CPUs run this at the
same time, their writes to reg1/reg2 won't be interleaved with each
other (because the whole section is inside a spinlock).  And Altix
violates that expectation.


Does that necessarily follow?

If you've got a large system with multiple pci bridges, could you end up 
with posted writes coming from different cpus taking a different amount 
of time to propagate to a device and thus colliding?


Chris
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Benjamin Herrenschmidt

On Tue, 2008-05-27 at 09:47 -0700, Linus Torvalds wrote:
> 
> __read[bwlq]()/__write[bwlq]() are not serialized with a :"memory" 
> barrier, although since they still use "asm volatile" I suspect that
> i 
> practice they are probably serial too. Did not look very closely at
> any 
> generated code (only did a trivial test to see that the code looks 
> *roughly* correct).

Nah, asm volatile doesn't help, it does need the "memory" clobber too.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Benjamin Herrenschmidt

On Tue, 2008-05-27 at 08:50 -0700, Roland Dreier wrote:
> > Though it's my understanding that at least ia64 does require the
>  > explicit barriers anyway, so we are still in a dodgy situation here
>  > where it's not clear what drivers should do and we end up with
>  > possibly excessive barriers on powerpc where I end up with both
>  > the wmb/rmb/mb that were added for ia64 -and- the ones I have in
>  > readl/writel to make them look synchronous... Not nice.
> 
> ia64 is a disaster with a slightly different ordering problem -- the
> mmiowb() issue.  I know Ben knows far too much about this, but for big
> SGI boxes, you sometimes need mmiowb() to avoid problems with driver
> code that does totally sane stuff like

This is a different issue. We deal with it on powerpc by having writel
set a per-cpu flag and spin_unlock() test it, and do the barrier if
needed there.

However, drivers such as e1000 -also- have a wmb() between filling the
ring buffer and kicking the DMA with MMIO, with a comment about this
being needed for ia64 relaxed ordering.

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Benjamin Herrenschmidt

On Tue, 2008-05-27 at 08:35 -0700, Linus Torvalds wrote:
> 
> On Tue, 27 May 2008, Benjamin Herrenschmidt wrote:
> > 
> > Yes. As it is today, tg3 for example is potentially broken on all archs
> > with newer gcc unless we either add "memory" clobber to readl/writel or
> > stick some wmb's in there (just a random driver I picked).
> > 
> > So Linus, what is your take on that matter ?
> 
> Let's just serialize the damn things, and add a memory clobber to them.
> 
> Expecting people to fix up all drivers is simply not going to happen. And 
> serializing things shouldn't be *that* expensive. People who cannot take 
> the expense can continue to use the magic __raw_writel() etc stuff.

Ok.

Do we also remove wmb/rmb/... from drivers then ? :-) I think ia64 would
need to be fixed to make their writel serializing...

Regarding __raw_* their semantics are dodgy ... we might want to provide
something better but it's a different subject.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: JFFS2 FS not writeable

2008-05-27 Thread Scott Wood

Ron Madrid wrote:

In which function should I start searching for this
problem?  What would cause a block to be marked bad?


No particular function; just something that causes things to not work. :-P

You may want to try increasing the OR[SCY] field or other timing 
parameters associated with the NAND flash.


-Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: JFFS2 FS not writeable

2008-05-27 Thread Ron Madrid
In which function should I start searching for this
problem?  What would cause a block to be marked bad?

Ron
--- Scott Wood <[EMAIL PROTECTED]> wrote:

> Ron Madrid wrote:
> > I have tried the latest code and now, I'm able to
> > write a limited amount.  After which when I
> reboot,
> > more than 100 contiguous blocks have been marked
> bad,
> > through the end of the device (according to the
> dts). 
> > Is this something that has been seen before?
> 
> Yes, but not with the latest code.
> 
>  > My NAND device is a large page.
> 
> I've only had small page devices to test on, so it's
> quite likely that 
> there's a large page bug in the driver somewhere.
> 
> If your erase block size is not 128K, try the latest
> set_addr patch in 
> the mtd-2.6.22.1 branch of u-boot-nand-flash (I
> still need to send it 
> out for Linux).
> 
> -Scott
> 

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Scott Wood

Trent Piepho wrote:

Is there an issue with anything _besides_ coherent DMA?

Could one have a special version of the accessors for drivers that
want to assume they are strictly ordered vs coherent DMA memory?
That would be much easier to get right, without slowing _everything_
down.


It's better to be safe by default and then optimize the fast paths than 
to be relaxed by default and hang the machine in some piece of code that 
runs once a month.  "Premature optimization is the root of all evil", 
and what not.


One could even go as far as to allow a driver to "#define 
WANT_STRICT_IO" and then it would get the strict versions.  Add that

to any driver that uses DMA and then worry about vetting those
drivers.


See above -- if you must have a #define, then it should be WANT_RELAXED_IO.

-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Trent Piepho

On Tue, 27 May 2008, Linus Torvalds wrote:

On Tue, 27 May 2008, Benjamin Herrenschmidt wrote:


Yes. As it is today, tg3 for example is potentially broken on all archs
with newer gcc unless we either add "memory" clobber to readl/writel or
stick some wmb's in there (just a random driver I picked).

So Linus, what is your take on that matter ?


Let's just serialize the damn things, and add a memory clobber to them.

Expecting people to fix up all drivers is simply not going to happen. And
serializing things shouldn't be *that* expensive. People who cannot take
the expense can continue to use the magic __raw_writel() etc stuff.


Is there an issue with anything _besides_ coherent DMA?

Could one have a special version of the accessors for drivers that want to
assume they are strictly ordered vs coherent DMA memory?  That would be much
easier to get right, without slowing _everything_ down.  The problem with the
raw versions is that on some arches they are much more raw than just not being
ordered w.r.t normal memory.

__raw_writel()
No ordering beyond what the arch provides natively.  The only thing you can
assume is that reads and writes to the same location on the same CPU are
ordered.

writel()
Strictly ordered with respect to any other IO, but not to normal memory. 
(this is what Documentation/memory-barriers.txt claims).  Should probably be

strictly ordered vs locks as well.  Would be strictly ordered w.r.t. the
streaming DMA sync calls of course.

strict_writel()
Strictly ordered with respect to normal (incl. coherent DMA) memory as well.

One could even go as far as to allow a driver to "#define WANT_STRICT_IO" and
then it would get the strict versions.  Add that to any driver that uses DMA
and then worry about vetting those drivers.

It's also worth nothing that adding barriers to IO accessors doesn't mean you
never have to worry about barriers with coherent DMA.  Inherent with coherent
DMA, and driving hardware in general, there are various sequence points where
one must be sure one operation has finished before starting the next.  Making
sure you're doing with DMA memory before telling the hardware to do something
with it a common example.  Often the "telling the hardware to do something" is
done via an IO accessor function, but not always.

You can have buffer descriptors in DMA memory that describe the DMA buffers to
the hardware.  It would be critical to be finished reading a DMA buffer before
updating the descriptor to indicate that it's empty.  You could trigger a DMA
operation not by a call to an IO accessor, but with an atomic operation to a
variable shared with an ISR.  When the ISR runs it sees the change and does
the necessary IO.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Roland Dreier
 > Writes are posted yes, but not reordered arbitrarily.

on standard x86 I mean here...
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Roland Dreier
 > Um, OK, you've said write twice now ... I was assuming you meant read.
 > Even on an x86, writes are posted, so there's no way a spin lock could
 > serialise a write without an intervening read to flush the posting
 > (that's why only reads have a relaxed version on altix).  Or is there
 > something else I'm missing?

Writes are posted yes, but not reordered arbitrarily.  If I have code like:

spin_lock(&mmio_lock);
writel(val1, reg1);
writel(val2, reg2);
spin_unlock(&mmio_lock);

then I have a reasonable expectation that if two CPUs run this at the
same time, their writes to reg1/reg2 won't be interleaved with each
other (because the whole section is inside a spinlock).  And Altix
violates that expectation.

 - R.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: JFFS2 FS not writeable

2008-05-27 Thread Scott Wood

Ron Madrid wrote:

I have tried the latest code and now, I'm able to
write a limited amount.  After which when I reboot,
more than 100 contiguous blocks have been marked bad,
through the end of the device (according to the dts). 
Is this something that has been seen before?


Yes, but not with the latest code.

> My NAND device is a large page.

I've only had small page devices to test on, so it's quite likely that 
there's a large page bug in the driver somewhere.


If your erase block size is not 128K, try the latest set_addr patch in 
the mtd-2.6.22.1 branch of u-boot-nand-flash (I still need to send it 
out for Linux).


-Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: JFFS2 FS not writeable

2008-05-27 Thread Ron Madrid
I have tried the latest code and now, I'm able to
write a limited amount.  After which when I reboot,
more than 100 contiguous blocks have been marked bad,
through the end of the device (according to the dts). 
Is this something that has been seen before?  My NAND
device is a large page.

Ron
--- Scott Wood <[EMAIL PROTECTED]> wrote:

> Ron Madrid wrote:
> > I'm trying to make new directories, create files,
> save
> > changes to files, etc. and when I reboot my
> changes
> > are not being saved.  I'm sorry if this is vague,
> but
> > I haven't run into this problem before in previous
> > kernels.  I'm using 2.6.25 and my configuration is
> > very similar to mpc8313erdb_defconfig, with
> changes to
> > include NAND support as my flash device is a NAND.
> 
> > Also, my dts is very similar to mpc8313erdb.dts as
> > well.  I don't know where to start looking for the
> > problem as there is no error.  Here is what is
> printed
> > on boot.
> 
> Try head-of-tree; there have been several fixes to
> the eLBC FCM NAND driver.
> 
> -Scott
> 

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] Add support for binary includes.

2008-05-27 Thread Jon Loeliger

Kumar Gala wrote:


On Feb 25, 2008, at 6:39 PM, David Gibson wrote:


On Wed, Feb 20, 2008 at 01:19:41PM -0600, Scott Wood wrote:

A property's data can be populated with a file's contents
as follows:

node {
prop = /incbin/("path/to/data");
};

A subset of a file can be included by passing start and size parameters.
For example, to include bytes 8 through 23:

node {
prop = /incbin/("path/to/data", 8, 16);
};

As with /include/, non-absolute paths are looked for in the directory
of the source file that includes them.


That issue was resolved, I believe.


Well, while I discuss the syntax with Jon, here's some comments on the
implementation.


have we made any progress on the syntax?


My last suggestions garnered a "I like that even less." response,
so as far as I know, we're still waiting for a good proposal here.

jdl

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread James Bottomley
On Tue, 2008-05-27 at 10:38 -0700, Roland Dreier wrote:
> > Actually, this specifically should not be.  The need for mmiowb on altix
>  > is because it explicitly violates some of the PCI rules that would
>  > otherwise impede performance.   The compromise is that readX on altix
>  > contains the needed dma flush but there's a variant operator,
>  > readX_relaxed that doesn't (for drivers that know what they're doing).
>  > The altix critical drivers have all been converted to use the relaxed
>  > form for performance, and the unconverted ones should all operate just
>  > fine (albeit potentially more slowly).
> 
> Is this a recent change?  Because as of October 2007, 76d7cc03
> ("IB/mthca: Use mmiowb() to avoid firmware commands getting jumbled up")
> was needed.  But this was involving writel() (__raw_writel() actually,
> looking at the code), not readl().  But writel_relaxed() doesn't exist
> (and doesn't make sense).

Um, OK, you've said write twice now ... I was assuming you meant read.
Even on an x86, writes are posted, so there's no way a spin lock could
serialise a write without an intervening read to flush the posting
(that's why only reads have a relaxed version on altix).  Or is there
something else I'm missing?

James


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Roland Dreier
 > Actually, this specifically should not be.  The need for mmiowb on altix
 > is because it explicitly violates some of the PCI rules that would
 > otherwise impede performance.   The compromise is that readX on altix
 > contains the needed dma flush but there's a variant operator,
 > readX_relaxed that doesn't (for drivers that know what they're doing).
 > The altix critical drivers have all been converted to use the relaxed
 > form for performance, and the unconverted ones should all operate just
 > fine (albeit potentially more slowly).

Is this a recent change?  Because as of October 2007, 76d7cc03
("IB/mthca: Use mmiowb() to avoid firmware commands getting jumbled up")
was needed.  But this was involving writel() (__raw_writel() actually,
looking at the code), not readl().  But writel_relaxed() doesn't exist
(and doesn't make sense).

 - R.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread James Bottomley
On Tue, 2008-05-27 at 08:50 -0700, Roland Dreier wrote:
> > Though it's my understanding that at least ia64 does require the
>  > explicit barriers anyway, so we are still in a dodgy situation here
>  > where it's not clear what drivers should do and we end up with
>  > possibly excessive barriers on powerpc where I end up with both
>  > the wmb/rmb/mb that were added for ia64 -and- the ones I have in
>  > readl/writel to make them look synchronous... Not nice.
> 
> ia64 is a disaster with a slightly different ordering problem -- the
> mmiowb() issue.  I know Ben knows far too much about this, but for big
> SGI boxes, you sometimes need mmiowb() to avoid problems with driver
> code that does totally sane stuff like
> 
>   spin_lock(&mmio_lock);
>   writel(val1, reg1);
>   writel(val2, reg2);
>   spin_unlock(&mmio_lock);
> 
> If that snippet is called on two CPUs at the same time, then the device
> might see a sequence like
> 
>   CPU1 -- write reg1
>   CPU2 -- write reg1
>   CPU1 -- write reg2
>   CPU2 -- write reg2
> 
> in spite of the fact that everything is totally ordered on the CPUs by
> the spin lock.
> 
> The reason this is such a disaster is because the code looks right,
> makes sense, and works fine on 99.99% of all systems out there.  So I
> would bet that 99% of our drivers have missing mmiowb() "bugs" -- no one
> has plugged the hardware into an Altix box and cared enough to stress
> test it.
> 
> However for the issue at hand, my expectation as a driver writer is that
> readl()/writel() are ordered with respect to MMIO operations, but not
> necessarily with respect to normal writes to coherent CPU memory.  And
> I've included explicit wmb()s in code I've written like
> drivers/infiniband/hw/mthca.

Actually, this specifically should not be.  The need for mmiowb on altix
is because it explicitly violates some of the PCI rules that would
otherwise impede performance.   The compromise is that readX on altix
contains the needed dma flush but there's a variant operator,
readX_relaxed that doesn't (for drivers that know what they're doing).
The altix critical drivers have all been converted to use the relaxed
form for performance, and the unconverted ones should all operate just
fine (albeit potentially more slowly).

You can see all of this in include/asm-ia64/sn/io.h

It is confusing to me that sn_dma_flush() and sn_mmiowb() have different
implementations, but I think both fix the spinlock problem you allude to
by ensuring the DMA operation is completed before the CPU instruction is
executed.

James


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: MMIO and gcc re-ordering issue

2008-05-27 Thread Linus Torvalds


On Tue, 27 May 2008, Linus Torvalds wrote:
> 
> Here's a UNTESTED patch for x86 that may or may not compile and work, and 
> which serializes (on a compiler level) the IO accesses against regular 
> memory accesses.

Ok, so it at least boots on x86-32. Thus probably on x86-64 too (since the 
code is now shared). I didn't look at whether it generates much bigger 
code due to the potential extra serialization, but some of the code 
generation I looked at looked fine.

IOW, it doesn't at least create any _obviously_ worse code, and it should 
be arguably safer than assuming the compiler does volatile accesses the 
way we want it to.

Linus
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


PPC PReP link failure - copy_page, empty_zero_page

2008-05-27 Thread Meelis Roos
While compiling the current kernel on prep ppc:

  MODPOST 703 modules
ERROR: "copy_page" [fs/fuse/fuse.ko] undefined!
ERROR: "empty_zero_page" [fs/ext4/ext4dev.ko] undefined!

-- 
Meelis Roos ([EMAIL PROTECTED])
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev