On Mon, 2003-02-03 at 18:09, Benjamin Herrenschmidt wrote:
> On Mon, 2003-02-03 at 17:05, Michel Dänzer wrote:
> > On Mon, 2003-02-03 at 17:34, Alan Cox wrote:
> > > On Mon, 2003-02-03 at 15:02, Keith Whitwell wrote:
> > > > >  
> > > > > -#define COMMIT_RING() do {                                       \
> > > > > -     RADEON_WRITE( RADEON_CP_RB_WPTR, dev_priv->ring.tail );             \
> > > > > +#define COMMIT_RING() do {                                           \
> > > > > +     /* read from PCI bus to ensure correct posting */               \
> > > > > +     RADEON_READ( RADEON_CP_RB_WPTR );                               \
> > > > > +     RADEON_WRITE( RADEON_CP_RB_WPTR, dev_priv->ring.tail );         \
> > > > > +     RADEON_READ( RADEON_CP_RB_WPTR );                               \
> > > > >  } while (0)
> > > > 
> > > > Ouch.  Put a conditional around that at least, so that not everybody suffers...
> > > 
> > > PCI posting applies to all platforms. However I'm trying to understand what this
> > > is trying to do. The final read has an effect in that it ensures that the WPTR is
> > > written not left posted for an undefined time. What does the previous one 
>achieve.
> > > Is there some kind of synchronization requirement against the GART/main memory ?
> > 
> > That's my understanding, we need to make sure the chip reads from the
> > ring what we wrote to it.
> 
> Well... You are asking for trouble ;)
> 
> The problem is that the behaviour will be pretty much HW implementation
> dependant. 
> 
> In the AGP case, the ring is mapped uncacheable. So your card and the
> ring are typically on the same memory type from the CPU, that helps.
> Though I would still make sure the correct bus path is flushed by doing
> that first read from the ring and not from the card.
> 
> In the PCI case, the ring is mapped cacheable in normal memory and you
> rely on the PCI cache coherency (snooping). That means that you have a
> new problem which is to synchronize writes to cacheable memory (the
> ring) with write to non cacheable MMIO space (the card). At least on
> PPC, I don't think anything but a full sync instruction will acheive
> that, so you'd rather add an mb(). And do the read from memory (actually
> cache), not the card.

You mean like this? Chris, does that work for you?


-- 
Earthling Michel Dänzer (MrCooper)/ Debian GNU/Linux (powerpc) developer
XFree86 and DRI project member   /  CS student, Free Software enthusiast
Index: radeon_drv.h
===================================================================
RCS file: /cvsroot/dri/xc/xc/programs/Xserver/hw/xfree86/os-support/shared/drm/kernel/radeon_drv.h,v
retrieving revision 1.12
diff -p -u -r1.12 radeon_drv.h
--- radeon_drv.h	2 Jan 2003 18:38:07 -0000	1.12
+++ radeon_drv.h	3 Feb 2003 18:25:00 -0000
@@ -864,8 +857,13 @@ do {									\
 		dev_priv->ring.tail = write;				\
 } while (0)
 
-#define COMMIT_RING() do {					    \
-	RADEON_WRITE( RADEON_CP_RB_WPTR, dev_priv->ring.tail );		    \
+#define COMMIT_RING() do {						\
+	/* Flush writes to ring */					\
+	DRM_READMEMORYBARRIER();					\
+	GET_RING_HEAD( &dev_priv->ring );				\
+	RADEON_WRITE( RADEON_CP_RB_WPTR, dev_priv->ring.tail );		\
+	/* read from PCI bus to ensure correct posting */		\
+	RADEON_READ( RADEON_CP_RB_RPTR );				\
 } while (0)
 
 #define OUT_RING( x ) do {						\

Reply via email to