-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 07/11/2011 12:49 PM, Dan McCabe wrote: > On 07/09/2011 08:56 AM, Chad Versace wrote: >> Up until this point, we incorrectly believed that the stencil buffer is >> Y-tiled. In fact, it is W tiled. From PRM Vol 1 Part 2 Section 4.5.2.1 >> W-Major Tile Format: >> "W-Major Tile Format is used for separate stencil." >> >> Since the stencil buffer is allocated with I915_TILING_Y, the span >> functions must decode W tiling through a Y tiled fence. >> >> On gen5 with intel_screen.hw_must_use_separate_stencil enabled, >> Fixes-Piglit-test: stencil-drawpixels >> Fixes-Piglit-test: stencil-scissor-clear >> Fixes-Piglit-test: readpixels-24_8 >> >> Note: This is a candidate for the 7.11 branch >> Signed-off-by: Chad Versace<c...@chad-versace.us> >> --- >> src/mesa/drivers/dri/intel/intel_span.c | 52 >> +++++++++++++++++++++---------- >> 1 files changed, 35 insertions(+), 17 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/intel/intel_span.c >> b/src/mesa/drivers/dri/intel/intel_span.c >> index 153803f..f39c008 100644 >> --- a/src/mesa/drivers/dri/intel/intel_span.c >> +++ b/src/mesa/drivers/dri/intel/intel_span.c >> @@ -141,28 +141,46 @@ intel_set_span_functions(struct intel_context *intel, >> /** >> * \brief Get pointer offset into stencil buffer. >> * >> - * The stencil buffer interleaves two rows into one. Yay for crazy hardware. >> - * The table below demonstrates how the pointer arithmetic behaves for a >> buffer >> - * with positive stride (s=stride). >> - * >> - * x | y | byte offset >> - * -------------------------- >> - * 0 | 0 | 0 >> - * 0 | 1 | 1 >> - * 1 | 0 | 2 >> - * 1 | 1 | 3 >> - * ... | ... | ... >> - * 0 | 2 | s >> - * 0 | 3 | s + 1 >> - * 1 | 2 | s + 2 >> - * 1 | 3 | s + 3 >> - * >> + * The stencil buffer is W-tiled, yet the drm buffer is allocated with >> + * I915_TILING_Y. So here we must decode the W tiling through a Y fence. >> * >> + * From PRM Vol 1 Part 2 Section 4.5.2.1 W-Major Tile Format: >> + * "W-Major Tile Format is used for separate stencil." >> */ >> static inline intptr_t >> intel_offset_S8(int stride, GLint x, GLint y) >> { >> - return 2 * ((y / 2) * stride + x) + y % 2; >> + /* f: (x, y) -> (fx, fy) */ >> + int fx = x / 8; >> + int fy = y / 4; >> + >> + /* e: (x, y) -> (ex, 0) */ >> + int ex = (x % 8) / 4; >> + >> + /* d: (x, y) -> (dx, dy) */ >> + int dx = (x % 4) / 2; >> + int dy = (y % 4) / 2; >> + >> + /* c: (x, y) -> (cx, cy) */ >> + int cx = x % 2; >> + int cy = y % 2; >> + >> + int s = stride; >> + intptr_t o = 0; >> + >> + if (s> 0) { >> + /*f*/ o += 16 * fx + 4 * s * fy; >> + /*e*/ o += 2 * s * ex; >> + /*d*/ o += 4 * dx + 8 * dy; >> + /*c*/ o += cx + 2 * cy; >> + } else { >> + /*f*/ o += 16 * fx + 4 * s * fy; >> + /*e*/ o += 2 * s * (1 - ex); >> + /*d*/ o += 4 * dx + 8 * (1 - dy); >> + /*c*/ o += cx + 2 * (1 - cy); >> + } >> + >> + return o; >> } >> >> #define WRITE_STENCIL(x, y, src) buf[intel_offset_S8(stride, x, y)] = src; > Can stride ever be negative? If so, why?
Yes. The stride is negative for window-system renderbuffers. > If the app ever specified a negative stride, it could have fixed at buffer > creation time (by also adjusting the buffer base address). No need to worry > about that issue thereafter. No-can-do. As far as I know, X demands that its buffers have negative stride. > On the other hand, negative strides could be considered evil :). Yes, they are :) > Also, can x or y ever be negative? No. [snip] > Using "*", "/" and "%" for bit manipulations of pixel addresses should be > avoided. Shifts and masks are clearer for bit manipulation, IMO. > > Regarding the offset computation, it might be useful to think of x,y > addresses of the tile and then of x,y addresses within the tile to make the > code more readable and perhaps simplify your computations. Damn crazy hardware. Here we need to decode a W tile through a Y fence, so the "x,y addresses of the tile and then of x,y addresses within the tile" are non-existent. The fence mapping has carved up the W tile and made a mess of it. So, there are no tile addresses to compute. Here is the little meaning that my equations possess: - (fx + ex, fy) is the address of a 4x4 block in which (x, y) resides. - Decompose that 4x4 block into 2x2 blocks. (dx, dy) is the address of that 2x2 block within the 4x4 block. - (cx, cy) is the address of (x, y) within that 2x2 block. > For example, if stride is a power of 2, tile size is 2x2 pixels, and your x > and y address bits look like (upper case bits are tile bits and lower case > letters are intra-tile bits): > XXXXXx > and > YYYYYy > then the offset for that pixel and a power of two stride has a bit pattern > that looks like > YYYYXXXXyx > But the devil is in the details and this might not be valid for our > particular (crazy?) hardware. YMMV. Can a similiar set of bit operations replicate intel_offset_S8()? > cheers, danm - -- Chad Versace c...@chad-versace.us -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJOG4EuAAoJEAIvNt057x8iZowQAIJXGQBGlBDUC9US7U+r83Jp SulXEua8kE/KHfptTN7WoPgmdmrRKH2JMcIC2kcCUE6I9/PdSRV2VIaVgWr/pMdP Z1SUVM4lQP57rPRe5iZmlS43u4gUSQaEr0fbYU6RynD4tXphH5W8gnF5QdF6/pGo LNIaGJ4ya3euAWX+aJwYI7Z+skbZ1ucyDUbII+RpDOnrO80v4E8uVrrGizr5PQtp VrX6Zpr1s800kgUXBtkiREK71VSSUYSxFEQ3xP/xBYzUcyhJxpqapN7qOWu9tOI+ 2OGy9MX2JuMwqmjVDLg8ap6dsQ92OjKtsgatNJ94Llg5xfP1hN7t91IRWtYpK9/V v90d/OEu4R1z+8864oqmo7oKlAKersFhhrvIUQ22N6lVxzQqjoR+Ph9BXUwhgnKb iJ3JMDE90UrJWeO+EnqBbL75HgvgcjZr15PlqZSKNvcNd5M5R/7lPKg/3Zbk88cy NRJtnrQdWrFk7aGrdi2bfIqm1teCq9kWBBbYuKQn1buXsXtMEC46SJKS9vSua/pF nVvr1361M9c+Whd4jTSErd/vLvgUA0dxptWPAestalT6i7yYoeq+ZxA236TIqvDZ 1AbSiDeXzslv3ArshbWrtQAZw33cop1z9HGiYMNCdbsdLVJtN1zf0Y25VV3nBcom 3anb5cgbYF601N7wI+CN =Y7Df -----END PGP SIGNATURE----- _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev