On Mon, Feb 16, 2015 at 2:59 PM, Ben Widawsky <[email protected]> wrote: > On Mon, Feb 16, 2015 at 02:14:41PM -0500, Alex Deucher wrote: >> On Sat, Feb 14, 2015 at 2:21 AM, Ilia Mirkin <[email protected]> wrote: >> > On Sat, Feb 14, 2015 at 2:11 AM, Ben Widawsky >> > <[email protected]> wrote: >> >> On Sat, Feb 14, 2015 at 02:07:32AM -0500, Ilia Mirkin wrote: >> >>> On Sat, Feb 14, 2015 at 1:54 AM, Ben Widawsky >> >>> <[email protected]> wrote: >> >>> > +static struct query queries[] = { >> >>> > + { >> >>> > + .query = GL_FRAGMENT_SHADER_INVOCATIONS_ARB, >> >>> > + .name = "GL_FRAGMENT_SHADER_INVOCATIONS_ARB", >> >>> > + .min = TEST_WIDTH * TEST_HEIGHT / 2, >> >>> > + .max = 0xffffffff}, >> >>> > + /* XXX: >> >>> > + * Intel hardware has some very unpredictable results for >> >>> > fragment >> >>> > + * shader invocations. After a day of head scratching, I've >> >>> > given up. >> >>> > + * Generating a real min, or max is not possible. The spec >> >>> > allows this. >> >>> > + * This will also help variance across vendors. >> >>> > + */ >> >>> >> >>> Is there a working theory as to how this could be less than width * >> >>> height? Does it count 1 per quad? (Or how it could be much more than >> >>> width * height... I can see edges getting processed unnecessarily, >> >>> but... max_int seems high.) >> >> >> >> No working theory on min, but I figured if we're going to fudge the max, >> >> we may >> >> as well fudge the min. What would you like as a max? I can show you >> >> hardware >> >> which generates way more invocations than anything I can contrive. 1440 >> >> invocations for an 8x8. >> > >> > Impressive :) >> > >> > Best I can do is suggest that I don't think you're counting what you >> > think you're counting. This has probably occurred to you, but you >> > really should triple-check that you're reading (and writing) from the >> > right place for this counter. >> > >> >> I echo this sentiment. You might also check if there are any >> additional state bits related to counts. For example, IIRC, on radeon >> hw there is some additional state you need to set to get accurate >> counts for occlusion queries. >> >> Alex > > Triple check? I'm way past triple. I've had some other people look into it as > well, so I am not the only one confused. > > Alex, FYI, there was some follow-up discussion on IRC which probably should > have > been in the commit message in the first place. Haswell works exactly as > expected. For example, a 4x4 rectangle of 2 triangles generates 3 2x2 subspans > per triangle, for a total of 6 subspans, or 24 pixels. For all powers of two > squares tested, Haswell works exactly as expected. > > Perhaps not a coincidence, but the HW that counts this stuff changed for IVB, > and then again for HSW. So for some time, the working theory was, we just > don't > know how to count pre-HSW (in particular, IVB). No big deal. However, it turns > out Gen8 seems to behave in exactly the same manner as IVB. I've yet to try > Gen9, but there is definitely no errata I can find that I haven't already > implemented. > > Further confusion which I didn't mention - very large triangles generate a PS > invocation count that is about 1/4 the total number of pixels. I forget the > exact count, but a 256x256 square was something like 10,000 pixels. > > I think we all agree there is no point in holding up the series for this, > right?
No objections from me. I was just throwing out possible ideas to explain the behavior you were seeing, but it sounds like you've pretty well trodden that path at this point. Alex _______________________________________________ Piglit mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/piglit
