Framebuffer read coherency (r3xx)
Hi, I'm currently trying to figure out why glean/texCube reproducably fails on my R420. That test basically has a number of sections that look like this loop(..) { render(); loop(..) { glReadPixels(..); check_values(); } } The outer loop continuously overwrites the same region of the framebuffer over and over again. In the third iteration of the outer loop in the TestReflectionMap section, this test reads a stale pixel, i.e. the value returned from glReadPixels is the value from the second iteration. Each of the following changes individually fixes the problem: 1) Do not overwrite the same region of the framebuffer in every iteration; instead, use a different framebuffer region for every iteration. 2) Add a sleep(1) before glReadPixels. 3) Add a sleep(1) after glReadPixels. 4) Call wbinvd() in the DRM at the end of radeon_cp_idle() (after the call to radeon_do_cp_idle), so that radeon_span.c ends up triggering a wbinvd before the actual read takes place. To me, this looks like glReadPixels reads a stale value from a cache somewhere. But which cache, and why? To add to the mystery, Glean has a lot of tests that look very similar, and they work just fine (though they might have a sufficiently different cache access pattern that might hide the symptom). Is there something I'm missing? And finally: What's the proper way to fix a problem like this? I'd appreciate all the help I can get in understanding this problem, since I've never dealt with something like it before. Thanks, Nicolai - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: Framebuffer read coherency (r3xx)
Each of the following changes individually fixes the problem: 1) Do not overwrite the same region of the framebuffer in every iteration; instead, use a different framebuffer region for every iteration. 2) Add a sleep(1) before glReadPixels. 3) Add a sleep(1) after glReadPixels. 4) Call wbinvd() in the DRM at the end of radeon_cp_idle() (after the call to radeon_do_cp_idle), so that radeon_span.c ends up triggering a wbinvd before the actual read takes place. It sounds like a GPU cache flush is missing, the pixels are probably sitting in the texture cache, and may not have hit the framebuffer when you read, the wbinvd is only acting like a sleep in this case. Dave. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: Framebuffer read coherency (r3xx)
On 07.06.2008 23:42, Dave Airlie wrote: Each of the following changes individually fixes the problem: 1) Do not overwrite the same region of the framebuffer in every iteration; instead, use a different framebuffer region for every iteration. 2) Add a sleep(1) before glReadPixels. 3) Add a sleep(1) after glReadPixels. 4) Call wbinvd() in the DRM at the end of radeon_cp_idle() (after the call to radeon_do_cp_idle), so that radeon_span.c ends up triggering a wbinvd before the actual read takes place. It sounds like a GPU cache flush is missing, the pixels are probably sitting in the texture cache, and may not have hit the framebuffer when you read, the wbinvd is only acting like a sleep in this case. Dave. Hmm, the radeon_cp_idle() should purge the destination cache (and wait for it too, including checking the DC_BUSY bit). At least the r200 driver has a comment in r200SpanRenderStart (including a dodgy workaround) which sure looks like the same issue to me. Roland - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: Framebuffer read coherency (r3xx)
Hmm, the radeon_cp_idle() should purge the destination cache (and wait for it too, including checking the DC_BUSY bit). At least the r200 driver has a comment in r200SpanRenderStart (including a dodgy workaround) which sure looks like the same issue to me. We may purge the cache but not wait for the purge to finish... Sounds like some experiments time.. There is also possibly a HOST data cache we need to deal with.. Dave. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel