On Wed, Apr 16, 2014 at 3:18 AM, Eric Anholt <e...@anholt.net> wrote: > Kenneth Graunke <kenn...@whitecape.org> writes: > >> On 04/14/2014 05:33 PM, Eric Anholt wrote: >>> This manifested as rendering failures or sometimes GPU hangs in >>> compositors when they accidentally got MSAA visuals due to a bug in the X >>> Server. Today we decided that the problem in compositors was equivalent >>> to a corruption bug we'd noticed recently in resizing MSAA-visual >>> glxgears, and debugging got a lot easier. >>> >>> When we allocate our MCS MT, libdrm takes the size we request, aligns it >>> to Y tile size (blowing it up from 300x300=900000 bytes to 384*320=122880 >>> bytes, 30 pages), then puts it into a power-of-two-sized BO (131072 bytes, >>> 32 pages). Because it's Y tiled, we attach a 384-byte-stride fence to it. >>> When we memset by the BO size in Mesa, between bytes 122880 and 131072 the >>> data gets stored to the first 20 or so scanlines of each of the 3 tiled >>> pages in that row, even though only 2 of those pages were allocated by >>> libdrm. >> >> What? >> >> I get that drm_intel_bo_alloc/drm_intel_bo_alloc_tiled might return a >> drm_intel_bo where bo->size is larger than what you asked for, due to >> the BO cache. But...what you're saying is, it doesn't actually allocate >> enough pages to back the whole bo->size it gives you? So, if you write >> bytes 0..(bo->size - 1), you'll randomly clobber memory in a way that's >> really difficult to detect? > > You have that many pages, really. But you've attached a fence to it, so > your allocated pages are structured as: > > +---+---+---+ > | | | | > +---+---+---+ > | | | | > +---+---+---+ > | | | | > +---+---+---+ > | | | > +---+---+ > > (except taller in this specific example). If you hit the pixels in > those quads, you'll be fine. > >> >> There are other places where we memset an entire BO using bo->size. For >> example, your INTEL_DEBUG=shader_time code does exactly that (though it >> isn't tiled). >> >> Could we change libdrm to set bo->size to the actual usable size of the >> buffer, rather than the bucket size? > > The pages containing pixels you asked for go to 122880, and the BO is > 131072, but the pixels you asked for have a maximum linear address of > 384*320=115200. Which value are you thinking is the "actual usable > size"? We certainly shouldn't have been memsetting more pixels than > 115200. 384*320 is 122880. It feels like bo->size could be 122880, and 131072 could be stored elsewhere in bo_gem.
With that change assumed, do you think it makes sense to add if (tiling_mode != I915_TILING_NONE && bo->size % stride) fprintf(stderr, "bo size is not a multiple of stride\n"); to drm_intel_gem_bo_set_tiling_internal? That is, emit a warning when drm_intel_gem_bo_map_gtt(bo); memset(bo->virtual, 0, bo->size); is known to explode. > > _______________________________________________ > mesa-stable mailing list > mesa-sta...@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-stable > -- o...@lunarg.com _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev