On 06/12/16 10:24 PM, Marek Olšák wrote: > On Mon, Dec 5, 2016 at 10:05 AM, Michel Dänzer <mic...@daenzer.net> wrote: >> On 03/12/16 05:38 AM, Marek Olšák wrote: >>> From: Marek Olšák <marek.ol...@amd.com> >>> >>> This fixes random radeonsi GPU hangs in Batman Arkham: Origins (Wine) and >>> probably many other games too. >>> >>> cso_cache deletes sampler states when the cache size is too big and doesn't >>> check which sampler states are bound, causing use-after-free in drivers. >>> Because of that, radeonsi uploaded garbage sampler states and the hardware >>> went bananas. Other drivers may have experienced similar issues. >>> >>> Cc: 13.0 12.0 <mesa-sta...@lists.freedesktop.org> >>> --- >>> src/gallium/auxiliary/cso_cache/cso_cache.c | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/src/gallium/auxiliary/cso_cache/cso_cache.c >>> b/src/gallium/auxiliary/cso_cache/cso_cache.c >>> index b240c93..1f3be4b 100644 >>> --- a/src/gallium/auxiliary/cso_cache/cso_cache.c >>> +++ b/src/gallium/auxiliary/cso_cache/cso_cache.c >>> @@ -181,21 +181,23 @@ static inline void sanitize_cb(struct cso_hash *hash, >>> enum cso_cache_type type, >>> --to_remove; >>> } >>> } >>> >>> struct cso_hash_iter >>> cso_insert_state(struct cso_cache *sc, >>> unsigned hash_key, enum cso_cache_type type, >>> void *state) >>> { >>> struct cso_hash *hash = _cso_hash_for_type(sc, type); >>> - sanitize_hash(sc, hash, type, sc->max_size); >>> + >>> + if (type != CSO_SAMPLER) >>> + sanitize_hash(sc, hash, type, sc->max_size); >>> >>> return cso_hash_insert(hash, hash_key, state); >>> } >>> >>> struct cso_hash_iter >>> cso_find_state(struct cso_cache *sc, >>> unsigned hash_key, enum cso_cache_type type) >>> { >>> struct cso_hash *hash = _cso_hash_for_type(sc, type); >>> >>> >> >> If I understand correctly, this means that the maximum cache size >> effectively isn't enforced for sampler states? Wouldn't it be better >> instead to add code which prevents currently bound states from being >> deleted from here? > > Not in this patch. Maybe later and if CPU profiling results show that > it matters. > > Currently, if the cache size is exceeded, the GPU usually hangs. If > pruning the cache was important, most apps would hang. > > Thus, I'd like to push this as-is.
Fair enough, but this will require some follow-up work: Why are sampler states singled out? The same problem can at least theoretically happen with other CSOs as well, right? The maximum cache size needs to be addressed, either something like above or by just removing the maximum cache size. BTW, if varying the LOD bias is common, maybe the LOD bias value should be tracked separately from the remaining sampler state? P.S. The bug has been there forever, so I don't see the justification for bypassing the normal stable branch process. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev