On 07/12/16 07:16 PM, Nicolai Hähnle wrote: > On 07.12.2016 08:50, Michel Dänzer wrote: >> On 06/12/16 10:24 PM, Marek Olšák wrote: >>> On Mon, Dec 5, 2016 at 10:05 AM, Michel Dänzer <mic...@daenzer.net> >>> wrote: >>>> On 03/12/16 05:38 AM, Marek Olšák wrote: >>>>> From: Marek Olšák <marek.ol...@amd.com> >>>>> >>>>> This fixes random radeonsi GPU hangs in Batman Arkham: Origins >>>>> (Wine) and >>>>> probably many other games too. >>>>> >>>>> cso_cache deletes sampler states when the cache size is too big and >>>>> doesn't >>>>> check which sampler states are bound, causing use-after-free in >>>>> drivers. >>>>> Because of that, radeonsi uploaded garbage sampler states and the >>>>> hardware >>>>> went bananas. Other drivers may have experienced similar issues. >>>>> >>>>> Cc: 13.0 12.0 <mesa-sta...@lists.freedesktop.org> >>>>> --- >>>>> src/gallium/auxiliary/cso_cache/cso_cache.c | 4 +++- >>>>> 1 file changed, 3 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/src/gallium/auxiliary/cso_cache/cso_cache.c >>>>> b/src/gallium/auxiliary/cso_cache/cso_cache.c >>>>> index b240c93..1f3be4b 100644 >>>>> --- a/src/gallium/auxiliary/cso_cache/cso_cache.c >>>>> +++ b/src/gallium/auxiliary/cso_cache/cso_cache.c >>>>> @@ -181,21 +181,23 @@ static inline void sanitize_cb(struct >>>>> cso_hash *hash, enum cso_cache_type type, >>>>> --to_remove; >>>>> } >>>>> } >>>>> >>>>> struct cso_hash_iter >>>>> cso_insert_state(struct cso_cache *sc, >>>>> unsigned hash_key, enum cso_cache_type type, >>>>> void *state) >>>>> { >>>>> struct cso_hash *hash = _cso_hash_for_type(sc, type); >>>>> - sanitize_hash(sc, hash, type, sc->max_size); >>>>> + >>>>> + if (type != CSO_SAMPLER) >>>>> + sanitize_hash(sc, hash, type, sc->max_size); >>>>> >>>>> return cso_hash_insert(hash, hash_key, state); >>>>> } >>>>> >>>>> struct cso_hash_iter >>>>> cso_find_state(struct cso_cache *sc, >>>>> unsigned hash_key, enum cso_cache_type type) >>>>> { >>>>> struct cso_hash *hash = _cso_hash_for_type(sc, type); >>>>> >>>>> >>>> >>>> If I understand correctly, this means that the maximum cache size >>>> effectively isn't enforced for sampler states? Wouldn't it be better >>>> instead to add code which prevents currently bound states from being >>>> deleted from here? >>> >>> Not in this patch. Maybe later and if CPU profiling results show that >>> it matters. >>> >>> Currently, if the cache size is exceeded, the GPU usually hangs. If >>> pruning the cache was important, most apps would hang. >>> >>> Thus, I'd like to push this as-is. >> >> Fair enough, but this will require some follow-up work: >> >> Why are sampler states singled out? The same problem can at least >> theoretically happen with other CSOs as well, right? > > The other CSOs only have single bind points. So with other CSOs, what > happens is that: > > 1. State tracker wants to bind some state, creates the CSO. > 2. cso_cache gets cleaned out; this might delete the currently bound CSO. > 3. Currently bound CSO gets immediately overwritten with the new state > anyway. > > Samplers are different because there are many bind points within a > single context. > > You're right though that the sounder approach would be to check all CSOs > against current bind points during cache cleanup, just in case a driver > does something with the old state on unbind.
That does seem rather unlikely though. Thanks for clarifying this. >> P.S. The bug has been there forever, so I don't see the justification >> for bypassing the normal stable branch process. > > And we've had mysterious hang bug reports forever as well. I wouldn't be > surprised if a large number or even all of them could be traced back to > this issue. If I had JPY 100 for every time I had such hope for a fix... :) I really hope this is the one, but experience suggests it's more likely that there are lots of different issues causing GPU hangs. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev