op 23-07-14 09:37, Christian K?nig schreef:
> Am 23.07.2014 09:31, schrieb Daniel Vetter:
>> On Wed, Jul 23, 2014 at 9:26 AM, Christian K?nig
>> <deathsimple at vodafone.de> wrote:
>>> It's not a locking problem I'm talking about here. Radeons lockup handling
>>> kicks in when anything calls into the driver from the outside, if you have a
>>> fence wait function that's called from the outside but doesn't handle
>>> lockups you essentially rely on somebody else calling another radeon
>>> function for the lockup to be resolved.
>> So you don't have a timer in radeon that periodically checks whether
>> progress is still being made? That's the approach we're using in i915,
>> together with some tricks to kick any stuck waiters so that we can
>> reliably step in and grab locks for the reset.
>
> We tried this approach, but it didn't worked at all.
>
> I already considered trying it again because of the upcoming fence 
> implementation, but reconsidering that when a driver is forced to change it's 
> handling because of the fence implementation that's just another hint that 
> there is something wrong here.
As far as I can tell it wouldn't need to be reworked for the fence 
implementation currently, only the moment you want to allow callers outside of 
radeon. :-)
Doing a GPU lockup recovery in the wait function would be messy even right now, 
you would hit a deadlock in ttm_bo_delayed_delete -> 
ttm_bo_cleanup_refs_and_unlock.

Regardless of the fence implementation, why would it be a good idea to do a 
full lockup recovery when some other driver is
calling your wait function? That doesn't seem to be a nice thing to do, so I 
think a timeout is the best error you could return here,
other drivers have to deal with that anyway.

~Maarten

Reply via email to