Jerome Glisse wrote:
> Hi Thomas,
>
> Dave find out the root of a strange oops we did encouter.
> I spend sometimes today trying to hack ttm around but
> in the end my solution is wrong.
>
> So when we move an object a ttm ghost object is created.
> If GPU takes time to evict the bo the ghost object endup
> on the destroy list & stay on the lru list (unless i have
> missunderstood the code the whole day). No if ghost is
> in GTT (similar issue can happen in different configuration,
> bottom line is evicting a ghost object) it can get evicted
> and that's when trouble start. The driver callback get
> call with the ghost object but ghost object haven't been
> created by the driver and thus driver will more than
> likely endup oupsing trying to access its private bo
> structure (ttm_bo structure is embeded in radeon_bo
> structure and any driver relying on accessing the
> driver structure will hit this issue).
>
> I see 2 solutions :
>   - Don't put ghost on lru list
>   - Add a flag so we know if we can call driver
>     callback on object or not.
>   

Jerome,

In general, since the driver bo is *derived* from the ttm bo, and the 
callback takes the base type, ttm bos as arguments, The driver needs to 
check the object type before typecasting. We do a similar check in the 
vmwgfx driver by checking the bo destroy function, to see whether it's 
the driver specific destroy, so this first problem should be viewed as a 
driver bug, as I see it.

Note that if you need driver private per-bo information to be added to a 
bo in order for move() to work, you should carefully review if it's 
*really* needed, and in that case we must set up a callback to add that 
information at bo creation, but in general the driver specific move 
function should be able to handle the base object type.

> I will send the first solution patch but i haven't yet
> found an easy way to exercise this path. My concern is
> that when in ttm_bo_mem_force_space we might fail because
> we don't wait for the ghost object to actualy die and
> free space (issue only if no_wait=false).
>   
> Also i wonder if letting a ghost bo object on lru might
> not lead to infinite eviction loop. Case i am thinking
> of :
>   - VRAM is full only 1 object we can evict, we evict
>     it and create a ghost object holding the vram space
>     the eviction is long enough that we put the ghost
>     on lru. ttm_bo_mem_force_space evict the ghost_object
>     and we loop on this.
>
> Anyway, what is your thought on this.
>   

This situation is actually handled by the evict bool. When @evict==true, 
no ghost object is created, and eviction is synchronous, so rather than 
being incorrect, we're being suboptimal.

I admit this isn't the most optimal solution.

My plan when I get time is to implement fully asynchronous memory 
management. That means that the managers are in sync with the CPU and 
not the GPU, and all buffer moves are pipelined, provided that the 
driver supports it. This also means that I will hang a sync object on 
each memory type manager, so that if we need to switch hw engine, and 
sync the manager with the GPU, we can wait on that sync object.

This will mean that when you evict a buffer object, its space will 
immediately show up as free, although it really isn't free yet, but it 
*will* be free when the gpu executes a move to that memory region, since 
the eviction will be scheduled before the move to memory.

Thanks,
Thomas


> Cheers,
> Jerome
>   


------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to