Hi Thomas,
      I conclude your meaning as below:
a. When CPU join in, it must wait for the sync object to really free the
device address space.
b. When CPU absent, but there are two indepent HW engines relevant to the
space, the one must wait for the sync object.
c.  Fully pipelined bo move support when only one HW engine related to the
space.
Am i right?
About *b*, let's say,
1)schedule copy the bo from VRAM based on HW DMA engine.
2) Put a corresponding sync object on the manager.
3) Free the vram region.
4) Region gets allocated.
5) GPU 2D render to this region.
since GPU 2D and HW DMA engine is totally independent from each other, so
sync object still needs to be signaled in this situation.



2009/12/15 Thomas Hellström <tho...@shipmail.org>

> Donnie Fang wrote:
>
>> Hi Thomas,
>>      I have several doubts. please check them as below.
>>
>> 2009/12/15 Thomas Hellstrom <tho...@shipmail.org <mailto:
>> tho...@shipmail.org>>
>>
>>
>>    Jerome Glisse wrote:
>>    > Hi Thomas,
>>    >
>>    > Dave find out the root of a strange oops we did encouter.
>>    > I spend sometimes today trying to hack ttm around but
>>    > in the end my solution is wrong.
>>    >
>>    > So when we move an object a ttm ghost object is created.
>>    > If GPU takes time to evict the bo the ghost object endup
>>    > on the destroy list & stay on the lru list (unless i have
>>    > missunderstood the code the whole day). No if ghost is
>>    > in GTT (similar issue can happen in different configuration,
>>    > bottom line is evicting a ghost object) it can get evicted
>>    > and that's when trouble start. The driver callback get
>>    > call with the ghost object but ghost object haven't been
>>    > created by the driver and thus driver will more than
>>    > likely endup oupsing trying to access its private bo
>>    > structure (ttm_bo structure is embeded in radeon_bo
>>    > structure and any driver relying on accessing the
>>    > driver structure will hit this issue).
>>    >
>>    > I see 2 solutions :
>>    >   - Don't put ghost on lru list
>>    >   - Add a flag so we know if we can call driver
>>    >     callback on object or not.
>>    >
>>
>>    Jerome,
>>
>>    In general, since the driver bo is *derived* from the ttm bo, and the
>>    callback takes the base type, ttm bos as arguments, The driver
>>    needs to
>>    check the object type before typecasting. We do a similar check in the
>>    vmwgfx driver by checking the bo destroy function, to see whether it's
>>    the driver specific destroy, so this first problem should be
>>    viewed as a
>>    driver bug, as I see it.
>>
>>    Note that if you need driver private per-bo information to be
>>    added to a
>>    bo in order for move() to work, you should carefully review if it's
>>    *really* needed, and in that case we must set up a callback to add
>>    that
>>    information at bo creation, but in general the driver specific move
>>    function should be able to handle the base object type.
>>
>>    > I will send the first solution patch but i haven't yet
>>    > found an easy way to exercise this path. My concern is
>>    > that when in ttm_bo_mem_force_space we might fail because
>>    > we don't wait for the ghost object to actualy die and
>>    > free space (issue only if no_wait=false).
>>    >
>>    > Also i wonder if letting a ghost bo object on lru might
>>    > not lead to infinite eviction loop. Case i am thinking
>>    > of :
>>    >   - VRAM is full only 1 object we can evict, we evict
>>    >     it and create a ghost object holding the vram space
>>    >     the eviction is long enough that we put the ghost
>>    >     on lru. ttm_bo_mem_force_space evict the ghost_object
>>    >     and we loop on this.
>>    >
>>    > Anyway, what is your thought on this.
>>    >
>>
>>    This situation is actually handled by the evict bool. When
>>    @evict==true,
>>    no ghost object is created, and eviction is synchronous, so rather
>>    than
>>    being incorrect, we're being suboptimal.
>>
>>    I admit this isn't the most optimal solution.
>>
>>    My plan when I get time is to implement fully asynchronous memory
>>    management. That means that the managers are in sync with the CPU and
>>    not the GPU, and all buffer moves are pipelined, provided that the
>>    driver supports it. This also means that I will hang a sync object on
>>    each memory type manager, so that if we need to switch hw engine, and
>>    sync the manager with the GPU, we can wait on that sync object.
>>
>>    This will mean that when you evict a buffer object, its space will
>>    immediately show up as free, although it really isn't free yet, but it
>>
>>
>>    *will* be free when the gpu executes a move to that memory region,
>>    since
>>    the eviction will be scheduled before the move to memory.
>>
>> Does the space show up as free immediately even when the fence object of
>> this bo hasn't been signaled?
>> The ttm core now deliver it to a ghost bo and let it track the old bo's
>> space, free its space only when the bo fence signaled.
>>  How could manage to fit these modification? would you please show more
>> hints?
>>
>
> Yes, the space will show up as free immediately.  However, if you want to
> *use* the space immediately, you will have to wait on the sync object that
> will be attached to the manager. Let's say you copy _from_ vram using a hw
> dma engine, and copy _to_ vram using the CPU:
>
> 1) Schedule a copy from a vram region.
> 2) Put a corresponding sync object on the manager.
> 3) Free the vram region.
> 4) Region gets allocated.
> 5) We want to memcpy to the vram region. We need to wait for the manager
> fence.
> 6) Memcpy.
>
> The second example is when you use the hw dma engine for both copies.
> 1) Schedule a copy from a vram region.
> 2) Put a corresponding sync object on the manager.
> 3) Free the vram region.
> 4) Region gets allocated.
> 5) Schedule a copy to vram. No need to wait for the fence, since the move
> _to_ vram will be carried out after the move _from_ vram.
>
> To make this work, the driver move() will have to return a sync object that
> we can hang on the manager, if the move is not carried out instantly.
> The driver will also need to provide a method to order sync objects, ( a
> schedule_barrier) so that when the "last" sync object is signaled, both sync
> objects need to be signaled.
>
>
> /Thomas
>
>
>
>
>
>
>
------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to