Re: Memory eviction in ttm

Thomas Hellström Fri, 14 Sep 2012 03:51:18 -0700

Hi Maarten!

Broadening the audience a bit..


On 9/14/12 9:12 AM, Maarten Lankhorst wrote:

Op 13-09-12 23:00, Thomas Hellstrom schreef:

On 09/13/2012 07:13 PM, Maarten Lankhorst wrote:

Hey

Op 13-09-12 18:41, Thomas Hellstrom schreef:

On 09/13/2012 05:19 PM, Maarten Lankhorst wrote:

Hey,

Op 12-09-12 15:28, Thomas Hellstrom schreef:

On 09/12/2012 02:48 PM, Maarten Lankhorst wrote:

Hey Thomas,

I'm playing around with moving reservations from ttm to global, but how ttm
ttm is handling reservations is getting in the way.  The code wants to move
the bo from the lru lock at the same time a reservation is made, but that
seems to be slightly too strict. It would really help me if that guarantee
is removed.

Hi, Maarten.

Removing that restriction is not really possible at the moment.
Also the memory accounting code depends on this, and may cause reservations
in the most awkward places. Since these reservations don't have a ticket
they may and will cause deadlocks. So in short the restriction is there
to avoid deadlocks caused by ticketless reservations.

I have finished the lockdep annotations now which seems to catch almost
all abuse I threw at it, so I'm feeling slightly more confident about moving
the locking order and reservations around.

Maarten, moving reservations in TTM out of the lru lock is incorrect as the 
code is
written now. If we want to move it out we need something for ticketless 
reservations

I've been thinking of having a global hash table of tickets with the task 
struct pointer as the key,
but even then, we'd need to be able to handle EBUSY errors on every operation 
that might try to
reserve a buffer.

The fact that lockdep doesn't complain isn't enough. There *will* be deadlock 
use-cases when TTM is handed
the right data-set.

Isn't there a way that a subsystem can register a callback to be performed to 
remove stuff from LRU and
to take a pre-reservation lock?

What if multiple subsystems need those? You will end up with a deadlock again.

I think it would be easier to change the code in ttm_bo.c to not assume the 
first
item on the lru list is really the least recently used, and assume the first 
item
that can be reserved without blocking IS the least recently used instead.

So what would happen then is that we'd spin on the first item on the LRU list, 
since
when reserving we must release the LRU lock, and if reserving fails, we thus
need to restart LRU traversal. Typically after a schedule(). That's bad.

So let's take a step back and analyze why the LRU lock has become a problem.
 From what I can tell, it's because you want to use per-object lock when 
reserving instead of a
global reservation lock (that TTM could use as the LRU lock). Is that correct?
and in that case, in what situation do you envision such a global lock being 
contended
to the extent that it hurts performance?

Lockdep WILL complain about trying to use multiple tickets, doing ticketed
and unticketed blocking reservations mixed, etc.

I want to remove the global fence_lock and make it a per buffer lock, with some
lockdep annotations it's perfectly legal to grab obj->fence_lock and 
obj2->fence_lock
if you have a reservation, but it should complain loudly about trying to take 2 
fence_locks
at the same time without a reservation.

Yes, TTM was previously using per buffer fence locks, and that works fine from 
a deadlock perspective, but
it hurts performance. Fencing 200 buffers in a command submission (google-earth 
for example) will mean
198 unnecessary locks, each discarding the processor pipelines. Locking is a 
*slow* operation, particularly
on systems with many processors, and I don't think it's a good idea to change 
that back, without analyzing
the performance impact. There are reasons people are writing stuff like RCU to 
avoid locking...

So why don't we simply use RCU for fence pointers and get rid of the fence 
locking? :D
danvet originally suggested it as a joke but if you think about it, it would 
make a lot of sense for this usecase.

I thought of that before, but the problem is you'd still need a spinlock to 
change the buffer's fence pointer,
even if reading it becomes quick.

Actually, I changed lockdep annotations a bit to distinguish between the
cases where ttm_bo_wait is called without reservation, and ttm_bo_wait
is called with, as far as I can see there are only 2 places that do it without,
at least if I converted my git tree properly..

http://cgit.freedesktop.org/~mlankhorst/linux/log/?h=v10-wip

First one is nouveau_bo_vma_del, this can be fixed easily.
Second one is ttm_bo_cleanup_refs and ttm_bo_cleanup_refs_or_queue,
if reservation is done first before ttm_bo_wait, the fence_lock could be
dropped entirely by adding smb_mb() in reserve and unreserve, functionally
there would be no difference. So if you can verify my lockdep annotations are
correct in the most recent commit wrt what's using ttm_bo_wait without 
reservation
we could remove the fence_lock entirely.

~Maarten

Being able to wait for buffer idle or get the fence pointer withoutreserving is a fundamental property of TTM. Reservation is a long-termlock. The fence lock is a very short term lock. If I were to choose, I'drather accept per-object fence locks than removing this property, butsee below.

Likewise, to be able to guarantee that a reserved object is not on anyLRU list is also an important property. Removing that property will, inaddition to the spin wait we've already discussed make understanding TTMlocking even more difficult, and I'd really like to avoid it.

If this were a real performance problem we were trying to solve it wouldbe easier to motivate changes in this area, but if it's just trying toavoid a global reservation lock and a global fence lock that will rarelyif ever see any contention, I can't see the point. On the contrary,having per-object locks will be very costly when reserving / fencingmany objects. As mentioned before, in the fence lock case it's beentried and removed, so I'd like to know the reasoning behind introducingit again, and in what situations you think the global locks will becontended.


/Thomas






_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Memory eviction in ttm

Reply via email to