Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-25 Thread Thomas Hellstrom
Jerome Glisse wrote: > On Thu, Jan 21, 2010 at 01:59:26PM +0100, Thomas Hellstrom wrote: > >> Jerome Glisse wrote: >> >>> On Thu, Jan 21, 2010 at 04:49:39AM +0100, Luca Barbieri wrote: >>> > We had to do a similar thing in the > Poulsbo driver and it turned out that we could

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-25 Thread Jerome Glisse
On Thu, Jan 21, 2010 at 04:14:39PM +0100, Luca Barbieri wrote: > I'm not sure I understand your proposal correctly. > It seems your proposoal is similar to mine, replacing the term "fence > nodes" with "ttm transactions", but I'm not sure if I understand it > correctly > > Here is some pseudocode

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-25 Thread Jerome Glisse
On Thu, Jan 21, 2010 at 01:59:26PM +0100, Thomas Hellstrom wrote: > Jerome Glisse wrote: > >On Thu, Jan 21, 2010 at 04:49:39AM +0100, Luca Barbieri wrote: > >>>We had to do a similar thing in the > >>>Poulsbo driver and it turned out that we could save a significant amount of > >>>CPU by using a de

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-21 Thread Francisco Jerez
Luca Barbieri writes: >> The card also keeps some internal FIFO caches, so it seems to me that >> getting that safe of races and efficient at the same time could be a bit >> non-trivial. > > Are they flushable? > It seems this *might* do the job: > if (!pfifo->cache_flush(dev)) > return; > pfifo-

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-21 Thread Luca Barbieri
> The card also keeps some internal FIFO caches, so it seems to me that > getting that safe of races and efficient at the same time could be a bit > non-trivial. Are they flushable? It seems this *might* do the job: if (!pfifo->cache_flush(dev)) return; pfifo->reassign(dev, false); pfifo->cache_fl

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-21 Thread Francisco Jerez
Luca Barbieri writes: >>> If not, it could possibly be hacked around by reading from a DMA >>> object at the address of the fence sequence number and then resizing >>> the DMA object so that addresses from a certain point on would trigger >>> a protection fault interrupt. >> >> I don't think you

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-21 Thread Maarten Maathuis
On Thu, Jan 21, 2010 at 4:56 PM, Luca Barbieri wrote: >> Doing it without software methods means you need to have a semaphore >> that "exists" in the cpu domain and therefore cannot be used again >> until the cpu is sure it's done. So that will probably require a >> rotating queue of several semap

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-21 Thread Luca Barbieri
> Doing it without software methods means you need to have a semaphore > that "exists" in the cpu domain and therefore cannot be used again > until the cpu is sure it's done. So that will probably require a > rotating queue of several semaphores and a fallback to cpu waiting. Why would it need a s

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-21 Thread Maarten Maathuis
On Thu, Jan 21, 2010 at 3:44 PM, Francisco Jerez wrote: > Luca Barbieri writes: > >>> Nvidia cards have a synchronization primitive that could be used to >>> synchronize several FIFOs in hardware (AKA semaphores, see [1] for an >>> example). >> >> Does this operate wholly on the GPU on all nVidia

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-21 Thread Luca Barbieri
>> If not, it could possibly be hacked around by reading from a DMA >> object at the address of the fence sequence number and then resizing >> the DMA object so that addresses from a certain point on would trigger >> a protection fault interrupt. > > I don't think you can safely modify a DMA object

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-21 Thread Luca Barbieri
I'm not sure I understand your proposal correctly. It seems your proposoal is similar to mine, replacing the term "fence nodes" with "ttm transactions", but I'm not sure if I understand it correctly Here is some pseudocode for a improved, simplified version of my proposal. It is modified so that t

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-21 Thread Francisco Jerez
Luca Barbieri writes: >> Nvidia cards have a synchronization primitive that could be used to >> synchronize several FIFOs in hardware (AKA semaphores, see [1] for an >> example). > > Does this operate wholly on the GPU on all nVidia cards? > > It seems that at least on some GPUs this will trigger

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-21 Thread Thomas Hellstrom
Luca Barbieri wrote: >> At a first glance: >> >> 1) We probably *will* need a delayed destroyed workqueue to avoid wasting >> memory that otherwise should be freed to the system. At the very least, the >> delayed delete process should optionally be run by a system shrinker. >> > You are right.

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-21 Thread Luca Barbieri
> Nvidia cards have a synchronization primitive that could be used to > synchronize several FIFOs in hardware (AKA semaphores, see [1] for an > example). Does this operate wholly on the GPU on all nVidia cards? It seems that at least on some GPUs this will trigger "software methods" that are basi

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-21 Thread Francisco Jerez
Luca Barbieri writes: >> At a first glance: >> >> 1) We probably *will* need a delayed destroyed workqueue to avoid wasting >> memory that otherwise should be freed to the system. At the very least, the >> delayed delete process should optionally be run by a system shrinker. > You are right. For

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-21 Thread Luca Barbieri
> At a first glance: > > 1) We probably *will* need a delayed destroyed workqueue to avoid wasting > memory that otherwise should be freed to the system. At the very least, the > delayed delete process should optionally be run by a system shrinker. You are right. For VRAM we don't care since we are

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-21 Thread Thomas Hellstrom
Jerome Glisse wrote: > On Thu, Jan 21, 2010 at 04:49:39AM +0100, Luca Barbieri wrote: > >>> We had to do a similar thing in the >>> Poulsbo driver and it turned out that we could save a significant amount of >>> CPU by using a delayed workqueue, collecting objects and destroying them >>> periodi

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-21 Thread Thomas Hellstrom
Luca Barbieri wrote: >> We had to do a similar thing in the >> Poulsbo driver and it turned out that we could save a significant amount of >> CPU by using a delayed workqueue, collecting objects and destroying them >> periodically. >> > Yes, indeed, we don't really care about a fence expiring

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-21 Thread Jerome Glisse
On Thu, Jan 21, 2010 at 04:49:39AM +0100, Luca Barbieri wrote: > > We had to do a similar thing in the > > Poulsbo driver and it turned out that we could save a significant amount of > > CPU by using a delayed workqueue, collecting objects and destroying them > > periodically. > Yes, indeed, we don

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-20 Thread Luca Barbieri
> We had to do a similar thing in the > Poulsbo driver and it turned out that we could save a significant amount of > CPU by using a delayed workqueue, collecting objects and destroying them > periodically. Yes, indeed, we don't really care about a fence expiring unless we want to use that buffer o

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-20 Thread Thomas Hellstrom
Luca Barbieri wrote: > When designing this, we should also keep in mind that some drivers > (e.g. nouveau) have multiple FIFO channels, and thus we would like a > buffer to be referenced for reading by multiple channels at once (and > be destroyed only when all fences are expired, obviously). > Als

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-20 Thread Thomas Hellstrom
Luca Barbieri wrote: >> Also note that the delayed delete list is not in fence order but in >> deletion-time order, which perhaps gives room for more optimizations. >> > You are right. > I think then that ttm_bo_delayed_delete may still need to be changed, > because it stops when ttm_bo_cleanu

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-20 Thread Luca Barbieri
When designing this, we should also keep in mind that some drivers (e.g. nouveau) have multiple FIFO channels, and thus we would like a buffer to be referenced for reading by multiple channels at once (and be destroyed only when all fences are expired, obviously). Also, hardware may support on-GPU

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-20 Thread Luca Barbieri
> Also note that the delayed delete list is not in fence order but in > deletion-time order, which perhaps gives room for more optimizations. You are right. I think then that ttm_bo_delayed_delete may still need to be changed, because it stops when ttm_bo_cleanup_refs returns -EBUSY, which happens

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-20 Thread Thomas Hellstrom
Luca Barbieri wrote: > Yes it's fine. I sent your patch to Dave with an expanded commit > comment for merging. > > Here is a possible redesign of the mechanism inspired by this issue. > It seems that what we are racing against is buffer eviction, due to > delayed deletion buffers being still kept o

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-20 Thread Luca Barbieri
Yes it's fine. I sent your patch to Dave with an expanded commit comment for merging. Here is a possible redesign of the mechanism inspired by this issue. It seems that what we are racing against is buffer eviction, due to delayed deletion buffers being still kept on the LRU list. I'm wondering i

[Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete (v3, final)

2010-01-20 Thread Luca Barbieri
Resending this with Thomas Hellstrom's signoff for merging into 2.6.33 ttm_bo_delayed_delete has a race condition, because after we do: kref_put(&nentry->list_kref, ttm_bo_release_list); we are not holding the list lock and not holding any reference to objects, and thus every bo in the list can b

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-20 Thread Thomas Hellstrom
Thomas Hellstrom wrote: Thomas Hellstrom wrote: Yes, it looks correct. Although it seems a little unintuitive to enter the loop with the spinlock held, and exit it with the spinlock not held. I've attached yet another patch to have that fixed. Could you take a look at whether it seems OK wi

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-20 Thread Thomas Hellstrom
Thomas Hellstrom wrote: Yes, it looks correct. Although it seems a little unintuitive to enter the loop with the spinlock held, and exit it with the spinlock not held. I've attached yet another patch to have that fixed. Could you take a look at whether it seems OK with you and, in that case,

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-20 Thread Thomas Hellstrom
Luca Barbieri wrote: >> Would nentry=list_first_entry(&entry->ddestroy, ) work? >> > Yes, it seems a bit less intuitive, but if that's the accepted > practice, let's do that instead. > > >> Here nentry may have been removed from the list by another process, which >> would trigger the u

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-19 Thread Thomas Hellstrom
Luca, Good catch. Some comments inline: Luca Barbieri wrote: > + entry = list_first_entry(&bdev->ddestroy, > + struct ttm_buffer_object, ddestroy); > + kref_get(&entry->list_kref); > > - if (next != &bdev->ddestroy) { > - nentry = list_entry

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-18 Thread Luca Barbieri
> Would  nentry=list_first_entry(&entry->ddestroy, ) work? Yes, it seems a bit less intuitive, but if that's the accepted practice, let's do that instead. > Here nentry may have been removed from the list by another process, which > would trigger the unnecessary call, mentioned above. You are

[Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

2010-01-18 Thread Luca Barbieri
ttm_bo_delayed_delete has a race condition, because after we do: kref_put(&nentry->list_kref, ttm_bo_release_list); we are not holding the list lock and not holding any reference to objects, and thus every bo in the list can be removed and freed at this point. However, we then use the next pointe