On Thu, 10 Jan 2013 18:03:00 +0100, Daniel Vetter <[email protected]> wrote: > This partially reverts > > commit 6c085a728cf000ac1865d66f8c9b52935558b328 > Author: Chris Wilson <[email protected]> > Date: Mon Aug 20 11:40:46 2012 +0200 > > drm/i915: Track unbound pages > > Closer inspection of that patch revealed a bunch of unrelated changes > in the shrinker: > - The shrinker count is now in pages instead of objects. > - For counting the shrinkable objects the old code only looked at the > inactive list, the new code looks at all bounds objects (including > pinned ones). That is obviously in addition to the new unbound list. > - The shrinker cound is no longer scaled with > sysctl_vfs_cache_pressure. Note though that with the default tuning > value of vfs_cache_pressue = 100 this doesn't affect the shrinker > behaviour. > - When actually shrinking objects, the old code first dropped > purgeable objects, then normal (inactive) objects. Only then did it, > in a last-ditch effort idle the gpu and evict everything. The new > code omits the intermediate step of evicting normal inactive > objects. > > Safe for the first change, which seems benign, and the shrinker count > scaling, which is a bit a different story, the endresult of all these > changes is that the shrinker is _much_ more likely to fall back to the > last-ditch resort of idling the gpu and evicting everything. The old > code could only do that if something else evicted lots of objects > meanwhile (since without any other changes the nr_to_scan will be > smaller than the object count). > > Reverting the vfs_cache_pressure behaviour itself is a bit bogus: Only > dentry/inode object caches should scale their shrinker counts with > vfs_cache_pressure. Originally I've had that change reverted, too. But > Chris Wilson insisted that it's too bogus and shouldn't again see the > light of day. > > Hence revert all these other changes and restore the old shrinker > behaviour, with the minor adjustment that we now first scan the > unbound list, then the inactive list for each object category > (purgeable or normal). > > A similar patch has been tested by a few people affected by the gen4/5 > hangs which started to appear in 3.7, which some people bisected to > the "drm/i915: Track unbound pages" commit. But just disabling the > unbound logic alone didn't change things at all. > > Note that this patch doesn't fix the referenced bugs, it only hides > the underlying bug(s) well enough to restore pre-3.7 behaviour. The > key to achieve that is to massively reduce the likelyhood of going > into a full gpu stall and evicting everything. > > v2: Reword commit message a bit, taking Chris Wilson's comment into > account. > > v3: On Chris Wilson's insistency, do not reinstate the rather bogus > vfs_cache_pressure change. > > Tested-by: Greg KH <[email protected]> > Tested-by: Dave Kleikamp <[email protected]> > References: https://bugs.freedesktop.org/show_bug.cgi?id=55984 > References: https://bugs.freedesktop.org/show_bug.cgi?id=57122 > References: https://bugs.freedesktop.org/show_bug.cgi?id=56916 > References: https://bugs.freedesktop.org/show_bug.cgi?id=57136 > Cc: Chris Wilson <[email protected]> > Cc: [email protected] > Signed-off-by: Daniel Vetter <[email protected]>
Acked-by: Chris Wilson <[email protected]> I just hope the clue bat descends soonest before we find another way of triggering the spurious hangs. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/intel-gfx
