Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
On 09/12/2013 11:50 PM, Maarten Lankhorst wrote: Op 12-09-13 18:44, Thomas Hellstrom schreef: On 09/12/2013 05:45 PM, Maarten Lankhorst wrote: Op 12-09-13 17:36, Daniel Vetter schreef: On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra wrote: So I'm poking around the preemption code and stumbled upon: drivers/gpu/drm/i915/i915_gem.c:set_need_resched(); drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); drivers/gpu/drm/udl/udl_gem.c: set_need_resched(); All these sites basically do: while (!trylock()) yield(); which is a horrible and broken locking pattern. Firstly its deadlock prone, suppose the faulting process is a FIFOn+1 task that preempted the lock holder at FIFOn. Secondly the implementation is worse than usual by abusing VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault doesn't retry, but you're using it as a get out of fault path. And you're using set_need_resched() which is not something a driver should _ever_ touch. Now I'm going to take away set_need_resched() -- and while you can 'reimplement' it using set_thread_flag() you're not going to do that because it will be broken due to changes to the preempt code. So please as to fix ASAP and don't allow anybody to trick you into merging silly things like that again ;-) The set_need_resched in i915_gem.c:i915_gem_fault can actually be removed. It was there to give the error handler a chance to sneak in and reset the hw/sw tracking when the gpu is dead. That hack goes back to the days when the locking around our error handler was somewhere between nonexistent and totally broken, nowadays we keep things from live-locking by a bit of magic in i915_mutex_lock_interruptible. I'll whip up a patch to rip this out. I'll also check that our testsuite properly exercises this path (needs a bit of work on a quick look for better coverage). The one in ttm is just bonghits to shut up lockdep: ttm can recurse into it's own pagefault handler and then deadlock, the trylock just keeps lockdep quiet. We've had that bug arise in drm/i915 due to some fun userspace did and now have testcases for them. The right solution to fix this is to use copy_to|from_user_atomic in ttm everywhere it holds locks and have slowpaths which drops locks, copies stuff into a temp allocation and then continues. At least that's how we've fixed all those inversions in i915-gem. I'm not volunteering to fix this ;-) Ah the case where a mmap'd address is passed to the execbuf ioctl? :P Fine I'll look into it a bit, hopefully before tuesday. Else it might take a bit longer since I'll be on my way to plumbers.. I think a possible fix would be if fault() were allowed to return an error and drop the mmap_sem() before returning. Otherwise we need to track down all copy_to_user / copy_from_user which happen with bo::reserve held. Actually, from looking at the mm code, it seems OK to do the following: if (!bo_tryreserve()) { up_read mmap_sem(); // Release the mmap_sem to avoid deadlocks. bo_reserve(); // Wait for the BO to become available (interruptible) bo_unreserve(); // Where is bo_wait_unreserved() when we need it, Maarten :P return VM_FAULT_RETRY; // Go ahead and retry the VMA walk, after regrabbing } Somebody conveniently added a VM_FAULT_RETRY, but for a different purpose. If possible, I suggest to take this route for now to avoid the mess of changing locking order in all TTM drivers, with all give-up-locking slowpaths that comes with it. IIRC it took some time for i915 to get that right, and completely get rid of all lockdep warnings. This will keep the official locking order bo::reserve mmap_sem /Thomas ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 57919] Visual glitches in unity with Radeon HD 7600M
https://bugs.freedesktop.org/show_bug.cgi?id=57919 --- Comment #21 from Thilo Cestonaro --- Latest Saucy. Still no luck :( Kernel: 3.11.0-7-generic Greetings Thilo -- You are receiving this mail because: You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 68235] Display freezes after login with kernel 3.11.0-rc5 on Cayman with dpm=1
https://bugs.freedesktop.org/show_bug.cgi?id=68235 --- Comment #25 from Alexandre Demers --- (In reply to comment #22) > (In reply to comment #20) > > Ok, if I apply the whole suggested patch but the following, it hangs: > > @@ -4152,14 +4152,14 @@ int ni_dpm_init(struct radeon_device *rdev) > > } > > ni_pi->mclk_rtt_mode_threshold = eg_pi->mclk_edc_wr_enable_threshold; > > > > - pi->voltage_control = > > - radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDC, > > 0); > > + pi->voltage_control = false; > > +// radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDC, > > 0); > > > > - pi->mvdd_control = > > - radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_MVDDC, > > 0); > > + pi->mvdd_control = false; > > +// radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_MVDDC, > > 0); > > > > - eg_pi->vddci_control = > > - radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDCI, > > 0); > > + eg_pi->vddci_control = false; > > +// radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDCI, > > 0); > > > > rv770_get_engine_memory_ss(rdev); > > So does just applying this portion of the patch by itself fix the hang? The only way I don't have a "ni_upload_sw_state failed" is by letting pi->voltage_control = radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDC, 0); However, I inevitably end up with a hang either at login or when my session is loading (however, going in a terminal before it hangs prevents any hang from happening as long as I stay in terminal). If I patch that part of code, I always have the "ni_upload_sw_state failed" error, thus not hanging but preventing any dpm. I can patch everything else or nothing at all (I tried different combinations) and they don't seem to change a thing about the hang. -- You are receiving this mail because: You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
On 09/12/2013 11:50 PM, Maarten Lankhorst wrote: Op 12-09-13 18:44, Thomas Hellstrom schreef: On 09/12/2013 05:45 PM, Maarten Lankhorst wrote: Op 12-09-13 17:36, Daniel Vetter schreef: On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra wrote: So I'm poking around the preemption code and stumbled upon: drivers/gpu/drm/i915/i915_gem.c:set_need_resched(); drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); drivers/gpu/drm/udl/udl_gem.c: set_need_resched(); All these sites basically do: while (!trylock()) yield(); which is a horrible and broken locking pattern. Firstly its deadlock prone, suppose the faulting process is a FIFOn+1 task that preempted the lock holder at FIFOn. Secondly the implementation is worse than usual by abusing VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault doesn't retry, but you're using it as a get out of fault path. And you're using set_need_resched() which is not something a driver should _ever_ touch. Now I'm going to take away set_need_resched() -- and while you can 'reimplement' it using set_thread_flag() you're not going to do that because it will be broken due to changes to the preempt code. So please as to fix ASAP and don't allow anybody to trick you into merging silly things like that again ;-) The set_need_resched in i915_gem.c:i915_gem_fault can actually be removed. It was there to give the error handler a chance to sneak in and reset the hw/sw tracking when the gpu is dead. That hack goes back to the days when the locking around our error handler was somewhere between nonexistent and totally broken, nowadays we keep things from live-locking by a bit of magic in i915_mutex_lock_interruptible. I'll whip up a patch to rip this out. I'll also check that our testsuite properly exercises this path (needs a bit of work on a quick look for better coverage). The one in ttm is just bonghits to shut up lockdep: ttm can recurse into it's own pagefault handler and then deadlock, the trylock just keeps lockdep quiet. We've had that bug arise in drm/i915 due to some fun userspace did and now have testcases for them. The right solution to fix this is to use copy_to|from_user_atomic in ttm everywhere it holds locks and have slowpaths which drops locks, copies stuff into a temp allocation and then continues. At least that's how we've fixed all those inversions in i915-gem. I'm not volunteering to fix this ;-) Ah the case where a mmap'd address is passed to the execbuf ioctl? :P Fine I'll look into it a bit, hopefully before tuesday. Else it might take a bit longer since I'll be on my way to plumbers.. I think a possible fix would be if fault() were allowed to return an error and drop the mmap_sem() before returning. Otherwise we need to track down all copy_to_user / copy_from_user which happen with bo::reserve held. CONFIG_PROVE_LOCKING=y and hard grab that reserve lock within the fault handler, done.. lockdep will spit it out for you :p ~Maarten Given that all copy_to_user / copy_from_user paths are actually hit during testing, right? /Thomas ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [Intel-gfx] [PATCH 1/2] drm/i915: kill set_need_resched
hmm, looks like I cargo-cult'd the same into msm. I guess in i915 (and ttm) case, the issue arises due to need for CPU access to buffer via GTT? In which case I should be safe to drop the set_need_resched() as well? (Since CPU always has direct access to the pages.) Or am I missing something about the original issue that necessitated set_need_resched()? BR, -R On Thu, Sep 12, 2013 at 11:57 AM, Daniel Vetter wrote: > This is just a remnant from the old days when our reset handling was > horribly racy, suffered from terribly locking issues and often happily > live-locked. Those days are now gone so we can drop the hacks and just > rip the reschedule-point out. > > Reported-by: Peter Zijlstra > Cc: Peter Zijlstra > Cc: Linux Kernel Mailing List > Signed-off-by: Daniel Vetter > --- > drivers/gpu/drm/i915/i915_gem.c | 11 --- > 1 file changed, 4 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index d80f33d..3871060 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -1389,14 +1389,11 @@ out: > if (i915_terminally_wedged(&dev_priv->gpu_error)) > return VM_FAULT_SIGBUS; > case -EAGAIN: > - /* Give the error handler a chance to run and move the > -* objects off the GPU active list. Next time we service the > -* fault, we should be able to transition the page into the > -* GTT without touching the GPU (and so avoid further > -* EIO/EGAIN). If the GPU is wedged, then there is no issue > -* with coherency, just lost writes. > + /* > +* EAGAIN means the gpu is hung and we'll wait for the error > +* handler to reset everything when re-faulting in > +* i915_mutex_lock_interruptible. > */ > - set_need_resched(); > case 0: > case -ERESTARTSYS: > case -EINTR: > -- > 1.8.4.rc3 > > ___ > Intel-gfx mailing list > intel-...@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 69081] [radeonsi] Incorrect rendering of textures
https://bugs.freedesktop.org/show_bug.cgi?id=69081 José Suárez changed: What|Removed |Added Summary|Incorrect rendering of |[radeonsi] Incorrect |textures|rendering of textures -- You are receiving this mail because: You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
On Thu, Sep 12, 2013 at 5:43 PM, Peter Zijlstra wrote: >> The one in ttm is just bonghits to shut up lockdep: ttm can recurse >> into it's own pagefault handler and then deadlock, the trylock just >> keeps lockdep quiet. We've had that bug arise in drm/i915 due to some >> fun userspace did and now have testcases for them. The right solution >> to fix this is to use copy_to|from_user_atomic in ttm everywhere it >> holds locks and have slowpaths which drops locks, copies stuff into a >> temp allocation and then continues. At least that's how we've fixed >> all those inversions in i915-gem. I'm not volunteering to fix this ;-) > > Yikes.. so how common is it? If I simply rip the set_need_resched() out > it will 'spin' on the fault a little longer until a 'natural' preemption > point -- if such a thing is every going to happen. It's a case of "our userspace doesn't do this", so as long as you're not evil and frob the drm device nodes of ttm drivers directly the deadlock will never happen. No idea how much contention actually happens on e.g. shared buffer objects - in i915 we have just one lock and so suffer quite a bit more from contention. So no idea how much removing the yield would hurt. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 1/2] drm/i915: kill set_need_resched
This is just a remnant from the old days when our reset handling was horribly racy, suffered from terribly locking issues and often happily live-locked. Those days are now gone so we can drop the hacks and just rip the reschedule-point out. Reported-by: Peter Zijlstra Cc: Peter Zijlstra Cc: Linux Kernel Mailing List Signed-off-by: Daniel Vetter --- drivers/gpu/drm/i915/i915_gem.c | 11 --- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index d80f33d..3871060 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1389,14 +1389,11 @@ out: if (i915_terminally_wedged(&dev_priv->gpu_error)) return VM_FAULT_SIGBUS; case -EAGAIN: - /* Give the error handler a chance to run and move the -* objects off the GPU active list. Next time we service the -* fault, we should be able to transition the page into the -* GTT without touching the GPU (and so avoid further -* EIO/EGAIN). If the GPU is wedged, then there is no issue -* with coherency, just lost writes. + /* +* EAGAIN means the gpu is hung and we'll wait for the error +* handler to reset everything when re-faulting in +* i915_mutex_lock_interruptible. */ - set_need_resched(); case 0: case -ERESTARTSYS: case -EINTR: -- 1.8.4.rc3 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
On Thu, Sep 12, 2013 at 06:22:10PM +0200, Peter Zijlstra wrote: > On Thu, Sep 12, 2013 at 05:58:49PM +0200, Daniel Vetter wrote: > > On Thu, Sep 12, 2013 at 5:43 PM, Peter Zijlstra > > wrote: > > >> The one in ttm is just bonghits to shut up lockdep: ttm can recurse > > >> into it's own pagefault handler and then deadlock, the trylock just > > >> keeps lockdep quiet. We've had that bug arise in drm/i915 due to some > > >> fun userspace did and now have testcases for them. The right solution > > >> to fix this is to use copy_to|from_user_atomic in ttm everywhere it > > >> holds locks and have slowpaths which drops locks, copies stuff into a > > >> temp allocation and then continues. At least that's how we've fixed > > >> all those inversions in i915-gem. I'm not volunteering to fix this ;-) > > > > > > Yikes.. so how common is it? If I simply rip the set_need_resched() out > > > it will 'spin' on the fault a little longer until a 'natural' preemption > > > point -- if such a thing is every going to happen. > > > > It's a case of "our userspace doesn't do this", so as long as you're > > not evil and frob the drm device nodes of ttm drivers directly the > > deadlock will never happen. No idea how much contention actually > > happens on e.g. shared buffer objects - in i915 we have just one lock > > and so suffer quite a bit more from contention. So no idea how much > > removing the yield would hurt. > > If 'sane' userspace is never supposed to do this, then only insane > userspace is going to hurt from this and that's a GOOD (tm) thing, > right? ;-) Not quite, as it would be possible for the evil userspace to trigger a GPU hang that would cause the sane userspace to spin indefinitely waiting for the error recovery to kick in. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
On Thu, Sep 12, 2013 at 05:11:25PM +0200, Maarten Lankhorst wrote: > Op 12-09-13 17:06, Peter Zijlstra schreef: > > Hi Dave, > > > > So I'm poking around the preemption code and stumbled upon: > > > > drivers/gpu/drm/i915/i915_gem.c:set_need_resched(); > > drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); > > drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); > > drivers/gpu/drm/udl/udl_gem.c: set_need_resched(); > > > > All these sites basically do: > > > > while (!trylock()) > > yield(); > > > > which is a horrible and broken locking pattern. > > > > Firstly its deadlock prone, suppose the faulting process is a FIFOn+1 > > task that preempted the lock holder at FIFOn. > > > > Secondly the implementation is worse than usual by abusing > > VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault > > doesn't retry, but you're using it as a get out of fault path. And > > you're using set_need_resched() which is not something a driver should > > _ever_ touch. > > > > Now I'm going to take away set_need_resched() -- and while you can > > 'reimplement' it using set_thread_flag() you're not going to do that > > because it will be broken due to changes to the preempt code. > > > > So please as to fix ASAP and don't allow anybody to trick you into > > merging silly things like that again ;-) > > > Agreed, but this is a case of locking inversion. How do you propose to get > around that? me? No idea, I've never looked at the actual locking in DRM. Someone who's familiar with that code would have to tackle that. I just spotted the fail because I was going to remove set_need_resched() and had a WTF moment when I tripped over this stuff. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
Hi Dave, So I'm poking around the preemption code and stumbled upon: drivers/gpu/drm/i915/i915_gem.c:set_need_resched(); drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); drivers/gpu/drm/udl/udl_gem.c: set_need_resched(); All these sites basically do: while (!trylock()) yield(); which is a horrible and broken locking pattern. Firstly its deadlock prone, suppose the faulting process is a FIFOn+1 task that preempted the lock holder at FIFOn. Secondly the implementation is worse than usual by abusing VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault doesn't retry, but you're using it as a get out of fault path. And you're using set_need_resched() which is not something a driver should _ever_ touch. Now I'm going to take away set_need_resched() -- and while you can 'reimplement' it using set_thread_flag() you're not going to do that because it will be broken due to changes to the preempt code. So please as to fix ASAP and don't allow anybody to trick you into merging silly things like that again ;-) ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
Op 12-09-13 17:06, Peter Zijlstra schreef: > Hi Dave, > > So I'm poking around the preemption code and stumbled upon: > > drivers/gpu/drm/i915/i915_gem.c:set_need_resched(); > drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); > drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); > drivers/gpu/drm/udl/udl_gem.c: set_need_resched(); > > All these sites basically do: > > while (!trylock()) > yield(); > > which is a horrible and broken locking pattern. > > Firstly its deadlock prone, suppose the faulting process is a FIFOn+1 > task that preempted the lock holder at FIFOn. > > Secondly the implementation is worse than usual by abusing > VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault > doesn't retry, but you're using it as a get out of fault path. And > you're using set_need_resched() which is not something a driver should > _ever_ touch. > > Now I'm going to take away set_need_resched() -- and while you can > 'reimplement' it using set_thread_flag() you're not going to do that > because it will be broken due to changes to the preempt code. > > So please as to fix ASAP and don't allow anybody to trick you into > merging silly things like that again ;-) > Agreed, but this is a case of locking inversion. How do you propose to get around that? ~Maarten ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
On Thu, Sep 12, 2013 at 6:44 PM, Thomas Hellstrom wrote: > > I think a possible fix would be if fault() were allowed to return an error > and drop the mmap_sem() before returning. > > Otherwise we need to track down all copy_to_user / copy_from_user which > happen with bo::reserve held. For maximal evilness submit the relocation list (or whatever data execbuf slurps in with copy_from_user while holding bo::reserve) of a bo in the execbuf list. At least that's the testcase we have for drm/i915. Then make sure that the execbuf wants the bo somewhere it can't be mmaped from userspace, so needs to be moved both in the fault handler and then back for the execbuf to continue ;-) -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
On Thu, Sep 12, 2013 at 6:22 PM, Peter Zijlstra wrote: > If 'sane' userspace is never supposed to do this, then only insane > userspace is going to hurt from this and that's a GOOD (tm) thing, > right? ;-) Afaik sane userspace doesn't hit the _deadlock_ (or lifelock if we have the set_need_resched in there). drm/i915 is a bit different since we have just one lock, and so the same design would actually deadlock even for sane userspace. But hitting contention there and yielding is somewhat expected. Obviously shouldn't happen too often since it'll hurt performance, with either blocking or the yield spinning loop. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Fwd: Intel/Radeon muxless laptop issues
Hello, I am writing to report two issues I'm having with a hybrid graphics laptop. Namely, Sony Vaio SA3S9R with Intel HD3000 and Radeon HD6630M First one looks like a regression introduced by the commit named "gpu/vga_switcheroo: add driver control power feature. (v3)". After I echo the "OFF" command to the switch, the DGPU turns off. However, after that the vga_switcheroo disappears. Reinserting the radeon module makes the kernel print the stack trace. I have uploaded the dmesg log to http://paste.debian.net/38539/ . Another issue I would like to bring up is the handling of suspend-resume cycle on muxless laptops. On most intel/radeon laptops, the DGPU turns on after a resume without an ACPI notification (which sound like a bug). Switcheroo is still thinking it's off, and it keeps draining its extra 10W of power. Now, if we do a "echo OFF", switcheroo does nothing. If we do "echo ON", it fails and gets stuck for some seconds (because it tries to power-up a powered-up card messing the initialization routine and then a GPU hang happens). The hacky solution is to turn it on before suspend and turn it off after via a suspend hook script, but this has several limitations: 1. Most distros don't ship the script by default leaving the inexperienced users with hot laptops 2. Doesn't work well when someone is really using the DGPU 3. I have no idea, but maybe turning it on makes it drain more power in suspend (well, since BIOSes tend to be buggy, who knows what happens next). My hacky solution was to forcibly turn off the DGPU in the switcheroo driver at resume if it was unused, ignoring its state inside the switcheroo struct. I have already made a proof-of-concept patch back in 2012, but somehow that received no attention. http://lkml.indiana.edu/hypermail/linux/kernel/1204.3/02530.html Since fixing ACPI for every broken laptop or shipping a suspend hook is a no-go, we need some kind of a workaround. I don't really know how it should be done, via DMI probing or whatever, but I would like someone who has more knowledge to look into that. Well, I guess, a solution would be to forcibly enable my hack for radeon/intel combos and then blacklist the laptops for which it breaks the resume cycle via DMI (i.e., I think that since most laptops behave this way, we should make the powersaving behavior default one). -- Regards, Alexander ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 68235] Display freezes after login with kernel 3.11.0-rc5 on Cayman with dpm=1
https://bugs.freedesktop.org/show_bug.cgi?id=68235 --- Comment #24 from Alexandre Demers --- (In reply to comment #22) > (In reply to comment #20) > > Ok, if I apply the whole suggested patch but the following, it hangs: > > @@ -4152,14 +4152,14 @@ int ni_dpm_init(struct radeon_device *rdev) > > } > > ni_pi->mclk_rtt_mode_threshold = eg_pi->mclk_edc_wr_enable_threshold; > > > > - pi->voltage_control = > > - radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDC, > > 0); > > + pi->voltage_control = false; > > +// radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDC, > > 0); > > > > - pi->mvdd_control = > > - radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_MVDDC, > > 0); > > + pi->mvdd_control = false; > > +// radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_MVDDC, > > 0); > > > > - eg_pi->vddci_control = > > - radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDCI, > > 0); > > + eg_pi->vddci_control = false; > > +// radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDCI, > > 0); > > > > rv770_get_engine_memory_ss(rdev); > > So does just applying this portion of the patch by itself fix the hang? Applying just this returns an error when booting: ni_upload_sw_state failed, but obviously the system doesn't hang after that (though it can't change its performance state) -- You are receiving this mail because: You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
On 09/12/2013 05:45 PM, Maarten Lankhorst wrote: Op 12-09-13 17:36, Daniel Vetter schreef: On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra wrote: So I'm poking around the preemption code and stumbled upon: drivers/gpu/drm/i915/i915_gem.c:set_need_resched(); drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); drivers/gpu/drm/udl/udl_gem.c: set_need_resched(); All these sites basically do: while (!trylock()) yield(); which is a horrible and broken locking pattern. Firstly its deadlock prone, suppose the faulting process is a FIFOn+1 task that preempted the lock holder at FIFOn. Secondly the implementation is worse than usual by abusing VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault doesn't retry, but you're using it as a get out of fault path. And you're using set_need_resched() which is not something a driver should _ever_ touch. Now I'm going to take away set_need_resched() -- and while you can 'reimplement' it using set_thread_flag() you're not going to do that because it will be broken due to changes to the preempt code. So please as to fix ASAP and don't allow anybody to trick you into merging silly things like that again ;-) The set_need_resched in i915_gem.c:i915_gem_fault can actually be removed. It was there to give the error handler a chance to sneak in and reset the hw/sw tracking when the gpu is dead. That hack goes back to the days when the locking around our error handler was somewhere between nonexistent and totally broken, nowadays we keep things from live-locking by a bit of magic in i915_mutex_lock_interruptible. I'll whip up a patch to rip this out. I'll also check that our testsuite properly exercises this path (needs a bit of work on a quick look for better coverage). The one in ttm is just bonghits to shut up lockdep: ttm can recurse into it's own pagefault handler and then deadlock, the trylock just keeps lockdep quiet. We've had that bug arise in drm/i915 due to some fun userspace did and now have testcases for them. The right solution to fix this is to use copy_to|from_user_atomic in ttm everywhere it holds locks and have slowpaths which drops locks, copies stuff into a temp allocation and then continues. At least that's how we've fixed all those inversions in i915-gem. I'm not volunteering to fix this ;-) Ah the case where a mmap'd address is passed to the execbuf ioctl? :P Fine I'll look into it a bit, hopefully before tuesday. Else it might take a bit longer since I'll be on my way to plumbers.. I think a possible fix would be if fault() were allowed to return an error and drop the mmap_sem() before returning. Otherwise we need to track down all copy_to_user / copy_from_user which happen with bo::reserve held. /Thomas ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
On 09/12/2013 10:39 PM, Thomas Gleixner wrote: On Thu, 12 Sep 2013, Daniel Vetter wrote: On Thu, Sep 12, 2013 at 10:20 PM, Thomas Gleixner wrote: I think for ttm drivers it's just execbuf being exploitable. But on drm/i915 we've had the same issue with the pwrite/pread ioctls, so a simple glBufferData(glMap) kind of recursion from gl clients blew the kernel to pieces ... And the only answer you folks came up with is set_need_resched() and yield()? Oh well The yield was for a different lifelock, and that one is also fixed by now. The fault handler deadlock was fixed in the usual "drop locks and jump into slowpath" fasion, at least in drm/i915. So we can remove that whole yield/set_need_resched() mess completely ? Thanks, tglx No. The while(trylock) is there to address a potential locking inversion deadlock. If the trylock fails, the code returns to user-space which retries the fault. This code needs to stay until we can come up with either a way to drop the mmap_sem and sleep before returning to user-space, or a bunch of code is fixed with a different locking order. The set_need_resched() can (and should according to Peter) go. /Thomas ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
Op 12-09-13 17:36, Daniel Vetter schreef: > On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra wrote: >> So I'm poking around the preemption code and stumbled upon: >> >> drivers/gpu/drm/i915/i915_gem.c:set_need_resched(); >> drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); >> drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); >> drivers/gpu/drm/udl/udl_gem.c: set_need_resched(); >> >> All these sites basically do: >> >> while (!trylock()) >> yield(); >> >> which is a horrible and broken locking pattern. >> >> Firstly its deadlock prone, suppose the faulting process is a FIFOn+1 >> task that preempted the lock holder at FIFOn. >> >> Secondly the implementation is worse than usual by abusing >> VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault >> doesn't retry, but you're using it as a get out of fault path. And >> you're using set_need_resched() which is not something a driver should >> _ever_ touch. >> >> Now I'm going to take away set_need_resched() -- and while you can >> 'reimplement' it using set_thread_flag() you're not going to do that >> because it will be broken due to changes to the preempt code. >> >> So please as to fix ASAP and don't allow anybody to trick you into >> merging silly things like that again ;-) > The set_need_resched in i915_gem.c:i915_gem_fault can actually be > removed. It was there to give the error handler a chance to sneak in > and reset the hw/sw tracking when the gpu is dead. That hack goes back > to the days when the locking around our error handler was somewhere > between nonexistent and totally broken, nowadays we keep things from > live-locking by a bit of magic in i915_mutex_lock_interruptible. I'll > whip up a patch to rip this out. I'll also check that our testsuite > properly exercises this path (needs a bit of work on a quick look for > better coverage). > > The one in ttm is just bonghits to shut up lockdep: ttm can recurse > into it's own pagefault handler and then deadlock, the trylock just > keeps lockdep quiet. We've had that bug arise in drm/i915 due to some > fun userspace did and now have testcases for them. The right solution > to fix this is to use copy_to|from_user_atomic in ttm everywhere it > holds locks and have slowpaths which drops locks, copies stuff into a > temp allocation and then continues. At least that's how we've fixed > all those inversions in i915-gem. I'm not volunteering to fix this ;-) Ah the case where a mmap'd address is passed to the execbuf ioctl? :P Fine I'll look into it a bit, hopefully before tuesday. Else it might take a bit longer since I'll be on my way to plumbers.. > The one in udl just looks like copypasta from i915, without any > justification (at least I don't see any) for why it's there. Probably > can die, too, since there isn't any gpu to reset on usb display-link > devices ... > > Cheers, Daniel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
Op 12-09-13 18:44, Thomas Hellstrom schreef: > On 09/12/2013 05:45 PM, Maarten Lankhorst wrote: >> Op 12-09-13 17:36, Daniel Vetter schreef: >>> On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra >>> wrote: So I'm poking around the preemption code and stumbled upon: drivers/gpu/drm/i915/i915_gem.c:set_need_resched(); drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); drivers/gpu/drm/udl/udl_gem.c: set_need_resched(); All these sites basically do: while (!trylock()) yield(); which is a horrible and broken locking pattern. Firstly its deadlock prone, suppose the faulting process is a FIFOn+1 task that preempted the lock holder at FIFOn. Secondly the implementation is worse than usual by abusing VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault doesn't retry, but you're using it as a get out of fault path. And you're using set_need_resched() which is not something a driver should _ever_ touch. Now I'm going to take away set_need_resched() -- and while you can 'reimplement' it using set_thread_flag() you're not going to do that because it will be broken due to changes to the preempt code. So please as to fix ASAP and don't allow anybody to trick you into merging silly things like that again ;-) >>> The set_need_resched in i915_gem.c:i915_gem_fault can actually be >>> removed. It was there to give the error handler a chance to sneak in >>> and reset the hw/sw tracking when the gpu is dead. That hack goes back >>> to the days when the locking around our error handler was somewhere >>> between nonexistent and totally broken, nowadays we keep things from >>> live-locking by a bit of magic in i915_mutex_lock_interruptible. I'll >>> whip up a patch to rip this out. I'll also check that our testsuite >>> properly exercises this path (needs a bit of work on a quick look for >>> better coverage). >>> >>> The one in ttm is just bonghits to shut up lockdep: ttm can recurse >>> into it's own pagefault handler and then deadlock, the trylock just >>> keeps lockdep quiet. We've had that bug arise in drm/i915 due to some >>> fun userspace did and now have testcases for them. The right solution >>> to fix this is to use copy_to|from_user_atomic in ttm everywhere it >>> holds locks and have slowpaths which drops locks, copies stuff into a >>> temp allocation and then continues. At least that's how we've fixed >>> all those inversions in i915-gem. I'm not volunteering to fix this ;-) >> Ah the case where a mmap'd address is passed to the execbuf ioctl? :P >> >> Fine I'll look into it a bit, hopefully before tuesday. Else it might take a >> bit longer since I'll be on my way to plumbers.. > > I think a possible fix would be if fault() were allowed to return an error > and drop the mmap_sem() before returning. > > Otherwise we need to track down all copy_to_user / copy_from_user which > happen with bo::reserve held. CONFIG_PROVE_LOCKING=y and hard grab that reserve lock within the fault handler, done.. lockdep will spit it out for you :p ~Maarten ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra wrote: > > So I'm poking around the preemption code and stumbled upon: > > drivers/gpu/drm/i915/i915_gem.c:set_need_resched(); > drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); > drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched(); > drivers/gpu/drm/udl/udl_gem.c: set_need_resched(); > > All these sites basically do: > > while (!trylock()) > yield(); > > which is a horrible and broken locking pattern. > > Firstly its deadlock prone, suppose the faulting process is a FIFOn+1 > task that preempted the lock holder at FIFOn. > > Secondly the implementation is worse than usual by abusing > VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault > doesn't retry, but you're using it as a get out of fault path. And > you're using set_need_resched() which is not something a driver should > _ever_ touch. > > Now I'm going to take away set_need_resched() -- and while you can > 'reimplement' it using set_thread_flag() you're not going to do that > because it will be broken due to changes to the preempt code. > > So please as to fix ASAP and don't allow anybody to trick you into > merging silly things like that again ;-) The set_need_resched in i915_gem.c:i915_gem_fault can actually be removed. It was there to give the error handler a chance to sneak in and reset the hw/sw tracking when the gpu is dead. That hack goes back to the days when the locking around our error handler was somewhere between nonexistent and totally broken, nowadays we keep things from live-locking by a bit of magic in i915_mutex_lock_interruptible. I'll whip up a patch to rip this out. I'll also check that our testsuite properly exercises this path (needs a bit of work on a quick look for better coverage). The one in ttm is just bonghits to shut up lockdep: ttm can recurse into it's own pagefault handler and then deadlock, the trylock just keeps lockdep quiet. We've had that bug arise in drm/i915 due to some fun userspace did and now have testcases for them. The right solution to fix this is to use copy_to|from_user_atomic in ttm everywhere it holds locks and have slowpaths which drops locks, copies stuff into a temp allocation and then continues. At least that's how we've fixed all those inversions in i915-gem. I'm not volunteering to fix this ;-) The one in udl just looks like copypasta from i915, without any justification (at least I don't see any) for why it's there. Probably can die, too, since there isn't any gpu to reset on usb display-link devices ... Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 68451] Texture flicker in native Dota2 in mesa 9.2.0rc1
https://bugs.freedesktop.org/show_bug.cgi?id=68451 --- Comment #18 from Peter Kraus --- (In reply to comment #17) > Hi, > I am also having this behavior in Dota2. Peter, are you sure > 7948ed1250cae78ae1b22dbce4ab23aceacc6159 is the first non-working? Can you > try one earlier commit, 1b40398d024d2ac5c8e8b78d0f4941e2a007de2c, and > confirm that it works? Because, I am seeing the corruption on even earlier > (haven't figured out yet how much earlier). > > Maybe I am doing something wrong, I am compiling 32-bit mesa on otherwise > 64-bit system, but I can see my own printf()'s I've put in to make sure the > correct library is loaded when running Dota2. Hello, I have just checked, commit 1b40398d024d2ac5c8e8b78d0f4941e2a007de2c (+ the 5 patches for Bison 3, above) still works fine. Current HEAD is still broken. Peter -- You are receiving this mail because: You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 2/2] drm/udl: rip out set_need_resched
This very much looks like copypasta from drm/i915's fault handler. It was used there to duct-tape over issues around gpu reset handling. Since that can't ever happen for udl and there's seemingly no other reason for this just drop it. Reported-by: Peter Zijlstra Cc: Peter Zijlstra Cc: Linux Kernel Mailing List Cc: Dave Airlie Signed-off-by: Daniel Vetter --- drivers/gpu/drm/udl/udl_gem.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/udl/udl_gem.c b/drivers/gpu/drm/udl/udl_gem.c index 8dbe9d0..8bf6461 100644 --- a/drivers/gpu/drm/udl/udl_gem.c +++ b/drivers/gpu/drm/udl/udl_gem.c @@ -97,7 +97,6 @@ int udl_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf) ret = vm_insert_page(vma, (unsigned long)vmf->virtual_address, page); switch (ret) { case -EAGAIN: - set_need_resched(); case 0: case -ERESTARTSYS: return VM_FAULT_NOPAGE; -- 1.8.4.rc3 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
On Thu, Sep 12, 2013 at 9:58 PM, Thomas Gleixner wrote: > >> On Thu, Sep 12, 2013 at 6:22 PM, Peter Zijlstra wrote: >> > If 'sane' userspace is never supposed to do this, then only insane >> > userspace is going to hurt from this and that's a GOOD (tm) thing, >> > right? ;-) >> >> Afaik sane userspace doesn't hit the _deadlock_ (or lifelock if we >> have the set_need_resched in there). drm/i915 is a bit different since >> we have just one lock, and so the same design would actually deadlock >> even for sane userspace. But hitting contention there and yielding is >> somewhat expected. Obviously shouldn't happen too often since it'll >> hurt performance, with either blocking or the yield spinning loop. > > So this is actually a non priviledged DoS interface, right? I think for ttm drivers it's just execbuf being exploitable. But on drm/i915 we've had the same issue with the pwrite/pread ioctls, so a simple glBufferData(glMap) kind of recursion from gl clients blew the kernel to pieces ... -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
On Thu, Sep 12, 2013 at 10:20 PM, Thomas Gleixner wrote: >> I think for ttm drivers it's just execbuf being exploitable. But on >> drm/i915 we've >> had the same issue with the pwrite/pread ioctls, so a simple >> glBufferData(glMap) kind of recursion from gl clients blew the kernel >> to pieces ... > > And the only answer you folks came up with is set_need_resched() and > yield()? Oh well The yield was for a different lifelock, and that one is also fixed by now. The fault handler deadlock was fixed in the usual "drop locks and jump into slowpath" fasion, at least in drm/i915. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 1/2] drm/i915: kill set_need_resched
On Thu, Sep 12, 2013 at 05:59:51PM +0200, Peter Zijlstra wrote: > On Thu, Sep 12, 2013 at 05:57:28PM +0200, Daniel Vetter wrote: > > This is just a remnant from the old days when our reset handling was > > horribly racy, suffered from terribly locking issues and often happily > > live-locked. Those days are now gone so we can drop the hacks and just > > rip the reschedule-point out. > > > > Reported-by: Peter Zijlstra > > Thanks! > > Acked-by: Peter Zijlstra Merged to drm-intel-fixes with Chris' irc-review that we indeed don't need this any more to duct-tape over bugs. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 1/2] drm/i915: kill set_need_resched
On Thu, Sep 12, 2013 at 05:57:28PM +0200, Daniel Vetter wrote: > This is just a remnant from the old days when our reset handling was > horribly racy, suffered from terribly locking issues and often happily > live-locked. Those days are now gone so we can drop the hacks and just > rip the reschedule-point out. > > Reported-by: Peter Zijlstra Thanks! Acked-by: Peter Zijlstra ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] Allow kmscon to be cross-compiled
Hi On Mon, Sep 9, 2013 at 1:43 PM, Thierry Reding wrote: > On Mon, Sep 09, 2013 at 12:48:08PM +0200, David Herrmann wrote: >> Hi >> >> On Mon, Sep 9, 2013 at 12:27 PM, Thierry Reding >> wrote: >> > The genshader and genunifont utilities are run during the build process >> > to generate source files. In order for that to work when cross-compiling >> > the files need to be built using the native compiler instead of the >> > cross-compiler. >> > >> > Add the AX_PROG_CC_FOR_BUILD m4 macro which defines various *_FOR_BUILD >> > variables that are the native equivalents of CC, CFLAGS, LDFLAGS, etc. >> > Override CC, CFLAGS and LDFLAGS for genshader and genunifont and their >> > object files so that they will be built natively and can be executed >> > during the build. >> >> Thanks a lot! Looks all good, few comments below. I think I will apply >> it as is, anyway. New bugfix-release is planned for this week, too. >> >> > Signed-off-by: Thierry Reding >> > --- >> > Makefile.am| 16 -- >> > configure.ac | 3 +- >> > m4/ax_prog_cc_for_build.m4 | 125 >> > + >> > 3 files changed, 139 insertions(+), 5 deletions(-) >> > create mode 100644 m4/ax_prog_cc_for_build.m4 >> > >> > diff --git a/Makefile.am b/Makefile.am >> > index 7019290..f1b4435 100644 >> > --- a/Makefile.am >> > +++ b/Makefile.am >> > @@ -445,8 +445,12 @@ EXTRA_DIST += $(SHADERS) >> > CLEANFILES += src/static_shaders.c >> > genshader_SOURCES = src/genshader.c >> > >> > -src/static_shaders.c: $(SHADERS) genshader$(EXEEXT) >> > - $(AM_V_GEN)./genshader$(EXEEXT) src/static_shaders.c $(SHADERS) >> > +src/static_shaders.c: $(SHADERS) genshader$(BUILD_EXEEXT) >> > + $(AM_V_GEN)./genshader$(BUILD_EXEEXT) src/static_shaders.c >> > $(SHADERS) >> > + >> > +genshader$(BUILD_EXEEXT) $(genshader_OBJECTS): CC = $(CC_FOR_BUILD) >> > +genshader$(BUILD_EXEEXT) $(genshader_OBJECTS): CFLAGS = >> > $(CFLAGS_FOR_BUILD) >> > +genshader$(BUILD_EXEEXT): LDFLAGS = $(LDFLAGS_FOR_BUILD) >> >> Just wondering, isn't this going to break if $BUILD_EXEEXT != $EXEEXT >> I mean, noinst_PROGRAMS generates build-rules for $EXEEXT, not for >> $BUILD_EXEEXT, so a dependency on "genshader$(BUILD_EXEEXT)" won't do >> anything if it differs from $EXEEXT. But maybe I am just missing >> something and automake creates rules for both? > > That's a good point. And yes, I think it would break if both extensions > differed. On the other hand I have no idea on how to make automake > generate rules for $(BUILD_EXEEXT). I don't think it can. > > Interestingly, I mistakenly used genshader$(EXEEXT) initially, but that > causes automake to spew out warnings that these "rules" (which aren't > really rules at all) were overriding previous commands for the same > target. So while $(BUILD_EXEEXT) is the right extension to use, it also > works around a shortcoming in automake. Another shortcoming of automake > will cause this to break when doing MinGW->Unix and Unix->MinGW cross- > builds... > > A different approach of this patch was to move genshader and genunifont > to a tools subdirectory with a separate Makefile.am and overwrite the > CC, CFLAGS and LDFLAGS variables in that directory only, but automake > doesn't accept that either because CFLAGS is considered a user variable > and therefore can't be overwritten. > > I think this is as good as it gets for now. I've tried to fix this kind > of problem in automake several times but never managed to. Perhaps it's > time to move on to something like SCons... Yepp, automake has issues but it's still better than the alternatives, imho. Patch is applied and pushed. I only added a small TODO-comment so I don't forget about this issue. I don't have any better idea right now. Btw., there's kmscon-devel at lists. (which is an open list and doesn't require registration). Thanks David
Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
On Thu, Sep 12, 2013 at 05:35:43PM +0100, Chris Wilson wrote: > Not quite, as it would be possible for the evil userspace to trigger a > GPU hang that would cause the sane userspace to spin indefinitely > waiting for the error recovery to kick in. So with FIFOn+1 preempting FIFOn its a live-lock because the faulting thread will forever keep yielding to itself since its the highest priority task around, therefore the set_need_resched() is an absolute NOP in that case. For OTHER it might run another task with set_need_resched(), without set_need_resched() it'll simply spin on the fault until it runs out of time and gets force preempted and another task gets to run. So for either case, the set_need_resched() doesn't make an appreciable difference. Removing it will not make evil userspace much worse -- at worst it will cause slightly more wasted cycles. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 2/2] drm/udl: rip out set_need_resched
On Thu, Sep 12, 2013 at 05:57:29PM +0200, Daniel Vetter wrote: > This very much looks like copypasta from drm/i915's fault handler. > It was used there to duct-tape over issues around gpu reset handling. > > Since that can't ever happen for udl and there's seemingly no other > reason for this just drop it. > > Reported-by: Peter Zijlstra Thanks! Acked-by: Peter Zijlstra ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
On 09/12/2013 05:58 PM, Daniel Vetter wrote: On Thu, Sep 12, 2013 at 5:43 PM, Peter Zijlstra wrote: The one in ttm is just bonghits to shut up lockdep: ttm can recurse into it's own pagefault handler and then deadlock, the trylock just keeps lockdep quiet. Could you describe how it could recurse into it's own pagefault handler? IIRC the VM flags of the TTM VMAs makes get_user_pages() refrain from touching these VMAs, hence I don't think this code can deadlock, but admittedly it's far from the optimal solution. Never mind, more on the set_need_resched() below. We've had that bug arise in drm/i915 due to some fun userspace did and now have testcases for them. The right solution to fix this is to use copy_to|from_user_atomic in ttm everywhere it holds locks and have slowpaths which drops locks, copies stuff into a temp allocation and then continues. At least that's how we've fixed all those inversions in i915-gem. I'm not volunteering to fix this ;-) Yikes.. so how common is it? If I simply rip the set_need_resched() out it will 'spin' on the fault a little longer until a 'natural' preemption point -- if such a thing is every going to happen. A typical case is if a process is throwing out a buffer from the GPU or system memory while another process pagefaults while writing to it. It's not a common situation, and it's by no means a fastpath situation. For correctness purposes, I think set_need_resched() can be safely removed. It's a case of "our userspace doesn't do this", so as long as you're not evil and frob the drm device nodes of ttm drivers directly the deadlock will never happen. No idea how much contention actually happens on e.g. shared buffer objects - in i915 we have just one lock and so suffer quite a bit more from contention. So no idea how much removing the yield would hurt. -Daniel /Thomas ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
On Thu, Sep 12, 2013 at 05:58:49PM +0200, Daniel Vetter wrote: > On Thu, Sep 12, 2013 at 5:43 PM, Peter Zijlstra wrote: > >> The one in ttm is just bonghits to shut up lockdep: ttm can recurse > >> into it's own pagefault handler and then deadlock, the trylock just > >> keeps lockdep quiet. We've had that bug arise in drm/i915 due to some > >> fun userspace did and now have testcases for them. The right solution > >> to fix this is to use copy_to|from_user_atomic in ttm everywhere it > >> holds locks and have slowpaths which drops locks, copies stuff into a > >> temp allocation and then continues. At least that's how we've fixed > >> all those inversions in i915-gem. I'm not volunteering to fix this ;-) > > > > Yikes.. so how common is it? If I simply rip the set_need_resched() out > > it will 'spin' on the fault a little longer until a 'natural' preemption > > point -- if such a thing is every going to happen. > > It's a case of "our userspace doesn't do this", so as long as you're > not evil and frob the drm device nodes of ttm drivers directly the > deadlock will never happen. No idea how much contention actually > happens on e.g. shared buffer objects - in i915 we have just one lock > and so suffer quite a bit more from contention. So no idea how much > removing the yield would hurt. If 'sane' userspace is never supposed to do this, then only insane userspace is going to hurt from this and that's a GOOD (tm) thing, right? ;-) And it won't actually deadlock if you don't use FIFO, for the regular scheduler class it'll just spin a little longer before getting preempted so no real worries there. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE
On Thu, Sep 12, 2013 at 05:36:43PM +0200, Daniel Vetter wrote: > The set_need_resched in i915_gem.c:i915_gem_fault can actually be > removed. It was there to give the error handler a chance to sneak in > and reset the hw/sw tracking when the gpu is dead. That hack goes back > to the days when the locking around our error handler was somewhere > between nonexistent and totally broken, nowadays we keep things from > live-locking by a bit of magic in i915_mutex_lock_interruptible. I'll > whip up a patch to rip this out. I'll also check that our testsuite > properly exercises this path (needs a bit of work on a quick look for > better coverage). > > The one in udl just looks like copypasta from i915, without any > justification (at least I don't see any) for why it's there. Probably > can die, too, since there isn't any gpu to reset on usb display-link > devices ... OK, awesome that. 2 down. > The one in ttm is just bonghits to shut up lockdep: ttm can recurse > into it's own pagefault handler and then deadlock, the trylock just > keeps lockdep quiet. We've had that bug arise in drm/i915 due to some > fun userspace did and now have testcases for them. The right solution > to fix this is to use copy_to|from_user_atomic in ttm everywhere it > holds locks and have slowpaths which drops locks, copies stuff into a > temp allocation and then continues. At least that's how we've fixed > all those inversions in i915-gem. I'm not volunteering to fix this ;-) Yikes.. so how common is it? If I simply rip the set_need_resched() out it will 'spin' on the fault a little longer until a 'natural' preemption point -- if such a thing is every going to happen. It would make that path take longer, but not be more or less broken. So if its a rare path, I'll just rip the set_need_resched() out and you DRM people can then fix up at your own pace. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 68451] Texture flicker in native Dota2 in mesa 9.2.0rc1
https://bugs.freedesktop.org/show_bug.cgi?id=68451 --- Comment #17 from Marko Srebre --- Hi, I am also having this behavior in Dota2. Peter, are you sure 7948ed1250cae78ae1b22dbce4ab23aceacc6159 is the first non-working? Can you try one earlier commit, 1b40398d024d2ac5c8e8b78d0f4941e2a007de2c, and confirm that it works? Because, I am seeing the corruption on even earlier (haven't figured out yet how much earlier). Maybe I am doing something wrong, I am compiling 32-bit mesa on otherwise 64-bit system, but I can see my own printf()'s I've put in to make sure the correct library is loaded when running Dota2. -- You are receiving this mail because: You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 68235] Display freezes after login with kernel 3.11.0-rc5 on Cayman with dpm=1
https://bugs.freedesktop.org/show_bug.cgi?id=68235 --- Comment #23 from Alex Deucher --- (In reply to comment #21) > Adding printk(KERN_DEBUG "DEBUG: about to pass the following value of > module_index to radeon_atom_init_mc_reg_table(): %d", module_index); just > before calling radeon_atom_init_mc_reg_table() returns 2. Ok, that looks good. -- You are receiving this mail because: You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 68235] Display freezes after login with kernel 3.11.0-rc5 on Cayman with dpm=1
https://bugs.freedesktop.org/show_bug.cgi?id=68235 --- Comment #22 from Alex Deucher --- (In reply to comment #20) > Ok, if I apply the whole suggested patch but the following, it hangs: > @@ -4152,14 +4152,14 @@ int ni_dpm_init(struct radeon_device *rdev) > } > ni_pi->mclk_rtt_mode_threshold = eg_pi->mclk_edc_wr_enable_threshold; > > - pi->voltage_control = > - radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDC, > 0); > + pi->voltage_control = false; > +// radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDC, > 0); > > - pi->mvdd_control = > - radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_MVDDC, > 0); > + pi->mvdd_control = false; > +// radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_MVDDC, > 0); > > - eg_pi->vddci_control = > - radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDCI, > 0); > + eg_pi->vddci_control = false; > +// radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDCI, > 0); > > rv770_get_engine_memory_ss(rdev); So does just applying this portion of the patch by itself fix the hang? -- You are receiving this mail because: You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 2/2] ACPI / video / i915: Remove ACPI backlight if firmware expects Windows 8
On Wed, Sep 11, 2013 at 11:45:19AM +0300, Jani Nikula wrote: > On Wed, 11 Sep 2013, Aaron Lu wrote: > > It is possible the i915 driver decides not to register a backlight > > interface for the graphics card for some reason(memory allocation failed > > or it knows the native control does not work on this card or whatever), > > so I would prefer let i915 tell ACPI video that it has registered a > > native backlight control interface as Jani has said. > > > > Then together with the video.use_native_backlight, we can register or > > not register ACPI video backlight interface accordingly. Or rather, we > > can simply not register ACPI video backlight interface for Win8 systems > > as long as i915 indicates that it has native backlight control(if the > > native control is broken, i915 should fix it or blacklist it so that > > i915 will not indicate it has native backlight control and ACPI video > > will continue to register its own). > > > > How does this sound? > > Sounds good to me. > > Before plunging forward, have you observed any difference between the > boot modes? We have reports [1] that the backlight behaviour is Not yet from ACPI's point of view. > different with UEFI vs. UEFI+CSM or legacy boot. So I'm wondering if the > acpi_gbl_osi_data >= ACPI_OSI_WIN_8 check in patch 2/2 is the whole > story. This check in patch 2/2 is a policy: for Win8 system, we think the native backlight control has a better chance of working than the ACPI video's, so I think the check is enough in ACPI video. > > Further, if we tell the BIOS we're Windows 8 to use the tested BIOS code > paths, what guarantees do we have of UEFI+CSM or legacy boots working? I suppose the 'tested BIOS code paths' means the pure UEFI boot mode? I don't know what guatantees do we have since I don't know what happened underneath after the backlight register is set in i915 driver, you or other i915 driver people should know more than I do :-) BTW, after the backlight register is set in i915, is it that some find of firmware code will run in response to the setting of the register(e.g. the BLC_PWM_CTL/BLC_PWM_CPU_CTL/PCI_LBPC reg)? Thanks, Aaron ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Intel/Radeon muxless laptop issues
Hello, I am writing to report two issues I'm having with a hybrid graphics laptop. Namely, Sony Vaio SA3S9R with Intel HD3000 and Radeon HD6630M First one looks like a regression introduced by the commit named "gpu/vga_switcheroo: add driver control power feature. (v3)". After I echo the "OFF" command to the switch, the DGPU turns off. However, after that the vga_switcheroo disappears. Reinserting the radeon module makes the kernel print the stack trace. I have uploaded the dmesg log to http://paste.debian.net/38539/ . Another issue I would like to bring up is the handling of suspend-resume cycle on muxless laptops. On most intel/radeon laptops, the DGPU turns on after a resume without an ACPI notification (which sound like a bug). Switcheroo is still thinking it's off, and it keeps draining its extra 10W of power. Now, if we do a "echo OFF", switcheroo does nothing. If we do "echo ON", it fails and gets stuck for some seconds (because it tries to power-up a powered-up card messing the initialization routine and then a GPU hang happens). The hacky solution is to turn it on before suspend and turn it off after via a suspend hook script, but this has several limitations: 1. Most distros don't ship the script by default leaving the inexperienced users with hot laptops 2. Doesn't work well when someone is really using the DGPU 3. I have no idea, but maybe turning it on makes it drain more power in suspend (well, since BIOSes tend to be buggy, who knows what happens next). My hacky solution was to forcibly turn off the DGPU in the switcheroo driver at resume if it was unused, ignoring its state inside the switcheroo struct. I have already made a proof-of-concept patch back in 2012, but somehow that received no attention. http://lkml.indiana.edu/hypermail/linux/kernel/1204.3/02530.html Since fixing ACPI for every broken laptop or shipping a suspend hook is a no-go, we need some kind of a workaround. I don't really know how it should be done, via DMI probing or whatever, but I would like someone who has more knowledge to look into that. Well, I guess, a solution would be to forcibly enable my hack for radeon/intel combos and then blacklist the laptops for which it breaks the resume cycle via DMI (i.e., I think that since most laptops behave this way, we should make the powersaving behavior default one). -- Regards, Alexander ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] drm/msm: fix potential NULL pointer dereference
From: Wei Yongjun The dereference to 'pdata' should be moved below the NULL test. Signed-off-by: Wei Yongjun --- drivers/gpu/drm/msm/msm_gpu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index e1e1ec9..6b50e6b 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -29,13 +29,14 @@ static void bs_init(struct msm_gpu *gpu, struct platform_device *pdev) { struct drm_device *dev = gpu->dev; - struct kgsl_device_platform_data *pdata = pdev->dev.platform_data; + struct kgsl_device_platform_data *pdata; if (!pdev) { dev_err(dev->dev, "could not find dtv pdata\n"); return; } + pdata = pdev->dev.platform_data; if (pdata->bus_scale_table) { gpu->bsc = msm_bus_scale_register_client(pdata->bus_scale_table); DBG("bus scale client: %08x", gpu->bsc); ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] drm/omap: fix error return code in omap_dmm_probe()
From: Wei Yongjun Fix to return -ENOMEM in the refill memory alloc error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun --- drivers/gpu/drm/omapdrm/omap_dmm_tiler.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/omapdrm/omap_dmm_tiler.c b/drivers/gpu/drm/omapdrm/omap_dmm_tiler.c index 9b794c9..ab546c9 100644 --- a/drivers/gpu/drm/omapdrm/omap_dmm_tiler.c +++ b/drivers/gpu/drm/omapdrm/omap_dmm_tiler.c @@ -675,6 +675,7 @@ static int omap_dmm_probe(struct platform_device *dev) &omap_dmm->refill_pa, GFP_KERNEL); if (!omap_dmm->refill_va) { dev_err(&dev->dev, "could not allocate refill memory\n"); + ret = -ENOMEM; goto fail; } ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 2/2] ACPI / video / i915: Remove ACPI backlight if firmware expects Windows 8
On Wed, 2013-09-11 at 13:29 +0300, Jani Nikula wrote: > On Wed, 11 Sep 2013, Matthew Garrett wrote: > > On Wed, 2013-09-11 at 11:45 +0300, Jani Nikula wrote: > > > >> Before plunging forward, have you observed any difference between the > >> boot modes? We have reports [1] that the backlight behaviour is > >> different with UEFI vs. UEFI+CSM or legacy boot. So I'm wondering if the > >> acpi_gbl_osi_data >= ACPI_OSI_WIN_8 check in patch 2/2 is the whole > >> story. > >> > >> Further, if we tell the BIOS we're Windows 8 to use the tested BIOS code > >> paths, what guarantees do we have of UEFI+CSM or legacy boots working? > > > > We have no evidence of Windows behaving differently based on the exposed > > firmware type. > > By "behaving differently", do you mean internally adapting to the boot > mode, or exhibiting different behaviour to the user? As far as backlight control goes, both. > We have evidence of the firmware behaving differently (VBT, backlight) > based on the boot mode, all else being equal. We don't adapt to that, > and we fail. I don't know if we should adapt, or do things differently > altogether. I don't even know if Windows 8 works on all boot modes on > the machines in question. Sure, but Windows knows nothing about VBT or opregion-backed backlight control. That's up to the Intel driver. -- Matthew Garrett ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 2/2] ACPI / video / i915: Remove ACPI backlight if firmware expects Windows 8
On Wed, 2013-09-11 at 11:45 +0300, Jani Nikula wrote: > Before plunging forward, have you observed any difference between the > boot modes? We have reports [1] that the backlight behaviour is > different with UEFI vs. UEFI+CSM or legacy boot. So I'm wondering if the > acpi_gbl_osi_data >= ACPI_OSI_WIN_8 check in patch 2/2 is the whole > story. > > Further, if we tell the BIOS we're Windows 8 to use the tested BIOS code > paths, what guarantees do we have of UEFI+CSM or legacy boots working? We have no evidence of Windows behaving differently based on the exposed firmware type. -- Matthew Garrett ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] nouveau: fix build eror when VGA_SWITCHEROO is not enabled
From: Randy Dunlap Fix nouveau build error on x86, when ACPI is enabled but VGA_SWITCHEROO is not enabled, by providing a stub function. drivers/built-in.o: In function `nouveau_pmops_runtime_suspend': nouveau_drm.c:(.text+0x3aac89): undefined reference to `nouveau_switcheroo_optimus_dsm' Signed-off-by: Randy Dunlap --- drivers/gpu/drm/nouveau/nouveau_acpi.c |1 + 1 file changed, 1 insertion(+) --- linux-next-20130911.orig/drivers/gpu/drm/nouveau/nouveau_acpi.c +++ linux-next-20130911/drivers/gpu/drm/nouveau/nouveau_acpi.c @@ -373,6 +373,7 @@ void nouveau_unregister_dsm_handler(void #else void nouveau_register_dsm_handler(void) {} void nouveau_unregister_dsm_handler(void) {} +void nouveau_switcheroo_optimus_dsm(void) {} #endif /* retrieve the ROM in 4k blocks */ ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 2/2] ACPI / video / i915: Remove ACPI backlight if firmware expects Windows 8
On mer., 2013-09-11 at 08:45 +, Matthew Garrett wrote: > On Wed, 2013-09-11 at 11:45 +0300, Jani Nikula wrote: > > > Before plunging forward, have you observed any difference between the > > boot modes? We have reports [1] that the backlight behaviour is > > different with UEFI vs. UEFI+CSM or legacy boot. So I'm wondering if the > > acpi_gbl_osi_data >= ACPI_OSI_WIN_8 check in patch 2/2 is the whole > > story. > > > > Further, if we tell the BIOS we're Windows 8 to use the tested BIOS code > > paths, what guarantees do we have of UEFI+CSM or legacy boots working? > > We have no evidence of Windows behaving differently based on the exposed > firmware type. > I do have an X230 which boots fine using pure UEFI or UEFI+CSM (and I guess I can try booting it with a grml for pure legacy), so I can do tests if needed. Regards, -- Yves-Alexis ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 69062] Corrupted visuals in Gnome3 (A6-4400M)
https://bugs.freedesktop.org/show_bug.cgi?id=69062 --- Comment #7 from LRN --- Upgraded kernel to b7a5ae97502e104371c7eb3b7b17ae959e50d6f5 No effect. Tried to build and install older drm and mesa (git revs from ~Jan 2013, since i remember that things worked back then). Installed drm - no effect. Installed xf86-video-ati - no effect. Tried to build old mesa, got errors. Installed llvm-3.3, made it the default. Got some more errors. Tinkered a bit more. At some point i had to reboot. When the laptop booted again, gdm3 showed no corruption, and neither did gnome3. Upgraded drm, xf86-video-ati and mesa back to git their respective git master HEADs. Rebooted. No corruption. So, yeah. Weird. Maybe i was building xorg in a wrong order, and my git gyrations made it right... -- You are receiving this mail because: You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 69245] New: Opencl random lockups whilst running tstellar's opencl-examples
on :01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 [29164.980562] radeon :01:00.0: GPU reset succeeded, trying to resume [29165.001309] [drm] PCIE GART of 512M enabled (table at 0x00142000). [29165.001440] radeon :01:00.0: WB enabled [29165.001444] radeon :01:00.0: fence driver on ring 0 use gpu addr 0x4c00 and cpu addr 0x88020e246c00 [29165.001447] radeon :01:00.0: fence driver on ring 3 use gpu addr 0x4c0c and cpu addr 0x88020e246c0c [29165.003766] radeon :01:00.0: fence driver on ring 5 use gpu addr 0x0f60c418 and cpu addr 0xc9000579c418 [29165.020514] [drm] ring test on 0 succeeded in 0 usecs [29165.020576] [drm] ring test on 3 succeeded in 1 usecs [29165.216744] [drm] ring test on 5 succeeded in 1 usecs [29165.216748] [drm] UVD initialized successfully. [29165.216753] [drm] Enabling audio support [29165.229845] [drm] ib test on ring 0 succeeded in 0 usecs [29165.229891] [drm] ib test on ring 3 succeeded in 0 usecs [29165.401525] [drm] ib test on ring 5 succeeded [29196.684784] radeon :01:00.0: GPU lockup CP stall for more than 1msec [29196.684793] radeon :01:00.0: GPU lockup (waiting for 0x00441178 last fence id 0x00441176) [29196.684811] [drm] Disabling audio support [29196.697087] radeon :01:00.0: fence driver on ring 5 use gpu addr 0x0f60c418 and cpu addr 0xc9000449b418 [29196.698143] radeon :01:00.0: Saved 55 dwords of commands on ring 0. [29196.698164] radeon :01:00.0: GPU softreset: 0x0009 [29196.698167] radeon :01:00.0: GRBM_STATUS = 0xF0701028 [29196.698170] radeon :01:00.0: GRBM_STATUS_SE0 = 0x9C02 [29196.698172] radeon :01:00.0: GRBM_STATUS_SE1 = 0x0007 [29196.698174] radeon :01:00.0: SRBM_STATUS = 0x20C0 [29196.698177] radeon :01:00.0: SRBM_STATUS2 = 0x [29196.698180] radeon :01:00.0: R_008674_CP_STALLED_STAT1 = 0x [29196.698182] radeon :01:00.0: R_008678_CP_STALLED_STAT2 = 0x400C [29196.698185] radeon :01:00.0: R_00867C_CP_BUSY_STAT = 0x00058004 [29196.698187] radeon :01:00.0: R_008680_CP_STAT = 0x80268647 [29196.698190] radeon :01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 [29196.714799] radeon :01:00.0: GRBM_SOFT_RESET=0x7F6B [29196.714855] radeon :01:00.0: SRBM_SOFT_RESET=0x0100 [29196.716015] radeon :01:00.0: GRBM_STATUS = 0x3828 [29196.716019] radeon :01:00.0: GRBM_STATUS_SE0 = 0x0007 [29196.716021] radeon :01:00.0: GRBM_STATUS_SE1 = 0x0007 [29196.716024] radeon :01:00.0: SRBM_STATUS = 0x20C0 [29196.716026] radeon :01:00.0: SRBM_STATUS2 = 0x [29196.716029] radeon :01:00.0: R_008674_CP_STALLED_STAT1 = 0x [29196.716031] radeon :01:00.0: R_008678_CP_STALLED_STAT2 = 0x [29196.716034] radeon :01:00.0: R_00867C_CP_BUSY_STAT = 0x [29196.716036] radeon :01:00.0: R_008680_CP_STAT = 0x [29196.716039] radeon :01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 [29196.716055] radeon :01:00.0: GPU reset succeeded, trying to resume [29196.736296] [drm] PCIE GART of 512M enabled (table at 0x00142000). [29196.736421] radeon :01:00.0: WB enabled [29196.736425] radeon :01:00.0: fence driver on ring 0 use gpu addr 0x4c00 and cpu addr 0x88020e246c00 [29196.736427] radeon :01:00.0: fence driver on ring 3 use gpu addr 0x4c0c and cpu addr 0x88020e246c0c [29196.738058] radeon :01:00.0: fence driver on ring 5 use gpu addr 0x0bf38418 and cpu addr 0xc9000579c418 [29196.754590] [drm] ring test on 0 succeeded in 1 usecs [29196.754651] [drm] ring test on 3 succeeded in 1 usecs [29196.950834] [drm] ring test on 5 succeeded in 1 usecs [29196.950839] [drm] UVD initialized successfully. [29196.950844] [drm] Enabling audio support [29196.950905] [drm] ib test on ring 0 succeeded in 0 usecs [29196.950961] [drm] ib test on ring 3 succeeded in 1 usecs [29197.122612] [drm] ib test on ring 5 succeeded On a related note after the restart the default power profile is loaded instead of the previous one, but I can post another bug for this issue if desired. If I can provide any other information which can be of help debugging this please tell me. Thanks a lot for your time and efforts in the project :) klondike -- You are receiving this mail because: You are the assignee for the bug. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20130912/286ab64e/attachment-0001.html>