Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Thomas Hellstrom

On 09/12/2013 11:50 PM, Maarten Lankhorst wrote:

Op 12-09-13 18:44, Thomas Hellstrom schreef:

On 09/12/2013 05:45 PM, Maarten Lankhorst wrote:

Op 12-09-13 17:36, Daniel Vetter schreef:

On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra  wrote:

So I'm poking around the preemption code and stumbled upon:

drivers/gpu/drm/i915/i915_gem.c:set_need_resched();
drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched();
drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched();
drivers/gpu/drm/udl/udl_gem.c:  set_need_resched();

All these sites basically do:

while (!trylock())
  yield();

which is a horrible and broken locking pattern.

Firstly its deadlock prone, suppose the faulting process is a FIFOn+1
task that preempted the lock holder at FIFOn.

Secondly the implementation is worse than usual by abusing
VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault
doesn't retry, but you're using it as a get out of fault path. And
you're using set_need_resched() which is not something a driver should
_ever_ touch.

Now I'm going to take away set_need_resched() -- and while you can
'reimplement' it using set_thread_flag() you're not going to do that
because it will be broken due to changes to the preempt code.

So please as to fix ASAP and don't allow anybody to trick you into
merging silly things like that again ;-)

The set_need_resched in i915_gem.c:i915_gem_fault can actually be
removed. It was there to give the error handler a chance to sneak in
and reset the hw/sw tracking when the gpu is dead. That hack goes back
to the days when the locking around our error handler was somewhere
between nonexistent and totally broken, nowadays we keep things from
live-locking by a bit of magic in i915_mutex_lock_interruptible. I'll
whip up a patch to rip this out. I'll also check that our testsuite
properly exercises this path (needs a bit of work on a quick look for
better coverage).

The one in ttm is just bonghits to shut up lockdep: ttm can recurse
into it's own pagefault handler and then deadlock, the trylock just
keeps lockdep quiet. We've had that bug arise in drm/i915 due to some
fun userspace did and now have testcases for them. The right solution
to fix this is to use copy_to|from_user_atomic in ttm everywhere it
holds locks and have slowpaths which drops locks, copies stuff into a
temp allocation and then continues. At least that's how we've fixed
all those inversions in i915-gem. I'm not volunteering to fix this ;-)

Ah the case where a mmap'd address is passed to the execbuf ioctl? :P

Fine I'll look into it a bit, hopefully before tuesday. Else it might take a 
bit longer since I'll be on my way to plumbers..

I think a possible fix would be if fault() were allowed to return an error and 
drop the mmap_sem() before returning.

Otherwise we need to track down all copy_to_user / copy_from_user which happen 
with bo::reserve held.


Actually, from looking at the mm code, it seems OK to do the following:

if (!bo_tryreserve()) {
up_read mmap_sem(); // Release the mmap_sem to avoid deadlocks.
bo_reserve();   // Wait for the BO to become available 
(interruptible)
bo_unreserve();   // Where is bo_wait_unreserved() when we 
need it, Maarten :P
return VM_FAULT_RETRY; // Go ahead and retry the VMA walk, after 
regrabbing

}

Somebody conveniently added a VM_FAULT_RETRY, but for a different purpose.

If possible, I suggest to take this route for now to avoid the mess of 
changing locking order in all TTM drivers, with
all give-up-locking slowpaths that comes with it. IIRC it took some time 
for i915 to get that right, and completely get rid of all lockdep warnings.


This will keep the official locking order
bo::reserve
mmap_sem

/Thomas
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 57919] Visual glitches in unity with Radeon HD 7600M

2013-09-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=57919

--- Comment #21 from Thilo Cestonaro  ---
Latest Saucy. Still no luck :(
Kernel: 3.11.0-7-generic

Greetings Thilo

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 68235] Display freezes after login with kernel 3.11.0-rc5 on Cayman with dpm=1

2013-09-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=68235

--- Comment #25 from Alexandre Demers  ---
(In reply to comment #22)
> (In reply to comment #20)
> > Ok, if I apply the whole suggested patch but the following, it hangs:
> > @@ -4152,14 +4152,14 @@ int ni_dpm_init(struct radeon_device *rdev)
> > }
> > ni_pi->mclk_rtt_mode_threshold = eg_pi->mclk_edc_wr_enable_threshold;
> >  
> > -   pi->voltage_control =
> > -   radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDC, 
> > 0);
> > +   pi->voltage_control = false;
> > +// radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDC, 
> > 0);
> >  
> > -   pi->mvdd_control =
> > -   radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_MVDDC, 
> > 0);
> > +   pi->mvdd_control = false;
> > +// radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_MVDDC, 
> > 0);
> >  
> > -   eg_pi->vddci_control =
> > -   radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDCI, 
> > 0);
> > +   eg_pi->vddci_control = false;
> > +// radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDCI, 
> > 0);
> >  
> > rv770_get_engine_memory_ss(rdev);
> 
> So does just applying this portion of the patch by itself fix the hang?

The only way I don't have a "ni_upload_sw_state failed" is by letting
pi->voltage_control = radeon_atom_is_voltage_gpio(rdev,
SET_VOLTAGE_TYPE_ASIC_VDDC, 0); However, I inevitably end up with a hang either
at login or when my session is loading (however, going in a terminal before it
hangs prevents any hang from happening as long as I stay in terminal).

If I patch that part of code, I always have the "ni_upload_sw_state failed"
error, thus not hanging but preventing any dpm.

I can patch everything else or nothing at all (I tried different combinations)
and they don't seem to change a thing about the hang.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Thomas Hellstrom

On 09/12/2013 11:50 PM, Maarten Lankhorst wrote:

Op 12-09-13 18:44, Thomas Hellstrom schreef:

On 09/12/2013 05:45 PM, Maarten Lankhorst wrote:

Op 12-09-13 17:36, Daniel Vetter schreef:

On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra  wrote:

So I'm poking around the preemption code and stumbled upon:

drivers/gpu/drm/i915/i915_gem.c:set_need_resched();
drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched();
drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched();
drivers/gpu/drm/udl/udl_gem.c:  set_need_resched();

All these sites basically do:

while (!trylock())
  yield();

which is a horrible and broken locking pattern.

Firstly its deadlock prone, suppose the faulting process is a FIFOn+1
task that preempted the lock holder at FIFOn.

Secondly the implementation is worse than usual by abusing
VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault
doesn't retry, but you're using it as a get out of fault path. And
you're using set_need_resched() which is not something a driver should
_ever_ touch.

Now I'm going to take away set_need_resched() -- and while you can
'reimplement' it using set_thread_flag() you're not going to do that
because it will be broken due to changes to the preempt code.

So please as to fix ASAP and don't allow anybody to trick you into
merging silly things like that again ;-)

The set_need_resched in i915_gem.c:i915_gem_fault can actually be
removed. It was there to give the error handler a chance to sneak in
and reset the hw/sw tracking when the gpu is dead. That hack goes back
to the days when the locking around our error handler was somewhere
between nonexistent and totally broken, nowadays we keep things from
live-locking by a bit of magic in i915_mutex_lock_interruptible. I'll
whip up a patch to rip this out. I'll also check that our testsuite
properly exercises this path (needs a bit of work on a quick look for
better coverage).

The one in ttm is just bonghits to shut up lockdep: ttm can recurse
into it's own pagefault handler and then deadlock, the trylock just
keeps lockdep quiet. We've had that bug arise in drm/i915 due to some
fun userspace did and now have testcases for them. The right solution
to fix this is to use copy_to|from_user_atomic in ttm everywhere it
holds locks and have slowpaths which drops locks, copies stuff into a
temp allocation and then continues. At least that's how we've fixed
all those inversions in i915-gem. I'm not volunteering to fix this ;-)

Ah the case where a mmap'd address is passed to the execbuf ioctl? :P

Fine I'll look into it a bit, hopefully before tuesday. Else it might take a 
bit longer since I'll be on my way to plumbers..

I think a possible fix would be if fault() were allowed to return an error and 
drop the mmap_sem() before returning.

Otherwise we need to track down all copy_to_user / copy_from_user which happen 
with bo::reserve held.

CONFIG_PROVE_LOCKING=y

and hard grab that reserve lock within the fault handler, done.. lockdep will 
spit it out for you :p

~Maarten


Given that all copy_to_user / copy_from_user paths are actually hit 
during testing, right?


/Thomas
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Intel-gfx] [PATCH 1/2] drm/i915: kill set_need_resched

2013-09-12 Thread Rob Clark
hmm, looks like I cargo-cult'd the same into msm.

I guess in i915 (and ttm) case, the issue arises due to need for CPU
access to buffer via GTT?  In which case I should be safe to drop the
set_need_resched() as well? (Since CPU always has direct access to the
pages.)  Or am I missing something about the original issue that
necessitated set_need_resched()?

BR,
-R


On Thu, Sep 12, 2013 at 11:57 AM, Daniel Vetter  wrote:
> This is just a remnant from the old days when our reset handling was
> horribly racy, suffered from terribly locking issues and often happily
> live-locked. Those days are now gone so we can drop the hacks and just
> rip the reschedule-point out.
>
> Reported-by: Peter Zijlstra 
> Cc: Peter Zijlstra 
> Cc: Linux Kernel Mailing List 
> Signed-off-by: Daniel Vetter 
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 11 ---
>  1 file changed, 4 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index d80f33d..3871060 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1389,14 +1389,11 @@ out:
> if (i915_terminally_wedged(&dev_priv->gpu_error))
> return VM_FAULT_SIGBUS;
> case -EAGAIN:
> -   /* Give the error handler a chance to run and move the
> -* objects off the GPU active list. Next time we service the
> -* fault, we should be able to transition the page into the
> -* GTT without touching the GPU (and so avoid further
> -* EIO/EGAIN). If the GPU is wedged, then there is no issue
> -* with coherency, just lost writes.
> +   /*
> +* EAGAIN means the gpu is hung and we'll wait for the error
> +* handler to reset everything when re-faulting in
> +* i915_mutex_lock_interruptible.
>  */
> -   set_need_resched();
> case 0:
> case -ERESTARTSYS:
> case -EINTR:
> --
> 1.8.4.rc3
>
> ___
> Intel-gfx mailing list
> intel-...@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 69081] [radeonsi] Incorrect rendering of textures

2013-09-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=69081

José Suárez  changed:

   What|Removed |Added

Summary|Incorrect rendering of  |[radeonsi] Incorrect
   |textures|rendering of textures

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Daniel Vetter
On Thu, Sep 12, 2013 at 5:43 PM, Peter Zijlstra  wrote:
>> The one in ttm is just bonghits to shut up lockdep: ttm can recurse
>> into it's own pagefault handler and then deadlock, the trylock just
>> keeps lockdep quiet. We've had that bug arise in drm/i915 due to some
>> fun userspace did and now have testcases for them. The right solution
>> to fix this is to use copy_to|from_user_atomic in ttm everywhere it
>> holds locks and have slowpaths which drops locks, copies stuff into a
>> temp allocation and then continues. At least that's how we've fixed
>> all those inversions in i915-gem. I'm not volunteering to fix this ;-)
>
> Yikes.. so how common is it? If I simply rip the set_need_resched() out
> it will 'spin' on the fault a little longer until a 'natural' preemption
> point -- if such a thing is every going to happen.

It's a case of "our userspace doesn't do this", so as long as you're
not evil and frob the drm device nodes of ttm drivers directly the
deadlock will never happen. No idea how much contention actually
happens on e.g. shared buffer objects - in i915 we have just one lock
and so suffer quite a bit more from contention. So no idea how much
removing the yield would hurt.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 1/2] drm/i915: kill set_need_resched

2013-09-12 Thread Daniel Vetter
This is just a remnant from the old days when our reset handling was
horribly racy, suffered from terribly locking issues and often happily
live-locked. Those days are now gone so we can drop the hacks and just
rip the reschedule-point out.

Reported-by: Peter Zijlstra 
Cc: Peter Zijlstra 
Cc: Linux Kernel Mailing List 
Signed-off-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/i915_gem.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d80f33d..3871060 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1389,14 +1389,11 @@ out:
if (i915_terminally_wedged(&dev_priv->gpu_error))
return VM_FAULT_SIGBUS;
case -EAGAIN:
-   /* Give the error handler a chance to run and move the
-* objects off the GPU active list. Next time we service the
-* fault, we should be able to transition the page into the
-* GTT without touching the GPU (and so avoid further
-* EIO/EGAIN). If the GPU is wedged, then there is no issue
-* with coherency, just lost writes.
+   /*
+* EAGAIN means the gpu is hung and we'll wait for the error
+* handler to reset everything when re-faulting in
+* i915_mutex_lock_interruptible.
 */
-   set_need_resched();
case 0:
case -ERESTARTSYS:
case -EINTR:
-- 
1.8.4.rc3

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Chris Wilson
On Thu, Sep 12, 2013 at 06:22:10PM +0200, Peter Zijlstra wrote:
> On Thu, Sep 12, 2013 at 05:58:49PM +0200, Daniel Vetter wrote:
> > On Thu, Sep 12, 2013 at 5:43 PM, Peter Zijlstra  
> > wrote:
> > >> The one in ttm is just bonghits to shut up lockdep: ttm can recurse
> > >> into it's own pagefault handler and then deadlock, the trylock just
> > >> keeps lockdep quiet. We've had that bug arise in drm/i915 due to some
> > >> fun userspace did and now have testcases for them. The right solution
> > >> to fix this is to use copy_to|from_user_atomic in ttm everywhere it
> > >> holds locks and have slowpaths which drops locks, copies stuff into a
> > >> temp allocation and then continues. At least that's how we've fixed
> > >> all those inversions in i915-gem. I'm not volunteering to fix this ;-)
> > >
> > > Yikes.. so how common is it? If I simply rip the set_need_resched() out
> > > it will 'spin' on the fault a little longer until a 'natural' preemption
> > > point -- if such a thing is every going to happen.
> > 
> > It's a case of "our userspace doesn't do this", so as long as you're
> > not evil and frob the drm device nodes of ttm drivers directly the
> > deadlock will never happen. No idea how much contention actually
> > happens on e.g. shared buffer objects - in i915 we have just one lock
> > and so suffer quite a bit more from contention. So no idea how much
> > removing the yield would hurt.
> 
> If 'sane' userspace is never supposed to do this, then only insane
> userspace is going to hurt from this and that's a GOOD (tm) thing,
> right? ;-)

Not quite, as it would be possible for the evil userspace to trigger a
GPU hang that would cause the sane userspace to spin indefinitely 
waiting for the error recovery to kick in.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Peter Zijlstra
On Thu, Sep 12, 2013 at 05:11:25PM +0200, Maarten Lankhorst wrote:
> Op 12-09-13 17:06, Peter Zijlstra schreef:
> > Hi Dave,
> >
> > So I'm poking around the preemption code and stumbled upon:
> >
> > drivers/gpu/drm/i915/i915_gem.c:set_need_resched();
> > drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched();
> > drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched();
> > drivers/gpu/drm/udl/udl_gem.c:  set_need_resched();
> >
> > All these sites basically do:
> >
> >   while (!trylock())
> > yield();
> >
> > which is a horrible and broken locking pattern. 
> >
> > Firstly its deadlock prone, suppose the faulting process is a FIFOn+1
> > task that preempted the lock holder at FIFOn.
> >
> > Secondly the implementation is worse than usual by abusing
> > VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault
> > doesn't retry, but you're using it as a get out of fault path. And
> > you're using set_need_resched() which is not something a driver should
> > _ever_ touch.
> >
> > Now I'm going to take away set_need_resched() -- and while you can
> > 'reimplement' it using set_thread_flag() you're not going to do that
> > because it will be broken due to changes to the preempt code.
> >
> > So please as to fix ASAP and don't allow anybody to trick you into
> > merging silly things like that again ;-)
> >
> Agreed, but this is a case of locking inversion. How do you propose to get 
> around that?

me? No idea, I've never looked at the actual locking in DRM. Someone
who's familiar with that code would have to tackle that.

I just spotted the fail because I was going to remove set_need_resched()
and had a WTF moment when I tripped over this stuff.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Peter Zijlstra
Hi Dave,

So I'm poking around the preemption code and stumbled upon:

drivers/gpu/drm/i915/i915_gem.c:set_need_resched();
drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched();
drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched();
drivers/gpu/drm/udl/udl_gem.c:  set_need_resched();

All these sites basically do:

  while (!trylock())
yield();

which is a horrible and broken locking pattern. 

Firstly its deadlock prone, suppose the faulting process is a FIFOn+1
task that preempted the lock holder at FIFOn.

Secondly the implementation is worse than usual by abusing
VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault
doesn't retry, but you're using it as a get out of fault path. And
you're using set_need_resched() which is not something a driver should
_ever_ touch.

Now I'm going to take away set_need_resched() -- and while you can
'reimplement' it using set_thread_flag() you're not going to do that
because it will be broken due to changes to the preempt code.

So please as to fix ASAP and don't allow anybody to trick you into
merging silly things like that again ;-)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Maarten Lankhorst
Op 12-09-13 17:06, Peter Zijlstra schreef:
> Hi Dave,
>
> So I'm poking around the preemption code and stumbled upon:
>
> drivers/gpu/drm/i915/i915_gem.c:set_need_resched();
> drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched();
> drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched();
> drivers/gpu/drm/udl/udl_gem.c:  set_need_resched();
>
> All these sites basically do:
>
>   while (!trylock())
>   yield();
>
> which is a horrible and broken locking pattern. 
>
> Firstly its deadlock prone, suppose the faulting process is a FIFOn+1
> task that preempted the lock holder at FIFOn.
>
> Secondly the implementation is worse than usual by abusing
> VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault
> doesn't retry, but you're using it as a get out of fault path. And
> you're using set_need_resched() which is not something a driver should
> _ever_ touch.
>
> Now I'm going to take away set_need_resched() -- and while you can
> 'reimplement' it using set_thread_flag() you're not going to do that
> because it will be broken due to changes to the preempt code.
>
> So please as to fix ASAP and don't allow anybody to trick you into
> merging silly things like that again ;-)
>
Agreed, but this is a case of locking inversion. How do you propose to get 
around that?

~Maarten
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Daniel Vetter
On Thu, Sep 12, 2013 at 6:44 PM, Thomas Hellstrom  wrote:
>
> I think a possible fix would be if fault() were allowed to return an error
> and drop the mmap_sem() before returning.
>
> Otherwise we need to track down all copy_to_user / copy_from_user which
> happen with bo::reserve held.

For maximal evilness submit the relocation list (or whatever data
execbuf slurps in with copy_from_user while holding bo::reserve) of a
bo in the execbuf list. At least that's the testcase we have for
drm/i915. Then make sure that the execbuf wants the bo somewhere it
can't be mmaped from userspace, so needs to be moved both in the fault
handler and then back for the execbuf to continue ;-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Daniel Vetter
On Thu, Sep 12, 2013 at 6:22 PM, Peter Zijlstra  wrote:
> If 'sane' userspace is never supposed to do this, then only insane
> userspace is going to hurt from this and that's a GOOD (tm) thing,
> right? ;-)

Afaik sane userspace doesn't hit the _deadlock_ (or lifelock if we
have the set_need_resched in there). drm/i915 is a bit different since
we have just one lock, and so the same design would actually deadlock
even for sane userspace. But hitting contention there and yielding is
somewhat expected. Obviously shouldn't happen too often since it'll
hurt performance, with either blocking or the yield spinning loop.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Fwd: Intel/Radeon muxless laptop issues

2013-09-12 Thread Alexander Tarasikov
Hello,
I am writing to report two issues I'm having with a hybrid graphics laptop.
Namely, Sony Vaio SA3S9R with Intel HD3000 and Radeon HD6630M

First one looks like a regression introduced by the commit named
"gpu/vga_switcheroo: add driver control power feature. (v3)". After I echo
the "OFF" command to the switch, the DGPU turns off. However, after that
the vga_switcheroo disappears. Reinserting the radeon module makes
the kernel print the stack trace.

I have uploaded the dmesg log to http://paste.debian.net/38539/ .

Another issue I would like to bring up is the handling of suspend-resume
cycle on muxless laptops. On most intel/radeon laptops, the DGPU turns
on after a resume without an ACPI notification (which sound like a bug).
Switcheroo is still thinking it's off, and it keeps draining
its extra 10W of power. Now, if we do a "echo OFF", switcheroo does nothing.
If we do "echo ON", it fails and gets stuck for some seconds (because it
tries to power-up a powered-up card messing the initialization routine and
then a GPU hang happens). The hacky solution is to turn it on before suspend
and turn it off after via a suspend hook script, but this has several
limitations:
1. Most distros don't ship the script by default leaving the inexperienced
users
with hot laptops
2. Doesn't work well when someone is really using the DGPU
3. I have no idea, but maybe turning it on makes it drain more power in
suspend
(well, since BIOSes tend to be buggy, who knows what happens next).

My hacky solution was to forcibly turn off the DGPU in the switcheroo driver
at resume if it was unused, ignoring its state inside the switcheroo struct.
I have already made a proof-of-concept patch back in 2012, but somehow
that received no attention.
http://lkml.indiana.edu/hypermail/linux/kernel/1204.3/02530.html

Since fixing ACPI for every broken laptop or shipping a suspend hook is a
no-go,
we need some kind of a workaround. I don't really know how it should be
done,
via DMI probing or whatever, but I would like someone who has more knowledge
to look into that. Well, I guess, a solution would be to forcibly enable my
hack
for radeon/intel combos and then blacklist the laptops for which it breaks
the resume
cycle via DMI (i.e., I think that since most laptops behave this way, we
should
make the powersaving behavior default one).

--
Regards, Alexander
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 68235] Display freezes after login with kernel 3.11.0-rc5 on Cayman with dpm=1

2013-09-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=68235

--- Comment #24 from Alexandre Demers  ---
(In reply to comment #22)
> (In reply to comment #20)
> > Ok, if I apply the whole suggested patch but the following, it hangs:
> > @@ -4152,14 +4152,14 @@ int ni_dpm_init(struct radeon_device *rdev)
> > }
> > ni_pi->mclk_rtt_mode_threshold = eg_pi->mclk_edc_wr_enable_threshold;
> >  
> > -   pi->voltage_control =
> > -   radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDC, 
> > 0);
> > +   pi->voltage_control = false;
> > +// radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDC, 
> > 0);
> >  
> > -   pi->mvdd_control =
> > -   radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_MVDDC, 
> > 0);
> > +   pi->mvdd_control = false;
> > +// radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_MVDDC, 
> > 0);
> >  
> > -   eg_pi->vddci_control =
> > -   radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDCI, 
> > 0);
> > +   eg_pi->vddci_control = false;
> > +// radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDCI, 
> > 0);
> >  
> > rv770_get_engine_memory_ss(rdev);
> 
> So does just applying this portion of the patch by itself fix the hang?

Applying just this returns an error when booting: ni_upload_sw_state failed,
but obviously the system doesn't hang after that (though it can't change its
performance state)

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Thomas Hellstrom

On 09/12/2013 05:45 PM, Maarten Lankhorst wrote:

Op 12-09-13 17:36, Daniel Vetter schreef:

On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra  wrote:

So I'm poking around the preemption code and stumbled upon:

drivers/gpu/drm/i915/i915_gem.c:set_need_resched();
drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched();
drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched();
drivers/gpu/drm/udl/udl_gem.c:  set_need_resched();

All these sites basically do:

   while (!trylock())
 yield();

which is a horrible and broken locking pattern.

Firstly its deadlock prone, suppose the faulting process is a FIFOn+1
task that preempted the lock holder at FIFOn.

Secondly the implementation is worse than usual by abusing
VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault
doesn't retry, but you're using it as a get out of fault path. And
you're using set_need_resched() which is not something a driver should
_ever_ touch.

Now I'm going to take away set_need_resched() -- and while you can
'reimplement' it using set_thread_flag() you're not going to do that
because it will be broken due to changes to the preempt code.

So please as to fix ASAP and don't allow anybody to trick you into
merging silly things like that again ;-)

The set_need_resched in i915_gem.c:i915_gem_fault can actually be
removed. It was there to give the error handler a chance to sneak in
and reset the hw/sw tracking when the gpu is dead. That hack goes back
to the days when the locking around our error handler was somewhere
between nonexistent and totally broken, nowadays we keep things from
live-locking by a bit of magic in i915_mutex_lock_interruptible. I'll
whip up a patch to rip this out. I'll also check that our testsuite
properly exercises this path (needs a bit of work on a quick look for
better coverage).

The one in ttm is just bonghits to shut up lockdep: ttm can recurse
into it's own pagefault handler and then deadlock, the trylock just
keeps lockdep quiet. We've had that bug arise in drm/i915 due to some
fun userspace did and now have testcases for them. The right solution
to fix this is to use copy_to|from_user_atomic in ttm everywhere it
holds locks and have slowpaths which drops locks, copies stuff into a
temp allocation and then continues. At least that's how we've fixed
all those inversions in i915-gem. I'm not volunteering to fix this ;-)

Ah the case where a mmap'd address is passed to the execbuf ioctl? :P

Fine I'll look into it a bit, hopefully before tuesday. Else it might take a 
bit longer since I'll be on my way to plumbers..


I think a possible fix would be if fault() were allowed to return an 
error and drop the mmap_sem() before returning.


Otherwise we need to track down all copy_to_user / copy_from_user which 
happen with bo::reserve held.


/Thomas
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Thomas Hellstrom

On 09/12/2013 10:39 PM, Thomas Gleixner wrote:

On Thu, 12 Sep 2013, Daniel Vetter wrote:


On Thu, Sep 12, 2013 at 10:20 PM, Thomas Gleixner  wrote:

I think for ttm drivers it's just execbuf being exploitable. But on
drm/i915 we've
had the same issue with the pwrite/pread ioctls, so a simple
glBufferData(glMap) kind of recursion from gl clients blew the kernel
to pieces ...

And the only answer you folks came up with is set_need_resched() and
yield()? Oh well

The yield was for a different lifelock, and that one is also fixed by
now. The fault handler deadlock was fixed in the usual "drop locks and
jump into slowpath" fasion, at least in drm/i915.

So we can remove that whole yield/set_need_resched() mess completely ?

Thanks,

tglx

No.

The while(trylock) is there to address a potential locking inversion 
deadlock. If the trylock fails, the code returns to user-space which 
retries the fault. This code needs to stay until we can come up with 
either a way to drop the mmap_sem and sleep before returning to 
user-space, or a bunch of code is fixed with a different locking order.


The set_need_resched() can (and should according to Peter) go.

/Thomas
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Maarten Lankhorst
Op 12-09-13 17:36, Daniel Vetter schreef:
> On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra  wrote:
>> So I'm poking around the preemption code and stumbled upon:
>>
>> drivers/gpu/drm/i915/i915_gem.c:set_need_resched();
>> drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched();
>> drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched();
>> drivers/gpu/drm/udl/udl_gem.c:  set_need_resched();
>>
>> All these sites basically do:
>>
>>   while (!trylock())
>> yield();
>>
>> which is a horrible and broken locking pattern.
>>
>> Firstly its deadlock prone, suppose the faulting process is a FIFOn+1
>> task that preempted the lock holder at FIFOn.
>>
>> Secondly the implementation is worse than usual by abusing
>> VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault
>> doesn't retry, but you're using it as a get out of fault path. And
>> you're using set_need_resched() which is not something a driver should
>> _ever_ touch.
>>
>> Now I'm going to take away set_need_resched() -- and while you can
>> 'reimplement' it using set_thread_flag() you're not going to do that
>> because it will be broken due to changes to the preempt code.
>>
>> So please as to fix ASAP and don't allow anybody to trick you into
>> merging silly things like that again ;-)
> The set_need_resched in i915_gem.c:i915_gem_fault can actually be
> removed. It was there to give the error handler a chance to sneak in
> and reset the hw/sw tracking when the gpu is dead. That hack goes back
> to the days when the locking around our error handler was somewhere
> between nonexistent and totally broken, nowadays we keep things from
> live-locking by a bit of magic in i915_mutex_lock_interruptible. I'll
> whip up a patch to rip this out. I'll also check that our testsuite
> properly exercises this path (needs a bit of work on a quick look for
> better coverage).
>
> The one in ttm is just bonghits to shut up lockdep: ttm can recurse
> into it's own pagefault handler and then deadlock, the trylock just
> keeps lockdep quiet. We've had that bug arise in drm/i915 due to some
> fun userspace did and now have testcases for them. The right solution
> to fix this is to use copy_to|from_user_atomic in ttm everywhere it
> holds locks and have slowpaths which drops locks, copies stuff into a
> temp allocation and then continues. At least that's how we've fixed
> all those inversions in i915-gem. I'm not volunteering to fix this ;-)
Ah the case where a mmap'd address is passed to the execbuf ioctl? :P

Fine I'll look into it a bit, hopefully before tuesday. Else it might take a 
bit longer since I'll be on my way to plumbers..
> The one in udl just looks like copypasta from i915, without any
> justification (at least I don't see any) for why it's there. Probably
> can die, too, since there isn't any gpu to reset on usb display-link
> devices ...
>
> Cheers, Daniel

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Maarten Lankhorst
Op 12-09-13 18:44, Thomas Hellstrom schreef:
> On 09/12/2013 05:45 PM, Maarten Lankhorst wrote:
>> Op 12-09-13 17:36, Daniel Vetter schreef:
>>> On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra  
>>> wrote:
 So I'm poking around the preemption code and stumbled upon:

 drivers/gpu/drm/i915/i915_gem.c:set_need_resched();
 drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched();
 drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched();
 drivers/gpu/drm/udl/udl_gem.c:  set_need_resched();

 All these sites basically do:

while (!trylock())
  yield();

 which is a horrible and broken locking pattern.

 Firstly its deadlock prone, suppose the faulting process is a FIFOn+1
 task that preempted the lock holder at FIFOn.

 Secondly the implementation is worse than usual by abusing
 VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault
 doesn't retry, but you're using it as a get out of fault path. And
 you're using set_need_resched() which is not something a driver should
 _ever_ touch.

 Now I'm going to take away set_need_resched() -- and while you can
 'reimplement' it using set_thread_flag() you're not going to do that
 because it will be broken due to changes to the preempt code.

 So please as to fix ASAP and don't allow anybody to trick you into
 merging silly things like that again ;-)
>>> The set_need_resched in i915_gem.c:i915_gem_fault can actually be
>>> removed. It was there to give the error handler a chance to sneak in
>>> and reset the hw/sw tracking when the gpu is dead. That hack goes back
>>> to the days when the locking around our error handler was somewhere
>>> between nonexistent and totally broken, nowadays we keep things from
>>> live-locking by a bit of magic in i915_mutex_lock_interruptible. I'll
>>> whip up a patch to rip this out. I'll also check that our testsuite
>>> properly exercises this path (needs a bit of work on a quick look for
>>> better coverage).
>>>
>>> The one in ttm is just bonghits to shut up lockdep: ttm can recurse
>>> into it's own pagefault handler and then deadlock, the trylock just
>>> keeps lockdep quiet. We've had that bug arise in drm/i915 due to some
>>> fun userspace did and now have testcases for them. The right solution
>>> to fix this is to use copy_to|from_user_atomic in ttm everywhere it
>>> holds locks and have slowpaths which drops locks, copies stuff into a
>>> temp allocation and then continues. At least that's how we've fixed
>>> all those inversions in i915-gem. I'm not volunteering to fix this ;-)
>> Ah the case where a mmap'd address is passed to the execbuf ioctl? :P
>>
>> Fine I'll look into it a bit, hopefully before tuesday. Else it might take a 
>> bit longer since I'll be on my way to plumbers..
>
> I think a possible fix would be if fault() were allowed to return an error 
> and drop the mmap_sem() before returning.
>
> Otherwise we need to track down all copy_to_user / copy_from_user which 
> happen with bo::reserve held.
CONFIG_PROVE_LOCKING=y

and hard grab that reserve lock within the fault handler, done.. lockdep will 
spit it out for you :p

~Maarten
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Daniel Vetter
On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra  wrote:
>
> So I'm poking around the preemption code and stumbled upon:
>
> drivers/gpu/drm/i915/i915_gem.c:set_need_resched();
> drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched();
> drivers/gpu/drm/ttm/ttm_bo_vm.c:set_need_resched();
> drivers/gpu/drm/udl/udl_gem.c:  set_need_resched();
>
> All these sites basically do:
>
>   while (!trylock())
> yield();
>
> which is a horrible and broken locking pattern.
>
> Firstly its deadlock prone, suppose the faulting process is a FIFOn+1
> task that preempted the lock holder at FIFOn.
>
> Secondly the implementation is worse than usual by abusing
> VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault
> doesn't retry, but you're using it as a get out of fault path. And
> you're using set_need_resched() which is not something a driver should
> _ever_ touch.
>
> Now I'm going to take away set_need_resched() -- and while you can
> 'reimplement' it using set_thread_flag() you're not going to do that
> because it will be broken due to changes to the preempt code.
>
> So please as to fix ASAP and don't allow anybody to trick you into
> merging silly things like that again ;-)

The set_need_resched in i915_gem.c:i915_gem_fault can actually be
removed. It was there to give the error handler a chance to sneak in
and reset the hw/sw tracking when the gpu is dead. That hack goes back
to the days when the locking around our error handler was somewhere
between nonexistent and totally broken, nowadays we keep things from
live-locking by a bit of magic in i915_mutex_lock_interruptible. I'll
whip up a patch to rip this out. I'll also check that our testsuite
properly exercises this path (needs a bit of work on a quick look for
better coverage).

The one in ttm is just bonghits to shut up lockdep: ttm can recurse
into it's own pagefault handler and then deadlock, the trylock just
keeps lockdep quiet. We've had that bug arise in drm/i915 due to some
fun userspace did and now have testcases for them. The right solution
to fix this is to use copy_to|from_user_atomic in ttm everywhere it
holds locks and have slowpaths which drops locks, copies stuff into a
temp allocation and then continues. At least that's how we've fixed
all those inversions in i915-gem. I'm not volunteering to fix this ;-)

The one in udl just looks like copypasta from i915, without any
justification (at least I don't see any) for why it's there. Probably
can die, too, since there isn't any gpu to reset on usb display-link
devices ...

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 68451] Texture flicker in native Dota2 in mesa 9.2.0rc1

2013-09-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=68451

--- Comment #18 from Peter Kraus  ---
(In reply to comment #17)
> Hi,
> I am also having this behavior in Dota2. Peter, are you sure
> 7948ed1250cae78ae1b22dbce4ab23aceacc6159 is the first non-working? Can you
> try one earlier commit, 1b40398d024d2ac5c8e8b78d0f4941e2a007de2c, and
> confirm that it works? Because, I am seeing the corruption on even earlier
> (haven't figured out yet how much earlier).
> 
> Maybe I am doing something wrong, I am compiling 32-bit mesa on otherwise
> 64-bit system, but I can see my own printf()'s I've put in to make sure the
> correct library is loaded when running Dota2.

Hello,
I have just checked, commit 1b40398d024d2ac5c8e8b78d0f4941e2a007de2c (+ the 5
patches for Bison 3, above) still works fine.

Current HEAD is still broken.

Peter

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 2/2] drm/udl: rip out set_need_resched

2013-09-12 Thread Daniel Vetter
This very much looks like copypasta from drm/i915's fault handler.
It was used there to duct-tape over issues around gpu reset handling.

Since that can't ever happen for udl and there's seemingly no other
reason for this just drop it.

Reported-by: Peter Zijlstra 
Cc: Peter Zijlstra 
Cc: Linux Kernel Mailing List 
Cc: Dave Airlie 
Signed-off-by: Daniel Vetter 
---
 drivers/gpu/drm/udl/udl_gem.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/udl/udl_gem.c b/drivers/gpu/drm/udl/udl_gem.c
index 8dbe9d0..8bf6461 100644
--- a/drivers/gpu/drm/udl/udl_gem.c
+++ b/drivers/gpu/drm/udl/udl_gem.c
@@ -97,7 +97,6 @@ int udl_gem_fault(struct vm_area_struct *vma, struct vm_fault 
*vmf)
ret = vm_insert_page(vma, (unsigned long)vmf->virtual_address, page);
switch (ret) {
case -EAGAIN:
-   set_need_resched();
case 0:
case -ERESTARTSYS:
return VM_FAULT_NOPAGE;
-- 
1.8.4.rc3

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Daniel Vetter
On Thu, Sep 12, 2013 at 9:58 PM, Thomas Gleixner  wrote:
>
>> On Thu, Sep 12, 2013 at 6:22 PM, Peter Zijlstra  wrote:
>> > If 'sane' userspace is never supposed to do this, then only insane
>> > userspace is going to hurt from this and that's a GOOD (tm) thing,
>> > right? ;-)
>>
>> Afaik sane userspace doesn't hit the _deadlock_ (or lifelock if we
>> have the set_need_resched in there). drm/i915 is a bit different since
>> we have just one lock, and so the same design would actually deadlock
>> even for sane userspace. But hitting contention there and yielding is
>> somewhat expected. Obviously shouldn't happen too often since it'll
>> hurt performance, with either blocking or the yield spinning loop.
>
> So this is actually a non priviledged DoS interface, right?

I think for ttm drivers it's just execbuf being exploitable. But on
drm/i915 we've
had the same issue with the pwrite/pread ioctls, so a simple
glBufferData(glMap) kind of recursion from gl clients blew the kernel
to pieces ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Daniel Vetter
On Thu, Sep 12, 2013 at 10:20 PM, Thomas Gleixner  wrote:
>> I think for ttm drivers it's just execbuf being exploitable. But on
>> drm/i915 we've
>> had the same issue with the pwrite/pread ioctls, so a simple
>> glBufferData(glMap) kind of recursion from gl clients blew the kernel
>> to pieces ...
>
> And the only answer you folks came up with is set_need_resched() and
> yield()? Oh well

The yield was for a different lifelock, and that one is also fixed by
now. The fault handler deadlock was fixed in the usual "drop locks and
jump into slowpath" fasion, at least in drm/i915.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 1/2] drm/i915: kill set_need_resched

2013-09-12 Thread Daniel Vetter
On Thu, Sep 12, 2013 at 05:59:51PM +0200, Peter Zijlstra wrote:
> On Thu, Sep 12, 2013 at 05:57:28PM +0200, Daniel Vetter wrote:
> > This is just a remnant from the old days when our reset handling was
> > horribly racy, suffered from terribly locking issues and often happily
> > live-locked. Those days are now gone so we can drop the hacks and just
> > rip the reschedule-point out.
> > 
> > Reported-by: Peter Zijlstra 
> 
> Thanks!
> 
> Acked-by: Peter Zijlstra 

Merged to drm-intel-fixes with Chris' irc-review that we indeed don't need
this any more to duct-tape over bugs.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 1/2] drm/i915: kill set_need_resched

2013-09-12 Thread Peter Zijlstra
On Thu, Sep 12, 2013 at 05:57:28PM +0200, Daniel Vetter wrote:
> This is just a remnant from the old days when our reset handling was
> horribly racy, suffered from terribly locking issues and often happily
> live-locked. Those days are now gone so we can drop the hacks and just
> rip the reschedule-point out.
> 
> Reported-by: Peter Zijlstra 

Thanks!

Acked-by: Peter Zijlstra 
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] Allow kmscon to be cross-compiled

2013-09-12 Thread David Herrmann
Hi

On Mon, Sep 9, 2013 at 1:43 PM, Thierry Reding  
wrote:
> On Mon, Sep 09, 2013 at 12:48:08PM +0200, David Herrmann wrote:
>> Hi
>>
>> On Mon, Sep 9, 2013 at 12:27 PM, Thierry Reding
>>  wrote:
>> > The genshader and genunifont utilities are run during the build process
>> > to generate source files. In order for that to work when cross-compiling
>> > the files need to be built using the native compiler instead of the
>> > cross-compiler.
>> >
>> > Add the AX_PROG_CC_FOR_BUILD m4 macro which defines various *_FOR_BUILD
>> > variables that are the native equivalents of CC, CFLAGS, LDFLAGS, etc.
>> > Override CC, CFLAGS and LDFLAGS for genshader and genunifont and their
>> > object files so that they will be built natively and can be executed
>> > during the build.
>>
>> Thanks a lot! Looks all good, few comments below. I think I will apply
>> it as is, anyway. New bugfix-release is planned for this week, too.
>>
>> > Signed-off-by: Thierry Reding 
>> > ---
>> >  Makefile.am|  16 --
>> >  configure.ac   |   3 +-
>> >  m4/ax_prog_cc_for_build.m4 | 125 
>> > +
>> >  3 files changed, 139 insertions(+), 5 deletions(-)
>> >  create mode 100644 m4/ax_prog_cc_for_build.m4
>> >
>> > diff --git a/Makefile.am b/Makefile.am
>> > index 7019290..f1b4435 100644
>> > --- a/Makefile.am
>> > +++ b/Makefile.am
>> > @@ -445,8 +445,12 @@ EXTRA_DIST += $(SHADERS)
>> >  CLEANFILES += src/static_shaders.c
>> >  genshader_SOURCES = src/genshader.c
>> >
>> > -src/static_shaders.c: $(SHADERS) genshader$(EXEEXT)
>> > -   $(AM_V_GEN)./genshader$(EXEEXT) src/static_shaders.c $(SHADERS)
>> > +src/static_shaders.c: $(SHADERS) genshader$(BUILD_EXEEXT)
>> > +   $(AM_V_GEN)./genshader$(BUILD_EXEEXT) src/static_shaders.c 
>> > $(SHADERS)
>> > +
>> > +genshader$(BUILD_EXEEXT) $(genshader_OBJECTS): CC = $(CC_FOR_BUILD)
>> > +genshader$(BUILD_EXEEXT) $(genshader_OBJECTS): CFLAGS = 
>> > $(CFLAGS_FOR_BUILD)
>> > +genshader$(BUILD_EXEEXT): LDFLAGS = $(LDFLAGS_FOR_BUILD)
>>
>> Just wondering, isn't this going to break if $BUILD_EXEEXT != $EXEEXT
>> I mean, noinst_PROGRAMS generates build-rules for $EXEEXT, not for
>> $BUILD_EXEEXT, so a dependency on "genshader$(BUILD_EXEEXT)" won't do
>> anything if it differs from $EXEEXT. But maybe I am just missing
>> something and automake creates rules for both?
>
> That's a good point. And yes, I think it would break if both extensions
> differed. On the other hand I have no idea on how to make automake
> generate rules for $(BUILD_EXEEXT). I don't think it can.
>
> Interestingly, I mistakenly used genshader$(EXEEXT) initially, but that
> causes automake to spew out warnings that these "rules" (which aren't
> really rules at all) were overriding previous commands for the same
> target. So while $(BUILD_EXEEXT) is the right extension to use, it also
> works around a shortcoming in automake. Another shortcoming of automake
> will cause this to break when doing MinGW->Unix and Unix->MinGW cross-
> builds...
>
> A different approach of this patch was to move genshader and genunifont
> to a tools subdirectory with a separate Makefile.am and overwrite the
> CC, CFLAGS and LDFLAGS variables in that directory only, but automake
> doesn't accept that either because CFLAGS is considered a user variable
> and therefore can't be overwritten.
>
> I think this is as good as it gets for now. I've tried to fix this kind
> of problem in automake several times but never managed to. Perhaps it's
> time to move on to something like SCons...

Yepp, automake has issues but it's still better than the alternatives,
imho. Patch is applied and pushed. I only added a small TODO-comment
so I don't forget about this issue. I don't have any better idea right
now.

Btw., there's kmscon-devel at lists. (which is an open list and
doesn't require registration).

Thanks
David


Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Peter Zijlstra
On Thu, Sep 12, 2013 at 05:35:43PM +0100, Chris Wilson wrote:
> Not quite, as it would be possible for the evil userspace to trigger a
> GPU hang that would cause the sane userspace to spin indefinitely 
> waiting for the error recovery to kick in.

So with FIFOn+1 preempting FIFOn its a live-lock because the faulting
thread will forever keep yielding to itself since its the highest
priority task around, therefore the set_need_resched() is an absolute
NOP in that case.

For OTHER it might run another task with set_need_resched(), without
set_need_resched() it'll simply spin on the fault until it runs out of
time and gets force preempted and another task gets to run.

So for either case, the set_need_resched() doesn't make an appreciable
difference.

Removing it will not make evil userspace much worse -- at worst it will
cause slightly more wasted cycles.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/2] drm/udl: rip out set_need_resched

2013-09-12 Thread Peter Zijlstra
On Thu, Sep 12, 2013 at 05:57:29PM +0200, Daniel Vetter wrote:
> This very much looks like copypasta from drm/i915's fault handler.
> It was used there to duct-tape over issues around gpu reset handling.
> 
> Since that can't ever happen for udl and there's seemingly no other
> reason for this just drop it.
> 
> Reported-by: Peter Zijlstra 

Thanks!

Acked-by: Peter Zijlstra 
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Thomas Hellstrom

On 09/12/2013 05:58 PM, Daniel Vetter wrote:

On Thu, Sep 12, 2013 at 5:43 PM, Peter Zijlstra  wrote:

The one in ttm is just bonghits to shut up lockdep: ttm can recurse
into it's own pagefault handler and then deadlock, the trylock just
keeps lockdep quiet.


Could you describe how it could recurse into it's own pagefault handler?
IIRC the VM flags of the TTM VMAs makes get_user_pages() refrain from 
touching these VMAs,
hence I don't think this code can deadlock, but admittedly it's far from 
the optimal solution.


Never mind, more on the set_need_resched() below.



  We've had that bug arise in drm/i915 due to some
fun userspace did and now have testcases for them. The right solution
to fix this is to use copy_to|from_user_atomic in ttm everywhere it
holds locks and have slowpaths which drops locks, copies stuff into a
temp allocation and then continues. At least that's how we've fixed
all those inversions in i915-gem. I'm not volunteering to fix this ;-)

Yikes.. so how common is it? If I simply rip the set_need_resched() out
it will 'spin' on the fault a little longer until a 'natural' preemption
point -- if such a thing is every going to happen.


A typical case is if a process is throwing out a buffer from the GPU or 
system memory while another
process pagefaults while writing to it. It's not a common situation, and 
it's by no means a fastpath situation.

For correctness purposes, I think set_need_resched() can be safely removed.


It's a case of "our userspace doesn't do this", so as long as you're
not evil and frob the drm device nodes of ttm drivers directly the
deadlock will never happen. No idea how much contention actually
happens on e.g. shared buffer objects - in i915 we have just one lock
and so suffer quite a bit more from contention. So no idea how much
removing the yield would hurt.
-Daniel


/Thomas
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Peter Zijlstra
On Thu, Sep 12, 2013 at 05:58:49PM +0200, Daniel Vetter wrote:
> On Thu, Sep 12, 2013 at 5:43 PM, Peter Zijlstra  wrote:
> >> The one in ttm is just bonghits to shut up lockdep: ttm can recurse
> >> into it's own pagefault handler and then deadlock, the trylock just
> >> keeps lockdep quiet. We've had that bug arise in drm/i915 due to some
> >> fun userspace did and now have testcases for them. The right solution
> >> to fix this is to use copy_to|from_user_atomic in ttm everywhere it
> >> holds locks and have slowpaths which drops locks, copies stuff into a
> >> temp allocation and then continues. At least that's how we've fixed
> >> all those inversions in i915-gem. I'm not volunteering to fix this ;-)
> >
> > Yikes.. so how common is it? If I simply rip the set_need_resched() out
> > it will 'spin' on the fault a little longer until a 'natural' preemption
> > point -- if such a thing is every going to happen.
> 
> It's a case of "our userspace doesn't do this", so as long as you're
> not evil and frob the drm device nodes of ttm drivers directly the
> deadlock will never happen. No idea how much contention actually
> happens on e.g. shared buffer objects - in i915 we have just one lock
> and so suffer quite a bit more from contention. So no idea how much
> removing the yield would hurt.

If 'sane' userspace is never supposed to do this, then only insane
userspace is going to hurt from this and that's a GOOD (tm) thing,
right? ;-)

And it won't actually deadlock if you don't use FIFO, for the regular
scheduler class it'll just spin a little longer before getting preempted
so no real worries there.


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

2013-09-12 Thread Peter Zijlstra
On Thu, Sep 12, 2013 at 05:36:43PM +0200, Daniel Vetter wrote:
> The set_need_resched in i915_gem.c:i915_gem_fault can actually be
> removed. It was there to give the error handler a chance to sneak in
> and reset the hw/sw tracking when the gpu is dead. That hack goes back
> to the days when the locking around our error handler was somewhere
> between nonexistent and totally broken, nowadays we keep things from
> live-locking by a bit of magic in i915_mutex_lock_interruptible. I'll
> whip up a patch to rip this out. I'll also check that our testsuite
> properly exercises this path (needs a bit of work on a quick look for
> better coverage).
> 
> The one in udl just looks like copypasta from i915, without any
> justification (at least I don't see any) for why it's there. Probably
> can die, too, since there isn't any gpu to reset on usb display-link
> devices ...

OK, awesome that. 2 down.

> The one in ttm is just bonghits to shut up lockdep: ttm can recurse
> into it's own pagefault handler and then deadlock, the trylock just
> keeps lockdep quiet. We've had that bug arise in drm/i915 due to some
> fun userspace did and now have testcases for them. The right solution
> to fix this is to use copy_to|from_user_atomic in ttm everywhere it
> holds locks and have slowpaths which drops locks, copies stuff into a
> temp allocation and then continues. At least that's how we've fixed
> all those inversions in i915-gem. I'm not volunteering to fix this ;-)

Yikes.. so how common is it? If I simply rip the set_need_resched() out
it will 'spin' on the fault a little longer until a 'natural' preemption
point -- if such a thing is every going to happen.

It would make that path take longer, but not be more or less broken.

So if its a rare path, I'll just rip the set_need_resched() out and you
DRM people can then fix up at your own pace.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 68451] Texture flicker in native Dota2 in mesa 9.2.0rc1

2013-09-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=68451

--- Comment #17 from Marko Srebre  ---
Hi,
I am also having this behavior in Dota2. Peter, are you sure
7948ed1250cae78ae1b22dbce4ab23aceacc6159 is the first non-working? Can you try
one earlier commit, 1b40398d024d2ac5c8e8b78d0f4941e2a007de2c, and confirm that
it works? Because, I am seeing the corruption on even earlier (haven't figured
out yet how much earlier).

Maybe I am doing something wrong, I am compiling 32-bit mesa on otherwise
64-bit system, but I can see my own printf()'s I've put in to make sure the
correct library is loaded when running Dota2.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 68235] Display freezes after login with kernel 3.11.0-rc5 on Cayman with dpm=1

2013-09-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=68235

--- Comment #23 from Alex Deucher  ---
(In reply to comment #21)
> Adding printk(KERN_DEBUG "DEBUG: about to pass the following value of
> module_index to radeon_atom_init_mc_reg_table(): %d", module_index); just
> before calling radeon_atom_init_mc_reg_table() returns 2.

Ok, that looks good.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 68235] Display freezes after login with kernel 3.11.0-rc5 on Cayman with dpm=1

2013-09-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=68235

--- Comment #22 from Alex Deucher  ---
(In reply to comment #20)
> Ok, if I apply the whole suggested patch but the following, it hangs:
> @@ -4152,14 +4152,14 @@ int ni_dpm_init(struct radeon_device *rdev)
>   }
>   ni_pi->mclk_rtt_mode_threshold = eg_pi->mclk_edc_wr_enable_threshold;
>  
> - pi->voltage_control =
> - radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDC, 
> 0);
> + pi->voltage_control = false;
> +//   radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDC, 
> 0);
>  
> - pi->mvdd_control =
> - radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_MVDDC, 
> 0);
> + pi->mvdd_control = false;
> +//   radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_MVDDC, 
> 0);
>  
> - eg_pi->vddci_control =
> - radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDCI, 
> 0);
> + eg_pi->vddci_control = false;
> +//   radeon_atom_is_voltage_gpio(rdev, SET_VOLTAGE_TYPE_ASIC_VDDCI, 
> 0);
>  
>   rv770_get_engine_memory_ss(rdev);

So does just applying this portion of the patch by itself fix the hang?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/2] ACPI / video / i915: Remove ACPI backlight if firmware expects Windows 8

2013-09-12 Thread Aaron Lu
On Wed, Sep 11, 2013 at 11:45:19AM +0300, Jani Nikula wrote:
> On Wed, 11 Sep 2013, Aaron Lu  wrote:
> > It is possible the i915 driver decides not to register a backlight
> > interface for the graphics card for some reason(memory allocation failed
> > or it knows the native control does not work on this card or whatever),
> > so I would prefer let i915 tell ACPI video that it has registered a
> > native backlight control interface as Jani has said.
> >
> > Then together with the video.use_native_backlight, we can register or
> > not register ACPI video backlight interface accordingly. Or rather, we
> > can simply not register ACPI video backlight interface for Win8 systems
> > as long as i915 indicates that it has native backlight control(if the
> > native control is broken, i915 should fix it or blacklist it so that
> > i915 will not indicate it has native backlight control and ACPI video
> > will continue to register its own).
> >
> > How does this sound?
> 
> Sounds good to me.
> 
> Before plunging forward, have you observed any difference between the
> boot modes? We have reports [1] that the backlight behaviour is

Not yet from ACPI's point of view.

> different with UEFI vs. UEFI+CSM or legacy boot. So I'm wondering if the
> acpi_gbl_osi_data >= ACPI_OSI_WIN_8 check in patch 2/2 is the whole
> story.

This check in patch 2/2 is a policy: for Win8 system, we think the
native backlight control has a better chance of working than the ACPI
video's, so I think the check is enough in ACPI video.

> 
> Further, if we tell the BIOS we're Windows 8 to use the tested BIOS code
> paths, what guarantees do we have of UEFI+CSM or legacy boots working?

I suppose the 'tested BIOS code paths' means the pure UEFI boot mode? I
don't know what guatantees do we have since I don't know what happened
underneath after the backlight register is set in i915 driver, you or
other i915 driver people should know more than I do :-)

BTW, after the backlight register is set in i915, is it that some
find of firmware code will run in response to the setting of the
register(e.g. the BLC_PWM_CTL/BLC_PWM_CPU_CTL/PCI_LBPC reg)?

Thanks,
Aaron
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Intel/Radeon muxless laptop issues

2013-09-12 Thread Alexander Tarasikov
Hello,
I am writing to report two issues I'm having with a hybrid graphics laptop.
Namely, Sony Vaio SA3S9R with Intel HD3000 and Radeon HD6630M

First one looks like a regression introduced by the commit named
"gpu/vga_switcheroo: add driver control power feature. (v3)". After I echo
the "OFF" command to the switch, the DGPU turns off. However, after that
the vga_switcheroo disappears. Reinserting the radeon module makes
the kernel print the stack trace.

I have uploaded the dmesg log to http://paste.debian.net/38539/ .

Another issue I would like to bring up is the handling of suspend-resume
cycle on muxless laptops. On most intel/radeon laptops, the DGPU turns
on after a resume without an ACPI notification (which sound like a bug).
Switcheroo is still thinking it's off, and it keeps draining
its extra 10W of power. Now, if we do a "echo OFF", switcheroo does nothing.
If we do "echo ON", it fails and gets stuck for some seconds (because it
tries to power-up a powered-up card messing the initialization routine and
then a GPU hang happens). The hacky solution is to turn it on before suspend
and turn it off after via a suspend hook script, but this has several
limitations:
1. Most distros don't ship the script by default leaving the inexperienced
users
with hot laptops
2. Doesn't work well when someone is really using the DGPU
3. I have no idea, but maybe turning it on makes it drain more power in
suspend
(well, since BIOSes tend to be buggy, who knows what happens next).

My hacky solution was to forcibly turn off the DGPU in the switcheroo driver
at resume if it was unused, ignoring its state inside the switcheroo struct.
I have already made a proof-of-concept patch back in 2012, but somehow
that received no attention.
http://lkml.indiana.edu/hypermail/linux/kernel/1204.3/02530.html

Since fixing ACPI for every broken laptop or shipping a suspend hook is a
no-go,
we need some kind of a workaround. I don't really know how it should be
done,
via DMI probing or whatever, but I would like someone who has more knowledge
to look into that. Well, I guess, a solution would be to forcibly enable my
hack
for radeon/intel combos and then blacklist the laptops for which it breaks
the resume
cycle via DMI (i.e., I think that since most laptops behave this way, we
should
make the powersaving behavior default one).

--
Regards, Alexander
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] drm/msm: fix potential NULL pointer dereference

2013-09-12 Thread Wei Yongjun
From: Wei Yongjun 

The dereference to 'pdata' should be moved below the NULL test.

Signed-off-by: Wei Yongjun 
---
 drivers/gpu/drm/msm/msm_gpu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index e1e1ec9..6b50e6b 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -29,13 +29,14 @@
 static void bs_init(struct msm_gpu *gpu, struct platform_device *pdev)
 {
struct drm_device *dev = gpu->dev;
-   struct kgsl_device_platform_data *pdata = pdev->dev.platform_data;
+   struct kgsl_device_platform_data *pdata;
 
if (!pdev) {
dev_err(dev->dev, "could not find dtv pdata\n");
return;
}
 
+   pdata = pdev->dev.platform_data;
if (pdata->bus_scale_table) {
gpu->bsc = 
msm_bus_scale_register_client(pdata->bus_scale_table);
DBG("bus scale client: %08x", gpu->bsc);

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] drm/omap: fix error return code in omap_dmm_probe()

2013-09-12 Thread Wei Yongjun
From: Wei Yongjun 

Fix to return -ENOMEM in the refill memory alloc error handling
case instead of 0, as done elsewhere in this function.

Signed-off-by: Wei Yongjun 
---
 drivers/gpu/drm/omapdrm/omap_dmm_tiler.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/omapdrm/omap_dmm_tiler.c 
b/drivers/gpu/drm/omapdrm/omap_dmm_tiler.c
index 9b794c9..ab546c9 100644
--- a/drivers/gpu/drm/omapdrm/omap_dmm_tiler.c
+++ b/drivers/gpu/drm/omapdrm/omap_dmm_tiler.c
@@ -675,6 +675,7 @@ static int omap_dmm_probe(struct platform_device *dev)
&omap_dmm->refill_pa, GFP_KERNEL);
if (!omap_dmm->refill_va) {
dev_err(&dev->dev, "could not allocate refill memory\n");
+   ret = -ENOMEM;
goto fail;
}
 

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/2] ACPI / video / i915: Remove ACPI backlight if firmware expects Windows 8

2013-09-12 Thread Matthew Garrett
On Wed, 2013-09-11 at 13:29 +0300, Jani Nikula wrote:
> On Wed, 11 Sep 2013, Matthew Garrett  wrote:
> > On Wed, 2013-09-11 at 11:45 +0300, Jani Nikula wrote:
> >
> >> Before plunging forward, have you observed any difference between the
> >> boot modes? We have reports [1] that the backlight behaviour is
> >> different with UEFI vs. UEFI+CSM or legacy boot. So I'm wondering if the
> >> acpi_gbl_osi_data >= ACPI_OSI_WIN_8 check in patch 2/2 is the whole
> >> story.
> >> 
> >> Further, if we tell the BIOS we're Windows 8 to use the tested BIOS code
> >> paths, what guarantees do we have of UEFI+CSM or legacy boots working?
> >
> > We have no evidence of Windows behaving differently based on the exposed
> > firmware type.
> 
> By "behaving differently", do you mean internally adapting to the boot
> mode, or exhibiting different behaviour to the user?

As far as backlight control goes, both.

> We have evidence of the firmware behaving differently (VBT, backlight)
> based on the boot mode, all else being equal. We don't adapt to that,
> and we fail. I don't know if we should adapt, or do things differently
> altogether. I don't even know if Windows 8 works on all boot modes on
> the machines in question.

Sure, but Windows knows nothing about VBT or opregion-backed backlight
control. That's up to the Intel driver.

-- 
Matthew Garrett 
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/2] ACPI / video / i915: Remove ACPI backlight if firmware expects Windows 8

2013-09-12 Thread Matthew Garrett
On Wed, 2013-09-11 at 11:45 +0300, Jani Nikula wrote:

> Before plunging forward, have you observed any difference between the
> boot modes? We have reports [1] that the backlight behaviour is
> different with UEFI vs. UEFI+CSM or legacy boot. So I'm wondering if the
> acpi_gbl_osi_data >= ACPI_OSI_WIN_8 check in patch 2/2 is the whole
> story.
> 
> Further, if we tell the BIOS we're Windows 8 to use the tested BIOS code
> paths, what guarantees do we have of UEFI+CSM or legacy boots working?

We have no evidence of Windows behaving differently based on the exposed
firmware type.

-- 
Matthew Garrett 
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] nouveau: fix build eror when VGA_SWITCHEROO is not enabled

2013-09-12 Thread Randy Dunlap
From: Randy Dunlap 

Fix nouveau build error on x86, when ACPI is enabled but
VGA_SWITCHEROO is not enabled, by providing a stub function.

drivers/built-in.o: In function `nouveau_pmops_runtime_suspend':
nouveau_drm.c:(.text+0x3aac89): undefined reference to 
`nouveau_switcheroo_optimus_dsm'

Signed-off-by: Randy Dunlap 
---
 drivers/gpu/drm/nouveau/nouveau_acpi.c |1 +
 1 file changed, 1 insertion(+)

--- linux-next-20130911.orig/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ linux-next-20130911/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -373,6 +373,7 @@ void nouveau_unregister_dsm_handler(void
 #else
 void nouveau_register_dsm_handler(void) {}
 void nouveau_unregister_dsm_handler(void) {}
+void nouveau_switcheroo_optimus_dsm(void) {}
 #endif
 
 /* retrieve the ROM in 4k blocks */
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/2] ACPI / video / i915: Remove ACPI backlight if firmware expects Windows 8

2013-09-12 Thread Yves-Alexis Perez
On mer., 2013-09-11 at 08:45 +, Matthew Garrett wrote:
> On Wed, 2013-09-11 at 11:45 +0300, Jani Nikula wrote:
> 
> > Before plunging forward, have you observed any difference between the
> > boot modes? We have reports [1] that the backlight behaviour is
> > different with UEFI vs. UEFI+CSM or legacy boot. So I'm wondering if the
> > acpi_gbl_osi_data >= ACPI_OSI_WIN_8 check in patch 2/2 is the whole
> > story.
> > 
> > Further, if we tell the BIOS we're Windows 8 to use the tested BIOS code
> > paths, what guarantees do we have of UEFI+CSM or legacy boots working?
> 
> We have no evidence of Windows behaving differently based on the exposed
> firmware type.
> 
I do have an X230 which boots fine using pure UEFI or UEFI+CSM (and I
guess I can try booting it with a grml for pure legacy), so I can do
tests if needed.

Regards,
-- 
Yves-Alexis

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 69062] Corrupted visuals in Gnome3 (A6-4400M)

2013-09-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=69062

--- Comment #7 from LRN  ---
Upgraded kernel to b7a5ae97502e104371c7eb3b7b17ae959e50d6f5
No effect.

Tried to build and install older drm and mesa (git revs from ~Jan 2013, since i
remember that things worked back then).
Installed drm - no effect.
Installed xf86-video-ati - no effect.
Tried to build old mesa, got errors. Installed llvm-3.3, made it the default.
Got some more errors. Tinkered a bit more.
At some point i had to reboot. When the laptop booted again, gdm3 showed no
corruption, and neither did gnome3.

Upgraded drm, xf86-video-ati and mesa back to git their respective git master
HEADs. Rebooted. No corruption.

So, yeah. Weird.
Maybe i was building xorg in a wrong order, and my git gyrations made it
right...

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 69245] New: Opencl random lockups whilst running tstellar's opencl-examples

2013-09-12 Thread bugzilla-dae...@freedesktop.org
on :01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[29164.980562] radeon :01:00.0: GPU reset succeeded, trying to resume
[29165.001309] [drm] PCIE GART of 512M enabled (table at 0x00142000).
[29165.001440] radeon :01:00.0: WB enabled
[29165.001444] radeon :01:00.0: fence driver on ring 0 use gpu addr
0x4c00 and cpu addr 0x88020e246c00
[29165.001447] radeon :01:00.0: fence driver on ring 3 use gpu addr
0x4c0c and cpu addr 0x88020e246c0c
[29165.003766] radeon :01:00.0: fence driver on ring 5 use gpu addr
0x0f60c418 and cpu addr 0xc9000579c418
[29165.020514] [drm] ring test on 0 succeeded in 0 usecs
[29165.020576] [drm] ring test on 3 succeeded in 1 usecs
[29165.216744] [drm] ring test on 5 succeeded in 1 usecs
[29165.216748] [drm] UVD initialized successfully.
[29165.216753] [drm] Enabling audio support
[29165.229845] [drm] ib test on ring 0 succeeded in 0 usecs
[29165.229891] [drm] ib test on ring 3 succeeded in 0 usecs
[29165.401525] [drm] ib test on ring 5 succeeded
[29196.684784] radeon :01:00.0: GPU lockup CP stall for more than 1msec
[29196.684793] radeon :01:00.0: GPU lockup (waiting for 0x00441178
last fence id 0x00441176)
[29196.684811] [drm] Disabling audio support
[29196.697087] radeon :01:00.0: fence driver on ring 5 use gpu addr
0x0f60c418 and cpu addr 0xc9000449b418
[29196.698143] radeon :01:00.0: Saved 55 dwords of commands on ring 0.
[29196.698164] radeon :01:00.0: GPU softreset: 0x0009
[29196.698167] radeon :01:00.0:   GRBM_STATUS   = 0xF0701028
[29196.698170] radeon :01:00.0:   GRBM_STATUS_SE0   = 0x9C02
[29196.698172] radeon :01:00.0:   GRBM_STATUS_SE1   = 0x0007
[29196.698174] radeon :01:00.0:   SRBM_STATUS   = 0x20C0
[29196.698177] radeon :01:00.0:   SRBM_STATUS2  = 0x
[29196.698180] radeon :01:00.0:   R_008674_CP_STALLED_STAT1 = 0x
[29196.698182] radeon :01:00.0:   R_008678_CP_STALLED_STAT2 = 0x400C
[29196.698185] radeon :01:00.0:   R_00867C_CP_BUSY_STAT = 0x00058004
[29196.698187] radeon :01:00.0:   R_008680_CP_STAT  = 0x80268647
[29196.698190] radeon :01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[29196.714799] radeon :01:00.0: GRBM_SOFT_RESET=0x7F6B
[29196.714855] radeon :01:00.0: SRBM_SOFT_RESET=0x0100
[29196.716015] radeon :01:00.0:   GRBM_STATUS   = 0x3828
[29196.716019] radeon :01:00.0:   GRBM_STATUS_SE0   = 0x0007
[29196.716021] radeon :01:00.0:   GRBM_STATUS_SE1   = 0x0007
[29196.716024] radeon :01:00.0:   SRBM_STATUS   = 0x20C0
[29196.716026] radeon :01:00.0:   SRBM_STATUS2  = 0x
[29196.716029] radeon :01:00.0:   R_008674_CP_STALLED_STAT1 = 0x
[29196.716031] radeon :01:00.0:   R_008678_CP_STALLED_STAT2 = 0x
[29196.716034] radeon :01:00.0:   R_00867C_CP_BUSY_STAT = 0x
[29196.716036] radeon :01:00.0:   R_008680_CP_STAT  = 0x
[29196.716039] radeon :01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[29196.716055] radeon :01:00.0: GPU reset succeeded, trying to resume
[29196.736296] [drm] PCIE GART of 512M enabled (table at 0x00142000).
[29196.736421] radeon :01:00.0: WB enabled
[29196.736425] radeon :01:00.0: fence driver on ring 0 use gpu addr
0x4c00 and cpu addr 0x88020e246c00
[29196.736427] radeon :01:00.0: fence driver on ring 3 use gpu addr
0x4c0c and cpu addr 0x88020e246c0c
[29196.738058] radeon :01:00.0: fence driver on ring 5 use gpu addr
0x0bf38418 and cpu addr 0xc9000579c418
[29196.754590] [drm] ring test on 0 succeeded in 1 usecs
[29196.754651] [drm] ring test on 3 succeeded in 1 usecs
[29196.950834] [drm] ring test on 5 succeeded in 1 usecs
[29196.950839] [drm] UVD initialized successfully.
[29196.950844] [drm] Enabling audio support
[29196.950905] [drm] ib test on ring 0 succeeded in 0 usecs
[29196.950961] [drm] ib test on ring 3 succeeded in 1 usecs
[29197.122612] [drm] ib test on ring 5 succeeded


On a related note after the restart the default power profile is loaded instead
of the previous one, but I can post another bug for this issue if desired.

If I can provide any other information which can be of help debugging this
please tell me.

Thanks a lot for your time and efforts in the project :)
klondike

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130912/286ab64e/attachment-0001.html>