Commit ecff665f5e3f (drm/ttm: make ttm reservation calls...) causes system hang on Radeon RS780

2013-07-10 Thread Markus Trippelsdorf
On 2013.07.10 at 11:56 +0200, Maarten Lankhorst wrote:
> Op 10-07-13 11:46, Markus Trippelsdorf schreef:
> > On 2013.07.10 at 11:29 +0200, Maarten Lankhorst wrote:
> >> Op 10-07-13 11:22, Markus Trippelsdorf schreef:
> >>> By simply copy/pasting a big document under LibreOffice my system hangs
> >>> itself up. Only a hard reset gets it working again.
> >>> see also: https://bugs.freedesktop.org/show_bug.cgi?id=66551
> >>>
> >>> I've bisected the issue to:
> >>>
> >>> commit ecff665f5e3f1c6909353e00b9420e45ae23d995
> >>> Author: Maarten Lankhorst 
> >>> Date:   Thu Jun 27 13:48:17 2013 +0200
> >>>
> >>> drm/ttm: make ttm reservation calls behave like reservation calls
> >>> 
> >>> This commit converts the source of the val_seq counter to
> >>> the ww_mutex api. The reservation objects are converted later,
> >>> because there is still a lockdep splat in nouveau that has to
> >>> resolved first.
> >>> 
> >>> Signed-off-by: Maarten Lankhorst 
> >>> Reviewed-by: Jerome Glisse 
> >>> Signed-off-by: Dave Airlie 
> >> Hey,
> >>
> >> Can you try current head with CONFIG_PROVE_LOCKING set and post the
> >> lockdep splat from dmesg, if any? If there is any locking issue
> >> lockdep should warn about it.  Lockdep will turn itself off after the
> >> first splat, so if the lockdep splat happens before running the
> >> affected parts those will have to be fixed first.
> > There was an unrelated EDAC lockdep splat, so I simply disabled it.
> >
> > This is what I get:
> >
> > Jul 10 11:40:44 x4 kernel: 
> > Jul 10 11:40:44 x4 kernel: [ BUG: lock held when returning to user space! ]
> > Jul 10 11:40:44 x4 kernel: 3.10.0-08587-g496322b #35 Not tainted
> > Jul 10 11:40:44 x4 kernel: 
> > Jul 10 11:40:44 x4 kernel: X/211 is leaving the kernel with locks still 
> > held!
> > Jul 10 11:40:44 x4 kernel: 2 locks held by X/211:
> > Jul 10 11:40:44 x4 kernel: #0:  (reservation_ww_class_acquire){+.+.+.}, at: 
> > [] radeon_bo_list_validate+0x20/0xd0
> > Jul 10 11:40:44 x4 kernel: #1:  (reservation_ww_class_mutex){+.+.+.}, at: 
> > [] ttm_eu_reserve_buffers+0x126/0x4b0
> > Jul 10 11:40:52 x4 kernel: SysRq : Emergency Sync
> > Jul 10 11:40:53 x4 kernel: Emergency Sync complete
> >
> Thanks, exactly what I thought. I missed a backoff somewhere..
> 
> Does the below patch fix it?

Yes. Thank you for your quick reply.

-- 
Markus


Commit ecff665f5e3f (drm/ttm: make ttm reservation calls...) causes system hang on Radeon RS780

2013-07-10 Thread Maarten Lankhorst
Op 10-07-13 11:46, Markus Trippelsdorf schreef:
> On 2013.07.10 at 11:29 +0200, Maarten Lankhorst wrote:
>> Op 10-07-13 11:22, Markus Trippelsdorf schreef:
>>> By simply copy/pasting a big document under LibreOffice my system hangs
>>> itself up. Only a hard reset gets it working again.
>>> see also: https://bugs.freedesktop.org/show_bug.cgi?id=66551
>>>
>>> I've bisected the issue to:
>>>
>>> commit ecff665f5e3f1c6909353e00b9420e45ae23d995
>>> Author: Maarten Lankhorst 
>>> Date:   Thu Jun 27 13:48:17 2013 +0200
>>>
>>> drm/ttm: make ttm reservation calls behave like reservation calls
>>> 
>>> This commit converts the source of the val_seq counter to
>>> the ww_mutex api. The reservation objects are converted later,
>>> because there is still a lockdep splat in nouveau that has to
>>> resolved first.
>>> 
>>> Signed-off-by: Maarten Lankhorst 
>>> Reviewed-by: Jerome Glisse 
>>> Signed-off-by: Dave Airlie 
>> Hey,
>>
>> Can you try current head with CONFIG_PROVE_LOCKING set and post the
>> lockdep splat from dmesg, if any? If there is any locking issue
>> lockdep should warn about it.  Lockdep will turn itself off after the
>> first splat, so if the lockdep splat happens before running the
>> affected parts those will have to be fixed first.
> There was an unrelated EDAC lockdep splat, so I simply disabled it.
>
> This is what I get:
>
> Jul 10 11:40:44 x4 kernel: 
> Jul 10 11:40:44 x4 kernel: [ BUG: lock held when returning to user space! ]
> Jul 10 11:40:44 x4 kernel: 3.10.0-08587-g496322b #35 Not tainted
> Jul 10 11:40:44 x4 kernel: 
> Jul 10 11:40:44 x4 kernel: X/211 is leaving the kernel with locks still held!
> Jul 10 11:40:44 x4 kernel: 2 locks held by X/211:
> Jul 10 11:40:44 x4 kernel: #0:  (reservation_ww_class_acquire){+.+.+.}, at: 
> [] radeon_bo_list_validate+0x20/0xd0
> Jul 10 11:40:44 x4 kernel: #1:  (reservation_ww_class_mutex){+.+.+.}, at: 
> [] ttm_eu_reserve_buffers+0x126/0x4b0
> Jul 10 11:40:52 x4 kernel: SysRq : Emergency Sync
> Jul 10 11:40:53 x4 kernel: Emergency Sync complete
>
Thanks, exactly what I thought. I missed a backoff somewhere..

Does the below patch fix it?

---
diff --git a/drivers/gpu/drm/radeon/radeon_object.c 
b/drivers/gpu/drm/radeon/radeon_object.c
index 0219d26..2020bf4 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -377,6 +377,7 @@ int radeon_bo_list_validate(struct ww_acquire_ctx *ticket,
domain = lobj->alt_domain;
goto retry;
}
+   ttm_eu_backoff_reservation(ticket, head);
return r;
}
}



Commit ecff665f5e3f (drm/ttm: make ttm reservation calls...) causes system hang on Radeon RS780

2013-07-10 Thread Markus Trippelsdorf
On 2013.07.10 at 11:29 +0200, Maarten Lankhorst wrote:
> Op 10-07-13 11:22, Markus Trippelsdorf schreef:
> > By simply copy/pasting a big document under LibreOffice my system hangs
> > itself up. Only a hard reset gets it working again.
> > see also: https://bugs.freedesktop.org/show_bug.cgi?id=66551
> >
> > I've bisected the issue to:
> >
> > commit ecff665f5e3f1c6909353e00b9420e45ae23d995
> > Author: Maarten Lankhorst 
> > Date:   Thu Jun 27 13:48:17 2013 +0200
> >
> > drm/ttm: make ttm reservation calls behave like reservation calls
> > 
> > This commit converts the source of the val_seq counter to
> > the ww_mutex api. The reservation objects are converted later,
> > because there is still a lockdep splat in nouveau that has to
> > resolved first.
> > 
> > Signed-off-by: Maarten Lankhorst 
> > Reviewed-by: Jerome Glisse 
> > Signed-off-by: Dave Airlie 
> Hey,
> 
> Can you try current head with CONFIG_PROVE_LOCKING set and post the
> lockdep splat from dmesg, if any? If there is any locking issue
> lockdep should warn about it.  Lockdep will turn itself off after the
> first splat, so if the lockdep splat happens before running the
> affected parts those will have to be fixed first.

There was an unrelated EDAC lockdep splat, so I simply disabled it.

This is what I get:

Jul 10 11:40:44 x4 kernel: 
Jul 10 11:40:44 x4 kernel: [ BUG: lock held when returning to user space! ]
Jul 10 11:40:44 x4 kernel: 3.10.0-08587-g496322b #35 Not tainted
Jul 10 11:40:44 x4 kernel: 
Jul 10 11:40:44 x4 kernel: X/211 is leaving the kernel with locks still held!
Jul 10 11:40:44 x4 kernel: 2 locks held by X/211:
Jul 10 11:40:44 x4 kernel: #0:  (reservation_ww_class_acquire){+.+.+.}, at: 
[] radeon_bo_list_validate+0x20/0xd0
Jul 10 11:40:44 x4 kernel: #1:  (reservation_ww_class_mutex){+.+.+.}, at: 
[] ttm_eu_reserve_buffers+0x126/0x4b0
Jul 10 11:40:52 x4 kernel: SysRq : Emergency Sync
Jul 10 11:40:53 x4 kernel: Emergency Sync complete

-- 
Markus


Commit ecff665f5e3f (drm/ttm: make ttm reservation calls...) causes system hang on Radeon RS780

2013-07-10 Thread Maarten Lankhorst
Op 10-07-13 11:22, Markus Trippelsdorf schreef:
> By simply copy/pasting a big document under LibreOffice my system hangs
> itself up. Only a hard reset gets it working again.
> see also: https://bugs.freedesktop.org/show_bug.cgi?id=66551
>
> I've bisected the issue to:
>
> commit ecff665f5e3f1c6909353e00b9420e45ae23d995
> Author: Maarten Lankhorst 
> Date:   Thu Jun 27 13:48:17 2013 +0200
>
> drm/ttm: make ttm reservation calls behave like reservation calls
> 
> This commit converts the source of the val_seq counter to
> the ww_mutex api. The reservation objects are converted later,
> because there is still a lockdep splat in nouveau that has to
> resolved first.
> 
> Signed-off-by: Maarten Lankhorst 
> Reviewed-by: Jerome Glisse 
> Signed-off-by: Dave Airlie 
Hey,

Can you try current head with CONFIG_PROVE_LOCKING set and post the lockdep 
splat from dmesg, if any? If there is any locking issue lockdep should warn 
about it.
Lockdep will turn itself off after the first splat, so if the lockdep splat 
happens before running the affected parts those will have to be fixed first.

~Maarten


Commit ecff665f5e3f (drm/ttm: make ttm reservation calls...) causes system hang on Radeon RS780

2013-07-10 Thread Markus Trippelsdorf
By simply copy/pasting a big document under LibreOffice my system hangs
itself up. Only a hard reset gets it working again.
see also: https://bugs.freedesktop.org/show_bug.cgi?id=66551

I've bisected the issue to:

commit ecff665f5e3f1c6909353e00b9420e45ae23d995
Author: Maarten Lankhorst 
Date:   Thu Jun 27 13:48:17 2013 +0200

drm/ttm: make ttm reservation calls behave like reservation calls

This commit converts the source of the val_seq counter to
the ww_mutex api. The reservation objects are converted later,
because there is still a lockdep splat in nouveau that has to
resolved first.

Signed-off-by: Maarten Lankhorst 
Reviewed-by: Jerome Glisse 
Signed-off-by: Dave Airlie 

-- 
Markus


Commit ecff665f5e3f (drm/ttm: make ttm reservation calls...) causes system hang on Radeon RS780

2013-07-10 Thread Markus Trippelsdorf
By simply copy/pasting a big document under LibreOffice my system hangs
itself up. Only a hard reset gets it working again.
see also: https://bugs.freedesktop.org/show_bug.cgi?id=66551

I've bisected the issue to:

commit ecff665f5e3f1c6909353e00b9420e45ae23d995
Author: Maarten Lankhorst m.b.lankho...@gmail.com
Date:   Thu Jun 27 13:48:17 2013 +0200

drm/ttm: make ttm reservation calls behave like reservation calls

This commit converts the source of the val_seq counter to
the ww_mutex api. The reservation objects are converted later,
because there is still a lockdep splat in nouveau that has to
resolved first.

Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com
Reviewed-by: Jerome Glisse jgli...@redhat.com
Signed-off-by: Dave Airlie airl...@redhat.com

-- 
Markus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: Commit ecff665f5e3f (drm/ttm: make ttm reservation calls...) causes system hang on Radeon RS780

2013-07-10 Thread Maarten Lankhorst
Op 10-07-13 11:22, Markus Trippelsdorf schreef:
 By simply copy/pasting a big document under LibreOffice my system hangs
 itself up. Only a hard reset gets it working again.
 see also: https://bugs.freedesktop.org/show_bug.cgi?id=66551

 I've bisected the issue to:

 commit ecff665f5e3f1c6909353e00b9420e45ae23d995
 Author: Maarten Lankhorst m.b.lankho...@gmail.com
 Date:   Thu Jun 27 13:48:17 2013 +0200

 drm/ttm: make ttm reservation calls behave like reservation calls
 
 This commit converts the source of the val_seq counter to
 the ww_mutex api. The reservation objects are converted later,
 because there is still a lockdep splat in nouveau that has to
 resolved first.
 
 Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com
 Reviewed-by: Jerome Glisse jgli...@redhat.com
 Signed-off-by: Dave Airlie airl...@redhat.com
Hey,

Can you try current head with CONFIG_PROVE_LOCKING set and post the lockdep 
splat from dmesg, if any? If there is any locking issue lockdep should warn 
about it.
Lockdep will turn itself off after the first splat, so if the lockdep splat 
happens before running the affected parts those will have to be fixed first.

~Maarten
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: Commit ecff665f5e3f (drm/ttm: make ttm reservation calls...) causes system hang on Radeon RS780

2013-07-10 Thread Markus Trippelsdorf
On 2013.07.10 at 11:29 +0200, Maarten Lankhorst wrote:
 Op 10-07-13 11:22, Markus Trippelsdorf schreef:
  By simply copy/pasting a big document under LibreOffice my system hangs
  itself up. Only a hard reset gets it working again.
  see also: https://bugs.freedesktop.org/show_bug.cgi?id=66551
 
  I've bisected the issue to:
 
  commit ecff665f5e3f1c6909353e00b9420e45ae23d995
  Author: Maarten Lankhorst m.b.lankho...@gmail.com
  Date:   Thu Jun 27 13:48:17 2013 +0200
 
  drm/ttm: make ttm reservation calls behave like reservation calls
  
  This commit converts the source of the val_seq counter to
  the ww_mutex api. The reservation objects are converted later,
  because there is still a lockdep splat in nouveau that has to
  resolved first.
  
  Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com
  Reviewed-by: Jerome Glisse jgli...@redhat.com
  Signed-off-by: Dave Airlie airl...@redhat.com
 Hey,
 
 Can you try current head with CONFIG_PROVE_LOCKING set and post the
 lockdep splat from dmesg, if any? If there is any locking issue
 lockdep should warn about it.  Lockdep will turn itself off after the
 first splat, so if the lockdep splat happens before running the
 affected parts those will have to be fixed first.

There was an unrelated EDAC lockdep splat, so I simply disabled it.

This is what I get:

Jul 10 11:40:44 x4 kernel: 
Jul 10 11:40:44 x4 kernel: [ BUG: lock held when returning to user space! ]
Jul 10 11:40:44 x4 kernel: 3.10.0-08587-g496322b #35 Not tainted
Jul 10 11:40:44 x4 kernel: 
Jul 10 11:40:44 x4 kernel: X/211 is leaving the kernel with locks still held!
Jul 10 11:40:44 x4 kernel: 2 locks held by X/211:
Jul 10 11:40:44 x4 kernel: #0:  (reservation_ww_class_acquire){+.+.+.}, at: 
[813279f0] radeon_bo_list_validate+0x20/0xd0
Jul 10 11:40:44 x4 kernel: #1:  (reservation_ww_class_mutex){+.+.+.}, at: 
[81309306] ttm_eu_reserve_buffers+0x126/0x4b0
Jul 10 11:40:52 x4 kernel: SysRq : Emergency Sync
Jul 10 11:40:53 x4 kernel: Emergency Sync complete

-- 
Markus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: Commit ecff665f5e3f (drm/ttm: make ttm reservation calls...) causes system hang on Radeon RS780

2013-07-10 Thread Maarten Lankhorst
Op 10-07-13 11:46, Markus Trippelsdorf schreef:
 On 2013.07.10 at 11:29 +0200, Maarten Lankhorst wrote:
 Op 10-07-13 11:22, Markus Trippelsdorf schreef:
 By simply copy/pasting a big document under LibreOffice my system hangs
 itself up. Only a hard reset gets it working again.
 see also: https://bugs.freedesktop.org/show_bug.cgi?id=66551

 I've bisected the issue to:

 commit ecff665f5e3f1c6909353e00b9420e45ae23d995
 Author: Maarten Lankhorst m.b.lankho...@gmail.com
 Date:   Thu Jun 27 13:48:17 2013 +0200

 drm/ttm: make ttm reservation calls behave like reservation calls
 
 This commit converts the source of the val_seq counter to
 the ww_mutex api. The reservation objects are converted later,
 because there is still a lockdep splat in nouveau that has to
 resolved first.
 
 Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com
 Reviewed-by: Jerome Glisse jgli...@redhat.com
 Signed-off-by: Dave Airlie airl...@redhat.com
 Hey,

 Can you try current head with CONFIG_PROVE_LOCKING set and post the
 lockdep splat from dmesg, if any? If there is any locking issue
 lockdep should warn about it.  Lockdep will turn itself off after the
 first splat, so if the lockdep splat happens before running the
 affected parts those will have to be fixed first.
 There was an unrelated EDAC lockdep splat, so I simply disabled it.

 This is what I get:

 Jul 10 11:40:44 x4 kernel: 
 Jul 10 11:40:44 x4 kernel: [ BUG: lock held when returning to user space! ]
 Jul 10 11:40:44 x4 kernel: 3.10.0-08587-g496322b #35 Not tainted
 Jul 10 11:40:44 x4 kernel: 
 Jul 10 11:40:44 x4 kernel: X/211 is leaving the kernel with locks still held!
 Jul 10 11:40:44 x4 kernel: 2 locks held by X/211:
 Jul 10 11:40:44 x4 kernel: #0:  (reservation_ww_class_acquire){+.+.+.}, at: 
 [813279f0] radeon_bo_list_validate+0x20/0xd0
 Jul 10 11:40:44 x4 kernel: #1:  (reservation_ww_class_mutex){+.+.+.}, at: 
 [81309306] ttm_eu_reserve_buffers+0x126/0x4b0
 Jul 10 11:40:52 x4 kernel: SysRq : Emergency Sync
 Jul 10 11:40:53 x4 kernel: Emergency Sync complete

Thanks, exactly what I thought. I missed a backoff somewhere..

Does the below patch fix it?

---
diff --git a/drivers/gpu/drm/radeon/radeon_object.c 
b/drivers/gpu/drm/radeon/radeon_object.c
index 0219d26..2020bf4 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -377,6 +377,7 @@ int radeon_bo_list_validate(struct ww_acquire_ctx *ticket,
domain = lobj-alt_domain;
goto retry;
}
+   ttm_eu_backoff_reservation(ticket, head);
return r;
}
}

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: Commit ecff665f5e3f (drm/ttm: make ttm reservation calls...) causes system hang on Radeon RS780

2013-07-10 Thread Markus Trippelsdorf
On 2013.07.10 at 11:56 +0200, Maarten Lankhorst wrote:
 Op 10-07-13 11:46, Markus Trippelsdorf schreef:
  On 2013.07.10 at 11:29 +0200, Maarten Lankhorst wrote:
  Op 10-07-13 11:22, Markus Trippelsdorf schreef:
  By simply copy/pasting a big document under LibreOffice my system hangs
  itself up. Only a hard reset gets it working again.
  see also: https://bugs.freedesktop.org/show_bug.cgi?id=66551
 
  I've bisected the issue to:
 
  commit ecff665f5e3f1c6909353e00b9420e45ae23d995
  Author: Maarten Lankhorst m.b.lankho...@gmail.com
  Date:   Thu Jun 27 13:48:17 2013 +0200
 
  drm/ttm: make ttm reservation calls behave like reservation calls
  
  This commit converts the source of the val_seq counter to
  the ww_mutex api. The reservation objects are converted later,
  because there is still a lockdep splat in nouveau that has to
  resolved first.
  
  Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com
  Reviewed-by: Jerome Glisse jgli...@redhat.com
  Signed-off-by: Dave Airlie airl...@redhat.com
  Hey,
 
  Can you try current head with CONFIG_PROVE_LOCKING set and post the
  lockdep splat from dmesg, if any? If there is any locking issue
  lockdep should warn about it.  Lockdep will turn itself off after the
  first splat, so if the lockdep splat happens before running the
  affected parts those will have to be fixed first.
  There was an unrelated EDAC lockdep splat, so I simply disabled it.
 
  This is what I get:
 
  Jul 10 11:40:44 x4 kernel: 
  Jul 10 11:40:44 x4 kernel: [ BUG: lock held when returning to user space! ]
  Jul 10 11:40:44 x4 kernel: 3.10.0-08587-g496322b #35 Not tainted
  Jul 10 11:40:44 x4 kernel: 
  Jul 10 11:40:44 x4 kernel: X/211 is leaving the kernel with locks still 
  held!
  Jul 10 11:40:44 x4 kernel: 2 locks held by X/211:
  Jul 10 11:40:44 x4 kernel: #0:  (reservation_ww_class_acquire){+.+.+.}, at: 
  [813279f0] radeon_bo_list_validate+0x20/0xd0
  Jul 10 11:40:44 x4 kernel: #1:  (reservation_ww_class_mutex){+.+.+.}, at: 
  [81309306] ttm_eu_reserve_buffers+0x126/0x4b0
  Jul 10 11:40:52 x4 kernel: SysRq : Emergency Sync
  Jul 10 11:40:53 x4 kernel: Emergency Sync complete
 
 Thanks, exactly what I thought. I missed a backoff somewhere..
 
 Does the below patch fix it?

Yes. Thank you for your quick reply.

-- 
Markus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel