[PATCH 2.6.38-10-generic] device driver: fix oops in radeon driver due to incorrect value from hardware

2011-09-08 Thread Michel Dänzer
On Mit, 2011-08-10 at 09:27 -0400, Alex Deucher wrote: 
> 2011/8/10 Michel D?nzer :
> > On Die, 2011-08-09 at 23:52 +0530, Mayank Rungta wrote:
> >> Added a check for the radeon ring buffer write index in r600.c which
> >> reads 0x on resume. This results in an Oops during
> >> radeon_ring_write. Masking the value averts this.
> >>
> >> This problem is not seen to be fixed in 3.0 r600.c as well.
> >>
> >> Detailed analysis of the problem can be found at -
> >>
> >> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/
> >>
> >> ---
> >>
> >> BUG: unable to handle kernel paging request at fa501ffc - Oops at
> >> r600_cp_start+0x48/0x380 in r600_cp_resume+0x345/0x580 [radeon]
> >>
> >> drivers/gpu/drm/radeon/r600.c
> >>
> >>
> >>
> >> --- linux-2.6.38/drivers/gpu/drm/radeon/r600.c.orig2011-08-05
> >> 15:39:40.824612700 +0530
> >> +++ linux-2.6.38/drivers/gpu/drm/radeon/r600.c2011-08-08
> >> 05:29:21.744417857 +0530
> >> @@ -2218,6 +2218,8 @@ int r600_cp_resume(struct radeon_device
> >>
> >>   rdev->cp.rptr = RREG32(CP_RB_RPTR);
> >>   rdev->cp.wptr = RREG32(CP_RB_WPTR);
> >> +/* protect against crazy HW on resume */
> >> +rdev->cp.wptr &= rdev->cp.ptr_mask;
> >
> > Although the same workaround is already in r100.c, I wonder if we
> > shouldn't rather try and eliminate all reads from the CP_RB_WPTR
> > register, at least other than for debugging purposes. Alex, what do you
> > think?
> 
> Either this or reset the registers to 0 or a saved value on resume
> rather than reading from them.

The patch below is what I had in mind. Does this fix the problem above?



Re: [PATCH 2.6.38-10-generic] device driver: fix oops in radeon driver due to incorrect value from hardware

2011-09-08 Thread Michel Dänzer
On Mit, 2011-08-10 at 09:27 -0400, Alex Deucher wrote: 
 2011/8/10 Michel Dänzer mic...@daenzer.net:
  On Die, 2011-08-09 at 23:52 +0530, Mayank Rungta wrote:
  Added a check for the radeon ring buffer write index in r600.c which
  reads 0x on resume. This results in an Oops during
  radeon_ring_write. Masking the value averts this.
 
  This problem is not seen to be fixed in 3.0 r600.c as well.
 
  Detailed analysis of the problem can be found at -
 
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/
 
  ---
 
  BUG: unable to handle kernel paging request at fa501ffc - Oops at
  r600_cp_start+0x48/0x380 in r600_cp_resume+0x345/0x580 [radeon]
 
  drivers/gpu/drm/radeon/r600.c
 
 
 
  --- linux-2.6.38/drivers/gpu/drm/radeon/r600.c.orig2011-08-05
  15:39:40.824612700 +0530
  +++ linux-2.6.38/drivers/gpu/drm/radeon/r600.c2011-08-08
  05:29:21.744417857 +0530
  @@ -2218,6 +2218,8 @@ int r600_cp_resume(struct radeon_device
 
rdev-cp.rptr = RREG32(CP_RB_RPTR);
rdev-cp.wptr = RREG32(CP_RB_WPTR);
  +/* protect against crazy HW on resume */
  +rdev-cp.wptr = rdev-cp.ptr_mask;
 
  Although the same workaround is already in r100.c, I wonder if we
  shouldn't rather try and eliminate all reads from the CP_RB_WPTR
  register, at least other than for debugging purposes. Alex, what do you
  think?
 
 Either this or reset the registers to 0 or a saved value on resume
 rather than reading from them.

The patch below is what I had in mind. Does this fix the problem above?


From c564bc8e6d449216d74ee134d5bf470221f79e8d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Michel=20D=C3=A4nzer?= michel.daen...@amd.com
Date: Thu, 8 Sep 2011 11:09:39 +0200
Subject: [PATCH] drm/radeon: Don't read from CP ring write pointer registers.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Apparently this doesn't always work reliably, e.g. at resume time.

Just initialize to 0, so the ring is considered empty.

Tested with hibernation on Sumo and Cayman cards.

Should fix https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/ .

Signed-off-by: Michel Dänzer michel.daen...@amd.com
---
 drivers/gpu/drm/radeon/evergreen.c |4 ++--
 drivers/gpu/drm/radeon/ni.c|   12 ++--
 drivers/gpu/drm/radeon/r100.c  |6 ++
 drivers/gpu/drm/radeon/r600.c  |4 ++--
 4 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index 15bd047..f2bd90a 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -1378,7 +1378,8 @@ int evergreen_cp_resume(struct radeon_device *rdev)
/* Initialize the ring buffer's read and write pointers */
WREG32(CP_RB_CNTL, tmp | RB_RPTR_WR_ENA);
WREG32(CP_RB_RPTR_WR, 0);
-   WREG32(CP_RB_WPTR, 0);
+   rdev-cp.wptr = 0;
+   WREG32(CP_RB_WPTR, rdev-cp.wptr);
 
/* set the wb address wether it's enabled or not */
WREG32(CP_RB_RPTR_ADDR,
@@ -1403,7 +1404,6 @@ int evergreen_cp_resume(struct radeon_device *rdev)
WREG32(CP_DEBUG, (1  27) | (1  28));
 
rdev-cp.rptr = RREG32(CP_RB_RPTR);
-   rdev-cp.wptr = RREG32(CP_RB_WPTR);
 
evergreen_cp_start(rdev);
rdev-cp.ready = true;
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 559dbd4..e3489ee 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -1182,7 +1182,8 @@ int cayman_cp_resume(struct radeon_device *rdev)
 
/* Initialize the ring buffer's read and write pointers */
WREG32(CP_RB0_CNTL, tmp | RB_RPTR_WR_ENA);
-   WREG32(CP_RB0_WPTR, 0);
+   rdev-cp.wptr = 0;
+   WREG32(CP_RB0_WPTR, rdev-cp.wptr);
 
/* set the wb address wether it's enabled or not */
WREG32(CP_RB0_RPTR_ADDR, (rdev-wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET) 
 0xFFFC);
@@ -1202,7 +1203,6 @@ int cayman_cp_resume(struct radeon_device *rdev)
WREG32(CP_RB0_BASE, rdev-cp.gpu_addr  8);
 
rdev-cp.rptr = RREG32(CP_RB0_RPTR);
-   rdev-cp.wptr = RREG32(CP_RB0_WPTR);
 
/* ring1  - compute only */
/* Set ring buffer size */
@@ -1215,7 +1215,8 @@ int cayman_cp_resume(struct radeon_device *rdev)
 
/* Initialize the ring buffer's read and write pointers */
WREG32(CP_RB1_CNTL, tmp | RB_RPTR_WR_ENA);
-   WREG32(CP_RB1_WPTR, 0);
+   rdev-cp1.wptr = 0;
+   WREG32(CP_RB1_WPTR, rdev-cp1.wptr);
 
/* set the wb address wether it's enabled or not */
WREG32(CP_RB1_RPTR_ADDR, (rdev-wb.gpu_addr + 
RADEON_WB_CP1_RPTR_OFFSET)  0xFFFC);
@@ -1227,7 +1228,6 @@ int cayman_cp_resume(struct radeon_device *rdev)
WREG32(CP_RB1_BASE, rdev-cp1.gpu_addr  8);
 
rdev-cp1.rptr = RREG32(CP_RB1_RPTR);
-   rdev-cp1.wptr = RREG32(CP_RB1_WPTR);
 
/* ring2 - compute only */
/* Set ring buffer size */
@@ -1240,7 +1240,8 @@ int 

[PATCH 2.6.38-10-generic] device driver: fix oops in radeon driver due to incorrect value from hardware

2011-08-10 Thread Mayank Rungta

On 08/10/2011 02:54 PM, Michel D?nzer wrote:

> On Die, 2011-08-09 at 23:52 +0530, Mayank Rungta wrote:
>> Added a check for the radeon ring buffer write index in r600.c which
>> reads 0x on resume. This results in an Oops during
>> radeon_ring_write. Masking the value averts this.
>>
>> This problem is not seen to be fixed in 3.0 r600.c as well.
>>
>> Detailed analysis of the problem can be found at -
>>
>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/
>>
>> ---
>>
>> BUG: unable to handle kernel paging request at fa501ffc - Oops at
>> r600_cp_start+0x48/0x380 in r600_cp_resume+0x345/0x580 [radeon]
>>
>> drivers/gpu/drm/radeon/r600.c
>>
>>
>>
>> --- linux-2.6.38/drivers/gpu/drm/radeon/r600.c.orig2011-08-05
>> 15:39:40.824612700 +0530
>> +++ linux-2.6.38/drivers/gpu/drm/radeon/r600.c2011-08-08
>> 05:29:21.744417857 +0530
>> @@ -2218,6 +2218,8 @@ int r600_cp_resume(struct radeon_device
>>
>>rdev->cp.rptr = RREG32(CP_RB_RPTR);
>>rdev->cp.wptr = RREG32(CP_RB_WPTR);
>> +/* protect against crazy HW on resume */
>> +rdev->cp.wptr&= rdev->cp.ptr_mask;
>
> The indentation of the lines you're adding doesn't match the surrounding
> lines.
Sorry. This looked fine in the mail I sent. I shall be careful in future.

>
>
> Although the same workaround is already in r100.c, I wonder if we
> shouldn't rather try and eliminate all reads from the CP_RB_WPTR
> register, at least other than for debugging purposes. Alex, what do you
> think?
>
> Otherwise, this should probably be added in evergreen.c as well.
>
>
>>   Developer's Certificate of Origin 1.1
>>
>>   [...]
>
> No need to include all this text, just the *-by: tags are enough.
>
Point taken.



[PATCH 2.6.38-10-generic] device driver: fix oops in radeon driver due to incorrect value from hardware

2011-08-10 Thread Michel Dänzer
On Die, 2011-08-09 at 23:52 +0530, Mayank Rungta wrote: 
> Added a check for the radeon ring buffer write index in r600.c which 
> reads 0x on resume. This results in an Oops during 
> radeon_ring_write. Masking the value averts this.
> 
> This problem is not seen to be fixed in 3.0 r600.c as well.
> 
> Detailed analysis of the problem can be found at -
> 
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/
> 
> ---
> 
> BUG: unable to handle kernel paging request at fa501ffc - Oops at 
> r600_cp_start+0x48/0x380 in r600_cp_resume+0x345/0x580 [radeon]
> 
> drivers/gpu/drm/radeon/r600.c
> 
> 
> 
> --- linux-2.6.38/drivers/gpu/drm/radeon/r600.c.orig2011-08-05 
> 15:39:40.824612700 +0530
> +++ linux-2.6.38/drivers/gpu/drm/radeon/r600.c2011-08-08 
> 05:29:21.744417857 +0530
> @@ -2218,6 +2218,8 @@ int r600_cp_resume(struct radeon_device
> 
>   rdev->cp.rptr = RREG32(CP_RB_RPTR);
>   rdev->cp.wptr = RREG32(CP_RB_WPTR);
> +/* protect against crazy HW on resume */
> +rdev->cp.wptr &= rdev->cp.ptr_mask;

The indentation of the lines you're adding doesn't match the surrounding
lines.


Although the same workaround is already in r100.c, I wonder if we
shouldn't rather try and eliminate all reads from the CP_RB_WPTR
register, at least other than for debugging purposes. Alex, what do you
think?

Otherwise, this should probably be added in evergreen.c as well.


>  Developer's Certificate of Origin 1.1
> 
>  [...]

No need to include all this text, just the *-by: tags are enough.


-- 
Earthling Michel D?nzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer


[PATCH 2.6.38-10-generic] device driver: fix oops in radeon driver due to incorrect value from hardware

2011-08-10 Thread Mayank Rungta
Added a check for the radeon ring buffer write index in r600.c which 
reads 0x on resume. This results in an Oops during 
radeon_ring_write. Masking the value averts this.

This problem is not seen to be fixed in 3.0 r600.c as well.

Detailed analysis of the problem can be found at -

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/

---

BUG: unable to handle kernel paging request at fa501ffc - Oops at 
r600_cp_start+0x48/0x380 in r600_cp_resume+0x345/0x580 [radeon]

drivers/gpu/drm/radeon/r600.c



--- linux-2.6.38/drivers/gpu/drm/radeon/r600.c.orig2011-08-05 
15:39:40.824612700 +0530
+++ linux-2.6.38/drivers/gpu/drm/radeon/r600.c2011-08-08 
05:29:21.744417857 +0530
@@ -2218,6 +2218,8 @@ int r600_cp_resume(struct radeon_device

  rdev->cp.rptr = RREG32(CP_RB_RPTR);
  rdev->cp.wptr = RREG32(CP_RB_WPTR);
+/* protect against crazy HW on resume */
+rdev->cp.wptr &= rdev->cp.ptr_mask;

  r600_cp_start(rdev);
  rdev->cp.ready = true;


 Developer's Certificate of Origin 1.1

 By making a contribution to this project, I certify that:

 (a) The contribution was created in whole or in part by me and I
 have the right to submit it under the open source license
 indicated in the file; or

 (b) The contribution is based upon previous work that, to the best
 of my knowledge, is covered under an appropriate open source
 license and I have the right under that license to submit that
 work with modifications, whether created in whole or in part
 by me, under the same open source license (unless I am
 permitted to submit under a different license), as indicated
 in the file; or

 (c) The contribution was provided directly to me by some other
 person who certified (a), (b) or (c) and I have not modified
 it.

 (d) I understand and agree that this project and the contribution
 are public and that a record of the contribution (including all
 personal information I submit with it, including my sign-off) is
 maintained indefinitely and may be redistributed consistent with
 this project or the open source license(s) involved.


 Signed-off-by: A R Karthick 

 Reported-by: Mayank Rungta 

 Tested-by: Mayank Rungta 


[PATCH 2.6.38-10-generic] device driver: fix oops in radeon driver due to incorrect value from hardware

2011-08-10 Thread Mayank Rungta
Added a check for the radeon ring buffer write index in r600.c which 
reads 0x on resume. This results in an Oops during 
radeon_ring_write. Masking the value averts this.


This problem is not seen to be fixed in 3.0 r600.c as well.

Detailed analysis of the problem can be found at -

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/

---

BUG: unable to handle kernel paging request at fa501ffc - Oops at 
r600_cp_start+0x48/0x380 in r600_cp_resume+0x345/0x580 [radeon]


drivers/gpu/drm/radeon/r600.c



--- linux-2.6.38/drivers/gpu/drm/radeon/r600.c.orig2011-08-05 
15:39:40.824612700 +0530
+++ linux-2.6.38/drivers/gpu/drm/radeon/r600.c2011-08-08 
05:29:21.744417857 +0530

@@ -2218,6 +2218,8 @@ int r600_cp_resume(struct radeon_device

 rdev-cp.rptr = RREG32(CP_RB_RPTR);
 rdev-cp.wptr = RREG32(CP_RB_WPTR);
+/* protect against crazy HW on resume */
+rdev-cp.wptr = rdev-cp.ptr_mask;

 r600_cp_start(rdev);
 rdev-cp.ready = true;


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or

(b) The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or

(c) The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.

(d) I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.


Signed-off-by: A R Karthick a.r.karth...@gmail.com

Reported-by: Mayank Rungta mr.m...@gmail.com

Tested-by: Mayank Rungta mr.m...@gmail.com
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2.6.38-10-generic] device driver: fix oops in radeon driver due to incorrect value from hardware

2011-08-10 Thread Michel Dänzer
On Die, 2011-08-09 at 23:52 +0530, Mayank Rungta wrote: 
 Added a check for the radeon ring buffer write index in r600.c which 
 reads 0x on resume. This results in an Oops during 
 radeon_ring_write. Masking the value averts this.
 
 This problem is not seen to be fixed in 3.0 r600.c as well.
 
 Detailed analysis of the problem can be found at -
 
 https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/
 
 ---
 
 BUG: unable to handle kernel paging request at fa501ffc - Oops at 
 r600_cp_start+0x48/0x380 in r600_cp_resume+0x345/0x580 [radeon]
 
 drivers/gpu/drm/radeon/r600.c
 
 
 
 --- linux-2.6.38/drivers/gpu/drm/radeon/r600.c.orig2011-08-05 
 15:39:40.824612700 +0530
 +++ linux-2.6.38/drivers/gpu/drm/radeon/r600.c2011-08-08 
 05:29:21.744417857 +0530
 @@ -2218,6 +2218,8 @@ int r600_cp_resume(struct radeon_device
 
   rdev-cp.rptr = RREG32(CP_RB_RPTR);
   rdev-cp.wptr = RREG32(CP_RB_WPTR);
 +/* protect against crazy HW on resume */
 +rdev-cp.wptr = rdev-cp.ptr_mask;

The indentation of the lines you're adding doesn't match the surrounding
lines.


Although the same workaround is already in r100.c, I wonder if we
shouldn't rather try and eliminate all reads from the CP_RB_WPTR
register, at least other than for debugging purposes. Alex, what do you
think?

Otherwise, this should probably be added in evergreen.c as well.


  Developer's Certificate of Origin 1.1
 
  [...]

No need to include all this text, just the *-by: tags are enough.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2.6.38-10-generic] device driver: fix oops in radeon driver due to incorrect value from hardware

2011-08-10 Thread Alex Deucher
2011/8/10 Michel Dänzer mic...@daenzer.net:
 On Die, 2011-08-09 at 23:52 +0530, Mayank Rungta wrote:
 Added a check for the radeon ring buffer write index in r600.c which
 reads 0x on resume. This results in an Oops during
 radeon_ring_write. Masking the value averts this.

 This problem is not seen to be fixed in 3.0 r600.c as well.

 Detailed analysis of the problem can be found at -

 https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/

 ---

 BUG: unable to handle kernel paging request at fa501ffc - Oops at
 r600_cp_start+0x48/0x380 in r600_cp_resume+0x345/0x580 [radeon]

 drivers/gpu/drm/radeon/r600.c



 --- linux-2.6.38/drivers/gpu/drm/radeon/r600.c.orig    2011-08-05
 15:39:40.824612700 +0530
 +++ linux-2.6.38/drivers/gpu/drm/radeon/r600.c    2011-08-08
 05:29:21.744417857 +0530
 @@ -2218,6 +2218,8 @@ int r600_cp_resume(struct radeon_device

       rdev-cp.rptr = RREG32(CP_RB_RPTR);
       rdev-cp.wptr = RREG32(CP_RB_WPTR);
 +    /* protect against crazy HW on resume */
 +    rdev-cp.wptr = rdev-cp.ptr_mask;

 The indentation of the lines you're adding doesn't match the surrounding
 lines.


 Although the same workaround is already in r100.c, I wonder if we
 shouldn't rather try and eliminate all reads from the CP_RB_WPTR
 register, at least other than for debugging purposes. Alex, what do you
 think?

Either this or reset the registers to 0 or a saved value on resume
rather than reading from them.


 Otherwise, this should probably be added in evergreen.c as well.


Yes, and ni.c as well.

Alex


          Developer's Certificate of Origin 1.1

          [...]

 No need to include all this text, just the *-by: tags are enough.


 --
 Earthling Michel Dänzer           |                   http://www.amd.com
 Libre software enthusiast         |          Debian, X and DRI developer
 ___
 dri-devel mailing list
 dri-devel@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/dri-devel

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2.6.38-10-generic] device driver: fix oops in radeon driver due to incorrect value from hardware

2011-08-10 Thread Mayank Rungta


On 08/10/2011 02:54 PM, Michel Dänzer wrote:


On Die, 2011-08-09 at 23:52 +0530, Mayank Rungta wrote:

Added a check for the radeon ring buffer write index in r600.c which
reads 0x on resume. This results in an Oops during
radeon_ring_write. Masking the value averts this.

This problem is not seen to be fixed in 3.0 r600.c as well.

Detailed analysis of the problem can be found at -

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/

---

BUG: unable to handle kernel paging request at fa501ffc - Oops at
r600_cp_start+0x48/0x380 in r600_cp_resume+0x345/0x580 [radeon]

drivers/gpu/drm/radeon/r600.c



--- linux-2.6.38/drivers/gpu/drm/radeon/r600.c.orig2011-08-05
15:39:40.824612700 +0530
+++ linux-2.6.38/drivers/gpu/drm/radeon/r600.c2011-08-08
05:29:21.744417857 +0530
@@ -2218,6 +2218,8 @@ int r600_cp_resume(struct radeon_device

   rdev-cp.rptr = RREG32(CP_RB_RPTR);
   rdev-cp.wptr = RREG32(CP_RB_WPTR);
+/* protect against crazy HW on resume */
+rdev-cp.wptr= rdev-cp.ptr_mask;


The indentation of the lines you're adding doesn't match the surrounding
lines.

Sorry. This looked fine in the mail I sent. I shall be careful in future.




Although the same workaround is already in r100.c, I wonder if we
shouldn't rather try and eliminate all reads from the CP_RB_WPTR
register, at least other than for debugging purposes. Alex, what do you
think?

Otherwise, this should probably be added in evergreen.c as well.



  Developer's Certificate of Origin 1.1

  [...]


No need to include all this text, just the *-by: tags are enough.


Point taken.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel