[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-12-22 Thread Brad Figg
This patch was picked up as part of an upstream stable release. This
commit first appeared in 3.0.0-13.21.

** Changed in: linux (Ubuntu)
   Status: Incomplete = Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-12-17 Thread a.r.karth...@gmail.com
Patch for this bug that also addresses other chipsets affected on
resume.


** Patch added: Fix for 820746
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+attachment/2636981/+files/video_820746.patch

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-12-16 Thread Brad Figg
@a-r-karthick,

Trying to apply that patch it seems to be corrupt. Can you just add the
patch email as you received it as an attachment to this bug and I'll see
if I can apply that. Thanks.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-12-15 Thread Brad Figg
@a-r-karthick,

I don't see this patch in Linus' tree. Has this been submitted upstream?

** Changed in: linux (Ubuntu)
   Status: Confirmed = Incomplete

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-12-15 Thread a.r.karth...@gmail.com
@brad-figg

Yes we did mail lkml and dri-devel. They had ack'ed it back then and
proposed to resolve it by just resetting the write ring buffer index to
0 on resume to be safe.

But not sure it got submitted upstream as it had asked for confirmation.
I am sure the patch would have worked but I debugged the issue which was
reproduced by @mynk in his hardware setup as it is typical to his setup.
Since I didn't have access or could reproduce locally, I just debugged
it with the objdump. Maybe you should pitch for it upstream if it hasn't
been merged.

Here is the mail that was sent back in response to our submission:


From c564bc8e6d449216d74ee134d5bf470221f79e8d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Michel=20D=C3=A4nzer?= michel.daen...@amd.com
Date: Thu, 8 Sep 2011 11:09:39 +0200
Subject: [PATCH] drm/radeon: Don't read from CP ring write pointer registers.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit


The patch below is what I had in mind. Does this fix the problem above?


Apparently this doesn't always work reliably, e.g. at resume time.

Just initialize to 0, so the ring is considered empty.

Tested with hibernation on Sumo and Cayman cards.

Should fix https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/
.

Signed-off-by: Michel Dänzer michel.daen...@amd.com
---
 drivers/gpu/drm/radeon/evergreen.c |4 ++--
 drivers/gpu/drm/radeon/ni.c|   12 ++--
 drivers/gpu/drm/radeon/r100.c  |6 ++
 drivers/gpu/drm/radeon/r600.c  |4 ++--
 4 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index 15bd047..f2bd90a 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -1378,7 +1378,8 @@ int evergreen_cp_resume(struct radeon_device *rdev)
   /* Initialize the ring buffer's read and write pointers */
   WREG32(CP_RB_CNTL, tmp | RB_RPTR_WR_ENA);
   WREG32(CP_RB_RPTR_WR, 0);
-   WREG32(CP_RB_WPTR, 0);
+   rdev-cp.wptr = 0;
+   WREG32(CP_RB_WPTR, rdev-cp.wptr);

   /* set the wb address wether it's enabled or not */
   WREG32(CP_RB_RPTR_ADDR,
@@ -1403,7 +1404,6 @@ int evergreen_cp_resume(struct radeon_device *rdev)
   WREG32(CP_DEBUG, (1  27) | (1  28));

   rdev-cp.rptr = RREG32(CP_RB_RPTR);
-   rdev-cp.wptr = RREG32(CP_RB_WPTR);

   evergreen_cp_start(rdev);
   rdev-cp.ready = true;
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 559dbd4..e3489ee 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -1182,7 +1182,8 @@ int cayman_cp_resume(struct radeon_device *rdev)

   /* Initialize the ring buffer's read and write pointers */
   WREG32(CP_RB0_CNTL, tmp | RB_RPTR_WR_ENA);
-   WREG32(CP_RB0_WPTR, 0);
+   rdev-cp.wptr = 0;
+   WREG32(CP_RB0_WPTR, rdev-cp.wptr);

   /* set the wb address wether it's enabled or not */
   WREG32(CP_RB0_RPTR_ADDR, (rdev-wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET) 
 0xFFFC);
@@ -1202,7 +1203,6 @@ int cayman_cp_resume(struct radeon_device *rdev)
   WREG32(CP_RB0_BASE, rdev-cp.gpu_addr  8);

   rdev-cp.rptr = RREG32(CP_RB0_RPTR);
-   rdev-cp.wptr = RREG32(CP_RB0_WPTR);

   /* ring1  - compute only */
   /* Set ring buffer size */
@@ -1215,7 +1215,8 @@ int cayman_cp_resume(struct radeon_device *rdev)

   /* Initialize the ring buffer's read and write pointers */
   WREG32(CP_RB1_CNTL, tmp | RB_RPTR_WR_ENA);
-   WREG32(CP_RB1_WPTR, 0);
+   rdev-cp1.wptr = 0;
+   WREG32(CP_RB1_WPTR, rdev-cp1.wptr);

   /* set the wb address wether it's enabled or not */
   WREG32(CP_RB1_RPTR_ADDR, (rdev-wb.gpu_addr + RADEON_WB_CP1_RPTR_OFFSET) 
 0xFFFC);
@@ -1227,7 +1228,6 @@ int cayman_cp_resume(struct radeon_device *rdev)
   WREG32(CP_RB1_BASE, rdev-cp1.gpu_addr  8);

   rdev-cp1.rptr = RREG32(CP_RB1_RPTR);
-   rdev-cp1.wptr = RREG32(CP_RB1_WPTR);

   /* ring2 - compute only */
   /* Set ring buffer size */
@@ -1240,7 +1240,8 @@ int cayman_cp_resume(struct radeon_device *rdev)

   /* Initialize the ring buffer's read and write pointers */
   WREG32(CP_RB2_CNTL, tmp | RB_RPTR_WR_ENA);
-   WREG32(CP_RB2_WPTR, 0);
+   rdev-cp2.wptr = 0;
+   WREG32(CP_RB2_WPTR, rdev-cp2.wptr);

   /* set the wb address wether it's enabled or not */
   WREG32(CP_RB2_RPTR_ADDR, (rdev-wb.gpu_addr + RADEON_WB_CP2_RPTR_OFFSET) 
 0xFFFC);
@@ -1252,7 +1253,6 @@ int cayman_cp_resume(struct radeon_device *rdev)
   WREG32(CP_RB2_BASE, rdev-cp2.gpu_addr  8);

   rdev-cp2.rptr = RREG32(CP_RB2_RPTR);
-   rdev-cp2.wptr = RREG32(CP_RB2_WPTR);

   /* start the rings */
   cayman_cp_start(rdev);
diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
index f2204cb..11e44a3 100644
--- a/drivers/gpu/drm/radeon/r100.c
+++ b/drivers/gpu/drm/radeon/r100.c
@@ -990,7 

[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-12-15 Thread Ubuntu Foundation's Bug Bot
** Tags added: patch

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-12-15 Thread Brad Figg
@a-r-karthick,

I'd be happy to build some kernels with that patch applied if someone is
willing to test them. I'll do it first thing tomorrow.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-12-15 Thread Mynk
I have changed my setup and I don't remember seeing this issue on linux
3.0.x kernel. I will need to check if I can reproduce the problem. If I
can I am willing to test the same.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-08-15 Thread Brad Figg
** Changed in: linux (Ubuntu)
   Status: New = Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-08-07 Thread Mynk
@a-r-karthick: the patch works fine. I saw some errors in r600 you might
want to check. Attaching along with.

** Attachment added: kern.log
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+attachment/2260727/+files/kern.log

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-08-07 Thread a.r.karth...@gmail.com
@Mynk: Thanks for your efforts in testing out the patch. I know you spent 
sleepless nights to get this verified amidst other preemptions. 
Regarding the error log on resume, its a different one altogether. It has 
something to do with the fact that the radeon_pcie_gart_enable never really 
happend. (advanced relocation table/iommu for PCI express slots).
So on device startup, the GPU acceleration was disabled for your hardware which 
also releases the gart table. (radeon_gart_table_vram_free invoked on gart 
finalize which clears up the vram)

Aug  8 05:34:29 mayankr-T400 kernel: [   10.481258] [drm:radeon_ring_write] 
*ERROR* radeon: writting more dword to ring than expected !
Aug  8 05:34:29 mayankr-T400 kernel: [   10.626140] [drm:r600_ring_test] 
*ERROR* radeon: ring test failed (scratch(0x8504)=0x)
Aug  8 05:34:29 mayankr-T400 kernel: [   10.626146] radeon :01:00.0: 
disabling GPU acceleration

However this fact isn't marked by the radeon driver when it disables the GPU 
acceleration in r600_startup (no flags marked).
So during resume when it tries to re-enable the gart table, it found an empty: 
vram object in r600_pcie_gart_enable. 
And then fails the resume but since your ring and gpu were anyway initialized 
and suspend doesn't touch it, you were not impacted but just got left with a 
Resume failed message.
I think this has something to do with the fact that your hardware is returning 
invalid (~0U) values during initialize for the ring buffer write index which we 
are now anding it with the ring buffer size which effectively leaves us with 1 
byte of write space at the tail or reduced ring buffer write space for the GPU 
acceleration feature to be enabled.
So I guess our quirk for your hardware is making you _live_ or exist with a 
broken/crazy hardware :)
Otherwise you would be Oopsing as before. If you want to enable GPU 
acceleration, maybe we retry a finite number of times on receiving invalid ring 
buffer write index values as my original patch with the expectation that the 
subsequent retries work but I guess its not worth it and it makes sense to live 
without GPU acceleration for your seemingly broken graphics chipset.

I also believe that we can mark a flag in rdev-flags like rdev-flags
|= RADEON_IS_GART_DISABLED and then check against this flag when the
pcie_gart_enable fails on resume and continue by rdev-flags =
~RADEON_IS_GART_DISABLED with the resume instead of failing the resume
since the gart vram object was freed during r600_startup while disabling
the gart and continuing.

But I don't think its a big deal and we can treat it as benign for now
for the reasons mentioned above.

So to cut it short, lets now pull the trigger for the patch to be pushed
upstream :)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-08-05 Thread Mynk
Adding the objdump referred to in the update. Will try the suggestions
and update.

** Attachment added: objdump -d -j .text radeon.ko radeon.ko.out
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+attachment/2255879/+files/radeon.ko.out

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-08-05 Thread Mynk
Thanks Karthick,

Your suggestion works -

T400:~/linux-2.6.38/drivers/gpu/drm/radeon# diff r600.c.orig r600.c
2221a,2234
   /*
* Re-read the read and write if the value returned isn't sane. before 
 calling r600_cp_start
*/
do {
 rdev-cp.rptr = RREG32(CP_RB_RPTR);
 mdelay(15);
  } while((int)rdev-cp.rptr  0);
 
 do {
  rdev-cp.wptr = RREG32(CP_RB_WPTR);
  mdelay(15);
   } while( (int)rdev-cp.wptr  0 );
 

If I boot without this fix I run into the Oops. With the patched module
it works fine.

If a patch would help kindly let know what steps to use to produce the
patch (which directory to run it from) and I can upload the same.

Hope this fix is reviewed and makes into the next release.

Thanks,
Mayank

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-08-05 Thread a.r.karth...@gmail.com
Great news!
Attach your r600.c and I can make a proper patch which you can retest before we 
can pitch
for it to be included in the next release.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-08-05 Thread Mynk
Attaching the r600.c as requested.

** Attachment added: r600.c
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+attachment/2256717/+files/r600.c

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-08-05 Thread Herton R. Krzesinski
a-r-karthick: thanks for your analysis.

This invalid register read seems something very similar which was also
workarounded at r100_cp_init function, take a look at this change:

commit 9e5786bd14cb9ffe29ebe66d41cedf03311b0d30
Author: Dave Airlie airl...@redhat.com
Date:   Wed Mar 31 13:38:56 2010 +1000

drm/radeon/kms: add sanity check to wptr.

If we resume in a bad way, we'll get 0x in wptr, and then
oops with no console. This just adds a sanity check so that we can
avoid the oops and hopefully get more details out of people's systems.

Signed-off-by: Dave Airlie airl...@redhat.com

diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
index 138ddd4..c8f4b03 100644
--- a/drivers/gpu/drm/radeon/r100.c
+++ b/drivers/gpu/drm/radeon/r100.c
@@ -744,6 +744,8 @@ int r100_cp_init(struct radeon_device *rdev, unsigned 
ring_size)
udelay(10);
rdev-cp.rptr = RREG32(RADEON_CP_RB_RPTR);
rdev-cp.wptr = RREG32(RADEON_CP_RB_WPTR);
+   /* protect against crazy HW on resume */
+   rdev-cp.wptr = rdev-cp.ptr_mask;
/* Set cp mode to bus mastering  enable cp*/
WREG32(RADEON_CP_CSQ_MODE,
   REG_SET(RADEON_INDIRECT2_START, indirect2_start) |


a-r-karthick: Can you raise this issue upstream, at 
dri-de...@lists.freedesktop.org mailing list? You can just test and send a 
similar patch for review, using ptr_mask also.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-08-05 Thread Herton R. Krzesinski
I mean, r600_cp_resume also seems to need same workaround already
present on r100_cp_init

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-08-05 Thread Herton R. Krzesinski
And this isn't a regression, the update from 2.6.38-8.42 doesn't touch
this area, seems the issue happens by luck, some code shuffle or
anything else (timing?) may be made the issue more likely to happen on
newer kernel.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-08-05 Thread a.r.karth...@gmail.com
@Herton : Good point regarding a similar fix in r100.c that I wasn't aware.
I didn't even reproduce this as @mynk (reporter and buddy) reproduced this as I 
don't have a hardware with radeon chipset :)

I debugged this with the objdump disassembly and the OOPs information.
Seems to exactly match the fix and the comment in r100.c that corrects
the write index.

Masking the write pointer with the ring buffer ptr_mask like in r100.c
makes sense even if it doesn't match the exact write index (would write
to the last byte before rolling back to 0 in the ring buffer) on resume
since I was trying to re-read with a delay. The fact that the retry with
mdelay was working for him on a resume implies that it was indeed
fetching the right values on a re-read.

But my patch was causing a boot time lockup as it was doing the same thing for 
ring buffer read index and it seems that the ring buffer read index returned 
from the hardware is always uninitialized or ~0U during boot. 
So maybe a mask for read index makes sense but the fact that its acceptable to 
issue a read to the iommu at an invalid offset before its corrected in the next 
read pass that patches it with ring ptr_mask suggests that its not a deal 
breaker.

So lets play it safe and retain the same fix in r600.c as it is in
r100.c:

@Mayank : Please revert the last patch and apply the patch on top of your 
original un-patched r600.c.
(also attached) 

@Herton: Once Mayank re-tests with the patch and confirms that it works,
I can push for it upstream quoting this bug as the reference.  I am
surprised that the bug still exists in 3.0 as well. And I don't believe
it has anything to do with regression. It could be that we are plain
lucky with resume since this is related to the hardware returning an
invalid index on resume.

--- r600_orig.c 2011-08-05 13:39:25.833427436 -0700
+++ r600.c  2011-08-05 14:50:32.037670946 -0700
@@ -2218,6 +2218,8 @@

rdev-cp.rptr = RREG32(CP_RB_RPTR);
rdev-cp.wptr = RREG32(CP_RB_WPTR);
+   /* protect against crazy HW on resume */
+   rdev-cp.wptr = rdev-cp.ptr_mask;

r600_cp_start(rdev);
rdev-cp.ready = true;


** Patch added: Patch for kernel panic in resume in r600 radeon driver module
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+attachment/2257107/+files/r600.c.resume.patch

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-08-05 Thread Herton R. Krzesinski
@a-r-karthick: yes once you get tested please submit it upstream, thank
you.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-08-04 Thread Seth Forshee
** Tags added: regression-update

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-08-04 Thread a.r.karth...@gmail.com
Curious guy not knowing much about radeon/video drivers but with a can-
debug approach trying to take a stab at this issue based on the
radeon.ko objdump disassembly provided by the bug reporter who happens
to be my friend : (please try the recommendation suggested at the end of
this report in r600.c, r600_cp_resume to see if it resolves the OOPs)

The kernel panic is a result of an invalid ring write pointer while updating a 
value to the radeon ring buffer.
The write pointer read from the radeon control register (r100_mm_rreq function 
in radeon.h) is returning an incorrect (or seemingly negative value on RESUME). 
Looks like we may have to add a retry on r600_cp_resume to make it work.

RCA enclosed below:
--

Mapping the Oops to the disassembly, its clear that the kernel panic was 
triggered by this instruction:
static inline void radeon_ring_write(struct radeon_device *rdev, uint32_t v)
{
#if DRM_DEBUG_CODE
if (rdev-cp.count_dw = 0) {
DRM_ERROR(radeon: writting more dword to ring than expected 
!\n);
}
#endif
rdev-cp.ring[rdev-cp.wptr++] = v; -PANICs here as rdev-cp.wptr 
seems to be negative
rdev-cp.wptr = rdev-cp.ptr_mask;
rdev-cp.count_dw--;
rdev-cp.ring_free_dw--;
}

Lets now map the above to the EIP first:

EIP at the time of the kernel panic was r600_cp_start+0x48
From objdump disassembly, it maps to:

r600_cp_start: (0x709e8):
  709e8:   c7 02 00 44 05 c0   movl   $0xc0054400,(%edx)

Also it exactly matches the OOPs hex dump:
Aug 4 09:25:20 mayankr-T400 kernel: [ 10.356006] Code: c6 0f 85 fd 02 00 00 8b 
bb a4 07 00 00 85 ff 0f 8e d6 02 00 00 8b 83 94 07 00 00 8d 14 85 00 00 00 00 
83 c0 01 03 93 8c 07 00 00 c7 02 00 44 05 c0 8b 93 a4 07 00 00 23 83 b4 07 00 
00 83 ab a0

Refer to the instruction c7 in angular brackets which represents the
faulting instruction and hexcodes also match the above EIP. (c7 02 00 44
05 c0 )

If you reverse engineer the code to the disassembly, the panic EIP is evident. 
From the objdump, EBX holds the radeon_device pointer *rdev.

The ring buffer remaining count cp.dw_count is 16 and held in register
EDI.

The write pointer or rdev-cp.wptr index for the radeon ring buffer is stored 
in EAX. 
From the panic this value is shown as 0 as EAX holds 0.

The ring buffer pointer that triggered the OOps is in EDX as seen also from the 
above target of the store.
EDX value from the OOPs is: 0xfa501ffc. And this is also the PTE entry that 
took the page fault as seen from the OOPs:

Aug 4 09:25:20 mayankr-T400 kernel: [ 10.354151] BUG: unable to handle
kernel paging request at fa501ffc

movl $0xc0054400, (%edx)

The C call for the above store is

radeon_ring_write(rdev, PACKET3(PACKET3_ME_INITIALIZE, 5));

from r600_cp_start. PACKET3(PACKET3_ME_INITIALIZE, 5) macro evaluates to 
0xc0054400.
So now we are dead sure that we had an incorrect radeon ring write pointer read 
from the register in r600_cp_resume:
before calling r600_cp_start:

rdev-cp.rptr = RREG32(CP_RB_RPTR);
rdev-cp.wptr = RREG32(CP_RB_WPTR);

Now from the assembly, the value of the write pointer is stored in EAX
at the time of the panic:

Taking a few instructions above the faulting instruction:

   709d2:   8b 83 94 07 00 00   mov0x794(%ebx),%eax
   709d8:   8d 14 85 00 00 00 00lea0x0(,%eax,4),%edx
   709df:   83 c0 01add$0x1,%eax
   709e2:   03 93 8c 07 00 00   add0x78c(%ebx),%edx
   709e8:   c7 02 00 44 05 c0   movl   $0xc0054400,(%edx)

0x794 offset of EBX (rdev pointer) is rdev-cp.wptr or the write index for the 
ring buffer.
We can see this is being moved to EAX. And the indexed absolute address is 
stored in EDX (the address that took the page fault as mentioned above)

Now we can see that:
add $0x1, %eax
happens BEFORE the movl or the faulting instruction.
In other words:
 rdev-cp.wptr++ after use of EAX. Unless we missed a speculative 
execution, there is no chance to miss execution of this increment. So for all 
intents and purposes, EAX cannot hold 0 since it was incremented before the 
fault.

Which implies that EAX was negative on reading from the radeon register
on a resume. And as a result, we indexed an invalid location into the
ring buffer (rdev-cp.ring) which expectedly triggered the kernel panic.

I am not sure if we have to re-read the values on resume if the value
returned is negative from the radeon register for the read and write
pointers.

You can try the following changes to see if it works:

In r600_cp_resume before calling r600_cp_start : (r600.c)

/*
 * Re-read the read and write if the value returned isn't sane. before calling 
r600_cp_start
*/
 do {
rdev-cp.rptr = RREG32(CP_RB_RPTR);
mdelay(15);
 } while((int)rdev-cp.rptr  0);
 
 do {
rdev-cp.wptr = RREG32(CP_RB_WPTR);
mdelay(15);
 } while( (int)rdev-cp.wptr  0 );


r600_cp_start(rdev);

If the above still triggers 

[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-08-04 Thread a.r.karth...@gmail.com
Just to avoid confusion as rdev-cp.wptr is unsigned, by negative I implied 
that it was ~0 or 0xU; when read in r600_cp_resume.
So an increment: add $1, %eax before the OOPs wrapped it back to 0. as EAX or 
the ring buffer write pointer index at the time of the panic was shown as 0. 
Since the increment had happened before the fault, it had to be ~0 or 
0xU on resume which is again an invalid write pointer value for the 
radeon ring buffer.

So we retry till we get a sane value for the read and write pointer for
the radeon ring buffer on RESUME in r600_cp_resume. Have a strong hunch
that it would fix the panic as this seems to be a timing issue with
reading registers on resume. (maybe the device isn't ready yet when the
resume tries to fetch the read and write indexes)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-08-03 Thread Mynk
** Attachment added: Relevant log under kern.log
   
https://bugs.launchpad.net/bugs/820746/+attachment/2253248/+files/Aug4_kern.log

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 820746] Re: Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with radeon

2011-08-03 Thread Mynk
Ok I see that lspci details are attached.

I am able to reproduce this problem at will. The laptop won't boot with
the latest kernel. I have to boot from the previous kernel.

Kindly note that the bug report was taken when booted from the older
version of the kernel.

Hope this helps,
Mayank

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/820746

Title:
  Black screen on boot - Oops: 0002 [#1] SMP on Lenovo T400 laptop with
  radeon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs