[Nouveau] [Bug 49243] New: graphical corruption with GeForce 6150SE nForce 430

2012-04-28 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=49243

 Bug #: 49243
   Summary: graphical corruption with GeForce 6150SE nForce 430
Classification: Unclassified
   Product: Mesa
   Version: git
  Platform: x86-64 (AMD64)
OS/Version: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/DRI/nouveau
AssignedTo: nouveau@lists.freedesktop.org
ReportedBy: shawnland...@gmail.com


I just built wayland/mesa and related with this script which pulls from git
http://www.chaosreigns.com/wayland/buildscript/dl/wayland-build-master.sh

I am using Ubuntu 12.04 release, 3.2.0-24-generic

window-screen-cap: https://imgur.com/Wg1ej

kern.log:
Apr 27 23:42:57 host kernel: [50991.175684] [drm] nouveau :00:0d.0: PGRAPH
- ERROR nsource: DATA_ERROR nstatus: BAD_ARGUMENT
Apr 27 23:42:57 host kernel: [50991.175699] [drm] nouveau :00:0d.0: PGRAPH
- ch 4 (0x00073000) subc 7 class 0x4497 mthd 0x0208 data 0x050a0228
etc...

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [Bug 49243] graphical corruption with GeForce 6150SE nForce 430

2012-04-28 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=49243

--- Comment #1 from Shawn Landden shawnland...@gmail.com 2012-04-27 23:49:54 
PDT ---
Created attachment 60719
  -- https://bugs.freedesktop.org/attachment.cgi?id=60719
weston with nouveau in xwayland

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH v2 4/4] drm/nouveau: gpu lockup recovery

2012-04-28 Thread Marcin Slusarz
On Thu, Apr 26, 2012 at 05:32:29PM +1000, Ben Skeggs wrote:
 On Wed, 2012-04-25 at 23:20 +0200, Marcin Slusarz wrote:
  Overall idea:
  Detect lockups by watching for timeouts (vm flush / fence), return -EIOs,
  handle them at ioctl level, reset the GPU and repeat last ioctl.
  
  GPU reset is done by doing suspend / resume cycle with few tweaks:
  - CPU-only bo eviction
  - ignoring vm flush / fence timeouts
  - shortening waits
 Okay.  I've thought about this a bit for a couple of days and think I'll
 be able to coherently share my thoughts on this issue now :)
 
 Firstly, while I agree that we need to become more resilient to errors,
 I don't think that following in the radeon/intel footsteps with
 something (imo, hackish) like this is the right choice for us
 necessarily.

This is not only radeon/intel way. Windows, since Vista SP1, does the
same - see http://msdn.microsoft.com/en-us/windows/hardware/gg487368.
It's funny how similar it is to this patch (I haven't seen this page earlier).

If you fear people will stop reporting bugs - don't. GPU reset is painfully
slow and can take up to 50 seconds (BO eviction is the most time consuming
part), so people will be annoyed enough to report them.
Currently, GPU lockups make users so angry, they frequently switch to blob
without even thinking about reporting anything.

 The *vast* majority of lockups we have are as a result of us badly
 mishandling exceptions reported to us by the GPU.  There are a couple of
 exceptions, however, they're very rare..

 A very common example is where people gain DMA_PUSHERs for whatever
 reason, and things go haywire eventually.

Nope, I had tens of lockups during testing, and only once I had DMA_PUSHER
before detecting GPU lockup.

 To handle a DMA_PUSHER
 sanely, generally you have to drop all pending commands for the channel
 (set GET=PUT, etc) and continue on.  However, this leaves us with fences
 and semaphores unsignalled etc, causing issues further up the stack with
 perfectly good channels hanging on attempting to sync with the crashed
 channel etc.
 
 The next most common example I can think of is nv4x hardware, getting a
 LIMIT_COLOR/ZETA exception from PGRAPH, and then a hang.  The solution
 is simple, learn how to handle the exception, log it, and PGRAPH
 survives.
 
 I strongly believe that if we focused our efforts on dealing with what
 the GPU reports to us a lot better, we'll find we really don't need such
 lockup recovery.

While I agree we need to improve on error handling to make lockup recovery
not needed, the reality is we can't predict everything and driver needs to
cope with its own bugs.

 I am, however, considering pulling the vm flush timeout error
 propagation and break-out-of-waits-on-signals that builds on it.  As we
 really do need to become better at having killable processes if things
 go wrong :)

Good :)

Marcin
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH v2 4/4] drm/nouveau: gpu lockup recovery

2012-04-28 Thread Marcin Slusarz
On Wed, Apr 25, 2012 at 11:20:36PM +0200, Marcin Slusarz wrote:
 Overall idea:
 Detect lockups by watching for timeouts (vm flush / fence), return -EIOs,
 handle them at ioctl level, reset the GPU and repeat last ioctl.
 
 GPU reset is done by doing suspend / resume cycle with few tweaks:
 - CPU-only bo eviction
 - ignoring vm flush / fence timeouts
 - shortening waits
 
 Signed-off-by: Marcin Slusarz marcin.slus...@gmail.com
 ---

Martin,

I'm wondering how below patch (which builds upon the above) affects
reclocking stability. I can't test it on my card, because it has only
one performance level. Can you test it on yours?

---
From: Marcin Slusarz marcin.slus...@gmail.com
Subject: [PATCH] drm/nouveau: take ioctls_rwsem before reclocking

Signed-off-by: Marcin Slusarz marcin.slus...@gmail.com
---
 drivers/gpu/drm/nouveau/nouveau_pm.c|6 ++
 drivers/gpu/drm/nouveau/nouveau_reset.c |2 +-
 2 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_pm.c 
b/drivers/gpu/drm/nouveau/nouveau_pm.c
index 34d591b..4716f39 100644
--- a/drivers/gpu/drm/nouveau/nouveau_pm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_pm.c
@@ -383,9 +383,15 @@ nouveau_pm_set_perflvl(struct device *d, struct 
device_attribute *a,
   const char *buf, size_t count)
 {
struct drm_device *dev = pci_get_drvdata(to_pci_dev(d));
+   struct drm_nouveau_private *dev_priv = dev-dev_private;
int ret;
 
+   intr_rwsem_down_write(dev_priv-ioctls_rwsem);
+
ret = nouveau_pm_profile_set(dev, buf);
+
+   intr_rwsem_up_write(dev_priv-ioctls_rwsem);
+
if (ret)
return ret;
return strlen(buf);
diff --git a/drivers/gpu/drm/nouveau/nouveau_reset.c 
b/drivers/gpu/drm/nouveau/nouveau_reset.c
index e893096..7c25a3c 100644
--- a/drivers/gpu/drm/nouveau/nouveau_reset.c
+++ b/drivers/gpu/drm/nouveau/nouveau_reset.c
@@ -139,7 +139,7 @@ int nouveau_reset_device(struct drm_device *dev)
end = jiffies;
NV_INFO(dev, GPU reset done, took %lu s\n, (end - start) / 
DRM_HZ);
while 
(intr_rwsem_down_read_interruptible(dev_priv-ioctls_rwsem))
-   ; /* not possible, we are holding reset_lock */
+   ;
}
mutex_unlock(dev_priv-reset_lock);
 
-- 
1.7.8.5

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [Bug 49243] graphical corruption with GeForce 6150SE nForce 430

2012-04-28 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=49243

--- Comment #2 from Shawn Landden shawnland...@gmail.com 2012-04-28 15:05:55 
PDT ---
Created attachment 60750
  -- https://bugs.freedesktop.org/attachment.cgi?id=60750
dmesg | grep nouveau

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [Bug 49243] graphical corruption with GeForce 6150SE nForce 430

2012-04-28 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=49243

Marcin Slusarz marcin.slus...@gmail.com changed:

   What|Removed |Added

  Attachment #60750|application/octet-stream|text/plain
  mime type||

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [Bug 43029] System won't boot using nouveau and Gainward Phantom adapters

2012-04-28 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=43029

Robert Riches rm.ric...@jacob21819.net changed:

   What|Removed |Added

 CC||rm.ric...@jacob21819.net

--- Comment #8 from Robert Riches rm.ric...@jacob21819.net 2012-04-28 
22:01:44 PDT ---
I have similar symptoms with an Asus ENGTX560 DC/2DI/1GD5 card running Mageia 1
(kernel 2.6.38.8-server-10.mga).  According to modinfo, srcversion:
7FFBFFA368D6517B0115747.  During boot with the normal kernel, nouveau said it
detected a NVc0 card, 0ce080a1 (if I wrote it down correctly).  Then, fb:
conflicting fb hw usage nouveaufb vs VESA VGA - removing generic driver. 
Then, the machine was locked up so hard sysrq did nothing that I could discern.
 I had to use the hardware reset button.

Using the linux-nonfb GRUB option, which I understand points to a
non-framebuffer version of the non-updated kernel/initrd, I saw a nouveau stack
trace fly by on the screen, then the same hard lockup.

Using the failsafe option, which I understand is a different non-updated
kernel/initrd, booting gets farther before locking up, and sysrq is able to
reboot the machine.

Knoppix 6.4.4 produces a stack trace from drm or nouveau, and sysrq is able to
reboot the machine--blindly if I remember correctly.

Is there documentation of whether a later (Mageia 2, perhaps) kernel would work
with this card?

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau