On Tue, 10 Mar 2015 02:15:28 +1100 Tue Oct 8 12:25:41 2013 +0200 Jonathan Gray <[email protected]> wrote: > On Mon, Mar 09, 2015 at 08:06:57AM -0400, dan mclaughlin wrote: > > >Synopsis: system hangs after drm:i915_hangcheck_hung error > > >Category: video driver > > >Environment: > > System : OpenBSD 5.7 > > Details : OpenBSD 5.7 (GENERIC) #738: Sun Mar 8 10:59:31 MDT 2015 > > > > [email protected]:/usr/src/sys/arch/i386/compile/GENERIC > > > > Architecture: OpenBSD.i386 > > Machine : i386 > > >Description: > > system hangs after moderate use. it mostly seems to occur when a > > program starts (ie qiv feh mplayer). scrolling thru photos hasn't > > triggered it yet, but quitting and restarting a viewer may. > > it doesn't take much to trigger it, i can reproduce fairly quickly. > > just now got around to reporting, but it's been happening for at > > least a couple months now. usually the system freezes before anything > > gets recorded, but by sheer luck i just got this: > > > > Mar 9 07:07:26 node02 /bsd: error: [drm:pid3557:i915_hangcheck_hung] > > *ERROR* Hangcheck timer elapsed... GPU hung > > Mar 9 07:07:28 node02 /bsd: error: [drm:pid3557:i915_hangcheck_hung] > > *ERROR* Hangcheck timer elapsed... GPU hung > > Mar 9 07:07:28 node02 /bsd: error: [drm:pid21570:i915_reset] *ERROR* GPU > > hanging too fast, declaring wedged! > > Mar 9 07:07:28 node02 /bsd: error: [drm:pid21570:i915_reset] *ERROR* > > Failed to reset chip. > > > > i am pretty certain this is the same issue as reported here: > > > > https://bugs.freedesktop.org/show_bug.cgi?id=54226 > > That looks unrelated and only for gen6 hardware. > > At least part of the problem here is that the gen2 reset itself > is known to cause problems and was at one point removed. > > Here is an equivalent of > > commit e252d07aff961f8553822cda621490d9aeef8a06 > Author: Daniel Vetter <[email protected]> > Date: Tue Oct 8 12:25:41 2013 +0200 > > drm/i915: rip out gen2 reset code > > At least on my i830M here it reliably results in hard system hangs > nowadays. This is much worse than falling back to software rendering, > so I think we should simply rip this out. > > After all we don't have any gpu reset for gen3 either, and there are a > lot more of those still around. > > Cc: Chris Wilson <[email protected]> > Acked-by: Chris Wilson <[email protected]> > Signed-off-by: Daniel Vetter <[email protected]> > > Index: sys/dev/pci/drm/i915/i915_drv.c > =================================================================== > RCS file: /cvs/src/sys/dev/pci/drm/i915/i915_drv.c,v > retrieving revision 1.75 > diff -u -p -r1.75 i915_drv.c > --- sys/dev/pci/drm/i915/i915_drv.c 12 Feb 2015 04:56:03 -0000 1.75 > +++ sys/dev/pci/drm/i915/i915_drv.c 9 Mar 2015 15:08:05 -0000 > @@ -1364,36 +1364,6 @@ inteldrm_timeout(void *arg) > task_add(dev_priv->mm.retire_taskq, &dev_priv->mm.retire_task); > } > > -static int i8xx_do_reset(struct drm_device *dev) > -{ > - struct drm_i915_private *dev_priv = dev->dev_private; > - > - if (IS_I85X(dev)) > - return -ENODEV; > - > - I915_WRITE(D_STATE, I915_READ(D_STATE) | DSTATE_GFX_RESET_I830); > - POSTING_READ(D_STATE); > - > - if (IS_I830(dev) || IS_845G(dev)) { > - I915_WRITE(DEBUG_RESET_I830, > - DEBUG_RESET_DISPLAY | > - DEBUG_RESET_RENDER | > - DEBUG_RESET_FULL); > - POSTING_READ(DEBUG_RESET_I830); > - drm_msleep(1, "8res1"); > - > - I915_WRITE(DEBUG_RESET_I830, 0); > - POSTING_READ(DEBUG_RESET_I830); > - } > - > - drm_msleep(1, "8res2"); > - > - I915_WRITE(D_STATE, I915_READ(D_STATE) & ~DSTATE_GFX_RESET_I830); > - POSTING_READ(D_STATE); > - > - return 0; > -} > - > static int i965_reset_complete(struct drm_device *dev) > { > struct drm_i915_private *dev_priv = dev->dev_private; > @@ -1540,9 +1510,6 @@ int intel_gpu_reset(struct drm_device *d > break; > case 4: > ret = i965_do_reset(dev); > - break; > - case 2: > - ret = i8xx_do_reset(dev); > break; > }
that did do something. before the screen would just freeze with its contents. now it freezes like that for a second, and then goes blank. i reproduced it twice. the first time took about the same amount of time as before, the second went a bit longer before hanging. it seems there is also enough time before it freezes to get error messages now. the first time i got the expected messages: Mar 9 17:15:48 node02 /bsd: error: [drm:pid2359:i915_hangcheck_hung] *ERROR* Ha ngcheck timer elapsed... GPU hung Mar 9 17:15:48 node02 /bsd: error: [drm:pid11430:i915_reset] *ERROR* Failed to reset chip. the second time though i got this though: Mar 10 06:37:01 node02 /bsd: error: [drm:pid27180:i915_get_vblank_timestamp] *ER ROR* Invalid crtc 1 Mar 10 06:37:01 node02 /bsd: uvm_fault(0xd5c34634, 0x80038000, 0, 1) -> e i do get occational vblank_timestamp errors but they have never seemed to cause any significant problem before (although i don't use graphics too intensely).
