Re: [Intel-gfx] Patch for crashing intel server

2013-11-03 Thread Bas Wijnen
Hi Chris,

I got a black screen while using your patch.
/sys/kernel/debug/dri/0/i915_gem_objects contents are shown below.  The first
time is while the video is running; the second after stopping it.  AFAICS,
there is no difference between them.

However, after starting a new video, there is a difference in active objects;
not sure if it is related (I don't really know what any of it means).  That is
the third one.

Thanks,
Bas

root@star:/sys/kernel/debug/dri/0# cat i915_gem_objects 
220 objects, 36782080 bytes
131 [131] objects, 34430976 [34430976] bytes in gtt
  0 [0] active objects, 0 [0] bytes
  131 [131] inactive objects, 34430976 [34430976] bytes
49 unbound objects, 638976 bytes
1 purgeable objects, 4096 bytes
6 pinned mappable objects, 15884288 bytes
118 fault mappable objects, 27901952 bytes
536870912 [268435456] gtt total

Xorg: 217 objects, 36642816 bytes (0 active, 30703616 inactive, 5922816 unbound)
root@star:/sys/kernel/debug/dri/0# cat i915_gem_objects 
220 objects, 36782080 bytes
131 [131] objects, 34430976 [34430976] bytes in gtt
  0 [0] active objects, 0 [0] bytes
  131 [131] inactive objects, 34430976 [34430976] bytes
49 unbound objects, 638976 bytes
1 purgeable objects, 4096 bytes
6 pinned mappable objects, 15884288 bytes
118 fault mappable objects, 27901952 bytes
536870912 [268435456] gtt total

Xorg: 217 objects, 36642816 bytes (0 active, 30703616 inactive, 5922816 unbound)
root@star:/sys/kernel/debug/dri/0# cat i915_gem_objects 
220 objects, 36782080 bytes
131 [131] objects, 34430976 [34430976] bytes in gtt
  2 [2] active objects, 32768 [32768] bytes
  129 [129] inactive objects, 34398208 [34398208] bytes
49 unbound objects, 638976 bytes
1 purgeable objects, 4096 bytes
6 pinned mappable objects, 15884288 bytes
118 fault mappable objects, 27901952 bytes
536870912 [268435456] gtt total

Xorg: 217 objects, 36642816 bytes (32768 active, 30670848 inactive, 5922816 
unbound)

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Patch for crashing intel server

2013-10-24 Thread Bas Wijnen
On Wed, Oct 23, 2013 at 09:28:28AM +0100, Chris Wilson wrote:
 No worries, if you can run
 
 addr2line -e /usr/lib/xorg/modules/drivers/intel_drv.so -i 0xfcd79 0xf8215
 
 that should give me the information needed to pinpoint the crash.

$ addr2line -e /usr/lib/xorg/modules/drivers/intel_drv.so -i 0xfcd79
0xf8215
/build/xserver-xorg-video-intel-WbV7Z9/xserver-xorg-video-intel-2.21.15/build/src/uxa/../../../src/uxa/intel.h:138
/build/xserver-xorg-video-intel-WbV7Z9/xserver-xorg-video-intel-2.21.15/build/src/uxa/../../../src/uxa/i915_video.c:156
/build/xserver-xorg-video-intel-WbV7Z9/xserver-xorg-video-intel-2.21.15/build/src/uxa/../../../src/uxa/intel_video.c:1584

Note that I'm running the unpatched Debian version again (so not with
your or my patch), which is why it was crashing.

In case you have different sources, here's some context for those lines:

intel.h:138 is
 static inline Bool intel_pixmap_tiled(PixmapPtr pixmap)
 {
return intel_get_pixmap_private(pixmap)-tiling != I915_TILING_NONE;
 }

i915_video.c:156 is
/* front buffer, pitch, offset */
   if (intel_pixmap_tiled(target)) {
tiling = BUF_3D_TILED_SURFACE;

and intel_video.c:1584 is
} else {
   I915DisplayVideoTextured(scrn, adaptor_priv, id, clipBoxes,
width, height, dstPitch, dstPitch2,
src_w, src_h, drw_w, drw_h,
pixmap);
}

Thanks,
Bas


signature.asc
Description: Digital signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Patch for crashing intel server

2013-10-16 Thread Bas Wijnen
On Tue, Oct 15, 2013 at 09:25:41AM +0100, Chris Wilson wrote:
  This does indeed stop the server from crashing, but actually makes the
  problem worse: it used to play video for a few minutes and then crash
  when trying.  With my patch it would play video for a few minutes and
  then present black screens when trying.  With your patch, it presents
  black screens from the start.
 
 Start of video, or beginning of X?

Beginning of X.  After starting and logging in, I can play them for a
few minutes; afterwards it will crash.

  I must say I'm not entirely sure if the backtrace I sent you is a
  typical case; I managed to crash it sooner than usual, so perhaps it
  wasn't the bug that I triggered before.  It did stop the crashing
  however.
  
   However, that still leaveas the question as to how you ended up being
   unable to allocate bo...

I didn't check the backtrace myself, but when I wrote my shotgun-patch,
the problem was that pixmap_private was NULL; bo is in there, right?  So
at least in that case, it could never have allocated it, or at least it
couldn't store the pointer.

  While looking for it I did find and try intel-gpu-time, and noticed that
  it always reports the gpu 100% busy, even when running intel-gpu-time
  sleep 5 from a linux virtual terminal (so not even X is displayed).  Is
  that normal?
 
 Hmm, looks like it should report correctly on i915.

Due to unrelated problems (unbearable slowness) I switched from gnome to
xfce.  It does report 0% now.  It seems gnome keeps the gpu busy even if
it's not displaying anything...

Thanks,
Bas


signature.asc
Description: Digital signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Patch for crashing intel server

2013-10-14 Thread Bas Wijnen
On Sun, Oct 13, 2013 at 10:43:49AM +0100, Chris Wilson wrote:
My X server was crashing when playing video, and I wrote a patch to fix
it.  Please find the background and the patch at
http://bugs.debian.org/724944 .
 
 Ok, I can see the allocation failure that leads to the crash:
 
 commit f9a18c9f38d09c145eb513ca989966dc135c1e9b
 Author: Chris Wilson ch...@chris-wilson.co.uk
 Date:   Sun Oct 13 10:36:35 2013 +0100

This does indeed stop the server from crashing, but actually makes the
problem worse: it used to play video for a few minutes and then crash
when trying.  With my patch it would play video for a few minutes and
then present black screens when trying.  With your patch, it presents
black screens from the start.

I must say I'm not entirely sure if the backtrace I sent you is a
typical case; I managed to crash it sooner than usual, so perhaps it
wasn't the bug that I triggered before.  It did stop the crashing
however.

 However, that still leaveas the question as to how you ended up being
 unable to allocate bo...
 
 You can watch /sys/kernel/debug/dri/0/i915_gem_objects (or just use
 intel-gpu-overlay) and see if there is an object leak.

I don't have enough knowledge about the internals to know how that
works.  I can see the file if I mount the debugfs, but what am I looking
for?

I don't seem to have intel-gpu-overlay on my system; does it make sense
to install it?  If so, where do I get it?

While looking for it I did find and try intel-gpu-time, and noticed that
it always reports the gpu 100% busy, even when running intel-gpu-time
sleep 5 from a linux virtual terminal (so not even X is displayed).  Is
that normal?

Thanks,
Bas


signature.asc
Description: Digital signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Bug#724944: Patch

2013-10-12 Thread Bas Wijnen
On Fri, Oct 11, 2013 at 08:53:03PM +0200, Julien Cristau wrote:
 Thanks.  Can you please send this upstream to
 intel-gfx@lists.freedesktop.org?

Done.  (I didn't subscribe to the list; not sure if that was required.
My mail wasn't bounced, so I suppose it worked.)

By the way, I just noticed that while the patch does prevent the server
from crashing, it doesn't actually solve the problem: videos are now all
black.  Not crashing the server is certainly an improvement, but this is
still unusable. :-(

I'm guessing the problem is whatever sets the intel_pixmap_private field
to NULL, but I have no idea where to look for that, or how to debug it.

It only happens after the server has been running for some time (a few
minutes), which sounds like it will not be easy to track down,
unfortunately.  If anyone wants to try, or can tell me what I can try,
please let me know.

Thanks,
Bas
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Patch for crashing intel server

2013-10-12 Thread Bas Wijnen
On Sat, Oct 12, 2013 at 09:46:14PM +0100, Chris Wilson wrote:
 On Fri, Oct 11, 2013 at 09:24:54PM +0200, Bas Wijnen wrote:
  Hello,
  
  My X server was crashing when playing video, and I wrote a patch to fix
  it.  Please find the background and the patch at
  http://bugs.debian.org/724944 .
 
 The patch is a shotgun solution, putting NULL pointer checks where the
 pointer is explicitly not allowed to be NULL. I need an actual
 stacktrace to find the root cause.
 -Chris

Sure thing; you can find it attached.  Of course it shows when the
segfault is triggered, not when the data became NULL.  And that should
be fixed, because even though the server doesn't crash with the patch,
it also doesn't play video.

If you need any more information (like debug statements in the
set_pixmap_private?), please let me know how I can generate it.

Thanks,
Bas
#0  0xb758f424 in __kernel_vsyscall ()
#1  0xb719380f in __GI_raise (sig=sig@entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#2  0xb7196cc3 in __GI_abort () at abort.c:90
#3  0xb77574a9 in OsAbort () at ../../os/utils.c:1299
#4  0xb7630d07 in ddxGiveUp (error=error@entry=EXIT_ERR_ABORT)
at ../../../../hw/xfree86/common/xf86Init.c:1063
#5  0xb7630da3 in AbortDDX (error=error@entry=EXIT_ERR_ABORT)
at ../../../../hw/xfree86/common/xf86Init.c:1107
#6  0xb775cc41 in AbortServer () at ../../os/log.c:767
#7  0xb775d6be in FatalError (
f=f@entry=0xb7785084 Caught signal %d (%s). Server aborting\n)
at ../../os/log.c:908
#8  0xb7754d84 in OsSigHandler (signo=11, sip=0xbfbe0f0c, unused=0xbfbe0f8c)
at ../../os/osinit.c:147
#9  signal handler called
#10 0xb6f26d79 in intel_pixmap_tiled (pixmap=0xb8945808)
at ../../../src/uxa/intel.h:138
#11 I915DisplayVideoTextured (scrn=0xb83f2f08, adaptor_priv=0xb83eed70, 
id=808596553, dstRegion=0xbfbe14a8, width=352, height=288, 
video_pitch=176, video_pitch2=352, src_w=352, src_h=288, drw_w=384, 
drw_h=288, pixmap=0xb8425d98) at ../../../src/uxa/i915_video.c:156
#12 0xb6f22215 in I830PutImageTextured (scrn=0xb83f2f08, src_x=0, src_y=0, 
drw_x=1308, drw_y=192, src_w=352, src_h=288, drw_w=384, drw_h=288, 
id=808596553, buf=0xb314 Address 0xb314 out of bounds, 
width=352, height=288, sync=0, clipBoxes=0xbfbe14a8, data=0xb83eed70, 
drawable=0xb88fad78) at ../../../src/uxa/intel_video.c:1584
#13 0xb764877c in xf86XVPutImage (client=0xb8906010, pDraw=0xb88fad78, 
pPort=0xb840ce58, pGC=0xb89206d8, src_x=0, src_y=0, src_w=352, src_h=288, 
drw_x=0, drw_y=0, drw_w=384, drw_h=288, format=0xb840ccd0, 
data=0xb314 Address 0xb314 out of bounds, sync=0, width=352, 
height=288) at ../../../../hw/xfree86/common/xf86xv.c:1827
#14 0xb769304c in XvdiPutImage (client=client@entry=0xb8906010, 
pDraw=0xb88fad78, pPort=0xb840ce58, pGC=0xb89206d8, src_x=0, src_y=0, 
src_w=352, src_h=288, drw_x=0, drw_y=0, drw_w=384, drw_h=288, 
image=image@entry=0xb840ccd0, 
data=0xb314 Address 0xb314 out of bounds, sync=0, 
width=width@entry=352, height=288) at ../../Xext/xvmain.c:673
#15 0xb7694648 in ProcXvShmPutImage (client=0xb8906010)
at ../../Xext/xvdisp.c:1025
#16 0xb7696dae in ProcXvDispatch (client=0xb8906010)
at ../../Xext/xvdisp.c:1212
#17 0xb75ed35d in Dispatch () at ../../dix/dispatch.c:432
#18 0xb75db38a in main (argc=13, argv=0xbfbe1774, envp=0xbfbe17ac)
at ../../dix/main.c:298
#0  0xb758f424 in __kernel_vsyscall ()
No symbol table info available.
#1  0xb719380f in __GI_raise (sig=sig@entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
resultvar = optimized out
resultvar = optimized out
pid = -1221525504
selftid = 29720
#2  0xb7196cc3 in __GI_abort () at abort.c:90
save_stage = 2
act = {__sigaction_handler = {sa_handler = 0xbfbe0db0, 
sa_sigaction = 0xbfbe0db0}, sa_mask = {__val = {3078486656, 
  3073441792, 171515904, 3076194304, 3076196648, 5, 3078438944, 
  3076120729, 3076197088, 3071376880, 1, 5, 0, 0, 0, 0, 0, 0, 0, 
  3076259256, 0, 0, 0, 3071717608, 0, 0, 0, 3078438912, 1, 
  3078475476, 3078475380, 3076146208}}, sa_flags = -1216480640, 
  sa_restorer = 0xb7196b80 __GI_abort}
sigs = {__val = {32, 0 repeats 31 times}}
#3  0xb77574a9 in OsAbort () at ../../os/utils.c:1299
No locals.
#4  0xb7630d07 in ddxGiveUp (error=error@entry=EXIT_ERR_ABORT)
at ../../../../hw/xfree86/common/xf86Init.c:1063
i = optimized out
#5  0xb7630da3 in AbortDDX (error=error@entry=EXIT_ERR_ABORT)
at ../../../../hw/xfree86/common/xf86Init.c:1107
i = optimized out
#6  0xb775cc41 in AbortServer () at ../../os/log.c:767
No locals.
#7  0xb775d6be in FatalError (
f=f@entry=0xb7785084 Caught signal %d (%s). Server aborting\n)
at ../../os/log.c:908
args = 0xbfbe0ee4 \v
args2 = 0xbfbe0ee4 \v
beenhere = 1
#8  0xb7754d84 in OsSigHandler (signo=11, sip=0xbfbe0f0c, unused=0xbfbe0f8c