Re: [Intel-gfx] Patch for crashing intel server
Hi Chris, I got a black screen while using your patch. /sys/kernel/debug/dri/0/i915_gem_objects contents are shown below. The first time is while the video is running; the second after stopping it. AFAICS, there is no difference between them. However, after starting a new video, there is a difference in active objects; not sure if it is related (I don't really know what any of it means). That is the third one. Thanks, Bas root@star:/sys/kernel/debug/dri/0# cat i915_gem_objects 220 objects, 36782080 bytes 131 [131] objects, 34430976 [34430976] bytes in gtt 0 [0] active objects, 0 [0] bytes 131 [131] inactive objects, 34430976 [34430976] bytes 49 unbound objects, 638976 bytes 1 purgeable objects, 4096 bytes 6 pinned mappable objects, 15884288 bytes 118 fault mappable objects, 27901952 bytes 536870912 [268435456] gtt total Xorg: 217 objects, 36642816 bytes (0 active, 30703616 inactive, 5922816 unbound) root@star:/sys/kernel/debug/dri/0# cat i915_gem_objects 220 objects, 36782080 bytes 131 [131] objects, 34430976 [34430976] bytes in gtt 0 [0] active objects, 0 [0] bytes 131 [131] inactive objects, 34430976 [34430976] bytes 49 unbound objects, 638976 bytes 1 purgeable objects, 4096 bytes 6 pinned mappable objects, 15884288 bytes 118 fault mappable objects, 27901952 bytes 536870912 [268435456] gtt total Xorg: 217 objects, 36642816 bytes (0 active, 30703616 inactive, 5922816 unbound) root@star:/sys/kernel/debug/dri/0# cat i915_gem_objects 220 objects, 36782080 bytes 131 [131] objects, 34430976 [34430976] bytes in gtt 2 [2] active objects, 32768 [32768] bytes 129 [129] inactive objects, 34398208 [34398208] bytes 49 unbound objects, 638976 bytes 1 purgeable objects, 4096 bytes 6 pinned mappable objects, 15884288 bytes 118 fault mappable objects, 27901952 bytes 536870912 [268435456] gtt total Xorg: 217 objects, 36642816 bytes (32768 active, 30670848 inactive, 5922816 unbound) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Patch for crashing intel server
On Wed, Oct 23, 2013 at 09:28:28AM +0100, Chris Wilson wrote: No worries, if you can run addr2line -e /usr/lib/xorg/modules/drivers/intel_drv.so -i 0xfcd79 0xf8215 that should give me the information needed to pinpoint the crash. $ addr2line -e /usr/lib/xorg/modules/drivers/intel_drv.so -i 0xfcd79 0xf8215 /build/xserver-xorg-video-intel-WbV7Z9/xserver-xorg-video-intel-2.21.15/build/src/uxa/../../../src/uxa/intel.h:138 /build/xserver-xorg-video-intel-WbV7Z9/xserver-xorg-video-intel-2.21.15/build/src/uxa/../../../src/uxa/i915_video.c:156 /build/xserver-xorg-video-intel-WbV7Z9/xserver-xorg-video-intel-2.21.15/build/src/uxa/../../../src/uxa/intel_video.c:1584 Note that I'm running the unpatched Debian version again (so not with your or my patch), which is why it was crashing. In case you have different sources, here's some context for those lines: intel.h:138 is static inline Bool intel_pixmap_tiled(PixmapPtr pixmap) { return intel_get_pixmap_private(pixmap)-tiling != I915_TILING_NONE; } i915_video.c:156 is /* front buffer, pitch, offset */ if (intel_pixmap_tiled(target)) { tiling = BUF_3D_TILED_SURFACE; and intel_video.c:1584 is } else { I915DisplayVideoTextured(scrn, adaptor_priv, id, clipBoxes, width, height, dstPitch, dstPitch2, src_w, src_h, drw_w, drw_h, pixmap); } Thanks, Bas signature.asc Description: Digital signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Patch for crashing intel server
On Tue, Oct 15, 2013 at 09:25:41AM +0100, Chris Wilson wrote: This does indeed stop the server from crashing, but actually makes the problem worse: it used to play video for a few minutes and then crash when trying. With my patch it would play video for a few minutes and then present black screens when trying. With your patch, it presents black screens from the start. Start of video, or beginning of X? Beginning of X. After starting and logging in, I can play them for a few minutes; afterwards it will crash. I must say I'm not entirely sure if the backtrace I sent you is a typical case; I managed to crash it sooner than usual, so perhaps it wasn't the bug that I triggered before. It did stop the crashing however. However, that still leaveas the question as to how you ended up being unable to allocate bo... I didn't check the backtrace myself, but when I wrote my shotgun-patch, the problem was that pixmap_private was NULL; bo is in there, right? So at least in that case, it could never have allocated it, or at least it couldn't store the pointer. While looking for it I did find and try intel-gpu-time, and noticed that it always reports the gpu 100% busy, even when running intel-gpu-time sleep 5 from a linux virtual terminal (so not even X is displayed). Is that normal? Hmm, looks like it should report correctly on i915. Due to unrelated problems (unbearable slowness) I switched from gnome to xfce. It does report 0% now. It seems gnome keeps the gpu busy even if it's not displaying anything... Thanks, Bas signature.asc Description: Digital signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Patch for crashing intel server
On Sun, Oct 13, 2013 at 10:43:49AM +0100, Chris Wilson wrote: My X server was crashing when playing video, and I wrote a patch to fix it. Please find the background and the patch at http://bugs.debian.org/724944 . Ok, I can see the allocation failure that leads to the crash: commit f9a18c9f38d09c145eb513ca989966dc135c1e9b Author: Chris Wilson ch...@chris-wilson.co.uk Date: Sun Oct 13 10:36:35 2013 +0100 This does indeed stop the server from crashing, but actually makes the problem worse: it used to play video for a few minutes and then crash when trying. With my patch it would play video for a few minutes and then present black screens when trying. With your patch, it presents black screens from the start. I must say I'm not entirely sure if the backtrace I sent you is a typical case; I managed to crash it sooner than usual, so perhaps it wasn't the bug that I triggered before. It did stop the crashing however. However, that still leaveas the question as to how you ended up being unable to allocate bo... You can watch /sys/kernel/debug/dri/0/i915_gem_objects (or just use intel-gpu-overlay) and see if there is an object leak. I don't have enough knowledge about the internals to know how that works. I can see the file if I mount the debugfs, but what am I looking for? I don't seem to have intel-gpu-overlay on my system; does it make sense to install it? If so, where do I get it? While looking for it I did find and try intel-gpu-time, and noticed that it always reports the gpu 100% busy, even when running intel-gpu-time sleep 5 from a linux virtual terminal (so not even X is displayed). Is that normal? Thanks, Bas signature.asc Description: Digital signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Bug#724944: Patch
On Fri, Oct 11, 2013 at 08:53:03PM +0200, Julien Cristau wrote: Thanks. Can you please send this upstream to intel-gfx@lists.freedesktop.org? Done. (I didn't subscribe to the list; not sure if that was required. My mail wasn't bounced, so I suppose it worked.) By the way, I just noticed that while the patch does prevent the server from crashing, it doesn't actually solve the problem: videos are now all black. Not crashing the server is certainly an improvement, but this is still unusable. :-( I'm guessing the problem is whatever sets the intel_pixmap_private field to NULL, but I have no idea where to look for that, or how to debug it. It only happens after the server has been running for some time (a few minutes), which sounds like it will not be easy to track down, unfortunately. If anyone wants to try, or can tell me what I can try, please let me know. Thanks, Bas ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Patch for crashing intel server
On Sat, Oct 12, 2013 at 09:46:14PM +0100, Chris Wilson wrote: On Fri, Oct 11, 2013 at 09:24:54PM +0200, Bas Wijnen wrote: Hello, My X server was crashing when playing video, and I wrote a patch to fix it. Please find the background and the patch at http://bugs.debian.org/724944 . The patch is a shotgun solution, putting NULL pointer checks where the pointer is explicitly not allowed to be NULL. I need an actual stacktrace to find the root cause. -Chris Sure thing; you can find it attached. Of course it shows when the segfault is triggered, not when the data became NULL. And that should be fixed, because even though the server doesn't crash with the patch, it also doesn't play video. If you need any more information (like debug statements in the set_pixmap_private?), please let me know how I can generate it. Thanks, Bas #0 0xb758f424 in __kernel_vsyscall () #1 0xb719380f in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #2 0xb7196cc3 in __GI_abort () at abort.c:90 #3 0xb77574a9 in OsAbort () at ../../os/utils.c:1299 #4 0xb7630d07 in ddxGiveUp (error=error@entry=EXIT_ERR_ABORT) at ../../../../hw/xfree86/common/xf86Init.c:1063 #5 0xb7630da3 in AbortDDX (error=error@entry=EXIT_ERR_ABORT) at ../../../../hw/xfree86/common/xf86Init.c:1107 #6 0xb775cc41 in AbortServer () at ../../os/log.c:767 #7 0xb775d6be in FatalError ( f=f@entry=0xb7785084 Caught signal %d (%s). Server aborting\n) at ../../os/log.c:908 #8 0xb7754d84 in OsSigHandler (signo=11, sip=0xbfbe0f0c, unused=0xbfbe0f8c) at ../../os/osinit.c:147 #9 signal handler called #10 0xb6f26d79 in intel_pixmap_tiled (pixmap=0xb8945808) at ../../../src/uxa/intel.h:138 #11 I915DisplayVideoTextured (scrn=0xb83f2f08, adaptor_priv=0xb83eed70, id=808596553, dstRegion=0xbfbe14a8, width=352, height=288, video_pitch=176, video_pitch2=352, src_w=352, src_h=288, drw_w=384, drw_h=288, pixmap=0xb8425d98) at ../../../src/uxa/i915_video.c:156 #12 0xb6f22215 in I830PutImageTextured (scrn=0xb83f2f08, src_x=0, src_y=0, drw_x=1308, drw_y=192, src_w=352, src_h=288, drw_w=384, drw_h=288, id=808596553, buf=0xb314 Address 0xb314 out of bounds, width=352, height=288, sync=0, clipBoxes=0xbfbe14a8, data=0xb83eed70, drawable=0xb88fad78) at ../../../src/uxa/intel_video.c:1584 #13 0xb764877c in xf86XVPutImage (client=0xb8906010, pDraw=0xb88fad78, pPort=0xb840ce58, pGC=0xb89206d8, src_x=0, src_y=0, src_w=352, src_h=288, drw_x=0, drw_y=0, drw_w=384, drw_h=288, format=0xb840ccd0, data=0xb314 Address 0xb314 out of bounds, sync=0, width=352, height=288) at ../../../../hw/xfree86/common/xf86xv.c:1827 #14 0xb769304c in XvdiPutImage (client=client@entry=0xb8906010, pDraw=0xb88fad78, pPort=0xb840ce58, pGC=0xb89206d8, src_x=0, src_y=0, src_w=352, src_h=288, drw_x=0, drw_y=0, drw_w=384, drw_h=288, image=image@entry=0xb840ccd0, data=0xb314 Address 0xb314 out of bounds, sync=0, width=width@entry=352, height=288) at ../../Xext/xvmain.c:673 #15 0xb7694648 in ProcXvShmPutImage (client=0xb8906010) at ../../Xext/xvdisp.c:1025 #16 0xb7696dae in ProcXvDispatch (client=0xb8906010) at ../../Xext/xvdisp.c:1212 #17 0xb75ed35d in Dispatch () at ../../dix/dispatch.c:432 #18 0xb75db38a in main (argc=13, argv=0xbfbe1774, envp=0xbfbe17ac) at ../../dix/main.c:298 #0 0xb758f424 in __kernel_vsyscall () No symbol table info available. #1 0xb719380f in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 resultvar = optimized out resultvar = optimized out pid = -1221525504 selftid = 29720 #2 0xb7196cc3 in __GI_abort () at abort.c:90 save_stage = 2 act = {__sigaction_handler = {sa_handler = 0xbfbe0db0, sa_sigaction = 0xbfbe0db0}, sa_mask = {__val = {3078486656, 3073441792, 171515904, 3076194304, 3076196648, 5, 3078438944, 3076120729, 3076197088, 3071376880, 1, 5, 0, 0, 0, 0, 0, 0, 0, 3076259256, 0, 0, 0, 3071717608, 0, 0, 0, 3078438912, 1, 3078475476, 3078475380, 3076146208}}, sa_flags = -1216480640, sa_restorer = 0xb7196b80 __GI_abort} sigs = {__val = {32, 0 repeats 31 times}} #3 0xb77574a9 in OsAbort () at ../../os/utils.c:1299 No locals. #4 0xb7630d07 in ddxGiveUp (error=error@entry=EXIT_ERR_ABORT) at ../../../../hw/xfree86/common/xf86Init.c:1063 i = optimized out #5 0xb7630da3 in AbortDDX (error=error@entry=EXIT_ERR_ABORT) at ../../../../hw/xfree86/common/xf86Init.c:1107 i = optimized out #6 0xb775cc41 in AbortServer () at ../../os/log.c:767 No locals. #7 0xb775d6be in FatalError ( f=f@entry=0xb7785084 Caught signal %d (%s). Server aborting\n) at ../../os/log.c:908 args = 0xbfbe0ee4 \v args2 = 0xbfbe0ee4 \v beenhere = 1 #8 0xb7754d84 in OsSigHandler (signo=11, sip=0xbfbe0f0c, unused=0xbfbe0f8c