Bug#776800: X server crashes when switching to another user
As of stretch, with the nvidia-driver 375.20 and Xorg 1.19 this problem has disappeared.
Bug#776800: X server crashes when switching to another user
I've observed that most of these X server crashes have a stack backtrace that looks like this: #0 0x7fde06264107 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x7fde062654e8 in __GI_abort () at abort.c:89 #2 0x7fde089e49e1 in OsAbort () at ../../os/utils.c:1361 #3 0x7fde0883f86e in ddxGiveUp (error=EXIT_ERR_ABORT) at ../../../../hw/xfree86/common/xf86Init.c:1088 #4 0x7fde0883f996 in AbortDDX (error=EXIT_ERR_ABORT) at ../../../../hw/xfree86/common/xf86Init.c:1132 #5 0x7fde089ee370 in AbortServer () at ../../os/log.c:783 #6 0x7fde089ee8b1 in FatalError (f=0x7fde08a1de68 Caught signal %d (%s). Server aborting\n) at ../../os/log.c:924 #7 0x7fde089e1653 in OsSigHandler (signo=11, sip=0x7fff53cb6530, unused=0x7fff53cb6400) at ../../os/osinit.c:147 #8 signal handler called #9 0x7fde0889cd87 in xf86CursorSetCursor (pDev=0x7fde0a9bb260, pScreen=0x7fde0a8d2110, pCurs=0x7fde0ac7b280, x=772, y=596) at ../../../../hw/xfree86/ramdac/xf86Cursor.c:332 #10 0x7fde0889c9e6 in xf86CursorEnableDisableFBAccess (pScrn=0x7fde0a894420, enable=1) at ../../../../hw/xfree86/ramdac/xf86Cursor.c:232 #11 0x7fde00be4ef2 in ?? () from /usr/lib/xorg/modules/drivers/nvidia_drv.so #12 0x7fde00bdc791 in ?? () from /usr/lib/xorg/modules/drivers/nvidia_drv.so #13 0x7fde0883afdb in xf86VTEnter () at ../../../../hw/xfree86/common/xf86Events.c:581 #14 0x7fde0883b0c8 in xf86VTSwitch () at ../../../../hw/xfree86/common/xf86Events.c:633 #15 0x7fde0883a5ad in xf86Wakeup (blockData=0x0, err=-1, pReadmask=0x7fde08c85500 LastSelectMask) at ../../../../hw/xfree86/common/xf86Events.c:291 #16 0x7fde087e5d0e in WakeupHandler (result=-1, pReadmask=0x7fde08c85500 LastSelectMask) at ../../dix/dixutils.c:423 #17 0x7fde089d70f0 in WaitForSomething (pClientsReady=0x7fde0abc4dd0) at ../../os/WaitFor.c:229 #18 0x7fde087d5edd in Dispatch () at ../../dix/dispatch.c:361 #19 0x7fde087e4f70 in dix_main (argc=14, argv=0x7fff53cb6ef8, envp=0x7fff53cb6f70) at ../../dix/main.c:296 #20 0x7fde087c5fc8 in main (argc=14, argv=0x7fff53cb6ef8, envp=0x7fff53cb6f70) at ../../dix/stubmain.c:34 I don't know what the closed-source nvidia_drv.so does in #11 and #12. But in #10 I applied the appended brute-force patch to see what happens and, lo and behold, no crashes after a hundred times switching user and two days of doing normal work! This patch may introduce a small memory leak - I don't know. But the machine doesn't freeze any more! @Aaron: Do you still think this is a bug in Xorg? -richard. -- Richard B. Kreckel http://in.terlu.de/~kreckel/ --- xorg-server-1.16.4.orig/hw/xfree86/ramdac/xf86Cursor.c 2015-02-11 00:32:06.0 +0100 +++ xorg-server-1.16.4/hw/xfree86/ramdac/xf86Cursor.c 2015-02-27 22:12:45.166164479 +0100 @@ -223,16 +223,6 @@ xf86CursorEnableDisableFBAccess(ScrnInfo if (ScreenPriv-EnableDisableFBAccess) (*ScreenPriv-EnableDisableFBAccess) (pScrn, enable); - -if (enable ScreenPriv-SavedCursor) { -/* - * Re-set current cursor so drivers can react to FB access having been - * temporarily disabled. - */ -xf86CursorSetCursor(pDev, pScreen, ScreenPriv-SavedCursor, -ScreenPriv-x, ScreenPriv-y); -ScreenPriv-SavedCursor = NULL; -} } static Bool
Bug#776800: X server crashes when switching to another user
On 02/27/2015 02:55 PM, Richard B. Kreckel wrote: I've observed that most of these X server crashes have a stack backtrace that looks like this: #0 0x7fde06264107 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x7fde062654e8 in __GI_abort () at abort.c:89 #2 0x7fde089e49e1 in OsAbort () at ../../os/utils.c:1361 #3 0x7fde0883f86e in ddxGiveUp (error=EXIT_ERR_ABORT) at ../../../../hw/xfree86/common/xf86Init.c:1088 #4 0x7fde0883f996 in AbortDDX (error=EXIT_ERR_ABORT) at ../../../../hw/xfree86/common/xf86Init.c:1132 #5 0x7fde089ee370 in AbortServer () at ../../os/log.c:783 #6 0x7fde089ee8b1 in FatalError (f=0x7fde08a1de68 Caught signal %d (%s). Server aborting\n) at ../../os/log.c:924 #7 0x7fde089e1653 in OsSigHandler (signo=11, sip=0x7fff53cb6530, unused=0x7fff53cb6400) at ../../os/osinit.c:147 #8 signal handler called #9 0x7fde0889cd87 in xf86CursorSetCursor (pDev=0x7fde0a9bb260, pScreen=0x7fde0a8d2110, pCurs=0x7fde0ac7b280, x=772, y=596) at ../../../../hw/xfree86/ramdac/xf86Cursor.c:332 #10 0x7fde0889c9e6 in xf86CursorEnableDisableFBAccess (pScrn=0x7fde0a894420, enable=1) at ../../../../hw/xfree86/ramdac/xf86Cursor.c:232 #11 0x7fde00be4ef2 in ?? () from /usr/lib/xorg/modules/drivers/nvidia_drv.so #12 0x7fde00bdc791 in ?? () from /usr/lib/xorg/modules/drivers/nvidia_drv.so #13 0x7fde0883afdb in xf86VTEnter () at ../../../../hw/xfree86/common/xf86Events.c:581 #14 0x7fde0883b0c8 in xf86VTSwitch () at ../../../../hw/xfree86/common/xf86Events.c:633 #15 0x7fde0883a5ad in xf86Wakeup (blockData=0x0, err=-1, pReadmask=0x7fde08c85500 LastSelectMask) at ../../../../hw/xfree86/common/xf86Events.c:291 #16 0x7fde087e5d0e in WakeupHandler (result=-1, pReadmask=0x7fde08c85500 LastSelectMask) at ../../dix/dixutils.c:423 #17 0x7fde089d70f0 in WaitForSomething (pClientsReady=0x7fde0abc4dd0) at ../../os/WaitFor.c:229 #18 0x7fde087d5edd in Dispatch () at ../../dix/dispatch.c:361 #19 0x7fde087e4f70 in dix_main (argc=14, argv=0x7fff53cb6ef8, envp=0x7fff53cb6f70) at ../../dix/main.c:296 #20 0x7fde087c5fc8 in main (argc=14, argv=0x7fff53cb6ef8, envp=0x7fff53cb6f70) at ../../dix/stubmain.c:34 I don't know what the closed-source nvidia_drv.so does in #11 and #12. But in #10 I applied the appended brute-force patch to see what happens and, lo and behold, no crashes after a hundred times switching user and two days of doing normal work! This patch may introduce a small memory leak - I don't know. But the machine doesn't freeze any more! @Aaron: Do you still think this is a bug in Xorg? Almost certainly, yes. The NVIDIA calls are part of the EnableDisableFBAccess wrap chain. The NVIDIA driver doesn't do a whole lot with the cursor during that path. Especially since the NVIDIA driver is just calling down to the wrapped xf86CursorEnableDisableFBAccess rather than doing anything with the cursor code directly. -- Aaron -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#776800: X server crashes when switching to another user
Package: xorg-server Version: 2:1.16.2.901-1 Severity: important I have a couple of Debian/jessie machines configured to be used by serveral users in turns. They are using the log in as another user feature of Gnome, while other are staying logged in. In about 1 out of 15 cases, X freezes for good after a user has entered the password. When X freezes, the screen remains black and only the mouse pointer is visible but cannot be moved. Attaching to the frozen process using gdb reveals that the hang is occurring because the X server crashed in a call to malloc(), and the server's crash handler calls the driver's LeaveVT function, which tries to allocate memory, which hangs trying to take a malloc lock which is already held. I should add here that the machines are all equipped with several types of nVIDIA cards and run the proprietory driver. After contacting Aaron Plattner of nVIDIA, he convinced me that the crashing path comes from a path that does not involve the NVIDIA driver. In order to track down the crashing call, I replaced /usr/bin/Xorg with a script like this: #!/bin/bash ulimit -c unlimited export MALLOC_CHECK_=2 exec /usr/bin/Xorg.testing $@ As expected, switching users makes X crash now instead of freezing. And this is the stack backtrace of a core dump: Core was generated by `/usr/bin/Xorg.testing :1 -novtswitch -background none -noreset -verbose 3 -auth'. Program terminated with signal SIGABRT, Aborted. #0 0x7fd92080e107 in __GI_raise (sig=sig@entry=6) at ../npt/sysdeps/unix/sysv/linux/raise.c:56 56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb) bt #0 0x7fd92080e107 in __GI_raise (sig=sig@entry=6) at ../npt/sysdeps/unix/sysv/linux/raise.c:56 #1 0x7fd92080f4e8 in __GI_abort () at abort.c:89 #2 0x7fd920851850 in malloc_printerr (action=optimized out, str=0x7fd92093ad1e free(): invalid pointer, ptr=optimized out) at malloc.c:5000 #3 0x7fd922b2edc7 in FreeCursor (value=0x7fd9238d9fd0, cid=cid@entry=0) at ../../dix/cursor.c:128 #4 0x7fd922bbf988 in xf86CursorSetCursor (pDev=0x7fd9235f4b10, pScreen=0x7fd92350b9c0, pCurs=0x7fd9238d50c0, x=685, y=591) at ../../../../hw/xfree86/ramdac/xf86Cursor.c:327 #5 0x7fd922c85e2b in miPointerUpdateSprite (pDev=0x7fd9235f4b10) at ../../mi/mipointer.c:442 #6 0x7fd922c8607e in miPointerDisplayCursor (pDev=0x7fd9235f4b10, pScreen=0x7fd92350b9c0, pCursor=0x7fd9238d50c0) at ../../mi/mipointer.c:194 #7 0x7fd922bcdef9 in CursorDisplayCursor (pDev=pDev@entry=0x7fd9235f4b10, pScreen=pScreen@entry=0x7fd92350b9c0, pCursor=0x7fd9238d50c0) at ../../xfixes/cursor.c:150 #8 0x7fd922bcdff6 in CursorFreeHideCount (data=optimized out, id=optimized out) at ../../xfixes/cursor.c:974 #9 0x7fd922b5e1e2 in doFreeResource (res=0x7fd9238d2310, skip=0) at ../../dix/resource.c:873 #10 0x7fd922b5ed2b in FreeResource (id=1094745046, skipDeleteFuncType=skipDeleteFuncType@entry=0) at ../../dix/resource.c:903 #11 0x7fd922bceebf in ProcXFixesShowCursor (client=0x7fd923859bd0) at ../../xfixes/cursor.c:931 #12 0x7fd922b3b0c7 in Dispatch () at ../../dix/dispatch.c:432 #13 0x7fd922b3f266 in dix_main (argc=14, argv=0x7fff220c2138, envp=optimized out) at ../../dix/main.c:296 #14 0x7fd9207fab45 in __libc_start_main (main=0x7fd922b295c0 main, argc=14, argv=0x7fff220c2138, init=optimized out, fini=optimized out, rtld_fini=optimized out, stack_end=0x7fff220c2128) at libc-start.c:287 #15 0x7fd922b295ee in _start () -rbk. -- .''`. Richard B. Kreckel : :' : krec...@debian.org `. `' krec...@ginac.de `-http://www.ginac.de/~kreckel/ -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org