Bug#926212: gnome-shell crashed: segfault in libgnome-shell.so after printing email from evolution
On Tue, 16 Apr 2019 at 22:47:14 +0100, Simon McVittie wrote: > On Tue, 02 Apr 2019 at 08:11:23 +0200, Guenter Grodotzki wrote: > > [39719.061358] gnome-shell[1279]: segfault at 0 ip 7fd4fa6ae3bf sp > > 7ffcf4dbaea0 error 4 in libgnome-shell.so[7fd4fa6a6000+1f000] > > How often has this happened? Is it reproducible, or is it something > that happened once and has not recurred? Please try gnome-shell 3.30.2-8 when it becomes available in unstable. I think it might fix this crash. smcv
Bug#926212: gnome-shell crashed: segfault in libgnome-shell.so after printing email from evolution
On Tue, 16 Apr 2019 22:47:14 +0100 Simon McVittie wrote: > How sure are you that the virtual memory area starting at 0x7fd4fa6a6000 > starts with .init and not .text? Unfortunately I am not completely sure, but I caused a crash while knowing the memory layout and found there also the dmesg line containing start of .init. At least this calculation worked later in 927142 too. Kind regards, Bernhard
Bug#926212: gnome-shell crashed (segfault)
Control: forwarded -1 https://gitlab.gnome.org/GNOME/gnome-shell/merge_requests/497 On Sun, 07 Apr 2019 at 20:00:23 +0200, Bernhard Übelacker wrote: > PS.: My untested change in message 10 might not crash, but lead to an > infinitive loop, as app->running_state might not change anymore... Yeah, let's not do that. If line 1485 of shell_app_dispose() is indeed what's crashing, then that would mean that app->running_state is non-NULL but app->running_state->windows is NULL. This doesn't seem to be meant to happen: we can see that in window_backed_app_get_window(), there's an assertion that if app->running_state is non-NULL, then app->running_state->windows is meant to be non-NULL too. A common feature of upstream bugs 750 and 822 is that _shell_app_remove_window() line 1110 is in the stack trace: this means the ShellApp is being disposed while shell_app_sync_running_state() is still running. At this point in _shell_app_remove_window(), window has already been removed from app->running_state->windows, but app->running_state has not yet been cleared: so the invariant that app->running_state is only non-NULL if app->running_state->windows is also non-NULL does not hold, and indeed in both those upstream bugs we seem to have a violation of that invariant, causing a crash when user code disposes the ShellApp in response to one of its signals. I think the solution is probably to stop believing that app->running_state != NULL implies app->running_state->windows != NULL, and check for the latter whenever we need it; but the refcounting of the ShellApp still seems suspicious, so I'm hoping for input from upstream before uploading anything for this. smcv
Bug#926212: gnome-shell crashed: segfault in libgnome-shell.so after printing email from evolution
Control: retitle -1 gnome-shell crashed: segfault in libgnome-shell.so after printing email from evolution Control: tags -1 + moreinfo Control: forwarded -1 https://gitlab.gnome.org/GNOME/gnome-shell/issues/750 I'm retitling this bug to try to stop other people using it to represent different segfaults, because after someone starts doing that it becomes really difficult to disentangle who has which bug and which bugs have been solved. Upstream bug 750 seems particularly similar. On Tue, 02 Apr 2019 at 08:11:23 +0200, Guenter Grodotzki wrote: > [39719.061358] gnome-shell[1279]: segfault at 0 ip 7fd4fa6ae3bf sp > 7ffcf4dbaea0 error 4 in libgnome-shell.so[7fd4fa6a6000+1f000] How often has this happened? Is it reproducible, or is it something that happened once and has not recurred? On Fri, 05 Apr 2019 at 22:01:58 +0200, Bernhard Übelacker wrote: > As this information is still kind of small, you might consider > to install a coredump collector like systemd-coredump. > That way you could list crashes of the current boot by: > coredumpctl list > And some more information is entered into journal that would > help a lot to triage such crashes ("Stack trace of thread...". > journalctl --no-pager > > Even better would be if you could install the debug symbol > packages e.g. gnome-shell-dbgsym like described in [1]. > Then following commands should print a backtrace > with source line information. This would be very useful information. > Nevertheless, I tried if that little information brings > us somewhere and I think it leads into function > shell_app_dispose. There, I assume, we reach line 1485, > unfortunately dereferencing a null pointer > in app->running_state->windows. > > crash instruction - start .init== diff > 0x7fd4fa6ae3bf - 0x7fd4fa6a6000 == 0x83BF How sure are you that the virtual memory area starting at 0x7fd4fa6a6000 starts with .init and not .text? smcv
Bug#926212: gnome-shell crashed (segfault)
Hello Guenter Grodotzki, (I guess you wanted me to receive your last message, so you should use "reply all", or it gets just attached to your bug report.) I have left a note in this upstream report [1], lets see if they agree. Kind regards, Bernhard [1] https://gitlab.gnome.org/GNOME/gnome-shell/issues/822#note_484642 PS.: My untested change in message 10 might not crash, but lead to an infinitive loop, as app->running_state might not change anymore...
Bug#926212: gnome-shell crashed (segfault)
Thanks Bernhard for the help. I am only realising now that even though I am running “testing” that the bug is most likely due to upstream. > On 05 Apr 2019, at 22:01, Bernhard Übelacker wrote: > > Hello Guenter Grodotzki, > I just tried to help triage that issue. > > For some reason you just added the segfault line. > I assume there was one line following starting with "Code:". > Please add that line too when submitting bugs. > > As this information is still kind of small, you might consider > to install a coredump collector like systemd-coredump. > That way you could list crashes of the current boot by: >coredumpctl list > And some more information is entered into journal that would > help a lot to triage such crashes ("Stack trace of thread...". >journalctl --no-pager > > Even better would be if you could install the debug symbol > packages e.g. gnome-shell-dbgsym like described in [1]. > Then following commands should print a backtrace > with source line information. > > > Nevertheless, I tried if that little information brings > us somewhere and I think it leads into function > shell_app_dispose. There, I assume, we reach line 1485, > unfortunately dereferencing a null pointer > in app->running_state->windows. > > > There are some upstream bugs [2], which point to that line. > Unfortunately it looks like there is no fix yet commited. > > > But, if I am right, something like this could > help already (untested)? > > while (app->running_state) > -_shell_app_remove_window (app, app->running_state->windows->data); > +if (app->running_state->windows) _shell_app_remove_window (app, > app->running_state->windows->data); > > /* We should have been transitioned when we removed all of our windows > */ > > > Kind regards, > Bernhard > > > [1] > https://wiki.debian.org/HowToGetABacktrace#Installing_the_debugging_symbols > [2] https://gitlab.gnome.org/GNOME/gnome-shell/issues/590 >https://gitlab.gnome.org/GNOME/gnome-shell/issues/766 >https://gitlab.gnome.org/GNOME/gnome-shell/issues/750 >https://gitlab.gnome.org/GNOME/gnome-shell/issues/918 >https://gitlab.gnome.org/GNOME/gnome-shell/issues/822 >https://bugzilla.redhat.com/show_bug.cgi?id=1654420#c22 > > > (gdb) list shell-app.c:1477,1492 > 1477static void > 1478shell_app_dispose (GObject *object) > 1479{ > 1480 ShellApp *app = SHELL_APP (object); > 1481 > 1482 g_clear_object (&app->info); > 1483 > 1484 while (app->running_state) > 1485_shell_app_remove_window (app, app->running_state->windows->data); > 1486 > 1487 /* We should have been transitioned when we removed all of our > windows */ > 1488 g_assert (app->state == SHELL_APP_STATE_STOPPED); > 1489 g_assert (app->running_state == NULL); > 1490 > 1491 G_OBJECT_CLASS(shell_app_parent_class)->dispose (object); > 1492} >
Bug#926212: gnome-shell crashed (segfault)
Hello Guenter Grodotzki, I just tried to help triage that issue. For some reason you just added the segfault line. I assume there was one line following starting with "Code:". Please add that line too when submitting bugs. As this information is still kind of small, you might consider to install a coredump collector like systemd-coredump. That way you could list crashes of the current boot by: coredumpctl list And some more information is entered into journal that would help a lot to triage such crashes ("Stack trace of thread...". journalctl --no-pager Even better would be if you could install the debug symbol packages e.g. gnome-shell-dbgsym like described in [1]. Then following commands should print a backtrace with source line information. Nevertheless, I tried if that little information brings us somewhere and I think it leads into function shell_app_dispose. There, I assume, we reach line 1485, unfortunately dereferencing a null pointer in app->running_state->windows. There are some upstream bugs [2], which point to that line. Unfortunately it looks like there is no fix yet commited. But, if I am right, something like this could help already (untested)? while (app->running_state) -_shell_app_remove_window (app, app->running_state->windows->data); +if (app->running_state->windows) _shell_app_remove_window (app, app->running_state->windows->data); /* We should have been transitioned when we removed all of our windows */ Kind regards, Bernhard [1] https://wiki.debian.org/HowToGetABacktrace#Installing_the_debugging_symbols [2] https://gitlab.gnome.org/GNOME/gnome-shell/issues/590 https://gitlab.gnome.org/GNOME/gnome-shell/issues/766 https://gitlab.gnome.org/GNOME/gnome-shell/issues/750 https://gitlab.gnome.org/GNOME/gnome-shell/issues/918 https://gitlab.gnome.org/GNOME/gnome-shell/issues/822 https://bugzilla.redhat.com/show_bug.cgi?id=1654420#c22 (gdb) list shell-app.c:1477,1492 1477static void 1478shell_app_dispose (GObject *object) 1479{ 1480 ShellApp *app = SHELL_APP (object); 1481 1482 g_clear_object (&app->info); 1483 1484 while (app->running_state) 1485_shell_app_remove_window (app, app->running_state->windows->data); 1486 1487 /* We should have been transitioned when we removed all of our windows */ 1488 g_assert (app->state == SHELL_APP_STATE_STOPPED); 1489 g_assert (app->running_state == NULL); 1490 1491 G_OBJECT_CLASS(shell_app_parent_class)->dispose (object); 1492} # Buster amd64 qemu VM 2019-04-05 apt update apt dist-upgrade apt install dpkg-dev devscripts systemd-coredump bc xserver-xorg dbus-x11 gdm3 gnome gdb elfutils binutils gnome-shell-dbgsym systemctl start gdm3 mkdir /home/benutzer/source/gnome-shell/orig -p cd/home/benutzer/source/gnome-shell/orig apt source gnome-shell cd # From submitter [39719.061358] gnome-shell[1279]: segfault at 0 ip 7fd4fa6ae3bf sp 7ffcf4dbaea0 error 4 in libgnome-shell.so[7fd4fa6a6000+1f000] https://www.enodev.fr/posts/decode-segfault-errors-in-dmesg.html https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/x86/mm/fault.c?h=linux-4.9.y#n31 /* * Page fault error code bits: * * bit 0 ==<-> 0: no page found<->1: protection fault * bit 1 ==<-> 0: read access><-->1: write access * bit 2 ==<-> 0: kernel-mode access<>1: user-mode access * bit 3 ==<-><--><--><-->1: use of reserved bit detected * bit 4 ==<-><--><--><-->1: fault was an instruction fetch * bit 5 ==<-><--><--><-->1: protection keys block access */ enum x86_pf_error_code { <-->PF_PROT><-->=<-><-->1 << 0, <-->PF_WRITE<-->=<-><-->1 << 1, <-->PF_USER><-->=<-><-->1 << 2, <-->PF_RSVD><-->=<-><-->1 << 3, <-->PF_INSTR<-->=<-><-->1 << 4, <-->PF_PK<-><-->=<-><-->1 << 5, }; "error 4" == 0b100 bit 0 ==<--> 0: no page found bit 1 ==<--> 0: read access bit 2 ==<--> 1: user-mode access # From submitter [39719.061358] gnome-shell[1279]: segfault at 0 ip 7fd4fa6ae3bf sp 7ffcf4dbaea0 error 4 in libgnome-shell.so[7fd4fa6a6000+1f000] crash instruction - start .init== diff 0x7fd4fa6ae3bf - 0x7fd4fa6a6000 == 0x83BF benutzer@debian:~$ gdb -q -ex 'set width 0' -ex 'set pagination off' -ex 'info share' -ex 'info target' -ex 'detach' -ex 'quit' --pid $(pidof gnome-shell) 2>&1 | grep libgnome-shell.so 0x7f2482ab2f10 0x7f2482acd22e Yes /usr/lib/gnome-shell/libgnome-shell.so 0x7f2482a98238 - 0x7f2482a9825c is .note.gnu.build-id in /usr/lib/gnome-shell/libgnome-shell.so 0x7f2482a98260 - 0x7f2482a99004 is .gnu.hash in /usr/lib/gnome-shell/libgnome-shell.so 0x7f2482a99008 - 0x7f2482a9fd40 is .dynsym in /usr/lib/gnome-shell/libgnome-shell.so 0x
Bug#926212: gnome-shell crashed (segfault)
Package: gnome-shell Version: 3.30.2-3 Severity: critical Justification: causes serious data loss Dear Maintainer, * What led up to the situation? I was trying to print an email from evolution * What exactly did you do (or not do) that was effective (or ineffective)? gnome-shell automatically restarted * What was the outcome of this action? - * What outcome did you expect instead? - Via dmesg: [39719.061358] gnome-shell[1279]: segfault at 0 ip 7fd4fa6ae3bf sp 7ffcf4dbaea0 error 4 in libgnome-shell.so[7fd4fa6a6000+1f000] Unfortunately due to the crash it seems evolution has a corrupted mail database. -- System Information: Debian Release: buster/sid APT prefers testing APT policy: (500, 'testing') Architecture: amd64 (x86_64) Kernel: Linux 4.19.0-4-amd64 (SMP w/4 CPU cores) Kernel taint flags: TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE Locale: LANG=en_ZA.UTF-8, LC_CTYPE=en_ZA.UTF-8 (charmap=UTF-8) (ignored: LC_ALL set to en_ZA.UTF-8), LANGUAGE=en_ZA:en (charmap=UTF-8) (ignored: LC_ALL set to en_ZA.UTF-8) Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages gnome-shell depends on: ii dconf-gsettings-backend [gsettings-backend] 0.30.1-2 ii evolution-data-server3.30.5-1 ii gir1.2-accountsservice-1.0 0.6.45-2 ii gir1.2-atspi-2.0 2.30.0-7 ii gir1.2-freedesktop 1.58.3-2 ii gir1.2-gcr-3 3.28.1-1 ii gir1.2-gdesktopenums-3.0 3.28.1-1 ii gir1.2-gdm-1.0 3.30.2-3 ii gir1.2-geoclue-2.0 2.5.2-1 ii gir1.2-glib-2.0 1.58.3-2 ii gir1.2-gnomebluetooth-1.03.28.2-3 ii gir1.2-gnomedesktop-3.0 3.30.2.1-1 ii gir1.2-gtk-3.0 3.24.5-1 ii gir1.2-gweather-3.0 3.28.2-2 ii gir1.2-ibus-1.0 1.5.19-4 ii gir1.2-mutter-3 3.30.2-6 ii gir1.2-nm-1.01.14.6-2 ii gir1.2-nma-1.0 1.8.20-1 ii gir1.2-pango-1.0 1.42.4-6 ii gir1.2-polkit-1.00.105-25 ii gir1.2-rsvg-2.0 2.44.10-1 ii gir1.2-soup-2.4 2.64.2-2 ii gir1.2-upowerglib-1.00.99.10-1 ii gjs 1.54.3-1 ii gnome-backgrounds3.30.0-1 ii gnome-settings-daemon3.30.2-3 ii gnome-shell-common 3.30.2-3 ii gsettings-desktop-schemas3.28.1-1 ii libatk-bridge2.0-0 2.30.0-5 ii libatk1.0-0 2.30.0-2 ii libc62.28-8 ii libcairo21.16.0-4 ii libcanberra-gtk3-0 0.30-7 ii libcanberra0 0.30-7 ii libcroco30.6.12-3 ii libecal-1.2-19 3.30.5-1 ii libedataserver-1.2-233.30.5-1 ii libgcr-base-3-1 3.28.1-1 ii libgdk-pixbuf2.0-0 2.38.1+dfsg-1 ii libgirepository-1.0-11.58.3-2 ii libgjs0g 1.54.3-1 ii libglib2.0-0 2.58.3-1 ii libglib2.0-bin 2.58.3-1 ii libgstreamer1.0-01.14.4-1 ii libgtk-3-0 3.24.5-1 ii libical3 3.0.4-3 ii libjson-glib-1.0-0 1.4.4-2 ii libmutter-3-03.30.2-6 ii libnm0 1.14.6-2 ii libpango-1.0-0 1.42.4-6 ii libpangocairo-1.0-0 1.42.4-6 ii libpolkit-agent-1-0 0.105-25 ii libpolkit-gobject-1-00.105-25 ii libpulse-mainloop-glib0 12.2-4 ii libpulse012.2-4 ii libsecret-1-00.18.7-1 ii libstartup-notification0 0.12-6 ii libsystemd0 241-1 ii libx11-6 2:1.6.7-1 ii libxfixes3 1:5.0.3-1 ii mutter 3.30.2-6 ii python3 3.7.2-1 Versions of packages gnome-shell recommends: ii bolt 0.7-2 ii chrome-gnome-shell10.1-5 ii gdm3 3.30.2-3 ii gkbd-capplet 3.26.1-1 ii gnome-control-