Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")
On Fri, 2019-11-01 at 01:49 -0400, David H. Gutteridge wrote: > On Tue, 2019-10-29 at 19:52 -0400, David H. Gutteridge wrote: > > On Tue, 2019-10-29 at 10:38 +, Chavdar Ivanov wrote: > > > I've tested xfce4 - a few days old build from -current pkgsrc - > > > now > > > on > > > real hardware with functional dri2. I get the same as with the > > > VirtualBox client - I have to disable compositing to get xfwm4 > > > working. At the same time glmark2 returns the usual or close to > > > results. > > > > What do you find if you disable compositing to get Xfce to start, > > and > > then enable it once xfwm4 is running successfully? I find that seems > > to > > work. So it fails some sort of initial probing, but then is able to > > activate the feature later, anyway. (As if there are two different > > code > > paths for this, or something is getting corrupted in memory during > > start up, but that isn't happening later on. I haven't had a chance > > to > > look at it in gdb again, yet.) > > Sorry, that's the wrong example, the right example is: > > - Move or delete the xfwm4.xml file from the .config path > - Start Xfce > - Go to Window Manager Tweaks->Compositor > - Note that the compositor is enabled, and related setting changes > (e.g. > opacity of window decorations) successfully apply. > > Yet, on the next startup cycle, xfwm4 crashes. (And it crashes with my > previous example of starting with the compositor turned off, and then > turning it on.) Some time between November and now, this seems to have been resolved in -current (9.99.32 from earlier this week is what I tested) with both Intel graphics and in a Qemu VM. Both now start for me without issue with compositing enabled. Dave
Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")
I don't remove xfwm4.xml, just set use_compositing setting to no. Then I am able to start xfce4 without a problem many times. EVen if I delete that file - out of xfce4 - then startxfce4, it starts first time; on exit use_compositing is set to true, subsequent startxfce4 starts, but without xfwm4 (and a longer delay to dump core); as I have a terminal in the session, I still can edit xfwm4.xml, set use_compositing to false and nohup xfwm4... On Fri, 1 Nov 2019 at 05:49, David H. Gutteridge wrote: > > On Tue, 2019-10-29 at 19:52 -0400, David H. Gutteridge wrote: > > On Tue, 2019-10-29 at 10:38 +, Chavdar Ivanov wrote: > > > I've tested xfce4 - a few days old build from -current pkgsrc - now > > > on > > > real hardware with functional dri2. I get the same as with the > > > VirtualBox client - I have to disable compositing to get xfwm4 > > > working. At the same time glmark2 returns the usual or close to > > > results. > > > > What do you find if you disable compositing to get Xfce to start, and > > then enable it once xfwm4 is running successfully? I find that seems > > to > > work. So it fails some sort of initial probing, but then is able to > > activate the feature later, anyway. (As if there are two different > > code > > paths for this, or something is getting corrupted in memory during > > start up, but that isn't happening later on. I haven't had a chance to > > look at it in gdb again, yet.) > > Sorry, that's the wrong example, the right example is: > > - Move or delete the xfwm4.xml file from the .config path > - Start Xfce > - Go to Window Manager Tweaks->Compositor > - Note that the compositor is enabled, and related setting changes (e.g. > opacity of window decorations) successfully apply. > > Yet, on the next startup cycle, xfwm4 crashes. (And it crashes with my > previous example of starting with the compositor turned off, and then > turning it on.) > > Regards, > > Dave > > --
Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")
On Tue, 2019-10-29 at 19:52 -0400, David H. Gutteridge wrote: > On Tue, 2019-10-29 at 10:38 +, Chavdar Ivanov wrote: > > I've tested xfce4 - a few days old build from -current pkgsrc - now > > on > > real hardware with functional dri2. I get the same as with the > > VirtualBox client - I have to disable compositing to get xfwm4 > > working. At the same time glmark2 returns the usual or close to > > results. > > What do you find if you disable compositing to get Xfce to start, and > then enable it once xfwm4 is running successfully? I find that seems > to > work. So it fails some sort of initial probing, but then is able to > activate the feature later, anyway. (As if there are two different > code > paths for this, or something is getting corrupted in memory during > start up, but that isn't happening later on. I haven't had a chance to > look at it in gdb again, yet.) Sorry, that's the wrong example, the right example is: - Move or delete the xfwm4.xml file from the .config path - Start Xfce - Go to Window Manager Tweaks->Compositor - Note that the compositor is enabled, and related setting changes (e.g. opacity of window decorations) successfully apply. Yet, on the next startup cycle, xfwm4 crashes. (And it crashes with my previous example of starting with the compositor turned off, and then turning it on.) Regards, Dave
Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")
On Tue, 2019-10-29 at 10:38 +, Chavdar Ivanov wrote: > I've tested xfce4 - a few days old build from -current pkgsrc - now on > real hardware with functional dri2. I get the same as with the > VirtualBox client - I have to disable compositing to get xfwm4 > working. At the same time glmark2 returns the usual or close to > results. What do you find if you disable compositing to get Xfce to start, and then enable it once xfwm4 is running successfully? I find that seems to work. So it fails some sort of initial probing, but then is able to activate the feature later, anyway. (As if there are two different code paths for this, or something is getting corrupted in memory during start up, but that isn't happening later on. I haven't had a chance to look at it in gdb again, yet.) > The other thing is - firefox used to be able to run WebGL under > -current a few months ago; now it reports that the system does not > support it. With overnight built firefox 70.0 I now get a core every > time I start it up, but then it works fine. The trace is again: > ... > (gdb) bt > #0 0x7ced69a09a41 in pthread_mutex_lock () from > /usr/lib/libpthread.so.1 > #1 0x7ced4881c42c in _mesa_error () from > /usr/X11R7/lib/modules/dri/swrast_dri.so > #2 0x7ced48836816 in _mesa_GetString () from > /usr/X11R7/lib/modules/dri/swrast_dri.so > #3 0x7ced581475d3 in ?? () from /usr/pkg/lib/firefox/libxul.so > > > Now it seems only epiphany can run WebGL (but has some other problems, > e.g. can't quit from the gui if WebGL was running). Yes, I've found the same thing. I was confused when video streaming suddenly stopped working for me, and then I saw Firefox was logging that it had "exhaused GL driver options", for a site that worked a month or so ago. (I imagine 9-BETA may still work, but I don't have anything running it at present.) Regards, Dave
Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")
I've tested xfce4 - a few days old build from -current pkgsrc - now on real hardware with functional dri2. I get the same as with the VirtualBox client - I have to disable compositing to get xfwm4 working. At the same time glmark2 returns the usual or close to results. The other thing is - firefox used to be able to run WebGL under -current a few months ago; now it reports that the system does not support it. With overnight built firefox 70.0 I now get a core every time I start it up, but then it works fine. The trace is again: ... (gdb) bt #0 0x7ced69a09a41 in pthread_mutex_lock () from /usr/lib/libpthread.so.1 #1 0x7ced4881c42c in _mesa_error () from /usr/X11R7/lib/modules/dri/swrast_dri.so #2 0x7ced48836816 in _mesa_GetString () from /usr/X11R7/lib/modules/dri/swrast_dri.so #3 0x7ced581475d3 in ?? () from /usr/pkg/lib/firefox/libxul.so Now it seems only epiphany can run WebGL (but has some other problems, e.g. can't quit from the gui if WebGL was running). On Sun, 27 Oct 2019 at 23:54, Robert Swindells wrote: > > > "David H. Gutteridge" wrote: > >On Wed, 2019-10-16 at 12:10 +0100, Chavdar Ivanov wrote: > > On Wed, 16 Oct 2019 at 11:03, David H. Gutteridge > > wrote: > > > > > FWIW, aside from Firefox (where I also see this issue), I've found > > > since the recent Mesa upgrade, Xfce4's window manager consistently > > > crashes during startup. These's a correlation with Firefox in the > > > backtrace: > > > > > > #3 0x79f26fa5e256 in _mesa_GetString (name=7937) at > > > /usr/xsrc/external/mit/MesaLib/dist/src/mesa/main/getstring.c:124 > > > ctx = 0x79f288ae5898 > > > vendor = 0x79f271833414 "Brian Paul" > > > renderer = 0x79f2718214ab "Mesa" > > > #4 0x0041b1b5 in ?? () > > > No symbol table info available. > > > #5 0x00442bb8 in ?? () > > > No symbol table info available. > > > [...] > > If you still have the core dump for this, or can generate another, it > could be helpful to examine the ctx variable in _mesa_GetString() using > gdb. > > If it is like firefox then I think you will find that not all of the > structure is in mapped memory. > > Can only think of two ways this could happen, either the ctx pointer > itself is garbage or the size given to calloc() to allocate the context > was too small. Have been trying to work out where this gets allocated > but not found it yet. > > I have got some patches to firefox that let me display Google Maps but > they basically just disable the use of OpenGL. > --
Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")
"David H. Gutteridge" wrote: >On Wed, 2019-10-16 at 12:10 +0100, Chavdar Ivanov wrote: > On Wed, 16 Oct 2019 at 11:03, David H. Gutteridge > wrote: > > > FWIW, aside from Firefox (where I also see this issue), I've found > > since the recent Mesa upgrade, Xfce4's window manager consistently > > crashes during startup. These's a correlation with Firefox in the > > backtrace: > > > > #3 0x79f26fa5e256 in _mesa_GetString (name=7937) at > > /usr/xsrc/external/mit/MesaLib/dist/src/mesa/main/getstring.c:124 > > ctx = 0x79f288ae5898 > > vendor = 0x79f271833414 "Brian Paul" > > renderer = 0x79f2718214ab "Mesa" > > #4 0x0041b1b5 in ?? () > > No symbol table info available. > > #5 0x00442bb8 in ?? () > > No symbol table info available. > > [...] If you still have the core dump for this, or can generate another, it could be helpful to examine the ctx variable in _mesa_GetString() using gdb. If it is like firefox then I think you will find that not all of the structure is in mapped memory. Can only think of two ways this could happen, either the ctx pointer itself is garbage or the size given to calloc() to allocate the context was too small. Have been trying to work out where this gets allocated but not found it yet. I have got some patches to firefox that let me display Google Maps but they basically just disable the use of OpenGL.
Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")
On Sun, Oct 27, 2019 at 02:47:43PM -0400, David H. Gutteridge wrote: > On Sun, 2019-10-27 at 02:24 +, m...@netbsd.org wrote: > > On Sun, Oct 27, 2019 at 01:30:48AM +0100, Chavdar Ivanov wrote: > > > In my case its also swrast_dri, VirtualBox host. I haven't recently > > > tried xfce4 on a real hardware with intel, I might di that later. > > > > I could finally reproduce a crash. > > And it went away when I pkg_delete'd MesaLib. I wonder if our issue is > > mixing two libGL implementations. That's a minefield. > > Interesting. In my case, neither of my machines have MesaLib installed > from pkgsrc, they're just using native X in this context. (Just curious > what your graphics chip is? Nvidia?) > > Regards, > > Dave > > I might have been to desperate to try and reproduce the issues and made some unusual setups. the scenario Chavdar Ivanov has should be reproducible with LIBGL_ALWAYS_SOFTWARE=1. I don't usually use xfce, so this was with a game. My graphics chip is an nvidia GTX 770.
Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")
On Sun, 2019-10-27 at 14:14 +, Chavdar Ivanov wrote: > I do not have MesaLib installed on this v/b guest at all. > > I bisected xfwm4.xml to try to find out which setting was causing the > problem. I didn't bother to read it first, as the result was obvious: > .. > ~ diff -u .config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml.HIDE > .config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml > --- .config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml.HIDE 2019-10-25 > 22:13:04.791908990 +0100 > +++ .config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml 2019-10-27 > 14:09:13.334172740 + > @@ -71,7 +71,7 @@ > > > > - > + > > > > > So the problem is that on first invocation xfce4 sets use_composing to > true, even if composing is not available or not functional. Well, there's more to it than that. This is happening on real hardware with Intel graphics, where there should be no such issue. (That's why I was referring to testing the different vblank settings in that config file before. Those settings in turn make xfwm4 choose different back ends for that aspect. Though that's just one piece.) Regardless, it's not that simple with virtualized environments, either. Or, at least, not mine. Having compositing enabled in that config file worked without issue before, and now it doesn't. (I just re-tested on an older QEMU VM snapshot of 8.99.50 from mid-July, with the current state of Xfce in pkgsrc, and there's no xfwm4 startup crash.) Regards, Dave
Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")
On Sun, 2019-10-27 at 02:24 +, m...@netbsd.org wrote: > On Sun, Oct 27, 2019 at 01:30:48AM +0100, Chavdar Ivanov wrote: > > In my case its also swrast_dri, VirtualBox host. I haven't recently > > tried xfce4 on a real hardware with intel, I might di that later. > > I could finally reproduce a crash. > And it went away when I pkg_delete'd MesaLib. I wonder if our issue is > mixing two libGL implementations. That's a minefield. Interesting. In my case, neither of my machines have MesaLib installed from pkgsrc, they're just using native X in this context. (Just curious what your graphics chip is? Nvidia?) Regards, Dave
Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")
Chavdar Ivanov wrote: >On Sun, 27 Oct 2019 at 16:25, Robert Swindells wrote: >> >> Chavdar Ivanov wrote: >> >I do not have MesaLib installed on this v/b guest at all. >> >> Are you running modular or native xorg ? > >Native. Ok, so either you have MesaLib from xsrc installed or you have deleted it yourself.
Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")
Native. On Sun, 27 Oct 2019 at 16:25, Robert Swindells wrote: > > > Chavdar Ivanov wrote: > >I do not have MesaLib installed on this v/b guest at all. > > Are you running modular or native xorg ? --
Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")
Chavdar Ivanov wrote: >I do not have MesaLib installed on this v/b guest at all. Are you running modular or native xorg ?
Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")
I do not have MesaLib installed on this v/b guest at all. I bisected xfwm4.xml to try to find out which setting was causing the problem. I didn't bother to read it first, as the result was obvious: .. ~ diff -u .config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml.HIDE .config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml --- .config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml.HIDE 2019-10-25 22:13:04.791908990 +0100 +++ .config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml 2019-10-27 14:09:13.334172740 + @@ -71,7 +71,7 @@ - + So the problem is that on first invocation xfce4 sets use_composing to true, even if composing is not available or not functional. On Sun, 27 Oct 2019 at 02:24, wrote: > > On Sun, Oct 27, 2019 at 01:30:48AM +0100, Chavdar Ivanov wrote: > > In my case its also swrast_dri, VirtualBox host. I haven't recently > > tried xfce4 on a real hardware with intel, I might di that later. > > I could finally reproduce a crash. > And it went away when I pkg_delete'd MesaLib. I wonder if our issue is > mixing two libGL implementations. That's a minefield. --
Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")
On Sun, Oct 27, 2019 at 01:30:48AM +0100, Chavdar Ivanov wrote: > In my case its also swrast_dri, VirtualBox host. I haven't recently > tried xfce4 on a real hardware with intel, I might di that later. I could finally reproduce a crash. And it went away when I pkg_delete'd MesaLib. I wonder if our issue is mixing two libGL implementations. That's a minefield.
Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")
In my case its also swrast_dri, VirtualBox host. I haven't recently tried xfce4 on a real hardware with intel, I might di that later. On Sat, 26 Oct 2019 at 19:25, David H. Gutteridge wrote: > > On Sat, 2019-10-26 at 00:40 +, m...@netbsd.org wrote: > > Can someone who has this issue explain it shortly? > > > > - Which GPU? > > - What part of updating (kernel, userland) did it? > > - Does a clean build of everything fix it? > > > > the i915 driver has broken userland compatibility. mrg/riastradh fixed > > it, > > but I won't be surprised if there's more we haven't spotted with the > > high bar of "does startx work". > > I'm seeing it in two contexts: > - on a laptop with Intel graphics (presently kernel as of Oct. 15th, >userland as of Oct. 13th, pkgsrc has gone through many updates) > - in a QEMU VM that has no DRM capabilities (so in that case, xfwm4 is >falling back to swrast_dri.so) (kernel and userland as of Oct. 2nd) > > Both were working as of -current's state in mid-August, as I tested the > xfwm4 update to 4.14.0 on them. Some time between then and early October > (9.99.15 from a kernel perspective), this issue emerged, it seems. > > I've done a subsequent full update to a -current 9.99.17 plus userland > from mid-October on the laptop, which hasn't made any difference. > (The kernel and userland in the VM are from Releng builds.) I haven't > yet tried a full replacement of every package for either of those > machines, but I have rebuilt all of Xfce, plus dependencies like gtk3 > and such on the laptop, and that hasn't made a difference. (I > specifically walked through the dependency chain for xfwm4.) > > My suspicion is this relates to the Mesa update, but I can't say for > sure. But there seemed to be overlap with the Firefox issue that was > being discussed. > > Regards, > > Dave > > --
Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")
On Sat, 2019-10-26 at 00:40 +, m...@netbsd.org wrote: > Can someone who has this issue explain it shortly? > > - Which GPU? > - What part of updating (kernel, userland) did it? > - Does a clean build of everything fix it? > > the i915 driver has broken userland compatibility. mrg/riastradh fixed > it, > but I won't be surprised if there's more we haven't spotted with the > high bar of "does startx work". I'm seeing it in two contexts: - on a laptop with Intel graphics (presently kernel as of Oct. 15th, userland as of Oct. 13th, pkgsrc has gone through many updates) - in a QEMU VM that has no DRM capabilities (so in that case, xfwm4 is falling back to swrast_dri.so) (kernel and userland as of Oct. 2nd) Both were working as of -current's state in mid-August, as I tested the xfwm4 update to 4.14.0 on them. Some time between then and early October (9.99.15 from a kernel perspective), this issue emerged, it seems. I've done a subsequent full update to a -current 9.99.17 plus userland from mid-October on the laptop, which hasn't made any difference. (The kernel and userland in the VM are from Releng builds.) I haven't yet tried a full replacement of every package for either of those machines, but I have rebuilt all of Xfce, plus dependencies like gtk3 and such on the laptop, and that hasn't made a difference. (I specifically walked through the dependency chain for xfwm4.) My suspicion is this relates to the Mesa update, but I can't say for sure. But there seemed to be overlap with the Firefox issue that was being discussed. Regards, Dave
Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")
Can someone who has this issue explain it shortly? - Which GPU? - What part of updating (kernel, userland) did it? - Does a clean build of everything fix it? the i915 driver has broken userland compatibility. mrg/riastradh fixed it, but I won't be surprised if there's more we haven't spotted with the high bar of "does startx work". https://github.com/NetBSD/src/commit/52ef9d9e2c837c205a00799c3d54c3ef4d65d68d