Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2020-01-03 Thread David H. Gutteridge
On Fri, 2019-11-01 at 01:49 -0400, David H. Gutteridge wrote:
> On Tue, 2019-10-29 at 19:52 -0400, David H. Gutteridge wrote:
> > On Tue, 2019-10-29 at 10:38 +, Chavdar Ivanov wrote:
> > > I've tested xfce4 - a few days old build from -current pkgsrc -
> > > now
> > > on
> > > real hardware with functional dri2. I get the same as with the
> > > VirtualBox client - I have to disable compositing to get xfwm4
> > > working. At the same time glmark2 returns the usual or close to
> > > results.
> > 
> > What do you find if you disable compositing to get Xfce to start,
> > and
> > then enable it once xfwm4 is running successfully? I find that seems
> > to
> > work. So it fails some sort of initial probing, but then is able to
> > activate the feature later, anyway. (As if there are two different
> > code
> > paths for this, or something is getting corrupted in memory during
> > start up, but that isn't happening later on. I haven't had a chance
> > to
> > look at it in gdb again, yet.)
> 
> Sorry, that's the wrong example, the right example is:
> 
> - Move or delete the xfwm4.xml file from the .config path
> - Start Xfce
> - Go to Window Manager Tweaks->Compositor
> - Note that the compositor is enabled, and related setting changes
> (e.g.
> opacity of window decorations) successfully apply.
> 
> Yet, on the next startup cycle, xfwm4 crashes. (And it crashes with my
> previous example of starting with the compositor turned off, and then
> turning it on.)

Some time between November and now, this seems to have been resolved in
-current (9.99.32 from earlier this week is what I tested) with both
Intel graphics and in a Qemu VM. Both now start for me without issue
with compositing enabled.

Dave




Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-11-01 Thread Chavdar Ivanov
I don't remove xfwm4.xml, just set use_compositing setting to no. Then
I am able to start xfce4 without a problem many times. EVen if I
delete that file - out of xfce4 - then startxfce4, it starts first
time; on exit use_compositing is set to true, subsequent startxfce4
starts, but without xfwm4 (and a longer delay to dump core); as I have
a terminal in the session, I still can edit xfwm4.xml, set
use_compositing to false and nohup xfwm4...

On Fri, 1 Nov 2019 at 05:49, David H. Gutteridge  wrote:
>
> On Tue, 2019-10-29 at 19:52 -0400, David H. Gutteridge wrote:
> > On Tue, 2019-10-29 at 10:38 +, Chavdar Ivanov wrote:
> > > I've tested xfce4 - a few days old build from -current pkgsrc - now
> > > on
> > > real hardware with functional dri2. I get the same as with the
> > > VirtualBox client - I have to disable compositing to get xfwm4
> > > working. At the same time glmark2 returns the usual or close to
> > > results.
> >
> > What do you find if you disable compositing to get Xfce to start, and
> > then enable it once xfwm4 is running successfully? I find that seems
> > to
> > work. So it fails some sort of initial probing, but then is able to
> > activate the feature later, anyway. (As if there are two different
> > code
> > paths for this, or something is getting corrupted in memory during
> > start up, but that isn't happening later on. I haven't had a chance to
> > look at it in gdb again, yet.)
>
> Sorry, that's the wrong example, the right example is:
>
> - Move or delete the xfwm4.xml file from the .config path
> - Start Xfce
> - Go to Window Manager Tweaks->Compositor
> - Note that the compositor is enabled, and related setting changes (e.g.
> opacity of window decorations) successfully apply.
>
> Yet, on the next startup cycle, xfwm4 crashes. (And it crashes with my
> previous example of starting with the compositor turned off, and then
> turning it on.)
>
> Regards,
>
> Dave
>
>


-- 



Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-10-31 Thread David H. Gutteridge
On Tue, 2019-10-29 at 19:52 -0400, David H. Gutteridge wrote:
> On Tue, 2019-10-29 at 10:38 +, Chavdar Ivanov wrote:
> > I've tested xfce4 - a few days old build from -current pkgsrc - now
> > on
> > real hardware with functional dri2. I get the same as with the
> > VirtualBox client - I have to disable compositing to get xfwm4
> > working. At the same time glmark2 returns the usual or close to
> > results.
> 
> What do you find if you disable compositing to get Xfce to start, and
> then enable it once xfwm4 is running successfully? I find that seems
> to
> work. So it fails some sort of initial probing, but then is able to
> activate the feature later, anyway. (As if there are two different
> code
> paths for this, or something is getting corrupted in memory during
> start up, but that isn't happening later on. I haven't had a chance to
> look at it in gdb again, yet.)

Sorry, that's the wrong example, the right example is:

- Move or delete the xfwm4.xml file from the .config path
- Start Xfce
- Go to Window Manager Tweaks->Compositor
- Note that the compositor is enabled, and related setting changes (e.g.
opacity of window decorations) successfully apply.

Yet, on the next startup cycle, xfwm4 crashes. (And it crashes with my
previous example of starting with the compositor turned off, and then
turning it on.)

Regards,

Dave




Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-10-29 Thread David H. Gutteridge
On Tue, 2019-10-29 at 10:38 +, Chavdar Ivanov wrote:
> I've tested xfce4 - a few days old build from -current pkgsrc - now on
> real hardware with functional dri2. I get the same as with the
> VirtualBox client - I have to disable compositing to get xfwm4
> working. At the same time glmark2 returns the usual or close to
> results.

What do you find if you disable compositing to get Xfce to start, and
then enable it once xfwm4 is running successfully? I find that seems to
work. So it fails some sort of initial probing, but then is able to
activate the feature later, anyway. (As if there are two different code
paths for this, or something is getting corrupted in memory during
start up, but that isn't happening later on. I haven't had a chance to
look at it in gdb again, yet.)

> The other thing is - firefox used to be able to run WebGL under
> -current a few months ago; now it reports that the system does not
> support it. With overnight built firefox 70.0 I now get a core every
> time I start it up, but then it works fine. The trace is again:
> ...
> (gdb) bt
> #0  0x7ced69a09a41 in pthread_mutex_lock () from
> /usr/lib/libpthread.so.1
> #1  0x7ced4881c42c in _mesa_error () from
> /usr/X11R7/lib/modules/dri/swrast_dri.so
> #2  0x7ced48836816 in _mesa_GetString () from
> /usr/X11R7/lib/modules/dri/swrast_dri.so
> #3  0x7ced581475d3 in ?? () from /usr/pkg/lib/firefox/libxul.so
> 
> 
> Now it seems only epiphany can run WebGL (but has some other problems,
> e.g. can't quit from the gui if WebGL was running).

Yes, I've found the same thing. I was confused when video streaming
suddenly stopped working for me, and then I saw Firefox was logging that
it had "exhaused GL driver options", for a site that worked a month or
so ago. (I imagine 9-BETA may still work, but I don't have anything
running it at present.)

Regards,

Dave




Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-10-29 Thread Chavdar Ivanov
I've tested xfce4 - a few days old build from -current pkgsrc - now on
real hardware with functional dri2. I get the same as with the
VirtualBox client - I have to disable compositing to get xfwm4
working. At the same time glmark2 returns the usual or close to
results.

The other thing is - firefox used to be able to run WebGL under
-current a few months ago; now it reports that the system does not
support it. With overnight built firefox 70.0 I now get a core every
time I start it up, but then it works fine. The trace is again:
...
(gdb) bt
#0  0x7ced69a09a41 in pthread_mutex_lock () from /usr/lib/libpthread.so.1
#1  0x7ced4881c42c in _mesa_error () from
/usr/X11R7/lib/modules/dri/swrast_dri.so
#2  0x7ced48836816 in _mesa_GetString () from
/usr/X11R7/lib/modules/dri/swrast_dri.so
#3  0x7ced581475d3 in ?? () from /usr/pkg/lib/firefox/libxul.so


Now it seems only epiphany can run WebGL (but has some other problems,
e.g. can't quit from the gui if WebGL was running).

On Sun, 27 Oct 2019 at 23:54, Robert Swindells  wrote:
>
>
> "David H. Gutteridge"  wrote:
> >On Wed, 2019-10-16 at 12:10 +0100, Chavdar Ivanov wrote:
> > On Wed, 16 Oct 2019 at 11:03, David H. Gutteridge  > > wrote:
> >
> > > FWIW, aside from Firefox (where I also see this issue), I've found
> > > since the recent Mesa upgrade, Xfce4's window manager consistently
> > > crashes during startup. These's a correlation with Firefox in the
> > > backtrace:
> > >
> > > #3  0x79f26fa5e256 in _mesa_GetString (name=7937) at
> > > /usr/xsrc/external/mit/MesaLib/dist/src/mesa/main/getstring.c:124
> > > ctx = 0x79f288ae5898
> > > vendor = 0x79f271833414 "Brian Paul"
> > > renderer = 0x79f2718214ab "Mesa"
> > > #4  0x0041b1b5 in ?? ()
> > > No symbol table info available.
> > > #5  0x00442bb8 in ?? ()
> > > No symbol table info available.
> > > [...]
>
> If you still have the core dump for this, or can generate another, it
> could be helpful to examine the ctx variable in _mesa_GetString() using
> gdb.
>
> If it is like firefox then I think you will find that not all of the
> structure is in mapped memory.
>
> Can only think of two ways this could happen, either the ctx pointer
> itself is garbage or the size given to calloc() to allocate the context
> was too small. Have been trying to work out where this gets allocated
> but not found it yet.
>
> I have got some patches to firefox that let me display Google Maps but
> they basically just disable the use of OpenGL.
>


-- 



Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-10-27 Thread Robert Swindells


"David H. Gutteridge"  wrote:
>On Wed, 2019-10-16 at 12:10 +0100, Chavdar Ivanov wrote:
> On Wed, 16 Oct 2019 at 11:03, David H. Gutteridge  > wrote:
> 
> > FWIW, aside from Firefox (where I also see this issue), I've found
> > since the recent Mesa upgrade, Xfce4's window manager consistently
> > crashes during startup. These's a correlation with Firefox in the
> > backtrace:
> >
> > #3  0x79f26fa5e256 in _mesa_GetString (name=7937) at
> > /usr/xsrc/external/mit/MesaLib/dist/src/mesa/main/getstring.c:124
> > ctx = 0x79f288ae5898
> > vendor = 0x79f271833414 "Brian Paul"
> > renderer = 0x79f2718214ab "Mesa"
> > #4  0x0041b1b5 in ?? ()
> > No symbol table info available.
> > #5  0x00442bb8 in ?? ()
> > No symbol table info available.
> > [...]

If you still have the core dump for this, or can generate another, it
could be helpful to examine the ctx variable in _mesa_GetString() using
gdb.

If it is like firefox then I think you will find that not all of the
structure is in mapped memory.

Can only think of two ways this could happen, either the ctx pointer
itself is garbage or the size given to calloc() to allocate the context
was too small. Have been trying to work out where this gets allocated
but not found it yet.

I have got some patches to firefox that let me display Google Maps but
they basically just disable the use of OpenGL.



Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-10-27 Thread maya
On Sun, Oct 27, 2019 at 02:47:43PM -0400, David H. Gutteridge wrote:
> On Sun, 2019-10-27 at 02:24 +, m...@netbsd.org wrote:
> > On Sun, Oct 27, 2019 at 01:30:48AM +0100, Chavdar Ivanov wrote:
> > > In my case its also swrast_dri, VirtualBox host. I haven't recently
> > > tried xfce4 on a real hardware with intel, I might di that later.
> > 
> > I could finally reproduce a crash.
> > And it went away when I pkg_delete'd MesaLib. I wonder if our issue is
> > mixing two libGL implementations. That's a minefield.
> 
> Interesting. In my case, neither of my machines have MesaLib installed
> from pkgsrc, they're just using native X in this context. (Just curious
> what your graphics chip is? Nvidia?)
> 
> Regards,
> 
> Dave
> 
> 

I might have been to desperate to try and reproduce the issues and made
some unusual setups.

the scenario Chavdar Ivanov has should be reproducible with
LIBGL_ALWAYS_SOFTWARE=1.

I don't usually use xfce, so this was with a game.

My graphics chip is an nvidia GTX 770.


Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-10-27 Thread David H. Gutteridge
On Sun, 2019-10-27 at 14:14 +, Chavdar Ivanov wrote:
> I do not have MesaLib installed on this v/b guest at all.
> 
> I bisected xfwm4.xml to try to find out which setting was causing the
> problem. I didn't bother to read it first, as the result was obvious:
> ..
> ~ diff -u .config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml.HIDE
> .config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml
> --- .config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml.HIDE 2019-10-25
> 22:13:04.791908990 +0100
> +++ .config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml  2019-10-27
> 14:09:13.334172740 +
> @@ -71,7 +71,7 @@
>  
>  
>  
> -
> +
>  
>  
>  
> 
> So the problem is that on first invocation xfce4 sets use_composing to
> true, even if composing is not available or not functional.

Well, there's more to it than that. This is happening on real hardware
with Intel graphics, where there should be no such issue. (That's why I
was referring to testing the different vblank settings in that config
file before. Those settings in turn make xfwm4 choose different back
ends for that aspect. Though that's just one piece.)

Regardless, it's not that simple with virtualized environments, either.
Or, at least, not mine. Having compositing enabled in that config file
worked without issue before, and now it doesn't. (I just re-tested on
an older QEMU VM snapshot of 8.99.50 from mid-July, with the current
state of Xfce in pkgsrc, and there's no xfwm4 startup crash.)

Regards,

Dave




Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-10-27 Thread David H. Gutteridge
On Sun, 2019-10-27 at 02:24 +, m...@netbsd.org wrote:
> On Sun, Oct 27, 2019 at 01:30:48AM +0100, Chavdar Ivanov wrote:
> > In my case its also swrast_dri, VirtualBox host. I haven't recently
> > tried xfce4 on a real hardware with intel, I might di that later.
> 
> I could finally reproduce a crash.
> And it went away when I pkg_delete'd MesaLib. I wonder if our issue is
> mixing two libGL implementations. That's a minefield.

Interesting. In my case, neither of my machines have MesaLib installed
from pkgsrc, they're just using native X in this context. (Just curious
what your graphics chip is? Nvidia?)

Regards,

Dave




Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-10-27 Thread Robert Swindells


Chavdar Ivanov  wrote:
>On Sun, 27 Oct 2019 at 16:25, Robert Swindells  wrote:
>>
>> Chavdar Ivanov  wrote:
>> >I do not have MesaLib installed on this v/b guest at all.
>>
>> Are you running modular or native xorg ?
>
>Native.

Ok, so either you have MesaLib from xsrc installed or you have deleted
it yourself.





Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-10-27 Thread Chavdar Ivanov
Native.

On Sun, 27 Oct 2019 at 16:25, Robert Swindells  wrote:
>
>
> Chavdar Ivanov  wrote:
> >I do not have MesaLib installed on this v/b guest at all.
>
> Are you running modular or native xorg ?



-- 



Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-10-27 Thread Robert Swindells


Chavdar Ivanov  wrote:
>I do not have MesaLib installed on this v/b guest at all.

Are you running modular or native xorg ?


Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-10-27 Thread Chavdar Ivanov
I do not have MesaLib installed on this v/b guest at all.

I bisected xfwm4.xml to try to find out which setting was causing the
problem. I didn't bother to read it first, as the result was obvious:
..
~ diff -u .config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml.HIDE
.config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml
--- .config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml.HIDE 2019-10-25
22:13:04.791908990 +0100
+++ .config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml  2019-10-27
14:09:13.334172740 +
@@ -71,7 +71,7 @@
 
 
 
-
+
 
 
 

So the problem is that on first invocation xfce4 sets use_composing to
true, even if composing is not available or not functional.


On Sun, 27 Oct 2019 at 02:24,  wrote:
>
> On Sun, Oct 27, 2019 at 01:30:48AM +0100, Chavdar Ivanov wrote:
> > In my case its also swrast_dri, VirtualBox host. I haven't recently
> > tried xfce4 on a real hardware with intel, I might di that later.
>
> I could finally reproduce a crash.
> And it went away when I pkg_delete'd MesaLib. I wonder if our issue is
> mixing two libGL implementations. That's a minefield.



-- 



Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-10-26 Thread maya
On Sun, Oct 27, 2019 at 01:30:48AM +0100, Chavdar Ivanov wrote:
> In my case its also swrast_dri, VirtualBox host. I haven't recently
> tried xfce4 on a real hardware with intel, I might di that later.

I could finally reproduce a crash.
And it went away when I pkg_delete'd MesaLib. I wonder if our issue is
mixing two libGL implementations. That's a minefield.


Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-10-26 Thread Chavdar Ivanov
In my case its also swrast_dri, VirtualBox host. I haven't recently
tried xfce4 on a real hardware with intel, I might di that later.

On Sat, 26 Oct 2019 at 19:25, David H. Gutteridge  wrote:
>
> On Sat, 2019-10-26 at 00:40 +, m...@netbsd.org wrote:
> > Can someone who has this issue explain it shortly?
> >
> > - Which GPU?
> > - What part of updating (kernel, userland) did it?
> > - Does a clean build of everything fix it?
> >
> > the i915 driver has broken userland compatibility. mrg/riastradh fixed
> > it,
> > but I won't be surprised if there's more we haven't spotted with the
> > high bar of "does startx work".
>
> I'm seeing it in two contexts:
>  - on a laptop with Intel graphics (presently kernel as of Oct. 15th,
>userland as of Oct. 13th, pkgsrc has gone through many updates)
>  - in a QEMU VM that has no DRM capabilities (so in that case, xfwm4 is
>falling back to swrast_dri.so) (kernel and userland as of Oct. 2nd)
>
> Both were working as of -current's state in mid-August, as I tested the
> xfwm4 update to 4.14.0 on them. Some time between then and early October
> (9.99.15 from a kernel perspective), this issue emerged, it seems.
>
> I've done a subsequent full update to a -current 9.99.17 plus userland
> from mid-October on the laptop, which hasn't made any difference.
> (The kernel and userland in the VM are from Releng builds.) I haven't
> yet tried a full replacement of every package for either of those
> machines, but I have rebuilt all of Xfce, plus dependencies like gtk3
> and such on the laptop, and that hasn't made a difference. (I
> specifically walked through the dependency chain for xfwm4.)
>
> My suspicion is this relates to the Mesa update, but I can't say for
> sure. But there seemed to be overlap with the Firefox issue that was
> being discussed.
>
> Regards,
>
> Dave
>
>


-- 



Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-10-26 Thread David H. Gutteridge
On Sat, 2019-10-26 at 00:40 +, m...@netbsd.org wrote:
> Can someone who has this issue explain it shortly?
> 
> - Which GPU?
> - What part of updating (kernel, userland) did it?
> - Does a clean build of everything fix it?
> 
> the i915 driver has broken userland compatibility. mrg/riastradh fixed
> it,
> but I won't be surprised if there's more we haven't spotted with the
> high bar of "does startx work".

I'm seeing it in two contexts:
 - on a laptop with Intel graphics (presently kernel as of Oct. 15th,
   userland as of Oct. 13th, pkgsrc has gone through many updates)
 - in a QEMU VM that has no DRM capabilities (so in that case, xfwm4 is
   falling back to swrast_dri.so) (kernel and userland as of Oct. 2nd)

Both were working as of -current's state in mid-August, as I tested the
xfwm4 update to 4.14.0 on them. Some time between then and early October
(9.99.15 from a kernel perspective), this issue emerged, it seems.

I've done a subsequent full update to a -current 9.99.17 plus userland
from mid-October on the laptop, which hasn't made any difference.
(The kernel and userland in the VM are from Releng builds.) I haven't
yet tried a full replacement of every package for either of those
machines, but I have rebuilt all of Xfce, plus dependencies like gtk3
and such on the laptop, and that hasn't made a difference. (I
specifically walked through the dependency chain for xfwm4.)

My suspicion is this relates to the Mesa update, but I can't say for
sure. But there seemed to be overlap with the Firefox issue that was
being discussed.

Regards,

Dave




Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-10-25 Thread maya
Can someone who has this issue explain it shortly?

- Which GPU?
- What part of updating (kernel, userland) did it?
- Does a clean build of everything fix it?

the i915 driver has broken userland compatibility. mrg/riastradh fixed it,
but I won't be surprised if there's more we haven't spotted with the
high bar of "does startx work".

https://github.com/NetBSD/src/commit/52ef9d9e2c837c205a00799c3d54c3ef4d65d68d