Re: FYI: new X server in -current, among other X things

2022-08-02 Thread David H. Gutteridge


On Thu, 28 Jul 2022 at 12:38:10 -0500 (CDT), John D. Baker wrote:
> I updated my -current install to shortly after the GCC 10.4 update,
> doing a non-update build on everything.  I then rebuilt all my extra
> packages.
> 
> Logging via xdm using the failsafe mode, the in-tree 'ctwm' works
> fine.
> Using my prefered fvwm2 (wm/fvwm), however, hangs as soon as it tries
> to decorate any windows (my '.xsession' script starts 'urxvt' among
> other things before exec-ing the window manager).
> 
> The root window is mostly vacant, the undecorated icon region for
> 'xconsole' is the only thing visible and the mouse cursor is the
> "wristwatch" glyph.
> 
> Switching to a text console and running 'ps' or 'top' shows the
> 'fvwm2'
> process is "parked".  So far the only way to stop it is 'kill -9'.
> 
> I know pkgsrc-HEAD is preferred on -current, but I'm using an up-to-
> date pkgsrc-2022Q2.

I've tested fvwm on 9.99.99 builds from July 22nd and July 29th (fvwm
rebuilt both times from pkgsrc HEAD), as initiated via startx, and had
no issues. I also tried rxvt-unicode and didn't have any issue there,
either. Perhaps there's some combination of your full setup that's
tripping a regression. (I don't use xdm at all.)

I've tested the following WMs and DEs and haven't found any issues
(other than the one already known and addressed with Xfce):

blackbox
ctwm (from X base)
enlightenment16
fluxbox
fvwm
fvwm3
icewm14 (1.4.2)
jwm
lxde
lxqt
mate
openbox
sawfish
xfce4

Regards,

Dave



Re: FYI: new X server in -current, among other X things

2022-07-22 Thread David H. Gutteridge
On Mon, 2022-07-18 at 21:23 -0400, David H. Gutteridge wrote:
> On Mon, 2022-07-18 at 07:17 +1000, matthew green wrote:
> > > can you post the whole Xorg.0.log somewhere?  most of
> > > my i915 systems have become non-functional the last few
> > > years, but i have one system to test.
> > 
> > unfortunately, my system (kaby lake, GT 630) seems to work
> > fine with xorg-server 21.1.4 for me.
> 
> This seems isolated to the modesetting driver. If I use intel instead,
> then there are no issues. I've opened PR xsrc/56934 to track this.

With mrg@'s commit to fix this[1], I can now start an X session without
issue with the modesetting driver, and things generally work. I do see
what seems like a separate, much more minor regression where rendering
is incomplete (this doesn't occur with the intel driver, only
modesetting), e.g., the mouse pointer gets garbled when I move it
between windows. No idea where the trouble would be here.

Anyway, thanks to rjs@ and mrg@ for fixing this!

Regards,

Dave

1. http://mail-index.netbsd.org/source-changes/2022/07/21/msg139935.html


Re: FYI: new X server in -current, among other X things

2022-07-22 Thread David H. Gutteridge
On Sun, 2022-07-17 at 00:28 -0400, David H. Gutteridge wrote:
> Separately, libX11 added a feature called "thread safety constructor"
> which we have enabled. It can cause hangs with X11 clients that aren't
> coded safely. This did include xfce4-settings from Xfce until the
> version I pushed to pkgsrc a couple of days ago (4.16.3). I believe
> LXDE is also affected, but haven't had time to deal with it yet. Not
> sure about any other DEs or X clients. (I'm not able to test at the
> moment, of course.)

LXDE has a basically identical code block (copied from Xfce, even the
"xfce" function names are retained) to the one that was causing
deadlocks when Xfce starts, though it's used for other purposes. I was
not able to reproduce any hangs with that LXDE component, though. I'm
not aware of anything else adversely affected by the libX11 change, but
I guess we'll see.

Regards,

Dave



Re: FYI: new X server in -current, among other X things

2022-07-16 Thread David H. Gutteridge


On Fri, 15 Jul 2022 at 15:12:07 +1000, matthew green wrote:
> i've updated most of xsrc to their latest versions.
> fontconfig and Mesa are remaining.  i've tested the
> new code on amd64 and arm64, and built several ports
> to confirm they still build.  the biggest change is
> the new xorg-server.
> 
> there are probably a few build issues left to find
> across all ports, and perhaps some run-time ones too
> but basic testing looks fine for me.
> 
> please send-pr or email here if you find problems.

TL;DR: after upgrading via the sets available from releng builds from
July 16th (http://releng.netbsd.org/builds/HEAD/202207160630Z) I'm not
able to start X on amd64 with i915 graphics. Separately, there may be
issues with libX11 1.8.1 where clients will hang due to recursive locks
occurring.

I haven't had time to look into this in any detail, but after upgrading
kernel and userland to the July 16th sets (and running etcupgrade), I'm
now unable to start any window manager. I get the following:

[   378.027] (EE) 
[   378.027] (EE) Backtrace:
[   378.033] (EE) 0: /usr/X11R7/bin/X (xorg_backtrace+0x44) [0x1467d46d5]
[   378.033] (EE) 1: /usr/X11R7/bin/X (os_move_fd+0x79) [0x1467d0465]
[   378.033] (EE) 2: /usr/lib/libc.so.12 (__sigtramp_siginfo_2+0x0) 
[0x75b46379c930]
[   378.034] (EE) 
[   378.034] (EE) Segmentation fault at address 0x0
[   378.034] (EE) 
Fatal server error:
[   378.034] (EE) Caught signal 11 (Segmentation fault). Server aborting
[   378.034] (EE) 
[   378.034] (EE) 
Please consult the The X.Org Foundation support 
 at http://wiki.x.org
 for help. 
[   378.034] (EE) Please also check the log file at "/var/log/Xorg.0.log" for 
additional information.
[   378.034] (EE) 
[   378.053] (EE) Server terminated with error (1). Closing log file.

This happens with ctwm as part of the base installation, as well as with
other pre-existing window managers and such from pkgsrc built against
9.99.97.

Separately, libX11 added a feature called "thread safety constructor"
which we have enabled. It can cause hangs with X11 clients that aren't
coded safely. This did include xfce4-settings from Xfce until the
version I pushed to pkgsrc a couple of days ago (4.16.3). I believe
LXDE is also affected, but haven't had time to deal with it yet. Not
sure about any other DEs or X clients. (I'm not able to test at the
moment, of course.)

Regards,

Dave



Re: Steinberg UR44 uaudio device

2022-06-11 Thread David H. Gutteridge
On Sat, 2022-06-11 at 19:31 +0200, Eivind Nicolay Evensen wrote:
> Den Fri, 10 Jun 2022 21:25:48 -0400
> skrev "David H. Gutteridge" :
> > 
> > To add to what others have said, a general rule as I've understood
> > it
> > is that if a device offers > 24/96 for ADC/DAC, it will be expecting
> > USB Audio 2.0, at least for processing at those higher rates. Some
> > devices are switchable between 1.0 and 2.0; some actually have a
> > physical switch to select which USB Audio class should be used. I've
> > never seen a UR44, much less operated one, but a glance at the
> > manual
> > refers to a switch on the rear labelled "CC Mode". This is for
> > "Class
> > Compliant" mode, which implies it will work with generic OS drivers.
> > Have you tried flipping that switch to "on"? (The manual is a bit
> > light on specific technical details.)
> 
> Yes, this was all done with the "class compliant" switch set to "on".

Interesting, not what I would necessarily expect here. Anyway, clearly
we're missing some significant support here.

Dave



Re: Steinberg UR44 uaudio device

2022-06-10 Thread David H. Gutteridge


On Thu, 9 Jun 2022 at 17:43:02 +0200, Eivind Nicolay Evensen wrote:
> > On 6/8/22 16:50, Eivind Nicolay Evensen wrote:
> > > If I read that right, Martin's guess that this is a v2 device is
> > > right.  
> > 
> > Hi,
> > 
> > If you compile the kernel with "options USB_DEBUG", there will be a 
> > "sysctl hw.usb.uaudio.debug=16", which will print this in dmesg.
> 
> 
> So then, this confirms it it seems:
[...]

To add to what others have said, a general rule as I've understood it
is that if a device offers > 24/96 for ADC/DAC, it will be expecting
USB Audio 2.0, at least for processing at those higher rates. Some
devices are switchable between 1.0 and 2.0; some actually have a
physical switch to select which USB Audio class should be used. I've
never seen a UR44, much less operated one, but a glance at the manual
refers to a switch on the rear labelled "CC Mode". This is for "Class
Compliant" mode, which implies it will work with generic OS drivers.
Have you tried flipping that switch to "on"? (The manual is a bit light
on specific technical details.)

Regards,

Dave



Re: CVS broken/down/changed?

2022-05-12 Thread David H. Gutteridge

On Wed, 11 May 2022 at 16:45:06 -0700, Greywolf wrote:

Hi, all!

cvs up -AdP in my source tree yields the message

cvs [update aborted]: permission denied for src

What's up?


There was an issue with some data syncing, but it was fixed earlier
today, so it should work for you now.

Regards,

Dave


Re: Blank screen when build packages

2021-06-07 Thread David H. Gutteridge



On Mon, 7 Jun 2021 at 18:15:43 +0200, Roland Illig wrote:
> Am 07.06.2021 um 16:12 schrieb Dmitrii Postolov:
> > Hi! Sorry for my bad English...
> >
> > NetBSD 9.99.83 GENERIC Sun Jun 6 2021
> >
> > After install NetBSD 9.99.83 and download and unpack pkgsrc-current,
> I try to build some apps
> > from pkgsrc, for example 'editors/nano'. Afrer 'make install clean'
> the app begin to build but
> > after some time the screen is blank and build stop. There is no this
> problem in
> > NetBSD 9.2_STABLE and pkgsrc-current.
> >
> > How can I resolve this problem in NetBSD 9.99.x?
> >
> > Video: https://disk.yandex.ru/i/ruKLubdmfKYpHQ
> 
> Hi Dimitrii,
> 
> that video indeed looks interesting.  The most interesting detail is
> at
> 00:20.  There, the screen does not become black at once, and the lines
> do not scroll up with constant speed.
> 
> My guess is that some program changes the foreground and background
> colors of the terminal, so that the remaining build commands are still
> shown, but the colors are "black on black", which of course looks
> invisible.
> 
> To verify this assumption, please try this command:
> 
> make install clean 2>&1 | tr -d '\033'
> 
> This removes any ANSI escape sequences from the output.  Now all build
> commands should be visible.

This is probably PR 56223.

Dave




Re: nvmm-induced panic on -current

2020-01-05 Thread David H. Gutteridge
On Sun, 5 Jan 2020, at 14:17:04 +, Chavdar Ivanov wrote:
> I just got:
> .
> Jan  5 14:07:04 ymir /netbsd: [ 59319.7533581] panic: fpudna from
> kernel, ip 0x80226c3f, trapframe 0xdc81fab42c80
> Jan  5 14:07:04 ymir /netbsd: [ 59319.7533581] cpu0: Begin
> traceback...
> Jan  5 14:07:04 ymir /netbsd: [ 59319.7533581] vpanic() at
> netbsd:vpanic+0x178
> Jan  5 14:07:04 ymir /netbsd: [ 59319.7533581] snprintf() at
> netbsd:snprintf
> Jan  5 14:07:04 ymir /netbsd: [ 59319.7533581] fpu_set_default_cw() at
> netbsd:fpu_set_default_cw
> Jan  5 14:07:04 ymir /netbsd: [ 59319.7533581] Xtrap07() at
> netbsd:Xtrap07+0xbd
> Jan  5 14:07:04 ymir /netbsd: [ 59319.7533581] nvmm_ioctl() at
> nvmm:nvmm_ioctl+0xec
> Jan  5 14:07:04 ymir /netbsd: [ 59319.7533581] sys_ioctl() at
> netbsd:sys_ioctl+0x59e
> Jan  5 14:07:04 ymir /netbsd: [ 59319.7533581] syscall() at
> netbsd:syscall+0x299
> Jan  5 14:07:04 ymir /netbsd: [ 59319.7533581] --- syscall (number 54)
> ---
> Jan  5 14:07:04 ymir /netbsd: [ 59319.7533581] 73188eb8199a:
> Jan  5 14:07:04 ymir /netbsd: [ 59319.7533581] cpu0: End traceback...
> Jan  5 14:07:04 ymir /netbsd:
> Jan  5 14:07:04 ymir /netbsd: [ 59319.7649742] dumping to dev 168,2
> (offset=8, size=5225879):
> Jan  5 14:07:04 ymir /netbsd: [ 59319.7649742] dump failed:
> insufficient space (7221248 < 11753276)
> 
> 
> on
> 
> NetBSD 9.99.33 (GENERIC) #17: Sat Jan  4 18:16:34 GMT 2020
> sysbuild@ymir:/home/sysbuild/amd64/obj/home/sysbuild/src/sys/a
> rch/amd64/compile/GENERIC
> 
> starting one of my qemu-nvmm virtual machines - Windows 10 32-bit -
> which used to work rather well, but I haven't started it for several
> weeks, so am not sure when the break has happened.

As another data point, I'm able to run two VM images (one DragonFly,
one Ubuntu) with nvmm on an Intel system with NetBSD 9.99.31 (GENERIC)
built Sun Dec 29 16:34:10 EST 2019, a userland from Dec. 14th, and the
qemu-nvmm package from pkgsrc-wip built Dec. 15th.

Thanks for mentioning this. I'm kind of addicted to nvmm and need it
for testing something right now, so I'll hold off updating -current
further at the moment.

Regards,

Dave




Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2020-01-03 Thread David H. Gutteridge
On Fri, 2019-11-01 at 01:49 -0400, David H. Gutteridge wrote:
> On Tue, 2019-10-29 at 19:52 -0400, David H. Gutteridge wrote:
> > On Tue, 2019-10-29 at 10:38 +, Chavdar Ivanov wrote:
> > > I've tested xfce4 - a few days old build from -current pkgsrc -
> > > now
> > > on
> > > real hardware with functional dri2. I get the same as with the
> > > VirtualBox client - I have to disable compositing to get xfwm4
> > > working. At the same time glmark2 returns the usual or close to
> > > results.
> > 
> > What do you find if you disable compositing to get Xfce to start,
> > and
> > then enable it once xfwm4 is running successfully? I find that seems
> > to
> > work. So it fails some sort of initial probing, but then is able to
> > activate the feature later, anyway. (As if there are two different
> > code
> > paths for this, or something is getting corrupted in memory during
> > start up, but that isn't happening later on. I haven't had a chance
> > to
> > look at it in gdb again, yet.)
> 
> Sorry, that's the wrong example, the right example is:
> 
> - Move or delete the xfwm4.xml file from the .config path
> - Start Xfce
> - Go to Window Manager Tweaks->Compositor
> - Note that the compositor is enabled, and related setting changes
> (e.g.
> opacity of window decorations) successfully apply.
> 
> Yet, on the next startup cycle, xfwm4 crashes. (And it crashes with my
> previous example of starting with the compositor turned off, and then
> turning it on.)

Some time between November and now, this seems to have been resolved in
-current (9.99.32 from earlier this week is what I tested) with both
Intel graphics and in a Qemu VM. Both now start for me without issue
with compositing enabled.

Dave




Re: Request for testing: i915 + suspend, or i915 heavy use

2019-12-21 Thread David H. Gutteridge
On Fri, 2019-12-13 at 23:28 -0500, David H. Gutteridge wrote:
> On Thu, 12 Dec 2019, at 19:42:37 +, co...@sdf.org wrote:
> > hi folks,
> > 
> > I applied an upstream security fix to i915. It's pretty big.
> > 
> > It touches the suspend codepath, and I can't test that on my
> > machine.
> > 
> > Additionally I am looking for confirmation that i915 is fine in the
> > last
> > week. My testing wasn't very intensive.
> > 
> > Any -current later than Dec 6 2019.
> > 
> > Thanks.

Hi Maya,

I tested on two laptops with somewhat different vintage Intel graphics,
and didn't find any regressions on either during heavy use or suspend
and resume.

Thanks,

Dave




Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-10-31 Thread David H. Gutteridge
On Tue, 2019-10-29 at 19:52 -0400, David H. Gutteridge wrote:
> On Tue, 2019-10-29 at 10:38 +, Chavdar Ivanov wrote:
> > I've tested xfce4 - a few days old build from -current pkgsrc - now
> > on
> > real hardware with functional dri2. I get the same as with the
> > VirtualBox client - I have to disable compositing to get xfwm4
> > working. At the same time glmark2 returns the usual or close to
> > results.
> 
> What do you find if you disable compositing to get Xfce to start, and
> then enable it once xfwm4 is running successfully? I find that seems
> to
> work. So it fails some sort of initial probing, but then is able to
> activate the feature later, anyway. (As if there are two different
> code
> paths for this, or something is getting corrupted in memory during
> start up, but that isn't happening later on. I haven't had a chance to
> look at it in gdb again, yet.)

Sorry, that's the wrong example, the right example is:

- Move or delete the xfwm4.xml file from the .config path
- Start Xfce
- Go to Window Manager Tweaks->Compositor
- Note that the compositor is enabled, and related setting changes (e.g.
opacity of window decorations) successfully apply.

Yet, on the next startup cycle, xfwm4 crashes. (And it crashes with my
previous example of starting with the compositor turned off, and then
turning it on.)

Regards,

Dave




Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-10-29 Thread David H. Gutteridge
On Tue, 2019-10-29 at 10:38 +, Chavdar Ivanov wrote:
> I've tested xfce4 - a few days old build from -current pkgsrc - now on
> real hardware with functional dri2. I get the same as with the
> VirtualBox client - I have to disable compositing to get xfwm4
> working. At the same time glmark2 returns the usual or close to
> results.

What do you find if you disable compositing to get Xfce to start, and
then enable it once xfwm4 is running successfully? I find that seems to
work. So it fails some sort of initial probing, but then is able to
activate the feature later, anyway. (As if there are two different code
paths for this, or something is getting corrupted in memory during
start up, but that isn't happening later on. I haven't had a chance to
look at it in gdb again, yet.)

> The other thing is - firefox used to be able to run WebGL under
> -current a few months ago; now it reports that the system does not
> support it. With overnight built firefox 70.0 I now get a core every
> time I start it up, but then it works fine. The trace is again:
> ...
> (gdb) bt
> #0  0x7ced69a09a41 in pthread_mutex_lock () from
> /usr/lib/libpthread.so.1
> #1  0x7ced4881c42c in _mesa_error () from
> /usr/X11R7/lib/modules/dri/swrast_dri.so
> #2  0x7ced48836816 in _mesa_GetString () from
> /usr/X11R7/lib/modules/dri/swrast_dri.so
> #3  0x7ced581475d3 in ?? () from /usr/pkg/lib/firefox/libxul.so
> 
> 
> Now it seems only epiphany can run WebGL (but has some other problems,
> e.g. can't quit from the gui if WebGL was running).

Yes, I've found the same thing. I was confused when video streaming
suddenly stopped working for me, and then I saw Firefox was logging that
it had "exhaused GL driver options", for a site that worked a month or
so ago. (I imagine 9-BETA may still work, but I don't have anything
running it at present.)

Regards,

Dave




Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-10-27 Thread David H. Gutteridge
On Sun, 2019-10-27 at 14:14 +, Chavdar Ivanov wrote:
> I do not have MesaLib installed on this v/b guest at all.
> 
> I bisected xfwm4.xml to try to find out which setting was causing the
> problem. I didn't bother to read it first, as the result was obvious:
> ..
> ~ diff -u .config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml.HIDE
> .config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml
> --- .config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml.HIDE 2019-10-25
> 22:13:04.791908990 +0100
> +++ .config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml  2019-10-27
> 14:09:13.334172740 +
> @@ -71,7 +71,7 @@
>  
>  
>  
> -
> +
>  
>  
>  
> 
> So the problem is that on first invocation xfce4 sets use_composing to
> true, even if composing is not available or not functional.

Well, there's more to it than that. This is happening on real hardware
with Intel graphics, where there should be no such issue. (That's why I
was referring to testing the different vblank settings in that config
file before. Those settings in turn make xfwm4 choose different back
ends for that aspect. Though that's just one piece.)

Regardless, it's not that simple with virtualized environments, either.
Or, at least, not mine. Having compositing enabled in that config file
worked without issue before, and now it doesn't. (I just re-tested on
an older QEMU VM snapshot of 8.99.50 from mid-July, with the current
state of Xfce in pkgsrc, and there's no xfwm4 startup crash.)

Regards,

Dave




Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-10-27 Thread David H. Gutteridge
On Sun, 2019-10-27 at 02:24 +, m...@netbsd.org wrote:
> On Sun, Oct 27, 2019 at 01:30:48AM +0100, Chavdar Ivanov wrote:
> > In my case its also swrast_dri, VirtualBox host. I haven't recently
> > tried xfce4 on a real hardware with intel, I might di that later.
> 
> I could finally reproduce a crash.
> And it went away when I pkg_delete'd MesaLib. I wonder if our issue is
> mixing two libGL implementations. That's a minefield.

Interesting. In my case, neither of my machines have MesaLib installed
from pkgsrc, they're just using native X in this context. (Just curious
what your graphics chip is? Nvidia?)

Regards,

Dave




Re: xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-10-26 Thread David H. Gutteridge
On Sat, 2019-10-26 at 00:40 +, m...@netbsd.org wrote:
> Can someone who has this issue explain it shortly?
> 
> - Which GPU?
> - What part of updating (kernel, userland) did it?
> - Does a clean build of everything fix it?
> 
> the i915 driver has broken userland compatibility. mrg/riastradh fixed
> it,
> but I won't be surprised if there's more we haven't spotted with the
> high bar of "does startx work".

I'm seeing it in two contexts:
 - on a laptop with Intel graphics (presently kernel as of Oct. 15th,
   userland as of Oct. 13th, pkgsrc has gone through many updates)
 - in a QEMU VM that has no DRM capabilities (so in that case, xfwm4 is
   falling back to swrast_dri.so) (kernel and userland as of Oct. 2nd)

Both were working as of -current's state in mid-August, as I tested the
xfwm4 update to 4.14.0 on them. Some time between then and early October
(9.99.15 from a kernel perspective), this issue emerged, it seems.

I've done a subsequent full update to a -current 9.99.17 plus userland
from mid-October on the laptop, which hasn't made any difference.
(The kernel and userland in the VM are from Releng builds.) I haven't
yet tried a full replacement of every package for either of those
machines, but I have rebuilt all of Xfce, plus dependencies like gtk3
and such on the laptop, and that hasn't made a difference. (I
specifically walked through the dependency chain for xfwm4.)

My suspicion is this relates to the Mesa update, but I can't say for
sure. But there seemed to be overlap with the Firefox issue that was
being discussed.

Regards,

Dave




xfwm4 crashes on NetBSD 9.99.17 (was "Re: firefox dumping core after NetBSD upgrade")

2019-10-25 Thread David H. Gutteridge
On Wed, 2019-10-16 at 12:10 +0100, Chavdar Ivanov wrote:
> On Wed, 16 Oct 2019 at 11:03, David H. Gutteridge  > wrote:
> 
> > FWIW, aside from Firefox (where I also see this issue), I've found
> > since the recent Mesa upgrade, Xfce4's window manager consistently
> > crashes during startup. These's a correlation with Firefox in the
> > backtrace:
> > 
> > Core was generated by `xfwm4'.
> > Program terminated with signal SIGSEGV, Segmentation fault.
> > (gdb) bt full
> > #0  debug_namespace_get (severity=MESA_DEBUG_SEVERITY_HIGH, id=1,
> > ns=0x79f288af02ef) at
> > /usr/xsrc/external/mit/MesaLib/dist/src/mesa/main/debug_output.c:393
> > elem = 0x0
> > node = 0x0
> > state = 0
> > node = 
> > state = 
> > elem = 
> > #1  _mesa_debug_is_message_enabled (debug=0x79f288af77a0, 
> > source=source@entry=MESA_DEBUG_SOURCE_API, type=type@entry=MESA_DEBU
> > G_TYPE_ERROR, id=1,
> > severity=severity@entry=MESA_DEBUG_SEVERITY_HIGH) at
> > /usr/xsrc/external/mit/MesaLib/dist/src/mesa/main/debug_output.c:623
> > gstack = 0
> > grp = 0x79f288af02ef
> > nspace = 0x79f288af02ef
> > #2  0x79f26fa440b0 in _mesa_error (ctx=ctx@entry=0x79f288ae5898,
> > error=error@entry=1282, fmtString=fmtString@entry=0
> > x79f271815ee4 "Inside glBegin/glEnd")
> > at
> > /usr/xsrc/external/mit/MesaLib/dist/src/mesa/main/errors.c:311
> > do_output = 
> > do_log = 
> > error_msg_id = 1
> > #3  0x79f26fa5e256 in _mesa_GetString (name=7937) at
> > /usr/xsrc/external/mit/MesaLib/dist/src/mesa/main/getstring.c:124
> > ctx = 0x79f288ae5898
> > vendor = 0x79f271833414 "Brian Paul"
> > renderer = 0x79f2718214ab "Mesa"
> > #4  0x0041b1b5 in ?? ()
> > No symbol table info available.
> > #5  0x00442bb8 in ?? ()
> > No symbol table info available.
> > [...]
> > 
> > (I haven't had any time to look into this further, so I haven't
> > enabled
> > debugging symbols for xfwm4 itself.)
> > 
> > Regards,
> > 
> > Dave
>
> I also have xfwm4 crash, but only if there is .config/xfce4 directory.
> So far if I remove it, xfce4 works fine. Otherwise the trace appeared
> similar to the above.

I found that the file .config/xfce4/xfconf/xfce-perchannel-xml/xfwm4.xml
specifically relates to this problem. If I remove it alone, xfwm4 starts.
What's curious is that after each startup, that file gets automatically
regenerated, but the regenerated version of the file causes xfwm4 to
crash on the next startup cycle.

I also tried various vblank setting options in that config file, but none
have made a difference. Obviously there's more to this, but I haven't
narrowed it down.

Dave




Re: firefox dumping core after NetBSD upgrade

2019-10-16 Thread David H. Gutteridge
On Tue, 15 Oct 2019, at 12:00:42 +0100, Robert Swindells wrote:
> I wrote:
> >From the stack trace that Paul Goyette provided it looks to me like
> >a Firefox bug is triggering one in Mesa.
> 
> I have now got a debug system and firefox build with debug-info, a
> firefox build with debug wouldn't display an URL.
> 
> I commented out the locking code to see what happened:
> 
> Index: errors.c
> ===
> RCS file: /cvsroot/xsrc/external/mit/MesaLib/dist/src/mesa/main/errors.c,v
> retrieving revision 1.1.1.4
> diff -u -r1.1.1.4 errors.c
> --- errors.c24 Sep 2019 18:10:11 -  1.1.1.4
> +++ errors.c15 Oct 2019 10:57:17 -
> @@ -306,6 +306,7 @@
>  
> do_output = should_output(ctx, error, fmtString);
>  
> +#if 0
> simple_mtx_lock(>DebugMutex);
> if (ctx->Debug) {
>do_log = _mesa_debug_is_message_enabled(ctx->Debug,
> @@ -318,6 +319,9 @@
>do_log = GL_FALSE;
> }
> simple_mtx_unlock(>DebugMutex);
> +#else
> +   do_log = GL_FALSE;
> +#endif
>  
> if (do_output || do_log) {
>char s[MAX_DEBUG_MESSAGE_LENGTH], s2[MAX_DEBUG_MESSAGE_LENGTH];
[...]

FWIW, aside from Firefox (where I also see this issue), I've found
since the recent Mesa upgrade, Xfce4's window manager consistently
crashes during startup. These's a correlation with Firefox in the
backtrace:

Core was generated by `xfwm4'.
Program terminated with signal SIGSEGV, Segmentation fault.
(gdb) bt full
#0  debug_namespace_get (severity=MESA_DEBUG_SEVERITY_HIGH, id=1, 
ns=0x79f288af02ef) at 
/usr/xsrc/external/mit/MesaLib/dist/src/mesa/main/debug_output.c:393
elem = 0x0
node = 0x0
state = 0
node = 
state = 
elem = 
#1  _mesa_debug_is_message_enabled (debug=0x79f288af77a0, 
source=source@entry=MESA_DEBUG_SOURCE_API, 
type=type@entry=MESA_DEBUG_TYPE_ERROR, id=1, 
severity=severity@entry=MESA_DEBUG_SEVERITY_HIGH) at 
/usr/xsrc/external/mit/MesaLib/dist/src/mesa/main/debug_output.c:623
gstack = 0
grp = 0x79f288af02ef
nspace = 0x79f288af02ef
#2  0x79f26fa440b0 in _mesa_error (ctx=ctx@entry=0x79f288ae5898, 
error=error@entry=1282, fmtString=fmtString@entry=0x79f271815ee4 "Inside 
glBegin/glEnd")
at /usr/xsrc/external/mit/MesaLib/dist/src/mesa/main/errors.c:311
do_output = 
do_log = 
error_msg_id = 1
#3  0x79f26fa5e256 in _mesa_GetString (name=7937) at 
/usr/xsrc/external/mit/MesaLib/dist/src/mesa/main/getstring.c:124
ctx = 0x79f288ae5898
vendor = 0x79f271833414 "Brian Paul"
renderer = 0x79f2718214ab "Mesa"
#4  0x0041b1b5 in ?? ()
No symbol table info available.
#5  0x00442bb8 in ?? ()
No symbol table info available.
[...]

(I haven't had any time to look into this further, so I haven't enabled
debugging symbols for xfwm4 itself.)

Regards,

Dave




Re: cvs init

2019-02-11 Thread David H. Gutteridge
On Fri, 8 Feb 2019 at 11:12:22 +, Patrick Welche wrote:
> On Thu, Feb 07, 2019 at 01:46:00PM -0500, Greg Troxel wrote:
> > As for the man page omission, maybe see if the bug is in upstream
> and
> > file a bug with them ;-) ?
> 
> I can give it a go ;-) That is part of the point, "init" doesn't
> appear.
> 
> > We could change the code to just not allow init of an existing dir
> at
> > all.
> 
> ... and maybe 
> https://wiki.netbsd.org/tutorials/how_to_setup_a_cvs_server/
> 
> which appears to do "mkdir mycompany"

Aside from the most recent upstream documentation* not being as
complete as we might like, there are two other general issues that
probably almost everyone knows of, but could be a bit of a stumbling
block for newer developers. One is that there are undocumented local
alterations and enhancements in the version NetBSD ships (there's more
than one PR open about this). The other is that the cvs(1) man page and
related info file NetBSD ships are missing fundamental commands that
are documented upstream, e.g., "add" and "remove" were examples, until
I fixed them the other day. (I happened to notice recently that "add"
was missing, which is why I started looking into this, but it wasn't at
the top of my to-do list.) And there's actually a PR open about "init"
being missing from NetBSD's distributed documentation: PR 45446. (And I
guess that wiki entry may need fixing, too.)

* The documentation shipped with CVS 1.11.23 is in some ways more
complete than what came with 1.12.13, because the 1.11 branch continued
to be maintained for a few years after the CVS project effectively
abandoned the 1.12 branch.

Also, part of why I'd referenced PR 45182 before is that there was a
broader issue raised in it: whether in general it's beneficial to
expect a "cvsadmin" group to exist locally to govern behaviour with
personal repositories. I imagine at this point the likelihood of
anything being reconsidered in this regard is low; it's more probable
NetBSD will move to a new VCS first.

Dave




Re: cvs init

2019-02-07 Thread David H. Gutteridge
On Fri, 2019-02-08 at 01:15 -0500, David H. Gutteridge wrote:
> On Thu, 07 Feb 2019, at 13:46:00 -0500, Greg Troxel wrote:
> > > $ cd /tmp
> > > $ mkdir foo
> > > $ cvs -d /tmp/foo init
> > > cvs [init aborted]: init to an existing repository is restricted
> > > to
> > members of the group cvsadmin
> > > $ grep cvsadmin /etc/group 
> > > $
> > > 
> > > I thought that if the cvsadmin group didn't exist on the system,
> > this
> > > restriction would be completely ignored? (according to "cvs admin"
> > command -
> > > no mention of it being applicable at all to "cvs init")
> > 
> > I just did "cvs -d /tmp/foo init" without creating foo first, and it
> > worked fine (netbsd-8).
> > 
> > The error is about running init on an *existing* repository.
> > 
> > I don't see that rerunning init on a repo that exists is something
> > anybody really wants to do, and if they do why using rm first is a
> > real problem.
> 
> The CVS documentation for version 1.12.13 states:
> 
> "cvs init is careful to never overwrite any existing files in the
> repository, so no harm is done if you run cvs init on an already set-
> up repository."
> https://web.archive.org/web/20111020045251/http://ximbiot.com/cvs/manual/cvs-1.12.13/cvs_2.html#SEC2

I realized I may have been unclear: I wasn't advocating for that as a
normal practice, or denying the code treats this as an error. I meant
that it's kind of counterintuitive to put a statement like that in
documentation without a caveat. (Basically what everyone else is saying
too.)

> > As for the man page omission, maybe see if the bug is in upstream
> > and
> > file a bug with them ;-) ?
> > 
> > We could change the code to just not allow init of an existing dir
> > at
> > all.
> 
> There is also a related NetBSD PR filed back in 2011:
> http://gnats.netbsd.org/45182

That PR is no longer relevant, it was addressed by christos@ in 2011.

Dave




Re: cvs init

2019-02-07 Thread David H. Gutteridge
On Thu, 07 Feb 2019, at 13:46:00 -0500, Greg Troxel wrote:
> > $ cd /tmp
> > $ mkdir foo
> > $ cvs -d /tmp/foo init
> > cvs [init aborted]: init to an existing repository is restricted to
> members of the group cvsadmin
> > $ grep cvsadmin /etc/group 
> > $
> >
> > I thought that if the cvsadmin group didn't exist on the system,
> this
> > restriction would be completely ignored? (according to "cvs admin"
> command -
> > no mention of it being applicable at all to "cvs init")
> 
> I just did "cvs -d /tmp/foo init" without creating foo first, and it
> worked fine (netbsd-8).
> 
> The error is about running init on an *existing* repository.
> 
> I don't see that rerunning init on a repo that exists is something
> anybody really wants to do, and if they do why using rm first is a
> real problem.

The CVS documentation for version 1.12.13 states:

"cvs init is careful to never overwrite any existing files in the
repository, so no harm is done if you run cvs init on an already set-up
repository."
https://web.archive.org/web/20111020045251/http://ximbiot.com/cvs/manual/cvs-1.12.13/cvs_2.html#SEC2

> As for the man page omission, maybe see if the bug is in upstream and
> file a bug with them ;-) ?
> 
> We could change the code to just not allow init of an existing dir at
> all.

There is also a related NetBSD PR filed back in 2011:
http://gnats.netbsd.org/45182

Regards,

Dave




Re: ThinkPad - suspend-to-RAM intel-x86 issues and tests

2018-11-29 Thread David H. Gutteridge
On Sat, 2018-11-24 at 18:33 +0100, Riccardo Mottola wrote:
> On 11/21/18 6:57 AM, David H. Gutteridge wrote:
> > I have access to a Toshiba Satellite Pro that's a roughly similar
> > vintage to your T43; I'll see how it behaves when I have a chance.
> 
> That would be interesting. Try it as-is, possibly with 8.0 and HEAD. If 
> it works fine, then try to disable audio, video and see if it helps.
> 
> Of course I want to pinpoint which driver(s) cause me problems, so at 
> least I can open the correct bug (and "bug" some kind soul to fix it). 
> Right now the information is a little more than "it doesn't work for me 
> on several computers".

I have good news and bad news, and it's the same thing: I was hoping
the Toshiba laptop had more parity with one of your laptops in terms
of hardware, but unfortunately, it's a consumer model, an M40X with an
Intel Dothan CPU, integrated 915GM graphics, a RealTek Ethernet card,
Atheros WiFi, etc. It's similar to the X110: it has a bit of trouble
on 8.0_STABLE resuming with the i915 DRM driver, but that's addressed
in 8.99.26, where it resumes as reliably as my T420. (With the same
caveats about restoring hardware state that are being discussed
elsewhere in this thread: I haven't tried that patch on it yet.)

Dave




Re: ThinkPad - suspend-to-RAM intel-x86 issues and tests

2018-11-29 Thread David H. Gutteridge
On Thu, 2018-11-29 at 15:15 +0900, Masanobu SAITOH wrote:
> On 2018/11/28 22:12, SAITOH Masanobu wrote:
> > On 2018/11/28 14:18, Masanobu SAITOH wrote:
> > > The diff says we should save/restore MSI table.
> > > We also should save/restore some other registers.
> > > 
> > >   Give me one or two days to resolve the problem.
> > 
> >   Please try the following diff:
> > 
> > http://www.netbsd.org/~msaitoh/pci-resume-20181118-0.dif
> > 
> > Even if I use this change with Thinkpad X220, it doesn't recover from
> > suspend...
> 
>   But, my X61 survived from suspend with this patch!

With that patch, networking devices now function reliably on my T420
after wakeup. That's a big improvement, thanks!

Dave




Re: ThinkPad - suspend-to-RAM intel-x86 issues and tests

2018-11-27 Thread David H. Gutteridge
On Sat, 2018-11-24 at 22:47 +, David Brownlee wrote:
> On Sat, 24 Nov 2018 at 18:52, David H. Gutteridge  > wrote:
> > On Fri, 2018-11-23 at 21:42 +, David Brownlee wrote:
> > > netbsd-8 Single user:
> > > - Suspend (hw.acpi.sleep.state=3) and resume appears to work
> > > reliably
> > > many times in a row
> > > - Booting multi user after suspend/resume: wireless iwn0 does not
> > > appear to work "iwn0: could not load firmware .text section"
> > 
> > I see that too. I haven't looked into it yet, but wondered if it was
> > as simple as forcing it to reload its firmware after resumption.
> 
> Mmm, the man page indicates "iwn0: could not load firmware .text
> section" is reported when it attempted to
> load the firmware from disk into the device but failed, so it may be a
> little more than that :/

That error definitely can mean just that, but it's notable that it's
not a case of the firmware file being absent or unloadable, as for me
it successfully loads on boot, it only gives that error on wakeup.

Dave




Re: ThinkPad - suspend-to-RAM intel-x86 issues and tests

2018-11-27 Thread David H. Gutteridge
On Tue, 2018-11-27 at 18:08 +, David Brownlee wrote:
> On Sun, 25 Nov 2018 at 21:11, David Brownlee  wrote:
> > I've bisected the changes against the github src copy, and it looks like 
> > the suspend/resume issue is related to the following commit:
> > 
> > commit 0fe469276f49bf0dc003300e0b8a35a80b7b246d (HEAD)
> > Author: jdolecek 
> > Date:   Mon Oct 22 20:57:07 2018 +
> > 
> > enable MSI support where available, blatantly copied from jmcneill's 
> > msk(4)
> > 
> > I tried building from HEAD with just that one commit reverted, and my T420s 
> > suspends and resumes again!
> > 
> > iwn0 is still non responsive after resume and wm0 will not pick up an IP 
> > via dhcpcd, but the disk responds :-p
> 
> So it turns out I'm as affective at off-by-one errors in git
> bisect as I am in coding... :/
> 
> It turns out the commit with the issue was:
> 
> commit 1628082c6b882d064bd5d77e5847c42b44b59fde (HEAD, refs/bisect/bad)
> Author: jdolecek 
> Date:   Mon Oct 22 21:04:53 2018 +
> 
> enable MSI support where available
> 
> M   sys/dev/pci/ahcisata_pci.c
> 
> Apologies...

No worries, I don't have an siisata device on that laptop, so I
figured it was ahcisata I needed to revert. I've done so, and tested,
and, yes, backing that change set out gets my laptop resuming without
disk errors again.

Thanks for the work you've put into isolating this and getting the PCI
config dumps!

Dave





Re: ThinkPad - suspend-to-RAM intel-x86 issues and tests

2018-11-24 Thread David H. Gutteridge
On Fri, 2018-11-23 at 21:42 +, David Brownlee wrote:
> Another couple of data points in case it helps
> 
> Tested on Thinkpad T420s and T530 with NetBSD/amd64 - both have
> similar behaviour
> 
> 8.99.25 Single user:
> - Suspends and seems to resume but hangs on first disk access "wd0a:
> device timeout reading fsbn ..."

Yes, I get that too. pgoyette@ suggested I follow up with jdolecek@
about it, but I haven't had time yet to look for more details. There
are a number of PRs that jdolecek@ was working on fixing that
reference "clearing WDCTL_RST failed for drive" in the dmesg. In my
case, I get that error on both 8.0_STABLE and 8.99.26 (after his
latest changes), but it seems like it's a red herring or there's more
to it, because 8 still resumes reliably regardless of that warning,
while HEAD behaves as you've seen. I just keep getting continuous
output with "wd0a: device timeout writing fsbn X of X..."

> netbsd-8 Single user:
> - Suspend (hw.acpi.sleep.state=3) and resume appears to work reliably
> many times in a row
> - Booting multi user after suspend/resume: wireless iwn0 does not
> appear to work "iwn0: could not load firmware .text section"

I see that too. I haven't looked into it yet, but wondered if it was
as simple as forcing it to reload its firmware after resumption.

(Actually, my iwn didn't work at all, originally, because it requires
a different firmware file than any that are distributed by NetBSD at
present, and needed an addition in the driver to target that firmware.
I made those changes in my tree and have been testing with them on
both 8 and HEAD.)

> netbsd-8 Multi user no x11:
> - Suspends, keyboard *usually* non responsive on resume (but can
> switch virtual terminals)

I've never had this problem, I've found my T420 consistently responsive
whether I'm at a console or have suspended with X running (typically
with an Xfce4 session). When it comes back, no issues there (aside from
iwn).

Dave




Re: ThinkPad - suspend-to-RAM intel-x86 issues and tests

2018-11-20 Thread David H. Gutteridge
On Tue, 2018-11-20 at 16:25 +0100, Riccardo Mottola wrote:
> Hi David,
> 
> David H. Gutteridge wrote:
> > FWIW, I'm able to get suspend and resume to work reliably on a
> > Lenovo
> > T420 with NetBSD-8.0_STABLE. (With 8.99.x, it doesn't work as
> > reliably
> > because the SATA driver seems to have issues after resumption which
> > don't occur with 8.0.) I didn't have to do anything of note to get
> > it
> > to work, it just does, assuming there's nothing extra attached.
> > (Read
> > on below.)
> 
> The T420 is a lot newer and is amd64 instead of x86.

True. What I can also say is that I tested the i386 port with it too
(see kern/53658), and it worked too. I've also tested i386 with an LG
X110 that I used to run NetBSD 5.x and then 7.x on. I could never get
it to suspend. With 8.0, it doesn't resume successfully, because of an
issue with the i915 DRM driver, but with the newer DRM code base that
was pulled into HEAD, I've found 8.99.25 did successfully resume, so
there may be hope.

I have access to a Toshiba Satellite Pro that's a roughly similar
vintage to your T43; I'll see how it behaves when I have a chance.

> > On the other hand, I cannot get it to work on a Lenovo x131e (the
> > AMD
> > CPU version, with Radeon graphics). With that machine, it resumes,
> > but
> > the display stays dark. (This behaviour is consistent with most
> > Linux
> > kernels I've tried as well, so there's something tricky about it.)
> 
> Yes, I have this issue on the T43 with ATI graphics, however it often 
> does work and come back (but takes several secons) it is unreliable,
> but 
> the only ThinkPad I have NetBSD on which sometimes suspends/resumes 
> correctly.
> 
> Did you try disabling video in configure and then trying to 
> suspend/resume after a clean boot?

I haven't tried that yet, no. I'll add it to the list.

> Christos suggested to me disabling video and audio (things in my 
> experience cause issue too).
> I tried also fine-grained disabling based on dmesg devices.
> 
> However, I went down to bare-bones, leaving just internal hard disk
> and 
> keyboard - disabling everything which I could (but perhaps I missed 
> something) and yet it fails!
> 
> > One other thing to consider is whether you have anything extra
> > plugged
> > into the USB stack when you're trying to suspend. I've found having
> > pretty much anything plugged in (including a mouse) causes my T420
> > to
> > fail to completely suspend.
> 
> Indeed, I did all the test with a just clean-booted machine with no
> USB 
> mouse, keybord, dongles or else.
> 
> It did not help.

Hopefully, we'll figure this out.

Dave




Re: ThinkPad - suspend-to-RAM intel-x86 issues and tests

2018-11-19 Thread David H. Gutteridge
On Wed, 14 Nov 2018, at 13:10:55 +0100, Riccardo Mottola wrote:
>HI all,
>
>I take the discussion started on a similar thread on netbsd-users over here, 
>since it is still a "current" issue and to debug it I am using netbsd-GENERIV 
>kernels from RelEng.
[...]
>
>the question is again.. suggestions on what to disable, if you have patch 
>suggestions, etc.

FWIW, I'm able to get suspend and resume to work reliably on a Lenovo
T420 with NetBSD-8.0_STABLE. (With 8.99.x, it doesn't work as reliably
because the SATA driver seems to have issues after resumption which
don't occur with 8.0.) I didn't have to do anything of note to get it
to work, it just does, assuming there's nothing extra attached. (Read
on below.)

On the other hand, I cannot get it to work on a Lenovo x131e (the AMD
CPU version, with Radeon graphics). With that machine, it resumes, but
the display stays dark. (This behaviour is consistent with most Linux
kernels I've tried as well, so there's something tricky about it.)

One other thing to consider is whether you have anything extra plugged
into the USB stack when you're trying to suspend. I've found having
pretty much anything plugged in (including a mouse) causes my T420 to
fail to completely suspend.

Regards,

Dave




Re: Networking issues with NetBSD-8 on Supermicro with X8DTU board?

2017-11-30 Thread David H. Gutteridge
On Thu, 30 Nov 2017, at 12:43:38 -0800, Brian Buhrow wrote:
>hello.  I'm trying to run the latest NetBSD-8 code on a Supermicro
>board, but I can't get the wm(4) network cards to work.  The dmesg is
>below.  NetBSD-5.2, using my production sources works just fine.  It
looks
>like an interrupt routing issue to me, but I don't understand enough
about
>how interrupts work in NetBSD-8 yet to know what to focus on  to narrow
the
>problem down.  Can someone look at the two dmesg outputs, one from
>NetBSD-8, the other from NetBSD-5.2 and make suggestions as to what to
try
>to figure out what's going wrong?  The NetBSD-8 sources are CVS'd from
>11/28/2017.

>From a quick look at your dmesg, it seems this is related to the
following PR and discussion:

http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=52717

Discussion beginning with
http://mail-index.netbsd.org/current-users/2017/11/10/msg032599.html

Regards,

Dave



Recent -current builds provide files with modification dates of zero

2017-01-30 Thread David H. Gutteridge
Hi all,

I've noticed that all the recent builds of -current on nyftp.netbsd.org
provide files with a modification date of zero. E.g.:

A recent netbsd-7 build:

[disciple@arcusix ~]$ tar tvzf kern-GENERIC.tgz 
-rwxr-xr-x root/wheel 17808252 2017-01-29 21:40 ./netbsd

A recent -current build:

[disciple@arcusix ~]$ tar tvzf kern-GENERIC.tgz 
-rwxr-xr-x root/wheel 21141528 1969-12-31 19:00 ./netbsd

Regards,

Dave



Re: HAL trouble

2015-06-06 Thread David H. Gutteridge
On Fri, 05 Jun 2015 at 10:55:11 +0100, Jaap Boender wrote:
Hi all,

Running KDE with HAL, I'm having some problems - at some point during 
my session, the mouse will stop responding and screen updates only 
seem to be done sporadically (I get the impression that sometimes I 
actually have to press a key for updates to happen...)

I think HAL is involved because a) disabling it seems to be a 
workaround and b) at the point the problems occur, hal becomes 
unkillable.

Given that without HAL, things like the battery monitor do not work 
(slightly annoying on a laptop), I'd really like to investigate this problem 
a bit more closely.

Does anyone have hints, leads or advice?

best,

  Jaap

P.S. I'm running this: NetBSD marion-dufresne.kerguelen.org 7.99.18 
NetBSD 7.99.18 (MARION-DUFRESNE) #3: Wed Jun  3 14:36:43 BST 2015  
jaapb@marion-
dufresne.kerguelen.org:/usr/obj/sys/arch/amd64/compile/MARION-
DUFRESNE amd64

with radeon drm-enabled graphics.

Hello,

I'm seeing the very same problem on 7.0_BETA on an i386 machine with
Intel graphics, running (or should I say testing) MATE. I'm also sure
HAL is the problem, because disabling it eliminates any issues for me
too.

Unfortunately, I can't really tell you how to fix it, all I can tell
you is what I've noted so far. For me, when hald gets stuck in
tstile, not only can I not kill it (or indeed, reboot the machine
because of it), it seems to adversely affect other processes, as if
HAL's trying to say I'm sorry, Dave. I'm afraid I can't do that.
For example, if I plug in a USB mouse at that point, the kernel
recognizes it, but it doesn't light up underneath indicating it's
active, and it doesn't work.

I've enabled HAL's logging feature in /etc/rc.d to output to
/var/log/messages to see if that helps. It doesn't. It doesn't log
anything before or after the freeze. Once it becomes unresponsive
and won't respond to kill -9, it doesn't log anything about that,
either. A kernel with DIAGNOSTIC, DEBUG, and LOCKDEBUG similarly
doesn't output anything useful. (After this point top isn't able
to show CPU states in its heading summary, either.) Of course, maybe
I'm doing something wrong here...

I can't see anything in Sergio Lenzi's recent patches that would have
any impact on the issue, either. My guess is there's some sort of
interaction issue with newer NetBSD releases. I never had this
problem on NetBSD 5 on the same machine, with HAL running as a Gnome
2 dependency.

Anyway, this is as far as I've looked into it. I haven't really had
the time of late.

Regards,

Dave



Re: HEADS UP: arm ports now building EABI by default

2014-08-07 Thread David H . Gutteridge
On Wed, 6 Aug 2014, Alan Barrett wrote:
On Tue, 05 Aug 2014, Greg Troxel wrote:
1) This is with passing -m but not -a.   I see in BUILDING that evbarm
is basically not allowed as -m without -a, and this is a change from
before.  Perhaps a note belongs in updating.

I think BUILDING is out of date with respect to permitted -m/-a options.
For the definitive list, search for valid_MACHINE_ARCH in build.sh.

Both BUILDING and UPDATING should probably be updated for the recent
changes.

UPDATING should get a note about this given users who simply supplied
-m evbarm will suddenly be building with a different ABI than
before, and they'll potentially get more than one surprise. (If they
haven't cleaned their objdir and stay with the default, they'll end
up with a mix of OABI and EABI objects.)

Once build.sh is fixed, the default will be MACHINE_ARCH=evbearm-el.

3) BUILDING doesn't address the eabi/oabi in the examples.  Maybe
that's ok, but I don't see it in http://wiki.netbsd.org/ports/evbarm/

It would be nice if somebody documented all this.

I had a start at doing this in PR 48741 that I submitted back in
April. It's now out of date as far as evbarm options are concerned,
but also adds other details -- the documentation for these options
in general was way out of date.

Regards,

Dave



Re: Preparation for creating netbsd-7 branch

2014-07-23 Thread David H. Gutteridge
Hi all,

To give a user's perspective, I'd like to comment on a couple of
items in the thread so far.

Christos Zoulas wrote:
Yes, I've been trying to follow that thread. Can you please summarize
the problem and propose a solution? Is it a backwards compatibility
issue? Or do we need to worry about binaries produced with sf on hf
able machines in the future? Why would one do that? To be compatible
with old machines?

Someone like myself could need to move from the old evbarm to
something else, e.g. on a Raspberry Pi, from evbarm to
evbearmv6hf-el, which means they're going from soft float to hard
float, and from OABI to EABI. The method to do so isn't documented
anywhere (to my knowledge). There are also users trying to run
current pkgsrc builds for evbarm on other variants.[1] (That could of
course disappear if pkgsrc builds changed to match the new defaults.)

I'd seen correspondence on port-arm about compatibility being offered
via COMPAT_NETBSD32, though it was unclear what the extent of that
is.[2] I think I'd misunderstood things, as I'd tried to help another
user and it wasn't working as I'd thought. I'd opened a PR[3] seeking
to improve the COMPAT_NETBSD32 man page, with the intent of
documenting just what it does for ARM, but I hadn't yet approached
any of the developers for help with the explanation.

Nick Hudson wrote:
Something (build.sh/wiki/both) can document each evbarm board to the
correct MACHINE_ARCH variant based could then be provided.

build.sh already contains useful information here

From what I can see, traditionally the BUILDING document covers this
information. (Well, not matching boards to aliases, but listing the
various aliases.) However, it is rather out of date compared to
build.sh, and not just for ARM ports. (As I think everyone here is
already aware.) I'd submitted a PR[4] to try and bring the existing
extent of the documentation up to date (as well as provide a few
unrelated amendments).

To avoid the overhead of maintaining documentation separately, the
existing information in BUILDING could be removed and an option and
function could be added to the build.sh script that would output the
table, with users being directed there from BUILDING. (Or something
else, but the existing documentation is inadequate for a number of
ports.)

I understand that writing documentation can often lag development
efforts, but it would be unfortunate if users can't appreciate all
the work developers have put in. (Well, I realize no one's doing
this for fame, but...)

References:

1. http://mail-index.netbsd.org/current-users/2014/07/17/msg025274.html
2. http://mail-index.netbsd.org/port-arm/2014/04/07/msg002350.html
3. http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=48968
4. http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=48741

Regards,

Dave



Re: pkg_add packages for evbearmv6hf-el: Cannot execute ELF binary

2014-07-18 Thread David H. Gutteridge
On Wed, 16 Jul 2014, at 21:25:45 -0400, William D. Jones wrote:
Hello all,

I am not sure whether this is user error or a legitimate bug, so I
will post here before filing a PR. The following thread may be
related:

http://mail-index.netbsd.org/current-users/2013/12/21/msg023935.html

I am attempting to get my NTFS hard disk recognized under Raspberry
Pi so that I may transfer files to and from the device as secondary
storage. Currently, any attempt to mount the device using mount_ntfs
returns Operation not supported by device, even using the ro
option. At least on FreeBSD as of August 2013, mount_ntfs is broken,
so I figured installing ntfs-3g is the solution. Compiling ntfs-3g
from scratch is not ideal on my Pi currently (see below), so I
attempted to use pkg_add (with -f option) to install from:

ftp://ftp.netbsd.org/pub/pkgsrc/packages/NetBSD/evbarm/6.1/filesystems/fuse-ntfs-3g-1.1120.tgz

pkg_add works correctly and installs the package, but all attempts
to run the program end with the following error message:

rpi-ptrain# ntfs-3g
-sh: Cannot execute ELF binary /usr/pkg/bin/ntfs-3g
This also applies to other packages, such as python.

Is the expected behavior due to ARM family mismatch, intentional
changes in the kernel facilities between 6.1 and 6.99 that make the
packages out of date, or is this a legitimate program loader bug
that I may have found? Source that I have manually compiled and
placed in /usr/local (including wget) works perfectly fine, and for
the time being, I suppose the following git repository is an
unofficial workaround: 
https://github.com/ebijun/NetBSD/tree/master/RPI/RPIimage/Image

If I'm understanding you correctly, you're running an evbearmv6hf-el
(ARM EABI, hard float) kernel and you're trying to use packages that
are evbarm (ARM OABI, soft float). Does your kernel have option
COMPAT_NETBSD32 enabled? If not, my understanding is you'd have an
ABI incompatibility problem.

Dave



Re: i386 DRMKMS results, 28 Jun 2014

2014-06-29 Thread David H. Gutteridge
On Sun, 29 Jun 2014, 12:02:26 +0200, Stephan wrote:
Hi all,

is that DRMKMS stuff enabled in the daily builds or is it neccessery
to build a custom kernel?

Regards,

Stephan

You need to use the DRMKMS kernel config, which isn't among those
included in the daily builds, from what I see. The relevant options
aren't enabled in GENERIC. I built mine from source.

Dave



i386 DRMKMS results, 28 Jun 2014

2014-06-28 Thread David H. Gutteridge
Hi all,

Following an earlier report today of success with DRMKMS on amd64,
I've just tested on i386 and can confirm it works for me. I haven't
tried anything too demanding, but I'm able to boot, the console
works, I'm able to switch VTs, and have run various applications in
a Blackbox WM session without issue. dmesg excerpt is below. (The
graphics chipset is an Intel 945GM.)

Regards,

Dave

agp0 at pchb0: i915-family chipset
agp0: detected 7932k stolen memory
agp0: aperture at 0xc000, size 0x1000
i915drmkms0 at pci0 dev 2 function 0
: vendor 0x8086 product 0x27ae (rev. 0x03)
drmkms0 at i915drmkms0
drm: Memory usable by graphics device = 256M
drm: MTRR allocation failed.  Graphics performance may suffer.
drm: Supports vblank timestamp caching Rev 1 (10.10.2010).
drm: Driver supports precise vblank timestamp query.
i915drmkms0: unable to map ROM
drm: failed to find VBIOS tables
drm kern warning: composite sync not supported
drm: initialized overlay support
drmkms0: interrupting at ioapic0 pin 16 (i915)
drm kern warning: composite sync not supported
i915drmkms0: framebuffer at 0xd9ecd000, size 1024x600, depth 32, stride 4096
wsdisplay0 at i915drmkms0 kbdmux 1: console (default, vt100 emulation), using 
wskbd0
wsmux1: connecting to wsdisplay0
fixme: max PWM is zero
drmkms0: info: registered panic notifier



Re: HEADS UP: riastradh-drm2 branch merged

2014-04-06 Thread David H . Gutteridge
On 2014-03-23, at 6:06 PM, David H. Gutteridge wrote:
 On 2014-03-20, at 6:27 PM, David H. Gutteridge wrote:
 On Tue, 18 Mar 2014 at 19:17:01, Taylor R Campbell wrote:
 I merged the riastradh-drm2 branch to HEAD today.  This shouldn't
 cause any problems for anyone, because it touched very little outside
 sys/external/bsd/drm2 -- it's not hooked into any kernels other than
 the new amd64/DRMKMS one.  But let me know if you observe any fallout.
 
 Update to userland X.org should be coming soon, so that userlands can
 take advantage of the new DRM/KMS drivers.
 
 Hello,
 
 I doubt I'm telling you anything you don't already know, but I tried
 compiling a DRMKMS kernel for both amd64 and i386 to test, and
 neither compiled.
 
 With i386, I hit this first:
 
 In file included from 
 /usr/builds/netbsd-current/src/sys/external/bsd/drm2/dist/include/drm/drmP.h:52:0,
from 
 /usr/builds/netbsd-current/src/sys/external/bsd/drm2/dist/drm/drm_agpsupport.c:34:
 /usr/builds/netbsd-current/src/sys/external/bsd/drm2/include/linux/pci.h: In 
 function 'pci_bus_alloc_resource':
 /usr/builds/netbsd-current/src/sys/external/bsd/drm2/include/linux/pci.h:255:6:
  error: large integer implicitly truncated to unsigned type
 *** [drm_agpsupport.o] Error code 1
 
 I realize you only provided an amd64 kernel, the implication being
 i386 might not yet be supported, but I tried it anyway, as the machine
 I'd test with isn't capable of running 64-bit code.
 
 From looking at the code, it's clear you're already aware of the
 issue, given your XXX notation.
 
 error = bus_space_alloc(bst, start, 0xULL /* XXX */,
  size, align, 0, 0, resource-start, resource-r_bsh);
 
 
 If this is of interest to anyone, I opened a PR detailing some issues
 that prevent this code from being used on i386. (The PR is 48676.)
 
 Dave

Recent changes by riastradh@ have made i386 kernels with drm2 enabled
buildable, so I've now tested on my somewhat aged machine with an
i945GME chipset. Details follow for the curious.

With is_console=1 set in src/sys/external/bsd/drm2/i915drm/i915_pci.c
the kernel tries to probe/attach to the graphics chipset (I'm not
sure exactly how far it gets, the messages flash by too fast) and
then fails, causing an apparent kernel panic (or at least a freeze).
I'm not able to get any dmesg output saved from it, and the screen
simply goes black. The machine doesn't respond to network activity.

With is_console=0 set, the kernel boots successfully, but the console
is unusable. (A small, fixed white cursor appears in the top left of
the screen, and I cannot switch VTs to other text consoles.) The
machine does boot multi-user, though, and responds to network
activity, so I've gleaned the following dmesg details.

Regards,

Dave

: vendor 0x8086 product 0x27ac (rev. 0x03)
agp0 at pchb0: detected 7932k stolen memory
agp0: aperture at 0xc000, size 0x1000
i915drmkms0 at pci0 dev 2 function 0
: vendor 0x8086 product 0x27ae (rev. 0x03)
drmkms0 at i915drmkms0
drm: Memory usable by graphics device = 256M
drm: MTRR allocation failed.  Graphics performance may suffer.
drm: Supports vblank timestamp caching Rev 1 (10.10.2010).
drm: Driver supports precise vblank timestamp query.
i915drmkms0: unable to map ROM
drm: failed to find VBIOS tables
i915drmkms0: unable to map VGA registersdrm kern warning: composite sync not 
supported
drm: initialized overlay support
drmkms0: interrupting at ioapic0 pin 16 (i915)

snip

drm kern warning: composite sync not supported
render error detected, EIR: 0x0010
page table error
  PGTBL_ER: 0x0100
DRM error in i915_report_and_clear_eir: EIR stuck: 0x0010, masking
render error detected, EIR: 0x0010
page table error
  PGTBL_ER: 0x0100
i915drmkms0: framebuffer at 0xda82b000, size 1024x600, depth 32, stride 4096
wsdisplay1 at i915drmkms0 kbdmux 1
wsmux1: connecting to wsdisplay1
fixme: max PWM is zero
drmkms0: info: registered panic notifier




Re: RPI kernels in -current expected to work?

2014-04-06 Thread David H. Gutteridge
On Sun, 06 Apr 2014 at 18:19:19, Frank Kardel wrote:
Great ! Thanks - works again.

Frank

On 04/06/14 14:43, Nick Hudson wrote:
On 04/06/14 13:01, Frank Kardel wrote:

Hi,

I see a long stream of

fixup: pd 
fixup: pde ... nothing to do

lines scrolling (forever?) after the initial boot kernel messages.

The boot process does not seem to make any reasonably observable 
 progress at that point.


This happens with self compiled kernels (as of 2014-04-06) and
kernels fetched from nyftp for 20140403 and 20140406.

The following older kernel works:
NetBSD rpi 6.99.38 NetBSD 6.99.38 (RPI) #0: Sat Mar 29 06:14:39 UTC 
 2014

builds%b44.netbsd.org@localhost:/home/builds/ab/HEAD/evbarm-earmhf/201403290440Z-obj/home/builds/ab/HEAD/src/sys/arch/evbarm/compile/RPI
 evbarm


Best regards,
  Frank


cvs update :)

Nick

I was seeing the same thing, and now have a working kernel again too,
though I've noticed userland tools that report kernel memory use seem
a bit confused after recent commits. (Not that that's a big deal to
me, I'm just mentioning it in case it's unexpected.)

ps(1) gives:

USER PID %CPU %MEM   VSZ RSS TTY   STAT STARTEDTIME COMMAND
root   0  0.0 -10.0 0 4148560 ? DKl   8:28PM 0:04.66 [system]

top(1) gives:

  PID USERNAME PRI NICE   SIZE   RES STATE  TIME   WCPUCPU COMMAND
0 root  960 0K -47808K mmctaskq   0:27  0.00%  0.00% [system]

Regards,

Dave

Re: HEADS UP: riastradh-drm2 branch merged

2014-03-23 Thread David H . Gutteridge
On 2014-03-20, at 6:27 PM, David H. Gutteridge wrote:
 On Tue, 18 Mar 2014 at 19:17:01, Taylor R Campbell wrote:
 I merged the riastradh-drm2 branch to HEAD today.  This shouldn't
 cause any problems for anyone, because it touched very little outside
 sys/external/bsd/drm2 -- it's not hooked into any kernels other than
 the new amd64/DRMKMS one.  But let me know if you observe any fallout.
 
 Update to userland X.org should be coming soon, so that userlands can
 take advantage of the new DRM/KMS drivers.
 
 Hello,
 
 I doubt I'm telling you anything you don't already know, but I tried
 compiling a DRMKMS kernel for both amd64 and i386 to test, and
 neither compiled.

 With i386, I hit this first:
 
 In file included from 
 /usr/builds/netbsd-current/src/sys/external/bsd/drm2/dist/include/drm/drmP.h:52:0,
 from 
 /usr/builds/netbsd-current/src/sys/external/bsd/drm2/dist/drm/drm_agpsupport.c:34:
 /usr/builds/netbsd-current/src/sys/external/bsd/drm2/include/linux/pci.h: In 
 function 'pci_bus_alloc_resource':
 /usr/builds/netbsd-current/src/sys/external/bsd/drm2/include/linux/pci.h:255:6:
  error: large integer implicitly truncated to unsigned type
 *** [drm_agpsupport.o] Error code 1
 
 I realize you only provided an amd64 kernel, the implication being
 i386 might not yet be supported, but I tried it anyway, as the machine
 I'd test with isn't capable of running 64-bit code.
 
 From looking at the code, it's clear you're already aware of the
 issue, given your XXX notation.
 
 error = bus_space_alloc(bst, start, 0xULL /* XXX */,
   size, align, 0, 0, resource-start, resource-r_bsh);
 

If this is of interest to anyone, I opened a PR detailing some issues
that prevent this code from being used on i386. (The PR is 48676.)

Dave



Re: HEADS UP: riastradh-drm2 branch merged

2014-03-21 Thread David H . Gutteridge
On 2014-03-20, at 6:27 PM, David H. Gutteridge wrote:
 On Tue, 18 Mar 2014 at 19:17:01, Taylor R Campbell wrote:
 I merged the riastradh-drm2 branch to HEAD today.  This shouldn't
 cause any problems for anyone, because it touched very little outside
 sys/external/bsd/drm2 -- it's not hooked into any kernels other than
 the new amd64/DRMKMS one.  But let me know if you observe any fallout.
 
 Update to userland X.org should be coming soon, so that userlands can
 take advantage of the new DRM/KMS drivers.
 
 Hello,
 
 I doubt I'm telling you anything you don't already know, but I tried
 compiling a DRMKMS kernel for both amd64 and i386 to test, and
 neither compiled.
 
 With amd64, I hit this:
 
 /usr/builds/netbsd-current/src/sys/external/bsd/drm2/dist/drm/drm_edid.c: In 
 function 'do_cvt_mode':
 /usr/builds/netbsd-current/src/sys/external/bsd/drm2/dist/drm/drm_edid.c:1399:13:
  error: 'width' may be used uninitialized in this function 
 [-Werror=maybe-uninitialized]
 newmode = drm_cvt_mode(dev, width, height,
 ^
 In file included from 
 /usr/builds/netbsd-current/src/sys/external/bsd/drm2/dist/drm/drm_edid.c:30:0:
 /usr/builds/netbsd-current/src/sys/external/bsd/drm2/dist/drm/drm_edid.c:1375:25:
  note: 'width' was declared here
   int uninitialized_var(width), height;
 ^
 /usr/builds/netbsd-current/src/sys/external/bsd/drm2/include/linux/kernel.h:47:30:
  note: in definition of macro 'uninitialized_var'
 #define uninitialized_var(x) x
  ^
 cc1: all warnings being treated as errors
 *** [drm_edid.o] Error code 1
 
 I assume this must simply be fallout from GCC being upgraded to 4.8.3.

After the commit referenced below, I'm able to build the amd64 kernel.

Regards,

Dave

Module Name:src
Committed By:   riastradh
Date:   Fri Mar 21 02:25:05 UTC 2014

Modified Files:
src/sys/external/bsd/drm2/include/linux: kernel.h

Log Message:
Make uninitialized_var kludge expand to `x = 0'.

Forgot to commit this the other day.


To generate a diff of this commit:
cvs rdiff -u -r1.2 -r1.3 src/sys/external/bsd/drm2/include/linux/kernel.h



Re: HEADS UP: riastradh-drm2 branch merged

2014-03-20 Thread David H. Gutteridge
On Tue, 18 Mar 2014 at 19:17:01, Taylor R Campbell wrote:
I merged the riastradh-drm2 branch to HEAD today.  This shouldn't
cause any problems for anyone, because it touched very little outside
sys/external/bsd/drm2 -- it's not hooked into any kernels other than
the new amd64/DRMKMS one.  But let me know if you observe any fallout.

Update to userland X.org should be coming soon, so that userlands can
take advantage of the new DRM/KMS drivers.

Hello,

I doubt I'm telling you anything you don't already know, but I tried
compiling a DRMKMS kernel for both amd64 and i386 to test, and
neither compiled.

With amd64, I hit this:

/usr/builds/netbsd-current/src/sys/external/bsd/drm2/dist/drm/drm_edid.c: In 
function 'do_cvt_mode':
/usr/builds/netbsd-current/src/sys/external/bsd/drm2/dist/drm/drm_edid.c:1399:13:
 error: 'width' may be used uninitialized in this function 
[-Werror=maybe-uninitialized]
 newmode = drm_cvt_mode(dev, width, height,
 ^
In file included from 
/usr/builds/netbsd-current/src/sys/external/bsd/drm2/dist/drm/drm_edid.c:30:0:
/usr/builds/netbsd-current/src/sys/external/bsd/drm2/dist/drm/drm_edid.c:1375:25:
 note: 'width' was declared here
   int uninitialized_var(width), height;
 ^
/usr/builds/netbsd-current/src/sys/external/bsd/drm2/include/linux/kernel.h:47:30:
 note: in definition of macro 'uninitialized_var'
 #define uninitialized_var(x) x
  ^
cc1: all warnings being treated as errors
*** [drm_edid.o] Error code 1

I assume this must simply be fallout from GCC being upgraded to 4.8.3.

With i386, I hit this first:

In file included from 
/usr/builds/netbsd-current/src/sys/external/bsd/drm2/dist/include/drm/drmP.h:52:0,
 from 
/usr/builds/netbsd-current/src/sys/external/bsd/drm2/dist/drm/drm_agpsupport.c:34:
/usr/builds/netbsd-current/src/sys/external/bsd/drm2/include/linux/pci.h: In 
function 'pci_bus_alloc_resource':
/usr/builds/netbsd-current/src/sys/external/bsd/drm2/include/linux/pci.h:255:6: 
error: large integer implicitly truncated to unsigned type
*** [drm_agpsupport.o] Error code 1

I realize you only provided an amd64 kernel, the implication being
i386 might not yet be supported, but I tried it anyway, as the machine
I'd test with isn't capable of running 64-bit code.

From looking at the code, it's clear you're already aware of the
issue, given your XXX notation.

error = bus_space_alloc(bst, start, 0xULL /* XXX */,
size, align, 0, 0, resource-start, resource-r_bsh);

I don't know if it's the preferred NetBSD way to handle this, but I'd
be inclined to add a macro that defines that literal value differently
depending on whether it's an amd64/i386 PAE build or a plain i386
build.

Regards,

Dave