re: FYI: new X server in -current, among other X things

2022-07-16 Thread matthew green
> TL;DR: after upgrading via the sets available from releng builds from
> July 16th (http://releng.netbsd.org/builds/HEAD/202207160630Z) I'm not
> able to start X on amd64 with i915 graphics. Separately, there may be
> issues with libX11 1.8.1 where clients will hang due to recursive locks
> occurring.

the libX11 thing is pretty terrible.  upstream says that
_not_ enabling it means other things are broken.  i don't
know anything better than fixing the clients i guess,
which is pretty terrible for backwards compat code/binaries.

> [   378.033] (EE) 0: /usr/X11R7/bin/X (xorg_backtrace+0x44) [0x1467d46d5]
> [   378.033] (EE) 1: /usr/X11R7/bin/X (os_move_fd+0x79) [0x1467d0465]
> [   378.033] (EE) 2: /usr/lib/libc.so.12 (__sigtramp_siginfo_2+0x0) 
> [0x75b46379c930]
> [   378.034] (EE) 
> [   378.034] (EE) Segmentation fault at address 0x0
> 
> This happens with ctwm as part of the base installation, as well as with
> other pre-existing window managers and such from pkgsrc built against
> 9.99.97.

can you configure X to generate a core dump or run it
under GDB and get the real stack trace?  i thought we'd
fixed this problem in libexecinfo, but it's still not
tracing through the SEGV above, so finding what is
crashing where is what we need next.

does it happen when X starts up?  maybe it crashes with
plain running "X" without any arguments (ie, not using
some frontend that will also fire up clients etc.)

can you post the whole Xorg.0.log somewhere?  most of
my i915 systems have become non-functional the last few
years, but i have one system to test.


.mrg.


Re: FYI: new X server in -current, among other X things

2022-07-16 Thread David H. Gutteridge


On Fri, 15 Jul 2022 at 15:12:07 +1000, matthew green wrote:
> i've updated most of xsrc to their latest versions.
> fontconfig and Mesa are remaining.  i've tested the
> new code on amd64 and arm64, and built several ports
> to confirm they still build.  the biggest change is
> the new xorg-server.
> 
> there are probably a few build issues left to find
> across all ports, and perhaps some run-time ones too
> but basic testing looks fine for me.
> 
> please send-pr or email here if you find problems.

TL;DR: after upgrading via the sets available from releng builds from
July 16th (http://releng.netbsd.org/builds/HEAD/202207160630Z) I'm not
able to start X on amd64 with i915 graphics. Separately, there may be
issues with libX11 1.8.1 where clients will hang due to recursive locks
occurring.

I haven't had time to look into this in any detail, but after upgrading
kernel and userland to the July 16th sets (and running etcupgrade), I'm
now unable to start any window manager. I get the following:

[   378.027] (EE) 
[   378.027] (EE) Backtrace:
[   378.033] (EE) 0: /usr/X11R7/bin/X (xorg_backtrace+0x44) [0x1467d46d5]
[   378.033] (EE) 1: /usr/X11R7/bin/X (os_move_fd+0x79) [0x1467d0465]
[   378.033] (EE) 2: /usr/lib/libc.so.12 (__sigtramp_siginfo_2+0x0) 
[0x75b46379c930]
[   378.034] (EE) 
[   378.034] (EE) Segmentation fault at address 0x0
[   378.034] (EE) 
Fatal server error:
[   378.034] (EE) Caught signal 11 (Segmentation fault). Server aborting
[   378.034] (EE) 
[   378.034] (EE) 
Please consult the The X.Org Foundation support 
 at http://wiki.x.org
 for help. 
[   378.034] (EE) Please also check the log file at "/var/log/Xorg.0.log" for 
additional information.
[   378.034] (EE) 
[   378.053] (EE) Server terminated with error (1). Closing log file.

This happens with ctwm as part of the base installation, as well as with
other pre-existing window managers and such from pkgsrc built against
9.99.97.

Separately, libX11 added a feature called "thread safety constructor"
which we have enabled. It can cause hangs with X11 clients that aren't
coded safely. This did include xfce4-settings from Xfce until the
version I pushed to pkgsrc a couple of days ago (4.16.3). I believe
LXDE is also affected, but haven't had time to deal with it yet. Not
sure about any other DEs or X clients. (I'm not able to test at the
moment, of course.)

Regards,

Dave



daily CVS update output

2022-07-16 Thread NetBSD source update


Updating src tree:
P src/distrib/sets/lists/xserver/md.ibmnws
P src/distrib/sets/lists/xserver/md.prep
P src/doc/CHANGES
P src/sbin/gpt/gpt.h
P src/share/man/man4/mfii.4
P src/sys/arch/x68k/dev/powsw.c
P src/sys/arch/x68k/x68k/machdep.c
P src/sys/arch/x86/x86/genfb_machdep.c
P src/sys/dev/ic/mfireg.h
P src/sys/dev/pci/mfii.c
P src/sys/dev/wscons/wsdisplay_vcons.c
P src/sys/kern/subr_pool.c
P src/tests/usr.bin/xlint/lint1/msg_135.c
P src/usr.bin/vmstat/vmstat.c
P src/usr.bin/xlint/lint1/debug.c
P src/usr.bin/xlint/lint1/err.c
P src/usr.bin/xlint/lint1/tree.c

Updating xsrc tree:


Killing core files:




Updating file list:
-rw-rw-r--  1 srcmastr  netbsd  40074773 Jul 17 03:04 ls-lRA.gz


Re: iscsi target on a zfs zvol?

2022-07-16 Thread Brad Spencer
Brian Buhrow  writes:

>   hello.  Yes, I was vaguely aware of the lack of extended attributes for 
> NetBSD-Zfs, but
> what I was suggesting was just using a flat file, exported via iscsi through 
> istgt or your
> initiator of choice, on top of zfs, rather than a zvol, because you'll find 
> the read/write speed
> to be so much faster.  Unfortunately, it seems the upstream zfs maintainers 
> have decided that
> zvols are not worth the time to optimize, so while they're functional, 
> they're not performant
> under any openzfs-using implementation.  This makes me sad because zvols are 
> such a tidy way to
> manage so many different kinds of things.
>
> -thanks
> -Brian


I freely admit that I don't use zvols very much in NetBSD, but did you
mess with the volblocksize any on the volume??


-- 
Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org


Re: iscsi target on a zfs zvol?

2022-07-16 Thread Brian Buhrow
hello.  Yes, I was vaguely aware of the lack of extended attributes for 
NetBSD-Zfs, but
what I was suggesting was just using a flat file, exported via iscsi through 
istgt or your
initiator of choice, on top of zfs, rather than a zvol, because you'll find the 
read/write speed
to be so much faster.  Unfortunately, it seems the upstream zfs maintainers 
have decided that
zvols are not worth the time to optimize, so while they're functional, they're 
not performant
under any openzfs-using implementation.  This makes me sad because zvols are 
such a tidy way to
manage so many different kinds of things.

-thanks
-Brian



Re: iscsi target on a zfs zvol?

2022-07-16 Thread Hauke Fath
At 9:15 Uhr -0700 13.07.2022, Brian Buhrow wrote:
> [...] you'll get much better read-write performance if you create a standard
>zfs filesystem for your time machine backup, then create a regular file in
>it which you export via iscsi.

To wrap up the issue, I don't even care much about which side is at fault,
initiator or target. It's just the experience was nothing I would like to
deal with daily.

For this home setup, while it would have been nice to use thge server's
raid1, I found 2 TB of spinning rust for 50 EUR which does the job nicely.

In a similar setting at work, three latest-model iMacs are happily running
their Time Machine against Samba shares on a FreeBSD server. Since NetBSD's
zfs does not support extended attributes, that wasn't an option here.

Cheerio,
Hauke


--
"It's never straight up and down" (DEVO)




Re: FYI: new X server in -current, among other X things

2022-07-16 Thread Michael
Hello,

On Fri, 15 Jul 2022 15:12:07 +1000
matthew green  wrote:

> i've updated most of xsrc to their latest versions.
> fontconfig and Mesa are remaining.  i've tested the
> new code on amd64 and arm64, and built several ports
> to confirm they still build.  the biggest change is
> the new xorg-server.
> 
> there are probably a few build issues left to find
> across all ports, and perhaps some run-time ones too
> but basic testing looks fine for me.

Alright, I'll check all my weirdo drivers!

have fun
Michael


Re: Weird clock behaviour with current (amd64) kernel

2022-07-16 Thread Robert Elz
Date:Sat, 16 Jul 2022 00:20:41 + (UTC)
From:RVP 
Message-ID:  

  | On Fri, 15 Jul 2022, Robert Elz wrote:
  | > If that is all it is, it is barely worth fixing ... though this
  | > must have happened sometime in the 9.99.9[78] series (sometime
  | > after early last Dec).

  | Farther back than that I think: 9.2_STABLE does the same thing.

Just as likely the relevant change has been pulled up to -9 to get
that result, I haven't booted a -9 with a local kernel recently to
know (but doing that, for other reasons, is possibly going to happen).

But until the graphics update last Dec, I was running HEAD on my
laptop, and updating its kernel frequently, and its cyan console was
100% cyan (no yellow/brown/...) anywhere in sight.   I stopped after
those changes, as my (quite old by then) X server dumped core with the
new drivers, and that wasn't useful...   I believe that issue was fixed,
but before I got around to testing it, the laptop decided to suffer heat
exhaustion (or similar) and really doesn't want to run for more than a
couple of minutes (and right now, is lacking a boot device anyway, I
pulled it).   So I am fairly sure that up until then, all was good
(not using EFI booting though, which now seems to be a difference, it
used EFI partitioning, but using biosboot, with EFI booting, NetBSD seemed
to not find half the hardware ... it does have a fairly early EFI
prom implementation though).

But it appears as if mlelstv@ might have fixed the colour issue now anyway.
(Thanks Michael).   Not that it really was much of an issue.  I will
verify later (maybe tomorrow sometime) - a system with that change
is built already.

kre



readlink(1) realpath(1) and POSIX

2022-07-16 Thread Robert Elz
POSIX is planning to add readlink(1) in the next version.   Nothing
special to say about that (makes no real difference to us, we have it
already, they will specify only the common options.)

But while doing that, they looked at the -f option, and saw in coreutils
that their man page says to use realpath(1) instead of readlink -f

(They never even got as far as detecting that our readlink -f and the
coreutils readlink -f don't act the same).

So, it was asked whether other systems have realpath(1) - we do, kamil@
added it back in Feb 2020, with the comment:

   Port realpath(1) from FreeBSD

   realpath(1) wraps realpath(3) and returns resolved physical path.

   This utility shipped with GNU and FreeBSD is sometimes
   used in scripts in the wild.

It is currently in HEAD only - it will be in 10 when that gets released.

So, POSIX has more or less decided to skip the -f option of readlink,
and require realpath(1) instead (realpath(3) has been around in POSIX for ages,
but is an XSI option ... realpath(1) won't be, just mandatory (probably)).

However, FreeBSD's realpath(1) (now also ours) and the coreutils realpath(1)
are substantially different beasts - the FreeBSD version is (as kamil said)
just a wrapper around realpath(3) and is quite simple.

coreutils realpath is a monstrous mess.Fortunately, POSIX aren't
proposing standardising almost any of that, just the basic functionality
which replaces readlink -f.

Unfortunately, for POSIX (and us) basic realpath (as in "realpath file")
has the same basic operational difference as readlink -f has between the
BSD & GNU implementations.   Ours is literally: "call realpath(3), if it
returns something, print that, otherwise it is an error".   Theirs allows
the final component in the expanded and canonicalized path to not exist.
(Their doc does not say what "not exist" really means in the hard cases,
but from testing their implementation, it is clear that if namei() returns
ENOENT for the final component, that is an allowed case, any other error
return is not).

The people who use this demand that functionality remain (I'm still unclear
on why - if the file is not to be created, who cares what its canonical path
would be, if it is, create it first using the known name, and canonicalize
later should work I would have thought ... but they don't agree - they say,
that if we want to know if it exists, we can canonicalise first, then test -e
though for a long time I wasn't sure how that was a rational counter argument,
I'm still not).

For a while I thought we could just do (in C, not exactly this) if
realpath($FILE) fails:
echo $(realpath $(dirname $FILE))/$(basename $FILE)
(with appropriate tests for when $FILE has no '/' etc), but that doesn't
work - it is not just the last component of the $FILE arg which is allowed not
to exist (though that case is part of it) but where that component exists,
and is a symlink, and the last component of that doesn't exist, or exists
and is another symlink for which ... this can go on (almost) forever.

The current POSIX proposal is to specify "realpath -e" (which is a coreutils
arg which makes theirs act just like ours) and also invent a new -E
arg, which would make ours work like theirs.   It would be unspecified
which was the default - ie: all scripts would need to use one of those
options to be portable.   The allowed result when neither option is given is
made even more bizarre to cater for a built in realpath in mksh, which
is even wackier in its default (and only) behaviour (inexplicable in some
cases) than the coreutils version - but the mksh one takes exactly 1 arg,
the path name, and simply execs realpath from the filesystem if anything
different is passed to it, so "realpath -[Ee] file" will bypass that
implementation and run a real one instead.

I have added -E support to our realpath(1) (that is, to the .c, haven't
gotten around to the man page yet) and of course -e (which is more or
less a no-op).   For now, I have made the default be -E if neither option
is given, which returns the same result as we currently get in cases
we do not currently produce an error, and makes our implementation more
compatible with (the small part that is sane) of the coreutils implementation.

I am not proposing adding any of their myriad other useless options, with
the sole possible exception of -z (which causes their realpath to use \0
rather than \n between output paths, and makes it a little safer in the
possible presence of paths containing newline chars when more than one
path arg is given ... the POSIX version (currently) will only specify
realpath working with a required single file arg .. our version (the FreeBSD
version), defaults to "." if no file is given, coreutils don't do that,
and both versions process as many file args as are given).

The source file size about doubles with these changes, which means about
3 times as much actual code (since about half of the current source is the
boilerplate noise).

Any 

Re: Weird clock behaviour with current (amd64) kernel

2022-07-16 Thread Robert Elz
Date:Sat, 16 Jul 2022 06:09:26 - (UTC)
From:mlel...@serpens.de (Michael van Elst)
Message-ID:  

  | For the green color it doesn't matter if the order is BGR or RGB.
  | For cyan, the wrong order gives "brown" which is a dark yellow.

I tossed up what to call the colour - certainly not bright yellow,
I considered orange (but that would bring to mind products from
Valencia, and that's not it) or polished copper (too artsy for me)
so I just decided to stick with the rgbcym (plus bw) pallette,
and yellow was closest...   brown would normally bring to mind
something darker, but otherwise it works too.

This is not important, I mentioned it only on the off chance that
it might be a useful clue to someone looking for more serious
problems.   It obviously isn't, so I think we can forget it now.

The clock issue is more interesting however, something is clearly
not being handled quute right there.

kre


Re: Weird clock behaviour with current (amd64) kernel

2022-07-16 Thread Michael van Elst
r...@sdf.org (RVP) writes:

>Unsurprisingly, EFI also has a colour-index similar to VGA (see:
>/usr/src/sys/external/bsd/gnu-efi/dist/inc/eficon.h). I tried fixing the
>indexes like this, but, it doesn't for some (autoconfig?) reason. Can
>only look into this after I come back from my road-trip.

That color index is used by text mode, but booting from EFI uses
a graphics framebuffer (nowadays mostly 24bit or 32bit per pixel).


But all this color shift is unrelated to the color indexes, but
how the framebuffer pixels are organized.


The early console code has no information about byte order in
the framebuffer. rasops then initializes e.g. for 32bit pixels:

if (ri->ri_rnum == 0) {
ri->ri_rnum = ri->ri_gnum = ri->ri_bnum = 8;

ri->ri_rpos = 0;
ri->ri_gpos = 8;
ri->ri_bpos = 16;
}

which is 0x00BBGGRR.

When genfb actually attaches it carries information about
the byte ordering and rasops gets initialized with the
right values.

For the green color it doesn't matter if the order is BGR or RGB.
For cyan, the wrong order gives "brown" which is a dark yellow.