Re: NetBSD 9.3 to 10.0 upgrade failure - check for DOS fs

2024-04-09 Thread Robert Elz
Date:Tue, 9 Apr 2024 22:28:46 +0200
From:Riccardo Mottola 
Message-ID:  <9f0bd479-42ef-7842-90fb-0d6a503cf...@libero.it>

  | no "e" of course... and no MS-DOS in sight. It was already a fully 
  | BSD-ized system.

What does fdisk show?   (ie: the MBR label).

kre



Re: Is use of 'binary' mode necessary to open files on NetBSD?

2023-12-01 Thread Robert Elz
Date:Sat, 2 Dec 2023 09:18:56 +0530
From:Mayuresh 
Message-ID:  

  | On NetBSD, the fopen man page clearly says 'b' is ignored. So wonder if
  | gcc layer introduces the need to use it in above usage pattern.

It is in stdio (see  src/lib/libc/stdio/flags.c) - that is in
fopen() and its siblings.   What some higher level library might
do is however entirly up to it.

kre


Re: athn0 interface not showing up after detection

2023-11-25 Thread Robert Elz
Date:Sat, 25 Nov 2023 15:21:13 -0500
From:tiny.sock7...@fastmail.com
Message-ID:  <43591e56-bbb6-4de5-bec5-37468ddb9...@app.fastmail.com>

  | The have those firmware files in my /libdata/firmware/if_athn,
  | I believe they came with the default install.

Yes, they do.   Your setup looks normal in that regard.   But I'm
afraid that means I can't assist any more, I know nothing about the
USB system or athn devices in particular.

You might get better results sending a PR describing what happens, than
just sending here to netbsd-users (where some of the more technically
oriented people don't necessarily read the messages).

kre



Re: athn0 interface not showing up after detection

2023-11-25 Thread Robert Elz
Date:Sat, 25 Nov 2023 12:51:56 -0500
From:tiny.sock7...@fastmail.com
Message-ID:  <9cc985c4-f389-4aa2-8315-6394bc697...@app.fastmail.com>

  | Is there an additional driver or command needed to load them into
  | kernel memory?

No, the kernel should simply load whatever it needs (into the usb device)
when it is recognised.   You might want to check that

sysctl hw.firmware.path

starts with /libdata/firmware:...   (or at the very least has /libdata/firmware
in it).

kre

ps: you still did not say which version of NetBSD you're using.




Re: athn0 interface not showing up after detection

2023-11-25 Thread Robert Elz
Date:Sat, 25 Nov 2023 01:59:36 -0500
From:tiny.sock7...@fastmail.com
Message-ID:  <0cda2184-60f3-4568-8900-0845a093e...@app.fastmail.com>

You didn't say which version of NetBSD you're using, that might be
important.

  | What might be causing this?

I personally know nothing of athn devices, but the man page does say:

 For USB devices, the driver needs at least version 1.1 of the following
 firmware files, which are loaded when an interface is attached:

   /libdata/firmware/athn-ar7010
   /libdata/firmware/athn-ar7010-11
   /libdata/firmware/athn-ar9271

Not sure if you need all three, or only the last one - do you have those
installed?   If not, that would be where I would start.

Also read athn(4).

kre



Re: buildworld failure due to md5 not supported by openssl3

2023-11-25 Thread Robert Elz
Date:Fri, 24 Nov 2023 19:45:00 +0100
From:Ede Wolf 
Message-ID:  <6b8e764f-6f34-4f4e-82e1-2c7e7b724...@nebelschwaden.de>

  | I am having somewhat cosmetic wm(4) issues though, but that is more for 
  | the alpha-port list, as at least on vbox - the only other machine I have 
  | with wm drivers - those issues do not appear.

As I understand it, which isn't very much, hardware issues aren't really
my thing, many hardware wm devices have errata that need working around,
and there's no guarantee that we have workarounds for all the varieties
that exist in the tree.  Things generally basically work anyway, but there
can be issues.   A software implementation (like on vbox) is less likely
to have problems like that, as any issues that arise can just get fixed in
the next vbox update - that's like many many orders of magnitude harder to
do with hardware that's already been sold and doesn't rely upon loadable
firmware.

  | Talking about virtualbox. On vbox 7.0.12 (running on a linux host) I was 
  | unable to install amd64 10RC1 with vitio-net.

Sorry, can't help with that one either, or certainly without a lot more
debugging info as to what was actually happening.

  | No issues with the emulated Pro/1000MT. Good enough for me

Yes, when I did use virtualbox (needed on my laptop before previously
so I could run NetBSD) there were several cases like that - just find
the emulated hardware that works, and use that - forget anything that
doesn't...

The change to the order of the lines in that libsaslc/lib/Makefile
has been made in the HEAD sources - it needs fixing in NetBSD 10 as well,
and it looks as if Christos didn't send a pullup for the change, so I
will ...

kre



Re: buildworld failure due to md5 not supported by openssl3

2023-11-24 Thread Robert Elz
Date:Fri, 24 Nov 2023 17:21:42 +0100
From:Ede Wolf 
Message-ID:  <5b8928a4-32b5-4015-8eb1-2432d3eb6...@nebelschwaden.de>

  | For what it is worth, as you have probably known it before, here my 
  | confirmation: Swapping those lines and disabling kerberos the build 
  | finished without problems.

Thanks - I hadn't actually tested it, but I was fairly confident that
would be what happened.

  | So I cannot comment on how usable this build is.

Aside from not having Kerberos, it should be identical to the previous
one - if you haven't tested that either, ie: you haven't run HEAD at
all on your alpha yet, then please do send a message if there are any
issues when you do get a chance to try it.

I have asked if those 2 lines can be swapped around in the distribution
sources, and I suspect that will happen soon, unless there was some
obscure reason (rather than just an editing error) for the positioning.

kre



Re: buildworld failure due to md5 not supported by openssl3

2023-11-23 Thread Robert Elz


I just did an alpha build as follows.   The -V's are what's in your mk.conf I
believe (with MKKERBEROS=yes and USE_KERBEROS=yes) and it worked without issue.

The build host is close enough to yours (kernel is 10.99.10 but that's 
irrelevant - userland is HEAD from before -10 was branched, but not all
that long before).

The sources don't have the Makefile change that would allow building with
MKKERBEROS=no so I didn't try that.

 build.sh command:build.sh -j 16 -V MKATF=no -V MKCLEANSRC=yes -V MK
CLEANVERIFY=yes -V MKCOMPAT=no -V MKCVS=yes -V MKDEBUGLIB=no -V MKDOC=yes -V MKD
TRACE=no -V MKGDB=no -V MKHOSTOBJ=no -V MKHTML=no -V MKINFO=no -V MKIPFILTER=no 
-V MKISCSI=yes -V MKKERBEROS=yes -V USE_KERBEROS=yes -V MKLDAP=no -V USE_LDAP=no
 -V MKLVM=no -V MKMANZ=yes -V MKMDNS=no -V MKNOUVEAUFIRMWARE=no -V MKNPF=yes -V 
MKPF=no -V MKPOSTFIX=yes -V MKPROFILE=no -V MKRADEONFIRMWARE=no -V MKREPRO=yes -
V MKRUMP=no -V MKX11=no -V MKX11FONTS=no -V MKX11MOTIF=no -V MKZFS=no -V MKYP=no
 -V USE_YP=no -V MKHESIOD=no -V USE_HESIOD=no -V MKPAM=yes -V USE_PAM=yes -V MKS
KEY=no -V USE_SKEY=no -m alpha -D /release/testing/alpha -O /usr/obj/testing/alp
ha -R /local/snap/20231123-testing-10.99.10-alpha -T /usr/obj/testing/tools -X /
readonly/release/testing/src/xsrc -u -x iso-image
 build.sh started:Thu Nov 23 20:04:16 +07 2023
 NetBSD version:  10.99.10
 MACHINE: alpha
 MACHINE_ARCH:alpha
 Build platform:  NetBSD 10.99.10 amd64
 HOST_SH: /bin/sh
 getenv MAKECONF: /dev/null
 MAKECONF file:   /dev/null
 TOOLDIR path:/usr/obj/testing/tools
 DESTDIR path:/release/testing/alpha
 RELEASEDIR path: /local/snap/20231123-testing-10.99.10-alpha
 Updated makewrapper: /usr/obj/testing/tools/bin/nbmake-alpha
 MKREPRO_TIMESTAMPWed Nov 22 14:51:55 UTC 2023
 Successful make iso-image
 build.sh ended:  Thu Nov 23 20:06:04 +07 2023

While I use '-u' (update build) this is the first alpha build I've
done in decades I think (certainly the first on this system) so
everything was clean to start with (no .o files, no .d (dependency)
files, I had to actually mkdir the target (DESTDIR) directory - so
that was certainly empty).

So one thing to check is that while you didn't seem to be doing an
update build, so everything should have been cleaned before it
started, you might want to try making certain of that by manually
cleaning it all (rm -fr on relevant directories) and trying again.
It is possible that the change of MKKERBEROS allowed something to
not get properly cleaned, which later messed up the build.

kre

ps: I actually did a release build - rather than the same as yours,
not really by design, but just because that's what I normally always
do, and I didn't think to change it!   I was expecting the build to
fail the way you reported, so I didn't thing it would make any
real difference.



Re: buildworld failure due to md5 not supported by openssl3

2023-11-23 Thread Robert Elz
Date:Thu, 23 Nov 2023 12:13:42 +0100
From:Ede Wolf 
Message-ID:  <77602506-626c-4fff-90ec-48e2f4aaf...@nebelschwaden.de>

  | Ok, I did not see this as yet verified, because, as with MKKERBEROS=yes 
  | and USE_KERBEROS=yes the build fails as well. Even though at a slightly 
  | different place, but still crypto related.
  |
  | But very likely that those are related. I'll just sit back, relax and wait.

I'll do a build in a minute, using your mk.conf settings, and see if I
can work out what other hidden dependency that we have that is causing
the problem.   This one isn't as obvious as the last one.

If you want to do something other than wait (and perhaps, just perhaps,
get a usable build) you can try altering

src/crypto/external/bsd/libsaslc/lib/Makefile

Swap the order of the two lines:

COPTS.crypto.c+=-Wno-error=deprecated-declarations
.endif

(so you get:

.endif
COPTS.crypto.c+=-Wno-error=deprecated-declarations

instead) and then go back to your original mk.conf with MKKERBEROS=no (etc)
and see what happens.

kre


kre



Re: buildworld failure due to md5 not supported by openssl3

2023-11-22 Thread Robert Elz
Date:Wed, 22 Nov 2023 16:21:01 +0100
From:Ede Wolf 
Message-ID:  


  | # cat /etc/mk.conf

  | MKKERBEROS=no


That one is the problem, the COPTS.crypto.c entry that Martin mentioned
is not included if MKKERBEROS is "no".

I have no idea why.

kre



Re: buildworld failure due to md5 not supported by openssl3

2023-11-22 Thread Robert Elz
Date:Wed, 22 Nov 2023 16:21:01 +0100
From:Ede Wolf 
Message-ID:  


  | My command:
  |
  | ./build.sh -a alpha -m alpha -j 4 -r -M /data/obj -D /data/destdir -R 
  | /data/release distsets
  |
  | My mk.conf should be rather unspectacular as well:
  |
  |
  | # cat /etc/mk.conf

That all looks clean enough - but what about your environment?  Do you
have CFLAGS or COPTS or anything similar in the environment?

kre



Re: iscsid - lfs and ipv6 issues

2023-11-18 Thread Robert Elz
Date:Sat, 18 Nov 2023 10:46:18 - (UTC)
From:mlel...@serpens.de (Michael van Elst)
Message-ID:  

And wrt this part:

  | The address string is later used in iscsid_driverif.c, a name
  | is resolved with gethostbyname(), so while an ipv6 address might
  | be accepted, the code lacks ipv6 support.

That's probably correct by default, but it looks to me that if you have
"options: inet6" in /etc/resolv.conf then gethostbyname_r (which gethostbyname
calls to do all of the work) does ...

   if (res->options & RES_USE_INET6) {
struct hostent *nhp = gethostbyname_internal(name, AF_INET6,
res, hp, buf, buflen, he);
if (nhp) {
__res_put_state(res);
return nhp;
}
}
hp = gethostbyname_internal(name, AF_INET, res, hp, buf, buflen, he);
__res_put_state(res);

"options: inet6" sets RES_USE_INET6 in res->options.

gethostbyname_internal() does all the real work of gethostbyname(), looking
up "name" for an address in the AF given by the 2nd param.

ie: if the inet6 option is set, then gethostbyname() will first look for
an IPv6 address (or addresses) and if found, return those.  If there are
none (or if inet6 is not set in the options) then it will look for an IPv4
address (AF_INET).

So, it might be possible to use iscsi with IPv6 without further changes.
Doing it that way would cause other gethostbyname() users to also get
given v6 addresses, which their code might not be expecting, so YMMV.
(ie: caveat emptor).

Using getaddrinfo() would be much better of course.

kre




Re: iscsid - lfs and ipv6 issues

2023-11-18 Thread Robert Elz
Date:Sat, 18 Nov 2023 18:25:58 +0700
From:Robert Elz 
Message-ID:  <28754.1700306...@jacaranda.noi.kre.to>

  | one way to do that might be
  | if (sp2 = strchr(str, ']'))


And in that, sp2 isn't needed, just use sp instead, leading to

sp = strchr(sp, ':');

(etc).

kre



Re: iscsid - lfs and ipv6 issues

2023-11-18 Thread Robert Elz
Date:Sat, 18 Nov 2023 11:26:50 - (UTC)
From:mlel...@serpens.de (Michael van Elst)
Message-ID:  

  | k...@munnari.oz.au (Robert Elz) writes:
  |
  | >That looks to me as if it should work, and is a lot cleaner, though
  | >I doubt there's a great need to remove the [] if they were given.
  |
  | getaddrinfo() doesn't strip or handle brackets.

As I said before, I haven't looked at all at how the saved address string
is handled after that routine returns it - I assumed that something else
must be processing those (and probably in a way that allowed your 'x'
workaround to work) - but if not, by all means, remove them there, I don't
see any harm in doing that, as the user isn't being required to give them.

kre



Re: iscsid - lfs and ipv6 issues

2023-11-18 Thread Robert Elz
Actually, no, I don't think that will work after all - in an address
like

[fe80::1]:1234

the
+   sp = strchr(str, ':');
+   if (sp != NULL) {
+   if (strchr(sp + 1, ':') != NULL) {

code is going to happen, and set the port to 0 (instead of the
intended 1234) - it needs to ignore :'s inside [] the way the
old code was doing - one way to do that might be

if (sp2 = strchr(str, ']'))
sp2++;
else
sp2 = str;

sp = strchr(sp2, ':');
if (sp != NULL) {
/* etc as it is in your patch */

That's very very very crude, but I think will do the right thing
for valid addresses.

kre



Re: iscsid - lfs and ipv6 issues

2023-11-18 Thread Robert Elz
That looks to me as if it should work, and is a lot cleaner, though
I doubt there's a great need to remove the [] if they were given.

kre



Re: iscsid - lfs and ipv6 issues

2023-11-18 Thread Robert Elz
Date:Fri, 17 Nov 2023 22:22:24 - (UTC)
From:mlel...@serpens.de (Michael van Elst)
Message-ID:  

  | The address parser looks broken.

It certainly is, it is horrid.

  | For some reason the first character is skipped when it tries
  | to identify IPv6,

At the relevant point it doesn't really seem to care which addr
family, but is trying to deal with v6 address literals

  | I was successful with
  | iscsictl add_send_target -a 'x[ipv6-address]'

I can't imagine how that would work (how it avoids the current problem
is clear) - the relevant function simply copies the address (as a string)
to be processed later - for current purposes I didn't look to see how
that is processed into an actual sockaddr type address, and how that
can possibly work with that 'x' there, but if it does, there is very likely
more dubious code.

The actual arg parsing of that -a option (for add_send_target, maybe
other commands as well) is in

src/sbin/iscsictl/iscsic_parse.c

and the relevant function is get_address()

After checking that the address is present (not NULL or "") the code
does ...

/* is there a port? don't check inside square brackets (IPv6 addr) */
for (sp = str + 1, val = 0; *sp && (*sp != ':' || val); sp++) {
if (*sp == '[')
val = 1;
else if (*sp == ']')
val = 0;
}

That "sp = str + 1" is your "skips the first character" - the problem is
that if the '[' is the first char (which you'd normally expect it to be,
if present at all) then it will never be seen, so val will remain 0, the
first ':' that is seen will then appear to be a ':' that separates the
address from the port number (rather than just part of the syntax of the
v6 address) and it is all downhill after that.   (When you put that 'x'
there, the '[' is seen, and everything up to the ']' is just treated as
the v6 addr, so this part of the code works.)

Simply removing that ' + 1' from the init of sp should fix that one.

But that's just the beginning of the problems...

The code goes on:

if (*sp) {

If that's true, we know that *sp == ':' from the loop above, but there
are two cases possible, one is an addr, followed by ':' and a port number.
The other is a literal IPv6 addr which isn't enclosed in [ ] (in which case
no port number would be possible, but that's the user's choice).

The code needs to work out which case we have, and it does that by:

for (sp2 = sp + 1; *sp2 && *sp2 != ':'; sp2++);

That is, simply look at the string following the ':' and see if there's
another ':' later, if there is, then the assumption is that this is
a v6 addr which didn't include the [ ] as protection.   That's OK (though
there are a million ways this stuff can fail to correctly handle various
broken input formats).

if (!*sp2) {

That is, there are no more ':' in the address, so the code assumes that
what follows the first one (*sp) is a port number, and parses that.

/* truncate source, that's the address */
*sp++ = '\0';

Now sp points past the ':' (which has been obliterated) at the port
number itself, and the string pointed to be str is the address, from which
the port number (and anything which follows) has been removed.

This is followed by code that parses the value of the port number,
and while it isn't particularly resilient against errors [aside: scanf
makes for crappy parsers, use strtol() instead], it isn't relevant to
anything here, so I'll skip that.

}

After this, the code wants to move past the just parsed port number,
and see if that was terminated by '\0' or ',' - in the latter case
a group tag (whatever that is, and no, I don't need to be told) can
follow.

for (; isdigit((unsigned char)*sp); sp++);

That's skipping the digits that were the port number - but note that is
being done, whether or not there was a port number given (oops) - in
the case above where *sp2 == ':' (that is, we have a v6 addr with no
[ ] around it) then this is just nonsense (this is what's happening in
the case that was reported).

if (*sp && *sp != ',')
arg_error(arg, "Bad address format: Extra character(s) '

When that loop is done, we either stopped on the ',' or the '\0' (or
that's what was expected) - if it stopped on anything else, that's
an error.   In the reported case it is stopping at a later ':' in the
v6 addr, which implies that after the first ':' before the second, was
a string of entirely decimal digits (or nothing in a case like fe80::...)
and not one of the alpha hex chars that can make up an IPv6 addr, eg
mine is currently:
2001:fb1:12a:
in that case it would complain that the 'f' is an invalid "extra character".

Next the code goes on to the case where there was no port number (no ':'
was seen outside [ ] in the address) ..

} else
  

Re: Meaning of size of /dev/pts/ files

2023-09-25 Thread Robert Elz
Date:Mon, 25 Sep 2023 09:42:36 +0200
From:rockyho...@firemail.cc
Message-ID:  <1354f06f549eb36716bca02777cb7...@firemail.cc>

  | The /dev/pts/ files seem to have each their own size, as if they were
  | regular files.

Everything which has an inode (or equivalent) has a size (everything
that stat() can be applied to must have one, as a size field is in the
resulting structure).

  | First curious fact: `ls -l' doesn't show the size in
  | bytes of such files (for some reason).

Because it is meaningless nothing.   POSIX says (in the definition of
the  header:

off_t st_size  For regular files, the file size in bytes.
   For symbolic links, the length in bytes of the
   pathname contained in the symbolic link.
   For a shared memory object, the length in bytes.
   For a typed memory object, the length in bytes.
   For other file types, the use of this field is
   unspecified.

The final sentence is the relevant one.

  | Instead, `exa' shows their sizes:

RVP already indicated how you misinterpreted that.

  | So, second curious fact: the sizes of these pts files are not
  | related to the number of characters received by them as output of some
  | command.

Not curious, what you're looking at isn't the size field.

  | Any clue about what these sizes actually represent?

RVP answered that for what you're looking at, the actual size, which is
in the stat() results (which applications should always simply ignore for
anything which isn't a regular file, symlink, or one of the memory types,
as it is unspecified - and which both ls and exa (whatever that is) are
doing, correctly, is irrelevant (and as RVP indicated, should always be
0, as nothing ever sets it to anything different).   Terminal type devices
don't get bigger (which is what the size represents) as you write data to
them, they just pass the data through to someplace else, and forget it.
They do tend to count how much they processed, but that's not a size, and
is terminal dependant data, so not available via stat() (so ls will certainly
never tell you that number).

kre



Re: segfault in libterminfo with ncurses with nethack

2023-09-02 Thread Robert Elz
Date:Fri, 1 Sep 2023 23:32:03 + (UTC)
From:RVP 
Message-ID:  

  | So, something like this:
  |
  | PREFER.curses=  pkgsrc
  | .include "../../mk/curses.buildlink3.mk"
  | .if ! ${PREFER.curses:U} == "pkgsrc"
  | .include "../../mk/termcap.buildlink3.mk"
  | .endif
  | .include "../../mk/bsd.pkg.mk"

Wouldn't it be better to just delete the include of the termcap
buildlink file entirely?   That is unless the application is
actyally using termcap functionality directly itself.   If the
only reason it is there is because the NetBSD curses requires it,
surely the curses buildlink file should be adding it, when it is
needed (and not otherwise).

kre



Re: UEFI installation

2023-08-14 Thread Robert Elz


It would be possible to add a manual override in the installer, but
currently there is no such thing.

A better solution would probably be to simply set up all
possible boot methods (for the way the system is being
configured) without caring which method happened to be
used to boot the install image.

kre


Re: Using 'groff'

2023-06-17 Thread Robert Elz
Date:Sat, 17 Jun 2023 15:18:35 +
From:Todd Gruhn 
Message-ID:  


  | This works:
  | groff -man /usr/pkg/man/man1/man.1  -Tascii 2> /dev/null  |   more

I'm surprised, I would have expected it to need to be

groff -man -Tascii /usr/pkg/man/man1/man.1 ...

though with GNU tools one can never tell what they might allow.
(the order of the -m and -T options isn't important).

  | OR ,  does 'groff -man ...' always need to have a full dir-name
  | (/usr/pkg/man/man1/* )?

It needs to be given a path to the file(s) it is to format, yes.
groff is not a manual page reader, it is a document formatter.
It works for man pages because man pages are documents, but groff
itself has no idea that what is being formatted is a manual page,
nor where such things might be stored.

Note that "-man" is not an option - -m is the option, it says
which macro package to use, "an" is the name of the manual macros,
used that way, as in practice, no-one ever does "-m an" (though you
could) with the -m arg to *roff - the macro package name is (by convention)
always given with the -m (as above, as -man in this case, there are a whole
bunch of other possibilities, for documents written for those macros,
you have to use the right macros for the document, and usually the 'm'
is considered part of the macro set name (the manuscript macros ('s')
are -ms, memorandum macros ('m') -mm, Eric's macros ('e') -me, the
manual macros ('an') -man, the doc (new form macros) 'doc' (-mdoc),
and the man/mdoc work it out and use either -man -or -mdoc, macros
("andoc") are -mandoc).

I would also suggest not redirecting stderr to /dev/null - if anything
is being printed to stderr, you (or someone) probably wants to investigate,
as it usually indicates some kind of error.

kre



Re: Advice for new travelling server: Intel Z690 chipset?

2023-05-05 Thread Robert Elz
Date:Fri, 5 May 2023 14:57:17 +0200
From:Johan Stenstam 
Message-ID:  <9a50686b-bd7c-4a00-84b9-3434395d0...@ihren.org>

  | But I’m concerned about the Intel Z690 chipset

No need, that works fine - I have a setup with that (definitely not in
a travelling system though - I can barely lift it) and it works just fine.

You said you didn't care, but if you were considering using (in any way
at all) the on-cpu graphics (assuming the CPU in the system you're looking
at has that) that is unlikely to be supported in NetBSD - not even sure if
it is recognised as a graphics device suitable for running wsfb or even a text
based console.

  | * disk performance from the multiple M.2 PCIe X4 Gen4 slots (PCH) devices?

Should be very good - for me enough that I had to add extra column width
in iostat output to make the results (transfers/sec in particular)
look reasonable...

Capacity is limited (I think there may be 4TB M.2 devices around now, but
common is just 2TB (or less, each)).

  | * networking: the NUC 12 has 10GbE (AQC113) + Intel� i225-LM.

That one I can't help with.

  | * USB keyboard: can this still be an issue?

No, that all just works.  I have used nothing else for years now (KB & mouse).
I use a wireless KB/mouse combo with just one dongle in a USB port for both.
Wired USB keyboards with a hub and a mouse plug in port on them also exist.

  | * a working console (there is no VGA, but 2xHDMI?) and 

Again, that might be an issue, depending what it is using for generating
the graphics.   But a cheap (old, perhaps even pre-loved) PCIe graphics
card is very likely to work OK, provided the system has a slot to put it
in, and sufficient power to drive it (some of the old ones were hungry).
Even if for some reason the DRM stuff doesn't work, it should at least
appear as a frame buffer, and be usable as the console, and for wsfb.

Any modern monitor/TV without HDMI support isn't worth considering, so
HDMI should be no issue (and if you need to run a VGA based old monitor,
or OHP, I think that HDMI->VGA converters exist).

I have just gone and had a look at the specs of the NUC12 Extreme -- apart
from having way less SATA availability than I have, that's almost identical
to my system.   Same CPU (or almost) (blindingly fast) and much of the rest
of the specs look similar (I could have more RAM than that supports, but I
do have just the 64GB that can have, and if anything, it is sometimes too
much - never seen any swap space used, even when doing a -j16 build of NetBSD).
The integrated WiFi is unlikely to be supported (until the new WiFi branch is
finished anyway - that's still some time away yet I believe).

In mine, I think:

Intel product 7af0 (miscellaneous network, revision 0x11) at pci0 dev 20 
function 3 not configured

is probably that.   I haven't used it, but the bluetooth which is on the
same board/chipset is recognised by NetBSD).

My Intel network (LAN) chip is I219V which works fine.
I also have an rge (2.5GHz - Realtek Semiconductor Killer E3000) which I
have no current use for, but is supported.

I have the i9-12900 graphics disabled in the BIOS, so that doesn't appear
in the dmesg output.

The specs I saw are not terribly clear (I didn't bother downloading the
datasheet) but it may be there's only a single PCIe slot available, so you
might need to choose between graphics & network expansions (near term support,
if it isn't there already, I don't know, for the i225 intel LAN is more
likely than for the integrated graphics).

kre



Re: Keeping NetBSD disklabel up to date

2023-01-27 Thread Robert Elz
Date:Thu, 26 Jan 2023 15:32:59 -0700
From:beaker 
Message-ID:  <14c843c8-9069-d45c-e103-fd1502c67...@lavabit.com>

  | I don't believe GPT is supported on the system in question.

If you're referring to old OS installations, you might be
right, without knowing the versions no-one could say (and
I couldn't in any case for linux).

But if you are referring to the hardware and BIOS, then it
will not care, GPT was designed to be sufficiently backwards
compatible with that that it should all jyst work. You need
OSs and boot code that understands GPT, but that's just
software.   All the BIOS needs is an MBR and GPT retains
that, and boot boot code installed in it which can handle
the actual OS finding and loading, you can install a GPT
version of that.

kre


Re: sending/receiving UTF-8 characters from terminal to program

2023-01-20 Thread Robert Elz
Date:Fri, 20 Jan 2023 08:55:45 + (UTC)
From:RVP 
Message-ID:  <4dd21c1f-f5c3-c3ba-96d8-cab73a0b...@sdf.org>

  | Both /bin/sh and bash output UTF-8 if given Unicode code-
  | points in the form `\u'. So,

I believe bash will take your current locale into account
when doing that, whereas neither /bin/sh nor /usr/bin/printf
do, they simply emit UTF-8 unconditionally.   This kind of
difference is (partly) why POSIX is not including the \u (or \U)
escape sequences in $'...' quoted strings in Issue 8.

Another is how the end of the  is detected, is it always
exactly 4 hex digits (or 8 for \U), or any number up to 4 (or
8) if followed by a non-hex char, or using as many hex chars
as exist?  To be portable (as input) such a string needs to
be exactly 4 (8) hex digits, and be followed by something
which is not a hex digit - the closing ' is often useful
there, it can always be followed immediately by $' to
resume quoting again (or just ' or " if those are adequate).
But that's just the input, you also need to be using a
locale using UTF-8 char encoding to get predictable output.

kre
  |
  | $ printf 'néz' | hexdump -C
  |   6e c3 a9 7a   |n..z|
  | 0004
  | $ printf $'n\uE9z' | hexdump -C
  |   6e c3 a9 7a   |n..z|
  | 0004
  | $
  |
  | If that works, then check those UTF-8 bytes against whatever the
  | terminal emulator generated from your keystrokes for the `'
  | in `néz'.
  |
  | -RVP
  |
  | --0-494486379-1674204946=:18222--
  |


Re: -10.0_BETA panics when system is rebooting

2023-01-06 Thread Robert Elz
Date:Fri, 6 Jan 2023 22:04:26 +0100
From:=?UTF-8?Q?BERTRAND_Jo=c3=abl?= 
Message-ID:  <85d8d94d-7cd6-8f8c-3b67-8e97a7c00...@systella.fr>

I can't help with the panic cause, but:

  | [ 856605,000596] acpiout5 at acpivga0 (DD.5961966] dump device bad
  |
  | I don't understand last line as dmesg indicates :

That's because it isn't really a line, somewhere in the "DD.5961966"
string, one message has been overwritten by another.   The last like
is really just a (not all there) timestamp, and "dump device bad"

Do we do crash dumps onto raidsets?

kre



Re: 'cd' if HOME is unset

2022-12-26 Thread Robert Elz
Date:Mon, 26 Dec 2022 10:41:25 -0800
From:Michael Cheponis 
Message-ID:  


  | Well, as a zsh user:

What zsh does you'd need to take up with zsh developers.   But it is
one of 2 shells I tested which don't require HOME to be set for "cd".
zsh I am not all that surprised about.  It tends not to concentrate
a lot on conformance with other shells, but rather on what its designers
believe is better for its users.(The other that does not error is dash).

  | $ echo $HOME
  | /usr/mac
  | $ unset HOME
  | $ echo $HOME
  |
  | $ zsh -c cd
  | $(no change in directory, no error msg)

No error message, yes, but are you sure there was no change in
directory?   (Even if that change was into the directory it started from).

jacaranda$ (cd /; unset HOME; zsh -c 'cd; pwd')
/home/kre

Looks like it changed directory to me (from / to /home/kre - my normal home).

  | in all cases, directory does not change when HOME is not defined.

Note that in the test cases (like "(unset HOME; zsh -c cd)") the subshell
(parentheses) are so the "unset HOME" doesn't affect the shell from which
the command is run (you won't need to set HOME again after the test), and
the cd is run only in the context of the shell that runs it, so only that
process has its directory changed - in this test, that shell exits
immediately after, so doing the change this way is normally pointless,
here, it was done solely for the purpose of viewing the error message
(if any).

In the form I use above, rather than simply exit, the shell that ran
the "cd" then ran "pwd" after, to reveal what its directory was now.
You could change the command string to "pwd; cd; pwd" to see before
and after directories.

kre



Re: 'cd' if HOME is unset

2022-12-26 Thread Robert Elz
Date:Sun, 25 Dec 2022 15:33:57 -0800
From:Michael Cheponis 
Message-ID:  


  | Maybe it should print "$HOME is not set" in that case?

Did you try it?

It is easy...

(unset HOME; sh -c cd)

or use ksh (or some other shell) instead of sh to test it.

Script started on Tue Dec 27 00:01:34 2022
jacaranda$ (unset HOME; sh -c cd)
cd: HOME not set
jacaranda$ (unset HOME; ksh -c cd)
ksh: cd: no home directory (HOME not set)
jacaranda$ exit

Script done on Tue Dec 27 00:03:06 2022


Note HOME  not $HOME  is not set, $HOME is, if
HOME is set, a pathname, or if HOME is not set, "",
neither of which makes any sense to describe as 'not set'.

kre


Re: 'cd' if HOME is unset

2022-12-24 Thread Robert Elz
Date:Sat, 24 Dec 2022 22:32:22 -0500
From:Jan Schaumann 
Message-ID:  

  | I happily admit that it's a rare edge case. I simply
  | find it surprising that 'cd' gives up if HOME is
  | unset.  Seems unintuitive to me.

It is how it is defined to work, and always has been.
Only dash and zsh seem to handle that case, no other
shells bother.   (I am a little surprised that dash does,
their general philosophy tends towards minimalist implementation,
with almost nothing that isn't required).

Better is just to always have HOME set.

For /bin/sh ~ works without HOME, so you could define

cd() {
case "$#" in
0)  set -- ~ ;;
esac
command cd "$@"
}

if you wanted to.   But that's not required to work either (I'm
not surprised that dash doesn't expand ~ when HOME is not set,
that's more in line with what I'd expect ... though tilde expansion
working is, in general, more useful, than cd with no args when
HOME is unset, so if a shell was to do just one, I'd generally do
it the way we do (as does bash)).

kre

ps: I am a little surprised that csh acts this way though, it
started from the Thompson sh (ie: pre 7th edition) and back then
there was no environment, and while csh had vars (incl home)
it couldn't have cd depend upon what was in the environment, and
had to either use the passwd db, either for "cd" or to init "home"
or both.   I guess that has been changed since.




Re: 'cd' if HOME is unset

2022-12-24 Thread Robert Elz
Why bother?It is already clear that one cannot depend upon this
working, and nothing normally should ever have HOME unset, unless that
is done deliberately (perhaps even to prevent a simple "cd" from
going there).

kre



Re: NetBSD-10.0_BETA: clock: unknown CMOS layout

2022-12-24 Thread Robert Elz
Date:Fri, 23 Dec 2022 10:20:27 - (UTC)
From:mlel...@serpens.de (Michael van Elst)
Message-ID:  

  | The message says that no century information is found in the CMOS RAM,
  | the hardware clock itself seems to keep only 2 year digits. The century
  | is then deduced as 1900 if the year number is less than 70 and 2000
  | otherwise.

I would hope that it is the other way around, if >=70, assume 1900,
and if < 70, assume 2000 (which is why 22 now will produce 2022).

  | This heuristic will fail 2030.

2070 would be more likely.But given how very unlikely it is now
that anyone is ever going to (legitimately - people doing weird things
can deal with the issues) boot a system in the 20th century, ever again,
then perhaps we should be altering the heuristic to assume all years
are 21st century for now, and then in another 50 years or so, if systems
still exist with this issue, and we are still measuring civil time the same
way, change the heuristic again so that 22nd century years will work
with a similar boundary to what was used for 20th/21st century years,
until sometime into the 22nd century, where it can start assuming all
boots occur in that period (and so on, for as long as this is needed).

kre



Re: timers slow (sleep 1 taking five seconds)

2022-12-03 Thread Robert Elz
What you're seeing clearly isn't the TSC calibration problem that I
was having, and Michael fixed (but which would not be fixed in 9.2,
so it was a possible source) as what I was seeing was time actually
running slow, and you're clearly not seeing that - just internal
sleep timers running slow.

However, when I switched from using the (miscalibrated) TSC to using
a different timer hardware source, after having started with a badly
running TSC, I did observe behaviour much like you are reporting.

That is, time was running at the right rate, but internal sleeps were
slow - still running at the incorrect frequency.   That was mentioned
somwhere in the "Weird clock behaviour with current (amd64) kernel" thread.

This one I never bothered looking for, as it seemed (to me) likely to be
just a side effect of the miscalibrated TSC, but perhaps not - perhaps there
is some other bug in the internal timing that is being triggered by
something (if I had to guess, now, I'd say perhaps some overflow, just
because of how long your system has been running without a reboot).

I'd also assume this is something in (relatively) new code - clearly in
the 9 series, but I have had older vintage NetBSD systems running much
longer than 90 days without any issues, but I have never really run a
9.x system, except when that was HEAD - the long running systems tend to be
older (early 8 at best), and on HEAD, I tend to reboot much more frequently
to keep up to date.

If this is what it is, a reboot would almost certainly fix things, for a
while - but would also loose the ability to debug what is currently happening.

kre



Re: Upgrade 9.2->9.3 amd64 issues - device not configured

2022-11-09 Thread Robert Elz
Date:Wed, 9 Nov 2022 21:36:30 +0100
From:Riccardo Mottola 
Message-ID:  <60792a38-6644-2687-0a3d-38e9f8a50...@libero.it>


  | What do you suggest to do now?

Unless there is something wrong, nothing.On a 9.x system you don't
need any boot code updates with upgrades, so not having one done is
harmless.

The error you saw is harmless (to you), there is nothing to fix.

kre



Re: tm(3) vs "double leap second"

2022-10-22 Thread Robert Elz
Date:Sat, 22 Oct 2022 23:21:04 -0400
From:Jan Schaumann 
Message-ID:  

  | I believe the notion of tm_sec allowing for 0-61 to
  | account for a possible "double leap second" was a
  | mistake

It was indeed, somehow the notion of "possibly two leap
seconds a year" was interpreted as "possibly two leap
seconds at once".

  | I believe this should be updated, but I don't know
  | whether this is a documentation/comment fix only, or
  | if we are actually somewhere using a value of 61?

I very much doubt that anyone is (any more) assuming that
tm_sec can ever be 61, or that it ever jumps from 57 to 0
without going through (at least) 58 first.  The patches you
proposed look fine to me (even if there were some code
assuming otherwise somewhere, those changes wouldn't affect it).

kre



Re: Backing up "stuff"

2022-10-18 Thread Robert Elz
Date:Tue, 18 Oct 2022 07:03:08 -0400
From:Todd Gruhn 
Message-ID:  


  | DVD+DL?  I have not heard this name.
  | What is DVD-DL?

Dual Layer.Capacity about twice as much as a regular DVD (BluRay discs
hold much more however).

Needs dual layer blank discs, and a dual layer capable writer.

kre



Re: vi -r crash, netbsd-9 amd64

2022-10-15 Thread Robert Elz
Date:Sat, 15 Oct 2022 15:38:14 +1100
From:Paul Ripke 
Message-ID:  

  | Interestingly, the file was last updated about 10 days before the crash...
  | and I do see fsync calls from vi on the recno recovery file, too.

I'm not sure that crashes are necessarily the cause of corrupted vi
recovery files - it is possible that some sequence of editing mods is
what makes bad ones.   In any case, occasional corrupt recovery files
have been a vi "feature" for as long as recovery files have existed.

It would be nice to see this fixed, but as no-one really knows what
causes it, so no-one is able to make it happen at will, finding the
cause is not an easy task.


  | I must admit I haven't experienced this - and I either crash my system
  | or suffer accidental power loss every month or so.

You're running 9.3_STABLE right?   I'm running HEAD.  There is very
likely a difference there.   I also have quite a lot of (unused mostly)
RAM which can hold a lot of buffers, which rarely ever actually require
flushing in normal operations of my system.

  | The last corruption I saw was back in Aug 2019:

Note that the corruption I meant was fine content corruption,
the kind you refer to here:

  | https://mail-index.netbsd.org/current-users/2019/08/19/msg036431.html

is meta-data corruption is far more rare (lots of work has gone
into making sure that doesn't happen, as if it does, and isn't corrected,
things just get worse and worse).   On the other hand a file with corrupted
data inside is simply a file with corrupted data inside, and only affects
users of that file.

  | I have postgres and mongodb running, but they both do the right thing
  | with fsync, etc,

None of the files I referred to would have been subject to any kind of
sync.   Certainly not fsync, but no general sync either (I have now taken
to running a
while sleep N; do sync; done
loop, where I pick N more or less at random after each reboot - more or
less replicating what the old update program used to do.)

kre



Re: vi -r crash, netbsd-9 amd64

2022-10-12 Thread Robert Elz
Date:Wed, 12 Oct 2022 22:57:00 +1100
From:Paul Ripke 
Message-ID:  

  | "vi -r", etc, and it seemed to work fine. The recovery file that causes the
  | crash was left behind after a kernel panic.

The recovery file will be corrupted.   I have seen that kind of thing
from time to time - vi -r really shouldn't crash when it attempts to
recover the mangled file, but it really doesn't matter, nothing is likely
to ever recover it, vi core dumping is just its weird way of telling you
that.

The bigger issue is that we have an issue with file system flushing, on
panic, everything is normally supposed to be written, but that can't be
guaranteed ... a bigger issue is that we aren't flushing file data
almost ever.   A while ago, I had a power failure (no chance for the
kernel to flush anything before the system died - so no surprise there
were corrupted files).   What was a surprise was that files that had
last been touched 12 hours previously hadn't been updated.   That's not
really acceptable.

kre



Re: set -o emacs ; stty -echo

2022-10-08 Thread Robert Elz
Date:Tue, 19 Jul 2022 21:34:27 -0400
From:Andrew Cagney 
Message-ID:  


  | should, like for bash, this put the terminal into -echo mode?
  |
  | arm64$ echo $SHELL
  | /bin/sh
  | arm64$ set -o emacs ; stty -echo
  | arm64$ pwd
  | /home/cagney
  |
  | other combinations are equally puzzling.  for instance:
  |
  | set -o emacs ; stty -echo ; set +o emacs
  | doesn't flip to -echo mode either

Sorry this has taken so long for (my part of) looking into this.

I am now confident that sh has nothing to do with this at all, in
fact, sh never makes any changes (by itself) to any of the terminal
operating modes -- it will check to see if the line discipline happens
to be the old one (which I don't think exists on NetBSD any more, and
probably not anywhere else relevant either) but if it is, all it does is
refuse to enable job control, it doesn't even attempt to alter it.

Aside from that, the only tty/sh interactions are to change the terminal's
process group as needed as the foreground job alters, turn off O_NONBLOCK
if someone stupidly set it on the shell's input stream (whether a terminal
or otherwise) and monitor the window width of stdout (if a tty) for some
output to be able to be wrapped better.

Everything else related to tty modes is handled entirely by libedit.
Hence, I am passing this to Christos to look at.

kre




Re: GPT on RAID

2022-09-28 Thread Robert Elz
Date:Wed, 28 Sep 2022 13:53:42 +0300
From:Dima Veselov 
Message-ID:  <4fd66e90-86f4-9fc3-aaf8-27ce417b1...@lich.phys.spbu.ru>

  | I put GPT on RAID device (because disk is large) and it seems no good 
  | way to root autoconfig.

That's probably true, with the emphasis on "good" - but there is a way.

  | If there any way to autoconfig or tell kernel via bootloader that
  | my root reside on certain GPT partition which is on RAID device which
  | is on GPT of two disks?

I've had a setup essentially like that for years - you need to configure
the raid with "-A root" to tell raidframe to claim the root partition
(and autoconfigure itself), and the tricky (not good) part, the GPT
partition that is to be the root must have a wedge name of raidNa (where
raidN is the raid set).

So,
gpt label -i M -l raidNa raidN

(M is the relevant partition index).   Don't forget to change any
relevant NAME= entries (in fstab, or elsewhere)
 to match.

raidctl -A root raidN

It is somewhat bizarre, but works.

My system has:

NAME=raid7a /   ffs rw,log   1 1

in fstab, and raid7 includes...

raidctl -s raid7
Components:
/dev/dk8: optimal
   /dev/dk18: optimal
No spares.
Component label for /dev/dk8:
   [...]
   RAID Level: 1
   Autoconfig: Yes
   Root partition: Yes
   Last configured as: raid7
[and the same for dk18]

and

gpt show -l raid7
   startsize  index  contents
   0   1 PMBR
   1   1 Pri GPT header
   2  32 Pri GPT table
  34 990 
1024 1047552  1  GPT part - "raid7a"
(etc).

I don't think it is important (or even relevant) that the root partition
happens to be the first one in the gpt on the raidframe.


kre



Re: Qemu/nvmm - time in NetBSD guest system lags behind (with estd on host)

2022-09-04 Thread Robert Elz
Date:Sun, 4 Sep 2022 10:52:31 +0200
From:Matthias Petermann 
Message-ID:  <06b5d183-36c7-30bf-56be-8e507dffd...@petermann-it.de>

  | This is a cryptographically signed message in MIME format.
  |
  | --ms090201020102010003020302
  | Content-Type: text/plain; charset=utf-8; format=flowed
  | Content-Language: de-DE
  | Content-Transfer-Encoding: quoted-printable
  |
  | Hi Robert,
  |
  | please allow me one mor more question

Sure, but this one I cannot answer, I know nothing about the
module build system, so I am punting this to Paul Goyette,
the expert on all things module related.

Paul?

  |
  | On 04.09.22 10:42, Matthias Petermann wrote:
  | > Hi Robert,
  | >0
  | > On 04.09.22 02:58, Robert Elz wrote:
  | >> if that implies that you rebuilt the kernel with HZ=1000 and then used
  | >> the zfs module built with HZ=100 then I think the first thing I would try
  | >> would be to rebuild the module(s?) with HZ=1000
  | >>
  | >
  | > Good point... I'll try that right away. This might coincide with my
  | > observation (race condition when initializing the ZPOOL, mail from just
  | > now).
  | I did build the kernel with build.sh as follows:
  |
  | ```
  | $ cd /build/netbsd-93-1000hz/usr/src/sys/arch/amd64/conf
  | $ cp GENERIC VHOST
  | $ vi VHOST
  |
  |  optionsHZ=1000
  |
  | $ cd /build/netbsd-93-1000hz/usr/src/
  | $ mkdir ../obj
  | $ ./build.sh -O ../obj -j 4 -U tools
  | $ ./build.sh -O ../obj -j 4 -U kernel=VHOST
  | $ ./build.sh -O ../obj -U releasekernel=VHOST
  | ```
  |
  | ...and picked it up from
  |
  | While for the *kernel* / *releasekernel* target the name of the kernel
  | configuration to be used can be provided, I don't see such an option for
  |
  | the *modules* target. How can I make sure the modules are built with the
  |
  | HZ option set in VHOST config? Or does it simply adapt these from a
  | previous run of the *kernel* target?
  |
  | Kind regards
  | Matthias
  |
  |
  | --ms090201020102010003020302
  | Content-Type: application/pkcs7-signature; name="smime.p7s"
  | Content-Transfer-Encoding: base64
  | Content-Disposition: attachment; filename="smime.p7s"
  | Content-Description: S/MIME Cryptographic Signature
  |
  | MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgEFADCABgkqhkiG9w0BBwEAAKCC
  | Cc8wggSSMIIDeqADAgECAghoUOMHJhNeJTANBgkqhkiG9w0BAQsFADBmMQswCQYDVQQGEwJE
  | RTEzMDEGA1UECgwqREdOIERldXRzY2hlcyBHZXN1bmRoZWl0c25ldHogU2VydmljZSBHbWJI
  | MSIwIAYDVQQDDBlkZ25zZXJ2aWNlIENBIDIgVHlwZSBFOlBOMB4XDTIxMTIyNzEwMDY1MFoX
  | DTIyMTIyNzEwMDY1MFowcDELMAkGA1UEBhMCREUxITAfBgNVBAUTGDQwMDAwMDAwNjFjOTky
  | OTgyNjA1ZWNjNDEbMBkGA1UEAwwSTWF0dGhpYXMgUGV0ZXJtYW5uMSEwHwYJKoZIhvcNAQkB
  | FhJtcEBwZXRlcm1hbm4taXQuZGUwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQCY
  | sokm5ZK4ogI3/02Du0PgMRGCgIZGVKmKStV/yMp7sZKi9oTMZwBEm1cO+zcepAFRA5iy4OC9
  | eZf+qJSu4BgEL1/qHsI3EyTCLmVOou0mKCkXv4+neriN+z8LltlocJVt+L78j+rUjyDfYMsg
  | ep5Icf6oHVBdeXbvrds44yKLOW0ozlnTGzcgqVIW7kc34QrJz9VwDwYdGrIbZ8zu2qvLec4s
  | ApWNsOaEzjDQDcwKszVGxSap42EpU/81ZiIrXQqCXdcpienydi+qYA58NMN/HM6uBod3tmt/
  | pc7PRKvXeRAsbjM1CtrxsiM2LZ+VOu1CY4qR80h64mNylj+wi7dXAgMBAAGjggE4MIIBNDAd
  | BgNVHQ4EFgQUJrBn3ZPsJhQjlSpeO+zlbphDFtIwDAYDVR0TAQH/BAIwADAfBgNVHSMEGDAW
  | gBTpxpPR1Q8GZHLqapY+uhDyVFSyeTBWBgNVHSAETzBNMEsGDCsGAQQB+ysCAQMCCDA7MDkG
  | CCsGAQUFBwIBFi1odHRwOi8vc2VjNS5kZ25zZXJ2aWNlLmRlL3BvbGljaWVzL2luZGV4Lmh0
  | bWwwPgYDVR0fBDcwNTAzoDGgL4YtaHR0cDovL3NlYzUuZGduc2VydmljZS5kZS9jcmwvY3Js
  | Mi10eXBlLWUuY3JsMA4GA1UdDwEB/wQEAwIEsDAdBgNVHSUEFjAUBggrBgEFBQcDAgYIKwYB
  | BQUHAwQwHQYDVR0RBBYwFIESbXBAcGV0ZXJtYW5uLWl0LmRlMA0GCSqGSIb3DQEBCwUAA4IB
  | AQDXi3RDfDsZivZhaF+l/2lkHMgofI12pA1WbREKnELjA0yexbu+DQLcQtIRrZUAdsso5l1m
  | +aetmRd8n+AGUR2ZIfLTHTm/zbvMeSJXVzc+7aCcwyMpFCCOPuyUiO2SMT+B278Mf6fRgto8
  | WuLlLnd7FlrxmOGKsTSF+kvwdHWHoUwh4dB8Y5CtZ5opj5GzLmuNo/axBvTvaDKAW+RxGpoH
  | U/Z1byL77K27Bg1P9fegN4jrzG+CZxJ/z/RQyXKTY8r1mjDQmuXUqmNnbH/BgD1C0diySbAm
  | Cvnw2FBe+/hGQDF8SZ50tnffLcqR65tbGBiCHPUYLgMYwT7fF/KPgltJMIIFNTCCBB2gAwIB
  | AgIIVRxK12atJfYwDQYJKoZIhvcNAQELBQAwYTELMAkGA1UEBhMCREUxMzAxBgNVBAoMKkRH
  | TiBEZXV0c2NoZXMgR2VzdW5kaGVpdHNuZXR6IFNlcnZpY2UgR21iSDEdMBsGA1UEAwwUZGdu
  | c2VydmljZSBSb290IDc6UE4wHhcNMTYxMDI2MDkyMjQxWhcNMjQxMDI2MDkyMjQxWjBmMQsw
  | CQYDVQQGEwJERTEzMDEGA1UECgwqREdOIERldXRzY2hlcyBHZXN1bmRoZWl0c25ldHogU2Vy
  | dmljZSBHbWJIMSIwIAYDVQQDDBlkZ25zZXJ2aWNlIENBIDIgVHlwZSBFOlBOMIIBIjANBgkq
  | hkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA3KXylD90x6NH0pdmzmujzW0XA2GWhOGVd7yxq3v1
  | OOOTrEWoTkT3j//S+J8nEyun1GsKQ06jmg8bV2MY6CTQvS5ykcVQf1JAX9IdubzdC9YleCoQ
  | mmPE4pldM9slEpW9jbmeIHQVOvaiZGrKmI/gD/DnEDqNInY/Ta9XpaBw99otCQz5IQY/FX+n
  | Om+5jcp/Mn2WL2Zc81dokP3L6OohS8dsIBu5gpDmfAQBxgxcOk9FCANAZOtGIUTEcSOxl4zM
  | QUANmP116D+Hb0Aw7TDZitK+Q1F6H/O8Nm613LbkNm+MTcBSBK1aAELvH7Z105vYjrWjrFsi
  | zGV+r+bM2kAagQIDAQABo4IB6jCCAeYwEgYDVR0TAQH/BAgwBgEB/wIBADAfBgNVHSMEGDAW
  | g

Re: Qemu/nvmm - time in NetBSD guest system lags behind (with estd on host)

2022-09-03 Thread Robert Elz
Date:Sat, 3 Sep 2022 13:51:25 +0200
From:Matthias Petermann 
Message-ID:  <8c9bbdbc-5583-7f2d-4e04-ab550b6ee...@petermann-it.de>

I cannot really help with zfs issues, I know nothing about it, but:

  | The zfs module was loaded though, I also built the kernel with exactly
  | the same sources as the "original" one, so I assume for now that the
  | modules are compatible.

if that implies that you rebuilt the kernel with HZ=1000 and then used
the zfs module built with HZ=100 then I think the first thing I would try
would be to rebuild the module(s?) with HZ=1000

Long ago there was much work done to get rid of the constant HZ from the
kernel, and replace it with the variable hz (which is initialised to HZ).

However, I am by no means sure that this has consistently been maintained
in all the intervening years, and in particular with external modules, and
there might be places that are still (or again) using HZ (the constant)
rather than hz (the variable) (but beware just doing a grep, in many contexts
there is a #define HZ hz in scope that can defeat that simple way of checking).

While you definitely need new modules if the kernel version changes, I
don't believe that the converse applies.   I suspect you need new modules
(or might) when kernel options change as well (whether you do or not
depends upon whether the module is affected by the option that has been
changed - which is not always easy to detect, so just rebuilding is
generally safer.)

kre



Re: Qemu/nvmm - time in NetBSD guest system lags behind (with estd on host)

2022-08-31 Thread Robert Elz
ps: arithmetic has never really been my thing, the 10.1ms I mentioned
should probably have been 11ms instead.

kre



Re: Qemu/nvmm - time in NetBSD guest system lags behind (with estd on host)

2022-08-31 Thread Robert Elz
Date:Wed, 31 Aug 2022 13:42:13 +0200
From:Matthias Petermann 
Message-ID:  

  | I'm also curious about the effect on energy consumption - i.e., whether 
  | it's measurable.

I'm sure its measurable, but I suspect you'd need a highly accurate
and very precise ammeter to do that.

kre



Re: Qemu/nvmm - time in NetBSD guest system lags behind (with estd on host)

2022-08-31 Thread Robert Elz
Date:Wed, 31 Aug 2022 11:29:06 +0200
From:Matthias Petermann 
Message-ID:  

  | The guests' time increasingly lags behind with continued operation. Also 
  | the ntpd seems to have no compensating effect in the guests here.

This is a well known issue, ntp in the guest cannot help, the clock
drift is too much for its algorithms to handle.

  | What could be the reason for this? Can estd be a source of interference?

On estd, no.   The problem is that the qemu guest is running using time at
100 Hz, it expects a clock interrupt every 10ms.   To make that happen,
qemu (effectively) sleeps for 10ms, then signals a clock interrupt to the
guest.   But the host running qemu is also running with a clock at 100Hz.
When a process asks to sleep for 10ms, it needs 2 clock ticks to occur,
one which might happen an instant after the sleep request (be just 1us
or something since the request) and so would not be long enough, and another
10ms later - then we know that 10ms has passed.

If it happens (and always happened) just that way, all would be close enough
(and NTP would cope in the host with any minor time shift that occurs) but
what actually happens is that qemu wakes up from its 10ms sleep (or gets
a 10 ms SIGALRM - that difference doesn't matter) and as well as signalling
its guest, immediately requests a new 10ms sleep, for next time.

Here, rather than being 1us before the next clock intr happens, it is
more likely to be 1us after the previous one happened, ie: 9.999 ms
until the next one happens - we wait that long, then 10ms more, and the
sleep finishes - just about 20ms elapses to do one 10ms sleep.   The
guest is getting one clock interrupt every 20ms, but believing that 10ms
has elapsed, as that's what it requested.

Or that is how I understand it from what has been explained to me - the
details might not be exactly right (and I've never looked inside qemu)
but that's more or less the effect and its underlying cause.

  | All I could find so far is [1]. It is recommended to add the rtc switch 
  | to the qemu command. Is there any recommendation here in the meantime 
  | which setting works best with NetBSD?

About the rtc, no no idea.   But to deal with the problem, aside from
major NetBSD code rewrites (the so called tickless kernel) the one
solution that should work is to run the host with HZ set a lot higher,
and leave the guest(s) at 100Hz.

For any modern host (anything you'd really want to use to run a qemu
guest in production) running with HZ=1000 will be fine (you'll never
notice the tiny extra overhead).   Some of the NetBSD ports already
run at that kind of rate - alpha has been at 1024Hz forever (and these
days, alphas are slow processors - though they weren't compared to
others when that change was made).  With this, the 10ms interrupts might
actually occur about 10.1 ms apart, but that much drift NTP should be
able to handle.   If not, run the host with an even higher HZ rate,
even 1 should work with a modern amd64 CPU (though I have never
tested that, nor heard of anyone who has - but 2000 should not be an issue).

If for some reason you cannot change the clock rate of the host (that is,
compile a new kernel with "options HZ=1000" in the config file) then make the
guests run with a much slower clock rate - nothing faster than 50Hz.

That should be acceptable (pdp-11's used to run at 50 or 60hz, and worked
OK) but needs to be even slower for clock drift issues.   The problem
is that if the OS clock rate is too slow, it will start to impact upon
(perceived) performance, and some application capabilities.

kre




Re: updating direct from 5 to 9?

2022-08-22 Thread Robert Elz
Date:Mon, 22 Aug 2022 21:59:28 +0700
From:Robert Elz 
Message-ID:  <8246.1661180...@jacaranda.noi.kre.to>

That is, I am replying to myself...   (sad that).

  | And second, find out why the existence of wedges has any effect on
  | mounting wd0a (would be different if you were using dk0 for some
  | reason on a filesystem not intended to have wedges).

And of course (I should think before pressing "send") that's because
the wedge has the block device open, and those are single use devices
(when the wedge has it open, nothing else can open it).

The question of why wedges were being created at all remains though.

kre



Re: updating direct from 5 to 9?

2022-08-22 Thread Robert Elz
Date:Mon, 22 Aug 2022 16:19:03 +0200
From:Martin Husemann 
Message-ID:  <20220822141903.ga13...@mail.duskware.de>


  | Booting a 9.3 install CD and digging around a bit I found the 9.3 kernel
  |
  |  - auto-creates bogus wedges dk0 (for the FFSv1 at /) and dk1 (for the
  |swap partition.

You might want to try to find out why it is making wedges for a
disklabelled drive at all?   DKWEDGE_METHOD_BSDLABEL (and the MBR
form) are supposed to be off by default.

And second, find out why the existence of wedges has any effect on
mounting wd0a (would be different if you were using dk0 for some
reason on a filesystem not intended to have wedges).

This has no bearing on why wedges created would not have the
proper settings of course.

  | I dimly recall the disklabel moved into the type 169 MBR partition
  | a long time ago - I bet 4.0 was before that change and this is what
  | now causes the broken wedge auto-detection.

I doubt you will win that bet.   I don't recall a time when the label
was ever not in the NetBSD partition, which means if it was, we're talking
about sometime in the early 1.x versions, or before.   (I started with 1.3,
but didn't really ever use that on an x86 system, more on sparc, which is
MBR free).

But that is easy for you to check - just hunt for the disklabel on the
"drive" you have configured the 4.0 system on, and see where it is.
Maybe somehow there are two, perhaps different ones?

kre



Re: updating direct from 5 to 9?

2022-08-21 Thread Robert Elz
Date:Sun, 21 Aug 2022 07:55:45 -0400
From:Greg Troxel 
Message-ID:  

  | But interesting that 9.2 build.sh works on 6.1.

Not relevant to the actual topic, but that stopped working for me,
I think even before -9 was released.   The basic system would build,
but something in X required tools that couldn't be compiled using
the -6 version of gcc.

But if you just meant the script, rather than a complete
successful build, then yes, it was designed (not by me, in
case that is not clear) to work almost anywhere.

kre



Re: set -o emacs ; stty -echo

2022-07-20 Thread Robert Elz
Date:Wed, 20 Jul 2022 07:43:28 -0400
From:Andrew Cagney 
Message-ID:  


  | Do you want a bug report?

It isn't needed, but you can make a PR for it if you like.

kre



Re: NetBSD 9.2 installer can't detect disk of some Hetzner VPSes

2022-07-20 Thread Robert Elz
Date:Wed, 20 Jul 2022 08:25:00 +0200
From:Matthias Petermann 
Message-ID:  

  | Unfortunaly, the kernel panics shortly before it passes control to init:
  |
  | ```
  | [] panic: cnopen: no console device


What kind of console interface does that setup give you?   Emulated
serial port?   Emulated graphics interface?

One of the virtio devices (1043) is described in pcidevs as a virtio console
but it doesn't look like we have any kind of driver for that one (whatever
that actually means).   The their setup emulates some kind of standard
com port (serial) or vga, then it should be possible to attach to that,
but the boot code would need to tell the kernel which of those to use.

You'd probably do better asking on current-users (or perhaps tech-kern
but just pick one of those) than netbsd-users for this kind of info.

kre



Re: set -o emacs ; stty -echo

2022-07-19 Thread Robert Elz
I will take a look, but I suspect you're seeing the interaction
between what editline (libedit) wants, and its settings, and how
the shell interacts with it to preserve sane settings for other
commands that also use the terminao.

bash and readline are much more tightly coupled than sh and libedit.

kre

ps: when I look at this I am much more likely to use vi mode
than emacs ... should make no difference to the interaction
just to how libedit edits (key bindings).


Re: Re: LTFS support for HP tape drive devices

2022-07-16 Thread Robert Elz
Date:Fri, 15 Jul 2022 23:41:13 +0200
From:Colo Colo 
Message-ID:  <20220715234113.840de...@pobox.sk>

  |  But I am new to BSD, and the question is, if it is possible to combine
  |  LTFS for NetBSD & LTFS for FreeBSD & Linux source packages to make
  |  LTFS works with HP drives on NetBSD or FreeBSD

As long as it is software (and you pay attention to licensing issues, so
no GPL'd code tries to get into our kernel - and as little as possible in
the distributed userland) almost anything is possible.

If you're asking whether if you do the work to make it happen, would we
accept that, I'd say probably yes (assuming licensing is BSD compat,
ideally the code sytle is compat, and it works without breaking something
else) and packages from anywhere (which work) tend to be accepted in pkgsrc,
so by all means, go for it.

If you're asking for someone else to make that happen for you, then you'd
need to find someone with the desire to make it happen, or who can be
convinced to have such a desire.

kre



Re: NetBSD 9.2 installer can't detect disk of some Hetzner VPSes

2022-07-11 Thread Robert Elz
Date:Mon, 11 Jul 2022 21:43:15 +0530
From:Mayuresh 
Message-ID:  <20220711161315.hoakmn5fgz76gtov@localhost>

  | Hetzner agreed to set a compatible chipset for my instance. So I finally
  | got the configuration I needed and have just installed NetBSD 9.2 on that.

Good.

  | Shouldn't qemu with the chipset setting they mentioned suffice for
  | testing?

Yes, I guess it should ... I'm not a qemu user, don't even have it
installed (though that is in progress now) so I may need some advice
on how to operate it to get the desired effect...

  | There are some difficulties in testing on Hetzner such as
  |
  | - As their reply suggests, there is no guarantee about which chipset
  |   your new instances will get. It's a bit random.

Yes, though they did say (from what you posted) that all the AMD cpu
instances have the one which has the problem, so that might not have
been a problem.

  | - I doubt whether they'd allow an arbitrary image. They have provided
  |   NetBSD 9.2 released image on their platform.

But that's a different issue...And as you suggest, testing would
be much easier done locally than exporting ISO images for you to try
anyway.   Of course, even if we find something to fix, it isn't likely
to get into their provided image any time soon.

kre



Re: NetBSD 9.2 installer can't detect disk of some Hetzner VPSes

2022-07-11 Thread Robert Elz
Date:Mon, 11 Jul 2022 15:25:42 +0530
From:Mayuresh 
Message-ID:  <20220711095542.mnyb4o54j5kd476c@localhost>

  | Following is a reply from Hetzner (they have quoted freebsd link, not sure
  | how relevant):

It looks to be in the area I already thought was perhaps related,
but that's old (2019) and we supposedly already have support for
revision 1, which FreeBSD apparently didn't (back then) if I read
all of that correctly (really just skimmed so far).

  | This is most likely due too a bug within BSD:
  | https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D236922

(for anyone else who goes looking, the =3D is a QP encoded = of
course, so just omit the "3D").

In what form would you need a new NetBSD 9.2_STABLE to appear in in order
to test the driver with some diagnostics added, so we can see what is
really going on?Is just a kernel enough, or do you need an ISO
image (since unless something changes, it is unlikely to work any better
than the previous attempt - just provide more info - it doesn't need
to contain install sets etc, with no accessible disks, those are useless,
so it could be quite a small ISO - perhaps the one intended for installing
from the net - not that that could work either - no working net or disks).

kre



Re: NetBSD 9.2 installer can't detect disk of some Hetzner VPSes

2022-07-10 Thread Robert Elz
Date:Sun, 10 Jul 2022 11:09:40 +0200
From:Martin Husemann 
Message-ID:  <20220710090940.ga16...@mail.duskware.de>

  | Yeah, I noticed that we already have support for vioscsi* at virtio?
  | [which is what the spec draft I linked ended in] and vioif* at virtio?
  | (at least in current), so it can't be this simple.

It has been there a while, I used to use it via virtualbox a laptop
or two ago...

It actually might be simpler than I thought - I was basing my "no support"
on a grep not finding the product ID anywhere but in the pcidevs file
(and files built from it).

But dev/pci/virtio_pci.c does this ...

/*
 * Non-transitional devices SHOULD have a PCI Revision
 * ID of 1 or higher.  Drivers MUST match any PCI
 * Revision ID value.
 */
if (((PCI_PRODUCT_QUMRANET_VIRTIO_1040 <=
  PCI_PRODUCT(pa->pa_id)) &&
 (PCI_PRODUCT(pa->pa_id) <=
  PCI_PRODUCT_QUMRANET_VIRTIO_107F)) &&
  /* XXX: TODO */
  PCI_REVISION(pa->pa_class) == 1)
return 1;

and all the devices in question should be between 1040 & 107f, so
the only issue might be if the revision is not 1.   Given that the
comment says that as long as the rev is >=1 is supposed to work
(there's an earlier test for rev 0 - transitional devices) it
might be that that is the problem - the devices being configured
just might be rev > 1.   In that case, if nothing in our drivers
is affected by the rev bump, all that might be needed is to adjust that
final test (the one with the XXX comment...)."If".

kre



Re: NetBSD 9.2 installer can't detect disk of some Hetzner VPSes

2022-07-10 Thread Robert Elz
Date:Sun, 10 Jul 2022 09:58:48 +0200
From:Martin Husemann 
Message-ID:  <20220710075848.gc25...@mail.duskware.de>

  | Is this the spec for the virtual devices?
  | https://lists.gnu.org/archive/html/qemu-devel/2011-06/msg00754.html

No idea.   That's 11 years old, and says TBD for the PCI ID to be used.

It is also just the SCSI interface, the net, and I assume probably,
console interfaces (and maybe more of them) are likely to be needed
as well.

  | Might be worth to ping the tech-kern mailing list with the unconfigured
  | dmesg lines and that pointer, maybe someone has done some work on this
  | already or in a related area. Since it is easily available in qemu,
  | testing is easy.

Might be.

It isn't impossible that these are the same basic virtio interfaces as
Virtualbox/VMware/... use, just with a different manufacturer ID.   But
I'm not sure how to find out.

kre



Re: NetBSD 9.2 installer can't detect disk of some Hetzner VPSes

2022-07-09 Thread Robert Elz
Date:Sun, 10 Jul 2022 09:46:26 +0530
From:Mayuresh 
Message-ID:  <20220710041626.zvnntquax4w7jnwq@localhost>

With this:

  | [ 1.236385] sd0 at scsibus0 target 0 lun 0:  disk fixed

you should do as I indicated in the earlier mail, see the config
for scsibus0 and then for whatever that is attached to (but perhaps
just scsibus0 will be enough) - you want the "at pciN" config line,
which should reveal the pci vendor and decice code being used.

Then you need to find out how to configure the bigger drive to use
the same ones, rather than linux specials.

kre



Re: NetBSD 9.2 installer can't detect disk of some Hetzner VPSes

2022-07-09 Thread Robert Elz
Date:Sun, 10 Jul 2022 09:39:08 +0530
From:Mayuresh 
Message-ID:  <20220710040908.exdwph7o7wuf3xrn@localhost>

  | Yes. Both network and disk are appearing not configured. PFA.

Vendor 1af4 is (in our pcidevs) QUMRANET - the web says it is Red Hat,
and used for virtio devices (which corresponds with our pcidevs, which
has device 1041 listed as "Virtio Network", 1043 as "Virtio Console",
1048 as "Virtio SCSI".

The latter corresponds with your linux boot dmesg...

[1.682859] scsi host2: Virtio SCSI HBA
[1.705095] scsi 2:0:0:0: Attached scsi generic sg0 type 0
[1.705167] sd 2:0:0:0: Power-on or device reset occurred
[1.706859] sd 2:0:0:0: [sda] 320004096 512-byte logical blocks: (164 GB/153 
GiB)



At the minute I can see no NetBSD drivers for these devices, and I
have no idea what their interface is like (that would require examining
linux sources I suspect).

Check your 48GB version, my guess is that it isn't using those linux
devices.

You may have the 160GB system configured expressly for linux - the
host most likely doesn't have a special case for NetBSD (sad as that
is) but they will certainly have an option for windows.   Use that
instead (linux should still work just fine).

kre



Re: NetBSD 9.2 installer can't detect disk of some Hetzner VPSes

2022-07-09 Thread Robert Elz
Date:Sun, 10 Jul 2022 01:14:05 +0530
From:Mayuresh 
Message-ID:  <20220709194405.towbk55owqgg3xsb@localhost>

  | If you could give me some words to grep that will help.

Look for "not configured" first.

Then ata wd[0-9] ld[0-9] and sd[0-9]  (you could probably just use wd0 ld0
and sd0) - but those assume a relatively normal x86 type install, there
are lots of names for disc drivers that might appear.

What would help would be to look at the 48GB install that works, see what
the drive is called in that one, look for that name in dmesg output
(probably in /var/run/dmesg.boot in a running system) then see what it
connects to, look for that, see what it connects to, etc.

Eg: I have

ld0 at nvme0 nsid 1
ld0: 1863 GB, 243201 cyl, 255 head, 63 sec, 512 bytes/sect x 3907029168 sectors
ld0: GPT GUID: fdb094e2-f6fc-45de-827a-106c6748e9c4
dk0 at ld0: "NetBSD_EFI", 522240 blocks at 2048, type: msdos
(and several more dkN at ld0: lines - those aren't important).

jacaranda$ grep nvme0 /var/run/dmesg.boot
nvme0 at pci2 dev 0 function 0: Samsung Electronics (3rd vendor ID) product 
a80a (rev. 0x00)
nvme0: NVMe 1.3
nvme0: for admin queue interrupting at msix1 vec 0
nvme0: Samsung SSD 980 PRO 2TB, firmware 3B2QGXA7, serial S69ENF0RA54347E
nvme0: for io queue 1 interrupting at msix1 vec 1 affinity to cpu0
[lots more interrupt related lines]
ld0 at nvme0 nsid 1

jacaranda$ grep pci2 /var/run/dmesg.boot
pci2 at ppb1 bus 2
pci2: i/o space, memory space enabled, rd/line, wr/inv ok
nvme0 at pci2 dev 0 function 0: Samsung Electronics (3rd vendor ID) product 
a80a (rev. 0x00)

jacaranda$ grep ppb1 /var/run/dmesg.boot
ppb1 at pci0 dev 6 function 0: Intel Alder Lake PCIe G4 Root Port 2 (x4) (rev. 
0x02)
ppb1: PCI Express capability version 2  x4 @ 
16.0GT/s
pci2 at ppb1 bus 2
[plus a bunch of false matches - ppb10 (and more) matches as well...]

jacaranda$ grep pci0 /var/run/dmesg.boot
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
[various things connected, including...]
ppb1 at pci0 dev 6 function 0: Intel Alder Lake PCIe G4 Root Port 2 (x4) (rev. 
0x02)
Intel product 7af0 (miscellaneous network, revision 0x11) at pci0 dev 20 
function 3 not configured


When you reach mainbus0 (or anything that looks like it) you can stop.

The "not configured" there I included just as an example, that's (what will
one day be) an iwl WiFi interface, but NetBSD doesn't have a driver for it yet.

If you get that info for the system that works, you can search for the
same things (except start in the reverse order) in the one that doesn't.
I can't think of any reason simply configuring a bigger drive should make
any difference - it is likely there are other config differences between
the two systems than just that.

kre



Re: Setting keyboard layout on xterm

2022-06-29 Thread Robert Elz
Date:Wed, 29 Jun 2022 17:56:09 +0200
From:Martin Husemann 
Message-ID:  <20220629155609.ga21...@mail.duskware.de>

  | I don't know if the in-tree xterm supports unicode

In systems that have it (HEAD does, I haven't checked earlier)
uxterm does.

uxterm is just "xterm -class UXTerm" under the hood, so yes, xterm
(or a recent enough xterm anyway) does support unicode, if started
in the appropriate way.

For me, it has no trouble showing me CJK/cyrillic/... spam messages from
nmh in an xterm.   Very decorative!

I haven't tried typing, with X, that usually requires an input method
to be installed, and I haven't done that yet (on other systems I allow
Thai input, which has worked fine ... my keyboards have all the symbols
on the keycaps, in addition to ascii ... quite crowded keycaps!)

kre



Re: how to turn off devices that monitor sensors

2022-06-21 Thread Robert Elz
Date:Tue, 21 Jun 2022 10:29:54 +0900
From:Henry 
Message-ID:  


  | Thank you for the ideas.  The manufacture date of this HP Pavillion
  | Notebook 15-au123d was 07/01/2017.  NetBSD is installed UEFI.

That should all be new enough that ACPI should work fine, and if the
other OS's (well windows) can shut down, then I'd assume that entering
S5 state should make that happen for NetBSD as well.

What other hardware exists in that system?

Does reboot (or shutdown -r) work correctly?

  | I tried `boot -2' but the startup stopped at the following.  I don't
  | know how to proceed.
  | boot device: 
  | root device:

At that point you should be able to type ? and get a list
of possible root device values, pick the right one, and type it.

But it is possible that without ACPI the disk isn't being seen by
NetBSD at all, and there will be nothing appropriate in that list.

This is about as far as I can take it, I don't know the x86 architecture
or the MD x86 code nearly well enough to suggest anything else that you
can try.

kre



Re: how to turn off devices that monitor sensors

2022-06-14 Thread Robert Elz
You might try "boot -2 netbsd" to disable ACPI completely,
in which case NetBSD would not be able to request ACPI S5 state
to shut down and power off (would need to use older BIOS
interfaces).

If your system is an older one (HP model numbers mean nothing
to me) then it shoukd work OK without ACPI.

You could also confirm whether other OS's are able to
power off that particular system (the test needs to be
of that exact system, as minor variations like what BIOS
rev is installed, what other hardware, and the BIOS
config all could alter the results.  Installing other
systems should not be needed, "live" cd/dvd boots (or
USB stick) should be sufficient to test this.

But testing NetBSD with ACPI disabled first is the quick
test to perform.

kre


Re: how to turn off devices that monitor sensors

2022-06-12 Thread Robert Elz
Date:Sun, 12 Jun 2022 18:18:26 +0900
From:Henry 
Message-ID:  


  | The machine freezes with the last messages to the console:
  | acpi0: entering state S5

S5 is "off" (more or less), the system should be doing nothing except
waiting for someone (or something) to request that it be turned on again.

It looks as if your system is one of those which NetBSD doesn't know how
to really shut down, or the BIOS has bugs which are preventing that from
happening.   Do you have any "Wake on" type events configured in the BIOS?
If so, you might want to try disabling those and see if that might make
a difference - the BIOS might be keeping the system more alive that you
want so that wake on lan, or wake on usb keyboard, or something can work.

  | acpitz0: workqueue busy: updates stopped
  | coretemp0: workqueue busy: updates stopped
  | coretemp1: workqueue busy: updates stopped

As Martin said, that's just noise, because the BIOS hasn't reset enough to
stop that stuff from interrupting, and is apparently keeping enough power
enabled to the ram (or at least caches) that enough of NetBSD is still
around to report that stuff, I agree with Martin, those are almost certainly
not related to your issue (they're a symptom caused by it, not causing it).
[I also would agree that there's potentially a driver bug, once the system
is off (or supposed to be off) nothing should be being processed at all.]

kre

ps: this might, or might not, make it direct from me to gmail, so I hope
you're subscribed to netbsd-users so you can get the reply that way.  gmail
doesn't like me, and tends to bounce mail I send.




Re: Adding Raidframe to existent GPT system

2022-05-09 Thread Robert Elz
Date:Mon, 9 May 2022 18:02:47 +0200
From:Martin Husemann 
Message-ID:  <20220509160247.gd2...@mail.duskware.de>

  |34 2014 Unused
  |  204865536  1  GPT part - EFI System

There's no good reason to align an EFI partition is there?
Those 2014 unused blocks would serve a more useful purpose
being included in that partition.

  | 67584  11720976384  2  GPT part - NetBSD RAIDFrame component

That one should be (and is) aligned.

  | The two (identical) EFI partitions are not strictly needed (one would
  | do), but my theory was to allow booting from either disk if one of them
  | fails completely.

That makes sense, assuming you're using EFI booting.  But even if you
had only one, you'd want something of that size on the other drive
to keep them approx identical.   A second efi partition is better
than just more unused space.

It would be nice if the EFI partition could be in the raidframe,
but GPT doesn't allow overlapping partitions - if it did we could
put the EFI partition inside the raidframe, and then just make
the outer GPT partition table EFI partition refer to the same
section of the drive.

kre


Re: Adding Raidframe to existent GPT system

2022-05-09 Thread Robert Elz
Date:Mon, 09 May 2022 10:01:47 -0400
From:=?UTF-8?Q?C=C3=A9sar_Catri=C3=A1n_C=2E?= 
Message-ID:  

  | Got three drives, two for a RAID-1 array and one more for backup. 

ok, that's good.  At least last time I looked raidframe was unable
to autoconfigure hot spare drives - they need to be added after
each boot, which is a minor inconvenience, as it can easily be scripted
in rc.local ... just watch out for trying to add a drive as a spare
after an earlier faikure has caused the spare to already have been
incorporated into the raid set.

It is also no great problem if you don't set up a hot spare, raidframe
runs fine (without redundancy) if a drive fails (had that happen
several times) then you can add the replacement manually after a
failure.  Just keep an eye on things so you detect faikures quickly,
it is easy to get complacent after years of nothing happening.
Which reminds me...

  | Got it enabled for MBR, but it seems the GPT adds complexity
  | for Raidframe due that each GPT partition is offered as a new
  | disk/wedge to the system.

One man's complexity is another's flexibility.   You can partition
the drive, and make separate raid sets from partitions on different
drives (with more drives to play with, I do things like that,
using parts of different drives for different raid sets).

  | Should be created only one wedge at first, using the entire disk,
  | then apply raidframe to it

You can do that, or break the drives into smaller pieces and
make each of those a separate raid.  It all depends upon your
needs.   It appears from your current GPT that you are using
legacy (BIOS) booting, rather than EFI - that's fine, and NetBSD's
boot allows booting from and root on a raid1 (though someone else
will need to provide the recipe if you want to do that, I don't
run things that way) but the firmware will not undertstand
raidframe, so EFI booting needs an EFI partition in the physical
drive's GPT partition table, not in a raid partion in that.

  | (don't know if raidframe is ready for GPT?), 

It is.   Raidframe just gives you a simulated drive.  Other
than BIOS access you can do anything with a raidframe that
you can with any other drive.

  | then do again a GPT layout into the raid0 device to deploy the
  | filesystems?

That works, if you take care of booting (assuming you plan
on booting from this drive/drives).

The one thing to watch for is partition alignment and stripe
sizes, relative to your filesystem block (newfs -b value)
size, to avoid getting lots of read/modify/write cycles
happening when all that should have happened is to write
a block (once to each drive with RAID 1).   There are other
people better able to explain the issues here than me.

Much better to wait a day or so, get the correct info,
and set things up properly, thandk everything, init tge
raid, set up filesystems, and tgen discover tgat tge config
forces poor performance, and you neex to start over (if
write performance matters, it doesn't always).

kre


Re: Adding Raidframe to existent GPT system

2022-05-09 Thread Robert Elz
oh, I forgot to say, that if you do have multiple drives,
and are going to use raidframe, change the GPT partition type
from ffs to raid.  You can put a new GPT (or disklabel if
it is small enough) inside the created raidframe, which appears
to the system as a drive.

kre


Re: GPT and UEFI booting

2022-04-05 Thread Robert Elz
Date:Tue, 5 Apr 2022 12:54:40 +0200
From:Martin Husemann 
Message-ID:  <20220405105440.ga20...@mail.duskware.de>

  | This is meant for expert use firmware bugs workarounds, and there seems
  | to be no official way to toggle it off again.

If I am reading the got sources correctly, using gpt biosboot
without giving the -A flag shoukd turn off the PMBR "active" bit.

kre


Re: is /bin/sh the almquist shell?

2022-03-29 Thread Robert Elz
Date:Tue, 29 Mar 2022 23:34:08 GMT
From:Mayuresh Kathe 
Message-ID:  <202203292334.22tny8vp027...@sdf.org>

  | should i start a separate thread asking for information
  | regarding netbsd's /bin/sh support for recursion?

New thread?   Probably not needed.

To actually answer the question depends exactly what you mean/need.

But as a simple (possibly incorrect) interpretation, the
original Bourne sh had no functions, so the only way it
could do anything recursive was by having a script run itself,
either as a standalone command, or via the '.' command.

All modern shells have functions (they are part of the POSIX
sh spec) and all shell functions have always supported recursion.
Not all shells support local vars in functions however, they are
not in posix.  Without them some recursive techniques can be
more difficult.

I believe that the original Almquist shell, and all descended
from it (which includes dash incidentally) support functions
and local variables.


Please read the sh man page.

kre


Re: manpage section-names

2021-10-30 Thread Robert Elz
Date:Sat, 30 Oct 2021 20:32:23 -0500 (CDT)
From:"Jeremy C. Reed" 
Message-ID:  

  | n I didn't search for definition of "n"

"new", back in about 1980...

kre



Re: proposed change to getty

2021-10-13 Thread Robert Elz
Date:Tue, 12 Oct 2021 09:42:55 -0400
From:matthew sporleder 
Message-ID:  


  | Do you mean modem like a telephone modem or modem like a serial port?

I meant telephone modems - they're what most uses DTR as a functional
signal, and it is disabling that signal that all this is about.   (There
were other devices that behaved similarly, but they're even less likely to
be seen now.)

The serial port is the system interface to the modem (that is the interface
in question - modems that are implemented on ISA/EISA/PCI/PCIe/... cards
are entirely different beasts, though I believe that some of those, probably
even most, present a system interface that looks like a serial port, and
most likely manipulating the fake DTR on that fake serial port would have
similar effects).

And yes, I know that these things are not quite as commonly used these
days as they were in the 1970's and 80's ...

kre



Re: proposed change to getty

2021-10-12 Thread Robert Elz
Date:Mon, 11 Oct 2021 18:09:23 -0300 (ADT)
From:Jared McNeill 
Message-ID:  <5ab793c9-8cab-2e79-e6ba-8017d924b...@invisible.ca>

  | There's a 2 second sleep in getty before opening the tty
  | that has been there since before NetBSD

I don't recall if that was there when this
version of getty was created or not -- probably,
in which case it was probably also in the 7th
edition getty.   That all got done about 40
years ago, far too long to remember.

2 secs back then was the smallest sleep that
was guaranteed to be > 0 (the delay for sleep(n)
was between n-1 and n secs).  Since that is
no longer true, the smallest change that should
happen is s/2/1/, or use usleep() or nanosleep()
and make the delay even smaller, 200ms should
be enough for any modem.

Doing it in the driver is OK as well, but
probably needs to remain in getty until we are
sure that all drivers do this correctly.
Since you are handling this by blocking open
until long enough after the close had passed,
also delaying the open in getty should have
no real effect.

kre


Re: FreeRADIUS instability

2021-09-30 Thread Robert Elz
Date:Thu, 30 Sep 2021 08:37:44 -0400
From:Christos Zoulas 
Message-ID:  <49c53880-d427-489d-92fa-881cd01b5...@zoulas.com>

  | I have committed it to head, but I want to make sure that everything is
  | ok and that people don't prefer to fix it via a fork hook,

There's nothing wrong with that as a fix for the DNS resolver issue, but
I suspect that the underlying issue isn't fixed this way - any process that
has a kqueue open (by some code in some library, so not known to the
application, as here) will face the same problem and so need a similar
solution.

I'd suggest that when a fork happens, rather than closing the kqueue fd
in the child, rather it be left open, but redirected to a nothing object
(one which simply returns errors on almost all operations but ones that
only affect the fd (eg: dup) and close().)

That would still need something like your fix if the kqueue is desired
to work (again) in the child, but would avoid issues like the one in
question where the fd is recorded somewhere, and used after that same
fd has been reassigned elsewhere.

Alternatively of course, simply make kqueue remain open across fork, it
already needs to be able to handle multiple fd's aimed at the same queue,
right?   After all the fd can still be dup'd.

kre



Re: FreeRADIUS instability

2021-09-29 Thread Robert Elz
Date:Wed, 29 Sep 2021 13:48:51 -0700
From:"Pawel S. Veselov" 
Message-ID:  <72a4f226-78dc-22f9-4d4b-90e434b76...@gmail.com>

  | I think the only way to fix this is to have the resolver state
  | cleaned up thoroughly after fork(). I can't see how this can be
  | worked around by applications.

Maybe put a call to res_init() in the child, immediately after a successful 
fork (before any other fd manipulation).

That will try to close the old kqueue, which will fail, but  no-one cares,
and then open a new one.

kre



Re: /bin/sh fd 12

2021-09-14 Thread Robert Elz
Date:Tue, 14 Sep 2021 21:26:43 +0700
From:Robert Elz 
Message-ID:  <14651.1631629...@jinx.noi.kre.to>

  | the fix is almost done.

And it was committed.

But then it was just almost almost done ... now it should really be
almost done.

What I committed last night is OK, for normal sh use, but can
be made to exhibit somewhat weird behaviour if subjected to
exotic testing (trying to see how much of this really works, etc).

I have a much better fix (for the "fd 13" issue) being built for
testing now, but it won't be committed until much later today.

kre



Re: /bin/sh fd 12

2021-09-14 Thread Robert Elz
Date:Tue, 14 Sep 2021 09:12:17 -0400
From:Jan Schaumann 
Message-ID:  <20210914131217.gk6...@netmeister.org>

  | Do you want me to send-pr the redirection to fds
  | 12/13?

You can if you want, but the fix is almost done.   (My test
build is just completing now, then I need to run tests to make
sure there are no regressions).

As I said in an earlier message, the fix for 12 (any fd the sh has
opened for its own needs) is trivial, but dealing with the issue with
13 (temporarily moved user fds) is messier, but I think I have it almost
done (there are still one or two minor issues to deal with, which I will
get to in a later fix).

kre





Re: /bin/sh fd 12

2021-09-14 Thread Robert Elz
Date:Tue, 14 Sep 2021 06:14:31 - (UTC)
From:mlel...@serpens.de (Michael van Elst)
Message-ID:  

  | /bin/sh uses /dev/tty for job control which is enabled automatically
  | when running as interactive shell. But there is a -m option where
  | you can enable/disable it, i.e. 'sh +m' runs a shell with job control
  | disabled and descriptor 12 not open.

And that's correct, rather than what I thought when I replied to the
original message earlier.

  | Maybe kre@ knows if a shell should allow redirection to its
  | own internal file descriptors.

It shouldn't.  Fixing that one is trivial.   Fixing the fd 13 one is
trickier (but will happen).

  | (Our) ksh only supports the single digit descriptors 0..9 for redirection

All ksh versions, I believe.

kre



Re: /bin/sh fd 12

2021-09-14 Thread Robert Elz
Date:Mon, 13 Sep 2021 23:32:39 -0400
From:Jan Schaumann 
Message-ID:  <20210914033238.gj6...@netmeister.org>

  | 0, 1, and 2 are obvious, but fd 12 did not seem
  | obvious to me.
  |
  | Descriptor 12 being open to the current terminal means
  | I can do this:
  |
  | $ echo foo >&12
  | foo
  | $ 

That's a bug, I will fix it.

  | But I can also:
  |
  | $ echo foo >&13
  | foo
  | $

That's also a bug, a similar one, the same fix should apply.

  | even though fd 13 did not show up under /proc/$$/fd/.

No, it is created for the echo command.

  | Where does that fd come from, and why is not shown
  | under /proc/$$/fd?

When you redirect standard output of a built-in command, the
existing standard output needs to be moved somewhere else (saved)
before the new one can be opened (dup'd in this case).  13 is
the next available fd, so that's where it is moved to - just in
time for the dup() back to fd 1...   When echo is done, fd 13
is moved back to fd 1, so it is closed again before you get a
chance to look.

  | And what's the purpose of fds 12 and 13?

12 is the script input (which, when there is no script, is a
copy of stdin, which is the terminal ... strange as it seems,
when a tty is stdin/stdout/stderr it is open read/write on all
three fd's).

13 is as above.   But those numbers are not fixed, in various
circumstances others might be used.

  | When using /bin/ksh, I see a different extraneous fd, fd 10,

Same thing as fd 12 in /bin/sh - all Bourne shell clones will have
something similar.

  | but I can't write to it:

/bin/ksh only allows you to reference fd's 0..9 (so do many shells
incidentally, that's all that's guaranteed by POSIX).  That's why
the "illegal file descriptor name".

  | Is this documented anywhere?

No.   Aside from /proc/*/fd these things are supposed to be invisible
(an internal implementation detail) - you won't see them via the fdflags
sh built-in for example.

kre



Re: backspace in wscons console sends ^H to processes

2021-07-19 Thread Robert Elz
Date:Mon, 19 Jul 2021 08:49:31 -0400
From:Greg Troxel 
Message-ID:  

  | As additional background, IMHO all of this confusion arose from the
  | differing setups of DEC computers and the IBM PC.

It is older than that.

  | On a real terminal as
  | one would have used with a PDP-11 or VAX in the 70s/80s/early-90s,

"Real terminals" existed long before PDP-11's or VAXen.  They had keys
that struck paper and made marks.

On those backspace moved backwards a character, and typing something else
overstruck the previous character (sometimes intentionally, often just
making a mess).   Delete did nothing at all.

Some of those paper terminals also had paper tape punches and readers.
This allowed "off line" preparation of input, or messages (these terminals
were used for telex / telegraph type operations, more than computers,
computers just needed something for interactive input, that is something
different than punched cards, and these were available).

When preparing a punched paper tape, to erase a mistaken character, one
relied on "delete does nothing" along with "delete is 0x7f", which when
converted to even parity format, is 0xFF, and since a "1" was recorded on
the tape as a hole, punching a delete changed whatever was there before
into the delete code (all rows punched), which, as above, did nothing.
But to overstrike the previous character, one needed first to move backwards
to be over it - that was what a "backspace" accomplished.

So, on "real terminals" the sequence to erase the previous character was
"backspace delete", and yes, the user had to type both of them - and to
erase two characters, one needed backspace backspace delete delete (etc).

When "glass ttys" started to become popular, their manufacturers
initially provided both keys - but computer systems wouldn't require
users to type both chars, but different systems picked differently.
Some used DEL (most Digital OS's did that), others used BSP (most
other systems, since that was the thing that worked easiest on a glass
tty - overstriking was destructive, so to replace one char with another
one simply did "backspace, replacement".  But DEC had settled on DEL
as the preferred choice before glass ttys (or the "modern" form, Tektronix
had a storage scope kind of thing, which was more like a paper terminal,
just without paper) were invented.

Unix used # for the erase character, as on a paper terminal you could see
that, count how many # chars, and mentally erase that number of previous
other characters (and @ was the line kill character, DEL was "interrupt").

Many unix users (but not all) had come from a DEC background, and in particular
a lot of BSD users, where there was "VMS vs BSD unix" type competition all over
the place.   So when BSD changed the defaults from # and @, they picked the
DEC convention (DEL ^U ^C) - which irritated non-DEC users a bit (like me)
who used non-DEC glass terminals, with BSP in a convenient location, and DEL
somewhere obscure (more obscure, usually smaller key).   Never mind, it was
user configurable (had been since the very early days, at least for erase and
kill - interrupt and quit were not configurable initially - which was one
reason many people in the early days used BSP for erase, DEL remained hard
wired as interrupt).

  | That led, I think,  i386 unix (386BSD, then early NetBSD) to let the key
  | send ^H and configure erase to ^H, breaking emacs

That alone IMO is the biggest feature.   Anything that breaks emacs, even in
trivial ways like this, is GOOD.

  | I don't see that as possible, and I have no idea why you would want
  | that.  Once you have the key that is logically the delete key sending
  | DEL as original ASCII intended,

It certainly intended nothing of the kind, DEL was "deleted" just as NUL
was "never entered" (empty space on the paper tape where nothing had been
punched).   Both were simply ignored and had no effect whatever (and so
could be used for padding characters after sending something which the
terminal would take a long time to execute, like carriage-return or line
feed.

If you ever actually used paper tape (I did) the last thing you'd ever want
was for DEL to start being interpreted as anything other than "nothing here".
On the other hand, as backspace never (usually) ever got onto the tape,
it changed its position instead, that one turns out to be a nice choice for
the erase character (paper tapes don't need it - the erasing is already
done).

That using ^H (BSP) as erase screws emacs users is just a bonus point.

  | What are you trying to
  | accomplish with this?   Or are you asking "is there some way to have
  | multiple characters function as erase in the tty"?

But that's a good question.   The tty driver (which is all you get when
you're in cat - unlike in a program which uses libedit or readline, or
similar where all this is done in those libraries, or in the program itself)
has exactly one erase character - you can set it to anything you 

Re: where is device manufacturer/model kept?

2021-06-28 Thread Robert Elz
Date:Mon, 28 Jun 2021 12:18:50 + (UTC)
From:RVP 
Message-ID:  <556bb7f-3792-635e-86ed-6d7c6b752...@sdf.org>

  | echo $(sysctl -n machdep.dmi.system-vendor)

That's a convoluted way of writing
sysctl -n machdep.dmi.system-vendor
and one which could fail if the string just happened to contain
the "wrong" characters (depending upon which version of echo is
being used for which are "wrong" for this purpose).

kre




Re: procfs difference between NetBSD and Linux

2021-06-08 Thread Robert Elz
Date:Tue, 8 Jun 2021 06:45:28 +
From:David Holland 
Message-ID:  

  | No such luck, 1000+ atf failures with a supposedly clean tree,
  | something's badly borked. Might take a while :-(

Something's locally not quite right ... I see nothing like that, and
all the failures I'm seeing now are either the "normal" ones that
happen everywhere (various ptrace test failures etc) or ones that are
caused by the kernel I'm running not being GENERIC (it has no audio
at all, not even pad, so lots of audio tests fail .. they should skip
rather than fail, but that's a different problem, I don't have MODULAR
turned on, so several tests which want to load kernel modules fail (again
should skip) and I don't have COMPAT32 so everything that wants to try
running a 32 bit binary also fails (also should skip, or simply not be
attempted at all).

Aside from that I have a bunch of c++ test failures, not sure whether
those are normal or not, but I can't imagine that changing namei should
have any effect there.

So, my guess would be some trivial problem in the recent editing -- even
with your original patch, untouched, I didn't get anything like that number
of test failures.

kre

ps: send me (off list) updated files (or patches against HEAD) if you would
like me to take another look.




Re: procfs difference between NetBSD and Linux

2021-06-06 Thread Robert Elz
Date:Sun, 06 Jun 2021 00:28:50 +0700
From:Robert Elz 
Message-ID:  <28802.1622914...@jinx.noi.kre.to>

Once more, into the self-reply...

  | (all the rest of the files
  | your patch modified are as you modified them).

It turns out there is another fix needed, in vfs_vnops.c

In that one, the patch did ...

-   if (fmode & O_CREAT) {
+   /*
+* 20210604 dholland ditto
+*/
+   if ((fmode & O_CREAT) != 0 && ndp->ni_dvp != NULL) {

which means that we only get into the following code (the if only
succeeds now) when we have a parent vnode, which now only happens
when the target node doesn't exist, when it exists, we won't be
creating anything, so no parent gets returned.

The code that followed either actually created the file (or at least
attempted to) - that part is still fine, and still works - or if the
target exists (no create needed) released the parent vnode (skipping
that part is fine, since we don't have it), also if O_EXCL is set,
returned EEXIST - that's OK, as if O_EXCL is set, we don't do this
modified code, so that's all OK, the EEXIST will come from namei() instead
of this code in some cases, but no-one cares where it comes from,
but also the code cleared the O_CREAT bit, and no longer does.

Thus means that and as O_CREAT remains set, so we don't bother with
vn_openchk() which means things like O_REGULAR no longer work.
(The permission checks are all in there too!)

I found this when I saw that the fopen("/dev/null", "wf") (and other
similar) tests in tests/lib/libc/stdio/t_fopen failed in the ATF test run
when I got time to go through the failures (OK, in reality I stopped at that
point, I'll run them all again with the fix for this).

I think that all that might be needed is to clear O_CREAT in the else
case of this if .. that was pointless before as we never got there with
O_CREAT set, but now we can.   Once that's done, the t_fopen test succeeds
(or as much as it can for me, I don't have MODULAR in my kernels, so a
couple of the sub-tests are skipped, but those are unrelated to these
changes).

I have done some testing with that change made, but I need to run all the
ATF tests, and make sure there's nothing else that's now failing and shouldn't
be.   All (quick & dirty) tests I have run on the various situations related
to what looked like they might be problems here are working now.

I also need to think more on the possible permutations.

kre



Re: procfs difference between NetBSD and Linux

2021-06-05 Thread Robert Elz
Date:Sat, 05 Jun 2021 23:03:05 +0700
From:Robert Elz 
Message-ID:  <2011.1622908...@jinx.noi.kre.to>

  | Replying to my own message again (draw your own conclusions)...

And now I am replying to my reply to my message.   All remaining hope is lost!

  | Building now and then will test this version soon (I had already run
  | the AFS tests, which don't test this particular scenario, apparently,

And of course I meant ATF tests Did you need any more proof?

  | as they all worked about as well as they typically do for me, certainly
  | no kernel crash from them).

I will run them again on this version, and won't reply again unless there
is something worth saying ("they worked, as well as usual" isn't it).

But for now, my modified version of dholland@'s patch is looking good:

netbsd# >/
-sh: cannot create /: is a directory
netbsd# >/bin
-sh: cannot create /bin: is a directory

No more panic, and the results that we want.   Further, with /proc
mounted, and this (modified from the original in this thread to
supply a little more info) test program:

#include 
#include 
#include 
#include 
#include 

int main ()
{
   int fd, new_fd, err;
   char buf[PATH_MAX];

   fd = openat (AT_FDCWD, "foo.txt",  O_RDONLY|O_NOFOLLOW|O_DIRECT);
   err = errno;
   printf ("fd = %d (flags %#x) err=%d\n", fd, fcntl(fd, F_GETFL, 0),
(fd == -1 ? err : 0));
   sprintf(buf, "/proc/self/fd/%d", fd);
   sleep(2);
   new_fd = openat(AT_FDCWD, buf, O_RDWR|O_CREAT|O_NONBLOCK, 0744);
   err = errno;
   printf ("new_fd = %d (flags %#x) err=%d\n", new_fd,
fcntl(new_fd, F_GETFL, 0), (new_fd == -1 ? err : 0));
}

(and after creating "foo.txt" of course), the results are...

fd = 3 (flags 0x8) err=0
new_fd = 4 (flags 0x6) err=0

The "O_DIRECT" in the first open I added just so the result from
F_GETFL wouldn't be 0, it is otherwise meaningless (that is the 0x8)
The flags value is O_DIRECT|O_RDONLY, as expected, O_NOFOLLOW is an operation,
not a mode, and isn't saved.

For new_fd, flags 6 is O_RDWR|O_NONBLOCK (O_CREAT isn't saved, naturally,
that's also an operation, not a mode).

I suspect this is what we want.

Without /proc mounted ... and yes, initially I forgot it was needed,
we get:

fd = 3 (flags 0x8) err=0
new_fd = -1 (flags 0x) err=2

which also looks correct, not that anything there is in any way related
to these changes in that case.

Now I'll leave it for David to turn my mangling into something sane,
but I think we probably have (in perhaps a messy way) this one solved.

kre

ps: David, I'll send the vfs_lookup.c I used in an off-list message, to
save you recreating it out of the e-mail...



Re: procfs difference between NetBSD and Linux

2021-06-05 Thread Robert Elz
Date:Sat, 05 Jun 2021 20:13:53 +0700
From:Robert Elz 
Message-ID:  <16349.1622898...@jinx.noi.kre.to>

Replying to my own message again (draw your own conclusions)...

  | It applies, compiled, and builds a release with no problems, running
  | tests now.

Unfortunately, it doesn't work, kernel segv in vn_open().

I believe the cause is this code (in namei()):

if (cnp->cn_nameiop != LOOKUP &&
(searchdir == NULL ||
 searchdir->v_mount != foundobj->v_mount)) {
if (searchdir) {
/*... irrelevant for now */
}
vrele(foundobj);
foundobj = NULL;
ndp->ni_dvp = NULL;
ndp->ni_vp = NULL;
state->attempt_retry = 1;

which is followed by the code that changed:

switch (cnp->cn_nameiop) {
case CREATE:
if (cnp->cn_flags & NONEXCLHACK) {

(etc).

The problem (of course) is those foundobj = NULL; and ndp->ni_vp = NULL;
lines, neither of which we want to happen in this case.

Then when we return to vn_open() (without an error) ndp->ni_vp == NULL
and kaboom.

I am trying a fix for this by making the initial test shown above be:

if (cnp->cn_nameiop != LOOKUP &&
(cnp->cn_flags & NONEXCLHACK) == 0 &&
(searchdir == NULL ||
 searchdir->v_mount != foundobj->v_mount)) {

which of course then makes the test of NONEXCLHACK inside "case CREATE:"
meaningless, but harmless, so I just left that for now.   This change
makes a NONEXCLHACK CREATE op function identically to a LOOKUP op, which
I believe is what we want in this case.

Then because we're now no longer doing the

ndp->ni_dvp = NULL;

and the code in vn_open() relies on that, I added

if (foundobj != NULL && cnp->cn_flags & NONEXCLHACK) {
if (searchdir != NULL) { 
if (searchdir_locked) {
VOP_UNLOCK(searchdir);
searchdir_locked = false;
}
vrele(searchdir);
}
searchdir = NULL;
}

which might be overly complicated, but seems to fit with what is needed
(or done anyway) in what comes later when searchdir != NULL.
(searchdir is later placed into ndp->ni_dvp).

Building now and then will test this version soon (I had already run
the AFS tests, which don't test this particular scenario, apparently,
as they all worked about as well as they typically do for me, certainly
no kernel crash from them).

kre

ps: The rhialto@ suggested test:
echo >/usr
made it really easy to test this).   Thanks.

My test setup has no (extra) mount points, or not until I get around to
mounting a procfs to test the code that failed anyway, so I can't use /usr
- but / works just as well -- / is a mount point.   Using an 8.1 kernel
(the relevant code hasn't changed in a decade - until today - so anything
vaguelly recent should give the same results):

$ echo >/
sh: cannot create /: file exists
$ echo >/bin
sh: cannot create /bin: is a directory

shows it is just as good to use for the test as any other mount point.
(the "echo" isn't needed, just ">/" works as a test, but that's immaterial).




Re: procfs difference between NetBSD and Linux

2021-06-05 Thread Robert Elz
Date:Fri, 4 Jun 2021 20:09:14 +
From:David Holland 
Message-ID:  

  | The patch below has not even been compile-tested and so may need some
  | adjustments (and might conceivably break rump) but should address the
  | problem in a way that will, with luck, not explode anything else.

It applies, compiled, and builds a release with no problems, running
tests now.

kre



Re: procfs difference between NetBSD and Linux

2021-06-04 Thread Robert Elz
Date:Fri, 04 Jun 2021 10:32:24 +1000
From:Simon Burge 
Message-ID:  <20210604003224.cc9b44e...@thoreau.thistledown.com.au>

  | https://pubs.opengroup.org/onlinepubs/007908799/xsh/open.html doesn't
  | mention anything about what filesystem types back the path being opened.

No, but there are lots of other things also not mentioned that also
affect what posix requires.Eg: and somewhat bizarrely, if the process
in question was started with one (or more) of fds 0 1 or 2 closed, then
what happens would also be unspecified.   There the underlying issue is
that the open might (in fact, would) return the lowest of the closed ones
of those which isn't what applications expect, and so bizarre things happen.

But POSIX has no notion of types of bizarre, there is just unspecified
(something happens, but implementations get to decide), undefined (anything
goes, reasonable or not) and specified.   As soon as you move out of the
POSIX defined environment, everything becomes unspecified or undefined.

Obviously, it wouldn't be useful to take liberties, we wouldn't want
open("/dev/null", 0) to call abort() just because the system has a
procfs mounted, and particularly if  the application isn't using it.
But POSIX doesn't say we cannot.   This is simply outside the standard.

  | It does say that O_CREAT without O_EXCL should have no effect if the
  | files exists.

Yes, and obviously, wherever possible, that's what should happen, it is
just that you cannot say "required for POSIX conformance" when the file
is on a filesystem that doesn't conform with POSIX.

  | That this particular instance is related to procfs
  | shouldn't make a difference, right?

I'm not aware of any discussions related to procfs type filesystems
related to POSIX (doesn't mean there have never been any) but this
type of issue comes up from time to time related to NFS, which also
has slightly different semantics than "normal" filesystems - and I
believe the answer has always been that as soon as you step away from
a POSIX environment, the requirements no longer apply.

Files and operations on an NFS filesystem aren't required to behave the
same way as files on a normal filesystem (which is good, as they don't).

kre

ps: if we were to be overly cynical, we could also say that to conform
to POSIX all that is required (of the implementation, leaving aside for
now all the paperwork etc required of the implementors) is that the system
pass the POSIX conformance tests.   Those have no procfs (or NFS) because
those things are not POSIX.   Hence testing O_CREAT on a /proc/$$/fd/N
type file name will never be done (or it could be, but it would just be
a regular file in a regular directory, and irrelevant here), and so cannot
cause a system to fail, whatever it does.




Re: procfs difference between NetBSD and Linux

2021-06-04 Thread Robert Elz
Date:Thu, 3 Jun 2021 18:45:53 - (UTC)
From:mlel...@serpens.de (Michael van Elst)
Message-ID:  

  | procfs will anser EOPNOTSUPP on VOP_CREATE. But it never comes that
  | far.

No, it doesn't.

What I was suggesting doesn't come close to fitting the way things
actually work, I should have considered it more before sending.

  | On the other hand, the logic in namei() might not be correct.

I'm not sure it is that simple (that's what I though a half hour or so ago).

  | It looks like a check to prevent CREATE operations on a mountpoint,
  | but that's neither necessary nor compatible when the object
  | already exists.

The issue (which is easier to see in much older versions of namei() than
the current one) is that a parent vnode pointer is required for
CREATE (and DELETE and RENAME) vnode ops, but across a mount point that
makes no sense (or does it?   Could we simply return the previous vnode
in the path regardless of the filesys - or would that wreck the locking
somewhere?)

If the CREATE is for a mkdir() or link() (or mknod() mkfifo() ...) then
all of this makes sense, the EEXIST is correct, and simply returning the
existing vnode as it is might not be.

But open(path, O_CREAT|..., ...) is different, it is only a CREATE if the
path doesn't exist, otherwise it is simply an open.   It could do 2 lookups,
one to discover if the path exists (returning if it does), and then a
second CREATE lookup if it doesn't - but that would be full of races
or locking nightmares.

kre




Re: procfs difference between NetBSD and Linux

2021-06-04 Thread Robert Elz
Replying to myself...

  | I think I am going to experiment with simply removing that error case
  | and see if anything breaks.

but that cannot work, the issue is that the operations in question return
the parent vnode, which, when a mount point has been crossed, isn't possible.
Simply returning success in that case won't work at all for all the other
uses of the CREAT vnop, which expect that parent vnode.

I considered dealing with EEXIST in open() (where it makes no sense,
unless O_EXCL is set) but that is unlikely to work, as namei()
when it returns an error isn't also going to be properly returning the
target vnode.

My guess at the minute is that to fix this we need a new vnop, OCREATE
(optional create) (or something), which works identically to the CREATE
operation, except that it doesn't fail if it cannot return the parent
vnode - and then callers (which would probably only be open()) using
this new op would need to deal with that when it happens.

But that's beyond my pay grade here, someone who has worked a lot on
namei() and the vnops needs to consider all this a lot more.

kre



Re: procfs difference between NetBSD and Linux

2021-06-04 Thread Robert Elz
Date:Fri, 4 Jun 2021 14:29:51 - (UTC)
From:mlel...@serpens.de (Michael van Elst)
Message-ID:  

  | We need to understand why namei() does this check and how it can be
  | corrected.

Yes, I was wondering about that, it seems to make no sense to me.

A mountpoint, by definition, must exist, so the O_CREAT flag (without
O_EXCL) will never be creating anything, so if we hit a mountpoint
boundary, just at the resolution of the name, the result cannot be
affected by O_CREAT (alone - O_CREAT|O_EXCL is always going to fail,
mountpoint or not, if the target name exists).

Simply removing whatever the test is should (hypothetically anyway)
make no difference to anything, so discovering why the check was added
would be useful.

I've been taking a bit of a look at the history, and while the error
wasn't always EEXIST (that's only from 2011) the test has been there
for ages.   At the minute I'm thinking it might be a deficit in the
design of the vnops ... the error comes from a "create" operation on
a mount point, which obviously is going to fail (as do delete and rename).
The problem is that O_CREAT isn't always a create op, it can simply be
a lookup, it only turns into an actual create operation if the target
doesn't exist.

Perhaps that means the way the create vnop works needs to be altered, or
perhaps this test doesn't really need to be there, as if it is really
intended to be a create (as in mkdir() or link()) it should simply fail
when it detects the target exists (mount point or not) and if it is an
"optional create" (as on O_CREAT on open) then if the target exists
it isn't really a create at all.

I think I am going to experiment with simply removing that error case
and see if anything breaks.

kre



Re: procfs difference between NetBSD and Linux

2021-06-03 Thread Robert Elz
Date:Thu, 3 Jun 2021 09:12:52 - (UTC)
From:mlel...@serpens.de (Michael van Elst)
Message-ID:  

  | namei() return EEXIST when it works on a CREATE operation and
  | crosses a mountpoint.

Could we perhaps simply have procfs remove O_CREAT from the flags
passed by the user?   It is never going to work to create a file
inside a procfs mount, is it?

kre

ps: But I'm not sure this is a POSIX problem, POSIX has no procfs,
and so anything that uses one is outside the bounds of what POSIX
specifies, and into the great vastness of beyond all knowledge -
ie: for POSIX, anything on a procfs is an unspecified operation.




Re: procfs difference between NetBSD and Linux

2021-06-01 Thread Robert Elz
Date:Tue, 1 Jun 2021 14:32:19 +0200
From:Martin Husemann 
Message-ID:  <20210601123219.ga16...@mail.duskware.de>

  | Good idea - and raise an upstream issue pointing at the non-portable
  | procfs assumption.

And while doing that, ask them what they're possibly trying to achieve
with the O_CREAT flag - if /proc/$$/fd/N doesn't exist, how is creating
(what would be a normal file, if procfs allowed it) going to possibly
do anything useful?   It is hard to believe that they're intending that
creating a file there will magically cause the fd to open (open to
what underlying object?)   If they know the fd is open (which they
seem to do here) then they know that /proc/$$/fd/N already exists, in
which case O_CREAT is useless (in the best of cases).

kre



Re: `man` cannot find any entry

2021-05-13 Thread Robert Elz
Date:Thu, 13 May 2021 15:29:25 -0700
From:J C 
Message-ID:  


  | Any idea how I can fix this?

unset MANPATH

Then find where it is being set, and
make that stop happening.

Use of MANPATH is for unusual situations,
it should not normally be required.

kre


Re: toupper and warnings

2021-05-07 Thread Robert Elz
Date:Thu, 06 May 2021 12:52:36 -0700
From:"Greg A. Woods" 
Message-ID:  


  | Yeah, "Undefined Behaviour" should be undefined -- i.e. removed from the
  | spec -- i.e. become either fully defined or at least implementation
  | defined.  It is not helpful at all -- it was a very VERY bad idea.

Not really possible.   To become implementation defined, the implementation
needs to be able to specify what happens (even if different from what other
implementations specify for the same thing).   Sometimes that's  not possible,
and what happens depends upon things outside the control of the implementation.
Eg: accessing an array out of bounds might just return random data from some
other data structure, or it might generate a segmentation violation - it all
depends upon how far out of bounds the access was, and where in the memory map
the array in question happened to be placed.   There's no way to define what
will happen - even worse on an embedded system, running with no memory 
management or privilege separation, the access might hit on memory mapped I/O
control, or CPU control registers, and do almost anything.

  | E.g. for ctype.h interfaces the spec should just say that values outside
  | the recognized range will simply be truncated as if by assignment to an
  | unsigned char.

That might have been a good idea, perhaps, if it had been specified that
way initially - only perhaps because it means penalizing good code with
meaningless extra checks or no-op data manipulations (&0xFF or whatever)
that do nothing for it except make the code run slower, just so bad
code behaves in some kind of predictable (but probably still incorrect) way.

But it wasn't specified like that.

And standards bodies are not legislatures - they don't (or shouldn't) go
defining how things should be, and then attempt to force implementations
to obey.

Rather, they set out what is known to work on all implementations (just
omitting ones with admitted bugs which should be fixed), so that
applications will know what they need to do to correctly use the interfaces
provided, and what they should not do, as the results would either be
unspecified (or implementation defined) or even simply undefined.
They also make it clear what a new implementation needs to implement in
order to be compatible with the other existing implementations, so that
applications which work with other implementations will also work with
the new one.

  | What I am pretty sure of though is that there's a vast difference
  | between the massive number of warnings spit out by the compiler vs. the
  | relatively low number of actual cases of passing values outside of -1..255.
  | We certainly wouldn't want to claim UB and abort for all of the warnings!

It is certainly true that the compiler is guessing when it issues one of
these warnings, in some cases it cannot know what the range of value will
be at run time, in others its analysis functionality is simply not up to
the task.   So a lot of false warnings occur - for some of the warnings
the vast majority look to be bogus (which is annoying) for others a warning
most commonly means a problem exists.

kre



Re: reboot hangs at "uhid2 at uhidev9 report id 7 ..."

2021-05-05 Thread Robert Elz
Date:Wed, 5 May 2021 18:55:38 +0200
From:Rhialto 
Message-ID:  

  | On Wed 05 May 2021 at 15:18:03 +0900, Henry wrote:
  | > The system kept booting into single user mode, but searching around I
  | > finally figured out that I needed to edit /etc/rc.conf.  I thought I
  | > had successfully changed to rc_configured=YES.
  |
  | The installer is also supposed to do that for you, so there must have
  | been something weird there.

Which might also explain the current problem.  If the final stages of the
system setup weren't done correctly, /dev might not have been setup either.
In that case, no-one is going to be able to open /dev/console to output
any further (from userlevel) messages.  That last message you're seeing is
often one of the last from the kernel before you start getting messages from
running /etc/rc.

That is, it is entirely possible  that the system is up and running, but
there is simply no way to communicate with it (if rc.conf wasn't set up,
the network probably isn't enabled either).

Boot in single user mode, check what is in /dev - if what's there doesn't
look correct (ordinary file for /dev/null or missing, no /dev/console or
not a char device, ...) then delete everything (except MAKEDEV if that is
in /dev on your version) and "cd /dev; sh MAKEDEV std" (or sh /etc/MAKEDEV
or wherever it is to be found).

If what is in /dev looks to be correct, check /etc/ttys next, but incorrect
config there is less likely to explain things.

kre



Re: IPv6: in6_setscope: can't set scope for not loopback interface

2021-04-22 Thread Robert Elz
Actually, you can ignore the "-s1600" request, looks as if someone has
finally made tcpdumps default snap length somewhat bigger...   It won't
hurt if you have done it using that option, but it also should no longer
(I have no idea for how long back into the past) be needed.

kre



Re: IPv6: in6_setscope: can't set scope for not loopback interface

2021-04-22 Thread Robert Elz
Date:Thu, 22 Apr 2021 20:50:09 +0200
From:=?UTF-8?Q?J=C3=B6rn_Clausen?= 
Message-ID:  


  | $ ifconfig -a

That all looks OK,

  | I have configured the IPv4 part of vioif0 via /etc/ifconfig.vioif0:

Now I'm going to suggest that you (at least temporarily) configure
a v6 address on that interface.

My suspicion is that something on your system is seeing those v6
incoming multicast packets, and is attempting to reply (with its own
multicast packets).   But you have no global address - the only
non link-local v6 address it can find is ::1.   If all of this
goes away when you have a v6 address configured, then we'll be much
closer to finding out what is going on.

Just add
inet6 2a04:52c0:101:162::1/64
at the end of /etc/ifconfig.vioif0  (the '1' could be any 16 bit hex value
you like, there are more ways to config v6 addrs, but this will do for now).
The 2a04:52c0:101:162 is what you said your ISP assigned you.

  | and define the default route in /etc/rc.conf.

don't bother with that for ipv6 for now (no default v6 route).

  | According to my ISP, he doesn't see the bogus packets with ::1 source, so
  | indeed they seem to be a product of my machine.

Assuming that's correct, which this test should verify (those packets should
go away and be replaced by packets from 2a04:52c0:101:162::1 if my guess is
correct) then we need to try and work out why the network stack is allowing
that to happen.

  | the resulting PCAP file is at
  |
  | 
https://drive.google.com/file/d/1b_QlSW_oqYb2lMe4m_FO-DU7mAQdd86c/view?usp=sharing

I'm unable to fetch that (or rather, I can connect to that page, but all it
ever does is show a "rotating circle" kind of thing).

Can you just send the pcap file (or perhaps a new version) to me
(not the list) via e-mail?

A second tcpdump pcap file after the v6 global addr is configured might
help as well.  And please, use -s 1600 on the tcpdump command that writes
the file - I'm not certain that it is required when -w is used, but it
certainly won't hurt (without that, only the packet headers tend to be
captured, and sometimes not even all that, 1600 is bigger than the MTU (plus
ethernet headers) so should get everything).

kre



Re: IPv6: in6_setscope: can't set scope for not loopback interface

2021-04-22 Thread Robert Elz
Date:Thu, 22 Apr 2021 11:06:04 +0200
From:=?UTF-8?Q?J=C3=B6rn_Clausen?= 
Message-ID:  


  | BTW: This is all happening on the actual network interface,
  | not the loopback interface.

Yes, I knew that, but the NetBSD network stack uses the loopback
interface for local packet delivery, it has to be configured correctly
or (some) things won't work.

  | I can see a constant stream of these packets:
  |
  | 10:31:46.504046 IP6 2a04:52c0:101:7b1::.5344 > ff15::efc0:988f.6771: UDP,
  | length 138

Those are multicast packets.   Multicast is one of the packet types for
which the interface scopes are important.

What port 6771 is being used for I'm not sure, /etc/services says it is
"plysrv-https" (yes, including for UDP) but it might easily be something
else.   Maybe someone else here can recognise it.   Of you might check,
initially using netstat, and then perhaps fstat, whether your host has
anything listening on that port.
 
  | 2a04:52c0:101:7b1 is on the same network as my machine

That would be a network prefix, the source addr is be 2a04:52c0:101:7b1::
(those extra colons are important, and indicate a host part of all zeroes,
which is unusual, but I don't think actually incorrect).

  | (technically, my ISP gave me the address 2a04:52c0:101:162::/64,

That's also a network prefix (a block of 2^64 addresses).   A different
one that the prefix of the sender  of those packets, though it is unclear
what that prefix (the one assigned to you) is intended for - most likely
for your internal network (if you have one, which for your usage you
probably don't) rather than for the link between the ISP and you, which
might be the 2a04:52c0:101:7b1 prefix.

  | but I don't use it and haven't configured the interface with it).

That won't stop multicast packets arriving, the switch shouldn't be
sending them unless something has joined the multicast group, but without
knowing a lot more about how your ISP has configured the connections to
its kvm guests, it is hard to say that anything wrong is happening.

  | Every now and then I see this:
  |
  | 10:31:49.689606 IP6 ::1.52736 > ff15::efc0:988f.6771: UDP, length 139
  | 10:31:49.690455 IP6 ::1.6771 > ff15::efc0:988f.6771: UDP, length 139
  | 10:31:51.690739 IP6 ::1.52736 > ff15::efc0:988f.6771: UDP, length 139
  | 10:31:51.691180 IP6 ::1.6771 > ff15::efc0:988f.6771: UDP, length 139

Those are simply wrong.   That ::1 source addr should never be attempting
to send any packets off its host - and if they're arriving over the vioif0
interface, rather than being send, then some other host out there is
horribly broken (I'd tend to suspect your config first though).

  | and this correlates perfectly with /var/log/messages:
  |
  | [Thu Apr 22 10:31:49 CEST 2021 <  27.000723>] in6_setscope: can't set scope
  | for not loopback interface vioif0 and loopback address ::1

Yes, it would.   Those packets are nonsense.

  | So I see packets on my network interface (i.e. not the loopback interface)
  | with a source of ::1. I am waiting for a reply from my ISP if I am seeing
  | pink elephants or if there are actually such packets on the network.

If there are, the sender of them needs to be fixed, but I wouldn't be
surprised if something on your host is trying to send those.

  | Do you know if port 6771 is some well-known port in IPv6 for housekeeping?

No, it is not a port I recognise.   But that means nothing.

  | The information I found seem to lean more to malware, and 2a04:52c0:101:7b1
  | might not be acting in good faith...?

I don't think I'd be assuming malware, when mistakes are far more likely.

The two most likely possibilities are some kind of mis-config on your host,
or some kind of mis-config on some other host running in a different KVM guest
on the same server.

kre



Re: IPv6: in6_setscope: can't set scope for not loopback interface

2021-04-22 Thread Robert Elz
Date:Wed, 21 Apr 2021 22:50:40 +0200
From:=?UTF-8?Q?J=C3=B6rn_Clausen?= 
Message-ID:  


  | I am mostly ignorant to everything IPv6, so I have no clue what that
  | message means, and I was not able to find any enlightenment online.

IPv6 link local (and multicast, and sometimes some other) addresses
have a "scope" in addition to the address itself.  That's because there
is nothing in the address which indicates which interface it belongs
to (no sub-net identifier or anything like that).

The reference to ::1 in the messages is interesting, that's the v6
equivalent of 127.0.0.1 in V4 - the loopback address, and should only
be assigned to lo0 (but needs to be there).

  | Is this something I can fix from inside the OS?

Almost certainly.  There's probably something mis-configured.

What is the status of the loopback interface (lo0) ?

Mine looks like:

lo0: flags=0x8049 mtu 33624
inet 127.0.0.1/8 flags 0x0
inet6 ::1/128 flags 0x20
inet6 fe80::1%lo0/64 flags 0x0 scopeid 0x3


  | $ ifconfig vioif0
  | vioif0: flags=0x8843 mtu 1500
  | ec_capabilities=1
  | ec_enabled=0
  | address: 00:16:3e:b3:00:8a
  | inet 5.2.76.44/24 broadcast 5.2.76.255 flags 0x0
  | inet6 fe80::216:3eff:feb3:8a%vioif0/64 flags 0x0 scopeid 0x1

Nothing looks wrong there

fe80::216:3eff:feb3:8a

is your link local address on that interface, the "%vioif0" is the
scope (and the /64 is essentially the netmask of course).

While the changes at your ISP may have triggered something, and of
course it is possible they're doing something incorrect or unusual, it
is probably more likely that it is just different.

You might want to capture a short sequence of packets on that interface
to see what is happening, since the timestamps you included show the
messages appearing several times a minute, capturing packets for just
a minute or two should be enough to see if there's anything strange.

tcpdump -i vioif0 -s 1600 -w /tmp/packets.pcap ip6

should do it, simply interrupt it after a couple of minutes.   Then you
can use tcpdump -r or wireshark to look at the packets, or put the file
somewhere it can be fetched.

kre



Re: ispell-british (for 9.0/amd64) has broken hash files

2021-02-03 Thread Robert Elz
Date:Tue, 02 Feb 2021 22:15:07 -0800
From:"Greg A. Woods" 
Message-ID:  

  | I'm wondering ie anyone can reproduce this
  | with the standard packages,

yes.   I have given up on ispell because of
this and now mostly use aspell instead (sometimes
just spell, but more often nothing as my typical
e-mail would indicate...)

kre


Re: Creating a GPT tab

2021-01-24 Thread Robert Elz
Date:Sun, 24 Jan 2021 14:12:20 -0800
From:John Nemeth 
Message-ID:  <202101242212.10omckhx022...@server.cornerstoneservice.ca>

  |  The tools won't replicate this, nor should they, as it is a
  | seriously broken setup.  To fix this setup, delete MBR parition 0.

Actually, it would take more than that...  The PMBR should cover the
entire drive, the one shown only accounted for the (unused except for
the partitioning headers) section before the start of the windows
partition.

This is even worse than when I looked at it before.   If this works
for anything at all (except perhaps ignoring the GPT partitions
entirely, and simply allowing access via the MBR to the windows
partition) then I suspect that indicates a bug somewhere.  It shouldn't.

kre



Re: Creating a GPT tab

2021-01-24 Thread Robert Elz
Hmm, in my previous reply I missed the 0 MBR partition.   That one is
weird.   That's a duplicate spec of the first GPT partition (the windows
GPT partition) - which I assume is there to allow ancient windows
systems, and other things that understand MBRs, to find that partition
without understanding GPT.

I am now guessing this is not a PC (for which that would make this an
invalid PMBR) but an ARM system perhaps?

I am not sure that we have tools that can make something like that.

Perhaps, someone else will need to answer.

kre



Re: Creating a GPT tab

2021-01-24 Thread Robert Elz
Date:Sun, 24 Jan 2021 11:49:09 -0700
From:Brook Milligan 
Message-ID:  <818cc659-27b2-4207-94e2-a14c9579f...@nmsu.edu>

  | I am trying to create GPT partitions that are the same as the following:

  | The complicating factor is that there is an MBR in sector 0 and the
  | primary GPT begins at sector 1.

That's normal.   The MBR is a "protective MBR" - it exists entirely so
that old systems that don't understand GPT won't think the drive is empty
and overwrite its contents without verifying first.

  | I cannot figure out how to make the tools replicate this and would
  | appreciate help.

Assuming you mean to use the tools to make a similar structure, rather
that make a literal copy of what is on the drive, then

gpt create sd0
gpt add -b 32788 -s 163840 -t windows -l Windows_Data sd0
gpt add -s 33554432 -t ffs -l NetBSD_Root sd0
gpt add -s 207618048 -t ffs -l NetBSD-Data sd0

you didn't show the existing partition labels, so I just made some
up, use whatever is appropriate (but they should be different from
the ones on the existing drive, if you are making a new one, unless
the two drives will (absolutely 100% for certain) never be connected
to the same system at the same time.

That's it.   gpt will create the PMBR for you as part of gpt create,
don't try and use fdisk to make that one appear.

kre




Re: On upgrade: NetBSD ?.? (UNKNOWN)

2021-01-05 Thread Robert Elz
As an alternative to what Herbert suggested, you can
just edit /etc/motd and put whatver you like there

It is intended to give a (brief, hopefully) message
to people when they login.

If you have update_motd set to YES in rc.conf
(or do not have it there at all, YES is the default)
then at each boot the first line of /etc/motd is
made to contain the system version string (by the
auto run at startup of /etc/rc.d.motd ... Herbert
just suggester doing it manually).

kre



  1   2   3   4   >