Re: Prob watching DVDs

2023-06-05 Thread Matthias Petermann

Hello Todd,

On 05.06.23 09:29, Todd Gruhn wrote:

I use both Xine and VLC Multimedia.

Both work with certain DVD; fail to use outer DVD.
Why?  Is this because DVD is so old? Has this been fixed?
Is there information on different DVD due to their age?


Thanks for any help.


It is possible that the failing DVDs are CSS encrypted. Did you have 
libdvdcss[1] installed?


Looks like this library is only available in source form due to license 
restrictions. So you might need to compile it yourself with pkgsrc.


Kind regards
Matthias


[1] 
https://cdn.netbsd.org/pub/pkgsrc/current/pkgsrc/multimedia/libdvdcss/index.html


smime.p7s
Description: S/MIME Cryptographic Signature


Re: nvmm users - experience

2023-05-24 Thread Matthias Petermann

Hi Mathew,

On 23.05.23 15:11, Mathew, Cherry G.* wrote:


 MP> I came across Qemu/NVMM more or less out of necessity, as I had
 MP> been struggling for some time to set up a proper Xen
 MP> configuration on newer NUCs (UEFI only). The issue I encountered
 MP> was with the graphics output on the virtual host, meaning that
 MP> the screen remained black after switching from Xen to NetBSD
 MP> DOM0. Since the device I had at my disposal lacked a serial
 MP> console or a management engine with Serial over LAN
 MP> capabilities, I had to look for alternatives and therefore got
 MP> somewhat involved in this topic.

 MP> I'm using the combination of NetBSD 9.3_STABLE + Qemu/NVMM on
 MP> small low-end servers (Intel NUC7CJYHN), primarily for classic
 MP> virtualization, which involves running multiple independent
 MP> virtual servers on a physical server. The setup I have come up
 MP> with works stably and with acceptable performance.

I have a follow-on question about this - Xen has some config tooling
related to startup - so you can say something like

'xendomains = dom1, dom2' in /etc/rc.conf, and these domains will be
started during bootup.

If you did want that for nvmm, what do you use ?


Unfortunately, I didn't find anything suitable and was in a big hurry to 
make the issue controllable for me. Therefore I wrote a shellscript 
quick and dirty. It encapsulates the aspects of starting VMs from the 
command line and from an rc script, creating appropriate Unix domain 
sockets to serve the guest's serial terminal and the Qemu frontend's 
monitoring console. If you want to have a look at it, I have uploaded it 
here (unfortunately without documentation and with a big warning that it 
is all done with a hot needle):


https://forge.petermann-it.de/mpeterma/vmctl



 MP> Scenario:

 MP> I have a small root filesystem with FFS on the built-in SSD, and
 MP> the backing store for the VMs is provided through ZFS ZVOLs. The
 MP> ZVOLs are replicated alternately every night (full and
 MP> incremental) to an external USB hard drive.

Are these 'zfs send' style backups ? or is the state on the backup USB
hard drive ready for swapping, if the primary fails for eg ?


Yes, I use zfs send, saving the stream from zfs send to files on the USB 
drive for my regular backups. So they are not directly usable. The idea 
is interesting though - I chose this way back then because I do it quite 
similar on my FFS systems with dump and the incremental aspect was 
important to me. On the other hand, I've also tested pulling a zfs send 
of all ZVOLs from the mini-server to my laptop, and then playing around 
locally with Qemu/nvmm with a "production copy".




 MP> There are a total of 5 VMs:

 MP> net (DHCP server, NFS and SMB server, DNS server) app
 MP> (Apache/PHP-FPM/PostgreSQL hosting some low-traffic web apps)
 MP> comm (ZNC) iot (Grafana, InfluxDB for data collection from two
 MP> smart meters every 10 seconds) mail (Postfix/Cyrus IMAP for a
 MP> handful of mailboxes)

 MP> Most of the time, the Hosts CPU usage of the host with this
 MP> "load" is around 20%. The provided services consistently respond
 MP> quickly.

Ok - and these are accounted as the container qemu processes' quota
scheduling time, I assume ? What about RAM ? Have you had a situation
where the host OS has to swap out ? Does this cause trouble ? Or does
qemu/nvmm only use pinned memory ?


I configured the VMs' RAM to have a few hundred MB buffer left on the 
host. Memory has run out in the past, especially when zfs send makes use 
of the buffer cache. Then swapping also occurred and together with the 
i/o load already increased by zfs send, the system was slowed down so 
badly that the response times were no longer acceptable. A complete 
recovery brought in this state also only a restart of the host. I got 
this under control with a tip someone gave me in #netbsd - I now pipe 
the output of zfs send first into dd, which has set the oflag "direct" 
and takes over the writing of the file. Obviously this bypasses some of 
the caching and avoids this situation.


Regarding pinned memory I can't say anything - the memory consumption of 
the VMs is stable from the host point of view, ballooning I haven't 
really tried with it yet.




 MP> However, I have noticed that depending on the load, the clocks
 MP> of the VMs can deviate significantly. This can be compensated
 MP> for by using a higher HZ in the host kernel (HZ=1000) and
 MP> tolerant ntdps configuration in the guests. I have also tried
 MP> various settings with schedctl, especially with the FIFO
 MP> scheduler, which helped in certain scenarios with high I/O
 MP> load. However, this came at the expense of stability.

I assume this is only *within* your VMs, right ? Do you see this across
guest Operating Systems, or just specific ones ?


The deviation 

Re: nvmm users - experience

2023-05-24 Thread Matthias Petermann

Hello everyone,

as a follow-up to yesterdays coversation, I've uploaded the script to my 
private Git in case anyone wants to take a look at it.


It consists of the actual tool itself and an rc-script that supports 
start/stop functionality during system boot-up and shutdown. The 
configuration of the VMs needs to be provided in a (hard-coded) conf.d 
directory.


However, I want to mention upfront that it was put together hastily and 
only meets the minimum requirements. Nevertheless, it might serve as 
inspiration for more sophisticated tooling in the future.


Kind regards
Matthias

[1] https://forge.petermann-it.de/mpeterma/vmctl

On 23.05.23 16:38, Matthias Petermann wrote:

Hi Robert,

On 23.05.23 15:51, Robert Nestor wrote:

Commenting on your follow-up on Xen startup and how I considered doing 
this in NVMM.  I have a rudamentory script that I use to define/create 
guest systems which also includes some hooks for starting up guest 
systems. There’s not currently any way I found in NVMM to send in 
shutdown commands or instructions though which seem to be available in 
Xen and KVM.  I did put hooks in my script to backup and restore guest 
systems.


Assuming you are using Qemu as the frontend to nvmm you can redirect 
Qemus "monitor" console to a Unix domain socket.


I use this command in my scripts:

```
  MONITOR_SOCKET=/tmp/$VM_ID.monitor
  CONSOLE_SOCKET=/tmp/$VM_ID.console

  nohup qemu-system-x86_64 -name $VM_ID -machine pc-q35-7.0 -smp 
$VM_CORES -m $VM_RAM -accel nvmm \

     -device virtio-balloon-pci,id=balloon0 \
     -k de -boot cd -cdrom $VM_CDROM \
     -machine graphics=off -display none -vga none \
     -object 
rng-random,filename=/dev/urandom,id=viornd0 \

     -device virtio-rng-pci,rng=viornd0 \
     -object iothread,id=t0 \
     $BLK \
     -device virtio-net-pci,netdev=vioif0,mac=$VM_MAC \
     -netdev 
tap,id=vioif0,ifname=$VM_NETIF,script=no,downscript=no \
     -chardev 
socket,id=monitor,path=$MONITOR_SOCKET,server=on,wait=off \

     -monitor chardev:monitor \
     -chardev 
socket,id=serial0,path=$CONSOLE_SOCKET,server=on,wait=off \

     -serial chardev:serial0 \
     -pidfile /tmp/$VM_ID.pid \
     2>&1 | logger -p local0.notice &
```

Shutdown is possible by sendung the system_powerdown qemu monitor command:

```
MONITOR_SOCKET=/tmp/$VM_ID.monitor
echo "system_powerdown" | nc -N -U $MONITOR_SOCKET
echo ""
```

On my systems this result in an ACPI poweroff event which triggers the 
shutdown procedure in the NetBSD guest.


Kind regards
Matthias


smime.p7s
Description: S/MIME Cryptographic Signature


Re: nvmm users - experience

2023-05-23 Thread Matthias Petermann

Hi Robert,

On 23.05.23 15:51, Robert Nestor wrote:


Commenting on your follow-up on Xen startup and how I considered doing this in 
NVMM.  I have a rudamentory script that I use to define/create guest systems 
which also includes some hooks for starting up guest systems. There’s not 
currently any way I found in NVMM to send in shutdown commands or instructions 
though which seem to be available in Xen and KVM.  I did put hooks in my script 
to backup and restore guest systems.


Assuming you are using Qemu as the frontend to nvmm you can redirect 
Qemus "monitor" console to a Unix domain socket.


I use this command in my scripts:

```
 MONITOR_SOCKET=/tmp/$VM_ID.monitor
 CONSOLE_SOCKET=/tmp/$VM_ID.console

 nohup qemu-system-x86_64 -name $VM_ID -machine pc-q35-7.0 -smp 
$VM_CORES -m $VM_RAM -accel nvmm \

-device virtio-balloon-pci,id=balloon0 \
-k de -boot cd -cdrom $VM_CDROM \
-machine graphics=off -display none -vga none \
-object 
rng-random,filename=/dev/urandom,id=viornd0 \

-device virtio-rng-pci,rng=viornd0 \
-object iothread,id=t0 \
$BLK \
-device virtio-net-pci,netdev=vioif0,mac=$VM_MAC \
-netdev 
tap,id=vioif0,ifname=$VM_NETIF,script=no,downscript=no \
-chardev 
socket,id=monitor,path=$MONITOR_SOCKET,server=on,wait=off \

-monitor chardev:monitor \
-chardev 
socket,id=serial0,path=$CONSOLE_SOCKET,server=on,wait=off \

-serial chardev:serial0 \
-pidfile /tmp/$VM_ID.pid \
2>&1 | logger -p local0.notice &
```

Shutdown is possible by sendung the system_powerdown qemu monitor command:

```
MONITOR_SOCKET=/tmp/$VM_ID.monitor
echo "system_powerdown" | nc -N -U $MONITOR_SOCKET
echo ""
```

On my systems this result in an ACPI poweroff event which triggers the 
shutdown procedure in the NetBSD guest.


Kind regards
Matthias


smime.p7s
Description: S/MIME Cryptographic Signature


Limitations regarding number of VNDs as Xen Backing Store for DomUs (consider partitionless image approach vs. GPT in image approach)

2023-05-17 Thread Matthias Petermann

Hello,

motivated by the other Xen thread here (I also will share a few 
experiences soon), I am currently considering the approach that could be 
advantageous for my use case in terms of storage.


Once again, we have a small NUC system as a Xen host. I exclusively use 
PV virtualization in the DomUs. In the Dom0, there is a large FFS 
filesystem where the storage of the DomUs should be stored as image 
files. I would like to have this set up for reasons that I can explain 
if necessary.


In doing so, I would like to keep the image files as compact as 
possible, meaning they should only contain the base system and selected 
packages suitable for the workload. The actual payload should be located 
in a separate image file within the VM, which will be linked as a 
separate block device in the DomU configuration.


A question that arises in this context is whether it would be sensible 
to run the images without partitions, i.e., without GPT, and instead 
have the FFS directly on the block device. I have a sense of the 
advantages and disadvantages of each option in terms of flexibility:


++ GPT: VM image can also be booted using HVM/Qemu/nvmm
++ GPT: An image can contain multiple partitions, i.e., root and swap 
(thus saving VND devices)

-- GPT: Image creation is more complex
-- GPT: Image handling is more complex when it is exclusively used for 
VM (take care on GUID duplicates for examples)


The partitionless approach would consequently require attaching the swap 
as another image file, which would result in 3 VND devices per DomU.


This leads to the main reason for my question: apart from the VND nodes 
in /dev that I can easily generate in sufficient numbers using MAKEDEV, 
what other unknown limitations might I potentially encounter earlier 
with this approach compared to another approach where I require fewer 
VND devices? An assessment would greatly help me, which pertains to the 
number of approximately 20-30 VMs on a system with 8GB of RAM (Dom0 = 
512MB).


Thank you very much for your input.

Best regards,
Matthias


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Advice for new travelling server: Intel Z690 chipset?

2023-05-10 Thread Matthias Petermann

Hi Johan,

On 10.05.23 10:26, Johan Stenstam wrote:

The only fly in the ointment is that running NetBSD 9.3 with Xen 4.13 the 
hypervisor complains about the CPU being unrecognised. When trying Xen 4.15 the 
CPU is recognised, but then the hypervisor crashes just as it is about to 
handover to NetBSD. But that’s a topic for another list, and I’m sure it will 
get sorted out.


Thank you for the interesting information and that you let us 
participate in the hardware discovery. The note about Xen has attracted 
my interest. Is the device a UEFI-only system, or can you also boot via 
BIOS/CSM? Background is that I recently also tried to boot with Xen 4.15 
on a UEFI-only system (NetBSD 10.0_BETA) and exactly the same problem 
that I still get the messages from Xen displayed, but then the NetBSD 
kernel does not start. Have you already made a post about this on 
another list? Or if you still plan to - can you please put me on CC?


Thanks & many greetings
Matthias


smime.p7s
Description: S/MIME Cryptographic Signature


Devpubd - possible to automatically adjust permissions for zvol device nodes?

2022-12-16 Thread Matthias Petermann

Hello,

under /dev/zvol/rdsk/pool/ there are device nodes for zvols created 
under ZFS. These are created automatically when the ZFS module is loaded 
(if not present) and are given the 600 root/wheel permission by default.


How can I set the permission, owner, and group during creation?

Devpubd looked like an obvious option - I had put a hook script there. 
However, apparently creating the ZVOLs doesn't generate a devpubd event, 
even though /etc/rc.d/zfs depends on it. Can anyone confirm this, or 
have I done something wrong?


In case this is not feasible with devpubd - what other options are 
there? My workaround for now would be to build a quick and dirty 
rc-script that loads after ZFS and overwrites the permissions in said 
/dev subdirectory at system startup.


Many greetings
Matthias


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Mystical issue with NetBSD and opening Russian Bank site chelinvest.ru

2022-11-01 Thread Matthias Petermann

Hello Dimitri,

this sounds really mystical and I don't have a concrete idea why this is 
happening in your case. However, I have had a related problem before 
that caused me a lot of trouble. It was about the name resolution in 
mixed networks of ipv4 and ipv6. You could check whether some or all of 
these points apply to your network:



- Are both systems using the same DNS server?

- Are there other DNS servers on the network?

- If the DNS server is local (e.g. on your router), does it have a DNS 
cache?


- Is the network (and DNS) pure ipv4 or a mixture of ipv6 and ipv4?

- If both systems have both an ipv4 and an ipv6 address, do both reach 
other systems behind the router via both ipv4 and ipv6?



I could imagine that the scenario you describe could occur under the 
following conditions:


 - There is more than one DNS server in your local network. One (1) of 
them provides ipv4 addresses, one (2) of them ipv6 addresses.


 - the DNS resolution of foreign names (i.e. outside the local domain) 
of the DNS servers is delegated to upstream DNS servers, with the 
upstream of (1) always responding slightly later than the upstream of (2)


 - NetBSD is set to query both ipv6 addresses and ipv4 addresses via DNS.

 - FreeBSD is set to query only ipv4 addresses via DNS.

 - In the initial state, the cache of the DNS servers does not yet know 
the IP of the bank.


 - When NetBSD starts the DNS query, it receives first an ipv6 address 
from (2) and may not be able to reach it due to the router 
configuration. The slightly delayed response from (1) is ignored because 
the ipv6 address has already been received.


 - When FreeBSD then starts the DNS query, it receives an ipv4 address 
from (1).


 - Now when NetBSD starts the DNS query again, the ipv4 address of the 
bank is already in the cache of (1), so the delay of the query to 
upstream is omitted. If within your local network (1) is the "faster" 
one under the condition that the ips are cached, then in this case 
NetBSD first receives the ipv4 address of the bank and can possibly 
reach it.


 - After a while, the cache expires (depending on how the TTL of the 
bank's domain is set) and the whole game repeats itself.


As I said, this was the case with me in a similar form. It helped to set 
a policy on the NetBSD host with ip6addrctl(8) to prefer ipv4. This 
meant that the ipv6 DNS responses were discarded. The easiest way to do 
this is with the rc.d script of the same name and the command prefer_ipv4.


Kind regards
Matthias


Am 02.11.22 um 03:53 schrieb Dmitrii Postolov:
Topic: Mystical issue with NetBSD and opening Russian Bank site 
chelinvest.ru


Hi! Sorry for my bad English...

The NetBSD version: 9.3_STABLE 02 Nov 06:28 +05 2022

I am client of Russian Bank Chelyabinvestbank. His website is 
https://chelinvest.ru
This website successfully opens on FreeBSD and FF-ESR with settings by 
default

and redirection occurs to https://chelinvest.ru/?ckattempt=1

On NetBSD 9.3_STABLE and FF-ESR (102.1.0, latest version from binary 
repository cdn.netbsd.org/.../9.0_current) while trying to open this 
site, endlessly spinning progress and the site does not open.


In rare instances if force site adress to 
'https://chelinvest.ru/?ckattempt=1'
this site is successfully open on NetBSD, but it is very rare. 
Basically, this site under NetBSD never opens.


Mystical issue is that open this site in another PC with FreeBSD and 
after that go to the separate PC with NetBSD and trying to open it, then 
site https://chelinvest.ru successfully opens on NetBSD and FF-ESR for 
for a while but after some time the problem repeats again.


The package mozilla-rootcerts-openssl is installed on both systems and I 
try to manually install the package openssl on NetBSD, but this did not 
solve the problem.


Please, help me to solve this problem...

P.S. OpenBSD has a similar problem with this site, so only FreeBSD 
successfully opens it with default settings.




smime.p7s
Description: S/MIME Cryptographic Signature


Re: Feed facility/priority to logger(1) via stdin - desirable extension or bad idea?

2022-10-23 Thread Matthias Petermann

Hi,

Am 21.10.2022 um 14:36 schrieb ignat...@cs.uni-bonn.de:

On Thu, Oct 20, 2022 at 10:30:12PM +0100, Mr Roooster wrote:

On Fri, 7 Oct 2022 at 14:19, Matthias Petermann  wrote:



- Can what I have in mind already be solved (differently or more
elegantly) with existing tools from the base system?


It's not elegent, but depending on your shell you can abuse tee to do
something like:

$ echo -e "alert|alert thing\ninfo|less important\n" |
  tee >(grep "^info|" | cut -c6- | logger -puser.notice) |
  grep "^alert|" | cut -c7- | logger -plocal0.notice


Uh... much easier than that, just using the standard shell's
builtin capabilities (and /usr/bin/logger):

$ printf "alert alert thing\ninfo less important\n" |
   (while read typeofmsg rest; do
case $typeofmsg in
alert)  prio="user.notice";;
info)   prio="local0.notice";;
esac
logger -p $prio $rest
done)

You can save the case...esac when you put the raw prio value as first
word of the line (whitespace separated from the rest of it):

$ printf "user.notice alert thing\nlocal0.notice less important\n" |
   (while read typeofmsg rest; do
logger -p $typeofmsg $rest
   done)


(Btw: If you insist to use cut, use

...|cut -d\| -f2-|...

instead, it's more robust when you haven't to count the characters
on each reedit of the code.)

Regards
-is



thank you all very much for the suggestions and impulses for reflection. 
I'm coming to the realization piece by piece that I need to divide the 
use cases more precisely after all. I have the thoughts of a 12F app[1] 
in the back of my mind all the time and am trying to line them up with 
the requirements of command line tools / scripts. Part of the motivation 
behind this comes from the fact that I've written a number of server 
apps that have multiple modes of operation - such as a built-in web 
server (logging via stderr) and a CLI (user input via stdin, payload 
output via stdout, logging/error messages via stderr). Since the 
scenarios that can be triggered via the web server are exactly the same 
as those that can be triggered via the CLI, I try to treat all 
diagnostic output (including error messages and warnings as well as all 
non-payload confirmations and information) as similar as possible for 
both operating modes. Furthermore I don't want to implement any 
specialities of the logging framework of the respective runtime 
environment in my applications (implemented in Golang, deployment on 
different platforms including Windows). Long story short... I should 
reconsider my motivation ;-) I realize that the kind of change to the 
logger tool I prototyped is based on very specific assumptions that 
don't belong on the generic level of the base system.


I'm currently thinking about a standalone tool, a kind of "logger-shell" 
that starts an arbitrary binary (12f app) with popen.
Stdin/stdout of the binary would be passed through directly to the 
stdin/stdout of the "logger-shell". In contrast, stderr would be logged 
via (configurable) rules to the respective logger (e.g. syslog()) with 
the appropriate log levels. This would allow me to wrap the above kind 
of server applications in web server mode while using them directly in 
CLI mode. I'll keep thinking about it though ;-)


For my example - the backup shell script - in the meantime I think that 
the approach with the "standard shell's builtin capabilities" is the one 
I like best.


Kind regards
Matthias


[1] https://12factor.net/logs


Re: Feed facility/priority to logger(1) via stdin - desirable extension or bad idea?

2022-10-16 Thread Matthias Petermann
[re-sent due to wrong patch included + wrong address of 
netbsd-userlevel, please apologize duplicates]


Hello,

On 07.10.22 21:39, Mouse wrote:

Is there some reason you don't actually syslog() the log messages,
then, rather than sending them down a pipe?  It sounds to me as though
you are going to have to make your log generator logging-aware, but,
then, I don't see what benefit you get from piping the output to a tool
instead of just logging it directly.  (The obvious (to me) benefit is
that you can control facility and priority with the logging tool
instead of wiring it into the code, but here you're pushing it back
into the log-generation code anyway.)


Besides the things I already wrote in my last reply, this passage got me 
thinking again. You raised an important point - the approach of 
accepting the syslog log via stdin implies the requirement for the 
logging program to "speak" the syslog log. But this is exactly what I 
want to avoid with the whole thing ;-) I would rather set a fixed 
facility by logger parameter and only read the log level from stdin. 
This should then also be coded in a form that it can be reproduced very 
easily and intuitively e.g. by a script.


I've implemented a working draft prototype of what I'd consider useful, 
and I'm attaching the patch in case anyone is interested.


 From the user's point of view, the change consists of an additional 
command line parameter (-F).


With this parameter, one can set the facility for the logger (unlike -p, 
which sets facility + log level). The parameter also activates special 
handling of all lines read via stdin. In this particular mode, the first 
characters read per line are expected to contain the log level in plain 
text (info, alert, err, emerg...), terminated by the pipe character. The 
pipe character separates the log level from the actual text to be logged.


This then makes the following possible, for example:

```
❯ echo "info|This in an informational message"|logger -F local2
❯ echo "alert|This in an alert message"|logger -F local2
```

and

```
#minute hourmdaymonth   wdaycommand
#
0   */6 *   *   *   /home/mpeterma/zdump backup 2>&1 
| logger -F local2

```

I understand that my use case may be a more specialized one. Using the 
example above (crontab), the advantage of the solution is that the 
called program doesn't need deep knowledge of the logging protocol used. 
The only point of contact is the naming of the loglevels, which is 
oriented on the syslog standard. Otherwise, the only communication is 
via stderr - a way that is available in every conceivable programming 
language. The program itself can recognize at runtime whether it writes 
to a terminal or to stderr (pipe). Depending on this, the program can 
adjust the contents of the output. In the case of terminal, for example, 
I let it prepend a timestamp and color-code the loglevel. In case of 
stderr I let prepend the loglevel + "|" symbol, which allows the 
modified logger with matching loglevel to log with the set facility. 
Another advantage is that the logger process is started only once when 
the program is started, instead of every time a log line is written.


For my use case this approach is the most practical way at the moment. 
In contrast, I can't think of a practical use case for the comment 
included in the original logger code (parsing the syslog log via stdin). 
Did I miss something in the process?


I am very grateful for any suggestions, criticism and food for thought.

Kind regards
Matthias

--- /build/netbsd-current/src/usr.bin/logger/logger.c	2012-04-27 08:30:48.0 +0200
+++ journal/netbsd/logger/logger.c	2022-10-16 10:16:02.649411028 +0200
@@ -66,7 +66,7 @@
 int
 main(int argc, char *argv[])
 {
-	int ch, logflags, pri;
+	int ch, logflags, fac, pri, prioflag;
 	const char *tag;
 	const char *sd = "-";
 	const char *msgid = "-";
@@ -75,7 +75,8 @@
 	tag = NULL;
 	pri = LOG_NOTICE;
 	logflags = 0;
-	while ((ch = getopt(argc, argv, "cd:f:im:np:st:")) != -1)
+	prioflag = 0;
+	while ((ch = getopt(argc, argv, "cd:f:F:im:np:st:")) != -1)
 		switch((char)ch) {
 		case 'c':	/* log to console */
 			logflags |= LOG_CONS;
@@ -99,6 +100,12 @@
 		case 'p':		/* priority */
 			pri = pencode(optarg);
 			break;
+		case 'F':		/* set fixed facility */
+			fac = decode(optarg, facilitynames);
+			if (fac < 0)
+errx(EXIT_FAILURE, "unknown facility name: %s", optarg);
+			prioflag = 1;
+			break;
 		case 's':		/* log to standard error */
 			logflags |= LOG_PERROR;
 			break;
@@ -138,13 +145,39 @@
 		}
 		if (p != buf)
 			syslogp(pri, msgid, sd, "%s", buf);
-	} else	/* TODO: allow syslog-protocol messages from file/stdin
-		 *   but that will require parsing the line to split
-		 *   it into three fields.
-		 */
-		while (fgets(buf, sizeof(buf), stdin) != NULL)
-			syslogp(pri, msgid, sd, "%s", buf);
-
+	} else {	
+		while (fgets(buf, sizeof(buf), stdin) != NULL) {
+			if (prioflag == 1) {
+char 

Re: Feed facility/priority to logger(1) via stdin - desirable extension or bad idea?

2022-10-16 Thread Matthias Petermann

Hello,

On 07.10.22 21:39, Mouse wrote:

Is there some reason you don't actually syslog() the log messages,
then, rather than sending them down a pipe?  It sounds to me as though
you are going to have to make your log generator logging-aware, but,
then, I don't see what benefit you get from piping the output to a tool
instead of just logging it directly.  (The obvious (to me) benefit is
that you can control facility and priority with the logging tool
instead of wiring it into the code, but here you're pushing it back
into the log-generation code anyway.)


Besides the things I already wrote in my last reply, this passage got me 
thinking again. You raised an important point - the approach of 
accepting the syslog log via stdin implies the requirement for the 
logging program to "speak" the syslog log. But this is exactly what I 
want to avoid with the whole thing ;-) I would rather set a fixed 
facility by logger parameter and only read the log level from stdin. 
This should then also be coded in a form that it can be reproduced very 
easily and intuitively e.g. by a script.


I've implemented a working draft prototype of what I'd consider useful, 
and I'm attaching the patch in case anyone is interested.


From the user's point of view, the change consists of an additional 
command line parameter (-F).


With this parameter, one can set the facility for the logger (unlike -p, 
which sets facility + log level). The parameter also activates special 
handling of all lines read via stdin. In this particular mode, the first 
characters read per line are expected to contain the log level in plain 
text (info, alert, err, emerg...), terminated by the pipe character. The 
pipe character separates the log level from the actual text to be logged.


This then makes the following possible, for example:

```
❯ echo "info|This in an informational message"|logger -F local2
❯ echo "alert|This in an alert message"|logger -F local2
```

and

```
#minute hourmdaymonth   wdaycommand
#
0   */6 *   *   *   /home/mpeterma/zdump backup 2>&1 
| logger -F local2

```

I understand that my use case may be a more specialized one. Using the 
example above (crontab), the advantage of the solution is that the 
called program doesn't need deep knowledge of the logging protocol used. 
The only point of contact is the naming of the loglevels, which is 
oriented on the syslog standard. Otherwise, the only communication is 
via stderr - a way that is available in every conceivable programming 
language. The program itself can recognize at runtime whether it writes 
to a terminal or to stderr (pipe). Depending on this, the program can 
adjust the contents of the output. In the case of terminal, for example, 
I let it prepend a timestamp and color-code the loglevel. In case of 
stderr I let prepend the loglevel + "|" symbol, which allows the 
modified logger with matching loglevel to log with the set facility. 
Another advantage is that the logger process is started only once when 
the program is started, instead of every time a log line is written.


For my use case this approach is the most practical way at the moment. 
In contrast, I can't think of a practical use case for the comment 
included in the original logger code (parsing the syslog log via stdin). 
Did I miss something in the process?


I am very grateful for any suggestions, criticism and food for thought.

Kind regards
Matthias
--- journal/netbsd/logger/logger.c	2022-10-16 10:16:02.649411028 +0200
+++ /build/netbsd-current/src/usr.bin/logger/logger.c	2012-04-27 08:30:48.0 +0200
@@ -66,7 +66,7 @@
 int
 main(int argc, char *argv[])
 {
-	int ch, logflags, fac, pri, prioflag;
+	int ch, logflags, pri;
 	const char *tag;
 	const char *sd = "-";
 	const char *msgid = "-";
@@ -75,8 +75,7 @@
 	tag = NULL;
 	pri = LOG_NOTICE;
 	logflags = 0;
-	prioflag = 0;
-	while ((ch = getopt(argc, argv, "cd:f:F:im:np:st:")) != -1)
+	while ((ch = getopt(argc, argv, "cd:f:im:np:st:")) != -1)
 		switch((char)ch) {
 		case 'c':	/* log to console */
 			logflags |= LOG_CONS;
@@ -100,12 +99,6 @@
 		case 'p':		/* priority */
 			pri = pencode(optarg);
 			break;
-		case 'F':		/* set fixed facility */
-			fac = decode(optarg, facilitynames);
-			if (fac < 0)
-errx(EXIT_FAILURE, "unknown facility name: %s", optarg);
-			prioflag = 1;
-			break;
 		case 's':		/* log to standard error */
 			logflags |= LOG_PERROR;
 			break;
@@ -145,39 +138,13 @@
 		}
 		if (p != buf)
 			syslogp(pri, msgid, sd, "%s", buf);
-	} else {	
-		while (fgets(buf, sizeof(buf), stdin) != NULL) {
-			if (prioflag == 1) {
-char *ptr;
-int lev;
-
-for (ptr = buf; *ptr != '\0' && *ptr != '|'; ptr++);
-if (*ptr != '\0') {
-	/* found encoded log level, decode */
-	*ptr = '\0';
-	lev = decode(buf, prioritynames);
-	if (lev == -1) { 
-		/* in case the encoded log level is unknown,
-		 * we first log this fact, then set the log 
-		 * level to a 

Re: Feed facility/priority to logger(1) via stdin - desirable extension or bad idea?

2022-10-07 Thread Matthias Petermann

Hello,

thank you very much for the suggestions and questions.

Am 07.10.2022 um 21:39 schrieb Mouse:

With a lot of scripts and tools written, I have gotten into the habit
of logging all logging output to stderr, as well as any form of
payload to stdout.


What do you do with error messages, then?


The error messages also go into the logger via stderr. Strictly 
speaking, from the stderr-output perspective I try to separate the use 
cases. This means that either the tool runs interactively in the shell, 
or it runs autonomously as part of an automatic pipeline. Neverthenless, 
I want to produce all messages generically for both cases with no code 
duplication.



Is there some reason you don't actually syslog() the log messages,
then, rather than sending them down a pipe?  It sounds to me as though
you are going to have to make your log generator logging-aware, but,
then, I don't see what benefit you get from piping the output to a tool
instead of just logging it directly.  (The obvious (to me) benefit is
that you can control facility and priority with the logging tool
instead of wiring it into the code, but here you're pushing it back
into the log-generation code anyway.)


Please let me illustrate using one of my backup scripts as an example. 
This script can be run interactively if necessary. Regularly, however, 
it is driven autonomously via cron.


It does not generate regular output on stdout but logs all steps 
performed on stderr. There is a small helper function in the script for 
this purpose:


```
 Log Function ##
log() {
local LEVEL=$1
local MESSAGE="$2"

if [ -t 2 ]; then
echo -n "$TC_NORM[" > /dev/stderr
case $LEVEL in
info)
echo -n "$TC_GREEN" > /dev/stderr
echo -n "INFO" > /dev/stderr
;;
warn)
echo -n "$TC_YELLOW" > /dev/stderr
echo -n "WARN" > /dev/stderr
;;
err)
echo -n "$TC_RED" > /dev/stderr
echo -n "ERR " > /dev/stderr
;;
*)
echo -n "" > /dev/stderr
;;
esac
echo -n $TC_NORM > /dev/stderr
echo -n "] " > /dev/stderr
echo "$MESSAGE"  > /dev/stderr
else
echo "$MESSAGE" > /dev/stderr
fi
}
```

So, depending on whether stderr is connected to a terminal or a pipe, a 
different formatting of the output is chosen. I didn't intend to 
emphasize on the usage of colours, it was just an example and I can 
understand your concerns.


In the interactive case, it looks like this:

```
$ ./zdump backup
[WARN] Target changed (76213cde-31a8-497a-a016-9775f2a12549 -> 
c3629a98-17b5-4751-b25e-a9fa338d20ed)

[INFO] Cleaning target ...
[INFO] Created snapshot tank/vol/net@0
[INFO] Starting full dump to tank_vol_...@0.full.zfs ...
[INFO] Completed in 37.324 seconds, 2.2 GB transferred (61 MB/sec)
```

In contrast, the raw output in the autonomous case will look like this:

```
Target changed (76213cde-31a8-497a-a016-9775f2a12549 -> 
c3629a98-17b5-4751-b25e-a9fa338d20ed)

Cleaning target ...
Created snapshot tank/vol/net@0
Starting full dump to tank_vol_...@0.full.zfs ...
Completed in 37.324 seconds, 2.2 GB transferred (61 MB/sec)
```

In the crontab, the backup script is connected to the logger:

```
#minute hourmdaymonth   wdaycommand
#
0   */6 *   *   *   /home/mpeterma/zdump backup 2>&1 
| logger -p local2.notice

```

As the result, everything is logged to local2.notice, independendly of 
the scripts application log level.


In the example above, I could of course call the logger command directly 
with the appropriate priority instead of the echoes on stderr. Anyway, 
the motivation behind the idea is for the script case that one can save 
calling a new logger process for each individual line to be output. The 
advantage here would be a lower system load, although with today's 
systems this probably won't matter that much unless you log hundreds of 
lines.


In the case of tools or compiled programmes in general, it's more a 
concern of the API. My understanding is that syslog() from the C 
standard library is only available for C/C++ programmes, and for other 
languages at most via FFI. This means that the direct call is not 
possible in some cases. Instead, a connection to syslog would have to be 
established via a network library that speaks the syslog protocol. This 
requires that the programme knows where to find the syslogd service 
(usually a configuration file pointing to the syslogd IP + Port / Unix 
Domain Socket to use). This requires a lot of attention if you have 
several environment to not mix up logging servers from development / 
production. The advantage here would be a simpler configuration, and 
also that I get exactly the same output on the console at development 
time as I get in the log file 

Feed facility/priority to logger(1) via stdin - desirable extension or bad idea?

2022-10-07 Thread Matthias Petermann

Hello all,

following a short discussion on IRC, I would like to ask a question or 
make a suggestion.


With a lot of scripts and tools written, I have gotten into the habit of 
logging all logging output to stderr, as well as any form of payload to 
stdout. To change the logging destination depending on the operating 
environment, I just have to connect stderr to the desired logging tool 
at the end of the pipeline. In my scripts and tools, on the other hand, 
I don't have to worry about the logging target at all, but can work as 
neutrally as possible. This is similarly recommended in the environment 
of 12F apps[1].


In my productive NetBSD environments, this logging tool is then usually 
logger(1), which is at the end of the pipeline and sends the output to 
syslogd.


logger(1) takes a parameter -p, which I can use to set the facility and 
the priority. This is where it starts to get a bit uncomfortable. I have 
no ability to influence the facility / priority on the other side of the 
pipeline. This means that I lose the capabilities of Syslogd, e.g. to 
use different log files depending on the priority.


It would be nice if one could, for example, optionally omit the -p 
parameter and instead specify the facility and priority via the standard 
input of logger in coded form (coded with angle brackets as in the raw 
syslog protocol) *)


In /usr.bin/logger/logger.c there is a TODO in the else path of command 
line processing (chosen when no command line parameters are provided) 
which mirrors pretty much that idea:


```
} else  /* TODO: allow syslog-protocol messages from file/stdin
 *   but that will require parsing the line to split
 *   it into three fields.
 */
while (fgets(buf, sizeof(buf), stdin) != NULL)
syslogp(pri, msgid, sd, "%s", buf);
```

In other implementations of logger (under [2], for example, the Linux 
variant), there is a --prio-prefix parameter that explicitly enables 
this type of handling.


My question on this:

- With reference to the TODO - would that be a welcome addition? If yes, 
what should be paid particular attention to?


- Can what I have in mind already be solved (differently or more 
elegantly) with existing tools from the base system?



Kind regards
Matthias


*) You could then decide in your scripts, depending on whether stderr is 
connected to a terminal or a pipe, whether you want to output nice 
coloured terminal logging or logging optimised for syslog with a prefix



[1] https://12factor.net/logs
[2] https://man7.org/linux/man-pages/man1/logger.1.html


Re: Multiple instances of the same mount point in fstab not possible? (want to layer multiple filesystems into one namespace via union mount option)

2022-09-24 Thread Matthias Petermann

Hi Michael,

On 24.09.22 10:42, Michael van Elst wrote:

m...@petermann-it.de (Matthias Petermann) writes:


NAME=data1 /export   ffs rw,log,union  0 0
NAME=data2 /export   ffs rw,log,union  0 0
NAME=data3 /export   ffs rw,log,union  0 0



Here, only the first file system is mounted, the other two are ignored.
I suspect that multiple occurrences of the same mount point are not
supported in fstab.


It's a feature. mount -a skips filesystems that are already mounted,
so you don't get failures from the attempt to mount a filesystem twice.
(mount -A forces all mounts).

But your example has another issue, there is no information about the
order of mounts, so you may end with filesystems stacked incompletely
or even the wrong order.

The easiest is probably to declare the filesystem 'noauto' and add
a custom rc script that creates and removes your stacked filesystems.



Thanks - I didn't know the difference between mount -a and mount -A... I 
have skimmed the man page several times but this detail had slipped my 
mind.


With the order of the mounts I assumed that the fstab is processed from 
top to bottom - but that is certainly not quite safe to assume. I was 
afraid that I would have to write my own rc.d script - but the advantage 
would be that I could at least influence that the mounts have to be 
ready before Samba and the NFS server are started.


Many greetings
Matthias




smime.p7s
Description: S/MIME Cryptographic Signature


Multiple instances of the same mount point in fstab not possible? (want to layer multiple filesystems into one namespace via union mount option)

2022-09-22 Thread Matthias Petermann

Hello all,

I have a possibly somewhat strange question. In addition to the root 
file system, I have three other FFS file systems here. I want to layer 
them on top of each other using the union mount option on one and the 
same mount point.


This works without any problems on the command line. However, I cannot 
find a way to automate this via fstab:



```
NAME=data1 /export   ffs rw,log,union  0 0
NAME=data2 /export   ffs rw,log,union  0 0
NAME=data3 /export   ffs rw,log,union  0 0
```

Here, only the first file system is mounted, the other two are ignored. 
I suspect that multiple occurrences of the same mount point are not 
supported in fstab.


Is my understanding of this correct, and is there perhaps a halfway 
elegant way to implement this?


Kind regards
Matthias


Re: Qemu/nvmm - time in NetBSD guest system lags behind (with estd on host)

2022-09-19 Thread Matthias Petermann

Hello Michael and Paul,

Am 05.09.2022 um 07:29 schrieb Michael van Elst:

p...@whooppee.com (Paul Goyette) writes:


You would have to modify the appropriate module's Makefile to add
the HZ=1000 definition.


ZFS doesn't use HZ, but in osnet/sys/sys/time.h it uses a value hz=100
to compute the lbolt time.

Does this help ?

Index: external/cddl/osnet/sys/sys/time.h
===
RCS file: /cvsroot/src/external/cddl/osnet/sys/sys/time.h,v
retrieving revision 1.13
diff -p -u -r1.13 time.h
--- external/cddl/osnet/sys/sys/time.h  29 Aug 2021 08:43:12 -  1.13
+++ external/cddl/osnet/sys/sys/time.h  5 Sep 2022 05:28:33 -
@@ -82,7 +82,7 @@ static inline int64_t
  ddi_get_lbolt64(void)
  {
 struct timespec ts;
-   const int hz = 100;
+   extern int hz;
  
 getnanouptime();

 return (int64_t)(SEC_TO_TICK(ts.tv_sec) + NSEC_TO_TICK(ts.tv_nsec));



I was able to test the module today. With the above patch, the ZPOOL is 
recognised and correctly integrated directly at boot time with an 
HZ=1000 kernel.


Many greetings
Matthias


Re: NetBSD iSCSI target on ZVOL used as block device for Qemu - iSCSI: NOP timeout

2022-09-19 Thread Matthias Petermann

Hello Michael,


Am 19.09.2022 um 10:52 schrieb Michael van Elst:

m...@petermann-it.de (Matthias Petermann) writes:

There is a userland implementation (iscsi-target / iscsi-initiator)
that is basically the reference (or maybe example) code plus some
integration as userland filesystem.

The code lacks features and isn't compatible with other iscsi
implementations. It's stable but there is no development.


Then there is a the kernel iscsi initiator (iscsctl / iscsid).
It's more compatible, much faster, but probably still a little buggy.
Most bugs and fixes in the last years came from me.

I use the kernel iscsi initiator together with istgt or the Linux
iSCSI target.
[...]



thanks for the clarification and the classification of the supposed 
advantages of an implementation in kernel space.


Your combination - kernel iscsi initiator and userspace istgt target 
sounds like a good roadmap :-) Fortunately, since both are available and 
working in one way or another, there is no rush. In perspective, of 
course, it would be nice to have the best possible implementation in the 
base system (or at least to remove the target implementation from the 
base system altogether - this would eliminate it as a source of errors 
for new users like me).


Kind regards
Matthias


Re: NetBSD iSCSI target on ZVOL used as block device for Qemu - iSCSI: NOP timeout

2022-09-19 Thread Matthias Petermann

Hello Chavdar and Michael,

Am 18.09.2022 um 16:28 schrieb Chavdar Ivanov:



On 18 September 2022 13:46:33 (+01:00), Matthias Petermann wrote:

 > Hi,
 >
 > Am 18.09.2022 um 14:22 schrieb Michael van Elst:
 > > m...@petermann-it.de (Matthias Petermann) writes:
 > > >> thanks for your suggestion. Do you mean this package:
 > > >> http://cvsweb.netbsd.org/bsdweb.cgi/pkgsrc/net/netbsd-iscsi-target/
 > >> ? I'm not sure if you mean this one, because the last change here 
was 9

 > >> years ago.
 > > Use the net/istgt package.
 > >
 > thanks - will build it right away and give it a try :)
That's what I meant, I was out and forgot the name of the package.
Years ago I had a NetBSD target serving a number of (mainly Windows) 
initiators; using the NetBSD built-in target I couldn't get them to 
reconnect after a restart of the server or of the workstations (I had to 
disconnect and reconnect every time). With net/istgt this wasn't the 
case and the connection worked reliably. More recently I used to use 
several zvols served by NetBSD-current (istgt) and from FreeBSD, the 
physical connection being through WiFi-PowerLine-Ethernet, and it worked 
(not particularly fast because of the PowerLine link). >

 > Kind regards
 > Matthias
 >

Chavdar



You have both helped me a lot with your advice. With net/istgt it works 
perfectly! The speed is also completely sufficient for me - even over 54 
MBit/s Wifi.


Just out of interest - how can the state of the iscsi implementation in 
NetBSD-Base be assessed in general? Is the implementation outdated 
and/or unstable, or are there (compatibility?) reasons for this? 
According to the description, isgt is the same as the FreeBSD iscsi 
implementation. As far as I can tell, both target implementations run in 
user space, i.e. operate on the same logical level, and neither can 
claim any advantages, e.g. running directly in kernel space and thus 
saving context switches. Of course, I realise that there are more 
important issues and that such a complex protocol does not port itself. 
But would it be a far-future option to adopt the istgt implementation in 
NetBSD-base?


Anyway, another knowledge gap closed and a piece of the puzzle placed. 
Very cool: my continuously running NUC-based energy-saving server 
replicates its ZVOLs (Qemu backing stores) to my occasionally running 
(not quite as economical) NAS with RAIDZ2. With iSCSI, I can now boot 
the replicas exported from the NAS directly to my laptop for diagnostic 
or recovery purposes if needed. If you know the stumbling blocks, 
NetBSD/ZFS/NVMM can be used to create a virtualisation platform that is 
quite suitable for small enterprise usage. I think it's about time that 
I write an article for the NetBSD Wiki :)


Thanks again, and kind regards
Matthias


Re: NetBSD iSCSI target on ZVOL used as block device for Qemu - iSCSI: NOP timeout

2022-09-18 Thread Matthias Petermann

Hi,

Am 18.09.2022 um 14:22 schrieb Michael van Elst:

m...@petermann-it.de (Matthias Petermann) writes:


thanks for your suggestion. Do you mean this package:



http://cvsweb.netbsd.org/bsdweb.cgi/pkgsrc/net/netbsd-iscsi-target/
? I'm not sure if you mean this one, because the last change here was 9
years ago.


Use the net/istgt package.



thanks - will build it right away and give it a try :)

Kind regards
Matthias


Re: NetBSD iSCSI target on ZVOL used as block device for Qemu - iSCSI: NOP timeout

2022-09-18 Thread Matthias Petermann

Hi,

Am 18.09.2022 um 14:11 schrieb Chavdar Ivanov:
My first suggestion would be to try using the iscsi server package from 
pkgsrc. I have certainly used iscsi shared zvol but from a windows 10 
initiator without a problem.




thanks for your suggestion. Do you mean this package:

http://cvsweb.netbsd.org/bsdweb.cgi/pkgsrc/net/netbsd-iscsi-target/

? I'm not sure if you mean this one, because the last change here was 9 
years ago.


Kind regards
Matthias



NetBSD iSCSI target on ZVOL used as block device for Qemu - iSCSI: NOP timeout

2022-09-18 Thread Matthias Petermann

Hello all,

here again a small (or big?) problem in connection with virtualisation ;-)

The following scenario is given: There is a NetBSD 9.3 server with ZFS, 
on it a ZVOL. The server makes the ZVOL available via iSCSI. There is 
also a NetBSD 9.3 client with Qemu/nvmm. The client boots from the ZVOL 
provided via iSCSI.


I use the following test configuration for this:

## Server

```
saturn$ cat /etc/iscsi/targets

extent0 /dev/zvol/rdsk/tank/backup/vhost/vol/iot0   16GB
target0 rw  extent0 0.0.0.0/0
```

## Client

```
HOSTNAME=netbsd
CORES=1
RAM=1G

qemu-system-x86_64 -nodefaults -machine pc-i440fx-7.0 -smp $CORES -m 
$RAM -monitor stdio \

   -k de -vga std -usbdevice tablet -boot c \
   -object iothread,id=t0 \
   -drive 
file=iscsi://192.168.2.20:3260/iqn.1994-04.org.netbsd.iscsi-target:target0/0,format=raw 
\
   -netdev user,id=vioif0 -device 
virtio-net-pci,netdev=vioif0 \
   -iscsi 
initiator-name=iqn.1994-04.org.netbsd.iscsi-target:target0,timeout=0 \

   -accel nvmm
```

## Observation

To my delight, booting on the client works quite well at first. However, 
there are long pauses when loading the kernel (when the spinner is 
displayed on the console). The spinner stops for a few seconds and then 
continues to spin. At the moments when the spinner continues to spin, a 
message appears on the Qemu console:


```
qemu-system-x86_64: iSCSI: NOP timeout. Reconnecting...
qemu-system-x86_64: iSCSI: NOP timeout. Reconnecting...
...
```

On the server, there is no indication of the cause of the timeouts - 
only an output in the syslog that a reconnect has taken place with some 
regularity:


```
...
Sep 18 08:24:03 saturn iscsi-target: > iSCSI Normal login  successful 
from iqn.1994-04.org.netbsd.iscsi-target:target0 on 192.168.2.140 disk 
0, ISID 140969396928512, TSIH 182
Sep 18 08:24:28 saturn iscsi-target: > iSCSI Normal login  successful 
from iqn.1994-04.org.netbsd.iscsi-target:target0 on 192.168.2.140 disk 
0, ISID 141669559369728, TSIH 183
Sep 18 08:24:53 saturn iscsi-target: > iSCSI Normal login  successful 
from iqn.1994-04.org.netbsd.iscsi-target:target0 on 192.168.2.140 disk 
0, ISID 141210014318592, TSIH 184
Sep 18 08:25:18 saturn iscsi-target: > iSCSI Normal login  successful 
from iqn.1994-04.org.netbsd.iscsi-target:target0 on 192.168.2.140 disk 
0, ISID 140802084110336, TSIH 185

```

While at the time of booting these irregularities do not seem to matter 
much (I assume that the BIOS routines in Qemu are tolerant enough here), 
later on when initialising the emulated ATA controller this leads first 
to a downgrade from DMA to PIO4, and finally to a series of "lost 
interrupt", which leads to "device timeout" on wd0 and finally to a 
system that is caught in a never-ending retry loop.


Which I can rule out:

- Problems with the network quality - the devices involved are wired 
with 1GB/s LAN and have no problems in other network-heavy scenarios.


- There is no firewall

- Even during the "hangs" there is no high CPU load on the systems involved.

What else I noticed:

- During the "hangs", the iscsi-target process on the server is stuck in 
the "netio/0" state. When the system has recovered and data is flowing, 
it switches between "netio/0" and "netio/1" every second or so.


This is certainly a very special scenario and I suspect that I will have 
to test the whole thing without ZFS involvement (i.e. with a VND). 
However, if anyone has a tip or even experience with this, I would be 
very grateful.


Kind regards
Matthias


Re: Qemu/nvmm - time in NetBSD guest system lags behind (with estd on host)

2022-09-12 Thread Matthias Petermann

Hello all,

On 12.09.22 15:00, Matthias Petermann wrote:


CONCLUSION

- Due to the root cause (well explained in [1]) the clocks in the VMs 
are potentially running slow.


- The root cause can be mitigated by a higher HZ in the host kernel 
compared to the guest kernel.


- ntpd can be configured to cope well with the deviations that remain, 
so that installation of 3rd party software (chrony) is not necessarily 
required.


- The clocks run to the second with the described ntpd configuration

- With chrony I had initially obtained a similarly good result, but then 
preferred to get by with the means of the base system (which is why I 
gave ntpd another try)


It remains exciting... the joy about the stable clock in the VMs didn't 
last long this time, too, unfortunately. Today I ran a backup again for 
the first time since the clock has been stable. For this I basically run 
a "zfs send | nc" on the host.


While this process is running, the clocks of the VMs continuously start 
to slow down again. The rate of change is so variable that ntpd seems to 
suspend stepping, now for two hours.


With renice I have now adjusted the priority of the processes:

 - "zfs send" + "nc": +20
 - qemu VM processes: -20

Furthermore I changed the scheduler algorithm for the qemu processes to 
SCHED_FIFO with schedctl (in the manpage this mode is associated with 
real time - I thought this can't be wrong for my problem ;-))


This seems to have stabilized the situation, stepping resumes and 
despite running backup the clocks are accurate. In the iostat running on 
the host, I can see about 15% less bandwidth in terms of I/O throughput 
which is fine for me.


Now, however, I have violated the principles of experimental theory by 
changing two parameters at once. Until I have the opportunity to repeat 
the experiment - does anyone have a guess which of the two changes might 
be decisive?


Kind regards
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Qemu/nvmm - time in NetBSD guest system lags behind (with estd on host)

2022-09-12 Thread Matthias Petermann

Hello all,

Am 31.08.2022 um 11:29 schrieb Matthias Petermann:

Hello all,

I have a NetBSD 9.3 host that hosts multiple virtual machines (Qemu with 
nvmm acceleration). One peculiarity: I use estd on the host to control 
down the power consumption of the whole system via frequency scaling 
when there is no load.


The time of the host is synchronized via ntpd (with default settings), 
as well as the guests.


The host's time is correct.

The guests' time increasingly lags behind with continued operation. Also 
the ntpd seems to have no compensating effect in the guests here.


What could be the reason for this? Can estd be a source of interference?

I start the Qemu instances like this:

```
     nohup qemu-system-x86_64  -machine pc-q35-7.0 -smp $VM_CORES -m 
$VM_RAM -accel nvmm \

     -device virtio-balloon-pci,id=balloon0 \
     -k de -boot cd -cdrom $VM_CDROM \
     -machine graphics=off -display none -vga none \
     -object 
rng-random,filename=/dev/urandom,id=viornd0 \

     -device virtio-rng-pci,rng=viornd0 \
     -object iothread,id=t0 \
     $BLK \
     -device virtio-net-pci,netdev=vioif0,mac=$VM_MAC \
     -netdev 
tap,id=vioif0,ifname=$VM_NETIF,script=no,downscript=no \
     -chardev 
socket,id=monitor,path=$MONITOR_SOCKET,server=on,wait=off \

     -monitor chardev:monitor \
     -chardev 
socket,id=serial0,path=$CONSOLE_SOCKET,server=on,wait=off \

     -serial chardev:serial0 \
     -pidfile /tmp/$VM_ID.pid \
     2>&1 | logger -p local0.notice &
```

Timecounter settings in the guests:

```
net$ sysctl -a |grep timecounter
kern.timecounter.choice = TSC(q=3000, f=199680 Hz) 
clockinterrupt(q=0, f=100 Hz) ichlpcib0(q=1000, f=3579545 Hz) 
hpet0(q=2000, f=1 Hz) ACPI-Safe(q=900, f=3579545 Hz) 
lapic(q=-100, f=1920 Hz) i8254(q=100, f=1193182 Hz) 
dummy(q=-100, f=100 Hz)

kern.timecounter.hardware = TSC
kern.timecounter.timestepwarnings = 0
```

All I could find so far is [1]. It is recommended to add the rtc switch 
to the qemu command. Is there any recommendation here in the meantime 
which setting works best with NetBSD?


I would be very happy about a short recommendation or a field report. As 
it is, this is the last remaining problem on my new virtual host, 
powered by NetBSD, Qemu, NVMM and ZFS ZVOLS.


Kind regards
Matthias


[1] http://mail-index.netbsd.org/port-amd64/2021/05/09/msg003459.html



After a few days of intermittent successes and just as many setbacks, I 
have now found a configuration that really works reliably. For all those 
who may now face the same problem in the future, I would like to share 
my notes on it.


SUMMARY

1) The host kernel is compiled with HZ=1000, the guest kernel stays at 
HZ=100, so the guest clock drifts away slowly, and ntpd can compensate.


2) On the host I run ntpd as client + server. It serves as the only and 
primary time source for the guest VMs. The ntpd.conf is the same as the 
default configuration except for a restrict directive that allows the 
local subnet of the VMs to access it.


3) In the guest VMs, ntpd runs as a client with only one time source 
(namely, the host). The following non-default settings in ntpd.conf are 
worth mentioning:


```
tinker panic 0 stepout 30
```

 - panic: allows ntpd regular not to give up even with larger time jumps

 - stepout: the default value is 900, i lower it to 30 so that large 
deviations that qualify for stepping are also fixed relatively quickly


```
#tos minsane 2
```

 - since i only use one time source, i comment this parameter out (you 
can just as well set it to 1)


```
server 192.168.2.10 burst minpoll 4 maxpoll 6 true
```

 - only use the one server (the IP address of the host)

 - the settings for minpoll and maxpoll have proven to be appropriate 
(minpoll 4 instead of the default of 6, maxpoll 6 instead of the default 
of 10). This setting results in the time source being polled at a higher 
frequency and therefore potential deviations are detected earlier.



CONCLUSION

- Due to the root cause (well explained in [1]) the clocks in the VMs 
are potentially running slow.


- The root cause can be mitigated by a higher HZ in the host kernel 
compared to the guest kernel.


- ntpd can be configured to cope well with the deviations that remain, 
so that installation of 3rd party software (chrony) is not necessarily 
required.


- The clocks run to the second with the described ntpd configuration

- With chrony I had initially obtained a similarly good result, but then 
preferred to get by with the means of the base system (which is why I 
gave ntpd another try)



OPEN ITEMS

- Build host modules with HZ=1000 [2] and test if this fixes ZFS 
initializa

Re: Access FFS partition on GPT on ZVOL

2022-09-07 Thread Matthias Petermann

Hello Michael,

Am 07.09.2022 um 16:37 schrieb Michael van Elst:

m...@petermann-it.de (Matthias Petermann) writes:


saturn$ doas dmsetup create net-export --table "34 2147483581 linear
/dev/zvol/rdsk/tank/backup/vhost/vol/net-export 0"
create and load called


Please try:

--table "0 2147483581 linear /dev/zvol/dsk/tank/backup/vhost/vol/net-export 34"

left side is the offset + size of a chunk of your new mapped device

right side is the backing store (must be the block device) and the offset on 
the backing store.



This worked great! This helps me tremendously - I can now mount the 
replicated ZVOL snapshots of my VMs directly on the NAS. This is much 
more convenient and faster than replicating the ZVOL back to the vhost 
if the worst comes to the worst.


Kind regards
Matthias


Re: Access FFS partition on GPT on ZVOL

2022-09-07 Thread Matthias Petermann

Hello Michael,

Am 03.09.2022 um 17:59 schrieb Michael van Elst:

m...@petermann-it.de (Matthias Petermann) writes:


saturn$ doas dkctl /dev/zvol/rdsk/tank/backup/vhost/vol/net-export=20
addwedge myexport 34 2147483581 ffs
dkctl: /dev/zvol/rdsk/tank/backup/vhost/vol/net-export: addwedge:=20
Inappropriate ioctl for device


A zvol is no disk and doesn't support partitions or wedges.



/dev/zvol/rdsk/tank/backup/vhost/vol/net-export
ccdconfig: ioctl (CCDIOCSET): /dev/rccd0d: Block device required
saturn$ doas ccdconfig -cv ccd0 0=20
/dev/zvol/dsk/tank/backup/vhost/vol/net-export
ccdconfig: ioctl (CCDIOCSET): /dev/rccd0d: Inappropriate ioctl for device=


ccd learned about wedges in 9.99.74, it wasn't pulled up to netbsd-9.


Maybe the diskmapper (LVM driver) can help, without LVM metadata you
need to manually create a mapping table and install it with 'dmsetup create'.



Thanks for the tip. I didn't know that the device mapper exists in such 
an accessible form in NetBSD (at least I saw it only as a necessary 
accessory for LVM and not as a standalone feature). However, I am not 
getting anywhere here either what is wrong with my attempts yet? 
Does anyone have any ideas?


Here is what I do:


- using NetBSD 9.3

```
saturn$ uname -a
NetBSD saturn.local 9.3_STABLE NetBSD 9.3_STABLE (GENERIC) #0: Tue Aug 
9 04:04:53 UTC 2022 
root@jupiter.local:/build/netbsd-9/obj/sys/arch/amd64/compile/GENERIC amd64

```

- list GPT of ZVOL

```
saturn$ doas gpt show /dev/zvol/rdsk/tank/backup/vhost/vol/net-export
   startsize  index  contents
   0   1 PMBR
   1   1 Pri GPT header
   2  32 Pri GPT table
  34  2147483581  1  GPT part - NetBSD FFSv1/FFSv2
  2147483615  32 Sec GPT table
  2147483647   1 Sec GPT header
```

- create a device

```
saturn$ doas dmsetup create net-export --table "34 2147483581 linear 
/dev/zvol/rdsk/tank/backup/vhost/vol/net-export 0"

create and load called

```

- check if it is there

```
saturn$ ls -la /dev/mapper/
total 92
drwxr-xr-x   2 root  wheel512 Sep  7 09:22 .
drwxr-xr-x  11 root  wheel  43520 Sep  7 09:22 ..
crw-rw   1 root  operator  194, 0 Sep  7 09:22 control
brw-r-   1 root  operator  169, 1 Sep  7 09:22 net-export
crw-r-   1 root  operator  194, 1 Sep  7 09:22 rnet-export
```

- retrieve further information from DM

```
saturn$ doas dmsetup info
Name:  net-export
State: ACTIVE
Read Ahead:0
Tables present:None
Open count:0
Event number:  0
Major, minor:  169, 1
Number of targets: 0

saturn$ doas dmsetup status
net-export:

saturn$ doas dmsetup ls
net-export  (169, 1)
```

- try to mount

```
saturn$ doas mount -t ffs -o ro /dev/mapper/net-export /mnt/
mount_ffs: /dev/mapper/net-export on /mnt: incorrect super block
```

- try to dump first bytes of device

```
saturn$ doas hexdump -C -n 32768 /dev/mapper/rnet-export
saturn$
```

Since the hexdump does not spit out any data, I suspect that the 
DM-Device is not set up correctly. The message during the mount attempt 
appears to me only as a subsequent error.


Kind regards
Matthias


Re: Qemu/nvmm - time in NetBSD guest system lags behind (with estd on host)

2022-09-04 Thread Matthias Petermann

Hi Robert,

please allow me one mor more question

On 04.09.22 10:42, Matthias Petermann wrote:

Hi Robert,

On 04.09.22 02:58, Robert Elz wrote:

if that implies that you rebuilt the kernel with HZ=1000 and then used
the zfs module built with HZ=100 then I think the first thing I would try
would be to rebuild the module(s?) with HZ=1000



Good point... I'll try that right away. This might coincide with my 
observation (race condition when initializing the ZPOOL, mail from just 
now).



I did build the kernel with build.sh as follows:

```
$ cd /build/netbsd-93-1000hz/usr/src/sys/arch/amd64/conf
$ cp GENERIC VHOST
$ vi VHOST

options HZ=1000

$ cd /build/netbsd-93-1000hz/usr/src/
$ mkdir ../obj
$ ./build.sh -O ../obj -j 4 -U tools
$ ./build.sh -O ../obj -j 4 -U kernel=VHOST
$ ./build.sh -O ../obj -U releasekernel=VHOST
```

...and picked it up from
	 
/build/netbsd-93-1000hz/usr/src/obj/releasedir/amd64/binary/kernel/netbsd-VHOST.gz

for deployment on the actual vhost.

While for the *kernel* / *releasekernel* target the name of the kernel 
configuration to be used can be provided, I don't see such an option for 
the *modules* target. How can I make sure the modules are built with the 
HZ option set in VHOST config? Or does it simply adapt these from a 
previous run of the *kernel* target?


Kind regards
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Qemu/nvmm - time in NetBSD guest system lags behind (with estd on host)

2022-09-04 Thread Matthias Petermann

Hi Robert,

On 04.09.22 02:58, Robert Elz wrote:

if that implies that you rebuilt the kernel with HZ=1000 and then used
the zfs module built with HZ=100 then I think the first thing I would try
would be to rebuild the module(s?) with HZ=1000



Good point... I'll try that right away. This might coincide with my 
observation (race condition when initializing the ZPOOL, mail from just 
now).


Kind regards
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Qemu/nvmm - time in NetBSD guest system lags behind (with estd on host)

2022-09-04 Thread Matthias Petermann

Hello,

On 03.09.22 13:51, Matthias Petermann wrote:
it took a while - in the meantime I could test the 1000HZ kernel on my 
system. Unfortunately I had an effect right at the first test, which 
forced me to roll back the change first. I hope I'll get around to 
recreating this on a less critical system early in the new week. What 
happened: the 1000HZ kernel could not activate the ZFS pool for some 
reason.


The zfs module was loaded though, I also built the kernel with exactly 
the same sources as the "original" one, so I assume for now that the 
modules are compatible.


In the meantime I was able to take the original problem under control 
with chrony (see post from another user), but the time jumps are 
enormous so I would like to try the solution with the 1000HZ kernel.




here is a small update to my mail from yesterday.

Short review: I had noticed after booting the new kernel with HZ=1000 
that my ZPOOL is not available. Consequently I could not test the effect 
on the VMs. I had managed a medium-term compensation of the deviations 
with net/chrony, but the corrections are very erratic - the deviations 
are simply too large with HZ=100.


Today I made another attempt. The workaround for the ZPOOL problem at 
the moment is not to start ZFS at boot time (zfs=NO in rc.conf) but only 
after the system is completely booted (doas service zfs onestart; doas 
zfs mount -a). It almost seems to me that there is a race condition at 
HZ=1000 that causes ZFS to try to initialize the ZPOOL even though the 
wedge of the VDEV is not ready. Unfortunately I could not analyze this 
in detail yet. But I can live with the workaround for the moment, and 
the good news is that the clocks in the VMs now run exactly to the 
second. Btw, on my low end NUC I cannot measure a significant increase 
in power consumption :-)


At this point once again a big thank you to everyone who answered me.

Many greetings & have a nice sunday
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Access FFS partition on GPT on ZVOL

2022-09-03 Thread Matthias Petermann

Hello all,

here is a ZVOL that is used by a VM as backing storage. Consequently, 
this is provided with a GPT in which, among other things, an FFS 
partition is defined. I want to access this partition from the host side 
(i.e. from outside the VM).


The listing of the GPT works already :

```
saturn$ doas gpt show /dev/zvol/rdsk/tank/backup/vhost/vol/net-export
   startsize  index  contents
   0   1 PMBR
   1   1 Pri GPT header
   2  32 Pri GPT table
  34  2147483581  1  GPT part - NetBSD FFSv1/FFSv2
  2147483615  32 Sec GPT table
  2147483647   1 Sec GPT header`
```

My thought now was that I could create a wedge out of it, but no go:

```
saturn$ doas dkctl /dev/zvol/rdsk/tank/backup/vhost/vol/net-export 
addwedge myexport 34 2147483581 ffs
dkctl: /dev/zvol/rdsk/tank/backup/vhost/vol/net-export: addwedge: 
Inappropriate ioctl for device

```

I then read under [1] that ccd might help here, however I get a very 
similar error message:


```
saturn$ doas ccdconfig -cv ccd0 0 
/dev/zvol/rdsk/tank/backup/vhost/vol/net-export

ccdconfig: ioctl (CCDIOCSET): /dev/rccd0d: Block device required
saturn$ doas ccdconfig -cv ccd0 0 
/dev/zvol/dsk/tank/backup/vhost/vol/net-export

ccdconfig: ioctl (CCDIOCSET): /dev/rccd0d: Inappropriate ioctl for device
```

Am I doing something fundamentally wrong here? Or does this require a 
kernel newer than 9.3?


Kind regards
Matthias


[1] https://wiki.netbsd.org/zfs/



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Qemu/nvmm - time in NetBSD guest system lags behind (with estd on host)

2022-09-03 Thread Matthias Petermann

Hello Robert,

On 31.08.22 16:57, Robert Elz wrote:

 Date:Wed, 31 Aug 2022 13:42:13 +0200
 From:Matthias Petermann 
 Message-ID:  

   | I'm also curious about the effect on energy consumption - i.e., whether
   | it's measurable.

I'm sure its measurable, but I suspect you'd need a highly accurate
and very precise ammeter to do that.

kre



it took a while - in the meantime I could test the 1000HZ kernel on my 
system. Unfortunately I had an effect right at the first test, which 
forced me to roll back the change first. I hope I'll get around to 
recreating this on a less critical system early in the new week. What 
happened: the 1000HZ kernel could not activate the ZFS pool for some 
reason.


The zfs module was loaded though, I also built the kernel with exactly 
the same sources as the "original" one, so I assume for now that the 
modules are compatible.


In the meantime I was able to take the original problem under control 
with chrony (see post from another user), but the time jumps are 
enormous so I would like to try the solution with the 1000HZ kernel.


Kind regards
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Low power server ideas

2022-09-02 Thread Matthias Petermann

Hello Andy,

Am 02.09.2022 um 16:38 schrieb Andy Ruhl:

Hello all,

I've been running a NetBSD server on i386 for about 20 odd years, I
should go back and check when I actually started it. I sort of
accidentally upgraded it to amd64 a while back but it worked.

Anyways, it seems like time to move to something else, maybe lower
power if possible.

I found this which is very interesting:

https://blog.netbsd.org/tnf/entry/making_rockpro64_a_netbsd_server

Using a 128gig internal MMC would be plenty for OS and some local
storage then I would add some other disks, possibly SSD.

Looking for other ideas if anyone has any.

Thanks.

Andy


if it may still be an amd64 architecture, the "cheap" Intel NUCs are 
worthwhile in my opinion. The power consumption of my NUC7CJYHN 
including 8 GB RAM + 2 TB SSD is about 4 W in idle and about 9 W under 
full load. As a barebone, you can currently get the device for about 130 
EUR. I'm currently rebuilding my small business server on this platform, 
and I'm pretty excited about how well NetBSD 9.3 with Qemu/nvmm runs on 
it. I'm running it as a virtual host for four VMs with ZFS ZVOLs as 
backing storage. Apart from a few minor problems that are solvable or 
under control, the system runs very reliably so far.


Kind regards
Matthias


Re: Qemu/nvmm - time in NetBSD guest system lags behind (with estd on host)

2022-08-31 Thread Matthias Petermann

Hello Robert,

Am 31.08.2022 um 13:03 schrieb Robert Elz:
[...]

About the rtc, no no idea.   But to deal with the problem, aside from
major NetBSD code rewrites (the so called tickless kernel) the one
solution that should work is to run the host with HZ set a lot higher,
and leave the guest(s) at 100Hz.

For any modern host (anything you'd really want to use to run a qemu
guest in production) running with HZ=1000 will be fine (you'll never
notice the tiny extra overhead).   Some of the NetBSD ports already
run at that kind of rate - alpha has been at 1024Hz forever (and these
days, alphas are slow processors - though they weren't compared to
others when that change was made).  With this, the 10ms interrupts might
actually occur about 10.1 ms apart, but that much drift NTP should be
able to handle.   If not, run the host with an even higher HZ rate,
even 1 should work with a modern amd64 CPU (though I have never
tested that, nor heard of anyone who has - but 2000 should not be an issue).

If for some reason you cannot change the clock rate of the host (that is,
compile a new kernel with "options HZ=1000" in the config file) then make the
guests run with a much slower clock rate - nothing faster than 50Hz.

That should be acceptable (pdp-11's used to run at 50 or 60hz, and worked
OK) but needs to be even slower for clock drift issues.   The problem
is that if the OS clock rate is too slow, it will start to impact upon
(perceived) performance, and some application capabilities.



Thank you for the detailed description and the solutions. You have 
helped me a lot to understand the cause - I think to increase the tick 
rate I should manage and can hopefully soon write a success message :-)


I'm also curious about the effect on energy consumption - i.e., whether 
it's measurable.


Kind regards
Matthias


Qemu/nvmm - time in NetBSD guest system lags behind (with estd on host)

2022-08-31 Thread Matthias Petermann

Hello all,

I have a NetBSD 9.3 host that hosts multiple virtual machines (Qemu with 
nvmm acceleration). One peculiarity: I use estd on the host to control 
down the power consumption of the whole system via frequency scaling 
when there is no load.


The time of the host is synchronized via ntpd (with default settings), 
as well as the guests.


The host's time is correct.

The guests' time increasingly lags behind with continued operation. Also 
the ntpd seems to have no compensating effect in the guests here.


What could be the reason for this? Can estd be a source of interference?

I start the Qemu instances like this:

```
nohup qemu-system-x86_64  -machine pc-q35-7.0 -smp $VM_CORES -m 
$VM_RAM -accel nvmm \

-device virtio-balloon-pci,id=balloon0 \
-k de -boot cd -cdrom $VM_CDROM \
-machine graphics=off -display none -vga none \
-object 
rng-random,filename=/dev/urandom,id=viornd0 \

-device virtio-rng-pci,rng=viornd0 \
-object iothread,id=t0 \
$BLK \
-device virtio-net-pci,netdev=vioif0,mac=$VM_MAC \
-netdev 
tap,id=vioif0,ifname=$VM_NETIF,script=no,downscript=no \
-chardev 
socket,id=monitor,path=$MONITOR_SOCKET,server=on,wait=off \

-monitor chardev:monitor \
-chardev 
socket,id=serial0,path=$CONSOLE_SOCKET,server=on,wait=off \

-serial chardev:serial0 \
-pidfile /tmp/$VM_ID.pid \
2>&1 | logger -p local0.notice &
```

Timecounter settings in the guests:

```
net$ sysctl -a |grep timecounter
kern.timecounter.choice = TSC(q=3000, f=199680 Hz) 
clockinterrupt(q=0, f=100 Hz) ichlpcib0(q=1000, f=3579545 Hz) 
hpet0(q=2000, f=1 Hz) ACPI-Safe(q=900, f=3579545 Hz) 
lapic(q=-100, f=1920 Hz) i8254(q=100, f=1193182 Hz) 
dummy(q=-100, f=100 Hz)

kern.timecounter.hardware = TSC
kern.timecounter.timestepwarnings = 0
```

All I could find so far is [1]. It is recommended to add the rtc switch 
to the qemu command. Is there any recommendation here in the meantime 
which setting works best with NetBSD?


I would be very happy about a short recommendation or a field report. As 
it is, this is the last remaining problem on my new virtual host, 
powered by NetBSD, Qemu, NVMM and ZFS ZVOLS.


Kind regards
Matthias


[1] http://mail-index.netbsd.org/port-amd64/2021/05/09/msg003459.html


Re: Qemu storage performance drops when smp > 1 (NetBSD 9.3 + Qemu/nvmm + ZVOL)

2022-08-18 Thread Matthias Petermann

Hi,

On 18.08.22 09:10, B. Atticus Grobe wrote:

ZFS (at least on Solaris and FreeBSD) will use any uncommitted RAM as an
I/O buffer, which likely explains why it was keeping up with the single
core runs, pushing everything to RAM instead of to disk. I would expect
if you push enough data to fill that buffer up, you'll see an equivalent
drop in write speed.



This does not seem to be the case at least for NetBSD 9.3. I just 
changed my test case so that all VMs use only one core each. I am now 
using ZVOLs again for storage. I achieve stable high throughputs, 
analogous to my first test. The free memory of the host (displayed with 
top) hardly decreases, i.e. of total 8192 MB more than 5000 MB are free 
and it also does not decrease significantly during the test runs. Mixed 
workloads also hardly worsen the performance, i.e. I ran in parallel in 
the three VMs:


vm1: dd if=/dev/zero of=test.img bs=4m
vm2: extraction of pkgsrc.tar.gz
vm3: pkgsrc build of lang/go118

For my "low end" system I draw these consequences for the time being:

 - each VM should use only one SMP core at a time
 - ZVOLS are no problem and offer a better performance than QCOW2 on 
ZFS or FFS


It would be interesting to test a system with a much higher number of 
physical cores, e.g. start a VM with 2 virtual cores on an 8-core system 
- what this means for the I/O performance unfortunately i don't have 
such a system immediately available, but i could have a look in the 
attic. An old AMD FX-8350 that I discarded months ago to save energy 
could be suitable.


Kind regards
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Qemu storage performance drops when smp > 1 (NetBSD 9.3 + Qemu/nvmm + ZVOL)

2022-08-18 Thread Matthias Petermann

Hi,

On 18.08.22 07:14, B. Atticus Grobe wrote:

Forgive me if I've misread, but it seems like you're running 3 VMs with
2 vCPUs each on a dual-core processor. ZFS itself is also processor and RAM
heavy (on every OS I've used it on, never have used it on NetBSD).

So, it seems to me like you've taken an already not particularly fast
processor and applied a really heavy load to it, especially if you're doing
any zRAID or anything else.


that's right - i have the habit to get the last out of low end hardware 
;-) Especially this device is interesting - when SpeedStep kicks in and 
clocks the CPU down to 800 Mhz, the whole device has a power consumption 
of less than 4 Watts. That's not insignificant for a full-fledged open 
source x86_64 server these days.


I don't use a ZRAID but an ordinary striped ZPOOL with only a single 
SSD, but that probably doesn't change the fact that I'm scratching the 
edge of what's possible with this limited hardware.




Someone actually familiar with the NetBSD kernel can correct me on this,
but I'm fairly sure that the kernel itself is multithreaded, and handles
interrupts on multiple cores.

I would guess that with the single vCPU VMs, the processor was able to
keep up, but now you've doubled down on the workload, and considerably
increased the number of context switches that have to happen.


This realization is slowly coming to me as well - especially based on 
the concerns expressed about my setup.


I have made another attempt - analogous to my previous tests with the 
only difference that I have completely disabled ZFS and the QCOW2 files 
are on a FFSv2 with WAPBL.


## 1 Core per VM

1: ~45 MB/s  (55% sys, 2% user)
2: ~86 MB/s  (94% sys, 4% user)
3: ~80 MB/s  (95% sys, 5% user)

## 2 Cores per VM

1: ~35 MB/s  (67% sys, 5% user)
2: <1 MB/s  (99% sys, 0.2% user)
3: <1 MB/s  (100% sys, 0.0% user)

In any case, the results with the "2 Cores" already confirms that in the 
cases where the performance comes to a standstill, >99% of the CPU load 
is on the system side and thus (probably?) a lot of expensive context 
switches or other "organzational stuff" takes place, i.e. absolute overload.


I still can't explain why the performance in the "1 cores" case for the 
otherwise identical setup is significantly below the values of ZVOL or 
QCOW2 on ZFS. Could this be caching effects or does iostat count certain 
things twice in the case of ZFS?




Or maybe I'm completely wrong on every point. I'd certainly take a look
at your CPU usage though.



I think you are not wrong... as I realize, I probably need to lower my 
expectations for the device. And on the limited hardware probably better 
to use only one CPU core each for the VMs to reduce context switches as 
these seem to be the reason for the performance drops. I will 
concentrate my tests on this for now and try to find out where the 
respective limits of ZVOL, QCOW2 on ZFS and QCOW2 on FFS are in this 
constellation.


Kind regards
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Re: NetBSD 9.2 installer can't detect disk of some Hetzner VPSes

2022-07-20 Thread Matthias Petermann

Hello,

On 19.07.22 15:02, Matthias Petermann wrote:
Further, I'd like to follow up on what the "incompatible" chipset is, 
i.e. how I could test it locally in Qemu. Has anyone at Hetzner ever 
figured this out? I just wanted to save the dmesg and copy it via SSH to 
my home - the next disappointment - it seems that no network interface 
was detected. So in the existing configuration, the VPS is completely 
useless for me.


to add one more data point: I just used the Linux based rescue system to 
upload a "NetBSD-9.99.98-amd64-install.img.gz" (built 2022-06-30) to a 
tmp partition on the VPS. After extraction and dd to the hard disk, I 
reset the system and it starts to boot. The boot loader works and the

kernel begins its initialization sequence. The results so far:

- In the messages written to the console, I can clearly see a "sd0" 
device recognized as the local disk, as well as several wedges detected 
from the GPT layout on it.


- Also there is a vioif0 device listed as network device

So this looks like the previous issue (no disk + no network on Q35 based 
Hetzner VPS) are already addressed in NetBSD-current.


Unfortunaly, the kernel panics shortly before it passes control to init:

```
[] panic: cnopen: no console device
```

Any ideas?

Kind regards
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Re: NetBSD 9.2 installer can't detect disk of some Hetzner VPSes

2022-07-19 Thread Matthias Petermann

On 11.07.22 11:55, Mayuresh wrote:



This is most likely due too a bug within BSD:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D236922

Most of our Systems and all AMD Hosts already use Q35 as default virtual
chipset.  It seems like you had luck to get a CX (Intel) Host that still
uses i440FX.




...seems like I missed this information from your initial mail. This 
answers (some) of my questions ;-)


Kind regards
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Re: NetBSD 9.2 installer can't detect disk of some Hetzner VPSes

2022-07-19 Thread Matthias Petermann

Hello,

Am 11.07.2022 um 18:27 schrieb Robert Elz:

 Date:Mon, 11 Jul 2022 21:43:15 +0530
 From:Mayuresh 
 Message-ID:  <20220711161315.hoakmn5fgz76gtov@localhost>

   | Hetzner agreed to set a compatible chipset for my instance. So I finally
   | got the configuration I needed and have just installed NetBSD 9.2 on that.

Good.

   | Shouldn't qemu with the chipset setting they mentioned suffice for
   | testing?

Yes, I guess it should ... I'm not a qemu user, don't even have it
installed (though that is in progress now) so I may need some advice
on how to operate it to get the desired effect...

   | There are some difficulties in testing on Hetzner such as
   |
   | - As their reply suggests, there is no guarantee about which chipset
   |   your new instances will get. It's a bit random.

Yes, though they did say (from what you posted) that all the AMD cpu
instances have the one which has the problem, so that might not have
been a problem.

   | - I doubt whether they'd allow an arbitrary image. They have provided
   |   NetBSD 9.2 released image on their platform.

But that's a different issue...And as you suggest, testing would
be much easier done locally than exporting ISO images for you to try
anyway.   Of course, even if we find something to fix, it isn't likely
to get into their provided image any time soon.

kre


since today I have the same problem with a newly created VPS at Hetzner.

This is not the only VPS with NetBSD that I manage there, so for me, the 
first question is which *chipset* I would have to request there for it 
to work? I especially want to make sure that after a hard reboot of my 
existing VPS, it doesn't happen to use this incompatible chipset as well.


Further, I'd like to follow up on what the "incompatible" chipset is, 
i.e. how I could test it locally in Qemu. Has anyone at Hetzner ever 
figured this out? I just wanted to save the dmesg and copy it via SSH to 
my home - the next disappointment - it seems that no network interface 
was detected. So in the existing configuration, the VPS is completely 
useless for me.


@Mayuresh: if you don't mind, could you please tell me the ticket# of 
your request at Hetzner and a quote of the answer you got? I would also 
like to make an attempt and contact the support, thereby refer to your 
ticket.


Kind regards
Matthias


Re: Keybord seems to be attached to ttyE0 after wdm starts

2022-06-06 Thread Matthias Petermann

Hi,

Am 05.06.2022 um 23:11 schrieb RVP:

On Sun, 5 Jun 2022, BERTRAND Joël wrote:

When NetBSD starts, wdm is launched, mouse works fine, but 
keyboard not

! I cannot switch from X console to ttyE1, but from a remote server, if
I kill X or wdm, I see that keyboard has sent characters to ttyE1 (login
I have enter in wdm is written after 'login: ' on ttyE1).

If I restart wdm, keyboard is active in wdm.

I don't understand why wdm loses keyboard when it is started during
boot. Explanation will be welcome.



Those config. files look OK. Can you post your Xorg config. file and the
/var/log/Xorg.0.log file?   

-RVP


The symptom is familiar to me. If I remember correctly, you had to start 
the X server on a tty that is not occupied by a getty yet, otherwise the 
keystrokes are not passed on.


If you start wdm via /etc/ttys, something like this (ttyE4):

```
ttyE0   "/usr/libexec/getty Pc" wsvt25  off secure
ttyE1   "/usr/libexec/getty Pc" wsvt25  on secure
ttyE2   "/usr/libexec/getty Pc" wsvt25  on secure
ttyE3   "/usr/libexec/getty Pc" wsvt25  on secure
ttyE4   "/usr/pkg/bin/wdm -nodaemon"wsvt25  on secure
```

it required me to put the following to /usr/pkg/etc/wdm/Xservers:

```
:0 local /usr/X11R7/bin/X vt5
```

I hope this helps.

Many greetings
Matthias




Re: NetBSD 10 -- new way of attaching virtual interfaces to bridge?

2022-05-25 Thread Matthias Petermann



Hi Manuel,

Am 25.05.2022 um 14:53 schrieb Manuel Bouyer:

On Wed, May 25, 2022 at 02:45:50PM +0200, Matthias Petermann wrote:


so is my understanding correct - for NetBSD 10 as Xen Dom0 it is required to
migrate from tap to vether? May I ask how to tell xentools415 to create a
vether device instead of a tap?


If your talking about the devices used for qemu's emulated ethernet
devices (in HVM guests), these are real tap devices so don't need to
be migrated.
vether is for the case where a tap device was used without the
"other end" (and in the Xen case, qemu is the other end)



Thanks for the clarification. You are right - I messed this up. Actually 
I was concerned about qemu emulated devices in HVM guests.


Kind regards
Matthias


Re: NetBSD 10 -- new way of attaching virtual interfaces to bridge?

2022-05-25 Thread Matthias Petermann



Hi,

Am 23.05.2022 um 13:02 schrieb s...@mailbox.org:

On 2022/05/23 11:55 Manuel Bouyer  wrote:


How are vether interfaces created and connected to a bridge?


The same way you do with tap I guess:
ifconfig vether0 create
ifconfig vether0 up
brconfig bridge0 add vether0 up


Well, that's just as straightforward as tap was. I was expecting something a 
lot more complicated but I had forgotten how elegantly simple NetBSD is.

Thank you, Manuel.


so is my understanding correct - for NetBSD 10 as Xen Dom0 it is 
required to migrate from tap to vether? May I ask how to tell 
xentools415 to create a vether device instead of a tap?


Kind regards
Matthias


Re: USB headphones

2022-01-17 Thread Matthias Petermann

Hi Todd,

Am 17.01.2022 um 23:27 schrieb Todd Gruhn:

I found a nice set of headphones -- but they ARE USB.

Is there a way to send sound to a/the USB port?


yes of course you can. In USB headphones there is a built-in D/A 
converter which is connected to the computer via the USB port. Where it 
gets interesting is with driver support. NetBSD can do this in principle 
with the "uaudio" driver (man 4 uaudio), but there could be problems in 
the following cases:


- uaudio currently supports only release 1.0 of the USB Audio Spec. 
Release 2.0 is not backward compatible according to the manpage and is 
not supported. It might be difficult to find out which release your 
headphones require, only try it out.


- if the PC has only USB 3.0 ports, uaudio seems to work only with 
current (that's what I recently got as an answer to a similar question 
here).


Many greetings
Matthias



Re: How to bind bozohttpd / inetd to port 8080?

2021-12-18 Thread Matthias Petermann

Hi,

Am 18.12.2021 um 11:47 schrieb Ignatios Souvatzis:

On Sat, Dec 18, 2021 at 11:34:12AM +0100, Matthias Petermann wrote:


I am currently trying to have bozohttpd listen on port 8080 instead of port
80 via inetd.

In /etc/services there is an entry "http-alt" for this.


yes, but in the distributed version there are two others (591 and
8008), both for TCP and UDP.  I guess you'll have to edit your
/etc/services and put comment signs before the two you don't want.

Regards,
-is


You are right:

```
extranet$ getent services http-alt
http-alt591/tcp
```

I had not counted on that - thanks for the tip :-)
There were to more (8008) and after commenting them out it looks good:

```
extranet$ doas vi /etc/services
extranet$ doas services_mkdb
extranet$ getent services http-alt
http-alt   8080/tcp
```

Many greetings
Matthias


How to bind bozohttpd / inetd to port 8080?

2021-12-18 Thread Matthias Petermann



Hello all,

I am currently trying to have bozohttpd listen on port 8080 instead of 
port 80 via inetd.


In /etc/services there is an entry "http-alt" for this.

However, when I set in the /etc/inetd.conf:

```
http-altstream  tcp nowait:600  _httpd 
/usr/libexec/httpd  httpd -L wol /var/www/wol.lua /var/www

```

...I still cannot access port 8080.

Starting inetd in Debug mode shows:

```
extranet$ doas inetd -d /etc/inetd.conf
ADD : http-alt proto=tcp, wait.max=0.600, user:group=_httpd:(null) 
builtin=0 server=/usr/libexec/httpd policy=

registered /usr/libexec/httpd on 6
```

Did I miss something? If it matters, it is NetBSD 9.2_STABLE.

Many greetings
Matthias


Re: Release

2021-12-17 Thread Matthias Petermann



Am 17.12.2021 um 14:33 schrieb Martin Husemann:

On Fri, Dec 17, 2021 at 08:27:56PM +0800, Piper H wrote:
solutions than OpenBSD. NetBSD also has a very strong commitment to
binary compatibility with old releases *and* the slowest release cycle
(which also means active support for old release lasts pretty long). This
is bad for you if you are waiting for support for something new in an
official release, but it is helpfull if you run several machines and don't
want to change things when you can avoid it).


...and it makes NetBSD an actual option as a foundation for long term 
supported appliances, even if the vendor is only a one man company or a 
small company. That is my experience at least.



And of course NetBSD has the most friendly and welcoming community ;-)


In today's world, that's worth more than running after every technical 
"innovation" ;-)


Kind regards
Matthias


Re: Release

2021-12-16 Thread Matthias Petermann



Hi Martin,

Am 16.12.2021 um 10:32 schrieb Martin Husemann:

On Wed, Dec 15, 2021 at 07:48:18PM +, Todd Gruhn wrote:

When is the next official NetBSD release?


The branch for 10 will need to happen "soon", but there is no fixed date
yet. Details at:

https://wiki.NetBSD.org/releng/netbsd-10/

(which is only outdated by ~2 weeks now - will update the page later this week)

Martin


Thanks for bringing this up. Just out of curiosity - I've recently seen 
some updates in current related to NVidea/Radeon graphics cards. Are 
these already the first signs of "it looks like the DRM branch can be 
merged before the branch"?


No matter when it comes, NetBSD 10 will be a great release - not least 
because it supports ACLs :-)


Kind regards
Matthias


Re: Release / NetBSD as mobile OS

2021-12-16 Thread Matthias Petermann



Am 16.12.2021 um 07:33 schrieb Piper H:

Is there a mobile OS based on BSD, besides OSX?


That depends on how you define mobile OS. Basically there is everything 
you need in NetBSD to make it a usable OS for mobile devices. To get an 
idea of this, I recommend this blog post from 2017. Under "Device Driver 
Support", it goes into particular detail about many aspects that are 
relevant for use on mobile devices:


https://blog.netbsd.org/tnf/entry/netbsd_on_allwinner_socs_update

And yes - I also wish that one day I can have NetBSD on my cell phone :-)

Kind regards
Matthias


Re: Accessing BIOS Partition from NetBSD

2021-11-11 Thread Matthias Petermann

Hi,


Am 10.11.21 um 21:27 schrieb evil cRaftKnife:

Hi

I am currently planning the install of NetBSD on my new notebook. My 
plan is to install the OS onto the first BIOS partition and use a second 
partition to be /home on ZFS.


I created the two partitions at install time in my practice run but 
couldn't find any comprehensive instructions on how to access the second 
BIOS partition or even what it's device name is under /dev.


If the notebook is new enough to support EFI, I would boot in EFI mode 
and install to GPT instead of BIOS partitions. This is relatively easy 
with NetBSD 9.2. I would first configure via sysinst only the EFI system 
FAT (512 MB), root FFS (>= 16GB)- and the swap partition, leaving the 
space at the end of the disk unconfigured.


I would perform the installation according to documentation. After 
installation, the remaining space can be allocated to another GPT 
partition and initialized as ZFS pool. The following steps roughly 
describe what should be done next:


1) Make sure ZFS is started on boot

```
# echo 'zfs=YES' >> /etc/rc.conf
```

2) Create GPT partition (assuming drive is wd0)

```
# gpt add -l tank -a 1m -t fbsd-zfs wd0
```

3) Create pool and dataset for home (use dk device resulting from step 2)

```
# zpool create tank /dev/dk<...>
# zfs create tank/home
```

4) Optional: migrate home directory

```
# cp -R -v /home/* /tank/home
```

5) Set mountpoint for new home (will overlay existing /home)

```
# zfs set mountpoint=/home tank/home
```

This setup works fine with NetBSD 9.2 on my Thinkpad from 2013.

Kind regards
Matthias


Re: RAIDframe boot issues on amd64 (BIOS boot)

2021-10-24 Thread Matthias Petermann

On 22.10.21 07:58, Matthias Petermann wrote:

## Questions

  - does anyone know what the output of @'s means? I could not derive 
this from the source code of the bootxx_ffsv2 unfortunately, also did 
not find something in the error code explainations in the wiki[2]


  - could it be a BIOS limitation of my specific model and if so - how 
to explain it?


  - if the workaround is to have the wedges dk0/dk1 start in sector 63 - 
what effect does this have on the alignment? (I had read that an offset 
of 2048 is favorable).


Actually, my third question is cleared so far. According to [1] having 
an offset of 63 is not desirable at all. So I guess using this offset to 
work around the boot issues is not a good idea.


So the key to the issue might still be to find out what the @'s mean :-/ 
I would also be very thankful if someone could review my installation 
procedure (see initial mail) in case I missed some important step, or 
did something strange which results in arbitrary results depending on 
the actual hardware configuration.


Kind regards
Matthias


[1] 
https://www.thomas-krenn.com/en/wiki/Partition_Alignment_detailed_explanation


RAIDframe boot issues on amd64 (BIOS boot)

2021-10-21 Thread Matthias Petermann

Hello all,

I have a non-UEFI PC with BIOS boot. The PC has two hard disks of the 
same size that I use to configure a RAIDframe RAID 1 on which the root 
filesystem lives and where I want to boot from. Depending on the start 
sector of the partitions that form the RAID components, there is a 
different behavior at boot for which I am looking for the cause.


## Preconditions / Setup

The system I use is NetBSD/amd64 9.2_STABLE, built from the sources 
around 2021-09-14.


The creation and partitioning of RAID 1 via sysinst works largely well 
menu-driven. I have tried various combinations of partitioning schemes 
and have determined the following to be the most convenient for my purpose:


1) Partitioning the raw disks wd0/wd1 via GPT partition table (a single 
RAID partition on each - wedge dk0/dk1).

2) Creation of the RAID from wedges dk0/dk1 as components
3) partitioning of RAID by NetBSD disklabel (raid0a is root, raid0b swap 
and so on)


These three steps, as well as the installation of the sets / 
configuration of network, user account and services to be started at 
boot time can be done completely in sysinst. There are only a few small 
things to refine in the shell after the installation is complete:


1) Set the GPT partitions to active and write the GPT-MBR boot code to 
sector 0.


```
gpt biosboot -A -i 1 wd0
gpt biosboot -A -i 1 wd1
```

2) write the primary bootstrap code PBR to the beginning of the RAID 
partitions (writing via sysinst should be skipped, because sysinst tries 
to write to raid0a which is not correct according to wiki[1])


```
installboot -o timeout=30 -v /dev/rdk0 /usr/mdec/bootxx_ffsv2
installboot -o timeout=30 -v /dev/rdk1 /usr/mdec/bootxx_ffsv2
```

3) set raid0 to forceroot

```
raidctl -A forceroot raid0
```

## Observations

There are two cases depending on the start sector of the RAID wedges

a) if wedges dk0/dk1 each start in sector 63

 - the system boots without problems.

b) if wedges dk0/dk1 each start in sector 2048

 - the following output appears:

```
NetBSD/x86 ffsv2 Primary Bootstrap

@
>> NetBSD/x86 BIOS Boot, Revision 5.11 
>> Memory: 610/...
Press return to boot.
booting hd0a:netbsd - starting in
open netbsd: input/output error
```

 - the @'s appear one by one ~ every second. While this process, the 
drive wd0 LED flashes so it looks it is seeking something


 - finally it seems to find the (secondary?) boot program, but this 
doesn't seem to find raid0a


 - anyway, in this stage I can enter the boot prompt. From there, I can 
get a successful boot by entering "raid0a:netbsd"


## Further Detail

Interesting fact: I did the identical installation in a VirtualBox. 
Here, both cases a) and b) work flawlessly.


## Questions

 - does anyone know what the output of @'s means? I could not derive 
this from the source code of the bootxx_ffsv2 unfortunately, also did 
not find something in the error code explainations in the wiki[2]


 - could it be a BIOS limitation of my specific model and if so - how 
to explain it?


 - if the workaround is to have the wedges dk0/dk1 start in sector 63 - 
what effect does this have on the alignment? (I had read that an offset 
of 2048 is favorable).



Many thanks in advance & kind regards
Matthias


[1] https://www.netbsd.org/docs/guide/en/chap-rf.html#chap-rf-moving-files

[2] https://wiki.netbsd.org/tutorials/how_netbsd_boots_on_x86/


Re: 2-year old change not pulled into 9.2

2021-09-21 Thread Matthias Petermann

Hello Jan,

On 22.09.21 03:05, Jan Schaumann wrote:

Hello,

Back in 2019, I reported a bug in cp(1):
https://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=54564

This was promptly fixed in src/bin/cp/utils.c rev 1.47
on 2019-09-23, but it looks like this change is not
included in cp(1) in NetBSD 9.2:

$ ident `which cp` | grep utils.c
/bin/cp:
  $NetBSD: utils.c,v 1.46 2018/07/17 13:04:58 darcy Exp $
$ uname -a
NetBSD apue 9.2 NetBSD 9.2 (GENERIC) #0: Wed May 12 13:15:55 UTC 2021
mkre...@mkrepro.netbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC amd64
$

I see that the last revision to be tagged
netbsd-9-2-RELEASE was 1.46.  But 9.2 was released on
2021-05-12 -- wouldn't it be reasonable to expect that
release to include changes a bit more recent than from
1.5 years prior?  How come later versions were not
tagged for this release?

-Jan



your question reminds me of a similar one I had a few years ago. With 
the exception that there are currently no "teeny" releases, I find the 
presentation in the Release Glossary helpful:


https://www.netbsd.org/releases/release-map.html#graph1

In your case, the release branch "netbsd-9" was probably already created 
when your patch was merged into current. This means that a transfer to 
the release branch requires a pull-up request:


https://www.netbsd.org/developers/releng/pullups.html

I am not a NetBSD developer, but I understood that there is no 
automatism for this, but the pull-up-request is made by the respective 
NetBSD developer, who puts the patch into current. Whether a pull-up 
request is made is decided on a risk/benefit basis. The developer may 
not always be able to decide this immediately. It might help to include 
the risk/benefit assessment when submitting the patch, opening the 
discussion to the potential pull-up-request. For a minor issue, I have 
also asked a PR for a pull-up request when I saw that it was already 
fixed in current - kind of in the similar situation as you:


http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=52630

That had worked well back then and it would be worth a try to aim for 
that for 9.3.


Kind regards
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Modern Cyrus IMAP and Apache Guacamole on NetBSD?

2021-09-09 Thread Matthias Petermann



Hi Carl,

On 09.09.21 05:50, Carl Brewer wrote:


G'day,

Looking at pkgsrc, the version of cyrus imap seems to be pretty ancient, 
2.4.20.something. I want to run a much more recent version, but don't 
have the skills or time to maintain anything in pkgsrc.  Has anyone here 
looked at, or is, running a more current version?  Maybe just compiled 
by hand? Any war stories?




A while ago I added version 3.0.13 to pkgsrc-wip[1] because I needed the 
CalDAV/CardDAV functionality. This package should currently also be 
buildable to evaluate it. I can help with the configuration. In the 
future, I would like to bring this to a newer version - 3.4.2 is now 
stable. I will certainly tackle this in autumn, but I am always grateful 
for help as I am not yet a pkgsrc expert myself and have to learn a lot 
from other packages ;-)


Best regards
Matthias


[1] 
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=tree;f=cyrus-imapd30;hb=HEAD


Re: ZFS RAIDZ2 and wd uncorrectable data error - why does ZFS not notice the hardware error?

2021-07-17 Thread Matthias Petermann

Hello together,

The story is slowly coming to a conclusion and I would like to describe 
my observations for the sake of completeness.


According to [1], SATA/ATA on NetBSD does not support hot swap. 
Therefore, I shut down the NAS and swapped the disk in a powerless state.


I installed the device like I got it out of the box, i.e. I did not make 
any special preparations.


Because the device did not have any partitions when it was booted, the 
wedge "dk3" of the last (non-defective) hard disk slipped to the front 
and was assigned to "dk2". After creating the partition on wd2, it was 
recognised as "dk3". The result was this:


```
# zpool status
  pool: tank
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas 
exist for

the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://illumos.org/msg/ZFS-8000-2Q
  scan: none requested
config:

NAME  STATE READ WRITE CKSUM
tank  DEGRADED 0 0 0
  raidz2-0DEGRADED 0 0 0
dk0   ONLINE   0 0 0
dk1   ONLINE   0 0 0
12938104637987333436  OFFLINE  0 0 0  was /dev/dk2
11417607113939770484  UNAVAIL  0 0 0  was /dev/dk3

errors: No known data errors
```

After another reboot, the order was correct again:

```
saturn$ doas zpool status
  pool: tank
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are 
unaffected.

action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P
  scan: resilvered 140K in 0h0m with 0 errors on Sat Jul 17 08:14:34 2021
config:

NAME  STATE READ WRITE CKSUM
tank  DEGRADED 0 0 0
  raidz2-0DEGRADED 0 0 0
dk0   ONLINE   0 0 0
dk1   ONLINE   0 0 0
12938104637987333436  OFFLINE  0 0 0  was /dev/dk2
dk3   ONLINE   0 0 1

errors: No known data errors
```

However, a "1" appears in the dk2 statistics under CKSUM.

I then initiated the replacement of the ZFS component as follows:

```
saturn$ doas zpool replace tank /dev/dk2
```

With the result:

```
saturn$ doas zpool status
  pool: tank
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sat Jul 17 08:18:56 2021
16.0G scanned out of 5.69T at 123M/s, 13h24m to go
3.87G resilvered, 0.27% done
config:

NAMESTATE READ WRITE CKSUM
tankDEGRADED 0 0 0
  raidz2-0  DEGRADED 0 0 0
dk0 ONLINE   0 0 0
dk1 ONLINE   0 0 0
replacing-2 OFFLINE  0 0 0
  12938104637987333436  OFFLINE  0 0 0  was 
/dev/dk2/old
  dk2   ONLINE   0 0 0 
(resilvering)

dk3 ONLINE   0 0 1

errors: No known data errors
```

So things are looking good for the time being - I'll keep an eye on 
whether the CKSUM will also be solved in the course of this, or whether 
another "construction site" is waiting for me here ;-)


I still have one small question: when initialising the RAID, I had set 
it to GPT partitions so that I could use the full storage capacity of 
the disks (instead of the 2 TB limit with disklabel) and also leave some 
buffer space free in case a replacement drive has a few sectors less 
than the existing ones. Now it looks as if the dynamic allocation of the 
wedges at boot time unnecessarily endangers the RAID in the event of a 
disk change. Therefore the question: is there a better possibility 
besides using the wedges? I remember that I had also tried the variant 
with the label NAME=zfs2 when I created it (as it works with newfs, for 
example), but it didn't work. Ok - as a workaround I could have prepared 
the disk on another system - for the next time I know that now.



Kind regards
Matthias



[1] https://mail-index.netbsd.org/netbsd-users/2011/01/28/msg007735.html



smime.p7s
Description: S/MIME Cryptographic Signature


Re: ZFS RAIDZ2 and wd uncorrectable data error - why does ZFS not notice the hardware error?

2021-07-16 Thread Matthias Petermann

Hi,

On 16.07.21 23:21, RVP wrote:

On Fri, 16 Jul 2021, Matthias Petermann wrote:

I will overwrite the disk with zeros once as a test. According to the 
S.M.A.R.T. values, the number of "pending" sectors has already 
decreased - from 18 to 15.


```
197 200    0 no  online  positive    Current pending sector  15
```



I would replace that drive, mate. 18 bad sectors on a fairly new drive
(some propagated all the way up to the OS) is a bit worrisome...



After overwriting the whole drive with zeros, there are no pending 
sectors anymore:


```
saturn$ doas atactl wd2 smart status
SMART supported, SMART enabled
id value thresh crit collect reliability description raw
  1 197   51 yes online  positiveRaw read error rate 39550
  3 176   21 yes online  positiveSpin-up time6158
  4 1000 no  online  positiveStart/stop count510
  5 200  140 yes online  positiveReallocated sector count0
  7 2000 no  online  positiveSeek error rate 0
  9  640 no  online  positivePower-on hours count26807
 10 1000 no  online  positiveSpin retry count0
 11 1000 no  online  positiveCalibration retry count 0
 12 1000 no  online  positiveDevice power cycle count506
192 2000 no  online  positivePower-off retract count 99
193 2000 no  online  positiveLoad cycle count2679
194 1180 no  online  positiveTemperature 32
196 2000 no  online  positiveReallocated event count 0
197 2000 no  online  positiveCurrent pending sector  0
198 1000 no  offline positiveOffline uncorrectable   0
199 2000 no  online  positiveUltra DMA CRC error count   0
200 1000 no  offline positiveWrite error rate0
```

This reflects exactly what Greg mentioned and it is good to know.

Of course, I don't trust this drive anymore and will replace it.

Kind regards
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Re: ZFS RAIDZ2 and wd uncorrectable data error - why does ZFS not notice the hardware error?

2021-07-16 Thread Matthias Petermann

Hi Michael,

On 16.07.21 16:46, Michael van Elst wrote:


smartmontools has more features and also understands rare setups
with e.g. RAID controllers, early USB enclosures or vendor-specific
(usually undocumented) parameters. It also comes with smartd to
monitor drives continously.

For plain SATA drives atactl is fine. It can also pass SMART commands
to most USB drives that attach as sdX.



I was lucky - in my case, atactl is apparently sufficient for the 
basics. Not so lucky, but also no surprise, is the result of the self-test:


```
saturn$ doas atactl wd2 smart selftest-log
SMART supported, SMART enabled
Log entry: 21
Name: Off-line
Status: No error
Log entry: 0
Name: Short off-line
Status: Read element of test failed
LBA first error: 1375756400
saturn$
```

I will overwrite the disk with zeros once as a test. According to the 
S.M.A.R.T. values, the number of "pending" sectors has already decreased 
- from 18 to 15.


```
197 2000 no  online  positiveCurrent pending sector  15
```

Kind regards
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Re: ZFS RAIDZ2 and wd uncorrectable data error - why does ZFS not notice the hardware error?

2021-07-16 Thread Matthias Petermann

Hello Greg,

On 14.07.21 14:10, Greg Troxel wrote:

I think you may have uncovered a bug in zfs statistics.

  NAMESTATE READ WRITE CKSUM
  tankONLINE   0 0 0
raidz2-0  ONLINE   0 0 0
  dk0 ONLINE   0 0 0
  dk1 ONLINE   0 0 0
  dk2 ONLINE   0 0 0
  dk3 ONLINE   0 0 0


It really seems like dk2 (assuming dk2 == wd2) should have some read errors.


...


I bet zfs got a read failed and did the reconstruction but didn't log
it.   But I'm guessing that that's a good thing to figure out.

...

Probably if you take that drive out and put it in a test box and write
zeros to the whole drive and then read back it will be sort of ok, but I
wouldn't trust it.



Thank you very much for all the background information. I'll pick out 
the topic of ZFS logs here, because from my point of view this is the 
most exciting question at the moment. If it is the case that ZFS has 
corrected the error - then it would be really great if it also displays 
this somewhere. At the moment, I haven't restarted the system and 
haven't changed the defective disk yet. Is there any chance to get this 
information? Would a core dump help?


Many greetings
Matthias




smime.p7s
Description: S/MIME Cryptographic Signature


Re: ZFS RAIDZ2 and wd uncorrectable data error - why does ZFS not notice the hardware error?

2021-07-16 Thread Matthias Petermann

Hello all,

Thank you very much for your valuable advice! I will add the 
smartmontools to my custom repository today so that I can install it on 
the NAS. In the meantime, I had another look at atactl - it seems to 
offer the possibility of reading out the error memory or starting a 
self-test. I have now done that, but I suspect that smartmontools 
probably offers more features.


```
saturn$ doas atactl wd2 smart error-log
SMART supported, SMART enabled
No errors have been logged
saturn$ doas atactl wd2 smart selftest-log
SMART supported, SMART enabled
No self-tests have been logged
saturn$ doas atactl wd2 smart offline 1
SMART supported, SMART enabled
saturn$
```

Best regards
Matthias




smime.p7s
Description: S/MIME Cryptographic Signature


ZFS RAIDZ2 and wd uncorrectable data error - why does ZFS not notice the hardware error?

2021-07-14 Thread Matthias Petermann

Hello all,

I run a NetBSD-based NAS at home. It is currently running on NetBSD 9.1. 
The system is booted from a USB stick on which the root file system is 
also located. The storage is on 4 x 4 TB magnetic hard disks, configured 
as ZFS RAIDZ2.


Earlier I noticed that the I/O performance of the system suddenly 
collapsed drastically. A look at the syslog gives a pretty clear 
indication of the reason:


```
[ 87240.313853] wd2: (uncorrectable data error)
[ 87240.313853] wd2d: error reading fsbn 5707914328 of 
5707914328-5707914455 (wd2 bn 5707914328; cn 5662613 tn 6 sn 46)
[ 87465.637977] wd2d: error reading fsbn 5710464152 of 
5710464152-5710464215 (wd2 bn 5710464152; cn 5665143 tn 0 sn 8), xfer 
338, retry 0

[ 87465.637977] wd2: (uncorrectable data error)
[ 87475.561683] wd2: soft error (corrected) xfer 338
[ 87506.393194] wd2d: error reading fsbn 5710555128 of 
5710555128-5710555255 (wd2 bn 5710555128; cn 5665233 tn 4 sn 12), xfer 
40, retry 0

[ 87506.393194] wd2: (uncorrectable data error)
[ 87515.156465] wd2d: error reading fsbn 5710555128 of 
5710555128-5710555255 (wd2 bn 5710555128; cn 5665233 tn 4 sn 12), xfer 
40, retry 1

```

The whole syslog is full of these messages. What surprises me is that 
there are "uncorrectable" data errors in the syslog. Nevertheless, the 
data can still be read - albeit very slowly. My assumption was that the 
redundancies of RAID2 are being used to compensate for the defects. To 
my surprise, ZFS does not seem to have noticed any of these defects:



```
saturn$ doas zpool status
  pool: tank
 state: ONLINE
  scan: none requested
config:

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
  raidz2-0  ONLINE   0 0 0
dk0 ONLINE   0 0 0
dk1 ONLINE   0 0 0
dk2 ONLINE   0 0 0
dk3 ONLINE   0 0 0

errors: No known data errors
```

Another indication that ZFS has not yet noticed the error: with top, 
there is no significant CPU load during I/O, neither in the user nor the 
system area. I would have expected this at least in the case when ZFS 
works with redundancies.


So it looks like the hardware error can still be corrected as far as 
possible at the level of the device driver, which makes me doubt the 
truth of the statement "uncorrectable data error".


Does anyone know what would have to happen for ZFS to notice the 
hardware defect?


Next, I will try to take the wd2 (dk2) component offline.

For the sake of completeness, here is the issue of S.M.A.R.T. - even if 
I find it difficult to interpret:


```
saturn$ doas atactl wd2 smart status
SMART supported, SMART enabled
id value thresh crit collect reliability description raw
  1 197   51 yes online  positiveRaw read error rate 38669
  3 176   21 yes online  positiveSpin-up time6158
  4 1000 no  online  positiveStart/stop count510
  5 200  140 yes online  positiveReallocated sector count0
  7 2000 no  online  positiveSeek error rate 0
  9  640 no  online  positivePower-on hours count26740
 10 1000 no  online  positiveSpin retry count0
 11 1000 no  online  positiveCalibration retry count 0
 12 1000 no  online  positiveDevice power cycle count506
192 2000 no  online  positivePower-off retract count 99
193 2000 no  online  positiveLoad cycle count2672
194 1170 no  online  positiveTemperature 33
196 2000 no  online  positiveReallocated event count 0
197 2000 no  online  positiveCurrent pending sector  18
198 1000 no  offline positiveOffline uncorrectable   0
199 2000 no  online  positiveUltra DMA CRC error count   0
200 1000 no  offline positiveWrite error rate0
```

Kind regards
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Re: NetBSD hdaudio @ HDMI

2021-07-12 Thread Matthias Petermann

Hello Jared and Nia,

Thank you very much for both your answers - they were the decisive hints 
:-) With the addition of the HDAUDIO_ENABLE_DISPLAYPORT option, I now 
have at least partial audio output, which is a huge improvement over the 
previous state.


There are still some small limitations at the moment, which I can live 
with using workarounds:


1) The output only works on the left channel, the right one remains 
silent. The whole cable route is ok, tested with another OS.


2) The main volume cannot be controlled with mixerctl. Regardless of the 
settings in mixerctl, the output is always at perceived maximum volume.


Have you experienced this before?

Many greetings
Matthias

```
mpeterma@workstation ~> audiocfg list
0: [*] audio0 @ hdafg0: Intel product 280b
   playback: 2ch, 48000Hz
   record:   2ch, 48000Hz
   (P-) slinear_le 16/16, 2ch, { 48000 }
   (P-) slinear_le 16/16, 4ch, { 48000 }
   (P-) slinear_le 16/16, 6ch, { 48000 }
   (P-) slinear_le 16/16, 8ch, { 48000 }
   (PR) slinear_le 16/16, 2ch, 48000-48000Hz


mpeterma@workstation ~> audioctl -a
name=Intel
version=product 280b
config=01h
encodings=mulaw:8*,alaw:8*,slinear:8*,ulinear:8*,slinear_le:16*,ulinear_le:16*,slinear_be:16*,ulinear_be:16*,slinear_le:32*,ulinear_le:32*,slinear_be:32*,ulinear_be:32*
properties=full_duplex,mmap,independent
full_duplex=1
fullduplex=1
blocksize=2048
hiwat=32
lowat=24
monitor_gain=0
mode=
play.rate=48000
play.channels=2
play.precision=32
play.encoding=slinear_le
play.gain=0
play.balance=32
play.port=0x0
play.avail_ports=0x0
play.seek=0
play.samples=0
play.eof=0
play.pause=0
play.error=0
play.waiting=0
play.open=0
play.active=1
play.buffer_size=65536
record.rate=8000
record.channels=1
record.precision=8
record.encoding=mulaw
record.gain=127
record.balance=32
record.port=0x0
record.avail_ports=0x0
record.seek=0
record.samples=0
record.eof=0
record.pause=0
record.error=0
record.waiting=0
record.open=0
record.active=0
record.buffer_size=65536
record.errors=0


mpeterma@workstation ~> mixerctl -a
outputs.master=0,0
outputs.master.mute=off
outputs.dacsel=DP00
```


On 08.07.21 01:48, Jared McNeill wrote:

On Mon, 5 Jul 2021, Matthias Petermann wrote:

Jul  4 23:23:02 localhost /netbsd: [   1.0088516] hdafg0: DP00 8ch: 
Digital Out [Jack]


[...]

I already have a self-built kernel with options HDAUDIO_ENABLE_HDMI in 
use[1]. However - this option is only documented in HDAUDIO(4) in 
current. I myself use NetBSD 9.2_STABLE here. Should this also work 
with it, or is this already my misunderstanding?


"DP00" means it reports itself as a display port connection and you need 
options HDAUDIO_ENABLE_DISPLAYPORT to attach an audio device to it.


Take care,
Jared




smime.p7s
Description: S/MIME Cryptographic Signature


uaudio audiocfg: write: Resource temporarily unavailable (Re: NetBSD hdaudio @ HDMI)

2021-07-07 Thread Matthias Petermann

On 07.07.21 14:51, Matthias Petermann wrote:


For now, I'm going to try a workaround with a USB audio interface and 
try my luck. If I have misunderstood anything else, please correct me.




...which unfortunately doesn't work as well :-(

```
[ 16802.542625] uaudio0 at uhub1 port 5 configuration 1 interface 0
[ 16802.542625] uaudio0: C-Media Electronics Inc. (0xd8c) USB Audio 
Device (0x14), rev 1.10/1.00, addr 8

[ 16802.542625] uaudio0: audio rev 1.00
[ 16802.542625] audio0 at uaudio0: playback, capture, full duplex, 
independent
[ 16802.542625] audio0: slinear_le:16 2ch 48000Hz, blk 11520 bytes 
(60ms) for playback
[ 16802.542625] audio0: slinear_le:16 1ch 48000Hz, blk 6000 bytes 
(62.5ms) for recording

[ 16802.542625] spkr1 at audio0: PC Speaker (synthesized)
[ 16802.542625] wsbell at spkr1 not configured
[ 16802.542625] uhidev6 at uhub1 port 5 configuration 1 interface 3
[ 16802.542625] uhidev6: C-Media Electronics Inc. (0xd8c) USB Audio 
Device (0x14), rev 1.10/1.00, addr 8, iclass 3/0

[ 16802.552622] uhid10 at uhidev6: input=4, output=4, feature=0


root@workstation /h/mpeterma# audiocfg list
0: [ ] audio0 @ uaudio0: USB audio
   playback: 2ch, 48000Hz
   record:   1ch, 48000Hz
   (P-) slinear_le 16/16, 2ch, { 48000, 44100 }
   (-R) slinear_le 16/16, 1ch, { 48000, 44100 }

root@workstation /h/mpeterma# audiocfg test 0
0: [ ] audio0 @ uaudio0: USB audio
   playback: 2ch, 48000Hz
   record:   1ch, 48000Hz
   (P-) slinear_le 16/16, 2ch, { 48000, 44100 }
   (-R) slinear_le 16/16, 1ch, { 48000, 44100 }
  testing channel 0...audiocfg: write: Resource temporarily unavailable
```

Do I have a similar public problem here, or just bad luck?

Off the top of my head, my NUC7 only has USB 3.0 ports. Maybe that's the 
problem and I'll try it again with another PC that also has USB 2.0. 
Otherwise, I'm at my wit's end here.


Kind regards
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Re: NetBSD hdaudio @ HDMI

2021-07-07 Thread Matthias Petermann

Hello all,

to my basic question whether HDMI audio also works in NetBSD 9.2_STABLE, 
I have received a positive experience report as private mail. Many 
thanks for that!


As far as my specific problem is concerned, I have to admit that I was 
far too naive. I should have looked at the PCI vendor and product IDs in 
the hdaudio driver - the selection of supported Intel chips is very 
limited and the ID of my chipset is not included:


```
/* Intel */
product INTEL   Q57_HDMI0x0054  Q57 HDMI
product INTEL   G45_HDMI_1  0x2801  G45 HDMI/1
product INTEL   G45_HDMI_2  0x2802  G45 HDMI/2
product INTEL   G45_HDMI_3  0x2803  G45 HDMI/3
product INTEL   G45_HDMI_4  0x2804  G45 HDMI/4
product INTEL   G45_HDMI_FB 0x29fb  G45 HDMI/FB
```

So it can't work at all, and I'm afraid that adding the ID of my device 
to the G45 driver is not enough.


For now, I'm going to try a workaround with a USB audio interface and 
try my luck. If I have misunderstood anything else, please correct me.


Many greetings
Matthias



On 05.07.21 01:08, Matthias Petermann wrote:

Hello everybody,

I have an Intel NUC7 that does not have an analog output for audio. 
Consequently, I have to rely on the output via HDMI. So far, I did not 
have luck with this. The following audio devices are recognised:


```
workstation$ audiocfg list
0: [ ] audio0 @ uaudio0: USB audio
    playback: unavailable
    record:   1ch, 48000Hz
    (-R) slinear_le 16/16, 1ch, { 8000, 11025, 16000, 22050, 32000, 
44100, 48000 }

1: [ ] audio1 @ uaudio1: USB audio
    playback: unavailable
    record:   1ch, 48000Hz
    (-R) slinear_le 16/16, 1ch, { 16000 }
    (-R) slinear_le 16/16, 1ch, { 24000 }
    (-R) slinear_le 16/16, 1ch, { 32000 }
    (-R) slinear_le 16/16, 1ch, { 48000 }
```

The two devices detected are USB microphones (webcam and another 
microphone). The hdaudio device is missing.


In the kernel log, at least the following is output:

```
workstation# grep hd messages

Jul  4 23:23:02 localhost /netbsd: [   1.0088516] hdaudio0 at pci0 dev 
31 function 3: HD Audio Controller
Jul  4 23:23:02 localhost /netbsd: [   1.0088516] hdaudio0: interrupting 
at msi4 vec 0
Jul  4 23:23:02 localhost /netbsd: [   1.0088516] hdafg0 at hdaudio0: 
Intel product 280b
Jul  4 23:23:02 localhost /netbsd: [   1.0088516] hdafg0: DP00 8ch: 
Digital Out [Jack]
Jul  4 23:23:02 localhost /netbsd: [   1.0088516] hdafg0: 8ch/0ch 
48000Hz PCM16*

```

I already have a self-built kernel with options HDAUDIO_ENABLE_HDMI in 
use[1]. However - this option is only documented in HDAUDIO(4) in 
current. I myself use NetBSD 9.2_STABLE here. Should this also work with 
it, or is this already my misunderstanding?



Kind regards
Matthias



[1] 
https://www.netbsd.org/docs/guide/en/chap-audio.html#chap-audio-hdaudio-hdmi 








smime.p7s
Description: S/MIME Cryptographic Signature


NetBSD hdaudio @ HDMI

2021-07-04 Thread Matthias Petermann

Hello everybody,

I have an Intel NUC7 that does not have an analog output for audio. 
Consequently, I have to rely on the output via HDMI. So far, I did not 
have luck with this. The following audio devices are recognised:


```
workstation$ audiocfg list 


0: [ ] audio0 @ uaudio0: USB audio
   playback: unavailable
   record:   1ch, 48000Hz
   (-R) slinear_le 16/16, 1ch, { 8000, 11025, 16000, 22050, 32000, 
44100, 48000 }

1: [ ] audio1 @ uaudio1: USB audio
   playback: unavailable
   record:   1ch, 48000Hz
   (-R) slinear_le 16/16, 1ch, { 16000 }
   (-R) slinear_le 16/16, 1ch, { 24000 }
   (-R) slinear_le 16/16, 1ch, { 32000 }
   (-R) slinear_le 16/16, 1ch, { 48000 }
```

The two devices detected are USB microphones (webcam and another 
microphone). The hdaudio device is missing.


In the kernel log, at least the following is output:

```
workstation# grep hd messages 



Jul  4 23:23:02 localhost /netbsd: [   1.0088516] hdaudio0 at pci0 dev 
31 function 3: HD Audio Controller
Jul  4 23:23:02 localhost /netbsd: [   1.0088516] hdaudio0: interrupting 
at msi4 vec 0
Jul  4 23:23:02 localhost /netbsd: [   1.0088516] hdafg0 at hdaudio0: 
Intel product 280b
Jul  4 23:23:02 localhost /netbsd: [   1.0088516] hdafg0: DP00 8ch: 
Digital Out [Jack]
Jul  4 23:23:02 localhost /netbsd: [   1.0088516] hdafg0: 8ch/0ch 
48000Hz PCM16*

```

I already have a self-built kernel with options HDAUDIO_ENABLE_HDMI in 
use[1]. However - this option is only documented in HDAUDIO(4) in 
current. I myself use NetBSD 9.2_STABLE here. Should this also work with 
it, or is this already my misunderstanding?



Kind regards
Matthias



[1] 
https://www.netbsd.org/docs/guide/en/chap-audio.html#chap-audio-hdaudio-hdmi




smime.p7s
Description: S/MIME Cryptographic Signature


Re: NetBSD as IPSEC client with NAT-T / PSK / XAUTH - can't get it to work :-/

2021-06-20 Thread Matthias Petermann

Hi Dima,

I just wanted to drop you a line and thank you again for your reply. It 
definitely helped me to understand IPSEC better. Unfortunately - or how 
you take it - fortunately there was another change in the framework 
conditions for my project, which means OpenVPN is back in the running. 
If that remains the case, I would suspend the IPSEC topic for the time 
being. I would still like to try it out for my own infrastructure and 
continue accordingly when the opportunity arises. I hope this will be 
before the start of autumn.


Many greetings
Matthias

Am 10.06.21 um 15:41 schrieb Dima Veselov:

On Thu, Jun 10, 2021 at 02:41:23PM +0200, Matthias Petermann wrote:

First of all you forgot to build (or to mention) ipsec tunnel 
specification

which usually is set in /etc/ipsec.conf. Tunnel specification includes
external addresses, internal addresses and protocols. Most of firewalls
will not answer if offered tunnel specification is not expected.


You are absolutely right - I completely ignored that part. And now I 
think I understand that it is essential. That is, if I have not 
misunderstood, Racoon in its role as key manager does nothing other 
than distribute the keys between the hosts and configure IPSec via 
setkey so that the traffic is encrypted with these keys?


Yes, that's right, IPSEC is fast symmetric packet encryption done
by the kernel, racoon does high security non-symmetric key exchange and
encryption (ISAKMP or IKE) to keep IPSEC keys safe. This combination
make ISAKMP/IPSEC fast and safe on all stages.




smime.p7s
Description: S/MIME Cryptographic Signature


Re: web-camera

2021-06-12 Thread Matthias Petermann

Hi Todd,

You could try fswebcam from pkgsrc-wip. This is a command line tool that 
can record from the v4l devices. It works out of the box with the 
built-in webcam on my Thinkpad x230.


Many greetings
Matthias

Am 12.06.21 um 13:07 schrieb Todd Gruhn:

I installed Cheese. It does not work on my system.
Any other options for web-cameras?






smime.p7s
Description: S/MIME Cryptographic Signature


Re: NetBSD as IPSEC client with NAT-T / PSK / XAUTH - can't get it to work :-/

2021-06-10 Thread Matthias Petermann

Hello Dima,

thank you for your answer. One thing first - I think I underestimated 
the complexity of an IPSEC setup like this. It has too many components 
in which I only have a half-knowledge yet.


Am 08.06.21 um 12:05 schrieb Dima Veselov:

On Tue, Jun 08, 2021 at 10:04:44AM +0200, Matthias Petermann wrote:
First of all you forgot to build (or to mention) ipsec tunnel specification
which usually is set in /etc/ipsec.conf. Tunnel specification includes
external addresses, internal addresses and protocols. Most of firewalls
will not answer if offered tunnel specification is not expected.


You are absolutely right - I completely ignored that part. And now I 
think I understand that it is essential. That is, if I have not 
misunderstood, Racoon in its role as key manager does nothing other than 
distribute the keys between the hosts and configure IPSec via setkey so 
that the traffic is encrypted with these keys?



Your second problem is the tunnel itself. Your setup is more likely
site-to-site clean IPSEC, but site-to-site must be configured on both
sides. Client-to-site VPN use different approach.
It is very likely you can't build both from behind same NAT. If your
task is not the digging up into IPSEC you may want to use
security/openvpn from pkgsrc.


I have always preferred OpenVPN in the past because I understood the 
setup there :-) This time IPSEC is actually set, because the remote 
station (FritzBox) is an appliance for which OpenVPN is not available as 
an option. I had let myself be guided by the fact that the connection 
setup with the Shrew Soft VPN Client was quite easy - but obviously this 
client also hides a lot of small-scale negotiations with the endpoint 
from the user.



We have to know what specification router do expect. It should have
some configured options to look at.


I thought I had listed the specification of the router (Fritzbox 7490) 
in my original email. The admin interface of the router does not show me 
any further information. It is also more of a consumer hardware that is 
trimmed for simplicity on the surface.


On the other hand - if I export the working client configuration of the 
Shrew Soft Client and open it in a text editor, it reveals hardly 
anything special - many parameters are set to "auto" there:


```
n:version:4

n:network-ike-port:500

n:network-mtu-size:1380

n:client-addr-auto:1

n:network-natt-port:4500

n:network-natt-rate:15

n:network-frag-size:540

n:network-dpd-enable:1

n:client-banner-enable:1

n:network-notify-enable:1

n:client-dns-used:1

n:client-dns-auto:0

n:client-dns-suffix-auto:1

n:client-splitdns-used:1

n:client-splitdns-auto:1

n:client-wins-used:1

n:client-wins-auto:1

n:phase1-dhgroup:2

n:phase1-life-secs:86400

n:phase1-life-kbytes:0

n:vendor-chkpt-enable:0

n:phase2-life-secs:3600

n:phase2-life-kbytes:0

n:policy-nailed:0

n:policy-list-auto:1

s:network-host:vpn.example.com

s:client-auto-mode:pull

s:client-iface:virtual

s:network-natt-mode:enable

s:network-frag-mode:enable

s:client-dns-addr:192.168.111.1

s:auth-method:mutual-psk-xauth

s:ident-client-type:keyid

s:ident-server-type:address

s:ident-client-data:user

b:auth-mutual-psk:xx==

s:phase1-exchange:aggressive

s:phase1-cipher:auto

s:phase1-hash:auto

s:phase2-transform:auto

s:phase2-hmac:auto

s:ipcomp-transform:disabled

n:phase2-pfsgroup:-1

s:policy-level:auto

```


There is some gap about IPSEC knowledge. You have tried to use isakmp
daemon (e.g. racoon) only, no IPSEC was ever used. This how it works:

1. Stage 1. isakmp daemon connect to other isakmp daemon and they talk
to each other. Upon success they have mutual security association and
now they can exchange keys. This is called phase 1 or ISAKMP security
association (also known as IKE).

2. Stage 2. Via encrypted channel they exchange keys for IPSEC encryption.
Upon success they reach Phase 2, which is called IPSEC. isakmp daemon
put exchanged keys into kernel and kernel now can encrypt flowing traffic.


Does this mean that the fact that my Racoon does not get an answer from 
the remote station has nothing to do with IPSEC, but is a pure 
UDP-over-NAT issue? And possibly not even that - if I understand 
correctly, one reason for the non-response of the remote station may be 
that the proposal of my racoon is inappropriate or invalid?



You do not need IPSEC_DEBUG because IPSEC work when it is set, kernel do
not know anything before phase 2 is reached. It may call isakmp daemon
if you would have any rules in /etc/ipsec.conf either manually installed
via setkey(8), but its not about the kernel to negotiate the other side,
its all about racoon.

4) What "classic" errors does one usually make in such a setup as 
above and how can I check this? As I said - I'm sure my racoon.conf is 


far from finished. But the fact that no response comes back (or 
doesn't find its way back - NAT-T?) irritates me a lot.


By the way your racoon.conf have sam

NetBSD as IPSEC client with NAT-T / PSK / XAUTH - can't get it to work :-/

2021-06-08 Thread Matthias Petermann

Hello all,

I would like to connect a NetBSD client to a remote LAN via IPSEC. 
Unfortunately, I am stuck at a very early point and would like to know 
if I am making a basic mistake or if I am expecting something from 
NetBSD that it cannot do.



1) Network Topology


[NetBSD-Client IP 192.168.2.177]  [Windows-Client IP 192.168.2.176]
   |  |
   |  |
( LAN 192.168.2.0/24 )
   |
   |
['local' Router/NAT IP 192.168.2.254 / external IP 222.222.222.222]
   |
   |
(=== WAN )
   |
   |
['remote' Router/NAT IP 192.168.1.254 / external IP 111.111.111.111]
   |
   |
( LAN 192.168.1.0/24 )
   |
   |
[Server IP 192.168.1.1]


2) Remote Endpoint Spec


The 'remote' router is a FritzBox 7490, which is the IPSEC access point. 
According to the manufacturer, the FritzBox supports:


 - VPN connections according to the IPSec standard with ESP, IKEv1 and 
Pre-Shared Keys. Authentication Header (AH) and Perfect Forward Security 
(PFS) are not supported.


 - Supported IPSec algorithms for IKE phase 1:
Encryption methods: AES with 256, 192, 128 bits, Triple-DES 
with 168 bits or DES with 56 bits.

Hash algorithm: SHA2-512, SHA1 or MD5-96
The FRITZ!Box initially uses 1024 bits (DH group 2) for key 
exchange via Diffie-Hellman. However, it subsequently also accepts 768, 
1536, 2048 and 3072 bits (DH group 1, 5, 14 and 15).


 - Supported IPSec algorithms for IKE phase 2:
Encryption methods: AES with 256, 192, 128 bits, Triple-DES 
with 168 bits or DES with 56 bits.

Hash algorithm: SHA2-512, SHA1 or MD5-96
Diffie-Hellman group is determined by IKE phase 1
Compression: none


3) Configuration


For the client configuration, I have essentially orientated myself on:

https://www.netbsd.org/docs/network/ipsec/rasvpn.html#nat_t

FILE: /etc/racoon/racoon.conf:

```
path pre_shared_key "/etc/racoon/psk.txt" ;
log debug2;

remote anonymous
{
exchange_mode main,base,aggressive;
my_identifier address 111.111.111.111
nat_traversal force;
ike_frag on;
esp_frag 540;
dpd_delay 20;
mode_cfg on;
lifetime time 24 hour ; # sec,min,hour
script "/etc/racoon/phase1-up.sh" phase1_up;
script "/etc/racoon/phase1-down.sh" phase1_down;

proposal {
encryption_algorithm aes;
hash_algorithm sha1;
authentication_method xauth_psk_client;
dh_group 2 ;
}

xauth_login "user";
proposal_check strict;
}

sainfo anonymous
{
pfs_group 2;
lifetime time 12 hour ;
encryption_algorithm aes ;
authentication_algorithm hmac_sha1, hmac_md5 ;
compression_algorithm deflate ;
}

```

FILE: /etc/racoon/psk.txt:

```
111.111.111.111 myshareds3cre7
user myxau7hp4ssw0rd
```


4) Test Environment


I am using a NetBSD 9.2_STABLE with a custom kernel that has IPSEC_DEBUG 
enabled.


```
armv7# uname -a
NetBSD armv7 9.2_STABLE NetBSD 9.2_STABLE (GENERIC) #0: Mon Jun  7 
07:17:03 CEST 2021 
mpeterma@x230Mk10.local:/tank/dev/netbsd/9-STABLE/obj/sys/arch/evbarm/compile/GENERIC 
evbarm

```

```
armv7# sysctl -a|grep ipsec
net.inet.ipsec.def_policy = 1
net.inet.ipsec.esp_trans_deflev = 1
net.inet.ipsec.esp_net_deflev = 1
net.inet.ipsec.ah_trans_deflev = 1
net.inet.ipsec.ah_net_deflev = 1
net.inet.ipsec.ah_cleartos = 1
net.inet.ipsec.ah_offsetmask = 0
net.inet.ipsec.dfbit = 2
net.inet.ipsec.ecn = 0
net.inet.ipsec.debug = 1
net.inet.ipsec.ipip_spoofcheck = 1
net.inet.ipsec.enabled = 1
net.inet.ipsec.used = 1
net.inet.ipsec.ah_enable = 1
net.inet.ipsec.esp_enable = 1
net.inet.ipsec.ipcomp_enable = 1
net.inet.ipsec.crypto_support = 0
net.inet.ipsec.test_replay = 0
net.inet.ipsec.test_integrity = 0
net.inet6.ipsec6.def_policy = 1
net.inet6.ipsec6.esp_trans_deflev = 1
net.inet6.ipsec6.esp_net_deflev = 1
net.inet6.ipsec6.ah_trans_deflev = 1
net.inet6.ipsec6.ah_net_deflev = 1
net.inet6.ipsec6.ecn = 0
net.inet6.ipsec6.debug = 1
net.inet6.ipsec6.enabled = 1
net.inet6.ipsec6.used = 1
```


5) Test Execution


Starting racoon in foreground to get debug log:

```
armv7# racoon -dd -F -f /etc/racoon/racoon.conf
Foreground mode.
2021-06-08 07:29:22: INFO: @(#)ipsec-tools cvs 
(http://ipsec-tools.sourceforge.net)
2021-06-08 07:29:22: INFO: @(#)This product linked OpenSSL 1.1.1k  25 
Mar 2021 (http://www.openssl.org/)
2021-06-08 07:29:22: INFO: Reading configuration from 
"/etc/racoon/racoon.conf"

2021-06-08 07:29:22: DEBUG: call pfkey_send_register for AH
2021-06-08 07:29:22: DEBUG: call pfkey_send_register for ESP
2021-06-08 07:29:22: DEBUG: call pfkey_send_register for IPCOMP
2021-06-08 07:29:22: DEBUG: reading config file 

Re: Finding out at runtime which IPSEC options are built into the kernel (IPSEC_NAT_T?)

2021-06-07 Thread Matthias Petermann

Hi Andy,

Am 06.06.21 um 14:53 schrieb Andy Ruhl:




Hopefully this helps someone searching:

The options(4) man page shows this line:

strings netbsd | sed -n 's/^_CFG_//p' | unvis (note that "netbsd" is
the kernel file, usually at /netbsd)

This will work if the kernel has the INCLUDE_CONFIG_FILE option which
I believe is on by default.

It shows all options compiled into the kernel. I've used it many times
to figure out what I did on some kernel.

Andy



many thanks - this was helpful for me!

Kind regards
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Finding out at runtime which IPSEC options are built into the kernel (IPSEC_NAT_T?)

2021-06-06 Thread Matthias Petermann
...looks like the IPSEC_NAT_T option no longer exists, but is included 
in IPSEC instead.



OPTIONS(4):

"
 options IPSEC
 Includes support for the IPsec protocol, using the implementation 
derived

 from OpenBSD, relying on opencrypto(9) to carry out cryptographic
 operations.  See ipsec(4) for details.

 options IPSEC_DEBUG
 Enables debugging code in IPsec stack.  See ipsec(4) for details.  The
 IPSEC option includes support for IPsec Network Address Translator
 traversal (NAT-T), as described in RFCs 3947 and 3948.  This feature
 might be patent-encumbered in some countries.
"



Am 06.06.21 um 11:28 schrieb Matthias Petermann:

Hello,

the subject probably already summarises the question - here is just a 
brief background: I would like to establish an IPSEC connection from a 
NetBSD box behind a NAT router to a IPSEC-VPN. My understanding is that 


the kernel must have the appropriate IPSEC_NET_T-option for this. Can I 



somehow find this out reliably at runtime?

I have a NetBSD 9.2_STABLE with GENERIC kernel on evbarm.

Small additional question: Does anyone here happen to have general 
experience with whether and how a VPN connection to a FritzBox can be 
established with NetBSD on-board means (racoon)? I have already done a 
lot of research on this - most of the tutorials and blogs on this are 
already over 5 years old, and there have already been several firmware 
updates of the FritzBoxes in the meantime, so it is not easy to narrow 
down where the error lies.


Kind regards
Matthias





smime.p7s
Description: S/MIME Cryptographic Signature


Finding out at runtime which IPSEC options are built into the kernel (IPSEC_NAT_T?)

2021-06-06 Thread Matthias Petermann

Hello,

the subject probably already summarises the question - here is just a 
brief background: I would like to establish an IPSEC connection from a 
NetBSD box behind a NAT router to a IPSEC-VPN. My understanding is that 
the kernel must have the appropriate IPSEC_NET_T-option for this. Can I 
somehow find this out reliably at runtime?


I have a NetBSD 9.2_STABLE with GENERIC kernel on evbarm.

Small additional question: Does anyone here happen to have general 
experience with whether and how a VPN connection to a FritzBox can be 
established with NetBSD on-board means (racoon)? I have already done a 
lot of research on this - most of the tutorials and blogs on this are 
already over 5 years old, and there have already been several firmware 
updates of the FritzBoxes in the meantime, so it is not easy to narrow 
down where the error lies.


Kind regards
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Installing NetBSD 9.1 on USB stick, not booting

2021-04-05 Thread Matthias Petermann

Hi Mayuresh,

I recently set up a NetBSD 9.1 for my NAS with BIOS boot in the same 
way. My problem was that both the installer USB stick and the target USB 
stick for the installation were connected at the same time during the 
installation (logical ;-)). During the installation, the target USB 
stick was detected as unit sd1, and the installer as sd0. When booting 
without the installer after the installation, the target USB stick then 
takes the place of the installer USB stick and becomes sd0 itself. As a 
result, the root filesystem could not be mounted because the entry in 
fstab was still set to sd1. Since then I mount the target partition 
again r/w before the first reboot and correct the fstab if necessary. I 
don't know if this helps you. As partitioning scheme I used disklabel. 
In this case at least the bootloader was installed. I did not try with GPT.


Many greetings
Matthias

On 05.04.21 19:36, Mayuresh wrote:

I have installed NetBSD 9.1 on a USB stick. I think earlier installers
used to ask about bootcode. 9.1 installer didn't ask me about it. The
installation seems fine but the stick is not booting.

I tried manually setting the boot manager (fdisk -B) though it didn't
work. I do not have a trace of all my attempts but finally it rendered the
USB stick read-only. Wasn't able to make it writable again, also tried on
Linux.

I got a new replacement stick now and I am back into the same situation -
installation is complete, but it won't boot. What exactly do I need to do
to make it bootable. I don't need it to work with the hdd of the host or
grub etc. Just need plain simple bootable usb stick that would boot into
NetBSD.





smime.p7s
Description: S/MIME Cryptographic Signature


Re: Using a USB MIDI controller as input for LMMS

2021-03-25 Thread Matthias Petermann

Hi Nia,

Am 25.03.21 um 10:38 schrieb nia:

Are you sure it isn't /dev/rmidi1?
The last line in your dmesg output indicates it has attaches as midi1.


You are right:

```
mpeterma@x230Mk10 ~ [SIGINT|SIGINT]> cat /dev/rmidi1|hexdump
000 01b0 b072 7101 01b0 b070 6f01 01b0 b06e
010 6d01 01b0 b06c 6b01 01b0 b06a 6901 01b0
020 b068 6701 01b0 b066 6501 01b0 b064 6301
030 01b0 b062 6101 01b0 b060 5e01 01b0 b05d
040 5c01 01b0 b05a 5901 01b0 b058 5701 01b0
050 b056 5501 01b0 b053 5201 01b0 b051 5001
```

( I tried this with /dev/rmidi0 before and got a "Device not configured" )

Anyway, LMMS still doesn't receive any input from the controller.



The code "looks like it should work". Sadly, I haven't had an opportunity
to test LMMS with a real MIDI keyboard - I use a computer keyboard.
The code and support in fluidsynth is very similar.


I'll check out fluidsynth... maybe there is a pattern.

Kind regards
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Using a USB MIDI controller as input for LMMS

2021-03-25 Thread Matthias Petermann

Hello all,

today a hobby question :-) As the subject already reveals, I want to 
make music with NetBSD.


LMMS is quite comfortable for me and now only the USB MIDI controller 
would have to work. It also almost seems like I'm only missing one teeny 
tiny detail.


That is the kernel output I get when I plug in the controller:

```
[ 1131610,270409] uhidev2 at uhub1 port 1 configuration 1 interface 0
[ 1131610,270409] uhidev2: AKAI (0x9e8) MPK Mini Mk II (0x26), rev 
1.10/0.00, addr 19, iclass 3/0

[ 1131610,270409] uhid4 at uhidev2: input=32, output=32, feature=0
[ 1131610,270409] uaudio0 at uhub1 port 1 configuration 1 interface 1
[ 1131610,270409] uaudio0: AKAI (0x9e8) MPK Mini Mk II (0x26), rev 
1.10/0.00, addr 19
[ 1131610,270409] uaudio0: autoconfiguration error: audio descriptors 
make no sense, error=4

[ 1131610,270409] umidi0 at uhub1 port 1 configuration 1 interface 2
[ 1131610,270409] umidi0: AKAI (0x9e8) MPK Mini Mk II (0x26), rev 
1.10/0.00, addr 19

[ 1131610,270409] umidi0: (genuine USB-MIDI)
[ 1131610,270409] umidi0: out=1, in=1
[ 1131610,270409] midi1 at umidi0: <0 >0 on umidi0
```

Am I correct in assuming that I can infer that /dev/rmidi0 is the 
associated device? If so, how can I tell exactly? The access rights of 
all rmidi devices seem at least ok:


```
mpeterma@x230Mk10 ~> ls -la /dev/rmidi*
crw-rw-rw-  1 root  wheel  58, 0 Feb 17 11:35 /dev/rmidi0
crw-rw-rw-  1 root  wheel  58, 1 Feb 17 11:35 /dev/rmidi1
crw-rw-rw-  1 root  wheel  58, 2 Feb 17 11:35 /dev/rmidi2
crw-rw-rw-  1 root  wheel  58, 3 Feb 17 11:35 /dev/rmidi3
crw-rw-rw-  1 root  wheel  58, 4 Feb 17 11:35 /dev/rmidi4
crw-rw-rw-  1 root  wheel  58, 5 Feb 17 11:35 /dev/rmidi5
crw-rw-rw-  1 root  wheel  58, 6 Feb 17 11:35 /dev/rmidi6
crw-rw-rw-  1 root  wheel  58, 7 Feb 17 11:35 /dev/rmidi7
```

Same for the higher level devices:

```
mpeterma@x230Mk10 ~> ls -la /dev/music
crw-rw-rw-  1 root  wheel  59, 0 Feb 17 11:35 /dev/music
mpeterma@x230Mk10 ~> ls -la /dev/sequencer
crw-rw-rw-  1 root  wheel  59, 128 Feb 17 11:35 /dev/sequencer
```

A kind of "recording" is also possible with midirecord (playing a C chord):

```
mpeterma@x230Mk10 ~ [SIGPIPE|SIGINT]> midirecord -|hexdump
000 544d 6468  0600 0100 0100 1800 544d
010 6b72  1d00 ff00 0351 a107 0020 4090
020 906e 5843 3c90 0458 4080 0100 3c80 8000
```

Now to LMMS: here the MIDI device /dev/rmidi0 is set as default. I also 
read somewhere in the CVS logs that this was patched not too long ago to 
use the correct device by default on NetBSD. Unfortunately, I am not 
able to close the loophole.


Regardless of whether the controller is connected or not, LMMS offers me 
"Input" and "Output" under "MIDI" when I click on the small cogwheel of 
an instrument track. However, no controller signals arrive.


Are there any LMMS users here who may be able to help me?

Kind regards
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Using FSS with WAPBL - Best practises?

2021-03-07 Thread Matthias Petermann

Hello NetBSD-Users,

after trying to backup from my WAPBL-enabled FFSv2 filesystems with dump 
every now and then, my memories of using FSS (-X option of dump) are not 
the very best. I had problems with it interacting with LVM a few years 
ago, resulting in sporadic crashes during backups and inconsistent 
filesystem state after the reboot. This was difficult for me to recreate 
at the time, and since then I've been forced to use dump without 
snapshots to keep my head above water, knowing that I might run into 
consistency problems.


My small home server is now reinstalled and instead of LVM I now use 
VNDs as backend storage for my Xen VMs. Also, my main filesystem with 
the network drives is no longer in Dom0 but is provided via Samba as 
DomU. So from the DomU point of view I don't have to deal with LVM any 
longer, instead it is a normal FFS on a block device.


Practically, to backup my DomU's I call dump from Dom0 over SSH and have 
the data returned to stdout so that I can save it to the USB hard drive 
mounted on Dom0, e.g.:


$ ssh user@192.168.2.54 doas /sbin/dump -h 0 -b 64 -$NEXT_LEVEL''auf - / 
> $DUMP_NAME


Now I would like to ensure consistency as well and use the -X option. I 
have done some research on this and came across [1] among others. The 
thread is from 2010 and at that time it was exactly about how WAPBL and 
FSS play together. At that time probably still experimental with 
explicit warning in the man page. In the current NetBSD I can't find 
this warning anymore, which makes me hopeful. In the same thread it is 
also reported that it might make sense to increase the WAPBL log size to 
let WAPBL and FSS work together optimally. Here I would be interested 
once, how I come to the optimal log size? Is there a guideline or an 
upper/lower limit? And can someone explain me briefly how the correct 
function of FSS and WAPBL depend on each other? What exactly causes a 
too small log size?



Kind regards
Matthias


[1] https://mail-index.netbsd.org/netbsd-users/2010/05/27/msg006288.html



smime.p7s
Description: S/MIME Cryptographic Signature


Access to GPT and partitions within a VND image

2021-02-15 Thread Matthias Petermann

Hello all,

I am using file-backed vnds as storage for my Xen domains. In each of 
the vnds there is a GPT with the corresponding partitions.


Is there an easy way to access the GPT and the partitions from outside, 
i.e. preferably from the host to the image file? I would like to 
implement a kind of thin-provisioning, where I copy a base image as a 
sparse file over and over again, and then change a few small things in 
it before the first start of the domain - like the fixed IP address.


Kind regards
Matthias




Strange boot behavior on NetBSD/amd64 (NetBSD/Xen) - boot.cfg question

2021-02-03 Thread Matthias Petermann

Hello all,

I think I had already mentioned this in another context, but since I 
just stumbled across it again I would like to bring it up again.


It appears that the bootloader's boot menu behaves strangely in some 
cases. Given is the following boot.cfg:


```
menu=Boot Xen:load /netbsd-XEN3_DOM0.gz console=pc;multiboot /xen.gz 
dom0_mem=512M dom0_max_vcpus=1 dom0_vcpus_pin console=vga

menu=Boot normally:rndseed /var/db/entropy-file;boot
menu=Boot single user:rndseed /var/db/entropy-file;boot -s
menu=Drop to boot prompt:prompt
default=1
timeout=5
clear=1
```
(there is no line break in the Boot Xen line in the original file).

This produces a nice boot menu similiar to this:

1) Boot Xen
2) Boot normally
3) Boot single user


...and properly executes the "Boot Xen" option when I select that (press 
1 key). Anyway - for some reason when I select Option "Boot normally" 
(press 2 key) the behavior is rather strange. This is the transscript 
from what is printed to the screen:


```
command(s): load /netbsd-XEN3_DOM0.gz console=pc;multiboot /xen.gz 
dom0_mem=512M dom0_max_vcpus=1 dom0_vcpus_pin console=vga

2419744+1319904=0x3910ec
Loading /var/db/entropy-file
Loading /netbsd-XEN3_DOM0.gz
heap full (0x6b868+327678)
```

Eventually the boot attempt fails.

When I select "Boot single user" (press 3 key), it appears to do what is 
expected and boots into single user mode and prompts for the pathname fo 
the shell.


What could be the cause that "boot normally" seems to use the 
configuration of "boot Xen" and on top of that then fails on boot. Does 
my boot.cfg have a syntactical error or is it a bug?


Kind regards
Matthias







Re: RAIDframe write performance below expectations on a RAID-1 of two magnetic disks on NetBSD/amd64 9.1

2020-12-31 Thread Matthias Petermann

Hello all,

in the meantime a few days have passed and after some back and forth I 
have now found a parameterization in which my RAID-1 achieves very 
satisfactory throughput rates (ca. 90 MB/s and more). I still can't 
completely exclude that the hardware has a small damage, because I think 
that I had tested the current configuration also at the beginning, then 
with worse results. Anyway - for the sake of completeness, here is a 
short summary of what finally led to success. And of course thanks again 
for all the hints here in the mailing list - that was a big help.


By the way: I have read in many places - including RAIDCTL(8) - that it 
is not safe to create a partition or filesystem on the RAID device until 
initialization with "-i" is 100% complete. If someone has a link to a 
documentation that goes into this in more detail, I would be very 
interested. Especially because I would like to understand what exactly 
is so dangerous about it and under which circumstances - if any - 
certain operations may be performed already during initialization. I 
also think that it would be enough for me to know what exactly happens 
on block level at -i. For the understanding of the source code I lack 
the experience, and the very good paper under


https://www.pdl.cmu.edu/RAIDframe/raidframebook.pdf

does not seem to address this.

Kind regards
Matthias


1) Create 4k aligned partitions

```
# gpt destroy wd2
# gpt destroy wd3
# gpt create wd2
# gpt create wd3
# gpt add -l raid1cmp0 -a 4k -t raid wd2
# gpt add -l raid1cmp1 -a 4k -t raid wd3
```

2) Create RAIDframe configuration file

```
# cat < /tmp/raid1.conf
START array
1 2 0

START disks
NAME=raid1cmp0
NAME=raid1cmp1

START layout
128 1 1 1

START queue
fifo 100
EOF
```

3) Initialize RAID

```
# raidctl -C /tmp/raid1.conf raid1
# raidctl -I 2020122802 raid1
# raidctl -i raid1
# raidctl -A yes raid1
```

4) After reconstruction / parity-rewrite has finished: create partition 
on RAID and format filesystem


```
# gpt create raid1
# gpt add -l data -a 4k -t ffs raid1
# newfs -O 2 NAME=data
```



RAIDframe write performance below expectations on a RAID-1 of two magnetic disks on NetBSD/amd64 9.1

2020-12-27 Thread Matthias Petermann

Hello all,

this is about the write performance of RAIDframe. There is a lot to read 
about this on these mailing lists and I have been very busy trying out 
everything I could get my hands on, i.e. different alignment methods, 
manipulating the write strategy of the drives, experimenting with the 
file system parameters. Unfortunately, I have now reached a point where 
I have been before. At that time my "solution" was to abandon NetBSD and 
use FreeBSD with ZFS instead. This time I don't want to give up so fast 
:-) So I'll give it a try and try to describe my setup as detailed as 
possible. Maybe someone sees my obvious mistake and can give me the 
crucial tip.


The root filesystem is on a separate disk set (also on RAIDframe but SSD 
storage) and is not the subject of this problem. The problem refers to 
two identical magnetic hard disks I have (each 1 TB, 4 kb sector size), 
from which I want to form a RAID-1 with RAIDframe. To do this, I first 
created a partition for RAIDframe on each of the two disks via GPT:


# gpt create wd2
# gpt create wd3
# gpt add -l raid1cmp0 -a 4k -t raid wd2
# gpt add -l raid1cmp1 -a 4k -t raid wd3

Then I initialized the RAID with the following parameter file:

START array
1 2 0

START disks
NAME=raid1cmp0
NAME=raid1cmp1

START layout
128 1 1 1

START queue
fifo 100

The speed of the parity rewrite had given me hope at first. I had 
already made several attempts with obviously wrong alignment and run 
times of approx. 10 hours were the result. With the correct alignment, 
the parity re-write runs in about 2 hours which, according to my 
research, should be a good average for the disk size.


On the RAID device (/dev/raid1 for me) I then created another GPT 
partition table and created a 4k-aligned partition in it as well:


# gpt create raid1
# gpt add -l data -a 4k -t ffs raid1
# newfs -O 2 -b 16k -f 2k NAME=data

This was formatted with an FFS filesystem (with the recommended 
parameters from [1]) and mounted with the mount option "log".


However, the write throughput remains well below my expectations and I 
am despairing. When writing a 1 GB file, I achieve write rates of about 
2 MB/s.


To me, this looks a bit like the hard drives are operating in the wrong 
mode in general. I suspected if the PIO mode is used instead of DMA. But 
I haven't found a reliable way to check that. Regardless of this, the 
disks achieve significantly higher write rates (80 MB/s and more) on 
their own (i.e. without a RAIDframe). In the dmesg it says that:


```
jupiter$ dmesg|grep wd2
[ 2.660025] wd2 at atabus2 drive 0
[ 2.660025] wd2: 
[ 2.660025] wd2: drive supports 16-sector PIO transfers, LBA48 
addressing
[ 2.660025] wd2: 931 GB, 1938021 cyl, 16 head, 63 sec, 512 
bytes/sect x 1953525168 sectors (0 bytes/physsect; first aligned sector: 8)

[ 2.850025] wd2: GPT GUID: 01d01c56-2caf-4370-ac48-634c4c211de7
[ 2.850025] dk3 at wd2: "raid1cmp0", 1953525088 blocks at 40, type: 
raidframe
[ 3.370025] wd2: drive supports PIO mode 4, DMA mode 2, Ultra-DMA 
mode 6 (Ultra/133), WRITE DMA FUA, NCQ (32 tags)
[ 3.400024] wd2(ahcisata0:2:0): using PIO mode 4, DMA mode 2, 
Ultra-DMA mode 6 (Ultra/133) (using DMA), WRITE DMA FUA EXT



jupiter$ dmesg|grep wd3
[ 3.400024] wd3 at atabus3 drive 0
[ 3.400024] wd3: 
[ 3.400024] wd3: drive supports 16-sector PIO transfers, LBA48 
addressing
[ 3.400024] wd3: 931 GB, 1938021 cyl, 16 head, 63 sec, 512 
bytes/sect x 1953525168 sectors (0 bytes/physsect; first aligned sector: 8)

[ 3.520024] wd3: GPT GUID: aabb5ee0-c30f-4654-9380-3ab8ca81cd9b
[ 3.520024] dk4 at wd3: "raid1cmp1", 1953525088 blocks at 40, type: 
raidframe
[ 3.530024] wd3: drive supports PIO mode 4, DMA mode 2, Ultra-DMA 
mode 6 (Ultra/133), WRITE DMA FUA, NCQ (32 tags)
[ 3.560024] wd3(ahcisata0:3:0): using PIO mode 4, DMA mode 2, 
Ultra-DMA mode 6 (Ultra/133) (using DMA), WRITE DMA FUA EXT

```

Doesn't look bad at first. The hard disks are identified as follows:

```
jupiter$ doas atactl wd2 identify
Model: ST1000LM048-2E7172, Rev: SDM1, Serial #: WES22ZJS
World Wide Name: 5000C5009D54CC56
Device type: ATA, fixed
Capacity 1000 Gbytes, 1953525168 sectors, 512 bytes/sector
Cylinders: 16383, heads: 16, sec/track: 63
Physical sector size: 4096 bytes
First physically aligned sector: 8
Command queue depth: 32
Device capabilities:
DMA
LBA
ATA standby timer values
IORDY operation
IORDY disabling
Device supports following standards:
ATA-4 ATA-5 ATA-6 ATA-7 ATA-8
Command set support:
NOP command (enabled)
READ BUFFER command (enabled)
WRITE BUFFER command (enabled)
Host Protected Area feature set (enabled)
Look-ahead (enabled)
Write cache (disabled)
Power Management feature set (enabled)
Security 

Re: NetBSD/Xen prompting for root filesystem although provided in boot.cfg

2020-12-21 Thread Matthias Petermann

Hallo Manuel and Cherry,

Am 21.12.2020 um 12:21 schrieb Manuel Bouyer:

Actually, if you have root-on-raid, you shouldn't need the
root/bootdev parameter at all, '-A root' should be enough.
I don't have it on systems where I have root on raid



Now it has actually worked. The combination of:

menu=Boot Xen:load /netbsd-XEN3_DOM0.gz;multiboot /xen.gz dom0_mem=512M 
dom0_max_vcpus=1 dom0_vcpus_pin console=com2 com2=9600,8n1


and -A root leads to an automatic boot from raid0a.

If I set -A softroot (which I had so far except for one test you 
recommended), it tries to take the root of dk0 when booting (which fails).


So the key was to omit the explicit bootdev/root specification in 
boot.cfg AND set -A root (and NOT -A softroot).


Reading raidctl(8) again, I also become more aware of my mistake. The 
boot device is wd0, which is not part of the RAID set (since it consists 
of dk0/dk1). Therefore the root device is not set automatically. Does 
this make sense and have I interpreted this correctly?



Many greetings
Matthias




Re: NetBSD/Xen prompting for root filesystem although provided in boot.cfg

2020-12-21 Thread Matthias Petermann

Hi Cherry,

thank you very much for the explanation.

Am 21.12.2020 um 11:49 schrieb Mathew, Cherry G.:
hi - the boot code sequence assumes root device precedence based on a 
bunch of archaic rules.


I'm wondering if having dk in the sequence makes any changes to the 
assumption, especially since the xen boot path is slightly different 
from the native one.


an easy way to verify this would be to create a new raid set using the 
underlying disk device nodes (/dev/wdxx ?) and retry booting from that 
raid set.


if it succeeds, then the xen boot code definitely needs further inspection.

it's a long shot, and I haven't looked at the codebase in a year, so I'm 
totally guessing here!




The reason I put GPT partitions on the RAID components is because they 
are different sized SSDs and I would like to mirror my root filesystem. 
One is 112GB in size and the other is 1TB. Therefore, there is a 112 GB 
partition on each. And on the larger of the two is a second GPT 
partition with a non raidframe file system. At least "architecturally" 
this setup must have worked before - just about in summer 2018 when I 
got the tip from Manuel that I have to include the root/bootdev 
parameter in boot.cfg. But at that time it was under NetBSD 8.0 and Xen 
4.11.


I will first recreate the current setup on two other hard disks to make 
sure that the problem occurs there in the same way.


As a next step, I would try an MBR+disklabel-in-disklabel combination 
instead of the GPT partitioning, as well as the underlying devices you 
suggested.


Out of interest, where would the anchor points be if I wanted to compare 
the executed code from NetBSD 8.0 to 9.1?


Many greetings
Matthias


Re: NetBSD/Xen prompting for root filesystem although provided in boot.cfg

2020-12-21 Thread Matthias Petermann

Hello Manuel,

Am 21.12.2020 um 08:53 schrieb Manuel Bouyer:

Can anyone give me a hint as to what I am doing wrong?


Did you set '-A root' on the raid0 ?



Thanks for your quick response. Originally I had set it so "softroot". I 
just tried "root" but unfortunately no change...


Kind regards
Matthias


NetBSD/Xen prompting for root filesystem although provided in boot.cfg

2020-12-20 Thread Matthias Petermann

Hello,

I am trying to multiboot my NetBSD 9.1/Xen 4.13 system with Xen and 
causes me some headache :-( It seems like the root file system is not 
automatically mounted. From the setup side, the root filesystem is 
located in a disklabel, which in turn is located on a RAIDFrame device, 
which consists of two components, each consisting of a GPT partition on 
two physical disks.


In my boot.cfg I have:

```
menu=Boot Xen:load /netbsd-XEN3_DOM0.gz bootdev=raid0a;multiboot /xen.gz 
dom0_mem=512M dom0_max_vcpus=1 dom0_vcpus_pin console=com2 com2=9600,8n1

```

but for some reason the Kernel still prompts for the root device, the 
dump device, file system and later for the init location. Anyway, the 
provided defaults are correct and could be used right away. Thats why it 
needs currently manual intervention on every boot:


```
[   3.3500231] raid0: RAID Level 1
[   3.3500231] raid0: Components: /dev/dk0 /dev/dk1
[   3.3500231] raid0: Total Sectors: 234434432 (114469 MB)
[   3.3500231] WARNING: 1 error while detecting hardware; check system log.
[   3.3500231] boot device: raid0
[   3.3500231] unknown device major 0x
[   3.3500231] root device (default raid0a):
[   3.4089966] dump device (default raid0b):
[   3.9617492] file system (default generic):
...
```

In the case above it is sufficient to take over the defaults by simply 
pressing the enter key. The system will boot without any problem after 
this. But of course, the manual intervention is inappropriate for a 
production system.


I did some further experiments and found out when I omit the bootdev 
parameter in boot.cfg, dk0 is offered as the default for the root 
device. dk0 is the GPT partition of one of the raid frame components on 
which the root file system is based. Logically this doesn't qualify as a 
valid root filesystem and fails booting.


So the bootdev parameter already seems to have some effect - namely to 
set the defaults accordingly. I am just wondering why I am still being 
asked, although everything is actually known...


From what I've seen, it reminds me a bit of my similar topic from 
August 2018[1].


Can anyone give me a hint as to what I am doing wrong?

Kind regards
Matthias

[1] http://mail-index.netbsd.org/port-xen/2018/08/23/msg009290.html


No HDMI audio with NetBSD 9.1 on NUC7i5DNKE

2020-12-13 Thread Matthias Petermann

Hello all,

my new NUC7i5DNKE is not providing audio output at the moment. The 
built-in sound chip is recognised as hdaudio and apparently also 
(partially) configured. Nevertheless, I do not get any usable audio 
devices displayed with audiocfg list.


The NUC7i5DNKE has no line-out output. The only way to get audio out of 
it is through the HDMI interface. I have read elsewhere that 
HDAUDIO_ENABLE_HDMI must first be enabled as an option in a custom 
kernel. I did that, in parallel with the option HDAUDIOVERBOSE. 
Unfortunately, audiocfg still lists no usable audio devices.


The relevant part of the kernel log I do post below.

Do I have an obvious problem here? In which direction would I have to 
research? I would be very happy to receive some tips on this.


Kind regards
Matthias


[ 1,008964] hdaudio0 at pci0 dev 31 function 3: HD Audio Controller
[ 1,008964] hdaudio0: interrupting at msi4 vec 0
[ 1,008964] hdaudio0: High Definition Audio version 1.0
[ 1,008964] hdaudio0: OSS 9 ISS 7 BSS 0 SDO 1 64-bit
[ 1,008964] hdaudio0: using 1024 byte CORB (cap 4)
[ 1,008964] hdaudio0: using 2048 byte RIRB (cap 4)
[ 1,008964] hdaudio0: cmd  : request 0F00  (00)
[ 1,008964] hdaudio0: cmd  : response 8086280B 0002
[ 1,008964] hdaudio0: cmd  : request 0F00 0004 (00)
[ 1,008964] hdaudio0: cmd  : response 00010001 0002
[ 1,008964] hdaudio0: cmd  : request 0F00 0002 (00)
[ 1,008964] hdaudio0: cmd  : response 0010 0002
[ 1,008964] hdaudio0: Codec02: 8086:280B HDA 1.0 rev 0 stepping 0
[ 1,008964] hdaudio0: cmd  : request 0F00 0005 (01)
[ 1,008964] hdaudio0: cmd  : response 0001 0002
[ 1,008964] hdafg0 at hdaudio0: Intel product 280b
[ 1,008964] hdafg0: parsing widgets
[ 1,008964] hdaudio0: cmd  : request 0F00 0004 (01)
[ 1,008964] hdaudio0: cmd  : response 00020002 0002
[ 1,008964] hdafg0: afg start 02 end 04 nwidgets 2
[ 1,008964] hdafg0: powering up widgets
[ 1,008964] hdaudio0: cmd  : request 0705  (01)
[ 1,008964] hdaudio0: cmd  : response  0002
[ 1,008964] hdaudio0: cmd  : request 0705  (02)
[ 1,008964] hdaudio0: cmd  : response  0002
[ 1,008964] hdaudio0: cmd  : request 0705  (03)
[ 1,008964] hdaudio0: cmd  : response  0002
[ 1,008964] hdaudio0: cmd  : request 0F00 0008 (01)
[ 1,008964] hdaudio0: cmd  : response 0004 0002
[ 1,008964] hdaudio0: cmd  : request 0F00 000B (01)
[ 1,008964] hdaudio0: cmd  : response  0002
[ 1,008964] hdaudio0: cmd  : request 0F00 000A (01)
[ 1,008964] hdaudio0: cmd  : response  0002
[ 1,008964] hdaudio0: cmd  : request 0F00 0012 (01)
[ 1,008964] hdaudio0: cmd  : response  0002
[ 1,008964] hdaudio0: cmd  : request 0F00 000D (01)
[ 1,008964] hdaudio0: cmd  : response  0002
[ 1,008964] hdaudio0: cmd  : request 0F00 000F (01)
[ 1,008964] hdaudio0: cmd  : response C009 0002
[ 1,008964] hdaudio0: cmd  : request 0F00 0011 (01)
[ 1,008964] hdaudio0: cmd  : response  0002
[ 1,008964] hdafg0: afg widgets 0xe866eb2afa00-0xe866eb2afc30
[ 1,008964] hdaudio0: cmd  : request 0F00 0009 (02)
[ 1,008964] hdaudio0: cmd  : response 6611 0002
[ 1,008964] hdaudio0: cmd  : request 0F00 0009 (03)
[ 1,008964] hdaudio0: cmd  : response 0040778D 0002
[ 1,008964] hdaudio0: cmd  : request 0F00 0009 (02)
[ 1,008964] hdaudio0: cmd  : response 6611 0002
[ 1,008964] hdaudio0: cmd  : request 0F1C  (02)
[ 1,008964] hdaudio0: cmd  : response  0002
[ 1,008964] hdaudio0: cmd  : request 0F00 000E (02)
[ 1,008964] hdaudio0: cmd  : response  0002
[ 1,008964] hdaudio0: cmd  : request 0F00 000B (02)
[ 1,008964] hdaudio0: cmd  : response 0005 0002
[ 1,008964] hdaudio0: cmd  : request 0F00 000A (02)
[ 1,008964] hdaudio0: cmd  : response 001A07F0 0002
[ 1,008964] hdaudio0: cmd  : request 0F00 0009 (03)
[ 1,008964] hdaudio0: cmd  : response 0040778D 0002
[ 1,008964] hdaudio0: cmd  : request 0F1C  (03)
[ 1,008964] hdaudio0: cmd  : response 18560010 0002
[ 1,008964] hdaudio0: cmd  : request 0F00 000E (03)
[ 1,008964] hdaudio0: cmd  : response 0001 0002
[ 1,008964] hdaudio0: cmd  : request 0F02  (03)
[ 1,008964] hdaudio0: cmd  : response 0002 0002
[ 1,008964] hdafg0: add connection 03->02
[ 1,008964] hdaudio0: cmd  : request 0F00 0012 (03)
[ 1,008964] hdaudio0: cmd  : response 8000 0002
[ 1,008964] hdaudio0: cmd  : request 0F00 000C (03)
[ 1,008964] hdaudio0: cmd  : response 0994 

Re: Mini-PC for NetBSD

2020-12-13 Thread Matthias Petermann

Hello,

On 10.12.20 15:10, Matthias Petermann wrote:
Thank you very much for both your comments on my question. To be on the 
safe side, I have now decided on a 7th generation NUC (Q3 2017) with 
Kaby Lake. According to the specification, it has Intel HD Graphics 620 
which should be supported. As soon as the device is delivered, I will 
install NetBSD on it and gladly write a summary of my experience here. 
Until then, thanks again.


Here comes the promised summary. The NUC7i5DNKE works well with NetBSD 
9.1. UEFI boot, GBit Ethernet, WIFI and accelerated graphics work. I 
still have problems with suspend to RAM and - which I'd best describe in 
more detail in a separate thread - with HDAUDIO in connection with 
output via HDMI. That does not work yet.


Many greetings
Matthias


Re: Speed up unbundling of hg bundles

2020-12-12 Thread Matthias Petermann

Hello Martin,

On 12.12.20 13:35, Martin Husemann wrote:

It sounds more like you are hitting a python (or NetBSD library) bug here
(unless your machine has too little RAM and is quite slow).


that seems to be the right conclusion. One thing in advance: the problem 
is now solved.


For documentation purposes: what were the circumstances?

I am using NetBSD 9.1 on amd64. Both CPU and RAM are sufficient 
(i5-3320M, 16 GB). I was running Mercurial in the Python 2.7 variant 
from pkgsrc-2020Q3. The reason for Python 2.7 was that I wanted to use 
TortoiseHg, which requires Python 2.7. TortoiseHg didn't work (that's 
another topic) but I continued to use Mercurial with Python 2.7 as a result.


The unbundle always stopped at this point:

# hg unbundle ../77d2a2ece3a06d837da45acd0fda80086ab4113c.zstd.hg
adding changesets
adding manifests 


adding file changes

I have now removed the Python 2.7 variant along with TortoiseHg, and 
installed the Python 3 variant instead. With this, it works in a 
perfectly acceptable time (something around 1 h I believe):


# hg unbundle ../77d2a2ece3a06d837da45acd0fda80086ab4113c.zstd.hg
adding changesets
adding manifests 

adding file changes 

added 931876 changesets with 2425841 changes to 439702 files (+417 
heads)

new changesets 8cec458d70ff:77d2a2ece3a0 (931876 drafts)
(run 'hg heads' to see heads)

So it does seem to be a bug. Since Python 2.7 has had its day anyway, I 
wouldn't pursue it any further for the time being. After this bumpy 
start, I am now happy to use Mercurial and am glad that my first 
suspicion (ZFS) did not prove to be true.


Thanks again for all the suggestions on this topic, and have a nice 
Saturday.


Matthias


Re: Speed up unbundling of hg bundles

2020-12-12 Thread Matthias Petermann

Hi Thomas,

On 12.12.20 12:32, Thomas Mueller wrote:

Is it not possible to use "hg clone ..." like "git clone" and not have to run 
for 20 hours or more?

Such a slow download would make users give up on NetBSD.

Where do you get the long name of that file to download?

You shouldn't have to use ftp for hg any more than you'd use ftp for a 
git-clone.


I remember that I had discussed the problem here before. At that time I 
had tried direct cloning, which had taken 11 hours or more. At that 
time, the cause was that my internet connection was unstable, which 
apparently led to the cloning mechanism getting out of step more often. 
As a possible solution, a user recommended to me at that time to go the 
way just described, i.e. to download an initial bundle from the CDN and 
unpack it via HG unbundle. I must admit that it is slowly dawning on me 
what the problem could be today. Then, at that time, unbundling was not 
so extremely lengthy, according to my memory. This time the signs are 
different - I'm trying to unbundle it on a ZFS dataset. If ZFS was the 
bottleneck, however, I would have expected the system to be 100% 
accounted for in Top. Instead, it is still the User Python process at 
the moment, which is why I thought that certain transformations are 
simply being carried out here, which are simply computationally 
intensive and therefore take so long


Kind regards
Matthias


Speed up unbundling of hg bundles

2020-12-12 Thread Matthias Petermann

Hello all,

what is the fastest way to get a working local clone of the NetBSD-src 
Mercurial repository?


I am currently using the following scheme:

# ftp 
https://cdn.netbsd.org/_bundles/src/77d2a2ece3a06d837da45acd0fda80086ab4113c.zstd.hg

# hg init src
# cd src
# hg unbundle ../77d2a2ece3a06d837da45acd0fda80086ab4113c.zstd.hg

The last command has now been running for 20 hours at 100% CPU 
utilisation of one core by a Python process. I know from previous 
attempts that it may well take that long. Given these times, would it 
make sense to provide a .tar.gz of an already "unbundled" repository as 
a basis? Or does that already exist and I haven't found it yet?


Thank you and best regards
Matthias


Re: Mini-PC for NetBSD

2020-12-10 Thread Matthias Petermann

Hello Nia and Benny,

Am 08.12.2020 um 17:26 schrieb nia:


The state of acceleration for Intel GPU is "anything up to Kaby Lake works".
That's... 7th generation? 8th?

IIRC a more generic UEFI framebuffer driver is used rather than vesa on modern
x86 hardware rather than vesa. It's not too bad, depending on your workload.



Thank you very much for both your comments on my question. To be on the 
safe side, I have now decided on a 7th generation NUC (Q3 2017) with 
Kaby Lake. According to the specification, it has Intel HD Graphics 620 
which should be supported. As soon as the device is delivered, I will 
install NetBSD on it and gladly write a summary of my experience here. 
Until then, thanks again.


Many greetings
Matthias



Re: Mini-PC for NetBSD

2020-12-08 Thread Matthias Petermann

Am 08.12.2020 um 15:22 schrieb Matthias Petermann:
> [...]
problem, I am more concerned about the support of the integrated 
graphics. In intel(4) the Intel Iris and Intel Iris Pro chipsets and 
Intel HD are mentioned as supported. Not included are Intel UHD, which 
is used in some of the newer NUCs. Is there any way to find out in 
advance if the respective integrated graphics will be supported by 
NetBSD's Xorg? I would like to use the Intel driver and I would not like 
to switch to the less performant VESA driver. I would also be grateful 
for a concrete model name, if one of you has such a NUC with a 
specification of my choice and can confirm that everything works fine.


I answer myself, because apparently DRM is the hurdle to take. In the 
release announcement for NetBSD 9.0 it was stated that this is at the 
level of Linux 4.4. In the announcement of NetBSD 9.1 it is not 
mentioned, so it is assumed that it remained stable. Now a matrix would 
be good, which shows which GPU is fully supported in this way :-) Is 
there such a matrix for NetBSD? Or can somebody navigate me to the 
source files, which show me the current state of it? Or even better, is 
there a wiki page or something like that (as exists for Xen for example)?


Kind regards
Matthias


Mini-PC for NetBSD

2020-12-08 Thread Matthias Petermann

Hello everybody,

today a small hardware question. I would like to buy a Mini-PC, on which 
NetBSD should be used as primary desktop operating system. I thought 
about an Intel NUC. Such a device from the lower end of the performance 
scale with Celeron-CPU I already have in use as a home server (model 
NUC5PPYH). But for the desktop it may be a bit more power. An i5 with 
SSD and 8 GB RAM or more seems to be a good choice. While I assume that 
the often used Realtek-LAN-chipsets and USB 3.0 are no longer a big 
problem, I am more concerned about the support of the integrated 
graphics. In intel(4) the Intel Iris and Intel Iris Pro chipsets and 
Intel HD are mentioned as supported. Not included are Intel UHD, which 
is used in some of the newer NUCs. Is there any way to find out in 
advance if the respective integrated graphics will be supported by 
NetBSD's Xorg? I would like to use the Intel driver and I would not like 
to switch to the less performant VESA driver. I would also be grateful 
for a concrete model name, if one of you has such a NUC with a 
specification of my choice and can confirm that everything works fine.


Thanks & best regards
Matthias


Re: (graphical) SSH console client

2020-12-03 Thread Matthias Petermann

Hello Michael,

Am 02.12.2020 um 19:41 schrieb Michael Parson:

Responding to myself, bad form, I know, but you'll also want/need the
pkgsrc/net/remimina-plugins to get the full functionality you're looking
for.



thanks for the recommendation... this looks very promising, although the 
ssh client seems to crash on my system. I will investigate further and 
report back, once I found a solution or be able to provide more 
substancial debugging information.


Kind regards
Matthias


Re: NetBSD 8 VPS server refusing to reboot: please help

2020-12-01 Thread Matthias Petermann

Hello Mayuresh and all who are interested in a NetBSD VPS in Europe

Am 30.11.2020 um 11:55 schrieb Mayuresh:

On Mon, Nov 30, 2020 at 11:33:55AM +0100, Matthias Petermann wrote:

The name Hetzner was mentioned there, and someone also wrote that they have
provided the NetBSD-ISO for installation there on request.


Yes I have a couple of VPS instances of NetBSD on Hetzner.

NetBSD isn't a stock OS at Hetzner. But you can request to mount a custom
ISO by providing its URL. Very few cloud providers seem to provide custom
iso.  Some provide but charge for it. In that respect Hetzner appears
really the best to me. The prices are competitive as well.

Only feature I sometimes miss is nested virtualization. I guess some large
cloud providers provide it, but they are costlier as well.



after my enquiry and a short correspondence with Hetzner support this 
morning I was confirmed that the NetBSD 9.1 ISO is now available to all 
customers by default and no longer needs to be requested separately for 
each project :-)


Many greetings
Matthias


Re: (graphical) SSH console client

2020-11-30 Thread Matthias Petermann

Hi Michael,

Am 30.11.2020 um 14:35 schrieb Michael van Elst:

m...@petermann-it.de (Matthias Petermann) writes:


can any of you recommend a (graphical) SSH client for NetBSD? I am
looking for something similar to Putty / mRemoteNG that I use in the
Windows world. It would be important to me:



* Management of connection targets and possibility to configure
passwords and SSH keys to use for each connection
* Advanced functions like configuration of remote / local port forwarding
* Unicode support


While all this is just possible with the command line client (and a Unicode
terminal like xterm), you can simply use Putty. The last time I tried the
version from pkgsrc, it just worked.



I didn't even know that there was a putty version for Unix :-) Funny - 
it looks exactly like the Windows version and I can also confirm that it 
works in pkgsrc-2020Q3. Thanks for the tip!


You're right though - what Putty out of the Box can do is not much 
different from what would be possible in a normal UTF8-Xterm with the 
ssh-client and a well maintained .ssh/config.


So I'll keep looking - the bar is now set at mRemoteNG ;-)

Many greetings
Matthias


(graphical) SSH console client

2020-11-30 Thread Matthias Petermann

Hello everybody,

can any of you recommend a (graphical) SSH client for NetBSD? I am 
looking for something similar to Putty / mRemoteNG that I use in the 
Windows world. It would be important to me:


* Management of connection targets and possibility to configure 
passwords and SSH keys to use for each connection

* Advanced functions like configuration of remote / local port forwarding
* Unicode support

There's probably already something in the pkgsrc - but I didn't really 
turn five.


Thanks & best wishes
Matthias


Re: NetBSD 8 VPS server refusing to reboot: please help

2020-11-30 Thread Matthias Petermann

Hello Mayuresh,


Translated with www.DeepL.com/Translator (free version)

Am 28.11.2020 um 16:05 schrieb Mayuresh:

On Sat, Nov 28, 2020 at 06:51:04PM +0530, Mayuresh wrote:

I accidentally powered off a Hetzner VPS server running NetBSD 8 and it
is refusing to reboot.


Ok, all solved - panic withdrawn! It just required rebooting it a number
of times and arbitrarily once it started. Probably something to do with
arbitrariness of sequence of detecting disks or something else?



thank you very much for your contribution here. I happen to have a 
similar topic on the screen and have also read in one of the forums (I 
think even in connection with your name but I can't find the URL 
anymore...) which VPS providers have NetBSD on offer (even if they don't 
officially support it). The name Hetzner was mentioned there, and 
someone also wrote that they have provided the NetBSD-ISO for 
installation there on request. Since I already had a Hetzner account, I 
tried that right away. However, I can only find various Linuxes, FreeBSD 
and OpenBSD as well as the Microsoft collection in the rich offer of 
ISOs. How did you get NetBSD installed there?


Independently of that I once opened a ticket at Hetzner with the request 
to make a NetBSD 9.1 ISO generally available and permanently available. 
If I have an answer to this request, I will gladly give you an update 
here again.


Best regards
Matthias


Re: sponsor NetBSD for 2020 https://github.com/sponsors/NetBSD

2020-11-10 Thread Matthias Petermann

Am 10.11.2020 um 14:32 schrieb matthew sporleder:

Indeed -- casting a wide net is in our interest.  I hope you are able
to use one of our many potential donation offerings -- paypal, stripe,
amazon smile, github sponsorship.. any I am missing?



So far my monthly subscription via Paypal has worked well. And the 
background of my question has already been answered - obviously the 
payment via Github is an advantage for the NetBSD Foundation at least in 
the first year, because Microsoft adds a dollar for every dollar 
donated? That would be a clear reason to change the payment provider at 
least for one year ;-)


Kind regards
Matthias


Re: sponsor NetBSD for 2020 https://github.com/sponsors/NetBSD

2020-11-10 Thread Matthias Petermann

Hallo Matthew,

Am 10.11.2020 um 05:35 schrieb matthew sporleder:

Hey -- the end of the year is coming up fast.  Wouldn't you feel
better about yourself if you added a github sponsorship to balance out
your incredible year? :)
How does this type of donation compare to a Paypal Monthly Subscription? 
Is it just a different way of transport, or are there advantages / 
disadvantages to Paypal?


Kind regards
Matthias


What administrative user interfaces for Xen are available for NetBSD?

2020-08-18 Thread Matthias Petermann

Hello, everyone,

since I set up my lab environment on NetBSD/Xen I am very satisfied with 
the handling of Xen. I use a mixture of sparse image files and LVM 
volumes as storage. The guests are mostly NetBSD (PVM) and some Windows 
7 VMs. Apart from snapshots (which I would probably get if I switched to 
ZFS volumes) I don't really miss anything and am also satisfied with the 
performance.


But what I am asking myself right now... is there a higher 
administration interface besides the command line tools (xl)? Libvirt is 
supposed to support Xen (I don't know about nvmm?), but there should be 
some kind of user interface on top of it. If someone has experience with 
this or knows a halfway recommendable configuration on a NetBSD Dom0, I 
would be grateful for a little tip. In the pkgsrc I did not find 
anything obvious in this regard.


Kind regards
Matthias


Re: NetBSD/Xen samba performance low (compared to NetBSD/amd64)

2020-08-03 Thread Matthias Petermann

Hi Greg and others,

Am 03.08.2020 um 21:31 schrieb Greg Troxel:

Other than the 1 cpu vs ? cpus, no.   I tested xen perfmorance long ago,
in 2006 with a setup

   NetBSD dom0
   disk file in filesystem
   NetBSD domU with xbd0 from the file

  and found that reading with dd:

the dom0 raw disk was just about the same as bare metal

the file was maybe 5-10% slower (maybe not quite; it was noticeable but
not a big deal)

the xbd0d "raw disk" was also 5-10 % slower than reading the file in
the dom0

Now, this isn't what you asked, but I find the difference you found seem
like a bug.

I would definitely do dd from the raw disk in your case 1 and 2,
followed by dd of the iso.

Also, I would repeat your tests and run "systat vmstat" during each
case, and also netstat to see if the network interface is not keeping
up.   Then I would run iperf, ttcp or whatever to test network separate
from samba and disk.




many thanks to you for the helpful hints. First of all it was important 
for me to get feedback if the performance losses I noticed are common. 
As I understood it, this is rather not the case and a detailed 
investigation is appropriate. This is what I will tackle next.


By the way, I noticed just today that even when using the "pure" NetBSD 
kernel without Xen the performance is also bad in some cases. The ISO 
file mentioned in the last posting is written with about 60 MByte/s. 
Even if I trigger a sync every second the performance hardly drops. 
However - if I (after a reboot to clean up the file cache) want to copy 
the same file back via the same way (from NetBSD point of view "read") I 
also only get a throughput of about 20 MByte/s. I thought that reading 
should be faster than writing in any case.


But there is one detail I have just now become aware of... when I bought 
these NUCs a few years ago, I installed hybrid hard disks (Seagate 
Firecuda SSHD 2TB). This type has a mechanical/magnetic mass storage and 
a (relatively small) SSD on the side. But from the point of view of the 
operating system the device appears as a whole. The on-disk-controller 
decides when and how the SSD is used. I don't know this in detail 
either, only that booting from this disk works quite fast. It's quite 
possible that here simply the blocks that are read first after power on 
are moved to the SSD. To cut a long story short - I think I first have 
to convert to either a purely mechanical hard disk or a pure SSD to 
avoid measurement inaccuracies due to the unpredictable behavior of the 
SSHD. Then I will approach the whole thing scientifically and consider 
your advice.


This week will be very exhausting and I can't assure that I will 
progress quickly. But I will stay on it and write my results here as 
soon as I have them.


Best regards
Matthias


NetBSD/Xen samba performance low (compared to NetBSD/amd64)

2020-08-03 Thread Matthias Petermann

Hello everybody,

on a small Intel NUC with Realtek network chip I want to operate
a Xen host. My NetBSD Xen guests are supposed to host different web apps 
as well as an Asterisk PBX.


I also want to provide network drives via Samba as performant as 
possible. To keep the overhead of hard drive access low I thought it 
could be a good idea to let Samba - although not architecturally clean - 
operate from Dom0.


When trying to do this, I noticed a significant difference in network speed.

Constellation 1 (Pure NetBSD kernel):

* NetBSD/amd64 9.0 Kernel + Samba: throughput ~60 MByte/s

Constellation 2 (NetBSD/Xen Domain 0):

* Xen 4.11 + NetBSD/Xen Dom0 + Samba: throughput ~12 MByte/s

I measured this by copying an 8 GB ISO file from a Windows host.
In constellation 2, no guests had started and the full main memory of 
Dom0 was assigned. In my view, the only significant difference is that 
NetBSD can only use one of the two CPU cores under Xen. Since the CPU 
was idle on average at 20% during copying, that doesn't seem to be the 
bottleneck?


Are such differences in I/O performance to be expected?

Thank you & best regards
Matthias


Boot selection of boot.cfg doesn't work as expected (with UEFI boot loader)

2020-08-03 Thread Matthias Petermann

Hello everyone,

on NetBSD/amd64 9.0 Release I am setting up a Xen Host. Therefore, I 
added the first line to my boot.cfg:


menu=Boot Xen:load /netbsd-XEN3_DOM0.gz root=NAME=root;multiboot /xen.gz 
dom0_mem=512M dom0_max_vcpus=1 dom0_vcpus_pin

menu=Boot normally:rndseed /var/db/entropy-file;boot
menu=Boot single user:rndseed /var/db/entropy-file;boot -s
menu=Drop to boot prompt:prompt
default=1
timeout=5
clear=1

(there is no line break in the "Boot Xen" line in the original file).

This results in a boot menu:

1. Boot Xen
2. Boot normally
3. Boot single user
4. Drop to boot prompt

When Xen did have a (here unrelated) problem booting, I wanted to select 
Option 2 (Boot normally) to boot a standard kernel instead. To my 
surprise, Option 2 also tries to boot the Xen kernel. On the other hand, 
Option 3 boots into the single user environment as expected.


Do I have made an obvious mistake in my boot.cfg, or does this look like 
a bug?


Kind regards
Matthias


  1   2   >