Configuring a loopback interface for an additional rdomain N via
/etc/hostname.loN causes /etc/netstart to misconfigure any
real network interface intended to be in that rdomain; it is
left in rdomain 0 instead. Arguably it's a bug in the kernel
rather than in netstart.
I stumbled across the problem in a real configuration, but
one in which I later realized I didn't need that loopback
interface anyway. So no urgency by me.
====
A. Demonstration, using unmodified OpenBSD 6.7:
On a freshly-booted system with em1 an unused, not-yet-
configured Ethernet interface, running as super-user:
echo 192.168.100.1/24 rdomain 10 >/etc/hostname.em1
echo 127.0.0.1/8 rdomain 10 >/etc/hostname.lo10
sh /etc/netstart
Expected result:
em1 and lo10 are brought up, with the addresses and
netmasks shown, both in rdomain 10.
Actual, unexpected result:
netstart prints an error message:
ifconfig: SIOCSIFRDOMAIN: Invalid argument
em1 is given the expected address and netmask, but is left
in rdomain 0: ifconfig em1 says
em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
lladdr xx:xx:xx:xx:xx:xx
index 4 priority 0 llprio 3
media: Ethernet autoselect (none)
status: no carrier
inet 192.168.100.1 netmask 0xffffff00 broadcast 192.168.100.255
lo10 is configured as expected:
lo10: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> rdomain 10 mtu 32768
index 7 priority 0 llprio 3
groups: lo
inet6 ::1 prefixlen 128
inet6 fe80::1%lo10 prefixlen 64 scopeid 0x7
inet 127.0.0.1 netmask 0xff000000
These things happen whether /etc/netstart is run manually, as
shown, or at boot time once those two hostname.* files exist.
If one then does
rm /etc/hostname.lo10
reboot
the system comes up without printing the SIOCSIFRDOMAIN error;
em1 is configured as expected (first line for em1 also says
`rdomain 10'); lo10 is left unconfigured, as expected.
Likewise if you remove both new hostname.* files, reboot
to get a clean start, and do
echo 192.168.100.1/24 rdomain 10 >/etc/hostname.em1
sh /etc/netstart
em1 is correctly configured (in rdomain 10). The problem
shows only if lo10 is also given a hostname.lo10 file.
====
B. Slightly-deeper analysis: by instrumenting a copy of
netstart, I figured out what it is doing. (netstart -n
doesn't tell the whole story--it doesn't say when it would
ifconfig XX create.)
When hostname.em1 and hostname.lo10 both exist, netstart does
ifconfig lo10 create
ifconfig em1 192.168.100.1/24 rdomain 10
ifconfig lo10 127.0.0.1/8 rdomain 10
If, on a freshly-booted system with neither hostname.em1
nor hostname.lo10, I run those commands by hand, I get
the same erroneous results. The ifconfig em1 produces
the SIOCSIFRDOMAIN message. The second ifconfig lo10
isn't necessary to demonstrate the problem, of course;
it is configuring em1 that goes wrong.
But if only hostname.em1 exists, not hostname.lo10,
netstart just does
ifconfig em1 192.168.100.1/24 rdomain 10
and there is no error message and em1 is correctly
configured.
Apparently netstart believes it must pre-create any
explicitly-configured loN devices (because ifconfig -C
lists lo as a pseudo-device), but doing so somehow
interferes with later assigning a real device to the
corresponding rdomain.
====
C. Related thoughts
Arguably feature requests rather than bugs:
-- sh /etc/netstart -n lo10 (with hostname.lo10 extant)
doesn't report that it would do ifconfig lo10 create.
Maybe it should say so.
-- Maybe netstart -n shouldn't require a list of
interface names; if none is given, just print everything
it would do to set up all interfaces.
Neither of these make a difference in normal operation,
but they would speed up debugging setup problems.
I'd be glad to write and submit a patch for those
things if you folks think they're worth doing.
Thanks,
Norman Wilson
PS: I'm really pleased with rdomains. I did something
vaguely similar a long time ago in an ancient version of
UNIX, but never worked out all the details. The OpenBSD
implementation feels complete and makes me very happy
when building firewalls and routers. Thanks!