>Number: 6382
>Category: kernel
>Synopsis: Changes in broadcast handling cause diskless booting to fail
>Confidential: yes
>Severity: serious
>Priority: medium
>Responsible: bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: unknown
>Arrival-Date: Thu May 20 00:20:01 GMT 2010
>Closed-Date:
>Last-Modified:
>Originator:
>Release:
>Organization:
>Environment:
System : OpenBSD 4.7
Details : OpenBSD 4.7 (GENERIC.MP) #0: Sun May 2 13:16:00 EDT 2010
[email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
Architecture: OpenBSD.amd64
Machine : amd64
>Description:
Since 4.7 (more precisely, revision 1.56 or src/sys/netinet/in.c),
depending on network configuration, booting a diskless OpenBSD client
fails just past the RARP stage, when it needs to contact a
portmap/rpc.bootparamd server.
Digging deeper, I found out that portmap/rpc.bootparamd didn't
receive the broadcast packets from the client (doing portmap -d showed
nothing, as opposed to using a 4.6 server).
My network is 172.16.5.0/25, so broadcast address is 172.16.5.127.
However, the client does not learn about it's netmask during the RARP
stage, so, based on the 172.16.X.X IP, it seems to assume that the
broadcast address is that of a class B network, thus 172.16.255.255.
Before rev 1.56 on netinet/in.c, the requests can be seen when running
portmap -d, but not after rev 1.56, which leads me to believe the
kernel does only accept broadcasts that match its own broadcast
address, as opposed to before where it would also accept those for
its "class".
>How-To-Repeat:
Following diskless(8), set up a two machine network with an IP from the
private class B network range (172.16-31.X.X/16), but make sure the
netmask is more narrow than /16.
# cat << EOF > /etc/dhcpd.conf
option subnet-mask 255.255.255.128;
option routers 172.16.5.1;
subnet 172.16.5.0 netmask 255.255.255.128 {
range 172.16.5.100 172.16.5.105;
}
host tc0 {
hardware ethernet 00:E0:C5:59:25:25;
fixed-address 172.16.5.20;
filename "pxeboot";
}
EOF
# dhcpd bge0 (or else ...)
# echo "00:E0:C5:59:25:25 tc0" > /etc/ethers
# rarpd -a
# echo "tftp dgram udp wait root /usr/libexec/tftpd tftpd -s
/var/tftpboot" >> /etc/inetd.conf
# inetd
# mkdir /var/tftpboot/etc
# cd /var/tftpboot
# ftp ftp://ftp.openbsd.org/pub/OpenBSD/4.7/i386/bsd
# ftp ftp://ftp.openbsd.org/pub/OpenBSD/4.7/i386/pxeboot
# echo "boot /bsd" > etc/boot.conf
# echo "tc0 root=172.16.5.5:/var/diskless/tc47/tc0" /etc/bootparams
# portmap
# rpc.bootparamd
The RARP client should assume a netmask of /16 anyway. Past this point,
it should start broadcasting bootparamd requests to learn about the
NFS server. On the client's console, this message should appear
after a few seconds:
PXE boot MAC address 00:e0:c5:59:25:25, interface vr0
nfs_boot: using interface vr0, with revarp & bootparams
nfs_boot: client_addr=172.16.5.20
RPC timeout for server 172.16.255.255 (0xac10ffff) prog 100000
RPC timeout for server 172.16.255.255 (0xac10ffff) prog 100000
...
And of course, the rpc.bootparamd server never receives the request,
thus never sends the answer.
>Fix:
Fix is unknown. As a workaround, one can force his broadcast to match
what can be assumed depending on what "network class" his IP scheme
falls into.
>Release-Note:
>Audit-Trail:
>Unformatted: