On 2 July 2015 at 07:57, Ted Unangst <[email protected]> wrote: > this has been an ongoing problem, but I think it's gotten worse. > > When I change networks, I run dhclient again. It tells me I have a lease. For > instance: > DHCPREQUEST on em0 to 255.255.255.255 > DHCPREQUEST on em0 to 255.255.255.255 > DHCPDISCOVER on em0 - interval 3 > DHCPOFFER from 192.168.1.1 (48:f8:b3:05:5e:09) > DHCPREQUEST on em0 to 255.255.255.255 > DHCPACK from 192.168.1.1 (48:f8:b3:05:5e:09) > bound to 192.168.1.137 -- renewal in 43200 seconds. > > But then it doesn't actually assign the IP and I have no routes: > Destination Gateway Flags Refs Use Mtu Prio Iface > 127/8 127.0.0.1 UGRS 0 0 32768 8 lo0 > 127.0.0.1 127.0.0.1 UHl 1 26262 32768 1 lo0 > 224/4 127.0.0.1 URS 0 0 32768 8 lo0 > > So I have to run it again. > DHCPREQUEST on em0 to 255.255.255.255 > DHCPREQUEST on em0 to 255.255.255.255 > DHCPACK from 192.168.1.1 (48:f8:b3:05:5e:09) > bound to 192.168.1.137 -- renewal in 43200 seconds. > > Now I have an IP and routes: > default 192.168.1.1 UGS 0 0 - 8 em0 > ... > > There appears to be a race where dhclient changes my IP address, then decides > to delete the old address, but actually deletes the new address. >
Best guess based on this info: It gets the ACK, deletes the old address and routes, and then adds the new address and routes. It gets the routing message reporting the expected address has been added, emits "bound to ...", and then gets another routing message that tells it somebody is messing with the interface and decides to exit. If you turn out to be the first person able to reproduce this often enough and willing to run some of the many diagnostic diffs I have cast upon the waters for past reports of this issue I would be delighted to make another attempt to find and avoid the race. To begin, some of 1) /var/log/daemon entries during an event. 2) /etc/dhclient.conf 3) /var/db/dhclient* files before and after switching networks 4) ifconfig and netstat before and after switching networks 5) tcpdump -i em0 -vv -X -s 2000 host 102.168.1.1 6) define 'gotten worse' -- % of 'failures'? 7) dmesg is always nice 8) running 'dhclient -L <path>' to record the actual leases offered 9) output of 'route -n monitor' during dhclient run on new network 10) compile dhclient with #define DEBUG turned on in dhcpd.h, and run 'dhclient -d' on new network 11) what if any M's are in the kernel you are running, especially any of the recent network stack ones 12 ) a detailed description of how you are changing networks, especially the timing between leaving one and joining the other and the timing between running the various dhclient instances 13) 'pgrep -l -f dhclient' before and after each run of dhclient Be warned, given current dhclient architecture and routing message production there are very likely unsolvable races involved. In particular the fact that routing messages do not contain the PID of the program causing the issuance of the ADD and DELETE routing message makes it theoretically impossible to guarantee the correct thing is done with the routing messages. Each dhclient instance must guess if the routing message is one caused by itself or a competing instance. Another question is why do you run dhclient again? The link up/down (and I assume there is a link up/down pair of routing messages) should trigger a lease renewal. .... Ken
