carp init delay
In some cases when a network port comes up, it does not indicate that the network is ready. But on linkup, carp(4) will try to get out of the INIT state as soon as possible. And because all is quiet it will decide to become master. This then leads to master-master situations. Here are two examples when this can happen, there are probably more: (1) spanning tree may be in effect, and not yet forwarding (2) a powering-up or rebooting switch that activates its ports immediately, but does not forward anything while not completely up yet (this may be an openbsd bridge too) I wonder if carp(4) needs an extra knob (*shudder*) to pause in the INIT state while the rest of the network gets ready after a linkup. I see in the source code there are already two mechanisms/workarounds that are related, but a pause may be a bit more generic: - sc_suppress - sc_delayed_arp Anyone else observe/fix this by other means? Opinions?
Re: carp init delay
On 2013/04/03 14:54, Camiel Dobbelaar wrote: In some cases when a network port comes up, it does not indicate that the network is ready. But on linkup, carp(4) will try to get out of the INIT state as soon as possible. And because all is quiet it will decide to become master. This then leads to master-master situations. Here are two examples when this can happen, there are probably more: (1) spanning tree may be in effect, and not yet forwarding (2) a powering-up or rebooting switch that activates its ports immediately, but does not forward anything while not completely up yet (this may be an openbsd bridge too) I wonder if carp(4) needs an extra knob (*shudder*) to pause in the INIT state while the rest of the network gets ready after a linkup. I see in the source code there are already two mechanisms/workarounds that are related, but a pause may be a bit more generic: - sc_suppress - sc_delayed_arp Anyone else observe/fix this by other means? Opinions? slightly messy, though at least this also applies to the case with things other than carp which could also have problems: add !sleep 5 or something in hostname.if for the physical interface...
Re: carp init delay
my 2 cents: timing is always a problem, maybe you could arping the next hop and then activate the carp ? On Wed, Apr 3, 2013 at 9:34 AM, Stuart Henderson s...@spacehopper.orgwrote: On 2013/04/03 14:54, Camiel Dobbelaar wrote: In some cases when a network port comes up, it does not indicate that the network is ready. But on linkup, carp(4) will try to get out of the INIT state as soon as possible. And because all is quiet it will decide to become master. This then leads to master-master situations. Here are two examples when this can happen, there are probably more: (1) spanning tree may be in effect, and not yet forwarding (2) a powering-up or rebooting switch that activates its ports immediately, but does not forward anything while not completely up yet (this may be an openbsd bridge too) I wonder if carp(4) needs an extra knob (*shudder*) to pause in the INIT state while the rest of the network gets ready after a linkup. I see in the source code there are already two mechanisms/workarounds that are related, but a pause may be a bit more generic: - sc_suppress - sc_delayed_arp Anyone else observe/fix this by other means? Opinions? slightly messy, though at least this also applies to the case with things other than carp which could also have problems: add !sleep 5 or something in hostname.if for the physical interface... -- - () ascii ribbon campaign - against html e-mail /\
Re: carp init delay
On 2013/04/03 15:43, Camiel Dobbelaar wrote: On 4/3/13 3:34 PM, Stuart Henderson wrote: In some cases when a network port comes up, it does not indicate that the network is ready. But on linkup, carp(4) will try to get out of the INIT state as soon as possible. And because all is quiet it will decide to become master. Anyone else observe/fix this by other means? Opinions? slightly messy, though at least this also applies to the case with things other than carp which could also have problems: add !sleep 5 or something in hostname.if for the physical interface... Yes, I already use that. That solves the case where the system with the carp interfaces itself is rebooted. But not the other cases. Hence I'd like that sleep applied always, instead of only when /etc/netstart is run. :-) Ah, I see what you mean. Still there are things other than carp where this might also apply - for example the pfsync initial_bulk that gets handled via if_linkstatehooks (but maybe also userland things)..
Re: carp init delay
When the system with the carp interfaces comes up, a sleep in the hostname.if file works. An arping might be an optimization of that. But I'd like carp to react properly to events *outside* the system. When I unplug/plug a network cable, spanning tree can kick in again on the switch. *Then* I'd like carp to pause. Or the example I mentioned earlier when a switch is powered off and on. I think handling that belongs in the kernel, and not some userland voodoo (ifstated/cron scripts) to clean it up. :-) On 4/3/13 3:37 PM, sven falempin wrote: my 2 cents: timing is always a problem, maybe you could arping the next hop and then activate the carp ? On Wed, Apr 3, 2013 at 9:34 AM, Stuart Henderson s...@spacehopper.orgwrote: On 2013/04/03 14:54, Camiel Dobbelaar wrote: In some cases when a network port comes up, it does not indicate that the network is ready. But on linkup, carp(4) will try to get out of the INIT state as soon as possible. And because all is quiet it will decide to become master. This then leads to master-master situations. Here are two examples when this can happen, there are probably more: (1) spanning tree may be in effect, and not yet forwarding (2) a powering-up or rebooting switch that activates its ports immediately, but does not forward anything while not completely up yet (this may be an openbsd bridge too) I wonder if carp(4) needs an extra knob (*shudder*) to pause in the INIT state while the rest of the network gets ready after a linkup. I see in the source code there are already two mechanisms/workarounds that are related, but a pause may be a bit more generic: - sc_suppress - sc_delayed_arp Anyone else observe/fix this by other means? Opinions? slightly messy, though at least this also applies to the case with things other than carp which could also have problems: add !sleep 5 or something in hostname.if for the physical interface...
Re: carp init delay
On 4/3/13 3:54 PM, Stuart Henderson wrote: On 2013/04/03 15:43, Camiel Dobbelaar wrote: On 4/3/13 3:34 PM, Stuart Henderson wrote: In some cases when a network port comes up, it does not indicate that the network is ready. But on linkup, carp(4) will try to get out of the INIT state as soon as possible. And because all is quiet it will decide to become master. Anyone else observe/fix this by other means? Opinions? slightly messy, though at least this also applies to the case with things other than carp which could also have problems: add !sleep 5 or something in hostname.if for the physical interface... Yes, I already use that. That solves the case where the system with the carp interfaces itself is rebooted. But not the other cases. Hence I'd like that sleep applied always, instead of only when /etc/netstart is run. :-) Ah, I see what you mean. Still there are things other than carp where this might also apply - for example the pfsync initial_bulk that gets handled via if_linkstatehooks (but maybe also userland things).. Pausing carp may help the pfsync case too? Wasn't the major problem there caused by the freshly booted backup going to master too soon and cancelling the bulk update?