** Description changed:
It was brought to my attention (by others) that ifupdown runs into race
conditions on some specific cases.
[Impact]
When trying to deploy many servers at once (higher chances of happening)
or from time-to-time, like any other intermittent race-condition.
Interfaces are not brought up like they should and this has a big impact
for servers that cannot rely on network start scripts.
The problem is caused by a race condition when init(upstart) starts up
network interfaces in parallel.
[Test Case]
Use attached script to reproduce the error (it might take some hours, in
a single virtual machine, for the error to occur).
- (example 1)
+ * please consider my bonding examples are using eth1 and eth2 as slave
+ interfaces.
- *** sequence to trigger race-condition ***
+ ifupdown some race conditions explained bellow:
- (a) ifup eth0 (b) ifup -a for eth0
+ !!!!
+ case 1)
+ (a) ifup eth0 (b) ifup -a for eth0
-----------------------------------------------------------------
1-1. Lock ifstate.lock file.
- 1-1. Wait for locking ifstate.lock
- file.
+ 1-1. Wait for locking ifstate.lock
+ file.
1-2. Read ifstate file to check
- the target NIC.
+ the target NIC.
1-3. close(=release) ifstate.lock
- file.
+ file.
1-4. Judge that the target NIC
- isn't processed.
- 1-2. Read ifstate file to check
- the target NIC.
- 1-3. close(=release) ifstate.lock
- file.
- 1-4. Judge that the target NIC
- isn't processed.
+ isn't processed.
+ 1-2. Read ifstate file to check
+ the target NIC.
+ 1-3. close(=release) ifstate.lock
+ file.
+ 1-4. Judge that the target NIC
+ isn't processed.
2. Lock and update ifstate file.
- Release the lock.
- 2. Lock and update ifstate file.
- Release the lock.
+ Release the lock.
+ 2. Lock and update ifstate file.
+ Release the lock.
+ !!!
- (example 2)
- Bonding device using eth0.
- ifenslave for eth0 is also executed in parallel, eth0 remains down.
-
- *** sequence to trigger race-condition ***
-
- (a) ifenslave of eth0 (b) ifenslave of eth0
+ !!!
+ case 2)
+ (a) ifenslave of eth0 (b) ifenslave of eth0
------------------------------------------------------------------
- 3. Execute ifenslave of eth0. 3. Execute ifenslave of eth0.
+ 3. Execute ifenslave of eth0. 3. Execute ifenslave of eth0.
4. Link down the target NIC.
5. Write NIC id to
- /sys/class/net/bond0/bonding
+ /sys/class/net/bond0/bonding
/slaves then NIC gets up
- 4. Link down the target NIC.
- 5. Fails to write NIC id to
- /sys/class/net/bond0/bonding/
+ 4. Link down the target NIC.
+ 5. Fails to write NIC id to
+ /sys/class/net/bond0/bonding/
slaves it is already written.
-
- (example 3)
-
- bonding is not set to active-backup as defined in config file: When the
- init(upstart) executes "if-pre-up.d/ifenslave" script and "if-pre-
- up.d/vlan" script for bond0 device in parallel, the "if-pre-
- up.d/ifenslave" script fails to change the bonding mode with a error
- message, "bonding: unable to update mode of bond0 because interface is
- up.".
-
- *** sequence to trigger race-condition ***
-
- (a)ifup bond0 (b)ifup -a
- -----------------------------------------------------------------------
- 1. Update statefile about bond0.
- 1. Does nothing about bond0
- because statefile is already
- updated about it.
- 2. ifenslave::setup_master()
- sysfs_change_down mode 1
- and link down bond0.
- 2. Link up bond0 by the vlan
- script on the processing
- for linking up bond0.201(*1).
- 3. "echo 1 > .../mode" fails.
-
- [ /etc/network/if-pre-up.d/vlan ]
-
- 46 if [ -n "$IF_VLAN_RAW_DEVICE" ] && [ ! -d /sys/class/net/$IFACE ]; then
- 47 if [ ! -x /sbin/vconfig ]; then
- 48 exit 0
- 49 fi
- 50 if ! ip link show dev "$IF_VLAN_RAW_DEVICE" > /dev/null; then
- 51 echo "$IF_VLAN_RAW_DEVICE does not exist, unable to create $IFACE"
- 52 exit 1
- 53 fi
- 54 ip link set up dev $IF_VLAN_RAW_DEVICE <-- (*1).
- 55 vconfig add $IF_VLAN_RAW_DEVICE $VLANID
- 56 fi
-
-
- [Regression Potential]
-
- * Attaching proposed patch (for upstream as well) and describing
- potential later on today.
-
- [Other Info]
-
- Example: [ /etc/network/interfaces ]
-
- auto lo
- iface lo inet loopback
-
- auto eth0
- iface eth0 inet manual
- bond-master bond0
-
- auto eth1
- iface eth1 inet manual
- bond-master bond0
-
- auto bond0
- iface bond0 inet dhcp
- bond-slaves eth0 eth1
- hwaddress 11:22:33:44:55:66
- bond-primary eth0
- bond-mode 1
- bond-miimon 100
- bond-updelay 200
- bond-downdelay 200
-
- auto bond0.201
- iface bond0.201 inet dhcp
- hwaddress 11:22:33:44:55:66
- vlan-raw-device bond0
- ...
-
- auto bond0.205
- iface bond0.205 inet dhcp
- hwaddress 11:22:33:44:55:66
- vlan-raw-device bond0
+ !!!
** Description changed:
It was brought to my attention (by others) that ifupdown runs into race
conditions on some specific cases.
[Impact]
When trying to deploy many servers at once (higher chances of happening)
or from time-to-time, like any other intermittent race-condition.
Interfaces are not brought up like they should and this has a big impact
for servers that cannot rely on network start scripts.
The problem is caused by a race condition when init(upstart) starts up
network interfaces in parallel.
[Test Case]
Use attached script to reproduce the error (it might take some hours, in
a single virtual machine, for the error to occur).
* please consider my bonding examples are using eth1 and eth2 as slave
- interfaces.
+ interfaces.
ifupdown some race conditions explained bellow:
!!!!
case 1)
(a) ifup eth0 (b) ifup -a for eth0
-----------------------------------------------------------------
1-1. Lock ifstate.lock file.
- 1-1. Wait for locking ifstate.lock
- file.
+ 1-1. Wait for locking ifstate.lock
+ file.
1-2. Read ifstate file to check
- the target NIC.
+ the target NIC.
1-3. close(=release) ifstate.lock
- file.
+ file.
1-4. Judge that the target NIC
- isn't processed.
- 1-2. Read ifstate file to check
- the target NIC.
- 1-3. close(=release) ifstate.lock
- file.
- 1-4. Judge that the target NIC
- isn't processed.
+ isn't processed.
+ 1-2. Read ifstate file to check
+ the target NIC.
+ 1-3. close(=release) ifstate.lock
+ file.
+ 1-4. Judge that the target NIC
+ isn't processed.
2. Lock and update ifstate file.
- Release the lock.
- 2. Lock and update ifstate file.
- Release the lock.
+ Release the lock.
+ 2. Lock and update ifstate file.
+ Release the lock.
!!!
-
!!!
case 2)
(a) ifenslave of eth0 (b) ifenslave of eth0
------------------------------------------------------------------
3. Execute ifenslave of eth0. 3. Execute ifenslave of eth0.
4. Link down the target NIC.
5. Write NIC id to
- /sys/class/net/bond0/bonding
- /slaves then NIC gets up
- 4. Link down the target NIC.
- 5. Fails to write NIC id to
- /sys/class/net/bond0/bonding/
- slaves it is already written.
+ /sys/class/net/bond0/bonding
+ /slaves then NIC gets up
+ 4. Link down the target NIC.
+ 5. Fails to write NIC id to
+ /sys/class/net/bond0/bonding/
+ slaves it is already written.
!!!
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1337873
Title:
Precise, Trusty, Utopic - ifupdown initialization problems caused by
race condition
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1337873/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs