Hi,
I upgraded a machine from SmartOS release 20160804T173241Z to
20160818T234814Z today. After rebooting, there was no network
connectivity. After some debugging it turned out that LACP link
aggregation didn't come up.
The server is an old Sun X4170, using the built in four port igb card,
connected to a Juniper SRX firewall. In the failed state, all links
were marked as physically up but the LACP didn't negotiate:
root@anto:~ # uname -a
SunOS anto.nym.se 5.11 joyent_20160818T234814Z i86pc i386 i86pc
root@anto:~ # dladm show-link
LINK CLASS MTU STATE BRIDGE OVER
igb0 phys 1500 up -- --
igb2 phys 1500 up -- --
igb1 phys 1500 up -- --
igb3 phys 1500 up -- --
aggr0 aggr 1500 up -- igb0 igb1 igb2 igb3
net0 vnic 1500 ? -- aggr0
net0 vnic 1500 ? -- aggr0
eth0 vnic 1500 ? -- aggr0
eth0 vnic 1500 ? -- aggr0
eth0 vnic 1500 ? -- aggr0
net0 vnic 1500 ? -- aggr0
net0 vnic 1500 ? -- aggr0
net0 vnic 1500 ? -- aggr0
root@anto:~ # dladm show-aggr -L
LINK PORT AGGREGATABLE SYNC COLL DIST DEFAULTED EXPIRED
aggr0 igb0 yes no no no yes no
-- igb1 yes no no no yes no
-- igb2 yes no no no yes no
-- igb3 yes no no no yes no
On the firewall side, interfaces were up but indicating no LACP traffic:
jb@hlv-srx240> show lacp interfaces ae0
Aggregated interface: ae0
LACP state: Role Exp Def Dist Col Syn Aggr Timeout Activity
ge-0/0/12 Actor No Yes No No No Yes Fast Active
ge-0/0/12 Partner No Yes No No No Yes Fast Passive
ge-0/0/13 Actor No Yes No No No Yes Fast Active
ge-0/0/13 Partner No Yes No No No Yes Fast Passive
ge-0/0/14 Actor No Yes No No No Yes Fast Active
ge-0/0/14 Partner No Yes No No No Yes Fast Passive
ge-0/0/15 Actor No Yes No No No Yes Fast Active
ge-0/0/15 Partner No Yes No No No Yes Fast Passive
LACP protocol: Receive State Transmit State Mux State
ge-0/0/12 Defaulted Fast periodic Detached
ge-0/0/13 Defaulted Fast periodic Detached
ge-0/0/14 Defaulted Fast periodic Detached
ge-0/0/15 Defaulted Fast periodic Detached
The configuration is fairly straight forward:
root@anto:~ # egrep aggr\|admin /usbkey/config
aggr0_aggr=0:14:4f:e7:39:6,0:14:4f:e7:39:7,0:14:4f:e7:39:8,0:14:4f:e7:39:9
aggr0_lacp_mode=active
# admin_nic is the nic admin_ip will be connected to for headnode zones.
admin_nic=aggr0
admin_ip=172.16.32.3
admin_ip6=2001:470:deeb:32::3/64
admin_netmask=255.255.255.0
admin_network=...
admin_gateway=172.16.32.11
internal_nic=aggr0
After rebooting back into 20160804T173241Z, everything is fine again -
LACP comes up immediately.
Any ideas? The changelog didn't mention anything specifically relevant
(searching for "lacp", "igb") that I could see...
//jb
-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription:
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com