http://defect.opensolaris.org/bz/show_bug.cgi?id=13999
Summary: nwam sometimes fails to configure bge interfaces
Classification: Development
Product: nwam
Version: nwam1_130
Platform: ANY/Generic
OS/Version: OpenSolaris
Status: NEW
Status Whiteboard: sst-osp
Severity: normal
Priority: P3
Component: ON daemon
AssignedTo: nwam-dev at opensolaris.org
ReportedBy: Martin.Horcicka at Sun.COM
QAContact: nwam-dev at opensolaris.org
CC: sst-defect at sun.com
--- Comment #0 from Martin Horcicka <Martin.Horcicka at Sun.COM> 2010-01-20
18:42:21 CET ---
During SST testing of OpenSolaris build 130 we found out that nwam
intermittently fails to configure interfaces properly. We have only seen it on
platforms with bge interfaces. In particular, on:
- Sun Fire V20z
- Sun Fire V40z
- Sun Fire V210
After an unsuccessful reboot the affected machine is in a state described
below.
# dladm show-ether
LINK PTYPE STATE AUTO SPEED-DUPLEX PAUSE
bge2 current down no 0M none
bge0 current up yes 1G-f bi
bge1 current down no 0M none
bge3 current down no 0M none
# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232
index 1
inet 127.0.0.1 netmask ff000000
bge0: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 0.0.0.0 netmask 0
ether 0:14:4f:4d:87:4d
bge1: flags=1000802<BROADCAST,MULTICAST,IPv4> mtu 1500 index 4
inet 0.0.0.0 netmask 0
ether 0:14:4f:4d:87:4e
bge2: flags=1000802<BROADCAST,MULTICAST,IPv4> mtu 1500 index 2
inet 0.0.0.0 netmask 0
ether 0:14:4f:4d:87:4f
bge3: flags=1004803<UP,BROADCAST,MULTICAST,DHCP,IPv4> mtu 1500 index 5
inet 0.0.0.0 netmask ff000000
ether 0:14:4f:4d:87:50
lo0: flags=2002000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv6,VIRTUAL> mtu 8252
index 1
inet6 ::1/128
bge0: flags=2004841<UP,RUNNING,MULTICAST,DHCP,IPv6> mtu 1500 index 3
inet6 fe80::214:4fff:fe4d:874d/10
ether 0:14:4f:4d:87:4d
# svcs -xv
svc:/network/service:default (layered network services)
State: offline since Tue Jan 12 14:51:58 2010
Reason: Start method is running.
See: http://sun.com/msg/SMF-8000-C4
See: man -M /usr/share/man -s 1M ifconfig
See: /var/svc/log/network-service:default.log
Impact: 22 dependent services are not running:
svc:/network/dns/client:default
svc:/milestone/name-services:default
svc:/milestone/multi-user:default
svc:/system/boot-config:default
svc:/system/intrd:default
svc:/milestone/multi-user-server:default
svc:/system/zones:default
svc:/application/graphical-login/gdm:default
svc:/system/rad:default
svc:/system/filesystem/autofs:default
svc:/system/system-log:default
svc:/network/smtp:sendmail
svc:/system/fpsd:default
svc:/network/sendmail-client:default
svc:/system/dumpadm:default
svc:/system/fmd:default
svc:/network/ssh:default
svc:/application/pkg/update:default
svc:/network/inetd:default
svc:/system/cron:default
svc:/application/opengl/ogl-select:default
svc:/network/iscsi/initiator:default
svc:/network/rpc/smserver:default (removable media management)
State: uninitialized since Tue Jan 12 14:51:17 2010
Reason: Restarter svc:/network/inetd:default is not running.
See: http://sun.com/msg/SMF-8000-5H
See: man -M /usr/share/man -s 1M rpc.smserverd
Impact: 2 dependent services are not running:
svc:/milestone/multi-user-server:default
svc:/system/zones:default
/var/svc/log/network-service:default.log:
[ Jan 12 14:51:15 Enabled. ]
[ Jan 12 14:51:58 Executing start method ("/lib/svc/method/net-svc start"). ]
[ Jan 12 14:51:58 Timeout override by svc.startd. Using infinite timeout. ]
WARNING: Timed out waiting for NIS to come up
# svcs physical
STATE STIME FMRI
disabled 14:51:14 svc:/network/physical:default
online 14:51:17 svc:/network/physical:nwam
# svcs -l nwam
fmri svc:/network/physical:nwam
name physical network interface autoconfiguration
enabled true
state online
next_state none
state_time Tue Jan 12 14:51:17 2010
logfile /var/svc/log/network-physical:nwam.log
restarter svc:/system/svc/restarter:default
contract_id 8
dependency require_all/none svc:/network/loopback (online)
dependency require_all/none svc:/network/datalink-management (online)
/var/svc/log/network-physical:nwam.log:
[ Jan 12 14:51:14 Enabled. ]
[ Jan 12 14:51:16 Executing start method ("/lib/svc/method/net-nwam start"). ]
[ Jan 12 14:51:17 Method "start" exited with status 0. ]
ifconfig: bge3: wait timed out, operation still pending...
ifconfigifconfig: unable to start : unable to start
/sbin/dhcpagent/sbin/dhcpagent
After a manual restart of nwam, the following lines are added to
/var/svc/log/network-physical:nwam.log (and the interfaces are configured
correctly):
[ Jan 13 13:13:46 Stopping because service restarting. ]
[ Jan 13 13:13:46 Executing stop method ("/lib/svc/method/net-nwam stop"). ]
[ Jan 13 13:13:46 Method "stop" exited with status 0. ]
[ Jan 13 13:13:46 Executing start method ("/lib/svc/method/net-nwam start"). ]
[ Jan 13 13:13:46 Method "start" exited with status 0. ]
ifconfig: bge2: interface not in appropriate state for command
ifconfig: bge3: interface not in appropriate state for command
Setting netmask of lo0 to 255.0.0.0
sh: line 1: /etc/nwam/ulp/check-conditions: not found
Unfortunately, we haven't reproduced the problem after switching on the nwam
debug logging.
--
Configure bugmail: http://defect.opensolaris.org/bz/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
You are the assignee for the bug.