http://defect.opensolaris.org/bz/show_bug.cgi?id=10377
amaguire <alan.maguire at sun.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ACCEPTED |CAUSEKNOWN
--- Comment #5 from amaguire <alan.maguire at sun.com> 2009-08-10 02:37:34 ---
(In reply to comment #4)
> part of the problem here is that it seems like known_wlan_add_to_known_wlans()
> will core if it's passed a NULL BSSID, causing nwamd to restart and the entry
> not to be added to the known list. From there, we've already connected, but
> perhaps due to a refresh or something similar, we re-init state and end up in
> "need selection". I need to investigate that part further, but I've got a fix
> for the known WLAN coredump coming soon.
In trying to reproduce this, I'm seeing a different pattern of behaviour
unfortunately. I let nwam connect, then I delete the known WLAN entry and
switch off the AP for a few minutes. When I switch it back on, I get a link
down/link up event, wifi is still connected and the IP addresses configured are
still there and traffic flows. So I end up in the correct state. The thing I
couldn't figure out is why nwamd decided to refresh link:ath0, bringing it back
to the initialized state, from which point we end up in the "offline*,need
wireless network selection" state/aux state. That's the part I didn't see at my
end. The portion of the log at the start describes this:
Jul 30 09:35:13 starbug-mac nwamd[411]: [ID 956038 daemon.debug] 2: signal
Alarm Clock caught
Jul 30 09:35:13 starbug-mac nwamd[411]: [ID 207793 daemon.debug] 2: enqueueing
event 4 (OBJECT_STATE) for object (80f9748) link:ath0
Jul 30 09:35:13 starbug-mac nwamd[411]: [ID 941650 daemon.debug] 1: dequeueing
event of type 4 (OBJECT_STATE) for object link:ath0
Jul 30 09:35:13 starbug-mac nwamd[411]: [ID 715742 daemon.debug] 1: (80f9748)
link:ath0: running method for event 4 (OBJECT_STATE)
Jul 30 09:35:13 starbug-mac nwamd[411]: [ID 938006 daemon.debug] 1:
nwamd_ncu_handle_state_event: changing state for NCU link:ath0 to (offline* ,
(re)initialized but not configured)
Jul 30 09:35:13 starbug-mac nwamd[411]: [ID 549847 daemon.debug] 1: enqueueing
event 4 (OBJECT_STATE) for object (80f0048) link:ath0
Jul 30 09:35:13 starbug-mac nwamd[411]: [ID 477446 daemon.debug] 1:
nwamd_dlpi_add_link: ncu 80b4bc8 (link:ath0) type 0
Jul 30 09:35:13 starbug-mac nwamd[411]: [ID 941650 daemon.debug] 1: dequeueing
event of type 4 (OBJECT_STATE) for object link:ath0
Jul 30 09:35:13 starbug-mac nwamd[411]: [ID 713878 daemon.debug] 1: (80f0048)
link:ath0: running method for event 4 (OBJECT_STATE)
Jul 30 09:35:13 starbug-mac nwamd[411]: [ID 693018 daemon.debug] 1:
nwamd_ncu_handle_state_event: changing state for NCU link:ath0 to (offline* ,
scanning for available networks on WiFi link)
Jul 30 09:35:13 starbug-mac nwamd[411]: [ID 980078 daemon.debug] 1:
nwamd_wlan_scan: WLAN scan for ath0
Jul 30 09:35:13 starbug-mac nwamd[411]: [ID 649249 daemon.debug] 1: delay
enqueueing event PERIODIC_SCAN for object (80f9748) link:ath0 for 120 sec
Jul 30 09:35:13 starbug-mac nwamd[411]: [ID 171865 daemon.debug] 473:
wlan_scan_thread: initiating scan on ath0
Jul 30 09:35:13 starbug-mac nwamd[411]: [ID 767910 daemon.debug] 14:
door_switch: retrieving active priority-group: 1
Jul 30 09:35:13 starbug-mac nwamd[411]: [ID 196689 daemon.debug] 14:
door_switch: retrieving state for link:ath0
Jul 30 09:35:14 starbug-mac nwamd[411]: [ID 144003 daemon.debug] 473:
enqueueing event 7 (WLAN_SCAN_REPORT) for object (80f0048) link:ath0
Jul 30 09:35:14 starbug-mac nwamd[411]: [ID 378737 daemon.debug] 1: dequeueing
event of type 7 (WLAN_SCAN_REPORT) for object link:ath0
Jul 30 09:35:14 starbug-mac nwamd[411]: [ID 126147 daemon.debug] 1: (80f0048)
link:ath0: no matching method for event 7 (WLAN_SCAN_REPORT)
Jul 30 09:35:14 starbug-mac nwamd[411]: [ID 316842 daemon.debug] 473:
wlan_scan_thread: no known WLANs - ask user
Jul 30 09:35:14 starbug-mac nwamd[411]: [ID 160749 daemon.debug] 473:
enqueueing event 4 (OBJECT_STATE) for object (80f0048) link:ath0
Jul 30 09:35:14 starbug-mac nwamd[411]: [ID 941650 daemon.debug] 1: dequeueing
event of type 4 (OBJECT_STATE) for object link:ath0
Jul 30 09:35:14 starbug-mac nwamd[411]: [ID 272716 daemon.debug] 14:
door_switch: NCU link:ath0 is offline*
Jul 30 09:35:14 starbug-mac nwamd[411]: [ID 713878 daemon.debug] 1: (80f0048)
link:ath0: running method for event 4 (OBJECT_STATE)
Jul 30 09:35:14 starbug-mac nwamd[411]: [ID 347995 daemon.debug] 1:
nwamd_ncu_handle_state_event: changing state for NCU link:ath0 to (offline* ,
need WiFi wireless network selection)
Jul 30 09:35:14 starbug-mac nwamd[411]: [ID 160045 daemon.debug] 1: enqueueing
event 8 (WLAN_NEED_CHOICE) for object (80f9408) link:ath0
Jul 30 09:35:14 starbug-mac nwamd[411]: [ID 142482 daemon.debug] 1: dequeueing
event of type 8 (WLAN_NEED_CHOICE) for object link:ath0
Jul 30 09:35:14 starbug-mac nwamd[411]: [ID 448243 daemon.debug] 1: (80f9408)
link:ath0: no matching method for event 8 (WLAN_NEED_CHOICE)
The fact that we are responding to a SIGALARM suggests the state
reinitialization is from a timed event. The only place that could be from is
ncu.c, responding to a link down event. Do you have logs prior to this? If so,
I'd suspect we'd find the following 30 seconds before:
"nwamd_ncu_state_machine: wifi disconnect - start over after 30sec interval",
So basically there seems to be some difference in how iwk (which I was using)
and ath (which you were using) respond to an AP shutting down - either that
or the APs themselves work differently. Regardless, the problems to deal with
here are:
1) APs are not being added to the known WLAN list.
2) link down events like this aren't being propogated to IP. Also, link down
events like this need to reset the wifi connected state for the ncu data
structure so that nwamd forces a disconnect in order to retry connecting.
2 wouldn't have been so noticeable if 1 hadn't happened. I'm trying to get the
GUI working to assess if 1 is fixed, and I've got a fix in progress for 2.
Changing (tentatively) to cause known as I'm reasonably sure I've pinned down
what's happening, even though I can't fully reproduce it.
--
Configure bugmail: http://defect.opensolaris.org/bz/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.