Hi Michel, Looks good to me. Looks like this thread didn't make it to the list so I'll try forward it.
On Tue, Dec 17, 2013 at 4:23 AM, Stam, Michel [FINT] <[email protected]> wrote: > Hello again, > > As promised, I reworked the patch a little; most of the code from > new_unauthenticated_peer( ) in linux/meshd-nl80211.c has been moved to > peer_created( ) in the same file. As peer_created( ) is invoked before > new_unauthenticated_peer( ) both when a beacon is received or when no > beacon is received, this should work. > > Let me know if this works for you. > > Kind regards, > > Michel Stam > > -----Original Message----- > From: Stam, Michel [FINT] > Sent: Monday, December 16, 2013 1:10 PM > To: 'Thomas Pedersen' > Cc: [email protected] > Subject: RE: Question regarding AuthSAE > > Hello Thomas, > > Let me respond to your email; > >>> What I derive from this is that the NEW_PEER_CANDIDATE command, which > >>> is invoked every time a beacon is received, is sent -after- the SAE >>> exchange. Because of this, there's no record in the kernel of the >>> peer, and while the SAE exchange succeeds, the keys are not installed > >>> (there's no peer to associate them with in the kernel), and the >>> connection does not work. >> >>So the station is only ever inserted in response to a > NEW_PEER_CANDIDATE event, I guess as long as this happens before key > installation we're ok. >>What we do in wpa_supplicant is just defer station insertion until > receiving the first OPEN frame. Note only the userspace-level station > struct is created on receiving a NEW_PEER_CANDIDATE event. > > So long as at least one beacon has been received before the exchange > occurs, it seems to work well enough. I did see some 'invalid argument' > errors, but they do not seem to keep the mesh link from functioning. > It is basically a race condition; technically it could happen if someone > starts up meshd-nl80211 during the bootup of a system (as is in the case > of a router). > > With regard to meshd-nl80211 creating only the userspace-level struct; > What you say is true, however; NL80211_CMD_NEW_PEER_CANDIDATE is sent by > the kernel (net/wireless/nl80211.c cfg80211_notify_new_peer_candidate( > ), net/mac80211/mesh_plink.c mesh_sta_info_alloc( ) via > mesh_sta_info_get( ) and mesh_neighbour_update( ) > ieee80211_mesh_rx_bcn_presp( ). This function is invoked when > IEEE80211_STYPE_BEACON or IEEE80211_STYPE_PROBE_RESP management frames > are received. > AuthSAE's create_candidate( ) is invoked from process_mgmt_frame( ) when > a beacon is received (IEEE802_11_FC_STYPE_BEACON). So basically the > beacon triggers 2 reactions in meshd-nl80211, one that creates the user > struct, and one that causes the kernel to send a message to > meshd-nl80211, which in turn tells the kernel to add the candidate via > NL80211_CMD_NEW_STATION. > > (Note that I traced only the path that has probably caused this > behaviour in meshd-nl80211). > > When no beacon is received, the create_candidate( ) is invoked either on > sae.c reauth( ), or on SAE_AUTH_COMMIT in process_mgmt_frame( ). Given > that reauth( ) is not the issue here (both mesh nodes have just > started), I think this is where things go wrong. NL80211_CMD_NEW_STATION > is not called, this only happens on receipt of a beacon. > > >>> Would you mind looking at this, to see if I did not do anything which > >>> might cause drastic failures? Note that this does need cleaning up, >>> but I'd like to do that once I am certain that this is a viable > solution. >> >>It looks ok. Care to send a patch to [email protected] or send > a github pull-request once you're happy with it? > > What I intend to do is take the duplicated code from > new_unauthenticated_candidate( ) in linux/meshd-nl80211.c and > communicate new candidates to the kernel when peer_created( ) is > invoked. I haven't tested this, but I think this will work equally well, > without any -EEXIST errors. > >>What kind of patch format is that anyway? :) > It's a context diff. For some reason I stuck with the predecessor of the > unified diff, but never mind, I'll send a unified diff next time. Old > habits die hard. > > I'll send a (reworked) patch later on this week when I've had a little > time to rework it. > > Regards, > > Michel Stam > > -----Original Message----- > From: Thomas Pedersen [mailto:[email protected]] > Sent: Sunday, December 15, 2013 9:18 PM > To: Stam, Michel [FINT] > Cc: [email protected] > Subject: Re: Question regarding AuthSAE > > Hi Michel, > > CCing o11s-devel, as maybe someone there can help you as well. > > On Fri, Dec 13, 2013 at 04:33:59PM +0100, Stam, Michel [FINT] wrote: >> Hello, >> >> >> >> I came across your email address because I think I have found >> something in AuthSAE (the latest GIT release available on >> https://github.com/cozybit/authsae/), but I was unable to find any >> maintainer. Can you direct me to the maintainer, or perhaps help me? > > I think all the cozybit folks responsible are on the open80211s list. > >> What I recently discovered; >> >> I have 2 mesh nodes running mac80211s from kernel 3.12.1 (stock > kernel). >> Both nodes are Dell Precision M6500 units with a Intel i7 (there's 8 >> cores here, mind). The only non-default hardware change is the >> addition of a ATH9K AR9285 radio card (so that hardware MFP should >> work). See also the attached authsae.sample.cfg used for the test >> setup. The setup is created after a clean boot as: >> >> mount none /sys/kernel/debug -t debugfs >> >> iw dev wlan0 del >> >> iw dev mesh0 del >> >> iw phy phy0 interface add mesh0 type mp >> >> ifconfig mesh0 up >> >> ./meshd-nl80211 -c ./authsae.sample.cfg >> >> >> >> When the meshd-nl80211 process is started on both units (almost >> simultaneously, one just a fraction earlier than the other), the node >> that first starts has issues installing the encryption keys . The >> second node has no such problem. >> >> Further investigation turned out that the final establish is done in >> estab_peer_link( ) in linux/meshd-nl80211.c, specifically using calls >> to the install_key( ) function, and the set_supported_rates( ) > function. >> These are the ones that fail. >> >> All of these make calls to the kernel function sta_info_get( ) / >> sta_info_get_bss( ) in net/mac80211/sta_info.c which expects that the >> station has been allocated in the kernel, and returns NULL otherwise. >> Functions calling these return -ENOENT if this happens. >> >> As meshd-nl80211 has taken control of station allocation from the >> kernel this should have been done by the time the nodes establish the >> mesh link. However, this is not always the case. >> >> >> >> See doesnotwork.log, in particular this snippet: >> >> estab with 48:5d:60:c0:27:20 >> >> set auth flag (seq num=1386330674) >> >> mesh plink with 48:5d:60:c0:27:20 established >> >> nlerror, cmd 18, seq 1386330673: No such file or directory >> >> nlerror, cmd 18, seq 1386330674: No such file or directory >> >> nlerror, cmd 11, seq 1386330675: No such file or directory >> >> nlerror, cmd 11, seq 1386330676: No such file or directory >> >> nlerror, cmd 11, seq 1386330677: No such file or directory >> >> nlerror, cmd 18, seq 1386330678: No such file or directory >> >> NL80211_CMD_NEW_PEER_CANDIDATE(1386330666.49663) >> >> new unauthed sta (seq num=1386330679) >> >> NL80211_CMD_NEW_STATION (1386330666.50351) >> >> Mesh plink timer for 48:5d:60:c0:27:20 fired on state ESTAB >> >> Timeout for peer 48:5d:60:c0:27:20 in state 4 >> >> >> >> Compare this to a working exchange, see doeswork.log, in particular >> the >> snippet: >> >> estab with 48:5d:60:c0:27:20 >> >> set auth flag (seq num=1386334964) >> >> mesh plink with 48:5d:60:c0:27:20 established >> >> nlerror, cmd 18, seq 1386334968: Invalid argument >> >> Mesh plink timer for 48:5d:60:c0:27:20 fired on state ESTAB >> >> Timeout for peer 48:5d:60:c0:27:20 in state 4 >> >> >> >> What I derive from this is that the NEW_PEER_CANDIDATE command, which >> is invoked every time a beacon is received, is sent -after- the SAE >> exchange. Because of this, there's no record in the kernel of the >> peer, and while the SAE exchange succeeds, the keys are not installed >> (there's no peer to associate them with in the kernel), and the >> connection does not work. > > So the station is only ever inserted in response to a NEW_PEER_CANDIDATE > event, I guess as long as this happens before key installation we're ok. > What we do in wpa_supplicant is just defer station insertion until > receiving the first OPEN frame. Note only the userspace-level station > struct is created on receiving a NEW_PEER_CANDIDATE event. > >> I managed to create a (very ugly fix) for this which resolves the >> issue by executing NL80211_CMD_NEW_STATION when the peer_created( ) >> function in linux/meshd-nl80211.c is called by create_candidate( ) in >> sae.c. This may cause the kernel to return EEXISTS if the beacon >> arrives, but at least the connection is established. This code was >> mostly taken from linux/meshd-nl80211.c new_unauthenticated_peer( ). >> >> >> >> Would you mind looking at this, to see if I did not do anything which >> might cause drastic failures? Note that this does need cleaning up, >> but I'd like to do that once I am certain that this is a viable > solution. > > It looks ok. Care to send a patch to [email protected] or send > a github pull-request once you're happy with it? > > What kind of patch format is that anyway? :) > > Thanks, > Thomas _______________________________________________ Devel mailing list [email protected] http://lists.open80211s.org/cgi-bin/mailman/listinfo/devel
