#275: Scan for non-ESSID-broadcasting access point always fails
------------------------------------+---------------------------------------
  Reporter:  [EMAIL PROTECTED]    |       Owner:  mrenzmann                     
      
      Type:  defect                 |      Status:  assigned                    
        
  Priority:  minor                  |   Milestone:  version 1.0.0 - first 
stable release
 Component:  madwifi: 802.11 stack  |     Version:  trunk                       
        
Resolution:                         |    Keywords:                              
        
------------------------------------+---------------------------------------
Comment (by [EMAIL PROTECTED]):

 Yes, I agree - the simple, single mdelay() call is inelegant and isn't
 guaranteed to work in all cases.

 Unfortunately I haven't been able to convince myself that there is any
 method which can (even in principle) work "correctly" in all cases.  Every
 method seems to have shortcomings.

 Fundamentally, there seems to be a basic resource-crunch here.  There
 could be multiple threads of code (different wireless tools - e.g. a
 daemon and a GUI panel) which wish to initiate active scans for various
 purposes.  If I understand things firmly (and perhaps I do not) the
 underlying firmware is capable of handling only one active scan at a time.
 This creates a conflict:  if two or more entities try to scan at once,
 then one of them is going to lose, in one way or another.  Any of several
 things can happen - its scan is never started (as is currently the case in
 the main-trunk code), or it's delayed in its ability to start a scan until
 the previous scanner is done, or its scan is started but then canceled
 "behind its back" without warning or notice.  Somebody loses;  I'd guess
 that the goal is to be reasonably fair in how this happens, and ensure
 that no code of thread becomes "stuck" indefinitely.

 The simple one-time mdelay() call makes an attempt to shut down a previous
 scan semi-gracefully before starting its own.  On the bad side, this isn't
 guaranteed to work:  the previous scan might take longer to shut down, and
 another party might come in after the cancellation takes effect and start
 another scan ("jumping the queue" in one way or another).  On the good
 side, this approach wouldn't seem to be capable of causing the calling
 thread to hang indefinitely.

 The method used in ticket #228 is another way of doing it.  On the good
 side, it'll proceed more quickly after the scan cancellation takes effect,
 and it's more positive about making sure that the cancellation did take
 effect.  On the bad side, it looks to me as if it could hang the calling
 thread for quite a while - if another thread "jumps the queue" and starts
 another scan after the cancellation takes effect and before this thread
 wakes up and checks, then the queue-jumper's scan wins out and the
 original canceller has to wait an indefinite amount of time.

 A safer compromise would be to use the method in #228, but with an
 iteration count and a timeout after perhaps 50 - 100 milliseconds.  If the
 interface isn't out of active-scan mode by then, the code could either re-
 cancel and wait again ("shooting the claim-jumper") or just bail out
 gracefully.  Either is probably better than being stuck indefinitely.

 A fancier approach would be to maintain some sort of explicit queue of
 active scans which had been requested, but not yet actually initiated.
 Some piece of code (perhaps a separate kernel thread which managed the
 interface, or perhaps the driver bottom-half) would terminate one scan and
 start the next, as appropriate.  This would be a much more complex and
 invasive change to the driver. It's beyond what I'd want to tackle myself
 at this stage, and frankly I'm not sure if it's really worth the effort.
 I would hope that the higher-level software which is asking for scans
 (e.g. wpa_supplicant, GUIs, etc.) would simply treat the results of a scan
 conflict the way that they'd treat any other scan which didn't find the
 desired APs - they'd idle for a while and then re-scan.

 So... I'm quite willing to redo my patch, replacing the simple mdelay()
 call with an interation-limited "check flag, sleep if it's still scanning"
 loop and a graceful bailout after a reasonable time (100 ms?).  Would that
 be satisfactory to all concerned?  If so, perhaps it would be wise to have
 the #228 patch use the same technique?

 Is there a call other than mdelay() which would be preferable?

-- 
Ticket URL: <http://madwifi.org/ticket/275>
MadWifi <http://madwifi.org/>
Multiband Atheros Driver for Wireless Fidelity

Reply via email to