Re: [RFT] net80211 TX serialisation, take #4

2013-02-23 Thread PseudoCylon
 --

 Message: 12
 Date: Fri, 22 Feb 2013 17:18:44 -0800
 From: Adrian Chadd adr...@freebsd.org
 To: freebsd-wireless@freebsd.org
 Subject: Re: [RFT] net80211 TX serialisation, take #4
 Message-ID:
 CAJ-Vmom8HNLNc-=CogiF1v2pJHcn73rB0w9EOHoBtTjAp=j...@mail.gmail.com
 Content-Type: text/plain; charset=ISO-8859-1

 On 22 February 2013 15:25, Adrian Chadd adr...@freebsd.org wrote:

 So, this is all pretty terrible. The only sane solution for now is to
 make my VAP TX lock an IC TX lock,and grab said IC TX lock for all
 VAPs. That way the driver can grab the IC TX lock when it's doing
 deferred sends and it'll be sure the lock is held when it decides to
 grab/increment sequence numbers from ni-ni_txseqs[].

I don't think
  lock();
  ni-ni_txseqs[]++;
  unlock();
can fix the problem because no one can guarantee which process/thread
grabs the lock next.

i.e. concurrently, without lock,
- Thread A is processing packet seq#1
- Thread B is processing packet seq#2
When the time to do ni_txseqs[]++, how can we guarantee thread A grabs
the lock before thread B? In other word, if thread A always grabs the
lock first, we don't need to serialize the Tx, do we?

 * .. and do this without tearing my hair out.

The sequence will be messed up during moving packets from one queue to
another, i.e from driver queue to hardware queue. As long as packets
are in a queue (in a linked list) sequence won't change unless we
explicitly write such code. So...

Saving your hair option 1
tx()
{
for() {
lock();
dequeue(m);/* assuming queue is in order */
ni_txseqs[]++
enqueue_working_queue(m);
unlock();
...
process m
...
lock();
/*
 * m may change here.
 * Whichever the thread does dequeues, m0 will be
 * the head of the queue, so sequence will be kept intact.
 * But, need to test if processing of m0 has been completed.
 */
dequeue_working_queue(m0);
enqueue_driver_queue(m0);  /* or hardware_queue() */
unlock();
}
}
This will keep sequence intact.

Saving your hair option 2
Shave your head until you fix the problem. Hair will grow again, you
know. No bad hair day is a bonus.


AK
___
freebsd-wireless@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-wireless
To unsubscribe, send any mail to freebsd-wireless-unsubscr...@freebsd.org


Re: [RFT] net80211 TX serialisation, take #4

2013-02-23 Thread Adrian Chadd
On 23 February 2013 22:30, PseudoCylon moonlightak...@yahoo.ca wrote:
 So, this is all pretty terrible. The only sane solution for now is to
 make my VAP TX lock an IC TX lock,and grab said IC TX lock for all
 VAPs. That way the driver can grab the IC TX lock when it's doing
 deferred sends and it'll be sure the lock is held when it decides to
 grab/increment sequence numbers from ni-ni_txseqs[].

 I don't think
   lock();
   ni-ni_txseqs[]++;
   unlock();
 can fix the problem because no one can guarantee which process/thread
 grabs the lock next.

Yup, that's definitely a problem.

The problem here is that packets coming into net80211 - either via
vap-input() (bpf injection) or vap-transmit() (802.3) , have to do
this:

* processed in order;
* the same order as they're then pushed into the power save / age queue;
* the same order that these frames get assigned sequence numbers and
enforce state (eg considering things for ampdu, power save, etc);
* the same order that they're queued to the driver;
* the same order that they're dequeued from the driver;
* the same order that they're processed (eg crypto encaps);
* the same order that it's then passed down into the driver - either
via direct dispatch or deferred queue.


 * .. and do this without tearing my hair out.

 The sequence will be messed up during moving packets from one queue to
 another, i.e from driver queue to hardware queue. As long as packets
 are in a queue (in a linked list) sequence won't change unless we
 explicitly write such code. So...

 Saving your hair option 1
 tx()
 {
 for() {
 lock();
 dequeue(m);/* assuming queue is in order */
 ni_txseqs[]++
 enqueue_working_queue(m);
 unlock();
 ...
 process m
 ...
 lock();
 /*
  * m may change here.
  * Whichever the thread does dequeues, m0 will be
  * the head of the queue, so sequence will be kept intact.
  * But, need to test if processing of m0 has been completed.
  */
 dequeue_working_queue(m0);
 enqueue_driver_queue(m0);  /* or hardware_queue() */
 unlock();
 }
 }
 This will keep sequence intact.

Right. This is similar to my idea (or two.)

There's a few other issues though!

ic_raw_xmit() is called by a bunch of places to generate both raw
frames and management frames. This bypasses the vap queue and calls
the driver direct. As an example, injected EAPOL frames have CCMP IV
numbers (as they're encrypted!) but not necessarily sequence numbers.
Because the CCMP IV gets calculated int he driver at some point after
it's been queued, ANY slight race here causes some other frame to get
queued with a CCMP IV -after- the EAPOL frame, and it will get
dropped. Then you get your session dropped. :-)

ic_raw_xmit() is also an ic method, not a vap method. Yes, it bypasses
the vap queue.

There's deferred transmission going on (eg ath_start() getting called
from TX completion, as an example.) Should that be called under the
above lock() in your example?

And the TX sequence number stuff in iwn and ath, because the
aggregation code reuses the ni_txseqs[] array. So yes, the locking
wouldn't clash with the rest of net80211 (as only either the non-agg
TX path in net80211 or the TX path in ath/iwn would be using that
variable) but it's still a pain in the ass.

Grr.

The ridiculously complicated, broken-down-into-bits version:

* add a new mbuf tag to mbufs (optional) which lets us tag an mbuf
with a bpf TX params struct (ie, what gets optionally passed into the
raw xmit call);
* calls to ic_raw_xmit() become a VAP raw xmit;
* the VAP raw xmit pushes mbufs into the same ifnet queue that the VAP
frames have (ie right now the VAP ifnet queue) - ie, serialise the raw
and non-raw VAP frames before it gets processed for 802.11 state;
* .. then move the VAP transmit to your model - which is
  + vap transmit and raw transmit just push frames into the vap ifnet queue;
  + vap transmit then schedules a deferred task or a direct dispatch +
deferred task if we cant grab the VAP TX lock;
  + the vap transmit task/dispatch routine handles the frames one at a
time - either inside a task queue or inside the VAP TX lock - thus all
of the 802.3 - 802.11 encapsulation and state handling is correctly
serialised _and_ the order of sequence numbers and frame handling is
maintained;
  + then those frames have to be queued to the driver in _exactly_ the
same order - so either the VAP TX task/process queues them into a
driver staging queue and kicks the driver side ,or it goes into a
separate task to do driver dispatch;
  + .. and then it's up to the driver to dequeue each frame, handle it
and push them into the hardware in the same order.

Now, that's totally, ridiculously complicated but it does avoid a
bunch of reentrant