[PATCH 0/4]: Revised support for passive-close

2007-11-28 Thread Gerrit Renker
This is the revised and tested patch for supporting passive-close. Testing
was impaired by a nasty bug (patch #1) which lead to the final Reset not
being sent. Once that had been fixed, things worked out very smoothly.

Other than that I think that the patch set has much benefited from the revision,
all possible combinations of server/client close have been tested, over an
emulated IPv6 satellite link, and with different constellations (client closes
earlier as server and vice versa).

There is one new point, simultaneous-close, that the patch now also deals with.
It is possible (and was observed several times during testing) that client and
server both perform an active-close, in which case both will retransmit their
Close / wait for a Reset ad infinitum. The patch deals with this situation by
letting the client act as tie-breaker here, which was shown to work very 
satisfactorily:
http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/closing_states/#simultaneous_close

Patch #1: Fixes a bug in the dccp_send_reset() routine: 

Patch #2: Is an updated/revised version of the patch introducing the 
closing-states.
  Changes:   i) combined with other patch (as suggested by Arnaldo);
ii) better/revised documentation;
   iii) uses the new naming scheme (also thanks to Arnaldo).

Patch #3: Revised and updated patch to support passive-close in terms of state
  transitions. The changes to the previous variant are:
  * explicit control over transitions into passive-intermediate states;
  * the leaked-skb issue has been fixed by tracking whether the skb is 
queued;
  * support for simultaneous active-close by client and server;
  * a detailed account of what happens in each state.

Patch #4: Removes a redundant test (consequence of patch #3).

I have updated the test tree with regard to these patches. Before the end of the
week I will take Arnaldo's changes for 2.6.25, backport them to the mainline 
basis
that the test tree is based on, and update the test tree with regard to the 
latest
changes (also updating the subtrees).
-
To unsubscribe from this list: send the line unsubscribe dccp in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4]: Use AF-independent rebuild_header routine

2007-11-28 Thread Gerrit Renker
[DCCP]: Use AF-independent rebuild_header routine

This fixes a nasty bug: dccp_send_reset() is called by both DCCPv4 and DCCPv6, 
but uses
inet_sk_rebuild_header() in each case. This leads to unpredictable and weird 
behaviour:
under some conditions, DCCPv6 Resets were sent, in other not.

The fix is to use the AF-independent rebuild_header routine.

Signed-off-by: Gerrit Renker [EMAIL PROTECTED]
---
 net/dccp/output.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/dccp/output.c
+++ b/net/dccp/output.c
@@ -391,7 +391,7 @@ int dccp_send_reset(struct sock *sk, enu
 * FIXME: what if rebuild_header fails?
 * Should we be doing a rebuild_header here?
 */
-   int err = inet_sk_rebuild_header(sk);
+   int err = inet_csk(sk)-icsk_af_ops-rebuild_header(sk);
 
if (err != 0)
return err;
-
To unsubscribe from this list: send the line unsubscribe dccp in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4]: Remove duplicate test for CloseReq

2007-11-28 Thread Gerrit Renker
[DCCP]: Remove duplicate test for CloseReq

This removes a redundant test for unexpected packet types. In 
dccp_rcv_state_process
it is tested twice whether a DCCP-server has received a CloseReq (Step 7):

 * first in the combined if-statement,
 * then in the call to dccp_rcv_closereq().

The latter is necesssary since dccp_rcv_closereq() is also called from
__dccp_rcv_established().

This patch removes the duplicate test.

Signed-off-by: Gerrit Renker [EMAIL PROTECTED]
---
 net/dccp/input.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

--- a/net/dccp/input.c
+++ b/net/dccp/input.c
@@ -616,16 +616,14 @@ int dccp_rcv_state_process(struct sock *
return 0;
/*
 *   Step 7: Check for unexpected packet types
-*  If (S.is_server and P.type == CloseReq)
-*  or (S.is_server and P.type == Response)
+*  If (S.is_server and P.type == Response)
 *  or (S.is_client and P.type == Request)
 *  or (S.state == RESPOND and P.type == Data),
 *Send Sync packet acknowledging P.seqno
 *Drop packet and return
 */
} else if ((dp-dccps_role != DCCP_ROLE_CLIENT 
-   (dh-dccph_type == DCCP_PKT_RESPONSE ||
-dh-dccph_type == DCCP_PKT_CLOSEREQ)) ||
+   dh-dccph_type == DCCP_PKT_RESPONSE) ||
(dp-dccps_role == DCCP_ROLE_CLIENT 
 dh-dccph_type == DCCP_PKT_REQUEST) ||
(sk-sk_state == DCCP_RESPOND 
-
To unsubscribe from this list: send the line unsubscribe dccp in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4]: Dedicated auxiliary states to support passive-close

2007-11-28 Thread Gerrit Renker
[DCCP]: Dedicated auxiliary states to support passive-close

This adds two auxiliary states to deal with passive closes:
  * PASSIVE_CLOSE(reached from OPEN via reception of Close)and
  * PASSIVE_CLOSEREQ (reached from OPEN via reception of CloseReq)
as internal intermediate states.

These states are used to allow a receiver to process unread data before 
acknowledging the received connection-termination-request (the Close/CloseReq).

Without such support, it will happen that passively-closed sockets enter CLOSED
state while there is still unprocessed data in the queue; leading to unexpected
and erratic API behaviour.

PASSIVE_CLOSE has been mapped into TCPF_CLOSE_WAIT, so that the code will 
seamlessly
work with inet_accept() (which tests for this state).

The state names are thanks to Arnaldo, who suggested this naming scheme 
following
an earlier revision of this patch.

Signed-off-by: Gerrit Renker [EMAIL PROTECTED]
---
 include/linux/dccp.h |   56 +++
 net/dccp/proto.c |   22 ++--
 2 files changed, 51 insertions(+), 27 deletions(-)

--- a/include/linux/dccp.h
+++ b/include/linux/dccp.h
@@ -227,29 +227,51 @@ struct dccp_so_feat {
 #include net/tcp_states.h
 
 enum dccp_state {
-   DCCP_OPEN   = TCP_ESTABLISHED,
-   DCCP_REQUESTING = TCP_SYN_SENT,
-   DCCP_LISTEN = TCP_LISTEN,
-   DCCP_RESPOND= TCP_SYN_RECV,
-   DCCP_CLOSING= TCP_CLOSING,
-   DCCP_TIME_WAIT  = TCP_TIME_WAIT,
-   DCCP_CLOSED = TCP_CLOSE,
-   DCCP_PARTOPEN   = TCP_MAX_STATES,
+   DCCP_OPEN= TCP_ESTABLISHED,
+   DCCP_REQUESTING  = TCP_SYN_SENT,
+   DCCP_LISTEN  = TCP_LISTEN,
+   DCCP_RESPOND = TCP_SYN_RECV,
+   /*
+* States involved in closing a DCCP connection:
+* 1) ACTIVE_CLOSEREQ is entered by a server sending a CloseReq.
+*
+* 2) CLOSING can have three different meanings (RFC 4340, 8.3):
+*  a. Client has performed active-close, has sent a Close to the server
+* from state OPEN or PARTOPEN, and is waiting for the final Reset
+* (in this case, SOCK_DONE == 1).
+*  b. Client is asked to perform passive-close, by receiving a CloseReq
+* in (PART)OPEN state. It sends a Close and waits for final Reset
+* (in this case, SOCK_DONE == 0).
+*  c. Server performs an active-close as in (a), keeps TIMEWAIT state.
+*
+* 3) The following intermediate states are employed to give passively
+*closing nodes a chance to process their unread data:
+*- PASSIVE_CLOSE(from OPEN = CLOSED) and
+*- PASSIVE_CLOSEREQ (from (PART)OPEN to CLOSING; case (b) above).
+*/
+   DCCP_ACTIVE_CLOSEREQ = TCP_FIN_WAIT1,
+   DCCP_PASSIVE_CLOSE   = TCP_CLOSE_WAIT,  /* any node receiving a Close */
+   DCCP_CLOSING = TCP_CLOSING,
+   DCCP_TIME_WAIT   = TCP_TIME_WAIT,
+   DCCP_CLOSED  = TCP_CLOSE,
+   DCCP_PARTOPEN= TCP_MAX_STATES,
+   DCCP_PASSIVE_CLOSEREQ,  /* clients receiving CloseReq */
DCCP_MAX_STATES
 };
 
-#define DCCP_STATE_MASK 0xf
+#define DCCP_STATE_MASK 0x1f
 #define DCCP_ACTION_FIN (17)
 
 enum {
-   DCCPF_OPEN   = TCPF_ESTABLISHED,
-   DCCPF_REQUESTING = TCPF_SYN_SENT,
-   DCCPF_LISTEN = TCPF_LISTEN,
-   DCCPF_RESPOND= TCPF_SYN_RECV,
-   DCCPF_CLOSING= TCPF_CLOSING,
-   DCCPF_TIME_WAIT  = TCPF_TIME_WAIT,
-   DCCPF_CLOSED = TCPF_CLOSE,
-   DCCPF_PARTOPEN   = (1  DCCP_PARTOPEN),
+   DCCPF_OPEN= TCPF_ESTABLISHED,
+   DCCPF_REQUESTING  = TCPF_SYN_SENT,
+   DCCPF_LISTEN  = TCPF_LISTEN,
+   DCCPF_RESPOND = TCPF_SYN_RECV,
+   DCCPF_ACTIVE_CLOSEREQ = TCPF_FIN_WAIT1,
+   DCCPF_CLOSING = TCPF_CLOSING,
+   DCCPF_TIME_WAIT   = TCPF_TIME_WAIT,
+   DCCPF_CLOSED  = TCPF_CLOSE,
+   DCCPF_PARTOPEN= (1  DCCP_PARTOPEN),
 };
 
 static inline struct dccp_hdr *dccp_hdr(const struct sk_buff *skb)
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -60,8 +60,7 @@ void dccp_set_state(struct sock *sk, con
 {
const int oldstate = sk-sk_state;
 
-   dccp_pr_debug(%s(%p) %-10.10s - %s\n,
- dccp_role(sk), sk,
+   dccp_pr_debug(%s(%p)  %s  --  %s\n, dccp_role(sk), sk,
  dccp_state_name(oldstate), dccp_state_name(state));
WARN_ON(state == oldstate);
 
@@ -134,14 +133,17 @@ EXPORT_SYMBOL_GPL(dccp_packet_name);
 const char *dccp_state_name(const int state)
 {
static char *dccp_state_names[] = {
-   [DCCP_OPEN]   = OPEN,
-   [DCCP_REQUESTING] = REQUESTING,
-   [DCCP_PARTOPEN]   = PARTOPEN,
-   [DCCP_LISTEN] = LISTEN,
-   [DCCP_RESPOND]= RESPOND,
-   [DCCP_CLOSING]= CLOSING,
-   [DCCP_TIME_WAIT]  = TIME_WAIT,
- 

[PATCH 3/4]: Integrate state transitions for passive-close

2007-11-28 Thread Gerrit Renker
[DCCP]: Integrate state transitions for passive-close

This adds the necessary state transitions for the two forms of passive-close

 * PASSIVE_CLOSE- which is entered when a host   receives a Close;
 * PASSIVE_CLOSEREQ - which is entered when a client receives a CloseReq.

Here is a detailed account of what the patch does in each state.
  
1) Receiving CloseReq
--
  The pseudo-code in 8.5 says:

 Step 13: Process CloseReq
  If P.type == CloseReq and S.state  CLOSEREQ,
  Generate Close
  S.state := CLOSING
  Set CLOSING timer.

  This means we need to address what to do in CLOSED, LISTEN, REQUEST, RESPOND, 
PARTOPEN, and OPEN.

   * CLOSED: silently ignore - it may be a late or duplicate CloseReq;
   * LISTEN/RESPOND: will not appear, since Step 7 is performed first (we know 
we are the client);
   * REQUEST:perform Step 13 directly (no need to enqueue packet);
   * OPEN/PARTOPEN:  enter PASSIVE_CLOSEREQ so that the application has a 
chance to process unread data.

  When already in PASSIVE_CLOSEREQ, no second CloseReq is enqueued. In any 
other state, the CloseReq is ignored.
  I think that this offers some robustness against rare and pathological cases: 
e.g. a simultaneous close where
  the client sends a Close and the server a CloseReq. The client will then be 
retransmitting its Close until it
  gets the Reset, so ignoring the CloseReq while in state CLOSING is sane.
  
2) Receiving Close
---
  The code below from 8.5 is unconditional.

 Step 14: Process Close
  If P.type == Close,
  Generate Reset(Closed)
  Tear down connection
  Drop packet and return

  Thus we need to consider all states:
   * CLOSED:   silently ignore, since this can happen when a 
retransmitted or late Close arrives;
   * LISTEN:   dccp_rcv_state_process() will generate a Reset (No 
Connection);
   * REQUEST:  perform Step 14 directly (no need to enqueue packet);
   * RESPOND:  dccp_check_req() will generate a Reset (Packet Error) 
-- left it at that;
   * OPEN/PARTOPEN:enter PASSIVE_CLOSE so that application has a chance to 
process unread data;
   * CLOSEREQ: server performed active-close -- perform Step 14;
   * CLOSING:  simultaneous-close: use a tie-breaker to avoid message 
ping-pong (see comment);
   * PASSIVE_CLOSEREQ: ignore - the peer has a bug (sending first a CloseReq 
and now a Close);
   * TIMEWAIT: packet is ignored.

   Note that the condition of receiving a packet in state CLOSED here is 
different from the condition there
   is no socket for such a connection: the socket still exists, but its state 
indicates it is unusable.

   Last, dccp_finish_passive_close sets either DCCP_CLOSED or DCCP_CLOSING = 
TCP_CLOSING, so that 
   sk_stream_wait_close() will wait for the final Reset (which will trigger 
CLOSING = CLOSED).

Signed-off-by: Gerrit Renker [EMAIL PROTECTED]
---
 include/linux/dccp.h |1 
 net/dccp/input.c |   88 ++-
 net/dccp/proto.c |   88 +--
 3 files changed, 131 insertions(+), 46 deletions(-)

--- a/net/dccp/input.c
+++ b/net/dccp/input.c
@@ -32,16 +32,56 @@ static void dccp_fin(struct sock *sk, st
sk-sk_data_ready(sk, 0);
 }
 
-static void dccp_rcv_close(struct sock *sk, struct sk_buff *skb)
+static int dccp_rcv_close(struct sock *sk, struct sk_buff *skb)
 {
-   dccp_send_reset(sk, DCCP_RESET_CODE_CLOSED);
-   dccp_fin(sk, skb);
-   dccp_set_state(sk, DCCP_CLOSED);
-   sk_wake_async(sk, 1, POLL_HUP);
+   int queued = 0;
+
+   switch (sk-sk_state) {
+   /*
+* We ignore Close when received in one of the following states:
+*  - CLOSED(may be a late or duplicate packet)
+*  - PASSIVE_CLOSEREQ  (the peer has sent a CloseReq earlier)
+*  - RESPOND   (already handled by dccp_check_req)
+*/
+   case DCCP_CLOSING:
+   /*
+* Simultaneous-close: receiving a Close after sending one. This
+* can happen if both client and server perform active-close and
+* will result in an endless ping-pong of crossing and retrans-
+* mitted Close packets, which only terminates when one of the
+* nodes times out (min. 64 seconds). Quicker convergence can be
+* achieved when one of the nodes acts as tie-breaker.
+* This is ok as both ends are done with data transfer and each
+* end is just waiting for the other to acknowledge termination.
+*/
+   if (dccp_sk(sk)-dccps_role != DCCP_ROLE_CLIENT)
+   break;
+   /* fall through */
+   case DCCP_REQUESTING:
+   case DCCP_ACTIVE_CLOSEREQ:
+   

Re: [PATCH 1/4]: Use AF-independent rebuild_header routine

2007-11-28 Thread Arnaldo Carvalho de Melo
Em Wed, Nov 28, 2007 at 08:35:08AM +, Gerrit Renker escreveu:
 [DCCP]: Use AF-independent rebuild_header routine
 
 This fixes a nasty bug: dccp_send_reset() is called by both DCCPv4 and 
 DCCPv6, but uses
 inet_sk_rebuild_header() in each case. This leads to unpredictable and weird 
 behaviour:
 under some conditions, DCCPv6 Resets were sent, in other not.
 
 The fix is to use the AF-independent rebuild_header routine.
 
 Signed-off-by: Gerrit Renker [EMAIL PROTECTED]

Thanks, applied.

- Arnaldo
-
To unsubscribe from this list: send the line unsubscribe dccp in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4]: Remove duplicate test for CloseReq

2007-11-28 Thread Arnaldo Carvalho de Melo
Em Wed, Nov 28, 2007 at 08:35:11AM +, Gerrit Renker escreveu:
 [DCCP]: Remove duplicate test for CloseReq
 
 This removes a redundant test for unexpected packet types. In 
 dccp_rcv_state_process
 it is tested twice whether a DCCP-server has received a CloseReq (Step 7):
 
  * first in the combined if-statement,
  * then in the call to dccp_rcv_closereq().
 
 The latter is necesssary since dccp_rcv_closereq() is also called from
 __dccp_rcv_established().
 
 This patch removes the duplicate test.
 
 Signed-off-by: Gerrit Renker [EMAIL PROTECTED]

Thanks, applied.

- Arnaldo
-
To unsubscribe from this list: send the line unsubscribe dccp in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4]: Integrate state transitions for passive-close

2007-11-28 Thread Arnaldo Carvalho de Melo
Em Wed, Nov 28, 2007 at 08:35:10AM +, Gerrit Renker escreveu:
 [DCCP]: Integrate state transitions for passive-close
 
 This adds the necessary state transitions for the two forms of passive-close
 
  * PASSIVE_CLOSE- which is entered when a host   receives a Close;
  * PASSIVE_CLOSEREQ - which is entered when a client receives a CloseReq.
 
 Here is a detailed account of what the patch does in each state.
   
 1) Receiving CloseReq
 --
   The pseudo-code in 8.5 says:
 
  Step 13: Process CloseReq
   If P.type == CloseReq and S.state  CLOSEREQ,
   Generate Close
   S.state := CLOSING
   Set CLOSING timer.
 
   This means we need to address what to do in CLOSED, LISTEN, REQUEST, 
 RESPOND, PARTOPEN, and OPEN.
 
* CLOSED: silently ignore - it may be a late or duplicate CloseReq;
* LISTEN/RESPOND: will not appear, since Step 7 is performed first (we 
 know we are the client);
* REQUEST:perform Step 13 directly (no need to enqueue packet);
* OPEN/PARTOPEN:  enter PASSIVE_CLOSEREQ so that the application has a 
 chance to process unread data.
 
   When already in PASSIVE_CLOSEREQ, no second CloseReq is enqueued. In any 
 other state, the CloseReq is ignored.
   I think that this offers some robustness against rare and pathological 
 cases: e.g. a simultaneous close where
   the client sends a Close and the server a CloseReq. The client will then be 
 retransmitting its Close until it
   gets the Reset, so ignoring the CloseReq while in state CLOSING is sane.
   
 2) Receiving Close
 ---
   The code below from 8.5 is unconditional.
 
  Step 14: Process Close
   If P.type == Close,
   Generate Reset(Closed)
   Tear down connection
   Drop packet and return
 
   Thus we need to consider all states:
* CLOSED:   silently ignore, since this can happen when a 
 retransmitted or late Close arrives;
* LISTEN:   dccp_rcv_state_process() will generate a Reset (No 
 Connection);
* REQUEST:  perform Step 14 directly (no need to enqueue packet);
* RESPOND:  dccp_check_req() will generate a Reset (Packet 
 Error) -- left it at that;
* OPEN/PARTOPEN:enter PASSIVE_CLOSE so that application has a chance 
 to process unread data;
* CLOSEREQ: server performed active-close -- perform Step 14;
* CLOSING:  simultaneous-close: use a tie-breaker to avoid message 
 ping-pong (see comment);
* PASSIVE_CLOSEREQ: ignore - the peer has a bug (sending first a CloseReq 
 and now a Close);
* TIMEWAIT: packet is ignored.
 
Note that the condition of receiving a packet in state CLOSED here is 
 different from the condition there
is no socket for such a connection: the socket still exists, but its 
 state indicates it is unusable.
 
Last, dccp_finish_passive_close sets either DCCP_CLOSED or DCCP_CLOSING = 
 TCP_CLOSING, so that 
sk_stream_wait_close() will wait for the final Reset (which will trigger 
 CLOSING = CLOSED).
 
 Signed-off-by: Gerrit Renker [EMAIL PROTECTED]

Applied
-
To unsubscribe from this list: send the line unsubscribe dccp in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4]: Dedicated auxiliary states to support passive-close

2007-11-28 Thread Arnaldo Carvalho de Melo
Em Wed, Nov 28, 2007 at 08:35:09AM +, Gerrit Renker escreveu:
 [DCCP]: Dedicated auxiliary states to support passive-close
 
 This adds two auxiliary states to deal with passive closes:
   * PASSIVE_CLOSE(reached from OPEN via reception of Close)and
   * PASSIVE_CLOSEREQ (reached from OPEN via reception of CloseReq)
 as internal intermediate states.  
 
 These states are used to allow a receiver to process unread data before 
 acknowledging the received connection-termination-request (the 
 Close/CloseReq).
 
 Without such support, it will happen that passively-closed sockets enter 
 CLOSED
 state while there is still unprocessed data in the queue; leading to 
 unexpected
 and erratic API behaviour.
 
 PASSIVE_CLOSE has been mapped into TCPF_CLOSE_WAIT, so that the code will 
 seamlessly
 work with inet_accept() (which tests for this state).
 
 The state names are thanks to Arnaldo, who suggested this naming scheme 
 following
 an earlier revision of this patch.
 
 Signed-off-by: Gerrit Renker [EMAIL PROTECTED]

Thanks, applied.

- Arnaldo
-
To unsubscribe from this list: send the line unsubscribe dccp in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/5] [TFRC]: Migrate TX history to singly-linked lis

2007-11-28 Thread Arnaldo Carvalho de Melo
This patch was based on another made by Gerrit Renker, his changelog was:

--
The patch set migrates TFRC TX history to a singly-linked list.

The details are:
 * use of a consistent naming scheme (all TFRC functions now begin with 
`tfrc_');
 * allocation and cleanup are taken care of internally;
 * provision of a lookup function, which is used by the CCID TX infrastructure
   to determine the time a packet was sent (in turn used for RTT sampling);
 * integration of the new interface with the present use in CCID3.
--

Simplifications I did:

. removing the tfrc_tx_hist_head that had a pointer to the list head and
  another for the slabcache.
. No need for creating a slabcache for each CCID that wants to use the TFRC
  tx history routines, create a single slabcache when the dccp_tfrc_lib module
  init routine is called.

Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED]
---
 net/dccp/ccids/ccid3.c  |   57 --
 net/dccp/ccids/ccid3.h  |3 +-
 net/dccp/ccids/lib/loss_interval.c  |   12 ++--
 net/dccp/ccids/lib/packet_history.c |  138 ---
 net/dccp/ccids/lib/packet_history.h |   79 
 5 files changed, 102 insertions(+), 187 deletions(-)

diff --git a/net/dccp/ccids/ccid3.c b/net/dccp/ccids/ccid3.c
index 964ec91..2668de8 100644
--- a/net/dccp/ccids/ccid3.c
+++ b/net/dccp/ccids/ccid3.c
@@ -49,7 +49,6 @@ static int ccid3_debug;
 #define ccid3_pr_debug(format, a...)
 #endif
 
-static struct dccp_tx_hist *ccid3_tx_hist;
 static struct dccp_rx_hist *ccid3_rx_hist;
 
 /*
@@ -389,28 +388,18 @@ static void ccid3_hc_tx_packet_sent(struct sock *sk, int 
more,
unsigned int len)
 {
struct ccid3_hc_tx_sock *hctx = ccid3_hc_tx_sk(sk);
-   struct dccp_tx_hist_entry *packet;
 
ccid3_hc_tx_update_s(hctx, len);
 
-   packet = dccp_tx_hist_entry_new(ccid3_tx_hist, GFP_ATOMIC);
-   if (unlikely(packet == NULL)) {
+   if (tfrc_tx_hist_add(hctx-ccid3hctx_hist, dccp_sk(sk)-dccps_gss))
DCCP_CRIT(packet history - out of memory!);
-   return;
-   }
-   dccp_tx_hist_add_entry(hctx-ccid3hctx_hist, packet);
-
-   packet-dccphtx_tstamp = ktime_get_real();
-   packet-dccphtx_seqno  = dccp_sk(sk)-dccps_gss;
-   packet-dccphtx_rtt= hctx-ccid3hctx_rtt;
-   packet-dccphtx_sent   = 1;
 }
 
 static void ccid3_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb)
 {
struct ccid3_hc_tx_sock *hctx = ccid3_hc_tx_sk(sk);
struct ccid3_options_received *opt_recv;
-   struct dccp_tx_hist_entry *packet;
+   struct tfrc_tx_hist_entry *packet;
ktime_t now;
unsigned long t_nfb;
u32 pinv, r_sample;
@@ -425,16 +414,19 @@ static void ccid3_hc_tx_packet_recv(struct sock *sk, 
struct sk_buff *skb)
switch (hctx-ccid3hctx_state) {
case TFRC_SSTATE_NO_FBACK:
case TFRC_SSTATE_FBACK:
-   /* get packet from history to look up t_recvdata */
-   packet = dccp_tx_hist_find_entry(hctx-ccid3hctx_hist,
- DCCP_SKB_CB(skb)-dccpd_ack_seq);
-   if (unlikely(packet == NULL)) {
-   DCCP_WARN(%s(%p), seqno %llu(%s) doesn't exist 
- in history!\n,  dccp_role(sk), sk,
-   (unsigned long long)DCCP_SKB_CB(skb)-dccpd_ack_seq,
-   dccp_packet_name(DCCP_SKB_CB(skb)-dccpd_type));
+   /* estimate RTT from history if ACK number is valid */
+   packet = tfrc_tx_hist_find_entry(hctx-ccid3hctx_hist,
+
DCCP_SKB_CB(skb)-dccpd_ack_seq);
+   if (packet == NULL) {
+   DCCP_WARN(%s(%p): %s with bogus ACK-%llu\n, 
dccp_role(sk), sk,
+ 
dccp_packet_name(DCCP_SKB_CB(skb)-dccpd_type),
+ (unsigned long 
long)DCCP_SKB_CB(skb)-dccpd_ack_seq);
return;
}
+   /*
+* Garbage-collect older (irrelevant) entries
+*/
+   tfrc_tx_hist_purge(packet-next);
 
/* Update receive rate in units of 64 * bytes/second */
hctx-ccid3hctx_x_recv = opt_recv-ccid3or_receive_rate;
@@ -451,7 +443,7 @@ static void ccid3_hc_tx_packet_recv(struct sock *sk, struct 
sk_buff *skb)
/*
 * Calculate new RTT sample and update moving average
 */
-   r_sample = dccp_sample_rtt(sk, ktime_us_delta(now, 
packet-dccphtx_tstamp));
+   r_sample = dccp_sample_rtt(sk, ktime_us_delta(now, 
packet-stamp));
hctx-ccid3hctx_rtt = tfrc_ewma(hctx-ccid3hctx_rtt, r_sample, 
9);
 
if 

[PATCH 4/5] [DCCP]: Integrate state transitions for passive-close

2007-11-28 Thread Arnaldo Carvalho de Melo
From: Gerrit Renker [EMAIL PROTECTED]

This adds the necessary state transitions for the two forms of passive-close

 * PASSIVE_CLOSE- which is entered when a host   receives a Close;
 * PASSIVE_CLOSEREQ - which is entered when a client receives a CloseReq.

Here is a detailed account of what the patch does in each state.

1) Receiving CloseReq

  The pseudo-code in 8.5 says:

 Step 13: Process CloseReq
  If P.type == CloseReq and S.state  CLOSEREQ,
  Generate Close
  S.state := CLOSING
  Set CLOSING timer.

  This means we need to address what to do in CLOSED, LISTEN, REQUEST, RESPOND, 
PARTOPEN, and OPEN.

   * CLOSED: silently ignore - it may be a late or duplicate CloseReq;
   * LISTEN/RESPOND: will not appear, since Step 7 is performed first (we know 
we are the client);
   * REQUEST:perform Step 13 directly (no need to enqueue packet);
   * OPEN/PARTOPEN:  enter PASSIVE_CLOSEREQ so that the application has a 
chance to process unread data.

  When already in PASSIVE_CLOSEREQ, no second CloseReq is enqueued. In any 
other state, the CloseReq is ignored.
  I think that this offers some robustness against rare and pathological cases: 
e.g. a simultaneous close where
  the client sends a Close and the server a CloseReq. The client will then be 
retransmitting its Close until it
  gets the Reset, so ignoring the CloseReq while in state CLOSING is sane.

2) Receiving Close

  The code below from 8.5 is unconditional.

 Step 14: Process Close
  If P.type == Close,
  Generate Reset(Closed)
  Tear down connection
  Drop packet and return

  Thus we need to consider all states:
   * CLOSED:   silently ignore, since this can happen when a 
retransmitted or late Close arrives;
   * LISTEN:   dccp_rcv_state_process() will generate a Reset (No 
Connection);
   * REQUEST:  perform Step 14 directly (no need to enqueue packet);
   * RESPOND:  dccp_check_req() will generate a Reset (Packet Error) 
-- left it at that;
   * OPEN/PARTOPEN:enter PASSIVE_CLOSE so that application has a chance to 
process unread data;
   * CLOSEREQ: server performed active-close -- perform Step 14;
   * CLOSING:  simultaneous-close: use a tie-breaker to avoid message 
ping-pong (see comment);
   * PASSIVE_CLOSEREQ: ignore - the peer has a bug (sending first a CloseReq 
and now a Close);
   * TIMEWAIT: packet is ignored.

   Note that the condition of receiving a packet in state CLOSED here is 
different from the condition there
   is no socket for such a connection: the socket still exists, but its state 
indicates it is unusable.

   Last, dccp_finish_passive_close sets either DCCP_CLOSED or DCCP_CLOSING = 
TCP_CLOSING, so that
   sk_stream_wait_close() will wait for the final Reset (which will trigger 
CLOSING = CLOSED).

Signed-off-by: Gerrit Renker [EMAIL PROTECTED]
Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED]
---
 include/linux/dccp.h |1 -
 net/dccp/input.c |   88 +
 net/dccp/proto.c |   88 +-
 3 files changed, 131 insertions(+), 46 deletions(-)

diff --git a/include/linux/dccp.h b/include/linux/dccp.h
index 8b3f9ad..312b989 100644
--- a/include/linux/dccp.h
+++ b/include/linux/dccp.h
@@ -260,7 +260,6 @@ enum dccp_state {
 };
 
 #define DCCP_STATE_MASK 0x1f
-#define DCCP_ACTION_FIN (17)
 
 enum {
DCCPF_OPEN= TCPF_ESTABLISHED,
diff --git a/net/dccp/input.c b/net/dccp/input.c
index ef299fb..fe4b0fb 100644
--- a/net/dccp/input.c
+++ b/net/dccp/input.c
@@ -32,16 +32,56 @@ static void dccp_fin(struct sock *sk, struct sk_buff *skb)
sk-sk_data_ready(sk, 0);
 }
 
-static void dccp_rcv_close(struct sock *sk, struct sk_buff *skb)
+static int dccp_rcv_close(struct sock *sk, struct sk_buff *skb)
 {
-   dccp_send_reset(sk, DCCP_RESET_CODE_CLOSED);
-   dccp_fin(sk, skb);
-   dccp_set_state(sk, DCCP_CLOSED);
-   sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_HUP);
+   int queued = 0;
+
+   switch (sk-sk_state) {
+   /*
+* We ignore Close when received in one of the following states:
+*  - CLOSED(may be a late or duplicate packet)
+*  - PASSIVE_CLOSEREQ  (the peer has sent a CloseReq earlier)
+*  - RESPOND   (already handled by dccp_check_req)
+*/
+   case DCCP_CLOSING:
+   /*
+* Simultaneous-close: receiving a Close after sending one. This
+* can happen if both client and server perform active-close and
+* will result in an endless ping-pong of crossing and retrans-
+* mitted Close packets, which only terminates when one of the
+* nodes times out (min. 64 seconds). Quicker convergence can be
+* achieved when one of the nodes acts as 

[PATCH 3/5] [DCCP]: Dedicated auxiliary states to support passive-close

2007-11-28 Thread Arnaldo Carvalho de Melo
From: Gerrit Renker [EMAIL PROTECTED]

This adds two auxiliary states to deal with passive closes:
  * PASSIVE_CLOSE(reached from OPEN via reception of Close)and
  * PASSIVE_CLOSEREQ (reached from OPEN via reception of CloseReq)
as internal intermediate states.

These states are used to allow a receiver to process unread data before
acknowledging the received connection-termination-request (the Close/CloseReq).

Without such support, it will happen that passively-closed sockets enter CLOSED
state while there is still unprocessed data in the queue; leading to unexpected
and erratic API behaviour.

PASSIVE_CLOSE has been mapped into TCPF_CLOSE_WAIT, so that the code will
seamlessly work with inet_accept() (which tests for this state).

The state names are thanks to Arnaldo, who suggested this naming scheme
following an earlier revision of this patch.

Signed-off-by: Gerrit Renker [EMAIL PROTECTED]
Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED]
---
 include/linux/dccp.h |   56 ++---
 net/dccp/proto.c |   22 ++-
 2 files changed, 51 insertions(+), 27 deletions(-)

diff --git a/include/linux/dccp.h b/include/linux/dccp.h
index a007326..8b3f9ad 100644
--- a/include/linux/dccp.h
+++ b/include/linux/dccp.h
@@ -227,29 +227,51 @@ struct dccp_so_feat {
 #include net/tcp_states.h
 
 enum dccp_state {
-   DCCP_OPEN   = TCP_ESTABLISHED,
-   DCCP_REQUESTING = TCP_SYN_SENT,
-   DCCP_LISTEN = TCP_LISTEN,
-   DCCP_RESPOND= TCP_SYN_RECV,
-   DCCP_CLOSING= TCP_CLOSING,
-   DCCP_TIME_WAIT  = TCP_TIME_WAIT,
-   DCCP_CLOSED = TCP_CLOSE,
-   DCCP_PARTOPEN   = TCP_MAX_STATES,
+   DCCP_OPEN= TCP_ESTABLISHED,
+   DCCP_REQUESTING  = TCP_SYN_SENT,
+   DCCP_LISTEN  = TCP_LISTEN,
+   DCCP_RESPOND = TCP_SYN_RECV,
+   /*
+* States involved in closing a DCCP connection:
+* 1) ACTIVE_CLOSEREQ is entered by a server sending a CloseReq.
+*
+* 2) CLOSING can have three different meanings (RFC 4340, 8.3):
+*  a. Client has performed active-close, has sent a Close to the server
+* from state OPEN or PARTOPEN, and is waiting for the final Reset
+* (in this case, SOCK_DONE == 1).
+*  b. Client is asked to perform passive-close, by receiving a CloseReq
+* in (PART)OPEN state. It sends a Close and waits for final Reset
+* (in this case, SOCK_DONE == 0).
+*  c. Server performs an active-close as in (a), keeps TIMEWAIT state.
+*
+* 3) The following intermediate states are employed to give passively
+*closing nodes a chance to process their unread data:
+*- PASSIVE_CLOSE(from OPEN = CLOSED) and
+*- PASSIVE_CLOSEREQ (from (PART)OPEN to CLOSING; case (b) above).
+*/
+   DCCP_ACTIVE_CLOSEREQ = TCP_FIN_WAIT1,
+   DCCP_PASSIVE_CLOSE   = TCP_CLOSE_WAIT,  /* any node receiving a Close */
+   DCCP_CLOSING = TCP_CLOSING,
+   DCCP_TIME_WAIT   = TCP_TIME_WAIT,
+   DCCP_CLOSED  = TCP_CLOSE,
+   DCCP_PARTOPEN= TCP_MAX_STATES,
+   DCCP_PASSIVE_CLOSEREQ,  /* clients receiving CloseReq */
DCCP_MAX_STATES
 };
 
-#define DCCP_STATE_MASK 0xf
+#define DCCP_STATE_MASK 0x1f
 #define DCCP_ACTION_FIN (17)
 
 enum {
-   DCCPF_OPEN   = TCPF_ESTABLISHED,
-   DCCPF_REQUESTING = TCPF_SYN_SENT,
-   DCCPF_LISTEN = TCPF_LISTEN,
-   DCCPF_RESPOND= TCPF_SYN_RECV,
-   DCCPF_CLOSING= TCPF_CLOSING,
-   DCCPF_TIME_WAIT  = TCPF_TIME_WAIT,
-   DCCPF_CLOSED = TCPF_CLOSE,
-   DCCPF_PARTOPEN   = 1  DCCP_PARTOPEN,
+   DCCPF_OPEN= TCPF_ESTABLISHED,
+   DCCPF_REQUESTING  = TCPF_SYN_SENT,
+   DCCPF_LISTEN  = TCPF_LISTEN,
+   DCCPF_RESPOND = TCPF_SYN_RECV,
+   DCCPF_ACTIVE_CLOSEREQ = TCPF_FIN_WAIT1,
+   DCCPF_CLOSING = TCPF_CLOSING,
+   DCCPF_TIME_WAIT   = TCPF_TIME_WAIT,
+   DCCPF_CLOSED  = TCPF_CLOSE,
+   DCCPF_PARTOPEN= (1  DCCP_PARTOPEN),
 };
 
 static inline struct dccp_hdr *dccp_hdr(const struct sk_buff *skb)
diff --git a/net/dccp/proto.c b/net/dccp/proto.c
index 73006b7..3489d3f 100644
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -60,8 +60,7 @@ void dccp_set_state(struct sock *sk, const int state)
 {
const int oldstate = sk-sk_state;
 
-   dccp_pr_debug(%s(%p) %-10.10s - %s\n,
- dccp_role(sk), sk,
+   dccp_pr_debug(%s(%p)  %s  --  %s\n, dccp_role(sk), sk,
  dccp_state_name(oldstate), dccp_state_name(state));
WARN_ON(state == oldstate);
 
@@ -134,14 +133,17 @@ EXPORT_SYMBOL_GPL(dccp_packet_name);
 const char *dccp_state_name(const int state)
 {
static char *dccp_state_names[] = {
-   [DCCP_OPEN]   = OPEN,
-   [DCCP_REQUESTING] = 

[PATCH 2/5] Use AF-independent rebuild_header routine

2007-11-28 Thread Arnaldo Carvalho de Melo
From: Gerrit Renker [EMAIL PROTECTED]

[DCCP]: Use AF-independent rebuild_header routine

This fixes a nasty bug: dccp_send_reset() is called by both DCCPv4 and DCCPv6, 
but uses
inet_sk_rebuild_header() in each case. This leads to unpredictable and weird 
behaviour:
under some conditions, DCCPv6 Resets were sent, in other not.

The fix is to use the AF-independent rebuild_header routine.

Signed-off-by: Gerrit Renker [EMAIL PROTECTED]
Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED]
---
 net/dccp/output.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/dccp/output.c b/net/dccp/output.c
index 33ce737..7caa7f5 100644
--- a/net/dccp/output.c
+++ b/net/dccp/output.c
@@ -391,7 +391,7 @@ int dccp_send_reset(struct sock *sk, enum dccp_reset_codes 
code)
 * FIXME: what if rebuild_header fails?
 * Should we be doing a rebuild_header here?
 */
-   int err = inet_sk_rebuild_header(sk);
+   int err = inet_csk(sk)-icsk_af_ops-rebuild_header(sk);
 
if (err != 0)
return err;
-- 
1.5.3.4

-
To unsubscribe from this list: send the line unsubscribe dccp in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHES 0/5]: DCCP patches for 2.6.25

2007-11-28 Thread Arnaldo Carvalho de Melo
Hi Herbert,

Please consider pulling from:

master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6.25

Best Regards,

- Arnaldo

 b/include/linux/dccp.h|   56 +
 b/net/dccp/ccids/ccid3.c  |   57 --
 b/net/dccp/ccids/ccid3.h  |3 
 b/net/dccp/ccids/lib/loss_interval.c  |   12 +-
 b/net/dccp/ccids/lib/packet_history.c |  138 +++---
 b/net/dccp/ccids/lib/packet_history.h |   80 +++
 b/net/dccp/input.c|   88 +
 b/net/dccp/output.c   |3 
 b/net/dccp/proto.c|   23 ++---
 include/linux/dccp.h  |1 
 net/dccp/input.c  |7 -
 net/dccp/proto.c  |   89 ++---
 12 files changed, 287 insertions(+), 270 deletions(-)
-
To unsubscribe from this list: send the line unsubscribe dccp in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/5] [DCCP]: Remove duplicate test for CloseReq

2007-11-28 Thread Arnaldo Carvalho de Melo
From: Gerrit Renker [EMAIL PROTECTED]

This removes a redundant test for unexpected packet types. In 
dccp_rcv_state_process
it is tested twice whether a DCCP-server has received a CloseReq (Step 7):

 * first in the combined if-statement,
 * then in the call to dccp_rcv_closereq().

The latter is necesssary since dccp_rcv_closereq() is also called from
__dccp_rcv_established().

This patch removes the duplicate test.

Signed-off-by: Gerrit Renker [EMAIL PROTECTED]
Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED]
---
 net/dccp/input.c |6 ++
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/dccp/input.c b/net/dccp/input.c
index fe4b0fb..decf2f2 100644
--- a/net/dccp/input.c
+++ b/net/dccp/input.c
@@ -629,16 +629,14 @@ int dccp_rcv_state_process(struct sock *sk, struct 
sk_buff *skb,
return 0;
/*
 *   Step 7: Check for unexpected packet types
-*  If (S.is_server and P.type == CloseReq)
-*  or (S.is_server and P.type == Response)
+*  If (S.is_server and P.type == Response)
 *  or (S.is_client and P.type == Request)
 *  or (S.state == RESPOND and P.type == Data),
 *Send Sync packet acknowledging P.seqno
 *Drop packet and return
 */
} else if ((dp-dccps_role != DCCP_ROLE_CLIENT 
-   (dh-dccph_type == DCCP_PKT_RESPONSE ||
-dh-dccph_type == DCCP_PKT_CLOSEREQ)) ||
+   dh-dccph_type == DCCP_PKT_RESPONSE) ||
(dp-dccps_role == DCCP_ROLE_CLIENT 
 dh-dccph_type == DCCP_PKT_REQUEST) ||
(sk-sk_state == DCCP_RESPOND 
-- 
1.5.3.4

-
To unsubscribe from this list: send the line unsubscribe dccp in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html