[PATCH 0/4]: Revised support for passive-close
This is the revised and tested patch for supporting passive-close. Testing was impaired by a nasty bug (patch #1) which lead to the final Reset not being sent. Once that had been fixed, things worked out very smoothly. Other than that I think that the patch set has much benefited from the revision, all possible combinations of server/client close have been tested, over an emulated IPv6 satellite link, and with different constellations (client closes earlier as server and vice versa). There is one new point, simultaneous-close, that the patch now also deals with. It is possible (and was observed several times during testing) that client and server both perform an active-close, in which case both will retransmit their Close / wait for a Reset ad infinitum. The patch deals with this situation by letting the client act as tie-breaker here, which was shown to work very satisfactorily: http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/closing_states/#simultaneous_close Patch #1: Fixes a bug in the dccp_send_reset() routine: Patch #2: Is an updated/revised version of the patch introducing the closing-states. Changes: i) combined with other patch (as suggested by Arnaldo); ii) better/revised documentation; iii) uses the new naming scheme (also thanks to Arnaldo). Patch #3: Revised and updated patch to support passive-close in terms of state transitions. The changes to the previous variant are: * explicit control over transitions into passive-intermediate states; * the leaked-skb issue has been fixed by tracking whether the skb is queued; * support for simultaneous active-close by client and server; * a detailed account of what happens in each state. Patch #4: Removes a redundant test (consequence of patch #3). I have updated the test tree with regard to these patches. Before the end of the week I will take Arnaldo's changes for 2.6.25, backport them to the mainline basis that the test tree is based on, and update the test tree with regard to the latest changes (also updating the subtrees). - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/4]: Use AF-independent rebuild_header routine
[DCCP]: Use AF-independent rebuild_header routine This fixes a nasty bug: dccp_send_reset() is called by both DCCPv4 and DCCPv6, but uses inet_sk_rebuild_header() in each case. This leads to unpredictable and weird behaviour: under some conditions, DCCPv6 Resets were sent, in other not. The fix is to use the AF-independent rebuild_header routine. Signed-off-by: Gerrit Renker [EMAIL PROTECTED] --- net/dccp/output.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/net/dccp/output.c +++ b/net/dccp/output.c @@ -391,7 +391,7 @@ int dccp_send_reset(struct sock *sk, enu * FIXME: what if rebuild_header fails? * Should we be doing a rebuild_header here? */ - int err = inet_sk_rebuild_header(sk); + int err = inet_csk(sk)-icsk_af_ops-rebuild_header(sk); if (err != 0) return err; - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4]: Remove duplicate test for CloseReq
[DCCP]: Remove duplicate test for CloseReq This removes a redundant test for unexpected packet types. In dccp_rcv_state_process it is tested twice whether a DCCP-server has received a CloseReq (Step 7): * first in the combined if-statement, * then in the call to dccp_rcv_closereq(). The latter is necesssary since dccp_rcv_closereq() is also called from __dccp_rcv_established(). This patch removes the duplicate test. Signed-off-by: Gerrit Renker [EMAIL PROTECTED] --- net/dccp/input.c |6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) --- a/net/dccp/input.c +++ b/net/dccp/input.c @@ -616,16 +616,14 @@ int dccp_rcv_state_process(struct sock * return 0; /* * Step 7: Check for unexpected packet types -* If (S.is_server and P.type == CloseReq) -* or (S.is_server and P.type == Response) +* If (S.is_server and P.type == Response) * or (S.is_client and P.type == Request) * or (S.state == RESPOND and P.type == Data), *Send Sync packet acknowledging P.seqno *Drop packet and return */ } else if ((dp-dccps_role != DCCP_ROLE_CLIENT - (dh-dccph_type == DCCP_PKT_RESPONSE || -dh-dccph_type == DCCP_PKT_CLOSEREQ)) || + dh-dccph_type == DCCP_PKT_RESPONSE) || (dp-dccps_role == DCCP_ROLE_CLIENT dh-dccph_type == DCCP_PKT_REQUEST) || (sk-sk_state == DCCP_RESPOND - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/4]: Dedicated auxiliary states to support passive-close
[DCCP]: Dedicated auxiliary states to support passive-close This adds two auxiliary states to deal with passive closes: * PASSIVE_CLOSE(reached from OPEN via reception of Close)and * PASSIVE_CLOSEREQ (reached from OPEN via reception of CloseReq) as internal intermediate states. These states are used to allow a receiver to process unread data before acknowledging the received connection-termination-request (the Close/CloseReq). Without such support, it will happen that passively-closed sockets enter CLOSED state while there is still unprocessed data in the queue; leading to unexpected and erratic API behaviour. PASSIVE_CLOSE has been mapped into TCPF_CLOSE_WAIT, so that the code will seamlessly work with inet_accept() (which tests for this state). The state names are thanks to Arnaldo, who suggested this naming scheme following an earlier revision of this patch. Signed-off-by: Gerrit Renker [EMAIL PROTECTED] --- include/linux/dccp.h | 56 +++ net/dccp/proto.c | 22 ++-- 2 files changed, 51 insertions(+), 27 deletions(-) --- a/include/linux/dccp.h +++ b/include/linux/dccp.h @@ -227,29 +227,51 @@ struct dccp_so_feat { #include net/tcp_states.h enum dccp_state { - DCCP_OPEN = TCP_ESTABLISHED, - DCCP_REQUESTING = TCP_SYN_SENT, - DCCP_LISTEN = TCP_LISTEN, - DCCP_RESPOND= TCP_SYN_RECV, - DCCP_CLOSING= TCP_CLOSING, - DCCP_TIME_WAIT = TCP_TIME_WAIT, - DCCP_CLOSED = TCP_CLOSE, - DCCP_PARTOPEN = TCP_MAX_STATES, + DCCP_OPEN= TCP_ESTABLISHED, + DCCP_REQUESTING = TCP_SYN_SENT, + DCCP_LISTEN = TCP_LISTEN, + DCCP_RESPOND = TCP_SYN_RECV, + /* +* States involved in closing a DCCP connection: +* 1) ACTIVE_CLOSEREQ is entered by a server sending a CloseReq. +* +* 2) CLOSING can have three different meanings (RFC 4340, 8.3): +* a. Client has performed active-close, has sent a Close to the server +* from state OPEN or PARTOPEN, and is waiting for the final Reset +* (in this case, SOCK_DONE == 1). +* b. Client is asked to perform passive-close, by receiving a CloseReq +* in (PART)OPEN state. It sends a Close and waits for final Reset +* (in this case, SOCK_DONE == 0). +* c. Server performs an active-close as in (a), keeps TIMEWAIT state. +* +* 3) The following intermediate states are employed to give passively +*closing nodes a chance to process their unread data: +*- PASSIVE_CLOSE(from OPEN = CLOSED) and +*- PASSIVE_CLOSEREQ (from (PART)OPEN to CLOSING; case (b) above). +*/ + DCCP_ACTIVE_CLOSEREQ = TCP_FIN_WAIT1, + DCCP_PASSIVE_CLOSE = TCP_CLOSE_WAIT, /* any node receiving a Close */ + DCCP_CLOSING = TCP_CLOSING, + DCCP_TIME_WAIT = TCP_TIME_WAIT, + DCCP_CLOSED = TCP_CLOSE, + DCCP_PARTOPEN= TCP_MAX_STATES, + DCCP_PASSIVE_CLOSEREQ, /* clients receiving CloseReq */ DCCP_MAX_STATES }; -#define DCCP_STATE_MASK 0xf +#define DCCP_STATE_MASK 0x1f #define DCCP_ACTION_FIN (17) enum { - DCCPF_OPEN = TCPF_ESTABLISHED, - DCCPF_REQUESTING = TCPF_SYN_SENT, - DCCPF_LISTEN = TCPF_LISTEN, - DCCPF_RESPOND= TCPF_SYN_RECV, - DCCPF_CLOSING= TCPF_CLOSING, - DCCPF_TIME_WAIT = TCPF_TIME_WAIT, - DCCPF_CLOSED = TCPF_CLOSE, - DCCPF_PARTOPEN = (1 DCCP_PARTOPEN), + DCCPF_OPEN= TCPF_ESTABLISHED, + DCCPF_REQUESTING = TCPF_SYN_SENT, + DCCPF_LISTEN = TCPF_LISTEN, + DCCPF_RESPOND = TCPF_SYN_RECV, + DCCPF_ACTIVE_CLOSEREQ = TCPF_FIN_WAIT1, + DCCPF_CLOSING = TCPF_CLOSING, + DCCPF_TIME_WAIT = TCPF_TIME_WAIT, + DCCPF_CLOSED = TCPF_CLOSE, + DCCPF_PARTOPEN= (1 DCCP_PARTOPEN), }; static inline struct dccp_hdr *dccp_hdr(const struct sk_buff *skb) --- a/net/dccp/proto.c +++ b/net/dccp/proto.c @@ -60,8 +60,7 @@ void dccp_set_state(struct sock *sk, con { const int oldstate = sk-sk_state; - dccp_pr_debug(%s(%p) %-10.10s - %s\n, - dccp_role(sk), sk, + dccp_pr_debug(%s(%p) %s -- %s\n, dccp_role(sk), sk, dccp_state_name(oldstate), dccp_state_name(state)); WARN_ON(state == oldstate); @@ -134,14 +133,17 @@ EXPORT_SYMBOL_GPL(dccp_packet_name); const char *dccp_state_name(const int state) { static char *dccp_state_names[] = { - [DCCP_OPEN] = OPEN, - [DCCP_REQUESTING] = REQUESTING, - [DCCP_PARTOPEN] = PARTOPEN, - [DCCP_LISTEN] = LISTEN, - [DCCP_RESPOND]= RESPOND, - [DCCP_CLOSING]= CLOSING, - [DCCP_TIME_WAIT] = TIME_WAIT, -
[PATCH 3/4]: Integrate state transitions for passive-close
[DCCP]: Integrate state transitions for passive-close This adds the necessary state transitions for the two forms of passive-close * PASSIVE_CLOSE- which is entered when a host receives a Close; * PASSIVE_CLOSEREQ - which is entered when a client receives a CloseReq. Here is a detailed account of what the patch does in each state. 1) Receiving CloseReq -- The pseudo-code in 8.5 says: Step 13: Process CloseReq If P.type == CloseReq and S.state CLOSEREQ, Generate Close S.state := CLOSING Set CLOSING timer. This means we need to address what to do in CLOSED, LISTEN, REQUEST, RESPOND, PARTOPEN, and OPEN. * CLOSED: silently ignore - it may be a late or duplicate CloseReq; * LISTEN/RESPOND: will not appear, since Step 7 is performed first (we know we are the client); * REQUEST:perform Step 13 directly (no need to enqueue packet); * OPEN/PARTOPEN: enter PASSIVE_CLOSEREQ so that the application has a chance to process unread data. When already in PASSIVE_CLOSEREQ, no second CloseReq is enqueued. In any other state, the CloseReq is ignored. I think that this offers some robustness against rare and pathological cases: e.g. a simultaneous close where the client sends a Close and the server a CloseReq. The client will then be retransmitting its Close until it gets the Reset, so ignoring the CloseReq while in state CLOSING is sane. 2) Receiving Close --- The code below from 8.5 is unconditional. Step 14: Process Close If P.type == Close, Generate Reset(Closed) Tear down connection Drop packet and return Thus we need to consider all states: * CLOSED: silently ignore, since this can happen when a retransmitted or late Close arrives; * LISTEN: dccp_rcv_state_process() will generate a Reset (No Connection); * REQUEST: perform Step 14 directly (no need to enqueue packet); * RESPOND: dccp_check_req() will generate a Reset (Packet Error) -- left it at that; * OPEN/PARTOPEN:enter PASSIVE_CLOSE so that application has a chance to process unread data; * CLOSEREQ: server performed active-close -- perform Step 14; * CLOSING: simultaneous-close: use a tie-breaker to avoid message ping-pong (see comment); * PASSIVE_CLOSEREQ: ignore - the peer has a bug (sending first a CloseReq and now a Close); * TIMEWAIT: packet is ignored. Note that the condition of receiving a packet in state CLOSED here is different from the condition there is no socket for such a connection: the socket still exists, but its state indicates it is unusable. Last, dccp_finish_passive_close sets either DCCP_CLOSED or DCCP_CLOSING = TCP_CLOSING, so that sk_stream_wait_close() will wait for the final Reset (which will trigger CLOSING = CLOSED). Signed-off-by: Gerrit Renker [EMAIL PROTECTED] --- include/linux/dccp.h |1 net/dccp/input.c | 88 ++- net/dccp/proto.c | 88 +-- 3 files changed, 131 insertions(+), 46 deletions(-) --- a/net/dccp/input.c +++ b/net/dccp/input.c @@ -32,16 +32,56 @@ static void dccp_fin(struct sock *sk, st sk-sk_data_ready(sk, 0); } -static void dccp_rcv_close(struct sock *sk, struct sk_buff *skb) +static int dccp_rcv_close(struct sock *sk, struct sk_buff *skb) { - dccp_send_reset(sk, DCCP_RESET_CODE_CLOSED); - dccp_fin(sk, skb); - dccp_set_state(sk, DCCP_CLOSED); - sk_wake_async(sk, 1, POLL_HUP); + int queued = 0; + + switch (sk-sk_state) { + /* +* We ignore Close when received in one of the following states: +* - CLOSED(may be a late or duplicate packet) +* - PASSIVE_CLOSEREQ (the peer has sent a CloseReq earlier) +* - RESPOND (already handled by dccp_check_req) +*/ + case DCCP_CLOSING: + /* +* Simultaneous-close: receiving a Close after sending one. This +* can happen if both client and server perform active-close and +* will result in an endless ping-pong of crossing and retrans- +* mitted Close packets, which only terminates when one of the +* nodes times out (min. 64 seconds). Quicker convergence can be +* achieved when one of the nodes acts as tie-breaker. +* This is ok as both ends are done with data transfer and each +* end is just waiting for the other to acknowledge termination. +*/ + if (dccp_sk(sk)-dccps_role != DCCP_ROLE_CLIENT) + break; + /* fall through */ + case DCCP_REQUESTING: + case DCCP_ACTIVE_CLOSEREQ: +
Re: [PATCH 1/4]: Use AF-independent rebuild_header routine
Em Wed, Nov 28, 2007 at 08:35:08AM +, Gerrit Renker escreveu: [DCCP]: Use AF-independent rebuild_header routine This fixes a nasty bug: dccp_send_reset() is called by both DCCPv4 and DCCPv6, but uses inet_sk_rebuild_header() in each case. This leads to unpredictable and weird behaviour: under some conditions, DCCPv6 Resets were sent, in other not. The fix is to use the AF-independent rebuild_header routine. Signed-off-by: Gerrit Renker [EMAIL PROTECTED] Thanks, applied. - Arnaldo - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4]: Remove duplicate test for CloseReq
Em Wed, Nov 28, 2007 at 08:35:11AM +, Gerrit Renker escreveu: [DCCP]: Remove duplicate test for CloseReq This removes a redundant test for unexpected packet types. In dccp_rcv_state_process it is tested twice whether a DCCP-server has received a CloseReq (Step 7): * first in the combined if-statement, * then in the call to dccp_rcv_closereq(). The latter is necesssary since dccp_rcv_closereq() is also called from __dccp_rcv_established(). This patch removes the duplicate test. Signed-off-by: Gerrit Renker [EMAIL PROTECTED] Thanks, applied. - Arnaldo - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/4]: Integrate state transitions for passive-close
Em Wed, Nov 28, 2007 at 08:35:10AM +, Gerrit Renker escreveu: [DCCP]: Integrate state transitions for passive-close This adds the necessary state transitions for the two forms of passive-close * PASSIVE_CLOSE- which is entered when a host receives a Close; * PASSIVE_CLOSEREQ - which is entered when a client receives a CloseReq. Here is a detailed account of what the patch does in each state. 1) Receiving CloseReq -- The pseudo-code in 8.5 says: Step 13: Process CloseReq If P.type == CloseReq and S.state CLOSEREQ, Generate Close S.state := CLOSING Set CLOSING timer. This means we need to address what to do in CLOSED, LISTEN, REQUEST, RESPOND, PARTOPEN, and OPEN. * CLOSED: silently ignore - it may be a late or duplicate CloseReq; * LISTEN/RESPOND: will not appear, since Step 7 is performed first (we know we are the client); * REQUEST:perform Step 13 directly (no need to enqueue packet); * OPEN/PARTOPEN: enter PASSIVE_CLOSEREQ so that the application has a chance to process unread data. When already in PASSIVE_CLOSEREQ, no second CloseReq is enqueued. In any other state, the CloseReq is ignored. I think that this offers some robustness against rare and pathological cases: e.g. a simultaneous close where the client sends a Close and the server a CloseReq. The client will then be retransmitting its Close until it gets the Reset, so ignoring the CloseReq while in state CLOSING is sane. 2) Receiving Close --- The code below from 8.5 is unconditional. Step 14: Process Close If P.type == Close, Generate Reset(Closed) Tear down connection Drop packet and return Thus we need to consider all states: * CLOSED: silently ignore, since this can happen when a retransmitted or late Close arrives; * LISTEN: dccp_rcv_state_process() will generate a Reset (No Connection); * REQUEST: perform Step 14 directly (no need to enqueue packet); * RESPOND: dccp_check_req() will generate a Reset (Packet Error) -- left it at that; * OPEN/PARTOPEN:enter PASSIVE_CLOSE so that application has a chance to process unread data; * CLOSEREQ: server performed active-close -- perform Step 14; * CLOSING: simultaneous-close: use a tie-breaker to avoid message ping-pong (see comment); * PASSIVE_CLOSEREQ: ignore - the peer has a bug (sending first a CloseReq and now a Close); * TIMEWAIT: packet is ignored. Note that the condition of receiving a packet in state CLOSED here is different from the condition there is no socket for such a connection: the socket still exists, but its state indicates it is unusable. Last, dccp_finish_passive_close sets either DCCP_CLOSED or DCCP_CLOSING = TCP_CLOSING, so that sk_stream_wait_close() will wait for the final Reset (which will trigger CLOSING = CLOSED). Signed-off-by: Gerrit Renker [EMAIL PROTECTED] Applied - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/4]: Dedicated auxiliary states to support passive-close
Em Wed, Nov 28, 2007 at 08:35:09AM +, Gerrit Renker escreveu: [DCCP]: Dedicated auxiliary states to support passive-close This adds two auxiliary states to deal with passive closes: * PASSIVE_CLOSE(reached from OPEN via reception of Close)and * PASSIVE_CLOSEREQ (reached from OPEN via reception of CloseReq) as internal intermediate states. These states are used to allow a receiver to process unread data before acknowledging the received connection-termination-request (the Close/CloseReq). Without such support, it will happen that passively-closed sockets enter CLOSED state while there is still unprocessed data in the queue; leading to unexpected and erratic API behaviour. PASSIVE_CLOSE has been mapped into TCPF_CLOSE_WAIT, so that the code will seamlessly work with inet_accept() (which tests for this state). The state names are thanks to Arnaldo, who suggested this naming scheme following an earlier revision of this patch. Signed-off-by: Gerrit Renker [EMAIL PROTECTED] Thanks, applied. - Arnaldo - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/5] [TFRC]: Migrate TX history to singly-linked lis
This patch was based on another made by Gerrit Renker, his changelog was: -- The patch set migrates TFRC TX history to a singly-linked list. The details are: * use of a consistent naming scheme (all TFRC functions now begin with `tfrc_'); * allocation and cleanup are taken care of internally; * provision of a lookup function, which is used by the CCID TX infrastructure to determine the time a packet was sent (in turn used for RTT sampling); * integration of the new interface with the present use in CCID3. -- Simplifications I did: . removing the tfrc_tx_hist_head that had a pointer to the list head and another for the slabcache. . No need for creating a slabcache for each CCID that wants to use the TFRC tx history routines, create a single slabcache when the dccp_tfrc_lib module init routine is called. Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- net/dccp/ccids/ccid3.c | 57 -- net/dccp/ccids/ccid3.h |3 +- net/dccp/ccids/lib/loss_interval.c | 12 ++-- net/dccp/ccids/lib/packet_history.c | 138 --- net/dccp/ccids/lib/packet_history.h | 79 5 files changed, 102 insertions(+), 187 deletions(-) diff --git a/net/dccp/ccids/ccid3.c b/net/dccp/ccids/ccid3.c index 964ec91..2668de8 100644 --- a/net/dccp/ccids/ccid3.c +++ b/net/dccp/ccids/ccid3.c @@ -49,7 +49,6 @@ static int ccid3_debug; #define ccid3_pr_debug(format, a...) #endif -static struct dccp_tx_hist *ccid3_tx_hist; static struct dccp_rx_hist *ccid3_rx_hist; /* @@ -389,28 +388,18 @@ static void ccid3_hc_tx_packet_sent(struct sock *sk, int more, unsigned int len) { struct ccid3_hc_tx_sock *hctx = ccid3_hc_tx_sk(sk); - struct dccp_tx_hist_entry *packet; ccid3_hc_tx_update_s(hctx, len); - packet = dccp_tx_hist_entry_new(ccid3_tx_hist, GFP_ATOMIC); - if (unlikely(packet == NULL)) { + if (tfrc_tx_hist_add(hctx-ccid3hctx_hist, dccp_sk(sk)-dccps_gss)) DCCP_CRIT(packet history - out of memory!); - return; - } - dccp_tx_hist_add_entry(hctx-ccid3hctx_hist, packet); - - packet-dccphtx_tstamp = ktime_get_real(); - packet-dccphtx_seqno = dccp_sk(sk)-dccps_gss; - packet-dccphtx_rtt= hctx-ccid3hctx_rtt; - packet-dccphtx_sent = 1; } static void ccid3_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb) { struct ccid3_hc_tx_sock *hctx = ccid3_hc_tx_sk(sk); struct ccid3_options_received *opt_recv; - struct dccp_tx_hist_entry *packet; + struct tfrc_tx_hist_entry *packet; ktime_t now; unsigned long t_nfb; u32 pinv, r_sample; @@ -425,16 +414,19 @@ static void ccid3_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb) switch (hctx-ccid3hctx_state) { case TFRC_SSTATE_NO_FBACK: case TFRC_SSTATE_FBACK: - /* get packet from history to look up t_recvdata */ - packet = dccp_tx_hist_find_entry(hctx-ccid3hctx_hist, - DCCP_SKB_CB(skb)-dccpd_ack_seq); - if (unlikely(packet == NULL)) { - DCCP_WARN(%s(%p), seqno %llu(%s) doesn't exist - in history!\n, dccp_role(sk), sk, - (unsigned long long)DCCP_SKB_CB(skb)-dccpd_ack_seq, - dccp_packet_name(DCCP_SKB_CB(skb)-dccpd_type)); + /* estimate RTT from history if ACK number is valid */ + packet = tfrc_tx_hist_find_entry(hctx-ccid3hctx_hist, + DCCP_SKB_CB(skb)-dccpd_ack_seq); + if (packet == NULL) { + DCCP_WARN(%s(%p): %s with bogus ACK-%llu\n, dccp_role(sk), sk, + dccp_packet_name(DCCP_SKB_CB(skb)-dccpd_type), + (unsigned long long)DCCP_SKB_CB(skb)-dccpd_ack_seq); return; } + /* +* Garbage-collect older (irrelevant) entries +*/ + tfrc_tx_hist_purge(packet-next); /* Update receive rate in units of 64 * bytes/second */ hctx-ccid3hctx_x_recv = opt_recv-ccid3or_receive_rate; @@ -451,7 +443,7 @@ static void ccid3_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb) /* * Calculate new RTT sample and update moving average */ - r_sample = dccp_sample_rtt(sk, ktime_us_delta(now, packet-dccphtx_tstamp)); + r_sample = dccp_sample_rtt(sk, ktime_us_delta(now, packet-stamp)); hctx-ccid3hctx_rtt = tfrc_ewma(hctx-ccid3hctx_rtt, r_sample, 9); if
[PATCH 4/5] [DCCP]: Integrate state transitions for passive-close
From: Gerrit Renker [EMAIL PROTECTED] This adds the necessary state transitions for the two forms of passive-close * PASSIVE_CLOSE- which is entered when a host receives a Close; * PASSIVE_CLOSEREQ - which is entered when a client receives a CloseReq. Here is a detailed account of what the patch does in each state. 1) Receiving CloseReq The pseudo-code in 8.5 says: Step 13: Process CloseReq If P.type == CloseReq and S.state CLOSEREQ, Generate Close S.state := CLOSING Set CLOSING timer. This means we need to address what to do in CLOSED, LISTEN, REQUEST, RESPOND, PARTOPEN, and OPEN. * CLOSED: silently ignore - it may be a late or duplicate CloseReq; * LISTEN/RESPOND: will not appear, since Step 7 is performed first (we know we are the client); * REQUEST:perform Step 13 directly (no need to enqueue packet); * OPEN/PARTOPEN: enter PASSIVE_CLOSEREQ so that the application has a chance to process unread data. When already in PASSIVE_CLOSEREQ, no second CloseReq is enqueued. In any other state, the CloseReq is ignored. I think that this offers some robustness against rare and pathological cases: e.g. a simultaneous close where the client sends a Close and the server a CloseReq. The client will then be retransmitting its Close until it gets the Reset, so ignoring the CloseReq while in state CLOSING is sane. 2) Receiving Close The code below from 8.5 is unconditional. Step 14: Process Close If P.type == Close, Generate Reset(Closed) Tear down connection Drop packet and return Thus we need to consider all states: * CLOSED: silently ignore, since this can happen when a retransmitted or late Close arrives; * LISTEN: dccp_rcv_state_process() will generate a Reset (No Connection); * REQUEST: perform Step 14 directly (no need to enqueue packet); * RESPOND: dccp_check_req() will generate a Reset (Packet Error) -- left it at that; * OPEN/PARTOPEN:enter PASSIVE_CLOSE so that application has a chance to process unread data; * CLOSEREQ: server performed active-close -- perform Step 14; * CLOSING: simultaneous-close: use a tie-breaker to avoid message ping-pong (see comment); * PASSIVE_CLOSEREQ: ignore - the peer has a bug (sending first a CloseReq and now a Close); * TIMEWAIT: packet is ignored. Note that the condition of receiving a packet in state CLOSED here is different from the condition there is no socket for such a connection: the socket still exists, but its state indicates it is unusable. Last, dccp_finish_passive_close sets either DCCP_CLOSED or DCCP_CLOSING = TCP_CLOSING, so that sk_stream_wait_close() will wait for the final Reset (which will trigger CLOSING = CLOSED). Signed-off-by: Gerrit Renker [EMAIL PROTECTED] Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- include/linux/dccp.h |1 - net/dccp/input.c | 88 + net/dccp/proto.c | 88 +- 3 files changed, 131 insertions(+), 46 deletions(-) diff --git a/include/linux/dccp.h b/include/linux/dccp.h index 8b3f9ad..312b989 100644 --- a/include/linux/dccp.h +++ b/include/linux/dccp.h @@ -260,7 +260,6 @@ enum dccp_state { }; #define DCCP_STATE_MASK 0x1f -#define DCCP_ACTION_FIN (17) enum { DCCPF_OPEN= TCPF_ESTABLISHED, diff --git a/net/dccp/input.c b/net/dccp/input.c index ef299fb..fe4b0fb 100644 --- a/net/dccp/input.c +++ b/net/dccp/input.c @@ -32,16 +32,56 @@ static void dccp_fin(struct sock *sk, struct sk_buff *skb) sk-sk_data_ready(sk, 0); } -static void dccp_rcv_close(struct sock *sk, struct sk_buff *skb) +static int dccp_rcv_close(struct sock *sk, struct sk_buff *skb) { - dccp_send_reset(sk, DCCP_RESET_CODE_CLOSED); - dccp_fin(sk, skb); - dccp_set_state(sk, DCCP_CLOSED); - sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_HUP); + int queued = 0; + + switch (sk-sk_state) { + /* +* We ignore Close when received in one of the following states: +* - CLOSED(may be a late or duplicate packet) +* - PASSIVE_CLOSEREQ (the peer has sent a CloseReq earlier) +* - RESPOND (already handled by dccp_check_req) +*/ + case DCCP_CLOSING: + /* +* Simultaneous-close: receiving a Close after sending one. This +* can happen if both client and server perform active-close and +* will result in an endless ping-pong of crossing and retrans- +* mitted Close packets, which only terminates when one of the +* nodes times out (min. 64 seconds). Quicker convergence can be +* achieved when one of the nodes acts as
[PATCH 3/5] [DCCP]: Dedicated auxiliary states to support passive-close
From: Gerrit Renker [EMAIL PROTECTED] This adds two auxiliary states to deal with passive closes: * PASSIVE_CLOSE(reached from OPEN via reception of Close)and * PASSIVE_CLOSEREQ (reached from OPEN via reception of CloseReq) as internal intermediate states. These states are used to allow a receiver to process unread data before acknowledging the received connection-termination-request (the Close/CloseReq). Without such support, it will happen that passively-closed sockets enter CLOSED state while there is still unprocessed data in the queue; leading to unexpected and erratic API behaviour. PASSIVE_CLOSE has been mapped into TCPF_CLOSE_WAIT, so that the code will seamlessly work with inet_accept() (which tests for this state). The state names are thanks to Arnaldo, who suggested this naming scheme following an earlier revision of this patch. Signed-off-by: Gerrit Renker [EMAIL PROTECTED] Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- include/linux/dccp.h | 56 ++--- net/dccp/proto.c | 22 ++- 2 files changed, 51 insertions(+), 27 deletions(-) diff --git a/include/linux/dccp.h b/include/linux/dccp.h index a007326..8b3f9ad 100644 --- a/include/linux/dccp.h +++ b/include/linux/dccp.h @@ -227,29 +227,51 @@ struct dccp_so_feat { #include net/tcp_states.h enum dccp_state { - DCCP_OPEN = TCP_ESTABLISHED, - DCCP_REQUESTING = TCP_SYN_SENT, - DCCP_LISTEN = TCP_LISTEN, - DCCP_RESPOND= TCP_SYN_RECV, - DCCP_CLOSING= TCP_CLOSING, - DCCP_TIME_WAIT = TCP_TIME_WAIT, - DCCP_CLOSED = TCP_CLOSE, - DCCP_PARTOPEN = TCP_MAX_STATES, + DCCP_OPEN= TCP_ESTABLISHED, + DCCP_REQUESTING = TCP_SYN_SENT, + DCCP_LISTEN = TCP_LISTEN, + DCCP_RESPOND = TCP_SYN_RECV, + /* +* States involved in closing a DCCP connection: +* 1) ACTIVE_CLOSEREQ is entered by a server sending a CloseReq. +* +* 2) CLOSING can have three different meanings (RFC 4340, 8.3): +* a. Client has performed active-close, has sent a Close to the server +* from state OPEN or PARTOPEN, and is waiting for the final Reset +* (in this case, SOCK_DONE == 1). +* b. Client is asked to perform passive-close, by receiving a CloseReq +* in (PART)OPEN state. It sends a Close and waits for final Reset +* (in this case, SOCK_DONE == 0). +* c. Server performs an active-close as in (a), keeps TIMEWAIT state. +* +* 3) The following intermediate states are employed to give passively +*closing nodes a chance to process their unread data: +*- PASSIVE_CLOSE(from OPEN = CLOSED) and +*- PASSIVE_CLOSEREQ (from (PART)OPEN to CLOSING; case (b) above). +*/ + DCCP_ACTIVE_CLOSEREQ = TCP_FIN_WAIT1, + DCCP_PASSIVE_CLOSE = TCP_CLOSE_WAIT, /* any node receiving a Close */ + DCCP_CLOSING = TCP_CLOSING, + DCCP_TIME_WAIT = TCP_TIME_WAIT, + DCCP_CLOSED = TCP_CLOSE, + DCCP_PARTOPEN= TCP_MAX_STATES, + DCCP_PASSIVE_CLOSEREQ, /* clients receiving CloseReq */ DCCP_MAX_STATES }; -#define DCCP_STATE_MASK 0xf +#define DCCP_STATE_MASK 0x1f #define DCCP_ACTION_FIN (17) enum { - DCCPF_OPEN = TCPF_ESTABLISHED, - DCCPF_REQUESTING = TCPF_SYN_SENT, - DCCPF_LISTEN = TCPF_LISTEN, - DCCPF_RESPOND= TCPF_SYN_RECV, - DCCPF_CLOSING= TCPF_CLOSING, - DCCPF_TIME_WAIT = TCPF_TIME_WAIT, - DCCPF_CLOSED = TCPF_CLOSE, - DCCPF_PARTOPEN = 1 DCCP_PARTOPEN, + DCCPF_OPEN= TCPF_ESTABLISHED, + DCCPF_REQUESTING = TCPF_SYN_SENT, + DCCPF_LISTEN = TCPF_LISTEN, + DCCPF_RESPOND = TCPF_SYN_RECV, + DCCPF_ACTIVE_CLOSEREQ = TCPF_FIN_WAIT1, + DCCPF_CLOSING = TCPF_CLOSING, + DCCPF_TIME_WAIT = TCPF_TIME_WAIT, + DCCPF_CLOSED = TCPF_CLOSE, + DCCPF_PARTOPEN= (1 DCCP_PARTOPEN), }; static inline struct dccp_hdr *dccp_hdr(const struct sk_buff *skb) diff --git a/net/dccp/proto.c b/net/dccp/proto.c index 73006b7..3489d3f 100644 --- a/net/dccp/proto.c +++ b/net/dccp/proto.c @@ -60,8 +60,7 @@ void dccp_set_state(struct sock *sk, const int state) { const int oldstate = sk-sk_state; - dccp_pr_debug(%s(%p) %-10.10s - %s\n, - dccp_role(sk), sk, + dccp_pr_debug(%s(%p) %s -- %s\n, dccp_role(sk), sk, dccp_state_name(oldstate), dccp_state_name(state)); WARN_ON(state == oldstate); @@ -134,14 +133,17 @@ EXPORT_SYMBOL_GPL(dccp_packet_name); const char *dccp_state_name(const int state) { static char *dccp_state_names[] = { - [DCCP_OPEN] = OPEN, - [DCCP_REQUESTING] =
[PATCH 2/5] Use AF-independent rebuild_header routine
From: Gerrit Renker [EMAIL PROTECTED] [DCCP]: Use AF-independent rebuild_header routine This fixes a nasty bug: dccp_send_reset() is called by both DCCPv4 and DCCPv6, but uses inet_sk_rebuild_header() in each case. This leads to unpredictable and weird behaviour: under some conditions, DCCPv6 Resets were sent, in other not. The fix is to use the AF-independent rebuild_header routine. Signed-off-by: Gerrit Renker [EMAIL PROTECTED] Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- net/dccp/output.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/net/dccp/output.c b/net/dccp/output.c index 33ce737..7caa7f5 100644 --- a/net/dccp/output.c +++ b/net/dccp/output.c @@ -391,7 +391,7 @@ int dccp_send_reset(struct sock *sk, enum dccp_reset_codes code) * FIXME: what if rebuild_header fails? * Should we be doing a rebuild_header here? */ - int err = inet_sk_rebuild_header(sk); + int err = inet_csk(sk)-icsk_af_ops-rebuild_header(sk); if (err != 0) return err; -- 1.5.3.4 - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHES 0/5]: DCCP patches for 2.6.25
Hi Herbert, Please consider pulling from: master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6.25 Best Regards, - Arnaldo b/include/linux/dccp.h| 56 + b/net/dccp/ccids/ccid3.c | 57 -- b/net/dccp/ccids/ccid3.h |3 b/net/dccp/ccids/lib/loss_interval.c | 12 +- b/net/dccp/ccids/lib/packet_history.c | 138 +++--- b/net/dccp/ccids/lib/packet_history.h | 80 +++ b/net/dccp/input.c| 88 + b/net/dccp/output.c |3 b/net/dccp/proto.c| 23 ++--- include/linux/dccp.h |1 net/dccp/input.c |7 - net/dccp/proto.c | 89 ++--- 12 files changed, 287 insertions(+), 270 deletions(-) - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/5] [DCCP]: Remove duplicate test for CloseReq
From: Gerrit Renker [EMAIL PROTECTED] This removes a redundant test for unexpected packet types. In dccp_rcv_state_process it is tested twice whether a DCCP-server has received a CloseReq (Step 7): * first in the combined if-statement, * then in the call to dccp_rcv_closereq(). The latter is necesssary since dccp_rcv_closereq() is also called from __dccp_rcv_established(). This patch removes the duplicate test. Signed-off-by: Gerrit Renker [EMAIL PROTECTED] Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- net/dccp/input.c |6 ++ 1 files changed, 2 insertions(+), 4 deletions(-) diff --git a/net/dccp/input.c b/net/dccp/input.c index fe4b0fb..decf2f2 100644 --- a/net/dccp/input.c +++ b/net/dccp/input.c @@ -629,16 +629,14 @@ int dccp_rcv_state_process(struct sock *sk, struct sk_buff *skb, return 0; /* * Step 7: Check for unexpected packet types -* If (S.is_server and P.type == CloseReq) -* or (S.is_server and P.type == Response) +* If (S.is_server and P.type == Response) * or (S.is_client and P.type == Request) * or (S.state == RESPOND and P.type == Data), *Send Sync packet acknowledging P.seqno *Drop packet and return */ } else if ((dp-dccps_role != DCCP_ROLE_CLIENT - (dh-dccph_type == DCCP_PKT_RESPONSE || -dh-dccph_type == DCCP_PKT_CLOSEREQ)) || + dh-dccph_type == DCCP_PKT_RESPONSE) || (dp-dccps_role == DCCP_ROLE_CLIENT dh-dccph_type == DCCP_PKT_REQUEST) || (sk-sk_state == DCCP_RESPOND -- 1.5.3.4 - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html