Re: [Announce]: Test results with latest CCID3 patch set
Here is my list of points, hoping that the others will add theirs, too: * would be good to have a standardised set of scripts, for comparison/benchmarking * the built-in VoIP module only works for UDP -- is it possible to port this to DCCP? * as per previous email, more complex traffic scenarios would be good, in particular - switching on/off background traffic at times to observe TCP/flow-friendliness - running multiple DCCP flows in parallel and at overlapping times The scenario that I mostly use is limiting the bandwidth with a middlebox running TBF. However, all of the recent trees except 2.6.20final_dccp (2.6.20 patched with Ian's modifications) that I have tested fail to achieve acceptable transfer rates. Test setup is as follows, unless otherwise mentioned: - Two Linux boxes are connected through another Linux box (middlebox) running netem. The boxes at the edge are identical machines with P4 2.4GHz and 512MB memory. Middlebox is a P4 2.6GHz with 512MB memory. - TBF is used to limit the bandwidth. TBF buffer is set as 1 bytes and the limit is 3 bytes. - 20ms constant delay is present in both sender-receiver and receiver-sender directions. - iperf in bytestream mode is used in these tests, and the streaming duration is set as 60 seconds. - DCCPv4 is used in the tests. Below are the results of the two different trees: Kernel: 2.6.20final_dccp (2.6.20 davem tree with Ian's patches - except for the best_packet_next patch. The patches applied are not the latest ones which are updated at 2nd of December, but the ones before the latest.) Results: Bottleneck=1000Kbps, tx_qlen=5:0.0-60.3 sec6.86 MBytes 955 Kbits/sec Bottleneck=2000Kbps, tx_qlen=5:0.0-60.1 sec13.3 MBytes 1.86 Kbits/sec Bottleneck=5000Kbps, tx_qlen=5:0.0-60.0 sec27.2 MBytes 3.80 Kbits/sec Bottleneck=1Kbps, limit=5bytes, tx_qlen=5:0.0-60.0 sec43.6 MBytes 6.09 Kbits/sec Kernel: 2.6.24-rc4 (Gerrit's latest tree): Results (Bottleneck bandwidth, tx_qlen:iperf server output): Bottleneck=1000Kbps, tx_qlen=5:0.0-168.2 sec107 KBytes 5.21 Kbits/sec Bottleneck=1000Kbps, tx_qlen=0:0.0-147.8 sec157 KBytes 8.71 Kbits/sec Bottleneck=2000Kbps, tx_qlen=5:0.0-89.0 sec138 KBytes 12.7 Kbits/sec Bottleneck=2000Kbps, tx_qlen=0:0.0-103.9 sec203 KBytes 16.0 Kbits/sec Bottleneck=5000Kbps, tx_qlen=5:0.0-66.6 sec222 KBytes 27.3 Kbits/sec Bottleneck=5000Kbps, tx_qlen=0:0.0-100.2 sec314 KBytes 25.7 Kbits/sec Bottleneck=1Kbps, tx_qlen=5:0.0-60.5 sec1.30 MBytes 181 Kbits/sec Bottleneck=1Kbps, tx_qlen=0:0.0-65.3 sec922 KBytes 116 Kbits/sec While writing this mail, I noticed that Ian has updated his patch set for 2.6.20. I will use this set and repeat the tests this week, hopefully. Moreover, I also tested Ian's patches for other trees (2.6.22 and 2.6.24) but the results were not as good as 2.6.20final_dccp results, if I remember correctly. I can go over them again, if necessary. Gerrit - Burak - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/7] [TFRC]: Loss interval code needs the macros/inlines that were moved
From: Gerrit Renker [EMAIL PROTECTED] This moves the inlines (which were previously declared as macros) back into packet_history.h since the loss detection code needs to be able to read entries from the RX history in order to create the relevant loss entries: it needs at least tfrc_rx_hist_loss_prev() and tfrc_rx_hist_last_rcv(), which in turn require the definition of the other inlines (macros). Signed-off-by: Gerrit Renker [EMAIL PROTECTED] Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- net/dccp/ccids/lib/packet_history.c | 35 --- net/dccp/ccids/lib/packet_history.h | 35 +++ 2 files changed, 35 insertions(+), 35 deletions(-) diff --git a/net/dccp/ccids/lib/packet_history.c b/net/dccp/ccids/lib/packet_history.c index 727b17d..dd2cf2d 100644 --- a/net/dccp/ccids/lib/packet_history.c +++ b/net/dccp/ccids/lib/packet_history.c @@ -151,23 +151,6 @@ void tfrc_rx_packet_history_exit(void) } } -/** - * tfrc_rx_hist_index - index to reach n-th entry after loss_start - */ -static inline u8 tfrc_rx_hist_index(const struct tfrc_rx_hist *h, const u8 n) -{ - return (h-loss_start + n) TFRC_NDUPACK; -} - -/** - * tfrc_rx_hist_last_rcv - entry with highest-received-seqno so far - */ -static inline struct tfrc_rx_hist_entry * - tfrc_rx_hist_last_rcv(const struct tfrc_rx_hist *h) -{ - return h-ring[tfrc_rx_hist_index(h, h-loss_count)]; -} - void tfrc_rx_hist_add_packet(struct tfrc_rx_hist *h, const struct sk_buff *skb, const u32 ndp) @@ -183,24 +166,6 @@ void tfrc_rx_hist_add_packet(struct tfrc_rx_hist *h, } EXPORT_SYMBOL_GPL(tfrc_rx_hist_add_packet); -/** - * tfrc_rx_hist_entry - return the n-th history entry after loss_start - */ -static inline struct tfrc_rx_hist_entry * - tfrc_rx_hist_entry(const struct tfrc_rx_hist *h, const u8 n) -{ - return h-ring[tfrc_rx_hist_index(h, n)]; -} - -/** - * tfrc_rx_hist_loss_prev - entry with highest-received-seqno before loss was detected - */ -static inline struct tfrc_rx_hist_entry * - tfrc_rx_hist_loss_prev(const struct tfrc_rx_hist *h) -{ - return h-ring[h-loss_start]; -} - /* has the packet contained in skb been seen before? */ int tfrc_rx_hist_duplicate(struct tfrc_rx_hist *h, struct sk_buff *skb) { diff --git a/net/dccp/ccids/lib/packet_history.h b/net/dccp/ccids/lib/packet_history.h index 3dfd182..e58b0fc 100644 --- a/net/dccp/ccids/lib/packet_history.h +++ b/net/dccp/ccids/lib/packet_history.h @@ -84,6 +84,41 @@ struct tfrc_rx_hist { #define rtt_sample_prev loss_start }; +/** + * tfrc_rx_hist_index - index to reach n-th entry after loss_start + */ +static inline u8 tfrc_rx_hist_index(const struct tfrc_rx_hist *h, const u8 n) +{ + return (h-loss_start + n) TFRC_NDUPACK; +} + +/** + * tfrc_rx_hist_last_rcv - entry with highest-received-seqno so far + */ +static inline struct tfrc_rx_hist_entry * + tfrc_rx_hist_last_rcv(const struct tfrc_rx_hist *h) +{ + return h-ring[tfrc_rx_hist_index(h, h-loss_count)]; +} + +/** + * tfrc_rx_hist_entry - return the n-th history entry after loss_start + */ +static inline struct tfrc_rx_hist_entry * + tfrc_rx_hist_entry(const struct tfrc_rx_hist *h, const u8 n) +{ + return h-ring[tfrc_rx_hist_index(h, n)]; +} + +/** + * tfrc_rx_hist_loss_prev - entry with highest-received-seqno before loss was detected + */ +static inline struct tfrc_rx_hist_entry * + tfrc_rx_hist_loss_prev(const struct tfrc_rx_hist *h) +{ + return h-ring[h-loss_start]; +} + extern void tfrc_rx_hist_add_packet(struct tfrc_rx_hist *h, const struct sk_buff *skb, const u32 ndp); -- 1.5.3.4 - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/8] [PATCH v2] [CCID3]: Interface CCID3 code with newer Loss Intervals Database
| This time around I'm not doing any reordering, just trying to use your | patches as is, but adding this patch as-is produces a kernel that will | crash, no? | | The loss history and the RX/TX packet history slabs are all created in | tfrc.c using the three different __init routines of the dccp_tfrc_lib. | | Yes, the init routines are called and in turn they create the slab | caches, but up to the patch [PATCH 8/8] [PATCH v2] [CCID3]: Interface | CCID3 code with newer Loss Intervals Database the new li slab is not | being created, no? See what I'm talking? | Sorry, there is some weird kind of mix-up going on. Can you please check your patch set: it seems this email exchange refers to an older variant. In the most recent patch set, the slab is introduced in the patch [TFRC]: Ringbuffer to track loss interval history --- a/net/dccp/ccids/lib/loss_interval.c +++ b/net/dccp/ccids/lib/loss_interval.c @@ -27,6 +23,54 @@ struct dccp_li_hist_entry { u32 dccplih_interval; }; +static struct kmem_cache *tfrc_lh_slab __read_mostly;/* === */ +/* Loss Interval weights from [RFC 3448, 5.4], scaled by 10 */ +static const int tfrc_lh_weights[NINTERVAL] = { 10, 10, 10, 10, 8, 6, 4, 2 }; // ... And this is 6/8, i.e. before 8/8, cf. http://www.mail-archive.com/dccp@vger.kernel.org/msg03000.html I don't know which tree you are working off, would it be possible to check against the test tree git://eden-feed.erg.abdn.ac.uk/dccp_exp [dccp] - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHES 0/7]: DCCP patches for 2.6.25
From: Arnaldo Carvalho de Melo [EMAIL PROTECTED] Date: Wed, 12 Dec 2007 14:36:46 -0200 Please consider pulling from: master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6.25 Pulled and pushed out to net-2.6.25, thanks! - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/7] [TFRC]: CCID3 (and CCID4) needs to access these inlines
From: Gerrit Renker [EMAIL PROTECTED] This moves two inlines back to packet_history.h: these are not private to packet_history.c, but are needed by CCID3/4 to detect whether a new loss is indicated, or whether a loss is already pending. Signed-off-by: Gerrit Renker [EMAIL PROTECTED] Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- net/dccp/ccids/lib/packet_history.c | 26 -- net/dccp/ccids/lib/packet_history.h | 35 +++ 2 files changed, 31 insertions(+), 30 deletions(-) diff --git a/net/dccp/ccids/lib/packet_history.c b/net/dccp/ccids/lib/packet_history.c index 5b10a1e..20af1a6 100644 --- a/net/dccp/ccids/lib/packet_history.c +++ b/net/dccp/ccids/lib/packet_history.c @@ -191,32 +191,6 @@ int tfrc_rx_hist_duplicate(struct tfrc_rx_hist *h, struct sk_buff *skb) } EXPORT_SYMBOL_GPL(tfrc_rx_hist_duplicate); -/* initialise loss detection and disable RTT sampling */ -static inline void tfrc_rx_hist_loss_indicated(struct tfrc_rx_hist *h) -{ - h-loss_count = 1; -} - -/* indicate whether previously a packet was detected missing */ -static inline int tfrc_rx_hist_loss_pending(const struct tfrc_rx_hist *h) -{ - return h-loss_count; -} - -/* any data packets missing between last reception and skb ? */ -int tfrc_rx_hist_new_loss_indicated(struct tfrc_rx_hist *h, - const struct sk_buff *skb, u32 ndp) -{ - int delta = dccp_delta_seqno(tfrc_rx_hist_last_rcv(h)-tfrchrx_seqno, -DCCP_SKB_CB(skb)-dccpd_seq); - - if (delta 1 ndp delta) - tfrc_rx_hist_loss_indicated(h); - - return tfrc_rx_hist_loss_pending(h); -} -EXPORT_SYMBOL_GPL(tfrc_rx_hist_new_loss_indicated); - static void tfrc_rx_hist_swap(struct tfrc_rx_hist *h, const u8 a, const u8 b) { const u8 idx_a = tfrc_rx_hist_index(h, a), diff --git a/net/dccp/ccids/lib/packet_history.h b/net/dccp/ccids/lib/packet_history.h index 24edd8d..c7eeda4 100644 --- a/net/dccp/ccids/lib/packet_history.h +++ b/net/dccp/ccids/lib/packet_history.h @@ -118,16 +118,43 @@ static inline struct tfrc_rx_hist_entry * return h-ring[h-loss_start]; } +/* initialise loss detection and disable RTT sampling */ +static inline void tfrc_rx_hist_loss_indicated(struct tfrc_rx_hist *h) +{ + h-loss_count = 1; +} + +/* indicate whether previously a packet was detected missing */ +static inline int tfrc_rx_hist_loss_pending(const struct tfrc_rx_hist *h) +{ + return h-loss_count; +} + +/* any data packets missing between last reception and skb ? */ +static inline int tfrc_rx_hist_new_loss_indicated(struct tfrc_rx_hist *h, + const struct sk_buff *skb, + u32 ndp) +{ + int delta = dccp_delta_seqno(tfrc_rx_hist_last_rcv(h)-tfrchrx_seqno, +DCCP_SKB_CB(skb)-dccpd_seq); + + if (delta 1 ndp delta) + tfrc_rx_hist_loss_indicated(h); + + return tfrc_rx_hist_loss_pending(h); +} + extern void tfrc_rx_hist_add_packet(struct tfrc_rx_hist *h, const struct sk_buff *skb, const u32 ndp); extern int tfrc_rx_hist_duplicate(struct tfrc_rx_hist *h, struct sk_buff *skb); -extern int tfrc_rx_hist_new_loss_indicated(struct tfrc_rx_hist *h, - const struct sk_buff *skb, u32 ndp); + struct tfrc_loss_hist; -extern int tfrc_rx_handle_loss(struct tfrc_rx_hist *, struct tfrc_loss_hist *, +extern int tfrc_rx_handle_loss(struct tfrc_rx_hist *h, + struct tfrc_loss_hist *lh, struct sk_buff *skb, u32 ndp, - u32 (*first_li)(struct sock *), struct sock *); + u32 (*first_li)(struct sock *sk), + struct sock *sk); extern u32 tfrc_rx_hist_sample_rtt(struct tfrc_rx_hist *h, const struct sk_buff *skb); extern int tfrc_rx_hist_alloc(struct tfrc_rx_hist *h); -- 1.5.3.4 - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/7] [CCID3]: Redundant debugging output / documentation
From: Gerrit Renker [EMAIL PROTECTED] Each time feedback is sent two lines are printed: ccid3_hc_rx_send_feedback: client ... - entry ccid3_hc_rx_send_feedback: Interval ...usec, X_recv=..., 1/p=... The first line is redundant and thus removed. Further, documentation of ccid3_hc_rx_sock (capitalisation) is made consistent. Signed-off-by: Gerrit Renker [EMAIL PROTECTED] Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- net/dccp/ccids/ccid3.c |2 -- net/dccp/ccids/ccid3.h |4 ++-- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/net/dccp/ccids/ccid3.c b/net/dccp/ccids/ccid3.c index 60fcb31..b92069b 100644 --- a/net/dccp/ccids/ccid3.c +++ b/net/dccp/ccids/ccid3.c @@ -685,8 +685,6 @@ static void ccid3_hc_rx_send_feedback(struct sock *sk, ktime_t now; s64 delta = 0; - ccid3_pr_debug(%s(%p) - entry \n, dccp_role(sk), sk); - if (unlikely(hcrx-ccid3hcrx_state == TFRC_RSTATE_TERM)) return; diff --git a/net/dccp/ccids/ccid3.h b/net/dccp/ccids/ccid3.h index 3c33dc6..6ceeb80 100644 --- a/net/dccp/ccids/ccid3.h +++ b/net/dccp/ccids/ccid3.h @@ -135,9 +135,9 @@ enum ccid3_hc_rx_states { * * @ccid3hcrx_x_recv - Receiver estimate of send rate (RFC 3448 4.3) * @ccid3hcrx_rtt - Receiver estimate of rtt (non-standard) - * @ccid3hcrx_p - current loss event rate (RFC 3448 5.4) + * @ccid3hcrx_p - Current loss event rate (RFC 3448 5.4) * @ccid3hcrx_last_counter - Tracks window counter (RFC 4342, 8.1) - * @ccid3hcrx_state - receiver state, one of %ccid3_hc_rx_states + * @ccid3hcrx_state - Receiver state, one of %ccid3_hc_rx_states * @ccid3hcrx_bytes_recv - Total sum of DCCP payload bytes * @ccid3hcrx_tstamp_last_feedback - Time at which last feedback was sent * @ccid3hcrx_tstamp_last_ack - Time at which last feedback was sent -- 1.5.3.4 - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHES 0/7]: DCCP patches for 2.6.25
Hi David, Please consider pulling from: master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6.25 Best Regards, - Arnaldo b/net/dccp/ccids/ccid3.c |2 b/net/dccp/ccids/ccid3.h |5 b/net/dccp/ccids/lib/loss_interval.c | 161 ++- b/net/dccp/ccids/lib/loss_interval.h | 56 ++ b/net/dccp/ccids/lib/packet_history.c | 68 +++- b/net/dccp/ccids/lib/packet_history.h | 36 b/net/dccp/ccids/lib/tfrc.c | 32 ++- b/net/dccp/ccids/lib/tfrc.h |4 b/net/dccp/dccp.h |8 net/dccp/ccids/ccid3.c| 72 +++- net/dccp/ccids/ccid3.h| 10 - net/dccp/ccids/lib/loss_interval.c| 284 +- net/dccp/ccids/lib/loss_interval.h| 11 - net/dccp/ccids/lib/packet_history.c | 279 + net/dccp/ccids/lib/packet_history.h | 47 - net/dccp/ccids/lib/tfrc.c | 10 - 16 files changed, 643 insertions(+), 442 deletions(-) - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/7] [CCID3]: Interface CCID3 code with newer Loss Intervals Database
From: Gerrit Renker [EMAIL PROTECTED] This hooks up the TFRC Loss Interval database with CCID 3 packet reception. In addition, it makes the CCID-specific computation of the first loss interval (which requires access to all the guts of CCID3) local to ccid3.c. The patch also fixes an omission in the DCCP code, that of a default / fallback RTT value (defined in section 3.4 of RFC 4340 as 0.2 sec); while at it, the upper bound of 4 seconds for an RTT sample has been reduced to match the initial TCP RTO value of 3 seconds from[RFC 1122, 4.2.3.1]. Signed-off-by: Gerrit Renker [EMAIL PROTECTED] Signed-off-by: Ian McDonald [EMAIL PROTECTED] Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- net/dccp/ccids/ccid3.c | 72 ++-- net/dccp/ccids/ccid3.h | 10 ++-- net/dccp/ccids/lib/loss_interval.c | 18 net/dccp/ccids/lib/tfrc.c | 10 ++-- net/dccp/dccp.h|7 ++- 5 files changed, 84 insertions(+), 33 deletions(-) diff --git a/net/dccp/ccids/ccid3.c b/net/dccp/ccids/ccid3.c index b92069b..a818a1e 100644 --- a/net/dccp/ccids/ccid3.c +++ b/net/dccp/ccids/ccid3.c @@ -1,6 +1,7 @@ /* * net/dccp/ccids/ccid3.c * + * Copyright (c) 2007 The University of Aberdeen, Scotland, UK * Copyright (c) 2005-7 The University of Waikato, Hamilton, New Zealand. * Copyright (c) 2005-7 Ian McDonald [EMAIL PROTECTED] * @@ -33,11 +34,7 @@ * along with this program; if not, write to the Free Software * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. */ -#include ../ccid.h #include ../dccp.h -#include lib/packet_history.h -#include lib/loss_interval.h -#include lib/tfrc.h #include ccid3.h #include asm/unaligned.h @@ -757,6 +754,46 @@ static int ccid3_hc_rx_insert_options(struct sock *sk, struct sk_buff *skb) return 0; } +/** ccid3_first_li - Implements [RFC 3448, 6.3.1] + * + * Determine the length of the first loss interval via inverse lookup. + * Assume that X_recv can be computed by the throughput equation + * s + * X_recv = + * R * fval + * Find some p such that f(p) = fval; return 1/p (scaled). + */ +static u32 ccid3_first_li(struct sock *sk) +{ + struct ccid3_hc_rx_sock *hcrx = ccid3_hc_rx_sk(sk); + u32 x_recv, p, delta; + u64 fval; + + if (hcrx-ccid3hcrx_rtt == 0) { + DCCP_WARN(No RTT estimate available, using fallback RTT\n); + hcrx-ccid3hcrx_rtt = DCCP_FALLBACK_RTT; + } + + delta = ktime_to_us(net_timedelta(hcrx-ccid3hcrx_tstamp_last_feedback)); + x_recv = scaled_div32(hcrx-ccid3hcrx_bytes_recv, delta); + if (x_recv == 0) { /* would also trigger divide-by-zero */ + DCCP_WARN(X_recv==0\n); + if ((x_recv = hcrx-ccid3hcrx_x_recv) == 0) { + DCCP_BUG(stored value of X_recv is zero); + return ~0U; + } + } + + fval = scaled_div(hcrx-ccid3hcrx_s, hcrx-ccid3hcrx_rtt); + fval = scaled_div32(fval, x_recv); + p = tfrc_calc_x_reverse_lookup(fval); + + ccid3_pr_debug(%s(%p), receive rate=%u bytes/s, implied + loss rate=%u\n, dccp_role(sk), sk, x_recv, p); + + return p == 0 ? ~0U : scaled_div(1, p); +} + static void ccid3_hc_rx_packet_recv(struct sock *sk, struct sk_buff *skb) { struct ccid3_hc_rx_sock *hcrx = ccid3_hc_rx_sk(sk); @@ -794,6 +831,14 @@ static void ccid3_hc_rx_packet_recv(struct sock *sk, struct sk_buff *skb) /* * Handle pending losses and otherwise check for new loss */ + if (tfrc_rx_hist_loss_pending(hcrx-ccid3hcrx_hist) + tfrc_rx_handle_loss(hcrx-ccid3hcrx_hist, + hcrx-ccid3hcrx_li_hist, + skb, ndp, ccid3_first_li, sk) ) { + do_feedback = CCID3_FBACK_PARAM_CHANGE; + goto done_receiving; + } + if (tfrc_rx_hist_new_loss_indicated(hcrx-ccid3hcrx_hist, skb, ndp)) goto update_records; @@ -803,7 +848,7 @@ static void ccid3_hc_rx_packet_recv(struct sock *sk, struct sk_buff *skb) if (unlikely(!is_data_packet)) goto update_records; - if (list_empty(hcrx-ccid3hcrx_li_hist)) { /* no loss so far: p = 0 */ + if (!tfrc_lh_is_initialised(hcrx-ccid3hcrx_li_hist)) { const u32 sample = tfrc_rx_hist_sample_rtt(hcrx-ccid3hcrx_hist, skb); /* * Empty loss history: no loss so far, hence p stays 0. @@ -812,6 +857,13 @@ static void ccid3_hc_rx_packet_recv(struct sock *sk, struct sk_buff *skb) */ if (sample != 0) hcrx-ccid3hcrx_rtt = tfrc_ewma(hcrx-ccid3hcrx_rtt, sample, 9); + + } else if (tfrc_lh_update_i_mean(hcrx-ccid3hcrx_li_hist, skb)) { + /* +* Step (3) of
[PATCH 3/7] [TFRC]: Ringbuffer to track loss interval history
From: Gerrit Renker [EMAIL PROTECTED] A ringbuffer-based implementation of loss interval history is easier to maintain, allocate, and update. The `swap' routine to keep the RX history sorted is due to and was written by Arnaldo Carvalho de Melo, simplifying an earlier macro-based variant. Details: * access to the Loss Interval Records via macro wrappers (with safety checks); * simplified, on-demand allocation of entries (no extra memory consumption on lossless links); cache allocation is local to the module / exported as service; * provision of RFC-compliant algorithm to re-compute average loss interval; * provision of comprehensive, new loss detection algorithm - support for all cases of loss, including re-ordered/duplicate packets; - waiting for NDUPACK=3 packets to fill the hole; - updating loss records when a late-arriving packet fills a hole. Signed-off-by: Gerrit Renker [EMAIL PROTECTED] Signed-off-by: Ian McDonald [EMAIL PROTECTED] Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- net/dccp/ccids/lib/loss_interval.c | 161 +- net/dccp/ccids/lib/loss_interval.h | 56 +- net/dccp/ccids/lib/packet_history.c | 218 ++- net/dccp/ccids/lib/packet_history.h | 11 +- net/dccp/ccids/lib/tfrc.h |3 + 5 files changed, 435 insertions(+), 14 deletions(-) diff --git a/net/dccp/ccids/lib/loss_interval.c b/net/dccp/ccids/lib/loss_interval.c index c0a933a..39980d1 100644 --- a/net/dccp/ccids/lib/loss_interval.c +++ b/net/dccp/ccids/lib/loss_interval.c @@ -1,6 +1,7 @@ /* * net/dccp/ccids/lib/loss_interval.c * + * Copyright (c) 2007 The University of Aberdeen, Scotland, UK * Copyright (c) 2005-7 The University of Waikato, Hamilton, New Zealand. * Copyright (c) 2005-7 Ian McDonald [EMAIL PROTECTED] * Copyright (c) 2005 Arnaldo Carvalho de Melo [EMAIL PROTECTED] @@ -10,12 +11,7 @@ * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. */ - -#include linux/module.h #include net/sock.h -#include ../../dccp.h -#include loss_interval.h -#include packet_history.h #include tfrc.h #define DCCP_LI_HIST_IVAL_F_LENGTH 8 @@ -27,6 +23,54 @@ struct dccp_li_hist_entry { u32 dccplih_interval; }; +static struct kmem_cache *tfrc_lh_slab __read_mostly; +/* Loss Interval weights from [RFC 3448, 5.4], scaled by 10 */ +static const int tfrc_lh_weights[NINTERVAL] = { 10, 10, 10, 10, 8, 6, 4, 2 }; + +/* implements LIFO semantics on the array */ +static inline u8 LIH_INDEX(const u8 ctr) +{ + return (LIH_SIZE - 1 - (ctr % LIH_SIZE)); +} + +/* the `counter' index always points at the next entry to be populated */ +static inline struct tfrc_loss_interval *tfrc_lh_peek(struct tfrc_loss_hist *lh) +{ + return lh-counter ? lh-ring[LIH_INDEX(lh-counter - 1)] : NULL; +} + +/* given i with 0 = i = k, return I_i as per the rfc3448bis notation */ +static inline u32 tfrc_lh_get_interval(struct tfrc_loss_hist *lh, const u8 i) +{ + BUG_ON(i = lh-counter); + return lh-ring[LIH_INDEX(lh-counter - i - 1)]-li_length; +} + +/* + * On-demand allocation and de-allocation of entries + */ +static struct tfrc_loss_interval *tfrc_lh_demand_next(struct tfrc_loss_hist *lh) +{ + if (lh-ring[LIH_INDEX(lh-counter)] == NULL) + lh-ring[LIH_INDEX(lh-counter)] = kmem_cache_alloc(tfrc_lh_slab, + GFP_ATOMIC); + return lh-ring[LIH_INDEX(lh-counter)]; +} + +void tfrc_lh_cleanup(struct tfrc_loss_hist *lh) +{ + if (!tfrc_lh_is_initialised(lh)) + return; + + for (lh-counter = 0; lh-counter LIH_SIZE; lh-counter++) + if (lh-ring[LIH_INDEX(lh-counter)] != NULL) { + kmem_cache_free(tfrc_lh_slab, + lh-ring[LIH_INDEX(lh-counter)]); + lh-ring[LIH_INDEX(lh-counter)] = NULL; + } +} +EXPORT_SYMBOL_GPL(tfrc_lh_cleanup); + static struct kmem_cache *dccp_li_cachep __read_mostly; static inline struct dccp_li_hist_entry *dccp_li_hist_entry_new(const gfp_t prio) @@ -98,6 +142,65 @@ u32 dccp_li_hist_calc_i_mean(struct list_head *list) EXPORT_SYMBOL_GPL(dccp_li_hist_calc_i_mean); +static void tfrc_lh_calc_i_mean(struct tfrc_loss_hist *lh) +{ + u32 i_i, i_tot0 = 0, i_tot1 = 0, w_tot = 0; + int i, k = tfrc_lh_length(lh) - 1; /* k is as in rfc3448bis, 5.4 */ + + for (i=0; i = k; i++) { + i_i = tfrc_lh_get_interval(lh, i); + + if (i k) { + i_tot0 += i_i * tfrc_lh_weights[i]; + w_tot += tfrc_lh_weights[i]; + } + if (i 0) + i_tot1 += i_i * tfrc_lh_weights[i-1]; + } + + BUG_ON(w_tot == 0); + lh-i_mean = max(i_tot0, i_tot1) / w_tot; +} + +/** + *
Re: [PATCH 8/8] [PATCH v2] [CCID3]: Interface CCID3 code with newer Loss Intervals Database
| +static struct kmem_cache *tfrc_lh_slab __read_mostly;/* === */ | | Yup, this one, is introduced as above but is not initialized at the | module init routine, please see, it should be OK and we can move on: | | http://git.kernel.org/?p=linux/kernel/git/acme/net-2.6.25.git;a=commitdiff;h=a925429ce2189b548dc19037d3ebd4ff35ae4af7 | Sorry for the confusion - you were right, the initialisation was sitting in the wrong patch, not the one in the subject line. In your online version the problem is fixed. Thanks a lot for all the work and for the clarification. Gerrit - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/8] [PATCH v2] [CCID3]: Interface CCID3 code with newer Loss Intervals Database
Em Wed, Dec 12, 2007 at 04:56:32PM +, Gerrit Renker escreveu: | This time around I'm not doing any reordering, just trying to use your | patches as is, but adding this patch as-is produces a kernel that will | crash, no? | | The loss history and the RX/TX packet history slabs are all created in | tfrc.c using the three different __init routines of the dccp_tfrc_lib. | | Yes, the init routines are called and in turn they create the slab | caches, but up to the patch [PATCH 8/8] [PATCH v2] [CCID3]: Interface | CCID3 code with newer Loss Intervals Database the new li slab is not | being created, no? See what I'm talking? | Sorry, there is some weird kind of mix-up going on. Can you please check your patch set: it seems this email exchange refers to an older variant. In the most recent patch set, the slab is introduced in the patch [TFRC]: Ringbuffer to track loss interval history --- a/net/dccp/ccids/lib/loss_interval.c +++ b/net/dccp/ccids/lib/loss_interval.c @@ -27,6 +23,54 @@ struct dccp_li_hist_entry { u32 dccplih_interval; }; +static struct kmem_cache *tfrc_lh_slab __read_mostly; /* === */ Yup, this one, is introduced as above but is not initialized at the module init routine, please see, it should be OK and we can move on: http://git.kernel.org/?p=linux/kernel/git/acme/net-2.6.25.git;a=commitdiff;h=a925429ce2189b548dc19037d3ebd4ff35ae4af7 +/* Loss Interval weights from [RFC 3448, 5.4], scaled by 10 */ +static const int tfrc_lh_weights[NINTERVAL] = { 10, 10, 10, 10, 8, 6, 4, 2 }; // ... And this is 6/8, i.e. before 8/8, cf. http://www.mail-archive.com/dccp@vger.kernel.org/msg03000.html I don't know which tree you are working off, would it be possible to check against the test tree git://eden-feed.erg.abdn.ac.uk/dccp_exp [dccp] I'm doing a fresh clone now. But I think that everything is OK after today's merge request I sent to David. - Arnaldo - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Announce]: Test results with latest CCID3 patch set
| The scenario that I mostly use is limiting the bandwidth with a middlebox running TBF. However, all of the recent trees | except 2.6.20final_dccp (2.6.20 patched with Ian's modifications) that I have tested fail to achieve acceptable transfer rates. Thank you for the report, but the material as-is does not help us very much. What is missing is the link speed of the ethernet cards that lead to the middlebox, and have you monitored the loss rate p? Using a TBF in this setting means slowing down the link speed by a factor of 10 .. 20, and unless you are using large FIFOs in addition to the TBF, the loss rate will very soon reach high values. Link speed is 100 Mbps, and I tested DCCP under various bottleneck bandwidths, like 1000, 2000, 5000 and 1 Mbps. I will repeat the tests with dccp_probe enabled, and show the results in a website, as soon as I have time. Therefore, can you please clarify what you mean by acceptable transfer rates: maybe the scenario is not supposed to warrant any high transfer rates at all. Which would mean expecting something where not much can be expected - as said, without more detailed knowledge about how p reacts, these figures don't tell us very much. By acceptable transfer rates I was referring to the rates achieved by 2.6.20final_dccp. But you are right, we cannot be sure of the goodness of the results by comparing two DCCP stacks, so I am giving TCP-Reno streaming results below under same conditions, which should be a solid benchmark: Bottleneck=1000Kbps:0.0-60.6 sec6.91 MBytes 958 Kbits/sec Bottleneck=2000Kbps:0.0-60.1 sec13.7 MBytes 1.91 Mbits/sec Bottleneck=5000Kbps:0.0-60.1 sec33.6 MBytes 4.69 Mbits/sec Bottleneck=1Kbps,limit=5bytes:0.0-60.1 sec65.4 MBytes 9.13 Mbits/sec The average transfer rates in the rightmost column show that the configured bottleneck rates are achievable, hence I think the transfer rates that 2.6.20final_dccp reaches seem acceptable, whereas the rates of other trees are not. Didn't you have a web page with further information? I will have soon, hopefully. Yes, please: the cleanest comparison would be to take a 2.6.24 tree, and compare the patch sets on the same basis. I almost expected that the results are not as good as the 2.6.20final_dccp -- it is an almost sure indication that the difference is not due to the patches, and that there are other factors at work. By carefully looking at these differences, we will be able to see clearer what is happening in the above. I agree with you. 2 or 3 weeks ago I applied Ian's patches to more recent trees (2.6.22 and 2.6.24) and the results were not as good as 2.6.20. Again thanks a lot for posting results, hope you will be back with further information soon, Gerrit I am planning to repeat the tests focusing on 2.6.24 and post the results - with dccp_probe figures in my website. Burak - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Announce]: Test results with latest CCID3 patch set
Gerrit Renker ha scritto: | I am new of this mailing list and I am really interested in the | measurements you are performing with DCCP. This was more of a regression test, as there had been recent changes in the test tree, to see that the kernel (not userspace) still performs in a predictable way. | Which tool are you using ? Are you using Iperf for such measurements ? The setup is the one from http://www.linux-foundation.org/en/Net:DCCP_Testing#Regression_testing and, yes, it uses iperf. | Have you ever heard about D-ITG ? | You can find more information here: | http://www.grid.unina.it/software/ITG | | I am one of the authors of such platform and I have also | performed some very preliminary tests with DCCP. | | I would be very glad to have your opinion on that and I'm very interested | in improving its features, also with specific regard to the support of | transport protocols. | It is a very nice tool with many features. I only ran simple tests with it (version 2.6), again only as basic sanity tests -- the throughput result was similar to the one tested with iperf. I think that the tool has more to offer and can help improve/extend DCCP testing. Here is my list of points, hoping that the others will add theirs, too: * would be good to have a standardised set of scripts, for comparison/benchmarking I can provide you the necessary support to verify the scripts and also to start to set up them. However, I need to know what you would like to generate. I have seen on the web page reported above the kind of tests you are performing with iperf, and I would like to provide you some some pieces of useful information regarding D-ITG. This tool allows to set two random variables that control the characteristics of traffic you generate. One of these variables models the Inter Departure Time (IDT) and the other one models the Size (PS) of the Packets. As of now, we support 6 random variables that are exponential, gamma, normal, pareto, cauchy, and poisson. Clearly also constant IDT and PS can be set. From the web page reported above I have seen that with iperf you generate CBR traffic but I could not find the rate. Moreover, in order to standardize the tests I need to know how you measure the obtained bit-rate. D-ITG can log at both sender and receiver sides different information for each sent and received packet respectively. Therefore after the tests you can obtain different information by analyzing the log files (with ITGDec). Regarding the bit-rate, you can have an average value related to the complete generation period as well as related to smaller time intervals. * the built-in VoIP module only works for UDP -- is it possible to port this to DCCP? The VoIP traffic is produced by setting an appropriate IDT and PS which are basically constant and printed out on standard output when a VoIP traffic flows is about to be generated. Therefore, you can already perform VoIP tests with DCCP. For more information you can have a look at D-ITG manual: http://www.grid.unina.it/software/ITG/codice/D-ITG2.6-manual.pdf If you can not find all the necessary information, just ask me. * as per previous email, more complex traffic scenarios would be good, in particular - switching on/off background traffic at times to observe TCP/flow-friendliness D-ITG supports multi-flow operation mode. In this case it reads the input information from a script file instead of the standard input. It uses a thread for each requested flow therefore the flows can have completely different characteristics in terms of kind of traffic (IDT and PS models), duration, start time, and transport protocols. - running multiple DCCP flows in parallel and at overlapping times This is already possible thanks to the script mode. Gerrit Alessio - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Announce]: Test results with latest CCID3 patch set
Arnaldo Carvalho de Melo ha scritto: Em Tue, Dec 11, 2007 at 02:21:28PM +, Gerrit Renker escreveu: | I am new of this mailing list and I am really interested in the | measurements you are performing with DCCP. This was more of a regression test, as there had been recent changes in the test tree, to see that the kernel (not userspace) still performs in a predictable way. | Which tool are you using ? Are you using Iperf for such measurements ? The setup is the one from http://www.linux-foundation.org/en/Net:DCCP_Testing#Regression_testing and, yes, it uses iperf. | Have you ever heard about D-ITG ? | You can find more information here: | http://www.grid.unina.it/software/ITG | | I am one of the authors of such platform and I have also | performed some very preliminary tests with DCCP. | | I would be very glad to have your opinion on that and I'm very interested | in improving its features, also with specific regard to the support of | transport protocols. | It is a very nice tool with many features. I only ran simple tests with it (version 2.6), again only as basic sanity tests -- the throughput result was similar to the one tested with iperf. I think that the tool has more to offer and can help improve/extend DCCP testing. Here is my list of points, hoping that the others will add theirs, too: * would be good to have a standardised set of scripts, for comparison/benchmarking * the built-in VoIP module only works for UDP -- is it possible to port this to DCCP? * as per previous email, more complex traffic scenarios would be good, in particular - switching on/off background traffic at times to observe TCP/flow-friendliness - running multiple DCCP flows in parallel and at overlapping times Does this tool records results in a database keyed by kernel version/buildid for us to use it as a regression tool? No, it does not record the results in a database. This feature has to be added by using some perl-like scripts. Something that would produce results around these lines: WARNING: test #23 counter #3 variance bigger than specified since the last kernel tested (git cset 55ed793afb4a8025d33a8e6a5f2f89d5ac4d8432)! Yes, we can work on building some scripts for this. - Arnaldo Alessio - To unsubscribe from this list: send the line unsubscribe dccp in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html