Re: SFQ qdisc crashes with limit of 2 packets
From: Alexey Kuznetsov [EMAIL PROTECTED] Date: Fri, 21 Sep 2007 19:55:22 +0400 Hello! Remove artificial limitation for sfq queue limit. This is followup to Patrick's patch. A little optimization to enqueue routine allows to remove artificial limitation on queue length. Plus, testing showed that hash function used by SFQ is too bad or even worse. It does not even sweep the whole range of hash values. Switched to Jenkins' hash. Signed-off-by: Alexey Kuznetsov [EMAIL PROTECTED] Applied, thanks Alexey. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SFQ qdisc crashes with limit of 2 packets
Hello! Remove artificial limitation for sfq queue limit. This is followup to Patrick's patch. A little optimization to enqueue routine allows to remove artificial limitation on queue length. Plus, testing showed that hash function used by SFQ is too bad or even worse. It does not even sweep the whole range of hash values. Switched to Jenkins' hash. Signed-off-by: Alexey Kuznetsov [EMAIL PROTECTED] diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c index 3a23e30..b542c87 100644 --- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -19,6 +19,7 @@ #include linux/init.h #include linux/ipv6.h #include linux/skbuff.h +#include linux/jhash.h #include net/ip.h #include net/netlink.h #include net/pkt_sched.h @@ -95,7 +96,7 @@ struct sfq_sched_data /* Variables */ struct timer_list perturb_timer; - int perturbation; + u32 perturbation; sfq_index tail; /* Index of current slot in round */ sfq_index max_depth; /* Maximal depth */ @@ -109,12 +110,7 @@ struct sfq_sched_data static __inline__ unsigned sfq_fold_hash(struct sfq_sched_data *q, u32 h, u32 h1) { - int pert = q-perturbation; - - /* Have we any rotation primitives? If not, WHY? */ - h ^= (h1pert) ^ (h1(0x1F - pert)); - h ^= h10; - return h 0x3FF; + return jhash_2words(h, h1, q-perturbation) (SFQ_HASH_DIVISOR - 1); } static unsigned sfq_hash(struct sfq_sched_data *q, struct sk_buff *skb) @@ -256,6 +252,13 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch) q-ht[hash] = x = q-dep[SFQ_DEPTH].next; q-hash[x] = hash; } + /* If selected queue has length q-limit, this means that +* all another queues are empty and that we do simple tail drop, +* i.e. drop _this_ packet. +*/ + if (q-qs[x].qlen = q-limit) + return qdisc_drop(skb, sch); + sch-qstats.backlog += skb-len; __skb_queue_tail(q-qs[x], skb); sfq_inc(q, x); @@ -294,6 +297,19 @@ sfq_requeue(struct sk_buff *skb, struct Qdisc* sch) } sch-qstats.backlog += skb-len; __skb_queue_head(q-qs[x], skb); + /* If selected queue has length q-limit+1, this means that +* all another queues are empty and we do simple tail drop. +* This packet is still requeued at head of queue, tail packet +* is dropped. +*/ + if (q-qs[x].qlen q-limit) { + skb = q-qs[x].prev; + __skb_unlink(skb, q-qs[x]); + sch-qstats.drops++; + sch-qstats.backlog -= skb-len; + kfree_skb(skb); + return NET_XMIT_CN; + } sfq_inc(q, x); if (q-qs[x].qlen == 1) { /* The flow is new */ if (q-tail == SFQ_DEPTH) { /* It is the first flow */ @@ -370,12 +386,10 @@ static void sfq_perturbation(unsigned long arg) struct Qdisc *sch = (struct Qdisc*)arg; struct sfq_sched_data *q = qdisc_priv(sch); - q-perturbation = net_random()0x1F; + get_random_bytes(q-perturbation, 4); - if (q-perturb_period) { - q-perturb_timer.expires = jiffies + q-perturb_period; - add_timer(q-perturb_timer); - } + if (q-perturb_period) + mod_timer(q-perturb_timer, jiffies + q-perturb_period); } static int sfq_change(struct Qdisc *sch, struct rtattr *opt) @@ -391,7 +405,7 @@ static int sfq_change(struct Qdisc *sch, struct rtattr *opt) q-quantum = ctl-quantum ? : psched_mtu(sch-dev); q-perturb_period = ctl-perturb_period*HZ; if (ctl-limit) - q-limit = min_t(u32, ctl-limit, SFQ_DEPTH - 2); + q-limit = min_t(u32, ctl-limit, SFQ_DEPTH - 1); qlen = sch-q.qlen; while (sch-q.qlen q-limit) @@ -400,8 +414,8 @@ static int sfq_change(struct Qdisc *sch, struct rtattr *opt) del_timer(q-perturb_timer); if (q-perturb_period) { - q-perturb_timer.expires = jiffies + q-perturb_period; - add_timer(q-perturb_timer); + mod_timer(q-perturb_timer, jiffies + q-perturb_period); + get_random_bytes(q-perturbation, 4); } sch_tree_unlock(sch); return 0; @@ -423,12 +437,13 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt) q-dep[i+SFQ_DEPTH].next = i+SFQ_DEPTH; q-dep[i+SFQ_DEPTH].prev = i+SFQ_DEPTH; } - q-limit = SFQ_DEPTH - 2; + q-limit = SFQ_DEPTH - 1; q-max_depth = 0; q-tail = SFQ_DEPTH; if (opt == NULL) { q-quantum = psched_mtu(sch-dev); q-perturb_period = 0; + get_random_bytes(q-perturbation, 4); } else { int err = sfq_change(sch, opt); if (err) - To unsubscribe from this list: send the line unsubscribe netdev in the body of
Re: SFQ qdisc crashes with limit of 2 packets
Hello! OK the off-by-one prevents an out-of-bounds array access, Yes, this is not off-by-one (off-by-two, to be more exact :-)). Maximal queue length is really limited by SFQ_DEPTH-2, because: 1. SFQ keeps list of queue lengths in array of length SFQ_DEPTH. This means length of queue must be in range 0..SFQ_DEPTH-1. 2. SFQ enqueues incoming packet first, and drops something after this. This means it always needs a free slot in queue. So, SFQ_DEPTH-2. CCed Alexey just to be safe, but I think the patch should be fine. Yes. But what'a about limit == 1? tc prohibits this case, but it is still not nice to have the bug in kernel. Plus, code remains creepy, the check + if (++sch-q.qlen q-limit) { still looks as a scary off-by-one. I would go all the way: replace this with natural: if (++sch-q.qlen = q-limit) { and maxed q-limit at SFQ_DEPTH-2. Patch enclosed. Seems, it is easy to relax the limit to SFQ_DEPTH-1, item #2 is an artificial limitation. If current queue already has SFQ_DEPTH-1 packets, we can drop from tail of this queue and skip update of all the state. It is even an optimization for the case when we have single flow. I am not 100% sure about this, I'll try and report. diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c index 9579573..cbf8089 100644 --- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -270,7 +270,7 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch) q-tail = x; } } - if (++sch-q.qlen q-limit-1) { + if (++sch-q.qlen = q-limit) { sch-bstats.bytes += skb-len; sch-bstats.packets++; return 0; @@ -306,7 +306,7 @@ sfq_requeue(struct sk_buff *skb, struct Qdisc* sch) q-tail = x; } } - if (++sch-q.qlen q-limit - 1) { + if (++sch-q.qlen = q-limit) { sch-qstats.requeues++; return 0; } @@ -391,10 +391,10 @@ static int sfq_change(struct Qdisc *sch, struct rtattr *opt) q-quantum = ctl-quantum ? : psched_mtu(sch-dev); q-perturb_period = ctl-perturb_period*HZ; if (ctl-limit) - q-limit = min_t(u32, ctl-limit, SFQ_DEPTH); + q-limit = min_t(u32, ctl-limit, SFQ_DEPTH - 2); qlen = sch-q.qlen; - while (sch-q.qlen = q-limit-1) + while (sch-q.qlen q-limit) sfq_drop(sch); qdisc_tree_decrease_qlen(sch, qlen - sch-q.qlen); @@ -423,7 +423,7 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt) q-dep[i+SFQ_DEPTH].next = i+SFQ_DEPTH; q-dep[i+SFQ_DEPTH].prev = i+SFQ_DEPTH; } - q-limit = SFQ_DEPTH; + q-limit = SFQ_DEPTH - 2; q-max_depth = 0; q-tail = SFQ_DEPTH; if (opt == NULL) { - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SFQ qdisc crashes with limit of 2 packets
Alexey Kuznetsov wrote: Hello! CCed Alexey just to be safe, but I think the patch should be fine. Yes. But what'a about limit == 1? tc prohibits this case, but it is still not nice to have the bug in kernel. Plus, code remains creepy, the check + if (++sch-q.qlen q-limit) { still looks as a scary off-by-one. I would go all the way: replace this with natural: if (++sch-q.qlen = q-limit) { and maxed q-limit at SFQ_DEPTH-2. Patch enclosed. Thats even better, thanks :) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SFQ qdisc crashes with limit of 2 packets
From: Patrick McHardy [EMAIL PROTECTED] Date: Wed, 19 Sep 2007 15:08:22 +0200 Alexey Kuznetsov wrote: Hello! CCed Alexey just to be safe, but I think the patch should be fine. Yes. But what'a about limit == 1? tc prohibits this case, but it is still not nice to have the bug in kernel. Plus, code remains creepy, the check + if (++sch-q.qlen q-limit) { still looks as a scary off-by-one. I would go all the way: replace this with natural: if (++sch-q.qlen = q-limit) { and maxed q-limit at SFQ_DEPTH-2. Patch enclosed. Thats even better, thanks :) I'll put this into my tree and wait while Alexey does his tests. Thanks. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
SFQ qdisc crashes with limit of 2 packets
Limit of 1 is forbidden, crashes with 2, works with 3: https://bugzilla.redhat.com/show_bug.cgi?id=219895 = If the defect is produced at a console (as in ctrl-alt-f0-6) a kernel stack trace can be seen the moment ping is invoked. Since the stack trace is not written to the /var/log/messages here's part of it (manually copied): syscall_call(() sys_socketcall() sys_sendmsg() sock_sendmsg() inet_sendmsg() raw_sendmsg() ip_push_pending_frames() ip_output() neigh_resolve_output() dev_queue_xmit() __qdisc_run() The location given in __qdisc_run() is 0x30/0x19b. The value given for EIP is sfq_dequeue+0xf6/0x179 in the sch_sfq module. From disassembling sch_sfq.ko it seems that it is on line 360 of sch_sfq.c: sch-qstats.backlog -= skb-len; where skb is an invalid pointer: net/sched/sch_sfq.c:360 194: ff 4d 28decl 0x28(%ebp) 197: 8b 14 24mov(%esp),%edx 19a: 8b 42 60mov0x60(%edx),%eax ** crash ** 19d: 29 45 58sub%eax,0x58(%ebp) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SFQ qdisc crashes with limit of 2 packets
Chuck Ebbert wrote: Limit of 1 is forbidden, crashes with 2, works with 3: From disassembling sch_sfq.ko it seems that it is on line 360 of sch_sfq.c: sch-qstats.backlog -= skb-len; where skb is an invalid pointer: Is it a NULL pointer or something random? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SFQ qdisc crashes with limit of 2 packets
Patrick McHardy wrote: Chuck Ebbert wrote: Limit of 1 is forbidden, crashes with 2, works with 3: From disassembling sch_sfq.ko it seems that it is on line 360 of sch_sfq.c: sch-qstats.backlog -= skb-len; where skb is an invalid pointer: Is it a NULL pointer or something random? Never mind, I found the reason. When enqueuing the packet, sfq_enqueue contains an off-by-one in the limit check (which IIRC is there for a reason, but I can't remember right now) and drops the packet again. dev_queue_xmit() calls qdisc_run() anyway and the empty qdisc is dequeued, which is not handled by SFQ. I see three possibilities to fix this (in my preferred order): 1) figure out why the off-by-one is there, if not needed remove 2) don't dequeue qdiscs even once if empty 3) check for NULL in sfq_dequeue So I'll try to remeber why the off-by-one is there .. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SFQ qdisc crashes with limit of 2 packets
Patrick McHardy wrote: Never mind, I found the reason. When enqueuing the packet, sfq_enqueue contains an off-by-one in the limit check (which IIRC is there for a reason, but I can't remember right now) and drops the packet again. dev_queue_xmit() calls qdisc_run() anyway and the empty qdisc is dequeued, which is not handled by SFQ. I see three possibilities to fix this (in my preferred order): 1) figure out why the off-by-one is there, if not needed remove 2) don't dequeue qdiscs even once if empty 3) check for NULL in sfq_dequeue So I'll try to remeber why the off-by-one is there .. OK the off-by-one prevents an out-of-bounds array access, which would cause a crash itself. Despite what I said above, sfq does try to handle dequeues while empty, but forgets to update q-tail when dropping the last packet from the only active queue, probably because it wasn't expected that the queue length is too small to queue even a single packet (and that really doesn't make much sense). So one possibility for fixing this is to update q-tail in sfq_drop when dropping the last packet, but that would still leave the qdisc non-functional because of the off-by-one. I chose a different way: cap the limit at SFQ_DEPTH-1 and remove the off-by-one, which should have no effect on the max (still 127), but prevents the crash since we can now queue at least a single packet and q-tail is properly updated in sfq_dequeue(). CCed Alexey just to be safe, but I think the patch should be fine. Signed-off-by: Patrick McHardy [EMAIL PROTECTED] diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c index 9579573..cbf8089 100644 --- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -270,7 +270,7 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch) q-tail = x; } } - if (++sch-q.qlen q-limit-1) { + if (++sch-q.qlen q-limit) { sch-bstats.bytes += skb-len; sch-bstats.packets++; return 0; @@ -306,7 +306,7 @@ sfq_requeue(struct sk_buff *skb, struct Qdisc* sch) q-tail = x; } } - if (++sch-q.qlen q-limit - 1) { + if (++sch-q.qlen q-limit) { sch-qstats.requeues++; return 0; } @@ -391,10 +391,10 @@ static int sfq_change(struct Qdisc *sch, struct rtattr *opt) q-quantum = ctl-quantum ? : psched_mtu(sch-dev); q-perturb_period = ctl-perturb_period*HZ; if (ctl-limit) - q-limit = min_t(u32, ctl-limit, SFQ_DEPTH); + q-limit = min_t(u32, ctl-limit, SFQ_DEPTH - 1); qlen = sch-q.qlen; - while (sch-q.qlen = q-limit-1) + while (sch-q.qlen = q-limit) sfq_drop(sch); qdisc_tree_decrease_qlen(sch, qlen - sch-q.qlen); @@ -423,7 +423,7 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt) q-dep[i+SFQ_DEPTH].next = i+SFQ_DEPTH; q-dep[i+SFQ_DEPTH].prev = i+SFQ_DEPTH; } - q-limit = SFQ_DEPTH; + q-limit = SFQ_DEPTH - 1; q-max_depth = 0; q-tail = SFQ_DEPTH; if (opt == NULL) {
Re: SFQ qdisc crashes with limit of 2 packets
From: Patrick McHardy [EMAIL PROTECTED] Date: Tue, 18 Sep 2007 21:15:28 +0200 OK the off-by-one prevents an out-of-bounds array access, which would cause a crash itself. Despite what I said above, sfq does try to handle dequeues while empty, but forgets to update q-tail when dropping the last packet from the only active queue, probably because it wasn't expected that the queue length is too small to queue even a single packet (and that really doesn't make much sense). So one possibility for fixing this is to update q-tail in sfq_drop when dropping the last packet, but that would still leave the qdisc non-functional because of the off-by-one. I chose a different way: cap the limit at SFQ_DEPTH-1 and remove the off-by-one, which should have no effect on the max (still 127), but prevents the crash since we can now queue at least a single packet and q-tail is properly updated in sfq_dequeue(). CCed Alexey just to be safe, but I think the patch should be fine. Signed-off-by: Patrick McHardy [EMAIL PROTECTED] I've applied this to net-2.6, thanks Patrick. I'll hold off merging this to Linus until later today so that if some issue is found we can address it. Thanks. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html