Re: tc-ipt v0.2: Extension does not know id 1504083504

2017-10-01 Thread Corey Hickey

On 2017-10-01 07:25, Sergey K. wrote:

Hi Corey.

I did it on your recommendation, replaced xtables.h file from my
version of iptables 1.6.0, and replaced the file netfilter.h.

Now it's works, but new construction doesn't:

# tc filter add dev eth0 parent : u32 match u32 0 0 action xt -j
SET --map-set WORLD_QoS dst
xt: unrecognized option '--map-set'
failed to find target (null)


I looked into this for a bit, and it seems to be due to the SET 
extension not supporting the x6-style options (I don't know the proper 
name for it).


From the iptables source:

$ grep -c x6 extensions/libxt_MARK.c
9

$ grep -c x6 extensions/libxt_SET.c
0

I got lost in gdb for a while and didn't come up with an easy answer. 
The options in are in the struct, just not where tc is looking.


(gdb) print *m->extra_opts
$3 = {name = 0x76f1dad8 "add-set", has_arg = 1, flag = 0x0, val = 49}

(gdb) print *(m->extra_opts+4)
$7 = {name = 0x76f1db03 "map-set", has_arg = 1, flag = 0x0, val = 53}


I don't know if the problem is in iptables or in tc. Somebody familiar 
with the code and its history could probably answer it easily, but I'm 
not at all familiar with it. Sorry. Hopefully somebody else will jump in 
and help you out.


-Corey


Re: tc-ipt v0.2: Extension does not know id 1504083504

2017-09-30 Thread Corey Hickey

On 2017-09-29 23:34, Sergey K. wrote:

Hello.

I have to apply this patch

https://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git/commit/?h=v4.10.0=97a02cabefb2e2dcfe27f89943709afa84be5525

to my version of iproute2.
But now I have a new information message, when I'm using construction like this:

tc filter add dev eth0 parent : u32 match u32 0 0 action xt -j
MARK --set-mark 0

message text:
tc-ipt v0.2: Extension does not know id 1504083504


I'm using Debian Stretch, kernel 4.9.0-3-amd64, iptables 1.6.0 and
patched iproute 4.9.0

How to solve?


Funny, I just ran into this too and subscribed here to report it. The 
error occurs during parsing of any options to the jump target; if the 
target has no options, there is no error.


The problem seems to be an outdated version of struct xtables_target in 
include/xtables.h. The version in iptables has an additional member 
"udata" that makes the offsets in the struct different for anything 
following.


A quick fix for this particular problem is to copy include/xtables.h from:
git://git.netfilter.org/iptables
...into include/ in the iproute2 source, then recompile after a 'make 
clean'.


As for a comprehensive fix, I don't know--presumably other headers in 
include/ may be out of date, but I don't want to just blindly send a 
patch unless someone who knows the ramifications says it's ok. This 
seems like it would need maintainer oversight. If there's something I 
can do, though, let me know.


-Corey


Re: [NET_SCHED 00/04]: External SFQ classifiers/flow classifier

2008-02-04 Thread Corey Hickey
Patrick McHardy wrote:
 You're missing protocol, handle etc. Try something like this:
 
 tc filter add dev eth0 protocol ip pref 1 parent 1: handle 1 \ 
 
  flow hash keys dst divisor 1024

Thanks, the kernel accepts that. I guess I understand tc filter usage
less than I thought I did Time to teach myself better.

-Corey
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel panic on 2.6.24 with esfq patch applied

2008-02-02 Thread Corey Hickey
Denys Fedoryshchenko wrote:
 Hi
 
 Probably bug related to ESFQ, now i will unload module and will test more. 
 But probably not related, so if not difficult, please take a look.
 
 Feb  1 09:08:50 SERVER [12380.067104] BUG: unable to handle kernel NULL 
 pointer dereference
 Feb  1 09:08:50 SERVER at virtual address 0008
 Feb  1 09:08:50 SERVER [12380.067140] printing eip: c01f10ed
 Feb  1 09:08:50 SERVER *pde = 
 Feb  1 09:08:50 SERVER
 Feb  1 09:08:50 SERVER [12380.067162] Oops:  [#1]
 Feb  1 09:08:50 SERVER SMP
 Feb  1 09:08:50 SERVER
 Feb  1 09:08:50 SERVER [12380.067181] Modules linked in:
 Feb  1 09:08:50 SERVER netconsole
 Feb  1 09:08:50 SERVER configfs
 Feb  1 09:08:50 SERVER iTCO_wdt
 Feb  1 09:08:50 SERVER nf_nat_pptp
 Feb  1 09:08:50 SERVER nf_conntrack_pptp
 Feb  1 09:08:50 SERVER nf_conntrack_proto_gre
 Feb  1 09:08:50 SERVER nf_nat_proto_gre
 Feb  1 09:08:50 SERVER sch_esfq

I'd rather you were using my recent patches to SFQ instead of ESFQ. I
was able to crash a 2.6.24 user-mode Linux with ESFQ as well; I don't
know if that's what you encountered, but the SFQ patches should be
better anyway.

http://fatooh.org/esfq-2.6/sfq-2.6.24.tar.bz2
http://fatooh.org/esfq-2.6/

-Corey
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel panic on 2.6.24 with esfq patch applied

2008-02-02 Thread Corey Hickey
Jarek Poplawski wrote:
 Hi Corey!
 
 I've just had a look at your site, and see you're a bit disappointed about
 the reception of your patches here. I don't use nor know about your work
 (maybe some time...), and wish you better public here, but maybe you'll
 find this story interesting: a few years ago, when I didn't even think of
 reading netdev, I quite often visited pages of such projects as HTB (not
 in every distro yet), IMQ or Julian Anastasov's. They were quite popular
 subjects on admins lists, and I really admired people around these
 projects, while the main kernel was something big and anonymous (no idea
 e.g. about D. Miller). For the same reason I knew more about people from
 iptables project - only because there was a need to visit this for some
 non standard modules, and about A. Kuznetsov - only because of patching
 iproute for HTB!!! (Only later I found that Alexey did more than needed
 for several such HTBs.) So, it seems there could be good side of such
 unofficial status too - if you only have something really useful for
 others...

Thank you for the words of encouragement, but don't worry: I'm not going
to run away and hide just yet. :)  I was a bit discouraged about not
getting reviews for the more recent patch submissions, though, and I
hope I don't come across as bitter about that. Patch review is a lot of
work, and I definitely appreciate the comments I received earlier--I was
able to make many improvements as a result.

In any case, kernel development progresses, and I now have to update my
patches so they can apply to current git. The ball has rolled back into
my court of its own accord; perhaps the playing field isn't level. ;)

-Corey
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NET_SCHED 00/04]: External SFQ classifiers/flow classifier

2008-02-02 Thread Corey Hickey
Patrick McHardy wrote:
 These patches add support for external classifiers to SFQ and add a
 new flow classifier, which can do hashing based on user-specified
 keys or deterministic mapping of keys to classes. Additionally there
 is a patch to make the SFQ queues visisble as classes to verify that
 the hash is indeed doing something useful and a patch to consifiy
 struct tcf_ext_map, which I had queued in the same tree.

Excellent! I'm glad this is applied. I'm having trouble figuring out how
it works, though. As a test, I'm trying to set up SFQ equivalent to
ESFQ's hash dst. Here's what I do, and this is what I get:


# ./tc qdisc add dev eth0 root handle 1: sfq
# ./tc filter add dev eth0 parent 1: flow hash keys dst
RTNETLINK answers: Invalid argument
We have an error talking to the kernel


I've tried a few different keys with the same results. I don't know what
I'm doing wrong, or even where to start figuring it out. Can you point
me in the right direction?

Here are some details that may be pertinent:

- current net-2.6 git (I double-checked)
- CONFIG_NET_CLS_FLOW=y
- current iproute2 git
- running on amd64 user-mode Linux


Thanks,
Corey
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SFQ: backport some features from ESFQ (try 5)

2007-11-15 Thread Corey Hickey
Corey Hickey wrote:
 Patchset try 2 addresses the review by Michael Buesch.
 Patchset try 3 addresses the review by Patrick McHardy.
 Patchset try 4 has a few cosmetic improvements.
 Patchset try 5 addresses further review by Patrick McHardy.
 
 This set of patches is substantially the same as my previous try, with
 changes made according to Patrick's recommendations.

Any comments? Patrick? I know you're busy, but you've been sending
reviews for my patches so far and yet another one would be much appreciated.

Thanks,
Corey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/8] Preparatory refactoring part 2.

2007-10-29 Thread Corey Hickey
Factor code out of sfq_init() so that the new function can be used
by sfq_change() later.

Actually, as the diff itself shows, most of the sfq_q_init() code
comes from the original sfq_change(), but sfq_change() is only
called by sfq_init() right now. Thus, it is safe to remove
sfq_change(); tc qdisc change doesn't yet work for sfq anyway.

Setting default parameters is moved into a separate function for
clarity.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   88 +++---
 1 files changed, 47 insertions(+), 41 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 10e2f3d..8ea816a 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -413,43 +413,41 @@ static void sfq_perturbation(unsigned long arg)
mod_timer(q-perturb_timer, jiffies + q-perturb_period);
 }
 
-static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
+static void
+sfq_default_parameters(struct Qdisc *sch)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-   struct tc_sfq_qopt *ctl = RTA_DATA(opt);
-   unsigned int qlen;
-
-   if (opt-rta_len  RTA_LENGTH(sizeof(*ctl)))
-   return -EINVAL;
-
-   sch_tree_lock(sch);
-   q-quantum = ctl-quantum ? : psched_mtu(sch-dev);
-   q-perturb_period = ctl-perturb_period*HZ;
-   if (ctl-limit)
-   q-limit = min_t(u32, ctl-limit, SFQ_DEPTH - 1);
 
-   qlen = sch-q.qlen;
-   while (sch-q.qlen  q-limit)
-   sfq_drop(sch);
-   qdisc_tree_decrease_qlen(sch, qlen - sch-q.qlen);
-
-   del_timer(q-perturb_timer);
-   if (q-perturb_period) {
-   mod_timer(q-perturb_timer, jiffies + q-perturb_period);
-   get_random_bytes(q-perturbation, 4);
-   }
-   sch_tree_unlock(sch);
-   return 0;
+   q-quantum= psched_mtu(sch-dev);
+   q-perturbation   = 0;
+   q-perturb_period = 0;
+   q-limit  = SFQ_DEPTH - 1;
 }
 
-static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
+static int
+sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
int i;
 
-   init_timer(q-perturb_timer);
-   q-perturb_timer.data = (unsigned long)sch;
-   q-perturb_timer.function = sfq_perturbation;
+   /* At this point, parameters are set to either defaults (sfq_init) or
+* the previous values (sfq_change). So, overwrite the parameters as
+* specified. */
+   if (opt) {
+   struct tc_sfq_qopt *ctl = RTA_DATA(opt);
+
+   if (opt-rta_len  RTA_LENGTH(sizeof(*ctl)))
+   return -EINVAL;
+
+   if (ctl-quantum)
+   q-quantum = ctl-quantum;
+   if (ctl-perturb_period)
+   q-perturb_period = ctl-perturb_period * HZ;
+   if (ctl-limit)
+   q-limit = ctl-limit;
+   }
+   q-limit = min_t(u32, q-limit, SFQ_DEPTH - 1);
+   q-tail = SFQ_DEPTH;
+   q-max_depth = 0;
 
for (i=0; iSFQ_HASH_DIVISOR; i++)
q-ht[i] = SFQ_DEPTH;
@@ -458,23 +456,31 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
q-dep[i+SFQ_DEPTH].next = i+SFQ_DEPTH;
q-dep[i+SFQ_DEPTH].prev = i+SFQ_DEPTH;
}
-   q-limit = SFQ_DEPTH - 1;
-   q-max_depth = 0;
-   q-tail = SFQ_DEPTH;
-   if (opt == NULL) {
-   q-quantum = psched_mtu(sch-dev);
-   q-perturb_period = 0;
-   get_random_bytes(q-perturbation, 4);
-   } else {
-   int err = sfq_change(sch, opt);
-   if (err)
-   return err;
-   }
for (i=0; iSFQ_DEPTH; i++)
sfq_link(q, i);
return 0;
 }
 
+static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   int err;
+
+   sfq_default_parameters(sch);
+   if ((err = sfq_q_init(q, opt)))
+   return err;
+
+   init_timer(q-perturb_timer);
+   q-perturb_timer.data = (unsigned long)sch;
+   q-perturb_timer.function = sfq_perturbation;
+   if (q-perturb_period) {
+   q-perturb_timer.expires = jiffies + q-perturb_period;
+   add_timer(q-perturb_timer);
+   }
+
+   return 0;
+}
+
 static void sfq_destroy(struct Qdisc *sch)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-- 
1.5.3.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


SFQ: backport some features from ESFQ (try 5)

2007-10-29 Thread Corey Hickey

Patchset try 2 addresses the review by Michael Buesch.
Patchset try 3 addresses the review by Patrick McHardy.
Patchset try 4 has a few cosmetic improvements.
Patchset try 5 addresses further review by Patrick McHardy.

This set of patches is substantially the same as my previous try, with
changes made according to Patrick's recommendations.

Iproute2 patches will follow shortly.



The following is the original patch text.

This set of patches adds some of ESFQ's modifications to the original 
SFQ. Thus far, I have received support for this approach rather than for 
trying to get ESFQ included as a separate qdisc.

http://mailman.ds9a.nl/pipermail/lartc/2007q2/021056.html

My patches here implement tc qdisc change, user-configurable depth 
(number of flows), and user-configurable divisor (for setting hash table 
size). I've left out the remaining ESFQ features (usage of jhash and 
different hashing methods) because Patrick McHardy intends to submit a 
patch that will supersede that functionality; see the URL above.

Default values remain the same, and SFQ's default behavior remains the 
same, so there should be no user disruption.

Thanks for your consideration,
Corey


 include/linux/pkt_sched.h |   23 ++-
 net/sched/sch_sfq.c   |  434 +++--
 2 files changed, 319 insertions(+), 138 deletions(-)


[PATCH 1/8] Preparatory refactoring part 1.
[PATCH 2/8] Preparatory refactoring part 2.
[PATCH 3/8] Make depth (number of queues) user-configurable
[PATCH 4/8] Add divisor.
[PATCH 5/8] Make qdisc changeable.
[PATCH 6/8] Remove comments about hardcoded values.
[PATCH 7/8] Rework perturb_period.
[PATCH 8/8] Use nested compat attributes to pass parameters.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/8] Remove comments about hardcoded values.

2007-10-29 Thread Corey Hickey
None of these are true anymore (hooray!).

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 include/linux/pkt_sched.h |8 
 net/sched/sch_sfq.c   |   13 +
 2 files changed, 1 insertions(+), 20 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 919af93..d754a3d 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -148,14 +148,6 @@ struct tc_sfq_qopt
unsignedflows;  /* Maximal number of flows  */
 };
 
-/*
- *  NOTE: limit, divisor and flows are hardwired to code at the moment.
- *
- * limit=flows=128, divisor=1024;
- *
- * The only reason for this is efficiency, it is possible
- * to change these parameters in compile time.
- */
 
 /* RED section */
 
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index ca8716f..7b11086 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -62,18 +62,7 @@
 
We still need true WFQ for top level CSZ, but using WFQ
for the best effort traffic is absolutely pointless:
-   SFQ is superior for this purpose.
-
-   IMPLEMENTATION:
-   This implementation limits maximal queue length to 128;
-   maximal mtu to 2^15-1; number of hash buckets to 1024.
-   The only goal of this restrictions was that all data
-   fit into one 4K page :-). Struct sfq_sched_data is
-   organized in anti-cache manner: all the data for a bucket
-   are scattered over different locations. This is not good,
-   but it allowed me to put it into 4K.
-
-   It is easy to increase these values, but not in flight.  */
+   SFQ is superior for this purpose. */
 
 #define SFQ_DEPTH_DEFAULT  128
 #define SFQ_DIVISOR_DEFAULT10
-- 
1.5.3.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/8] Add divisor.

2007-10-29 Thread Corey Hickey
Make hash divisor user-configurable.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   27 +--
 1 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 1c1bf08..c74d5ce 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -76,7 +76,7 @@
It is easy to increase these values, but not in flight.  */
 
 #define SFQ_DEPTH_DEFAULT  128
-#define SFQ_HASH_DIVISOR   1024
+#define SFQ_DIVISOR_DEFAULT10
 
 #define SFQ_HEAD 0
 #define SFQ_TAIL 1
@@ -86,6 +86,10 @@
 typedef unsigned int sfq_index;
 #define SFQ_MAX_DEPTH (UINT_MAX / 2 - 1)
 
+/* In practice, the actual divisor size is limited by kcalloc, but we still
+ * don't want to left shift by more than 31. */
+#define SFQ_MAX_DIVISOR 31
+
 struct sfq_head
 {
sfq_index   next;
@@ -99,6 +103,7 @@ struct sfq_sched_data
unsignedquantum;/* Allotment per round: MUST BE = MTU 
*/
int limit;
unsigneddepth;
+   unsignedhash_divisor;
 
 /* Variables */
struct timer_list perturb_timer;
@@ -106,7 +111,7 @@ struct sfq_sched_data
sfq_index   tail;   /* Index of current slot in round */
sfq_index   max_depth;  /* Maximal depth */
 
-   sfq_index   ht[SFQ_HASH_DIVISOR];   /* Hash table */
+   sfq_index   *ht;/* Hash table */
sfq_index   *next;  /* Active slots link */
short   *allot; /* Current allotment per slot */
unsigned short  *hash;  /* Hash value indexed by slots 
*/
@@ -116,7 +121,9 @@ struct sfq_sched_data
 
 static __inline__ unsigned sfq_fold_hash(struct sfq_sched_data *q, u32 h, u32 
h1)
 {
-   return jhash_2words(h, h1, q-perturbation)  (SFQ_HASH_DIVISOR - 1);
+   unsigned mask = (1q-hash_divisor) - 1;
+
+   return jhash_2words(h, h1, q-perturbation)  mask;
 }
 
 static unsigned sfq_hash(struct sfq_sched_data *q, struct sk_buff *skb)
@@ -418,6 +425,7 @@ static void sfq_perturbation(unsigned long arg)
 
 static void sfq_q_destroy(struct sfq_sched_data *q)
 {
+   kfree(q-ht);
kfree(q-dep);
kfree(q-next);
kfree(q-allot);
@@ -441,6 +449,7 @@ sfq_default_parameters(struct Qdisc *sch)
q-quantum= psched_mtu(sch-dev);
q-perturbation   = 0;
q-perturb_period = 0;
+   q-hash_divisor   = SFQ_DIVISOR_DEFAULT;
q-depth  = SFQ_DEPTH_DEFAULT;
q-limit  = SFQ_DEPTH_DEFAULT - 1;
 }
@@ -463,18 +472,24 @@ sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
q-quantum = ctl-quantum;
if (ctl-perturb_period)
q-perturb_period = ctl-perturb_period * HZ;
+   if (ctl-divisor)
+   q-hash_divisor = ctl-divisor;
if (ctl-flows)
q-depth = ctl-flows;
if (ctl-limit)
q-limit = ctl-limit;
 
-   if (q-depth  SFQ_MAX_DEPTH)
+   if (q-depth SFQ_MAX_DEPTH ||
+   q-hash_divisor  SFQ_MAX_DIVISOR)
return -EINVAL;
}
q-limit = min_t(u32, q-limit, q-depth - 1);
q-tail = q-depth;
q-max_depth = 0;
 
+   q-ht = kcalloc(1q-hash_divisor, sizeof(sfq_index), GFP_KERNEL);
+   if (!q-ht)
+   goto err_case;
q-dep = kcalloc(1 + q-depth*2, sizeof(struct sfq_head), GFP_KERNEL);
if (!q-dep)
goto err_case;
@@ -491,7 +506,7 @@ sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
if (!q-qs)
goto err_case;
 
-   for (i=0; iSFQ_HASH_DIVISOR; i++)
+   for (i=0; i  1q-hash_divisor; i++)
q-ht[i] = q-depth;
for (i=0; i  q-depth; i++) {
skb_queue_head_init(q-qs[i]);
@@ -537,7 +552,7 @@ static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
opt.perturb_period = q-perturb_period/HZ;
 
opt.limit = q-limit;
-   opt.divisor = SFQ_HASH_DIVISOR;
+   opt.divisor = q-hash_divisor;
opt.flows = q-depth;
 
RTA_PUT(skb, TCA_OPTIONS, sizeof(opt), opt);
-- 
1.5.3.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/8] Make qdisc changeable.

2007-10-29 Thread Corey Hickey
Re-implement sfq_change() and enable Qdisc_opts.change so tc qdisc
change will work.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   67 ++-
 1 files changed, 66 insertions(+), 1 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index c74d5ce..ca8716f 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -454,6 +454,17 @@ sfq_default_parameters(struct Qdisc *sch)
q-limit  = SFQ_DEPTH_DEFAULT - 1;
 }
 
+static void
+sfq_copy_parameters(struct sfq_sched_data *dst, struct sfq_sched_data *src)
+{
+   dst-quantum= src-quantum;
+   dst-perturbation   = src-perturbation;
+   dst-perturb_period = src-perturb_period;
+   dst-hash_divisor   = src-hash_divisor;
+   dst-limit  = src-limit;
+   dst-depth  = src-depth;
+}
+
 static int
 sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
 {
@@ -542,6 +553,60 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
return 0;
 }
 
+static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   struct sfq_sched_data tmp;
+   struct sk_buff *skb;
+   unsigned int qlen;
+   int err;
+
+   /* set up tmp queue */
+   memset(tmp, 0, sizeof(struct sfq_sched_data));
+   sfq_copy_parameters(tmp, q);
+   if ((err = sfq_q_init(tmp, opt)))
+   return err;
+
+   /* handle perturbation */
+   /* This code avoids resetting the perturb_timer unless perturb_period
+* is changed. Note that the rest of this function leaves
+* q-perturb_timer alone, whereas all other members of q get
+* overwritten from tmp. */
+   if (!tmp.perturb_period) {
+   tmp.perturbation = 0;
+   del_timer(q-perturb_timer);
+   } else if (tmp.perturb_period != q-perturb_period) {
+   mod_timer(q-perturb_timer, jiffies + tmp.perturb_period);
+   }
+
+   /* move packets from the old queue to the tmp queue */
+   sch_tree_lock(sch);
+   qlen = sch-q.qlen;
+   while (sch-q.qlen = tmp.limit - 1)
+   sfq_drop(sch);
+   qdisc_tree_decrease_qlen(sch, qlen - sch-q.qlen);
+   while ((skb = sfq_q_dequeue(q)) != NULL)
+   sfq_q_enqueue(skb, tmp, SFQ_TAIL);
+
+   /* clean up the old queue */
+   sfq_q_destroy(q);
+
+   /* copy elements of the tmp queue into the old queue */
+   sfq_copy_parameters(q, tmp);
+   q-tail  = tmp.tail;
+   q-max_depth = tmp.max_depth;
+   q-ht= tmp.ht;
+   q-dep   = tmp.dep;
+   q-next  = tmp.next;
+   q-allot = tmp.allot;
+   q-hash  = tmp.hash;
+   q-qs= tmp.qs;
+
+   /* finish up */
+   sch_tree_unlock(sch);
+   return 0;
+}
+
 static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
@@ -576,7 +641,7 @@ static struct Qdisc_ops sfq_qdisc_ops = {
.init   =   sfq_init,
.reset  =   sfq_reset,
.destroy=   sfq_destroy,
-   .change =   NULL,
+   .change =   sfq_change,
.dump   =   sfq_dump,
.owner  =   THIS_MODULE,
 };
-- 
1.5.3.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 8/8] Use nested compat attributes to pass parameters.

2007-10-29 Thread Corey Hickey
This fixes the ambiguity between, for example:
tc qdisc change ... perturb 0
tc qdisc change ...

Without this patch, there is no way for SFQ to differentiate between
a parameter specified to be 0 and a parameter that was omitted.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 include/linux/pkt_sched.h |   13 
 net/sched/sch_sfq.c   |   69 ++--
 2 files changed, 66 insertions(+), 16 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 14a08ad..b1a1a52 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -148,6 +148,19 @@ struct tc_sfq_qopt
unsignedflows;  /* Maximal number of flows  */
 };
 
+enum
+{
+   TCA_SFQ_UNSPEC,
+   TCA_SFQ_COMPAT,
+   TCA_SFQ_QUANTUM,
+   TCA_SFQ_PERTURB,
+   TCA_SFQ_LIMIT,
+   TCA_SFQ_DIVISOR,
+   TCA_SFQ_FLOWS,
+   __TCA_SFQ_MAX,
+};
+
+#define TCA_SFQ_MAX (__TCA_SFQ_MAX - 1)
 
 /* RED section */
 
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 2764a54..5c25b05 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -456,6 +456,29 @@ sfq_copy_parameters(struct sfq_sched_data *dst, struct 
sfq_sched_data *src)
dst-depth  = src-depth;
 }
 
+/* SFQ parameters exist as individual rtattr attributes, with a nested
+ * struct tc_sfq_qopt for compatibility with older userspace tools. If an
+ * individual attribute is set, we want to use it; otherwise, fall back to the
+ * nested struct.
+ * There is one caveat: if a member of the nested struct is 0, we cannot
+ * determine if that parameter is supposed to be 0 or if it is merely unset.
+ * So, only set a parameter if the corresponding struct member (u32 compat) is
+ * nonzero. When setting a parameter to 0, it is necessary to use the
+ * individual attribute. */
+static inline int
+sfq_get_parameter(u32 *dst, struct rtattr *tb[TCA_SFQ_MAX], int attr,
+ u32 compat)
+{
+   struct rtattr *rta = tb[(attr - 1)];
+   if (rta)
+   *dst = RTA_GET_U32(rta);
+   else if (compat)
+   *dst = compat;
+
+   rtattr_failure:
+   return -EINVAL;
+}
+   
 static int
 sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
 {
@@ -465,21 +488,24 @@ sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
 * the previous values (sfq_change). So, overwrite the parameters as
 * specified. */
if (opt) {
-   struct tc_sfq_qopt *ctl = RTA_DATA(opt);
-
-   if (opt-rta_len  RTA_LENGTH(sizeof(*ctl)))
-   return -EINVAL;
-
-   if (ctl-quantum)
-   q-quantum = ctl-quantum;
-   if (ctl-perturb_period)
-   q-perturb_period = ctl-perturb_period;
-   if (ctl-divisor)
-   q-hash_divisor = ctl-divisor;
-   if (ctl-flows)
-   q-depth = ctl-flows;
-   if (ctl-limit)
-   q-limit = ctl-limit;
+   struct tc_sfq_qopt *ctl;
+   struct rtattr *tb[TCA_SFQ_MAX];
+
+   if (rtattr_parse_nested_compat(tb, TCA_SFQ_MAX, opt, ctl,
+   sizeof(*ctl)))
+   goto rtattr_failure;
+
+   if (sfq_get_parameter((q-quantum),tb, TCA_SFQ_QUANTUM,
+   ctl-quantum)||
+   sfq_get_parameter((q-perturb_period), tb, TCA_SFQ_PERTURB,
+   ctl-perturb_period) ||
+   sfq_get_parameter((q-hash_divisor),   tb, TCA_SFQ_DIVISOR,
+   ctl-divisor)||
+   sfq_get_parameter((q-depth),  tb, TCA_SFQ_FLOWS,
+   ctl-flows)  ||
+   sfq_get_parameter((q-limit),  tb, TCA_SFQ_LIMIT,
+   ctl-limit))
+   goto rtattr_failure;
 
if (q-depth SFQ_MAX_DEPTH ||
q-hash_divisor  SFQ_MAX_DIVISOR)
@@ -519,6 +545,8 @@ sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
for (i=0; i  q-depth; i++)
sfq_link(q, i);
return 0;
+rtattr_failure:
+   return -EINVAL;
 err_case:
sfq_q_destroy(q);
return -ENOBUFS;
@@ -602,17 +630,26 @@ static int sfq_dump(struct Qdisc *sch, struct sk_buff 
*skb)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
unsigned char *b = skb_tail_pointer(skb);
+   struct rtattr *nest;
struct tc_sfq_qopt opt;
 
opt.quantum = q-quantum;
opt.perturb_period = q-perturb_period;
-
opt.limit = q-limit;
opt.divisor = q-hash_divisor;
opt.flows = q-depth;
 
+   nest = RTA_NEST_COMPAT(skb, TCA_OPTIONS, sizeof(opt), opt);
+
+   RTA_PUT_U32(skb, TCA_SFQ_QUANTUM, q-quantum);
+   RTA_PUT_U32(skb

[PATCH 7/8] Rework perturb_period.

2007-10-29 Thread Corey Hickey
perturb_period is the only parameter that doesn't match 1:1 with the
value from userspace. Multiplying perturb_period by HZ when used rather
than when assigned makes it easy and clean to use a small function for
setting parameters (in a subsequent patch).

perturb_period is currently a signed integer, but I can't see any good
reason why this is so--a negative perturbation period will add a timer
that expires in the past, causing constant perturbation, which makes
hashing useless.

Strictly speaking, this will break binary compatibility with older
versions of tc, but that ought not to be a problem because (a) there's
no valid use for a negative perturb_period, and (b) negative values
will be seen as high values ( INT_MAX), which don't work anyway.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 include/linux/pkt_sched.h |2 +-
 net/sched/sch_sfq.c   |   14 --
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index d754a3d..14a08ad 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -142,7 +142,7 @@ enum
 struct tc_sfq_qopt
 {
unsignedquantum;/* Bytes per round allocated to flow */
-   int perturb_period; /* Period of hash perturbation */
+   unsignedperturb_period; /* Period of hash perturbation */
__u32   limit;  /* Maximal packets in queue */
unsigneddivisor;/* Hash divisor  */
unsignedflows;  /* Maximal number of flows  */
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 7b11086..2764a54 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -70,6 +70,8 @@
 #define SFQ_HEAD 0
 #define SFQ_TAIL 1
 
+#define SFQ_PERTURB(period) (jiffies + (unsigned long)period * HZ)
+
 /* This type must contain greater than depth*2 values, so depth is constrained 
  * accordingly. */
 typedef unsigned int sfq_index;
@@ -88,7 +90,7 @@ struct sfq_head
 struct sfq_sched_data
 {
 /* Parameters */
-   int perturb_period;
+   unsignedperturb_period;
unsignedquantum;/* Allotment per round: MUST BE = MTU 
*/
int limit;
unsigneddepth;
@@ -409,7 +411,7 @@ static void sfq_perturbation(unsigned long arg)
get_random_bytes(q-perturbation, 4);
 
if (q-perturb_period)
-   mod_timer(q-perturb_timer, jiffies + q-perturb_period);
+   mod_timer(q-perturb_timer, SFQ_PERTURB(q-perturb_period));
 }
 
 static void sfq_q_destroy(struct sfq_sched_data *q)
@@ -471,7 +473,7 @@ sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
if (ctl-quantum)
q-quantum = ctl-quantum;
if (ctl-perturb_period)
-   q-perturb_period = ctl-perturb_period * HZ;
+   q-perturb_period = ctl-perturb_period;
if (ctl-divisor)
q-hash_divisor = ctl-divisor;
if (ctl-flows)
@@ -535,7 +537,7 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
q-perturb_timer.data = (unsigned long)sch;
q-perturb_timer.function = sfq_perturbation;
if (q-perturb_period) {
-   q-perturb_timer.expires = jiffies + q-perturb_period;
+   q-perturb_timer.expires = SFQ_PERTURB(q-perturb_period);
add_timer(q-perturb_timer);
}
 
@@ -565,7 +567,7 @@ static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
tmp.perturbation = 0;
del_timer(q-perturb_timer);
} else if (tmp.perturb_period != q-perturb_period) {
-   mod_timer(q-perturb_timer, jiffies + tmp.perturb_period);
+   mod_timer(q-perturb_timer, SFQ_PERTURB(tmp.perturb_period));
}
 
/* move packets from the old queue to the tmp queue */
@@ -603,7 +605,7 @@ static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
struct tc_sfq_qopt opt;
 
opt.quantum = q-quantum;
-   opt.perturb_period = q-perturb_period/HZ;
+   opt.perturb_period = q-perturb_period;
 
opt.limit = q-limit;
opt.divisor = q-hash_divisor;
-- 
1.5.3.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/8] Make depth (number of queues) user-configurable

2007-10-29 Thread Corey Hickey
* replace #define with a parameter
* use old hardcoded value as a default
* kcalloc() arrays in sfq_q_init()
* free() arrays in new function sfq_q_destroy()
* move sfq_destroy() to near sfq_q_destroy(), for clarity

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |  104 +++
 1 files changed, 72 insertions(+), 32 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 8ea816a..1c1bf08 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -75,14 +75,16 @@
 
It is easy to increase these values, but not in flight.  */
 
-#define SFQ_DEPTH  128
+#define SFQ_DEPTH_DEFAULT  128
 #define SFQ_HASH_DIVISOR   1024
 
 #define SFQ_HEAD 0
 #define SFQ_TAIL 1
 
-/* This type should contain at least SFQ_DEPTH*2 values */
-typedef unsigned char sfq_index;
+/* This type must contain greater than depth*2 values, so depth is constrained 
+ * accordingly. */
+typedef unsigned int sfq_index;
+#define SFQ_MAX_DEPTH (UINT_MAX / 2 - 1)
 
 struct sfq_head
 {
@@ -96,6 +98,7 @@ struct sfq_sched_data
int perturb_period;
unsignedquantum;/* Allotment per round: MUST BE = MTU 
*/
int limit;
+   unsigneddepth;
 
 /* Variables */
struct timer_list perturb_timer;
@@ -104,11 +107,11 @@ struct sfq_sched_data
sfq_index   max_depth;  /* Maximal depth */
 
sfq_index   ht[SFQ_HASH_DIVISOR];   /* Hash table */
-   sfq_index   next[SFQ_DEPTH];/* Active slots link */
-   short   allot[SFQ_DEPTH];   /* Current allotment per slot */
-   unsigned short  hash[SFQ_DEPTH];/* Hash value indexed by slots 
*/
-   struct sk_buff_head qs[SFQ_DEPTH];  /* Slot queue */
-   struct sfq_head dep[SFQ_DEPTH*2];   /* Linked list of slots, 
indexed by depth */
+   sfq_index   *next;  /* Active slots link */
+   short   *allot; /* Current allotment per slot */
+   unsigned short  *hash;  /* Hash value indexed by slots 
*/
+   struct sk_buff_head *qs;/* Slot queue */
+   struct sfq_head *dep;   /* Linked list of slots, 
indexed by depth */
 };
 
 static __inline__ unsigned sfq_fold_hash(struct sfq_sched_data *q, u32 h, u32 
h1)
@@ -160,7 +163,7 @@ static unsigned sfq_hash(struct sfq_sched_data *q, struct 
sk_buff *skb)
 static inline void sfq_link(struct sfq_sched_data *q, sfq_index x)
 {
sfq_index p, n;
-   int d = q-qs[x].qlen + SFQ_DEPTH;
+   int d = q-qs[x].qlen + q-depth;
 
p = d;
n = q-dep[d].next;
@@ -211,7 +214,7 @@ static unsigned int sfq_drop(struct Qdisc *sch)
   drop a packet from it */
 
if (d  1) {
-   sfq_index x = q-dep[d+SFQ_DEPTH].next;
+   sfq_index x = q-dep[d + q-depth].next;
skb = q-qs[x].prev;
len = skb-len;
__skb_unlink(skb, q-qs[x]);
@@ -234,7 +237,7 @@ static unsigned int sfq_drop(struct Qdisc *sch)
kfree_skb(skb);
sfq_dec(q, d);
sch-q.qlen--;
-   q-ht[q-hash[d]] = SFQ_DEPTH;
+   q-ht[q-hash[d]] = q-depth;
sch-qstats.drops++;
sch-qstats.backlog -= len;
return len;
@@ -250,8 +253,8 @@ sfq_q_enqueue(struct sk_buff *skb, struct sfq_sched_data 
*q, int end)
sfq_index x;
 
x = q-ht[hash];
-   if (x == SFQ_DEPTH) {
-   q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
+   if (x == q-depth) {
+   q-ht[hash] = x = q-dep[q-depth].next;
q-hash[x] = hash;
}
 
@@ -287,7 +290,7 @@ sfq_q_enqueue(struct sk_buff *skb, struct sfq_sched_data 
*q, int end)
 
sfq_inc(q, x);
if (q-qs[x].qlen == 1) {   /* The flow is new */
-   if (q-tail == SFQ_DEPTH) { /* It is the first flow */
+   if (q-tail == q-depth) {  /* It is the first flow */
q-tail = x;
q-next[x] = x;
q-allot[x] = q-quantum;
@@ -351,7 +354,7 @@ sk_buff *sfq_q_dequeue(struct sfq_sched_data *q)
sfq_index a, old_a;
 
/* No active slots */
-   if (q-tail == SFQ_DEPTH)
+   if (q-tail == q-depth)
return NULL;
 
a = old_a = q-next[q-tail];
@@ -362,10 +365,10 @@ sk_buff *sfq_q_dequeue(struct sfq_sched_data *q)
 
/* Is the slot empty? */
if (q-qs[a].qlen == 0) {
-   q-ht[q-hash[a]] = SFQ_DEPTH;
+   q-ht[q-hash[a]] = q-depth;
a = q-next[a];
if (a == old_a) {
-   q-tail = SFQ_DEPTH;
+   q-tail = q-depth;
return skb;
}
q-next[q-tail] = a;
@@ -413,6 +416,23 @@ static

[PATCH 1/8] Preparatory refactoring part 1.

2007-10-29 Thread Corey Hickey
Make a new function sfq_q_enqueue() that operates directly on the
queue data. This will be useful for implementing sfq_change() in
a later patch. A pleasant side-effect is reducing most of the
duplicate code in sfq_enqueue() and sfq_requeue().

Similarly, make a new function sfq_q_dequeue().

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |  119 ++-
 1 files changed, 70 insertions(+), 49 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index b542c87..10e2f3d 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -78,6 +78,9 @@
 #define SFQ_DEPTH  128
 #define SFQ_HASH_DIVISOR   1024
 
+#define SFQ_HEAD 0
+#define SFQ_TAIL 1
+
 /* This type should contain at least SFQ_DEPTH*2 values */
 typedef unsigned char sfq_index;
 
@@ -241,9 +244,8 @@ static unsigned int sfq_drop(struct Qdisc *sch)
 }
 
 static int
-sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
+sfq_q_enqueue(struct sk_buff *skb, struct sfq_sched_data *q, int end)
 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
unsigned hash = sfq_hash(q, skb);
sfq_index x;
 
@@ -252,15 +254,37 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
q-hash[x] = hash;
}
-   /* If selected queue has length q-limit, this means that
-* all another queues are empty and that we do simple tail drop,
-* i.e. drop _this_ packet.
-*/
-   if (q-qs[x].qlen = q-limit)
-   return qdisc_drop(skb, sch);
 
-   sch-qstats.backlog += skb-len;
-   __skb_queue_tail(q-qs[x], skb);
+   if (end == SFQ_TAIL) {
+   /* If selected queue has length q-limit, this means that
+* all other queues are empty and that we do simple tail drop,
+* i.e. drop _this_ packet.
+*/
+   if (q-qs[x].qlen = q-limit) {
+   unsigned int drop_len = skb-len;
+
+   kfree_skb(skb);
+   return drop_len;
+   }
+   __skb_queue_tail(q-qs[x], skb);
+   } else { /* end == SFQ_HEAD */
+   __skb_queue_head(q-qs[x], skb);
+   /* If selected queue has length q-limit+1, this means that
+* all other queues are empty and we do simple tail drop.
+* This packet is still requeued at head of queue, tail packet
+* is dropped.
+*/
+   if (q-qs[x].qlen  q-limit) {
+   unsigned int drop_len;
+
+   skb = q-qs[x].prev;
+   drop_len = skb-len;
+   __skb_unlink(skb, q-qs[x]);
+   kfree_skb(skb);
+   return drop_len;
+   }
+   }
+
sfq_inc(q, x);
if (q-qs[x].qlen == 1) {   /* The flow is new */
if (q-tail == SFQ_DEPTH) { /* It is the first flow */
@@ -273,6 +297,21 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
q-tail = x;
}
}
+
+   return 0;
+}
+
+static int
+sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+
+   if (sfq_q_enqueue(skb, q, SFQ_TAIL)) {
+   sch-qstats.drops++;
+   return NET_XMIT_DROP;
+   }
+
+   sch-qstats.backlog += skb-len;
if (++sch-q.qlen = q-limit) {
sch-bstats.bytes += skb-len;
sch-bstats.packets++;
@@ -287,58 +326,27 @@ static int
 sfq_requeue(struct sk_buff *skb, struct Qdisc* sch)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-   unsigned hash = sfq_hash(q, skb);
-   sfq_index x;
+   unsigned int drop_len;
 
-   x = q-ht[hash];
-   if (x == SFQ_DEPTH) {
-   q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
-   q-hash[x] = hash;
-   }
sch-qstats.backlog += skb-len;
-   __skb_queue_head(q-qs[x], skb);
-   /* If selected queue has length q-limit+1, this means that
-* all another queues are empty and we do simple tail drop.
-* This packet is still requeued at head of queue, tail packet
-* is dropped.
-*/
-   if (q-qs[x].qlen  q-limit) {
-   skb = q-qs[x].prev;
-   __skb_unlink(skb, q-qs[x]);
+   if ((drop_len = sfq_q_enqueue(skb, q, SFQ_HEAD))) {
+   sch-qstats.backlog -= drop_len;
sch-qstats.drops++;
-   sch-qstats.backlog -= skb-len;
-   kfree_skb(skb);
return NET_XMIT_CN;
}
-   sfq_inc(q, x);
-   if (q-qs[x].qlen == 1) {   /* The flow is new */
-   if (q-tail == SFQ_DEPTH) { /* It is the first flow */
-   q-tail = x;
-   q-next[x] = x

[iproute2] SFQ: backport some features from ESFQ (try 5)

2007-10-29 Thread Corey Hickey

These patches follow the ESFQ--SFQ kernel patches. See the kernel
patch summary for general information.

Thanks,
Corey


 include/linux/pkt_sched.h |   23 ++-
 tc/q_sfq.c|   42 +-
 2 files changed, 51 insertions(+), 14 deletions(-)


[PATCH 1/3] SFQ: Support changing depth and divisor.
[PATCH 2/3] Change perturb_period to unsigned.
[PATCH 3/3] Use nested compat attributes for passing parameters to the kernel.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] Change perturb_period to unsigned.

2007-10-29 Thread Corey Hickey
This corresponds to the kernel patch doing the same.

Here, too, this will technically break binary compatibility with older
kernels, but that shouldn't be a problem because negative perturb_period
values aren't usable anyway.
---
 include/linux/pkt_sched.h |2 +-
 tc/q_sfq.c|4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 9d41f63..fb04a89 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -142,7 +142,7 @@ enum
 struct tc_sfq_qopt
 {
unsignedquantum;/* Bytes per round allocated to flow */
-   int perturb_period; /* Period of hash perturbation */
+   unsignedperturb_period; /* Period of hash perturbation */
__u32   limit;  /* Maximal packets in queue */
unsigneddivisor;/* Hash divisor  */
unsignedflows;  /* Maximal number of flows  */
diff --git a/tc/q_sfq.c b/tc/q_sfq.c
index 19e76ba..83c8a54 100644
--- a/tc/q_sfq.c
+++ b/tc/q_sfq.c
@@ -47,7 +47,7 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
ok++;
} else if (strcmp(*argv, perturb) == 0) {
NEXT_ARG();
-   if (get_integer(opt.perturb_period, *argv, 0)) {
+   if (get_u32(opt.perturb_period, *argv, 0)) {
fprintf(stderr, Illegal \perturb\\n);
return -1;
}
@@ -114,7 +114,7 @@ static int sfq_print_opt(struct qdisc_util *qu, FILE *f, 
struct rtattr *opt)
fprintf(f, flows %u/%u , qopt-flows, qopt-divisor);
}
if (qopt-perturb_period)
-   fprintf(f, perturb %dsec , qopt-perturb_period);
+   fprintf(f, perturb %usec , qopt-perturb_period);
return 0;
 }
 
-- 
1.5.3.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] SFQ: Support changing depth and divisor.

2007-10-29 Thread Corey Hickey
This can safely be applied either before or after the kernel
patches because the tc_sfq_qopt struct is unchanged:

- old kernels will ignore the new parameters from new iproute2
- new kernels will use the same defaults for the new parameters
---
 include/linux/pkt_sched.h |9 -
 tc/q_sfq.c|   20 +++-
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 268c515..9d41f63 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -148,15 +148,6 @@ struct tc_sfq_qopt
unsignedflows;  /* Maximal number of flows  */
 };
 
-/*
- *  NOTE: limit, divisor and flows are hardwired to code at the moment.
- *
- * limit=flows=128, divisor=1024;
- *
- * The only reason for this is efficiency, it is possible
- * to change these parameters in compile time.
- */
-
 /* RED section */
 
 enum
diff --git a/tc/q_sfq.c b/tc/q_sfq.c
index 05385cf..19e76ba 100644
--- a/tc/q_sfq.c
+++ b/tc/q_sfq.c
@@ -25,7 +25,7 @@
 
 static void explain(void)
 {
-   fprintf(stderr, Usage: ... sfq [ limit NUMBER ] [ perturb SECS ] [ 
quantum BYTES ]\n);
+   fprintf(stderr, Usage: ... sfq [ limit NUMBER ] [ depth FLOWS ] [ 
divisor HASHBITS ] [ perturb SECS ] [ quantum BYTES ]\n);
 }
 
 #define usage() return(-1)
@@ -63,6 +63,24 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
return -1;
}
ok++;
+   } else if (strcmp(*argv, depth) == 0) {
+   NEXT_ARG();
+   if (get_unsigned(opt.flows, *argv, 0)) {
+   fprintf(stderr, Illegal \depth\\n);
+   return -1;
+   }
+   ok++;
+   } else if (strcmp(*argv, divisor) == 0) {
+   NEXT_ARG();
+   if (get_unsigned(opt.divisor, *argv, 0)) {
+   fprintf(stderr, Illegal \divisor\\n);
+   return -1;
+   }
+   if (opt.divisor = 15) {
+   fprintf(stderr, Illegal \divisor\, must be  
15\n);
+   return -1;
+   }
+   ok++;
} else if (strcmp(*argv, help) == 0) {
explain();
return -1;
-- 
1.5.3.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] Use nested compat attributes for passing parameters to the kernel.

2007-10-29 Thread Corey Hickey
Note that I have left sfq_print_opt() alone. At this point, there
can be no difference between the data in the nested rtattrs and the
data in the compat rtattr, and I didn't want to add clutter that
isn't useful. Let me know if I should do differently.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 include/linux/pkt_sched.h |   14 ++
 tc/q_sfq.c|   18 --
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index fb04a89..aad04eb 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -148,6 +148,20 @@ struct tc_sfq_qopt
unsignedflows;  /* Maximal number of flows  */
 };
 
+enum
+{
+   TCA_SFQ_UNSPEC,
+   TCA_SFQ_COMPAT,
+   TCA_SFQ_QUANTUM,
+   TCA_SFQ_PERTURB,
+   TCA_SFQ_LIMIT,
+   TCA_SFQ_DIVISOR,
+   TCA_SFQ_FLOWS,
+   __TCA_SFQ_MAX,
+};
+
+#define TCA_SFQ_MAX (__TCA_SFQ_MAX - 1)
+
 /* RED section */
 
 enum
diff --git a/tc/q_sfq.c b/tc/q_sfq.c
index 83c8a54..69f17c8 100644
--- a/tc/q_sfq.c
+++ b/tc/q_sfq.c
@@ -34,9 +34,13 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
 {
int ok=0;
struct tc_sfq_qopt opt;
+   struct rtattr *nest;
 
memset(opt, 0, sizeof(opt));
 
+   /* put blank data in rtattr so there is a hole to fill later */
+   nest = addattr_nest_compat(n, 1024, TCA_OPTIONS, opt, sizeof(opt));
+
while (argc  0) {
if (strcmp(*argv, quantum) == 0) {
NEXT_ARG();
@@ -44,6 +48,7 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
fprintf(stderr, Illegal \limit\\n);
return -1;
}
+   addattr32(n, 1024, TCA_SFQ_QUANTUM, opt.quantum);
ok++;
} else if (strcmp(*argv, perturb) == 0) {
NEXT_ARG();
@@ -51,6 +56,7 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
fprintf(stderr, Illegal \perturb\\n);
return -1;
}
+   addattr32(n, 1024, TCA_SFQ_PERTURB, opt.perturb_period);
ok++;
} else if (strcmp(*argv, limit) == 0) {
NEXT_ARG();
@@ -62,6 +68,7 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
fprintf(stderr, Illegal \limit\, must be  
1\n);
return -1;
}
+   addattr32(n, 1024, TCA_SFQ_LIMIT, opt.limit);
ok++;
} else if (strcmp(*argv, depth) == 0) {
NEXT_ARG();
@@ -69,6 +76,7 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
fprintf(stderr, Illegal \depth\\n);
return -1;
}
+   addattr32(n, 1024, TCA_SFQ_FLOWS, opt.flows);
ok++;
} else if (strcmp(*argv, divisor) == 0) {
NEXT_ARG();
@@ -80,6 +88,7 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
fprintf(stderr, Illegal \divisor\, must be  
15\n);
return -1;
}
+   addattr32(n, 1024, TCA_SFQ_DIVISOR, opt.divisor);
ok++;
} else if (strcmp(*argv, help) == 0) {
explain();
@@ -92,8 +101,13 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
argc--; argv++;
}
 
-   if (ok)
-   addattr_l(n, 1024, TCA_OPTIONS, opt, sizeof(opt));
+   if (ok) {
+   /* fill the hole we left earlier with real compat data */
+   memcpy(RTA_DATA(nest), opt, sizeof(opt));
+   addattr_nest_compat_end(n, nest);
+   }
+   else
+   nest-rta_len = 0;
return 0;
 }
 
-- 
1.5.3.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] Use nested compat attributes to pass parameters.

2007-10-29 Thread Corey Hickey
Corey Hickey wrote:
 +/* SFQ parameters exist as individual rtattr attributes, with a nested
 + * struct tc_sfq_qopt for compatibility with older userspace tools. If an
 + * individual attribute is set, we want to use it; otherwise, fall back to 
 the
 + * nested struct.
 + * There is one caveat: if a member of the nested struct is 0, we cannot
 + * determine if that parameter is supposed to be 0 or if it is merely unset.
 + * So, only set a parameter if the corresponding struct member (u32 compat) 
 is
 + * nonzero. When setting a parameter to 0, it is necessary to use the
 + * individual attribute. */
 +static inline int
 +sfq_get_parameter(u32 *dst, struct rtattr *tb[TCA_SFQ_MAX], int attr,
 +   u32 compat)
 +{
 + struct rtattr *rta = tb[(attr - 1)];
 + if (rta)
 + *dst = RTA_GET_U32(rta);
 + else if (compat)
 + *dst = compat;
 +
 + rtattr_failure:
 + return -EINVAL;
 +}
 + 
  static int
  sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
  {
 @@ -465,21 +488,24 @@ sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
* the previous values (sfq_change). So, overwrite the parameters as
* specified. */
   if (opt) {
 - struct tc_sfq_qopt *ctl = RTA_DATA(opt);
 -
 - if (opt-rta_len  RTA_LENGTH(sizeof(*ctl)))
 - return -EINVAL;
 -
 - if (ctl-quantum)
 - q-quantum = ctl-quantum;
 - if (ctl-perturb_period)
 - q-perturb_period = ctl-perturb_period;
 - if (ctl-divisor)
 - q-hash_divisor = ctl-divisor;
 - if (ctl-flows)
 - q-depth = ctl-flows;
 - if (ctl-limit)
 - q-limit = ctl-limit;
 + struct tc_sfq_qopt *ctl;
 + struct rtattr *tb[TCA_SFQ_MAX];
 +
 + if (rtattr_parse_nested_compat(tb, TCA_SFQ_MAX, opt, ctl,
 + sizeof(*ctl)))
 + goto rtattr_failure;
 +
 + if (sfq_get_parameter((q-quantum),tb, TCA_SFQ_QUANTUM,
 + ctl-quantum)||
 + sfq_get_parameter((q-perturb_period), tb, TCA_SFQ_PERTURB,
 + ctl-perturb_period) ||
 + sfq_get_parameter((q-hash_divisor),   tb, TCA_SFQ_DIVISOR,
 + ctl-divisor)||
 + sfq_get_parameter((q-depth),  tb, TCA_SFQ_FLOWS,
 + ctl-flows)  ||
 + sfq_get_parameter((q-limit),  tb, TCA_SFQ_LIMIT,
 + ctl-limit))
 + goto rtattr_failure;
  

You may note that this part ended up being rather ugly, and I wouldn't
blame anyone for wanting it to be improved. I don't see any good
solutions; the alternatives I can provide are:

1. Use a macro, as I had originally written for this patch:
http://marc.info/?l=linux-netdevm=119102007907626w=2
(search for GET_PARAM)
The macro itself doesn't look pretty, but the usage is fairly clean.

2. Use neither macro nor function.
There would be five repetitions of very similar code, with five lines
per repetition: not very nice, but the formatting would still look better.

3. Only use a separate rtattr for perturb, since the other parameters
work fine already. From the start, I had wanted to keep parameter
parsing consistent, but it may not be worth it. This would definitely be
the cleanest approach for readability, and the most simple.


Sorry to bother you with such a superficial problem, but I really don't
know what would be preferable.

Thanks,
Corey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/10] Preparatory refactoring part 1.

2007-10-01 Thread Corey Hickey

Patrick McHardy wrote:

Corey Hickey wrote:

Make a new function sfq_q_enqueue() that operates directly on the
queue data. This will be useful for implementing sfq_change() in
a later patch. A pleasant side-effect is reducing most of the
duplicate code in sfq_enqueue() and sfq_requeue().

Similarly, make a new function sfq_q_dequeue().

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   72 +++
 1 files changed, 38 insertions(+), 34 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 3a23e30..57485ef 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c




The sfq_q_enqueue part looks fine.

 
-	sch-qstats.drops++;



A line in the changelog explaining that this was increased twice
would have been nice.


Certainly; I think I didn't realize, when you originally pointed out the 
duplicate incrementing, that it was a bug in the original version and 
not in my patch. Otherwise, I would have sent it as a separate patch.


If a note in this patch will suffice, though, I'll definitely do so.


sfq_drop(sch);
return NET_XMIT_CN;
 }
 
-

-
-
-static struct sk_buff *
-sfq_dequeue(struct Qdisc* sch)
+static struct
+sk_buff *sfq_q_dequeue(struct sfq_sched_data *q)



What is this function needed for?


It gets used in sfq_change for moving packets from the old queue into 
the new one. In this case, we don't want to modify sch-q.qlen or 
sch-qstats.backlog, since those don't actually change.


 while ((skb = sfq_q_dequeue(q)) != NULL)
 sfq_q_enqueue(skb, tmp, SFQ_TAIL);


I'll improve the description of this patch to make that more clear.

-Corey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/10] Move two functions.

2007-10-01 Thread Corey Hickey

Patrick McHardy wrote:

Corey Hickey wrote:

Move sfq_q_destroy() to above sfq_q_init() so that it can be used
by an error case in a later patch.

Move sfq_destroy() as well, for clarity.



This patch looks pointless, just put them where you need them
in the patch introducing them.


As you wish. I thought having a separate patch would ease reviewing.

-Corey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/10] Add divisor.

2007-10-01 Thread Corey Hickey

Patrick McHardy wrote:

Corey Hickey wrote:

Make hash divisor user-configurable.




@@ -120,7 +121,7 @@ static __inline__ unsigned sfq_fold_hash(struct 
sfq_sched_data *q, u32 h, u32 h1
/* Have we any rotation primitives? If not, WHY? */
h ^= (h1pert) ^ (h1(0x1F - pert));
h ^= h10;
-   return h  0x3FF;
+   return h  (q-hash_divisor-1);



This assumes that hash_divisor is a power of two, but this is
not enforced anywhere.


Ok. I'll move that part from userspace to the kernel. That should be 
better anyway.


-Corey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/10] Make qdisc changeable.

2007-10-01 Thread Corey Hickey

Patrick McHardy wrote:

Corey Hickey wrote:

Re-implement sfq_change() and enable Qdisc_opts.change so tc qdisc
change will work.




+static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
+{
+   ...
+
+   /* finish up */
+   if (q-perturb_period) {
+   q-perturb_timer.expires = jiffies + q-perturb_period;
+   add_timer(q-perturb_timer);
+   } else {
+   q-perturbation = 0;



Seems counter-productive to explicitly set it to zero since
it was still used during tranfering the packets with the
old value. So I'd suggest to remove this or alternatively
set it to the final value *before* transfering the packets.


I suppose so; you're right. I'll adapt that part to fit before 
transferring the packets.


-Corey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/10] Change perturb_period to unsigned.

2007-10-01 Thread Corey Hickey

Patrick McHardy wrote:

Corey Hickey wrote:

perturb_period is currently a signed integer, but I can't see any good
reason why this is so--a negative perturbation period will add a timer
that expires in the past, causing constant perturbation, which makes
hashing useless.

if (q-perturb_period) {
q-perturb_timer.expires = jiffies + q-perturb_period;
add_timer(q-perturb_timer);
}

Strictly speaking, this will break binary compatibility with older
versions of tc, but that ought not to be a problem because (a) there's
no valid use for a negative perturb_period, and (b) negative values
will be seen as high values ( INT_MAX), which don't work anyway.

If perturb_period is too large, (perturb_period * HZ) will overflow the
size of an unsigned int and wrap around. So, check for thet and reject
values that are too high.



Sounds reasonable.


--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -74,6 +74,9 @@
 typedef unsigned int sfq_index;
 #define SFQ_MAX_DEPTH (UINT_MAX / 2 - 1)
 
+/* We don't want perturb_period * HZ to overflow an unsigned int. */

+#define SFQ_MAX_PERTURB (UINT_MAX / HZ)



jiffies are unsigned long.


Hmm. You're right. It looks like my previous patch obviated the need for 
this part. I'll remove it.


-Corey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/10] Use nested compat attributes to pass parameters.

2007-10-01 Thread Corey Hickey

Patrick McHardy wrote:

Corey Hickey wrote:

+
+#define GET_PARAM(dst, nest, compat) do { \
+   struct rtattr *rta = tb[(nest) - 1]; \
+   if (rta) \
+   (dst) = RTA_GET_U32(rta); \
+   else if ((compat)) \
+   (dst) = (compat); \
+} while (0)



An inline function and a comment why this is done would increase
readability.


Well, I had a reason for making a macro, but it probably wasn't a good 
reason. Looking now, I don't see why not to make a function. I'll see 
what I can do.



+   nest = RTA_NEST_COMPAT(skb, TCA_OPTIONS, sizeof(opt), opt);
+
+   RTA_PUT_U32(skb, TCA_SFQ_QUANTUM, q-quantum);
+   RTA_PUT_U32(skb, TCA_SFQ_PERTURB, q-perturb_period);
+   RTA_PUT_U32(skb, TCA_SFQ_LIMIT,   q-limit);
+   RTA_PUT_U32(skb, TCA_SFQ_DIVISOR, q-hash_divisor);
+   RTA_PUT_U32(skb, TCA_SFQ_FLOWS,   q-depth);
RTA_PUT(skb, TCA_OPTIONS, sizeof(opt), opt);



This is wrong, RTA_NEST_COMPAT already dumps the structure.


You mean that last line (RTA_PUT) is superfluous, right? I can't see a 
reason for it to be there, so I must have just forgotten to delete it 
from the original code.


If I'm wrong, I might need a little hand-holding here. My understanding 
of all the RTA stuff is a bit shaky.



Much thanks for the review. I'll make a new set of patches soon.

-Corey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/10] Preparatory refactoring part 2.

2007-10-01 Thread Corey Hickey

Patrick McHardy wrote:

Corey Hickey wrote:

The sfq_destroy() -- sfq_q_destroy() change looks pointless here,
but it's cleaner to split now and add code to sfq_q_destroy() in a
later patch.

+static void sfq_destroy(struct Qdisc *sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   sfq_q_destroy(q);
+}



It does look pointless, after applying all patches sfq_destroy still
remains a simply wrapper around sfq_q_destroy.


It does remain a wrapper, but both functions are used. It doesn't have 
to be this way, but I wanted to avoid duplicating code and I didn't see 
a better layout.


sfq_q_destroy is used in sfq_q_init if a kcalloc fails. sfq_q_init knows 
nothing about struct Qdisc *sch, so it can't call sfq_destroy.


sfq_destroy is still marked as the destroy function in sfq_qdisc_ops.

-Corey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SFQ: backport some features from ESFQ (try 4)

2007-09-29 Thread Corey Hickey

Patrick McHardy wrote:

Corey Hickey wrote:

Patchset try 2 addresses the review by Michael Buesch.
Patchset try 3 addresses the review by Patrick McHardy.
Patchset try 4 has a few cosmetic improvements.

Nobody reviewed my last set of patches, and I wasn't pushy about asking.
Since it's been a while, I ported the kernel and userspace patchsets to
current git of net-2.6 and iproute2, respectively.


Thanks for posting these patches, I'll have a closer look tommorrow.


Many thanks for reviewing.

-Corey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


SFQ: backport some features from ESFQ (try 4)

2007-09-28 Thread Corey Hickey

Patchset try 2 addresses the review by Michael Buesch.
Patchset try 3 addresses the review by Patrick McHardy.
Patchset try 4 has a few cosmetic improvements.

Nobody reviewed my last set of patches, and I wasn't pushy about asking.
Since it's been a while, I ported the kernel and userspace patchsets to
current git of net-2.6 and iproute2, respectively.

The first 7 patches in this series resemble the corresponding 7 patches
in patchset try 2. There aren't any major changes--just modifications
to address errors noticed in review and slight reorganizations to make
the next patches easier.

Patches 8-10 implement parameter passing via nested compat attributes.
This is necessary for using 'tc qdisc change' to disable perturbation.
The rest of the parameters were added for consistency.

Iproute2 patches will follow shortly.



The following is the original patch text.

This set of patches adds some of ESFQ's modifications to the original 
SFQ. Thus far, I have received support for this approach rather than for 
trying to get ESFQ included as a separate qdisc.

http://mailman.ds9a.nl/pipermail/lartc/2007q2/021056.html

My patches here implement tc qdisc change, user-configurable depth 
(number of flows), and user-configurable divisor (for setting hash table 
size). I've left out the remaining ESFQ features (usage of jhash and 
different hashing methods) because Patrick McHardy intends to submit a 
patch that will supersede that functionality; see the URL above.

Default values remain the same, and SFQ's default behavior remains the 
same, so there should be no user disruption.

Thanks for your consideration,
Corey



 include/linux/pkt_sched.h |   23 ++--
 net/sched/sch_sfq.c   |  353 ++---
 2 files changed, 254 insertions(+), 122 deletions(-)


[PATCH 01/10] Preparatory refactoring part 1.
[PATCH 02/10] Preparatory refactoring part 2.
[PATCH 03/10] Move two functions.
[PATCH 04/10] Make depth (number of queues) user-configurable:
[PATCH 05/10] Add divisor.
[PATCH 06/10] Make qdisc changeable.
[PATCH 07/10] Remove comments about hardcoded values.
[PATCH 08/10] Multiply perturb_period by HZ when used rather than when assigned.
[PATCH 09/10] Change perturb_period to unsigned.
[PATCH 10/10] Use nested compat attributes to pass parameters.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/10] Move two functions.

2007-09-28 Thread Corey Hickey
Move sfq_q_destroy() to above sfq_q_init() so that it can be used
by an error case in a later patch.

Move sfq_destroy() as well, for clarity.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   22 +++---
 1 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 1ba3d1a..ca22cb7 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -382,6 +382,17 @@ static void sfq_perturbation(unsigned long arg)
}
 }
 
+static void sfq_q_destroy(struct sfq_sched_data *q)
+{
+   del_timer(q-perturb_timer);
+}
+
+static void sfq_destroy(struct Qdisc *sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   sfq_q_destroy(q);
+}
+
 static void
 sfq_default_parameters(struct Qdisc *sch)
 {
@@ -451,17 +462,6 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
return 0;
 }
 
-static void sfq_q_destroy(struct sfq_sched_data *q)
-{
-   del_timer(q-perturb_timer);
-}
-
-static void sfq_destroy(struct Qdisc *sch)
-{
-   struct sfq_sched_data *q = qdisc_priv(sch);
-   sfq_q_destroy(q);
-}
-
 static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-- 
1.5.3

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/10] Preparatory refactoring part 1.

2007-09-28 Thread Corey Hickey
Make a new function sfq_q_enqueue() that operates directly on the
queue data. This will be useful for implementing sfq_change() in
a later patch. A pleasant side-effect is reducing most of the
duplicate code in sfq_enqueue() and sfq_requeue().

Similarly, make a new function sfq_q_dequeue().

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   72 +++
 1 files changed, 38 insertions(+), 34 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 3a23e30..57485ef 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -77,6 +77,9 @@
 #define SFQ_DEPTH  128
 #define SFQ_HASH_DIVISOR   1024
 
+#define SFQ_HEAD 0
+#define SFQ_TAIL 1
+
 /* This type should contain at least SFQ_DEPTH*2 values */
 typedef unsigned char sfq_index;
 
@@ -244,10 +247,9 @@ static unsigned int sfq_drop(struct Qdisc *sch)
return 0;
 }
 
-static int
-sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
+static void
+sfq_q_enqueue(struct sk_buff *skb, struct sfq_sched_data *q, unsigned int end)
 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
unsigned hash = sfq_hash(q, skb);
sfq_index x;
 
@@ -256,8 +258,12 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
q-hash[x] = hash;
}
-   sch-qstats.backlog += skb-len;
-   __skb_queue_tail(q-qs[x], skb);
+
+   if (end == SFQ_TAIL)
+   __skb_queue_tail(q-qs[x], skb);
+   else
+   __skb_queue_head(q-qs[x], skb);
+
sfq_inc(q, x);
if (q-qs[x].qlen == 1) {   /* The flow is new */
if (q-tail == SFQ_DEPTH) { /* It is the first flow */
@@ -270,6 +276,15 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
q-tail = x;
}
}
+}
+
+static int
+sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+
+   sfq_q_enqueue(skb, q, SFQ_TAIL);
+   sch-qstats.backlog += skb-len;
if (++sch-q.qlen = q-limit) {
sch-bstats.bytes += skb-len;
sch-bstats.packets++;
@@ -284,45 +299,21 @@ static int
 sfq_requeue(struct sk_buff *skb, struct Qdisc* sch)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-   unsigned hash = sfq_hash(q, skb);
-   sfq_index x;
 
-   x = q-ht[hash];
-   if (x == SFQ_DEPTH) {
-   q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
-   q-hash[x] = hash;
-   }
+   sfq_q_enqueue(skb, q, SFQ_HEAD);
sch-qstats.backlog += skb-len;
-   __skb_queue_head(q-qs[x], skb);
-   sfq_inc(q, x);
-   if (q-qs[x].qlen == 1) {   /* The flow is new */
-   if (q-tail == SFQ_DEPTH) { /* It is the first flow */
-   q-tail = x;
-   q-next[x] = x;
-   q-allot[x] = q-quantum;
-   } else {
-   q-next[x] = q-next[q-tail];
-   q-next[q-tail] = x;
-   q-tail = x;
-   }
-   }
if (++sch-q.qlen = q-limit) {
sch-qstats.requeues++;
return 0;
}
 
-   sch-qstats.drops++;
sfq_drop(sch);
return NET_XMIT_CN;
 }
 
-
-
-
-static struct sk_buff *
-sfq_dequeue(struct Qdisc* sch)
+static struct
+sk_buff *sfq_q_dequeue(struct sfq_sched_data *q)
 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
struct sk_buff *skb;
sfq_index a, old_a;
 
@@ -335,8 +326,6 @@ sfq_dequeue(struct Qdisc* sch)
/* Grab packet */
skb = __skb_dequeue(q-qs[a]);
sfq_dec(q, a);
-   sch-q.qlen--;
-   sch-qstats.backlog -= skb-len;
 
/* Is the slot empty? */
if (q-qs[a].qlen == 0) {
@@ -353,6 +342,21 @@ sfq_dequeue(struct Qdisc* sch)
a = q-next[a];
q-allot[a] += q-quantum;
}
+
+   return skb;
+}
+
+static struct sk_buff
+*sfq_dequeue(struct Qdisc* sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   struct sk_buff *skb;
+
+   skb = sfq_q_dequeue(q);
+   if (skb == NULL)
+   return NULL;
+   sch-q.qlen--;
+   sch-qstats.backlog -= skb-len;
return skb;
 }
 
-- 
1.5.3

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/10] Preparatory refactoring part 2.

2007-09-28 Thread Corey Hickey
Factor code out of sfq_init() and sfq_destroy(), again so that the
new functions can be used by sfq_change() later.

Actually, as the diff itself shows, most of the sfq_q_init() code
comes from the original sfq_change(), but sfq_change() is only
called by sfq_init() right now. Thus, it is safe to remove
sfq_change(); tc qdisc change doesn't yet work for sfq anyway.

Setting default parameters is moved into a separate function for
clarity.

The sfq_destroy() -- sfq_q_destroy() change looks pointless here,
but it's cleaner to split now and add code to sfq_q_destroy() in a
later patch.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   95 +--
 1 files changed, 54 insertions(+), 41 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 57485ef..1ba3d1a 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -382,43 +382,41 @@ static void sfq_perturbation(unsigned long arg)
}
 }
 
-static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
+static void
+sfq_default_parameters(struct Qdisc *sch)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-   struct tc_sfq_qopt *ctl = RTA_DATA(opt);
-   unsigned int qlen;
-
-   if (opt-rta_len  RTA_LENGTH(sizeof(*ctl)))
-   return -EINVAL;
-
-   sch_tree_lock(sch);
-   q-quantum = ctl-quantum ? : psched_mtu(sch-dev);
-   q-perturb_period = ctl-perturb_period*HZ;
-   if (ctl-limit)
-   q-limit = min_t(u32, ctl-limit, SFQ_DEPTH - 2);
 
-   qlen = sch-q.qlen;
-   while (sch-q.qlen  q-limit)
-   sfq_drop(sch);
-   qdisc_tree_decrease_qlen(sch, qlen - sch-q.qlen);
-
-   del_timer(q-perturb_timer);
-   if (q-perturb_period) {
-   q-perturb_timer.expires = jiffies + q-perturb_period;
-   add_timer(q-perturb_timer);
-   }
-   sch_tree_unlock(sch);
-   return 0;
+   q-quantum= psched_mtu(sch-dev);
+   q-perturbation   = 0;
+   q-perturb_period = 0;
+   q-limit  = SFQ_DEPTH - 2;
 }
 
-static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
+static int
+sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
int i;
 
-   init_timer(q-perturb_timer);
-   q-perturb_timer.data = (unsigned long)sch;
-   q-perturb_timer.function = sfq_perturbation;
+   /* At this point, parameters are set to either defaults (sfq_init) or
+* the previous values (sfq_change). So, overwrite the parameters as
+* specified. */
+   if (opt) {
+   struct tc_sfq_qopt *ctl = RTA_DATA(opt);
+
+   if (opt-rta_len  RTA_LENGTH(sizeof(*ctl)))
+   return -EINVAL;
+
+   if (ctl-quantum)
+   q-quantum = ctl-quantum;
+   if (ctl-perturb_period)
+   q-perturb_period = ctl-perturb_period * HZ;
+   if (ctl-limit)
+   q-limit = ctl-limit;
+   }
+   q-limit = min_t(u32, q-limit, SFQ_DEPTH - 2);
+   q-tail = SFQ_DEPTH;
+   q-max_depth = 0;
 
for (i=0; iSFQ_HASH_DIVISOR; i++)
q-ht[i] = SFQ_DEPTH;
@@ -427,28 +425,43 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
q-dep[i+SFQ_DEPTH].next = i+SFQ_DEPTH;
q-dep[i+SFQ_DEPTH].prev = i+SFQ_DEPTH;
}
-   q-limit = SFQ_DEPTH - 2;
-   q-max_depth = 0;
-   q-tail = SFQ_DEPTH;
-   if (opt == NULL) {
-   q-quantum = psched_mtu(sch-dev);
-   q-perturb_period = 0;
-   } else {
-   int err = sfq_change(sch, opt);
-   if (err)
-   return err;
-   }
+
for (i=0; iSFQ_DEPTH; i++)
sfq_link(q, i);
return 0;
 }
 
-static void sfq_destroy(struct Qdisc *sch)
+static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
+   int err;
+
+   sfq_default_parameters(sch);
+   if ((err = sfq_q_init(q, opt)))
+   return err;
+
+   init_timer(q-perturb_timer);
+   q-perturb_timer.data = (unsigned long)sch;
+   q-perturb_timer.function = sfq_perturbation;
+   if (q-perturb_period) {
+   q-perturb_timer.expires = jiffies + q-perturb_period;
+   add_timer(q-perturb_timer);
+   }
+
+   return 0;
+}
+
+static void sfq_q_destroy(struct sfq_sched_data *q)
+{
del_timer(q-perturb_timer);
 }
 
+static void sfq_destroy(struct Qdisc *sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   sfq_q_destroy(q);
+}
+
 static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-- 
1.5.3

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info

[PATCH 04/10] Make depth (number of queues) user-configurable:

2007-09-28 Thread Corey Hickey
* replace #define with a parameter
* use old hardcoded value as a default
* kcalloc() arrays in sfq_q_init()
* free() arrays in sfq_q_destroy()

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   85 +++---
 1 files changed, 59 insertions(+), 26 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index ca22cb7..34a299d 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -74,14 +74,16 @@
 
It is easy to increase these values, but not in flight.  */
 
-#define SFQ_DEPTH  128
+#define SFQ_DEPTH_DEFAULT  128
 #define SFQ_HASH_DIVISOR   1024
 
 #define SFQ_HEAD 0
 #define SFQ_TAIL 1
 
-/* This type should contain at least SFQ_DEPTH*2 values */
-typedef unsigned char sfq_index;
+/* This type must contain greater than depth*2 values, so depth is constrained 
+ * accordingly. */
+typedef unsigned int sfq_index;
+#define SFQ_MAX_DEPTH (UINT_MAX / 2 - 1)
 
 struct sfq_head
 {
@@ -95,6 +97,7 @@ struct sfq_sched_data
int perturb_period;
unsignedquantum;/* Allotment per round: MUST BE = MTU 
*/
int limit;
+   unsigneddepth;
 
 /* Variables */
struct timer_list perturb_timer;
@@ -103,11 +106,11 @@ struct sfq_sched_data
sfq_index   max_depth;  /* Maximal depth */
 
sfq_index   ht[SFQ_HASH_DIVISOR];   /* Hash table */
-   sfq_index   next[SFQ_DEPTH];/* Active slots link */
-   short   allot[SFQ_DEPTH];   /* Current allotment per slot */
-   unsigned short  hash[SFQ_DEPTH];/* Hash value indexed by slots 
*/
-   struct sk_buff_head qs[SFQ_DEPTH];  /* Slot queue */
-   struct sfq_head dep[SFQ_DEPTH*2];   /* Linked list of slots, 
indexed by depth */
+   sfq_index   *next;  /* Active slots link */
+   short   *allot; /* Current allotment per slot */
+   unsigned short  *hash;  /* Hash value indexed by slots 
*/
+   struct sk_buff_head *qs;/* Slot queue */
+   struct sfq_head *dep;   /* Linked list of slots, 
indexed by depth */
 };
 
 static __inline__ unsigned sfq_fold_hash(struct sfq_sched_data *q, u32 h, u32 
h1)
@@ -164,7 +167,7 @@ static unsigned sfq_hash(struct sfq_sched_data *q, struct 
sk_buff *skb)
 static inline void sfq_link(struct sfq_sched_data *q, sfq_index x)
 {
sfq_index p, n;
-   int d = q-qs[x].qlen + SFQ_DEPTH;
+   int d = q-qs[x].qlen + q-depth;
 
p = d;
n = q-dep[d].next;
@@ -215,7 +218,7 @@ static unsigned int sfq_drop(struct Qdisc *sch)
   drop a packet from it */
 
if (d  1) {
-   sfq_index x = q-dep[d+SFQ_DEPTH].next;
+   sfq_index x = q-dep[d + q-depth].next;
skb = q-qs[x].prev;
len = skb-len;
__skb_unlink(skb, q-qs[x]);
@@ -238,7 +241,7 @@ static unsigned int sfq_drop(struct Qdisc *sch)
kfree_skb(skb);
sfq_dec(q, d);
sch-q.qlen--;
-   q-ht[q-hash[d]] = SFQ_DEPTH;
+   q-ht[q-hash[d]] = q-depth;
sch-qstats.drops++;
sch-qstats.backlog -= len;
return len;
@@ -254,8 +257,8 @@ sfq_q_enqueue(struct sk_buff *skb, struct sfq_sched_data 
*q, unsigned int end)
sfq_index x;
 
x = q-ht[hash];
-   if (x == SFQ_DEPTH) {
-   q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
+   if (x == q-depth) {
+   q-ht[hash] = x = q-dep[q-depth].next;
q-hash[x] = hash;
}
 
@@ -266,7 +269,7 @@ sfq_q_enqueue(struct sk_buff *skb, struct sfq_sched_data 
*q, unsigned int end)
 
sfq_inc(q, x);
if (q-qs[x].qlen == 1) {   /* The flow is new */
-   if (q-tail == SFQ_DEPTH) { /* It is the first flow */
+   if (q-tail == q-depth) {  /* It is the first flow */
q-tail = x;
q-next[x] = x;
q-allot[x] = q-quantum;
@@ -318,7 +321,7 @@ sk_buff *sfq_q_dequeue(struct sfq_sched_data *q)
sfq_index a, old_a;
 
/* No active slots */
-   if (q-tail == SFQ_DEPTH)
+   if (q-tail == q-depth)
return NULL;
 
a = old_a = q-next[q-tail];
@@ -329,10 +332,10 @@ sk_buff *sfq_q_dequeue(struct sfq_sched_data *q)
 
/* Is the slot empty? */
if (q-qs[a].qlen == 0) {
-   q-ht[q-hash[a]] = SFQ_DEPTH;
+   q-ht[q-hash[a]] = q-depth;
a = q-next[a];
if (a == old_a) {
-   q-tail = SFQ_DEPTH;
+   q-tail = q-depth;
return skb;
}
q-next[q-tail] = a;
@@ -385,6 +388,11 @@ static void sfq_perturbation(unsigned long arg)
 static void

[PATCH 06/10] Make qdisc changeable.

2007-09-28 Thread Corey Hickey
Re-implement sfq_change() and enable Qdisc_opts.change so tc qdisc
change will work.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   61 ++-
 1 files changed, 60 insertions(+), 1 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index d72ea7c..b8e8fa5 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -416,6 +416,17 @@ sfq_default_parameters(struct Qdisc *sch)
q-limit  = SFQ_DEPTH_DEFAULT - 2;
 }
 
+static void
+sfq_copy_parameters(struct sfq_sched_data *dst, struct sfq_sched_data *src)
+{
+   dst-quantum= src-quantum;
+   dst-perturbation   = src-perturbation;
+   dst-perturb_period = src-perturb_period;
+   dst-hash_divisor   = src-hash_divisor;
+   dst-limit  = src-limit;
+   dst-depth  = src-depth;
+}
+
 static int
 sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
 {
@@ -503,6 +514,54 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
return 0;
 }
 
+static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   struct sfq_sched_data tmp;
+   struct sk_buff *skb;
+   unsigned int qlen;
+   int err;
+   
+   /* set up tmp queue */
+   memset(tmp, 0, sizeof(struct sfq_sched_data));
+   sfq_copy_parameters(tmp, q);
+   if ((err = sfq_q_init(tmp, opt)))
+   return err;
+
+   /* copy packets from the old queue to the tmp queue */
+   sch_tree_lock(sch);
+   qlen = sch-q.qlen;
+   while (sch-q.qlen = tmp.limit - 1)
+   sfq_drop(sch);
+   qdisc_tree_decrease_qlen(sch, qlen - sch-q.qlen);
+   while ((skb = sfq_q_dequeue(q)) != NULL)
+   sfq_q_enqueue(skb, tmp, SFQ_TAIL);
+   
+   /* clean up the old queue */
+   sfq_q_destroy(q);
+
+   /* copy elements of the tmp queue into the old queue */
+   sfq_copy_parameters(q, tmp);
+   q-tail  = tmp.tail;
+   q-max_depth = tmp.max_depth;
+   q-ht= tmp.ht;
+   q-dep   = tmp.dep;
+   q-next  = tmp.next;
+   q-allot = tmp.allot;
+   q-hash  = tmp.hash;
+   q-qs= tmp.qs;
+
+   /* finish up */
+   if (q-perturb_period) {
+   q-perturb_timer.expires = jiffies + q-perturb_period;
+   add_timer(q-perturb_timer);
+   } else {
+   q-perturbation = 0;
+   }
+   sch_tree_unlock(sch);
+   return 0;
+}
+
 static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
@@ -537,7 +596,7 @@ static struct Qdisc_ops sfq_qdisc_ops = {
.init   =   sfq_init,
.reset  =   sfq_reset,
.destroy=   sfq_destroy,
-   .change =   NULL,
+   .change =   sfq_change,
.dump   =   sfq_dump,
.owner  =   THIS_MODULE,
 };
-- 
1.5.3

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/10] Multiply perturb_period by HZ when used rather than when assigned.

2007-09-28 Thread Corey Hickey
perturb_period is the only parameter that doesn't match 1:1 with the
value from userspace. This change makes it easy and clean to use a
small macro for setting parameters (in a subsequent patch).

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   10 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 341a9a1..2d3cc38 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -370,7 +370,7 @@ static void sfq_perturbation(unsigned long arg)
q-perturbation = net_random()0x1F;
 
if (q-perturb_period) {
-   q-perturb_timer.expires = jiffies + q-perturb_period;
+   q-perturb_timer.expires = jiffies + q-perturb_period * HZ;
add_timer(q-perturb_timer);
}
 }
@@ -433,7 +433,7 @@ sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
if (ctl-quantum)
q-quantum = ctl-quantum;
if (ctl-perturb_period)
-   q-perturb_period = ctl-perturb_period * HZ;
+   q-perturb_period = ctl-perturb_period;
if (ctl-divisor)
q-hash_divisor = ctl-divisor;
if (ctl-flows)
@@ -496,7 +496,7 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
q-perturb_timer.data = (unsigned long)sch;
q-perturb_timer.function = sfq_perturbation;
if (q-perturb_period) {
-   q-perturb_timer.expires = jiffies + q-perturb_period;
+   q-perturb_timer.expires = jiffies + q-perturb_period * HZ;
add_timer(q-perturb_timer);
}
 
@@ -542,7 +542,7 @@ static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
 
/* finish up */
if (q-perturb_period) {
-   q-perturb_timer.expires = jiffies + q-perturb_period;
+   q-perturb_timer.expires = jiffies + q-perturb_period * HZ;
add_timer(q-perturb_timer);
} else {
q-perturbation = 0;
@@ -558,7 +558,7 @@ static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
struct tc_sfq_qopt opt;
 
opt.quantum = q-quantum;
-   opt.perturb_period = q-perturb_period/HZ;
+   opt.perturb_period = q-perturb_period;
 
opt.limit = q-limit;
opt.divisor = q-hash_divisor;
-- 
1.5.3

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/10] Remove comments about hardcoded values.

2007-09-28 Thread Corey Hickey
None of these are true anymore (hooray!).

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 include/linux/pkt_sched.h |8 
 net/sched/sch_sfq.c   |   17 +++--
 2 files changed, 3 insertions(+), 22 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 268c515..58a0ea6 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -148,14 +148,6 @@ struct tc_sfq_qopt
unsignedflows;  /* Maximal number of flows  */
 };
 
-/*
- *  NOTE: limit, divisor and flows are hardwired to code at the moment.
- *
- * limit=flows=128, divisor=1024;
- *
- * The only reason for this is efficiency, it is possible
- * to change these parameters in compile time.
- */
 
 /* RED section */
 
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index b8e8fa5..341a9a1 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -61,18 +61,7 @@
 
We still need true WFQ for top level CSZ, but using WFQ
for the best effort traffic is absolutely pointless:
-   SFQ is superior for this purpose.
-
-   IMPLEMENTATION:
-   This implementation limits maximal queue length to 128;
-   maximal mtu to 2^15-1; number of hash buckets to 1024.
-   The only goal of this restrictions was that all data
-   fit into one 4K page :-). Struct sfq_sched_data is
-   organized in anti-cache manner: all the data for a bucket
-   are scattered over different locations. This is not good,
-   but it allowed me to put it into 4K.
-
-   It is easy to increase these values, but not in flight.  */
+   SFQ is superior for this purpose. */
 
 #define SFQ_DEPTH_DEFAULT  128
 #define SFQ_DIVISOR_DEFAULT1024
@@ -521,7 +510,7 @@ static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
struct sk_buff *skb;
unsigned int qlen;
int err;
-   
+
/* set up tmp queue */
memset(tmp, 0, sizeof(struct sfq_sched_data));
sfq_copy_parameters(tmp, q);
@@ -536,7 +525,7 @@ static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
qdisc_tree_decrease_qlen(sch, qlen - sch-q.qlen);
while ((skb = sfq_q_dequeue(q)) != NULL)
sfq_q_enqueue(skb, tmp, SFQ_TAIL);
-   
+
/* clean up the old queue */
sfq_q_destroy(q);
 
-- 
1.5.3

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/10] Change perturb_period to unsigned.

2007-09-28 Thread Corey Hickey
perturb_period is currently a signed integer, but I can't see any good
reason why this is so--a negative perturbation period will add a timer
that expires in the past, causing constant perturbation, which makes
hashing useless.

if (q-perturb_period) {
q-perturb_timer.expires = jiffies + q-perturb_period;
add_timer(q-perturb_timer);
}

Strictly speaking, this will break binary compatibility with older
versions of tc, but that ought not to be a problem because (a) there's
no valid use for a negative perturb_period, and (b) negative values
will be seen as high values ( INT_MAX), which don't work anyway.

If perturb_period is too large, (perturb_period * HZ) will overflow the
size of an unsigned int and wrap around. So, check for thet and reject
values that are too high.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 include/linux/pkt_sched.h |2 +-
 net/sched/sch_sfq.c   |8 ++--
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 58a0ea6..8559974 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -142,7 +142,7 @@ enum
 struct tc_sfq_qopt
 {
unsignedquantum;/* Bytes per round allocated to flow */
-   int perturb_period; /* Period of hash perturbation */
+   unsignedperturb_period; /* Period of hash perturbation */
__u32   limit;  /* Maximal packets in queue */
unsigneddivisor;/* Hash divisor  */
unsignedflows;  /* Maximal number of flows  */
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 2d3cc38..170fd37 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -74,6 +74,9 @@
 typedef unsigned int sfq_index;
 #define SFQ_MAX_DEPTH (UINT_MAX / 2 - 1)
 
+/* We don't want perturb_period * HZ to overflow an unsigned int. */
+#define SFQ_MAX_PERTURB (UINT_MAX / HZ)
+
 struct sfq_head
 {
sfq_index   next;
@@ -83,7 +86,7 @@ struct sfq_head
 struct sfq_sched_data
 {
 /* Parameters */
-   int perturb_period;
+   unsignedperturb_period;
unsignedquantum;/* Allotment per round: MUST BE = MTU 
*/
int limit;
unsigneddepth;
@@ -441,7 +444,8 @@ sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
if (ctl-limit)
q-limit = ctl-limit;
 
-   if (q-depth  SFQ_MAX_DEPTH)
+   if (q-perturb_period  SFQ_MAX_PERTURB ||
+   q-depth  SFQ_MAX_DEPTH)
return -EINVAL;
}
q-limit = min_t(u32, q-limit, q-depth - 2);
-- 
1.5.3

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/10] Add divisor.

2007-09-28 Thread Corey Hickey
Make hash divisor user-configurable.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   18 +-
 1 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 34a299d..d72ea7c 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -75,7 +75,7 @@
It is easy to increase these values, but not in flight.  */
 
 #define SFQ_DEPTH_DEFAULT  128
-#define SFQ_HASH_DIVISOR   1024
+#define SFQ_DIVISOR_DEFAULT1024
 
 #define SFQ_HEAD 0
 #define SFQ_TAIL 1
@@ -98,6 +98,7 @@ struct sfq_sched_data
unsignedquantum;/* Allotment per round: MUST BE = MTU 
*/
int limit;
unsigneddepth;
+   unsignedhash_divisor;
 
 /* Variables */
struct timer_list perturb_timer;
@@ -105,7 +106,7 @@ struct sfq_sched_data
sfq_index   tail;   /* Index of current slot in round */
sfq_index   max_depth;  /* Maximal depth */
 
-   sfq_index   ht[SFQ_HASH_DIVISOR];   /* Hash table */
+   sfq_index   *ht;/* Hash table */
sfq_index   *next;  /* Active slots link */
short   *allot; /* Current allotment per slot */
unsigned short  *hash;  /* Hash value indexed by slots 
*/
@@ -120,7 +121,7 @@ static __inline__ unsigned sfq_fold_hash(struct 
sfq_sched_data *q, u32 h, u32 h1
/* Have we any rotation primitives? If not, WHY? */
h ^= (h1pert) ^ (h1(0x1F - pert));
h ^= h10;
-   return h  0x3FF;
+   return h  (q-hash_divisor-1);
 }
 
 static unsigned sfq_hash(struct sfq_sched_data *q, struct sk_buff *skb)
@@ -388,6 +389,7 @@ static void sfq_perturbation(unsigned long arg)
 static void sfq_q_destroy(struct sfq_sched_data *q)
 {
del_timer(q-perturb_timer);
+   kfree(q-ht);
kfree(q-dep);
kfree(q-next);
kfree(q-allot);
@@ -409,6 +411,7 @@ sfq_default_parameters(struct Qdisc *sch)
q-quantum= psched_mtu(sch-dev);
q-perturbation   = 0;
q-perturb_period = 0;
+   q-hash_divisor   = SFQ_DIVISOR_DEFAULT;
q-depth  = SFQ_DEPTH_DEFAULT;
q-limit  = SFQ_DEPTH_DEFAULT - 2;
 }
@@ -431,6 +434,8 @@ sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
q-quantum = ctl-quantum;
if (ctl-perturb_period)
q-perturb_period = ctl-perturb_period * HZ;
+   if (ctl-divisor)
+   q-hash_divisor = ctl-divisor;
if (ctl-flows)
q-depth = ctl-flows;
if (ctl-limit)
@@ -443,6 +448,9 @@ sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
q-tail = q-depth;
q-max_depth = 0;
 
+   q-ht = kcalloc(q-hash_divisor, sizeof(sfq_index), GFP_KERNEL);
+   if (!q-ht)
+   goto err_case;
q-dep = kcalloc(1 + q-depth*2, sizeof(struct sfq_head), GFP_KERNEL);
if (!q-dep)
goto err_case;
@@ -459,7 +467,7 @@ sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
if (!q-qs)
goto err_case;
 
-   for (i=0; iSFQ_HASH_DIVISOR; i++)
+   for (i=0; iq-hash_divisor; i++)
q-ht[i] = q-depth;
for (i=0; i  q-depth; i++) {
skb_queue_head_init(q-qs[i]);
@@ -505,7 +513,7 @@ static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
opt.perturb_period = q-perturb_period/HZ;
 
opt.limit = q-limit;
-   opt.divisor = SFQ_HASH_DIVISOR;
+   opt.divisor = q-hash_divisor;
opt.flows = q-depth;
 
RTA_PUT(skb, TCA_OPTIONS, sizeof(opt), opt);
-- 
1.5.3

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/10] Use nested compat attributes to pass parameters.

2007-09-28 Thread Corey Hickey
This fixes the ambiguity between, for example:
tc qdisc change ... perturb 0
tc qdisc change ...

Without this patch, there is no way for SFQ to differentiate between
a parameter specified to be 0 and a parameter that was omitted.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 include/linux/pkt_sched.h |   13 +++
 net/sched/sch_sfq.c   |   53 +---
 2 files changed, 48 insertions(+), 18 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 8559974..aad04eb 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -148,6 +148,19 @@ struct tc_sfq_qopt
unsignedflows;  /* Maximal number of flows  */
 };
 
+enum
+{
+   TCA_SFQ_UNSPEC,
+   TCA_SFQ_COMPAT,
+   TCA_SFQ_QUANTUM,
+   TCA_SFQ_PERTURB,
+   TCA_SFQ_LIMIT,
+   TCA_SFQ_DIVISOR,
+   TCA_SFQ_FLOWS,
+   __TCA_SFQ_MAX,
+};
+
+#define TCA_SFQ_MAX (__TCA_SFQ_MAX - 1)
 
 /* RED section */
 
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 170fd37..36197f6 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -428,25 +428,31 @@ sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
 * the previous values (sfq_change). So, overwrite the parameters as
 * specified. */
if (opt) {
-   struct tc_sfq_qopt *ctl = RTA_DATA(opt);
-
-   if (opt-rta_len  RTA_LENGTH(sizeof(*ctl)))
-   return -EINVAL;
-
-   if (ctl-quantum)
-   q-quantum = ctl-quantum;
-   if (ctl-perturb_period)
-   q-perturb_period = ctl-perturb_period;
-   if (ctl-divisor)
-   q-hash_divisor = ctl-divisor;
-   if (ctl-flows)
-   q-depth = ctl-flows;
-   if (ctl-limit)
-   q-limit = ctl-limit;
-
+   struct tc_sfq_qopt *ctl;
+   struct rtattr *tb[TCA_SFQ_MAX];
+
+   if (rtattr_parse_nested_compat(tb, TCA_SFQ_MAX, opt, ctl,
+  sizeof(*ctl)))
+   goto rtattr_failure;
+
+#define GET_PARAM(dst, nest, compat) do { \
+   struct rtattr *rta = tb[(nest) - 1]; \
+   if (rta) \
+   (dst) = RTA_GET_U32(rta); \
+   else if ((compat)) \
+   (dst) = (compat); \
+} while (0)
+
+   GET_PARAM(q-quantum,TCA_SFQ_QUANTUM, ctl-quantum);
+   GET_PARAM(q-perturb_period, TCA_SFQ_PERTURB,
+   ctl-perturb_period);
+   GET_PARAM(q-hash_divisor,   TCA_SFQ_DIVISOR, ctl-divisor);
+   GET_PARAM(q-depth,  TCA_SFQ_FLOWS,   ctl-flows);
+   GET_PARAM(q-limit,  TCA_SFQ_LIMIT,   ctl-limit);
+   
if (q-perturb_period  SFQ_MAX_PERTURB ||
q-depth  SFQ_MAX_DEPTH)
-   return -EINVAL;
+   goto rtattr_failure;
}
q-limit = min_t(u32, q-limit, q-depth - 2);
q-tail = q-depth;
@@ -482,6 +488,8 @@ sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
for (i=0; i  q-depth; i++)
sfq_link(q, i);
return 0;
+rtattr_failure:
+   return -EINVAL;
 err_case:
sfq_q_destroy(q);
return -ENOBUFS;
@@ -559,17 +567,26 @@ static int sfq_dump(struct Qdisc *sch, struct sk_buff 
*skb)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
unsigned char *b = skb_tail_pointer(skb);
+   struct rtattr *nest;
struct tc_sfq_qopt opt;
 
opt.quantum = q-quantum;
opt.perturb_period = q-perturb_period;
-
opt.limit = q-limit;
opt.divisor = q-hash_divisor;
opt.flows = q-depth;
 
+   nest = RTA_NEST_COMPAT(skb, TCA_OPTIONS, sizeof(opt), opt);
+
+   RTA_PUT_U32(skb, TCA_SFQ_QUANTUM, q-quantum);
+   RTA_PUT_U32(skb, TCA_SFQ_PERTURB, q-perturb_period);
+   RTA_PUT_U32(skb, TCA_SFQ_LIMIT,   q-limit);
+   RTA_PUT_U32(skb, TCA_SFQ_DIVISOR, q-hash_divisor);
+   RTA_PUT_U32(skb, TCA_SFQ_FLOWS,   q-depth);
RTA_PUT(skb, TCA_OPTIONS, sizeof(opt), opt);
 
+   RTA_NEST_COMPAT_END(skb, nest);
+
return skb-len;
 
 rtattr_failure:
-- 
1.5.3

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


sfq

2007-09-28 Thread Corey Hickey

These patches follow the ESFQ--SFQ kernel patches. See the kernel
patch summary for general information.

Thanks,
Corey



 include/linux/pkt_sched.h |   23 ++-
 tc/q_sfq.c|   43 ++-
 2 files changed, 52 insertions(+), 14 deletions(-)


[PATCH 1/3] SFQ: Support changing depth and divisor.
[PATCH 2/3] Change perturb_period to unsigned.
[PATCH 3/3] Use nested compat attributes for passing parameters to the kernel.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] Change perturb_period to unsigned.

2007-09-28 Thread Corey Hickey
This corresponds to the kernel patch doing the same.

Here, too, this will technically break binary compatibility with older
kernels, but that shouldn't be a problem because negative perturb_period
values aren't usable anyway.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 include/linux/pkt_sched.h |2 +-
 tc/q_sfq.c|4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 9d41f63..fb04a89 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -142,7 +142,7 @@ enum
 struct tc_sfq_qopt
 {
unsignedquantum;/* Bytes per round allocated to flow */
-   int perturb_period; /* Period of hash perturbation */
+   unsignedperturb_period; /* Period of hash perturbation */
__u32   limit;  /* Maximal packets in queue */
unsigneddivisor;/* Hash divisor  */
unsignedflows;  /* Maximal number of flows  */
diff --git a/tc/q_sfq.c b/tc/q_sfq.c
index 7754db7..c9fcc53 100644
--- a/tc/q_sfq.c
+++ b/tc/q_sfq.c
@@ -47,7 +47,7 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
ok++;
} else if (strcmp(*argv, perturb) == 0) {
NEXT_ARG();
-   if (get_integer(opt.perturb_period, *argv, 0)) {
+   if (get_u32(opt.perturb_period, *argv, 0)) {
fprintf(stderr, Illegal \perturb\\n);
return -1;
}
@@ -115,7 +115,7 @@ static int sfq_print_opt(struct qdisc_util *qu, FILE *f, 
struct rtattr *opt)
fprintf(f, flows %u/%u , qopt-flows, qopt-divisor);
}
if (qopt-perturb_period)
-   fprintf(f, perturb %dsec , qopt-perturb_period);
+   fprintf(f, perturb %usec , qopt-perturb_period);
return 0;
 }
 
-- 
1.5.3

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] Use nested compat attributes for passing parameters to the kernel.

2007-09-28 Thread Corey Hickey
Note that I have left sfq_print_opt() alone. At this point, there
can be no difference between the data in the nested rtattrs and the
data in the compat rtattr, and I didn't want to add clutter that
isn't useful. Let me know if I should do differently.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 include/linux/pkt_sched.h |   14 ++
 tc/q_sfq.c|   18 --
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index fb04a89..aad04eb 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -148,6 +148,20 @@ struct tc_sfq_qopt
unsignedflows;  /* Maximal number of flows  */
 };
 
+enum
+{
+   TCA_SFQ_UNSPEC,
+   TCA_SFQ_COMPAT,
+   TCA_SFQ_QUANTUM,
+   TCA_SFQ_PERTURB,
+   TCA_SFQ_LIMIT,
+   TCA_SFQ_DIVISOR,
+   TCA_SFQ_FLOWS,
+   __TCA_SFQ_MAX,
+};
+
+#define TCA_SFQ_MAX (__TCA_SFQ_MAX - 1)
+
 /* RED section */
 
 enum
diff --git a/tc/q_sfq.c b/tc/q_sfq.c
index c9fcc53..5bb3eb7 100644
--- a/tc/q_sfq.c
+++ b/tc/q_sfq.c
@@ -34,9 +34,13 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
 {
int ok=0;
struct tc_sfq_qopt opt;
+   struct rtattr *nest;
 
memset(opt, 0, sizeof(opt));
 
+   /* put blank data in rtattr so there is a hole to fill later */
+   nest = addattr_nest_compat(n, 1024, TCA_OPTIONS, opt, sizeof(opt));
+
while (argc  0) {
if (strcmp(*argv, quantum) == 0) {
NEXT_ARG();
@@ -44,6 +48,7 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
fprintf(stderr, Illegal \limit\\n);
return -1;
}
+   addattr32(n, 1024, TCA_SFQ_QUANTUM, opt.quantum);
ok++;
} else if (strcmp(*argv, perturb) == 0) {
NEXT_ARG();
@@ -51,6 +56,7 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
fprintf(stderr, Illegal \perturb\\n);
return -1;
}
+   addattr32(n, 1024, TCA_SFQ_PERTURB, opt.perturb_period);
ok++;
} else if (strcmp(*argv, limit) == 0) {
NEXT_ARG();
@@ -62,6 +68,7 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
fprintf(stderr, Illegal \limit\, must be  
1\n);
return -1;
}
+   addattr32(n, 1024, TCA_SFQ_LIMIT, opt.limit);
ok++;
} else if (strcmp(*argv, depth) == 0) {
NEXT_ARG();
@@ -69,6 +76,7 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
fprintf(stderr, Illegal \depth\\n);
return -1;
}
+   addattr32(n, 1024, TCA_SFQ_FLOWS, opt.flows);
ok++;
} else if (strcmp(*argv, divisor) == 0) {
NEXT_ARG();
@@ -81,6 +89,7 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
return -1;
}
opt.divisor = 1opt.divisor;
+   addattr32(n, 1024, TCA_SFQ_DIVISOR, opt.divisor);
ok++;
} else if (strcmp(*argv, help) == 0) {
explain();
@@ -93,8 +102,13 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
argc--; argv++;
}
 
-   if (ok)
-   addattr_l(n, 1024, TCA_OPTIONS, opt, sizeof(opt));
+   if (ok) {
+   /* fill the hole we left earlier with real compat data */
+   memcpy(RTA_DATA(nest), opt, sizeof(opt));
+   addattr_nest_compat_end(n, nest);
+   }
+   else
+   nest-rta_len = 0;
return 0;
 }
 
-- 
1.5.3

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] SFQ: Support changing depth and divisor.

2007-09-28 Thread Corey Hickey
This can safely be applied either before or after the kernel
patches because the tc_sfq_qopt struct is unchanged:

- old kernels will ignore the parameters from new iproute2
- new kernels will use the same default parameters

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 include/linux/pkt_sched.h |9 -
 tc/q_sfq.c|   21 -
 2 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 268c515..9d41f63 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -148,15 +148,6 @@ struct tc_sfq_qopt
unsignedflows;  /* Maximal number of flows  */
 };
 
-/*
- *  NOTE: limit, divisor and flows are hardwired to code at the moment.
- *
- * limit=flows=128, divisor=1024;
- *
- * The only reason for this is efficiency, it is possible
- * to change these parameters in compile time.
- */
-
 /* RED section */
 
 enum
diff --git a/tc/q_sfq.c b/tc/q_sfq.c
index 05385cf..7754db7 100644
--- a/tc/q_sfq.c
+++ b/tc/q_sfq.c
@@ -25,7 +25,7 @@
 
 static void explain(void)
 {
-   fprintf(stderr, Usage: ... sfq [ limit NUMBER ] [ perturb SECS ] [ 
quantum BYTES ]\n);
+   fprintf(stderr, Usage: ... sfq [ limit NUMBER ] [ depth FLOWS ] [ 
divisor HASHBITS ] [ perturb SECS ] [ quantum BYTES ]\n);
 }
 
 #define usage() return(-1)
@@ -63,6 +63,25 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
return -1;
}
ok++;
+   } else if (strcmp(*argv, depth) == 0) {
+   NEXT_ARG();
+   if (get_unsigned(opt.flows, *argv, 0)) {
+   fprintf(stderr, Illegal \depth\\n);
+   return -1;
+   }
+   ok++;
+   } else if (strcmp(*argv, divisor) == 0) {
+   NEXT_ARG();
+   if (get_unsigned(opt.divisor, *argv, 0)) {
+   fprintf(stderr, Illegal \divisor\\n);
+   return -1;
+   }
+   if (opt.divisor = 15) {
+   fprintf(stderr, Illegal \divisor\, must be  
15\n);
+   return -1;
+   }
+   opt.divisor = 1opt.divisor;
+   ok++;
} else if (strcmp(*argv, help) == 0) {
explain();
return -1;
-- 
1.5.3

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sfq (iproute2 patches)

2007-09-28 Thread Corey Hickey

Corey Hickey wrote:

These patches follow the ESFQ--SFQ kernel patches. See the kernel
patch summary for general information.


Dang, I forgot to set the subject; these are the iproute2 patches.

-Corey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/10] SFQ: backport some features from ESFQ (try 3)

2007-08-25 Thread Corey Hickey
Patchset try 2 addresses the review by Michael Buesch.
Patchset try 3 addresses the review by Patrick McHardy.

The first 7 patches in this series resemble the corresponding 7 patches
I sent previously. There aren't any major changes--just modifications
to address errors noticed in review and slight reorganizations to make
the next patches easier.

Patches 8-10 implement parameter passing via nested compat attributes.
This is necessary for using 'tc qdisc change' to disable perturbation.
The rest of the parameters were added for consistency.

Iproute2 patches will follow shortly.



The following is the original patch text.

This set of patches adds some of ESFQ's modifications to the original 
SFQ. Thus far, I have received support for this approach rather than for 
trying to get ESFQ included as a separate qdisc.

http://mailman.ds9a.nl/pipermail/lartc/2007q2/021056.html

My patches here implement tc qdisc change, user-configurable depth 
(number of flows), and user-configurable divisor (for setting hash table 
size). I've left out the remaining ESFQ features (usage of jhash and 
different hashing methods) because Patrick McHardy intends to submit a 
patch that will supersede that functionality; see the URL above.

Default values remain the same, and SFQ's default behavior remains the 
same, so there should be no user disruption.

Thanks for your consideration,
Corey



 include/linux/pkt_sched.h |   23 ++-
 net/sched/sch_sfq.c   |  356 ++--
 2 files changed, 257 insertions(+), 122 deletions(-)


[PATCH 01/10] Preparatory refactoring part 1.
[PATCH 02/10] Preparatory refactoring part 2.
[PATCH 03/10] Move two functions.
[PATCH 04/10] Make depth (number of queues) user-configurable:
[PATCH 05/10] Add divisor.
[PATCH 06/10] Make qdisc changeable.
[PATCH 07/10] Remove comments about hardcoded values.
[PATCH 08/10] Multiply perturb_period by HZ when used rather than when assigned.
[PATCH 09/10] Change perturb_period to unsigned.
[PATCH 10/10] Use nested compat attributes to pass parameters.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/10] Preparatory refactoring part 1.

2007-08-25 Thread Corey Hickey
Make a new function sfq_q_enqueue() that operates directly on the
queue data. This will be useful for implementing sfq_change() in
a later patch. A pleasant side-effect is reducing most of the
duplicate code in sfq_enqueue() and sfq_requeue().

Similarly, make a new function sfq_q_dequeue().

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   72 +++
 1 files changed, 38 insertions(+), 34 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 9579573..346e966 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -77,6 +77,9 @@
 #define SFQ_DEPTH  128
 #define SFQ_HASH_DIVISOR   1024
 
+#define SFQ_HEAD 0
+#define SFQ_TAIL 1
+
 /* This type should contain at least SFQ_DEPTH*2 values */
 typedef unsigned char sfq_index;
 
@@ -244,10 +247,9 @@ static unsigned int sfq_drop(struct Qdisc *sch)
return 0;
 }
 
-static int
-sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
+static void
+sfq_q_enqueue(struct sk_buff *skb, struct sfq_sched_data *q, unsigned int end)
 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
unsigned hash = sfq_hash(q, skb);
sfq_index x;
 
@@ -256,8 +258,12 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
q-hash[x] = hash;
}
-   sch-qstats.backlog += skb-len;
-   __skb_queue_tail(q-qs[x], skb);
+
+   if (end == SFQ_TAIL)
+   __skb_queue_tail(q-qs[x], skb);
+   else
+   __skb_queue_head(q-qs[x], skb);
+
sfq_inc(q, x);
if (q-qs[x].qlen == 1) {   /* The flow is new */
if (q-tail == SFQ_DEPTH) { /* It is the first flow */
@@ -270,6 +276,15 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
q-tail = x;
}
}
+}
+
+static int
+sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+
+   sfq_q_enqueue(skb, q, SFQ_TAIL);
+   sch-qstats.backlog += skb-len;
if (++sch-q.qlen  q-limit-1) {
sch-bstats.bytes += skb-len;
sch-bstats.packets++;
@@ -284,45 +299,21 @@ static int
 sfq_requeue(struct sk_buff *skb, struct Qdisc* sch)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-   unsigned hash = sfq_hash(q, skb);
-   sfq_index x;
 
-   x = q-ht[hash];
-   if (x == SFQ_DEPTH) {
-   q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
-   q-hash[x] = hash;
-   }
+   sfq_q_enqueue(skb, q, SFQ_HEAD);
sch-qstats.backlog += skb-len;
-   __skb_queue_head(q-qs[x], skb);
-   sfq_inc(q, x);
-   if (q-qs[x].qlen == 1) {   /* The flow is new */
-   if (q-tail == SFQ_DEPTH) { /* It is the first flow */
-   q-tail = x;
-   q-next[x] = x;
-   q-allot[x] = q-quantum;
-   } else {
-   q-next[x] = q-next[q-tail];
-   q-next[q-tail] = x;
-   q-tail = x;
-   }
-   }
if (++sch-q.qlen  q-limit - 1) {
sch-qstats.requeues++;
return 0;
}
 
-   sch-qstats.drops++;
sfq_drop(sch);
return NET_XMIT_CN;
 }
 
-
-
-
-static struct sk_buff *
-sfq_dequeue(struct Qdisc* sch)
+static struct
+sk_buff *sfq_q_dequeue(struct sfq_sched_data *q)
 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
struct sk_buff *skb;
sfq_index a, old_a;
 
@@ -335,8 +326,6 @@ sfq_dequeue(struct Qdisc* sch)
/* Grab packet */
skb = __skb_dequeue(q-qs[a]);
sfq_dec(q, a);
-   sch-q.qlen--;
-   sch-qstats.backlog -= skb-len;
 
/* Is the slot empty? */
if (q-qs[a].qlen == 0) {
@@ -353,6 +342,21 @@ sfq_dequeue(struct Qdisc* sch)
a = q-next[a];
q-allot[a] += q-quantum;
}
+
+   return skb;
+}
+
+static struct sk_buff
+*sfq_dequeue(struct Qdisc* sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   struct sk_buff *skb;
+
+   skb = sfq_q_dequeue(q);
+   if (skb == NULL)
+   return NULL;
+   sch-q.qlen--;
+   sch-qstats.backlog -= skb-len;
return skb;
 }
 
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/10] Preparatory refactoring part 2.

2007-08-25 Thread Corey Hickey
Factor code out of sfq_init() and sfq_destroy(), again so that the
new functions can be used by sfq_change() later.

Actually, as the diff itself shows, most of the sfq_q_init() code
comes from the original sfq_change(), but sfq_change() is only
called by sfq_init() right now. Thus, it is safe to remove
sfq_change(); tc qdisc change doesn't yet work for sfq anyway.

Setting default parameters is moved into a separate function for
clarity.

The sfq_destroy() -- sfq_q_destroy() change looks pointless here,
but it's cleaner to split now and add code to sfq_q_destroy() in a
later patch.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   96 +--
 1 files changed, 55 insertions(+), 41 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 346e966..f95a0dc 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -382,43 +382,42 @@ static void sfq_perturbation(unsigned long arg)
}
 }
 
-static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
+static void
+sfq_default_parameters(struct Qdisc *sch)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-   struct tc_sfq_qopt *ctl = RTA_DATA(opt);
-   unsigned int qlen;
-
-   if (opt-rta_len  RTA_LENGTH(sizeof(*ctl)))
-   return -EINVAL;
-
-   sch_tree_lock(sch);
-   q-quantum = ctl-quantum ? : psched_mtu(sch-dev);
-   q-perturb_period = ctl-perturb_period*HZ;
-   if (ctl-limit)
-   q-limit = min_t(u32, ctl-limit, SFQ_DEPTH);
 
-   qlen = sch-q.qlen;
-   while (sch-q.qlen = q-limit-1)
-   sfq_drop(sch);
-   qdisc_tree_decrease_qlen(sch, qlen - sch-q.qlen);
-
-   del_timer(q-perturb_timer);
-   if (q-perturb_period) {
-   q-perturb_timer.expires = jiffies + q-perturb_period;
-   add_timer(q-perturb_timer);
-   }
-   sch_tree_unlock(sch);
-   return 0;
+   q-quantum = psched_mtu(sch-dev);
+   q-perturbation = 0;
+   q-perturb_period = 0;
+   q-limit = SFQ_DEPTH;
 }
 
-static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
+
+static int
+sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
int i;
 
-   init_timer(q-perturb_timer);
-   q-perturb_timer.data = (unsigned long)sch;
-   q-perturb_timer.function = sfq_perturbation;
+   /* At this point, parameters are set to either defaults (sfq_init) or
+* the previous values (sfq_change). So, overwrite the parameters as
+* specified. */
+   if (opt) {
+   struct tc_sfq_qopt *ctl = RTA_DATA(opt);
+
+   if (opt-rta_len  RTA_LENGTH(sizeof(*ctl)))
+   return -EINVAL;
+
+   if (ctl-quantum)
+   q-quantum = ctl-quantum;
+   if (ctl-perturb_period)
+   q-perturb_period = ctl-perturb_period * HZ;
+   if (ctl-limit)
+   q-limit = ctl-limit;
+   }
+   q-limit = min_t(u32, q-limit, SFQ_DEPTH);
+   q-tail = SFQ_DEPTH;
+   q-max_depth = 0;
 
for (i=0; iSFQ_HASH_DIVISOR; i++)
q-ht[i] = SFQ_DEPTH;
@@ -427,28 +426,43 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
q-dep[i+SFQ_DEPTH].next = i+SFQ_DEPTH;
q-dep[i+SFQ_DEPTH].prev = i+SFQ_DEPTH;
}
-   q-limit = SFQ_DEPTH;
-   q-max_depth = 0;
-   q-tail = SFQ_DEPTH;
-   if (opt == NULL) {
-   q-quantum = psched_mtu(sch-dev);
-   q-perturb_period = 0;
-   } else {
-   int err = sfq_change(sch, opt);
-   if (err)
-   return err;
-   }
+
for (i=0; iSFQ_DEPTH; i++)
sfq_link(q, i);
return 0;
 }
 
-static void sfq_destroy(struct Qdisc *sch)
+static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
+   int err;
+
+   sfq_default_parameters(sch);
+   if ((err = sfq_q_init(q, opt)))
+   return err;
+
+   init_timer(q-perturb_timer);
+   q-perturb_timer.data = (unsigned long)sch;
+   q-perturb_timer.function = sfq_perturbation;
+   if (q-perturb_period) {
+   q-perturb_timer.expires = jiffies + q-perturb_period;
+   add_timer(q-perturb_timer);
+   }
+
+   return 0;
+}
+
+static void sfq_q_destroy(struct sfq_sched_data *q)
+{
del_timer(q-perturb_timer);
 }
 
+static void sfq_destroy(struct Qdisc *sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   sfq_q_destroy(q);
+}
+
 static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org

[PATCH 03/10] Move two functions.

2007-08-25 Thread Corey Hickey
Move sfq_q_destroy() to above sfq_q_init() so that it can be used
by an error case in a later patch.

Move sfq_destroy() as well, for clarity.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   22 +++---
 1 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index f95a0dc..676195d 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -382,6 +382,17 @@ static void sfq_perturbation(unsigned long arg)
}
 }
 
+static void sfq_q_destroy(struct sfq_sched_data *q)
+{
+   del_timer(q-perturb_timer);
+}
+
+static void sfq_destroy(struct Qdisc *sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   sfq_q_destroy(q);
+}
+
 static void
 sfq_default_parameters(struct Qdisc *sch)
 {
@@ -452,17 +463,6 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
return 0;
 }
 
-static void sfq_q_destroy(struct sfq_sched_data *q)
-{
-   del_timer(q-perturb_timer);
-}
-
-static void sfq_destroy(struct Qdisc *sch)
-{
-   struct sfq_sched_data *q = qdisc_priv(sch);
-   sfq_q_destroy(q);
-}
-
 static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/10] Make depth (number of queues) user-configurable:

2007-08-25 Thread Corey Hickey
* replace #define with a parameter
* use old hardcoded value as a default
* kcalloc() arrays in sfq_q_init()
* free() arrays in sfq_q_destroy()

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   84 +++
 1 files changed, 58 insertions(+), 26 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 676195d..2e6d607 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -74,14 +74,16 @@
 
It is easy to increase these values, but not in flight.  */
 
-#define SFQ_DEPTH  128
+#define SFQ_DEPTH_DEFAULT  128
 #define SFQ_HASH_DIVISOR   1024
 
 #define SFQ_HEAD 0
 #define SFQ_TAIL 1
 
-/* This type should contain at least SFQ_DEPTH*2 values */
-typedef unsigned char sfq_index;
+/* This type must contain greater than depth*2 values, so depth is constrained 
+ * accordingly. */
+typedef unsigned int sfq_index;
+#define SFQ_MAX_DEPTH (UINT_MAX / 2 - 1)
 
 struct sfq_head
 {
@@ -95,6 +97,7 @@ struct sfq_sched_data
int perturb_period;
unsignedquantum;/* Allotment per round: MUST BE = MTU 
*/
int limit;
+   unsigneddepth;
 
 /* Variables */
struct timer_list perturb_timer;
@@ -103,11 +106,11 @@ struct sfq_sched_data
sfq_index   max_depth;  /* Maximal depth */
 
sfq_index   ht[SFQ_HASH_DIVISOR];   /* Hash table */
-   sfq_index   next[SFQ_DEPTH];/* Active slots link */
-   short   allot[SFQ_DEPTH];   /* Current allotment per slot */
-   unsigned short  hash[SFQ_DEPTH];/* Hash value indexed by slots 
*/
-   struct sk_buff_head qs[SFQ_DEPTH];  /* Slot queue */
-   struct sfq_head dep[SFQ_DEPTH*2];   /* Linked list of slots, 
indexed by depth */
+   sfq_index   *next;  /* Active slots link */
+   short   *allot; /* Current allotment per slot */
+   unsigned short  *hash;  /* Hash value indexed by slots 
*/
+   struct sk_buff_head *qs;/* Slot queue */
+   struct sfq_head *dep;   /* Linked list of slots, 
indexed by depth */
 };
 
 static __inline__ unsigned sfq_fold_hash(struct sfq_sched_data *q, u32 h, u32 
h1)
@@ -164,7 +167,7 @@ static unsigned sfq_hash(struct sfq_sched_data *q, struct 
sk_buff *skb)
 static inline void sfq_link(struct sfq_sched_data *q, sfq_index x)
 {
sfq_index p, n;
-   int d = q-qs[x].qlen + SFQ_DEPTH;
+   int d = q-qs[x].qlen + q-depth;
 
p = d;
n = q-dep[d].next;
@@ -215,7 +218,7 @@ static unsigned int sfq_drop(struct Qdisc *sch)
   drop a packet from it */
 
if (d  1) {
-   sfq_index x = q-dep[d+SFQ_DEPTH].next;
+   sfq_index x = q-dep[d + q-depth].next;
skb = q-qs[x].prev;
len = skb-len;
__skb_unlink(skb, q-qs[x]);
@@ -238,7 +241,7 @@ static unsigned int sfq_drop(struct Qdisc *sch)
kfree_skb(skb);
sfq_dec(q, d);
sch-q.qlen--;
-   q-ht[q-hash[d]] = SFQ_DEPTH;
+   q-ht[q-hash[d]] = q-depth;
sch-qstats.drops++;
sch-qstats.backlog -= len;
return len;
@@ -254,8 +257,8 @@ sfq_q_enqueue(struct sk_buff *skb, struct sfq_sched_data 
*q, unsigned int end)
sfq_index x;
 
x = q-ht[hash];
-   if (x == SFQ_DEPTH) {
-   q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
+   if (x == q-depth) {
+   q-ht[hash] = x = q-dep[q-depth].next;
q-hash[x] = hash;
}
 
@@ -266,7 +269,7 @@ sfq_q_enqueue(struct sk_buff *skb, struct sfq_sched_data 
*q, unsigned int end)
 
sfq_inc(q, x);
if (q-qs[x].qlen == 1) {   /* The flow is new */
-   if (q-tail == SFQ_DEPTH) { /* It is the first flow */
+   if (q-tail == q-depth) {  /* It is the first flow */
q-tail = x;
q-next[x] = x;
q-allot[x] = q-quantum;
@@ -318,7 +321,7 @@ sk_buff *sfq_q_dequeue(struct sfq_sched_data *q)
sfq_index a, old_a;
 
/* No active slots */
-   if (q-tail == SFQ_DEPTH)
+   if (q-tail == q-depth)
return NULL;
 
a = old_a = q-next[q-tail];
@@ -329,10 +332,10 @@ sk_buff *sfq_q_dequeue(struct sfq_sched_data *q)
 
/* Is the slot empty? */
if (q-qs[a].qlen == 0) {
-   q-ht[q-hash[a]] = SFQ_DEPTH;
+   q-ht[q-hash[a]] = q-depth;
a = q-next[a];
if (a == old_a) {
-   q-tail = SFQ_DEPTH;
+   q-tail = q-depth;
return skb;
}
q-next[q-tail] = a;
@@ -385,6 +388,11 @@ static void sfq_perturbation(unsigned long arg)
 static void

[PATCH 08/10] Multiply perturb_period by HZ when used rather than when assigned.

2007-08-25 Thread Corey Hickey
perturb_period is the only parameter that doesn't match 1:1 with the
value from userspace. This change makes it easy and clean to use a
small macro for setting parameters (in a subsequent patch).

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   10 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 77ffce3..f9f4460 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -370,7 +370,7 @@ static void sfq_perturbation(unsigned long arg)
q-perturbation = net_random()0x1F;
 
if (q-perturb_period) {
-   q-perturb_timer.expires = jiffies + q-perturb_period;
+   q-perturb_timer.expires = jiffies + q-perturb_period * HZ;
add_timer(q-perturb_timer);
}
 }
@@ -432,7 +432,7 @@ sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
if (ctl-quantum)
q-quantum = ctl-quantum;
if (ctl-perturb_period)
-   q-perturb_period = ctl-perturb_period * HZ;
+   q-perturb_period = ctl-perturb_period;
if (ctl-divisor)
q-hash_divisor = ctl-divisor;
if (ctl-flows)
@@ -495,7 +495,7 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
q-perturb_timer.data = (unsigned long)sch;
q-perturb_timer.function = sfq_perturbation;
if (q-perturb_period) {
-   q-perturb_timer.expires = jiffies + q-perturb_period;
+   q-perturb_timer.expires = jiffies + q-perturb_period * HZ;
add_timer(q-perturb_timer);
}
 
@@ -545,7 +545,7 @@ static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
 
/* finish up */
if (q-perturb_period) {
-   q-perturb_timer.expires = jiffies + q-perturb_period;
+   q-perturb_timer.expires = jiffies + q-perturb_period * HZ;
add_timer(q-perturb_timer);
} else {
q-perturbation = 0;
@@ -561,7 +561,7 @@ static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
struct tc_sfq_qopt opt;
 
opt.quantum = q-quantum;
-   opt.perturb_period = q-perturb_period/HZ;
+   opt.perturb_period = q-perturb_period;
 
opt.limit = q-limit;
opt.divisor = q-hash_divisor;
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/10] Make qdisc changeable.

2007-08-25 Thread Corey Hickey
Re-implement sfq_change() and enable Qdisc_opts.change so tc qdisc
change will work.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   64 ++-
 1 files changed, 63 insertions(+), 1 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 827b885..08e6862 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -415,6 +415,16 @@ sfq_default_parameters(struct Qdisc *sch)
q-limit = q-depth = SFQ_DEPTH_DEFAULT;
 }
 
+static void
+sfq_copy_parameters(struct sfq_sched_data *dst, struct sfq_sched_data *src)
+{
+   dst-quantum= src-quantum;
+   dst-perturbation   = src-perturbation;
+   dst-perturb_period = src-perturb_period;
+   dst-hash_divisor   = src-hash_divisor;
+   dst-limit  = src-limit;
+   dst-depth  = src-depth;
+}
 
 static int
 sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
@@ -503,6 +513,58 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
return 0;
 }
 
+static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   struct sfq_sched_data tmp;
+   struct sk_buff *skb;
+   unsigned int qlen;
+   int err;
+   
+   /* set up tmp queue */
+   memset(tmp, 0, sizeof(struct sfq_sched_data));
+   sfq_copy_parameters(tmp, q);
+   if ((err = sfq_q_init(tmp, opt)))
+   return err;
+
+   /* copy packets from the old queue to the tmp queue */
+   sch_tree_lock(sch);
+   qlen = sch-q.qlen;
+   while (sch-q.qlen = tmp.limit - 1)
+   sfq_drop(sch);
+   qdisc_tree_decrease_qlen(sch, qlen - sch-q.qlen);
+   while ((skb = sfq_q_dequeue(q)) != NULL)
+   sfq_q_enqueue(skb, tmp, SFQ_TAIL);
+   
+   /* clean up the old queue */
+   sfq_q_destroy(q);
+
+   /* copy elements of the tmp queue into the old queue */
+   q-perturb_period = tmp.perturb_period;
+   q-quantum= tmp.quantum;
+   q-limit  = tmp.limit;
+   q-depth  = tmp.depth;
+   q-hash_divisor   = tmp.hash_divisor;
+   q-tail   = tmp.tail;
+   q-max_depth  = tmp.max_depth;
+   q-ht= tmp.ht;
+   q-dep   = tmp.dep;
+   q-next  = tmp.next;
+   q-allot = tmp.allot;
+   q-hash  = tmp.hash;
+   q-qs= tmp.qs;
+
+   /* finish up */
+   if (q-perturb_period) {
+   q-perturb_timer.expires = jiffies + q-perturb_period;
+   add_timer(q-perturb_timer);
+   } else {
+   q-perturbation = 0;
+   }
+   sch_tree_unlock(sch);
+   return 0;
+}
+
 static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
@@ -537,7 +599,7 @@ static struct Qdisc_ops sfq_qdisc_ops = {
.init   =   sfq_init,
.reset  =   sfq_reset,
.destroy=   sfq_destroy,
-   .change =   NULL,
+   .change =   sfq_change,
.dump   =   sfq_dump,
.owner  =   THIS_MODULE,
 };
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/10] Remove comments about hardcoded values.

2007-08-25 Thread Corey Hickey
None of these are true anymore (hooray!).

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 include/linux/pkt_sched.h |8 
 net/sched/sch_sfq.c   |   17 +++--
 2 files changed, 3 insertions(+), 22 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 268c515..58a0ea6 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -148,14 +148,6 @@ struct tc_sfq_qopt
unsignedflows;  /* Maximal number of flows  */
 };
 
-/*
- *  NOTE: limit, divisor and flows are hardwired to code at the moment.
- *
- * limit=flows=128, divisor=1024;
- *
- * The only reason for this is efficiency, it is possible
- * to change these parameters in compile time.
- */
 
 /* RED section */
 
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 08e6862..77ffce3 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -61,18 +61,7 @@
 
We still need true WFQ for top level CSZ, but using WFQ
for the best effort traffic is absolutely pointless:
-   SFQ is superior for this purpose.
-
-   IMPLEMENTATION:
-   This implementation limits maximal queue length to 128;
-   maximal mtu to 2^15-1; number of hash buckets to 1024.
-   The only goal of this restrictions was that all data
-   fit into one 4K page :-). Struct sfq_sched_data is
-   organized in anti-cache manner: all the data for a bucket
-   are scattered over different locations. This is not good,
-   but it allowed me to put it into 4K.
-
-   It is easy to increase these values, but not in flight.  */
+   SFQ is superior for this purpose. */
 
 #define SFQ_DEPTH_DEFAULT  128
 #define SFQ_DIVISOR_DEFAULT1024
@@ -520,7 +509,7 @@ static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
struct sk_buff *skb;
unsigned int qlen;
int err;
-   
+
/* set up tmp queue */
memset(tmp, 0, sizeof(struct sfq_sched_data));
sfq_copy_parameters(tmp, q);
@@ -535,7 +524,7 @@ static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
qdisc_tree_decrease_qlen(sch, qlen - sch-q.qlen);
while ((skb = sfq_q_dequeue(q)) != NULL)
sfq_q_enqueue(skb, tmp, SFQ_TAIL);
-   
+
/* clean up the old queue */
sfq_q_destroy(q);
 
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/10] Add divisor.

2007-08-25 Thread Corey Hickey
Make hash divisor user-configurable.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   18 +-
 1 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 2e6d607..827b885 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -75,7 +75,7 @@
It is easy to increase these values, but not in flight.  */
 
 #define SFQ_DEPTH_DEFAULT  128
-#define SFQ_HASH_DIVISOR   1024
+#define SFQ_DIVISOR_DEFAULT1024
 
 #define SFQ_HEAD 0
 #define SFQ_TAIL 1
@@ -98,6 +98,7 @@ struct sfq_sched_data
unsignedquantum;/* Allotment per round: MUST BE = MTU 
*/
int limit;
unsigneddepth;
+   unsignedhash_divisor;
 
 /* Variables */
struct timer_list perturb_timer;
@@ -105,7 +106,7 @@ struct sfq_sched_data
sfq_index   tail;   /* Index of current slot in round */
sfq_index   max_depth;  /* Maximal depth */
 
-   sfq_index   ht[SFQ_HASH_DIVISOR];   /* Hash table */
+   sfq_index   *ht;/* Hash table */
sfq_index   *next;  /* Active slots link */
short   *allot; /* Current allotment per slot */
unsigned short  *hash;  /* Hash value indexed by slots 
*/
@@ -120,7 +121,7 @@ static __inline__ unsigned sfq_fold_hash(struct 
sfq_sched_data *q, u32 h, u32 h1
/* Have we any rotation primitives? If not, WHY? */
h ^= (h1pert) ^ (h1(0x1F - pert));
h ^= h10;
-   return h  0x3FF;
+   return h  (q-hash_divisor-1);
 }
 
 static unsigned sfq_hash(struct sfq_sched_data *q, struct sk_buff *skb)
@@ -388,6 +389,7 @@ static void sfq_perturbation(unsigned long arg)
 static void sfq_q_destroy(struct sfq_sched_data *q)
 {
del_timer(q-perturb_timer);
+   kfree(q-ht);
kfree(q-dep);
kfree(q-next);
kfree(q-allot);
@@ -409,6 +411,7 @@ sfq_default_parameters(struct Qdisc *sch)
q-quantum = psched_mtu(sch-dev);
q-perturbation = 0;
q-perturb_period = 0;
+   q-hash_divisor = SFQ_DIVISOR_DEFAULT;
q-limit = q-depth = SFQ_DEPTH_DEFAULT;
 }
 
@@ -431,6 +434,8 @@ sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
q-quantum = ctl-quantum;
if (ctl-perturb_period)
q-perturb_period = ctl-perturb_period * HZ;
+   if (ctl-divisor)
+   q-hash_divisor = ctl-divisor;
if (ctl-flows)
q-depth = ctl-flows;
if (ctl-limit)
@@ -443,6 +448,9 @@ sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
q-tail = q-depth;
q-max_depth = 0;
 
+   q-ht = kcalloc(q-hash_divisor, sizeof(sfq_index), GFP_KERNEL);
+   if (!q-ht)
+   goto err_case;
q-dep = kcalloc(1 + q-depth*2, sizeof(struct sfq_head), GFP_KERNEL);
if (!q-dep)
goto err_case;
@@ -459,7 +467,7 @@ sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
if (!q-qs)
goto err_case;
 
-   for (i=0; iSFQ_HASH_DIVISOR; i++)
+   for (i=0; iq-hash_divisor; i++)
q-ht[i] = q-depth;
for (i=0; i  q-depth; i++) {
skb_queue_head_init(q-qs[i]);
@@ -505,7 +513,7 @@ static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
opt.perturb_period = q-perturb_period/HZ;
 
opt.limit = q-limit;
-   opt.divisor = SFQ_HASH_DIVISOR;
+   opt.divisor = q-hash_divisor;
opt.flows = q-depth;
 
RTA_PUT(skb, TCA_OPTIONS, sizeof(opt), opt);
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/10] Change perturb_period to unsigned.

2007-08-25 Thread Corey Hickey
perturb_period is currently a signed integer, but I can't see any good
reason why this is so--a negative perturbation period will add a timer
that expires in the past, causing constant perturbation, which makes
hashing useless.

if (q-perturb_period) {
q-perturb_timer.expires = jiffies + q-perturb_period;
add_timer(q-perturb_timer);
}

Strictly speaking, this will break binary compatibility with older
versions of tc, but that ought not to be a problem because (a) there's
no valid use for a negative perturb_period, and (b) negative values
will be seen as high values ( INT_MAX), which don't work anyway.

If perturb_period is too large, (perturb_period * HZ) will overflow the
size of an unsigned int and wrap around. So, check for thet and reject
values that are too high.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 include/linux/pkt_sched.h |2 +-
 net/sched/sch_sfq.c   |8 ++--
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 58a0ea6..8559974 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -142,7 +142,7 @@ enum
 struct tc_sfq_qopt
 {
unsignedquantum;/* Bytes per round allocated to flow */
-   int perturb_period; /* Period of hash perturbation */
+   unsignedperturb_period; /* Period of hash perturbation */
__u32   limit;  /* Maximal packets in queue */
unsigneddivisor;/* Hash divisor  */
unsignedflows;  /* Maximal number of flows  */
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index f9f4460..157adc8 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -74,6 +74,9 @@
 typedef unsigned int sfq_index;
 #define SFQ_MAX_DEPTH (UINT_MAX / 2 - 1)
 
+/* We don't want perturb_period * HZ to overflow an unsigned int. */
+#define SFQ_MAX_PERTURB (UINT_MAX / HZ)
+
 struct sfq_head
 {
sfq_index   next;
@@ -83,7 +86,7 @@ struct sfq_head
 struct sfq_sched_data
 {
 /* Parameters */
-   int perturb_period;
+   unsignedperturb_period;
unsignedquantum;/* Allotment per round: MUST BE = MTU 
*/
int limit;
unsigneddepth;
@@ -440,7 +443,8 @@ sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
if (ctl-limit)
q-limit = ctl-limit;
 
-   if (q-depth  SFQ_MAX_DEPTH)
+   if (q-perturb_period  SFQ_MAX_PERTURB ||
+   q-depth  SFQ_MAX_DEPTH)
return -EINVAL;
}
q-limit = min_t(u32, q-limit, q-depth);
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/10] Use nested compat attributes to pass parameters.

2007-08-25 Thread Corey Hickey
This fixes the ambiguity between, for example:
tc qdisc change ... perturb 0
tc qdisc change ...

Without this patch, there is no way for SFQ to differentiate between
a parameter specified to be 0 and a parameter that was omitted.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 include/linux/pkt_sched.h |   13 +++
 net/sched/sch_sfq.c   |   53 +---
 2 files changed, 48 insertions(+), 18 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 8559974..aad04eb 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -148,6 +148,19 @@ struct tc_sfq_qopt
unsignedflows;  /* Maximal number of flows  */
 };
 
+enum
+{
+   TCA_SFQ_UNSPEC,
+   TCA_SFQ_COMPAT,
+   TCA_SFQ_QUANTUM,
+   TCA_SFQ_PERTURB,
+   TCA_SFQ_LIMIT,
+   TCA_SFQ_DIVISOR,
+   TCA_SFQ_FLOWS,
+   __TCA_SFQ_MAX,
+};
+
+#define TCA_SFQ_MAX (__TCA_SFQ_MAX - 1)
 
 /* RED section */
 
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 157adc8..62232c9 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -427,25 +427,31 @@ sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
 * the previous values (sfq_change). So, overwrite the parameters as
 * specified. */
if (opt) {
-   struct tc_sfq_qopt *ctl = RTA_DATA(opt);
-
-   if (opt-rta_len  RTA_LENGTH(sizeof(*ctl)))
-   return -EINVAL;
-
-   if (ctl-quantum)
-   q-quantum = ctl-quantum;
-   if (ctl-perturb_period)
-   q-perturb_period = ctl-perturb_period;
-   if (ctl-divisor)
-   q-hash_divisor = ctl-divisor;
-   if (ctl-flows)
-   q-depth = ctl-flows;
-   if (ctl-limit)
-   q-limit = ctl-limit;
-
+   struct tc_sfq_qopt *ctl;
+   struct rtattr *tb[TCA_SFQ_MAX];
+
+   if (rtattr_parse_nested_compat(tb, TCA_SFQ_MAX, opt, ctl,
+  sizeof(*ctl)))
+   goto rtattr_failure;
+
+#define GET_PARAM(dst, nest, compat) do { \
+   struct rtattr *rta = tb[(nest) - 1]; \
+   if (rta) \
+   (dst) = RTA_GET_U32(rta); \
+   else if ((compat)) \
+   (dst) = (compat); \
+} while (0)
+
+   GET_PARAM(q-quantum,TCA_SFQ_QUANTUM, ctl-quantum);
+   GET_PARAM(q-perturb_period, TCA_SFQ_PERTURB,
+   ctl-perturb_period);
+   GET_PARAM(q-hash_divisor,   TCA_SFQ_DIVISOR, ctl-divisor);
+   GET_PARAM(q-depth,  TCA_SFQ_FLOWS,   ctl-flows);
+   GET_PARAM(q-limit,  TCA_SFQ_LIMIT,   ctl-limit);
+   
if (q-perturb_period  SFQ_MAX_PERTURB ||
q-depth  SFQ_MAX_DEPTH)
-   return -EINVAL;
+   goto rtattr_failure;
}
q-limit = min_t(u32, q-limit, q-depth);
q-tail = q-depth;
@@ -481,6 +487,8 @@ sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
for (i=0; i  q-depth; i++)
sfq_link(q, i);
return 0;
+rtattr_failure:
+   return -EINVAL;
 err_case:
sfq_q_destroy(q);
return -ENOBUFS;
@@ -562,17 +570,26 @@ static int sfq_dump(struct Qdisc *sch, struct sk_buff 
*skb)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
unsigned char *b = skb_tail_pointer(skb);
+   struct rtattr *nest;
struct tc_sfq_qopt opt;
 
opt.quantum = q-quantum;
opt.perturb_period = q-perturb_period;
-
opt.limit = q-limit;
opt.divisor = q-hash_divisor;
opt.flows = q-depth;
 
+   nest = RTA_NEST_COMPAT(skb, TCA_OPTIONS, sizeof(opt), opt);
+
+   RTA_PUT_U32(skb, TCA_SFQ_QUANTUM, q-quantum);
+   RTA_PUT_U32(skb, TCA_SFQ_PERTURB, q-perturb_period);
+   RTA_PUT_U32(skb, TCA_SFQ_LIMIT,   q-limit);
+   RTA_PUT_U32(skb, TCA_SFQ_DIVISOR, q-hash_divisor);
+   RTA_PUT_U32(skb, TCA_SFQ_FLOWS,   q-depth);
RTA_PUT(skb, TCA_OPTIONS, sizeof(opt), opt);
 
+   RTA_NEST_COMPAT_END(skb, nest);
+
return skb-len;
 
 rtattr_failure:
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3][iproute2] SFQ: backport some features from ESFQ

2007-08-25 Thread Corey Hickey

These patches follow the ESFQ--SFQ kernel patches. See the kernel
patch summary for general information.

Thanks,
Corey



 include/linux/pkt_sched.h |   23 ++-
 tc/q_sfq.c|   43 ++-
 2 files changed, 52 insertions(+), 14 deletions(-)


[PATCH 1/3] SFQ: Support changing depth and divisor.
[PATCH 2/3] Change perturb_period to unsigned.
[PATCH 3/3] Use nested compat attributes for passing parameters to the kernel.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] SFQ: Support changing depth and divisor.

2007-08-25 Thread Corey Hickey
This can safely be applied either before or after the kernel
patches because the tc_sfq_qopt struct is unchanged:

- old kernels will ignore the parameters from new iproute2
- new kernels will use the same default parameters
---
 include/linux/pkt_sched.h |9 -
 tc/q_sfq.c|   21 -
 2 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index d10f353..37946d4 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -139,15 +139,6 @@ struct tc_sfq_qopt
unsignedflows;  /* Maximal number of flows  */
 };
 
-/*
- *  NOTE: limit, divisor and flows are hardwired to code at the moment.
- *
- * limit=flows=128, divisor=1024;
- *
- * The only reason for this is efficiency, it is possible
- * to change these parameters in compile time.
- */
-
 /* RED section */
 
 enum
diff --git a/tc/q_sfq.c b/tc/q_sfq.c
index 05385cf..7754db7 100644
--- a/tc/q_sfq.c
+++ b/tc/q_sfq.c
@@ -25,7 +25,7 @@
 
 static void explain(void)
 {
-   fprintf(stderr, Usage: ... sfq [ limit NUMBER ] [ perturb SECS ] [ 
quantum BYTES ]\n);
+   fprintf(stderr, Usage: ... sfq [ limit NUMBER ] [ depth FLOWS ] [ 
divisor HASHBITS ] [ perturb SECS ] [ quantum BYTES ]\n);
 }
 
 #define usage() return(-1)
@@ -63,6 +63,25 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
return -1;
}
ok++;
+   } else if (strcmp(*argv, depth) == 0) {
+   NEXT_ARG();
+   if (get_unsigned(opt.flows, *argv, 0)) {
+   fprintf(stderr, Illegal \depth\\n);
+   return -1;
+   }
+   ok++;
+   } else if (strcmp(*argv, divisor) == 0) {
+   NEXT_ARG();
+   if (get_unsigned(opt.divisor, *argv, 0)) {
+   fprintf(stderr, Illegal \divisor\\n);
+   return -1;
+   }
+   if (opt.divisor = 15) {
+   fprintf(stderr, Illegal \divisor\, must be  
15\n);
+   return -1;
+   }
+   opt.divisor = 1opt.divisor;
+   ok++;
} else if (strcmp(*argv, help) == 0) {
explain();
return -1;
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] Change perturb_period to unsigned.

2007-08-25 Thread Corey Hickey
This corresponds to the kernel patch doing the same.

Here, too, this will technically break binary compatibility with older
kernels, but that shouldn't be a problem because negative perturb_period
values aren't usable anyway.
---
 include/linux/pkt_sched.h |2 +-
 tc/q_sfq.c|4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 37946d4..0252bb7 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -133,7 +133,7 @@ enum
 struct tc_sfq_qopt
 {
unsignedquantum;/* Bytes per round allocated to flow */
-   int perturb_period; /* Period of hash perturbation */
+   unsignedperturb_period; /* Period of hash perturbation */
__u32   limit;  /* Maximal packets in queue */
unsigneddivisor;/* Hash divisor  */
unsignedflows;  /* Maximal number of flows  */
diff --git a/tc/q_sfq.c b/tc/q_sfq.c
index 7754db7..c9fcc53 100644
--- a/tc/q_sfq.c
+++ b/tc/q_sfq.c
@@ -47,7 +47,7 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
ok++;
} else if (strcmp(*argv, perturb) == 0) {
NEXT_ARG();
-   if (get_integer(opt.perturb_period, *argv, 0)) {
+   if (get_u32(opt.perturb_period, *argv, 0)) {
fprintf(stderr, Illegal \perturb\\n);
return -1;
}
@@ -115,7 +115,7 @@ static int sfq_print_opt(struct qdisc_util *qu, FILE *f, 
struct rtattr *opt)
fprintf(f, flows %u/%u , qopt-flows, qopt-divisor);
}
if (qopt-perturb_period)
-   fprintf(f, perturb %dsec , qopt-perturb_period);
+   fprintf(f, perturb %usec , qopt-perturb_period);
return 0;
 }
 
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] Use nested compat attributes for passing parameters to the kernel.

2007-08-25 Thread Corey Hickey
Note that I have left sfq_print_opt() alone. At this point, there
can be no difference between the data in the nested rtattrs and the
data in the compat rtattr, and I didn't want to add clutter that
isn't useful. Let me know if I should do differently.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 include/linux/pkt_sched.h |   14 ++
 tc/q_sfq.c|   18 --
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 0252bb7..0388094 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -139,6 +139,20 @@ struct tc_sfq_qopt
unsignedflows;  /* Maximal number of flows  */
 };
 
+enum
+{
+   TCA_SFQ_UNSPEC,
+   TCA_SFQ_COMPAT,
+   TCA_SFQ_QUANTUM,
+   TCA_SFQ_PERTURB,
+   TCA_SFQ_LIMIT,
+   TCA_SFQ_DIVISOR,
+   TCA_SFQ_FLOWS,
+   __TCA_SFQ_MAX,
+};
+
+#define TCA_SFQ_MAX (__TCA_SFQ_MAX - 1)
+
 /* RED section */
 
 enum
diff --git a/tc/q_sfq.c b/tc/q_sfq.c
index c9fcc53..5bb3eb7 100644
--- a/tc/q_sfq.c
+++ b/tc/q_sfq.c
@@ -34,9 +34,13 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
 {
int ok=0;
struct tc_sfq_qopt opt;
+   struct rtattr *nest;
 
memset(opt, 0, sizeof(opt));
 
+   /* put blank data in rtattr so there is a hole to fill later */
+   nest = addattr_nest_compat(n, 1024, TCA_OPTIONS, opt, sizeof(opt));
+
while (argc  0) {
if (strcmp(*argv, quantum) == 0) {
NEXT_ARG();
@@ -44,6 +48,7 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
fprintf(stderr, Illegal \limit\\n);
return -1;
}
+   addattr32(n, 1024, TCA_SFQ_QUANTUM, opt.quantum);
ok++;
} else if (strcmp(*argv, perturb) == 0) {
NEXT_ARG();
@@ -51,6 +56,7 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
fprintf(stderr, Illegal \perturb\\n);
return -1;
}
+   addattr32(n, 1024, TCA_SFQ_PERTURB, opt.perturb_period);
ok++;
} else if (strcmp(*argv, limit) == 0) {
NEXT_ARG();
@@ -62,6 +68,7 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
fprintf(stderr, Illegal \limit\, must be  
1\n);
return -1;
}
+   addattr32(n, 1024, TCA_SFQ_LIMIT, opt.limit);
ok++;
} else if (strcmp(*argv, depth) == 0) {
NEXT_ARG();
@@ -69,6 +76,7 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
fprintf(stderr, Illegal \depth\\n);
return -1;
}
+   addattr32(n, 1024, TCA_SFQ_FLOWS, opt.flows);
ok++;
} else if (strcmp(*argv, divisor) == 0) {
NEXT_ARG();
@@ -81,6 +89,7 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
return -1;
}
opt.divisor = 1opt.divisor;
+   addattr32(n, 1024, TCA_SFQ_DIVISOR, opt.divisor);
ok++;
} else if (strcmp(*argv, help) == 0) {
explain();
@@ -93,8 +102,13 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
argc--; argv++;
}
 
-   if (ok)
-   addattr_l(n, 1024, TCA_OPTIONS, opt, sizeof(opt));
+   if (ok) {
+   /* fill the hole we left earlier with real compat data */
+   memcpy(RTA_DATA(nest), opt, sizeof(opt));
+   addattr_nest_compat_end(n, nest);
+   }
+   else
+   nest-rta_len = 0;
return 0;
 }
 
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/7] Make qdisc changeable.

2007-08-05 Thread Corey Hickey

Patrick McHardy wrote:

+   if ((err = sfq_q_init(tmp, opt)))
+   return err;



This will also use defaults for all unspecified values. It would
be more consistent with other qdiscs to only change those values
that are actually specified, so something like tc qdisc change ...
perturb 10 will *only* change the perturbation parameter.


I'm fixed this for all the parameters except one--your example above.
Since 0 is a valid value for perturb, I can't see any clever way to
differentiate between the user specifying perturb 0 or leaving perturb
unspecified. Either way, opt-perturb_period is 0.

The only way I can see would be to add another member, say,
opt-perturb_specified, and use that accordingly. Unfortunately, this
would break usage of sfq with older versions of tc, so I'm hoping
there's a better approach.

Do you have any suggestions? I've looked at the other qdisc files and I 
don't see any other instances like this.


Thanks,
Corey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/7] Preparatory refactoring part 2.

2007-07-31 Thread Corey Hickey

Patrick McHardy wrote:

Corey Hickey wrote:

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 8ae077f..0c46938 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -380,71 +380,71 @@ static void sfq_perturbation(unsigned long arg)
}
 }
 
-static int sfq_change(struct Qdisc *sch, struct rtattr *opt)

+static int sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
struct tc_sfq_qopt *ctl = RTA_DATA(opt);
-   unsigned int qlen;
+   int i;
 
-	if (opt-rta_len  RTA_LENGTH(sizeof(*ctl)))

+   if (opt  opt-rta_len  RTA_LENGTH(sizeof(*ctl)))



opt is dereferenced above (RTA_DATA), so if it is NULL we've already
crashed.


I think that test made ESFQ not crash when I did a qdisc change 
without giving any parameters, but that was a while ago and I might be 
mistaken. I'll need to rewrite much of this function anyway, and I'll 
pay attention to what happens when I get there.



return -EINVAL;
 
-	sch_tree_lock(sch);

-   q-quantum = ctl-quantum ? : psched_mtu(sch-dev);
-   q-perturb_period = ctl-perturb_period*HZ;
-   if (ctl-limit)
-   q-limit = min_t(u32, ctl-limit, SFQ_DEPTH);
+   q-perturbation = 0;
+   q-max_depth = 0;
+   q-tail = q-limit = SFQ_DEPTH;
+   if (opt == NULL) {
+   q-perturb_period = 0;
+   } else {
+   struct tc_sfq_qopt *ctl = RTA_DATA(opt);
+   if (ctl-quantum)
+   q-quantum = ctl-quantum;
+   q-perturb_period = ctl-perturb_period*HZ;
 
-	qlen = sch-q.qlen;

-   while (sch-q.qlen = q-limit-1)
-   sfq_drop(sch);
-   qdisc_tree_decrease_qlen(sch, qlen - sch-q.qlen);



I hope that patch that makes changing possible brings this back ..
checking .. it doesn't. Please either keep this or fix up 6/7
to bring it back.


It got lost in translation; I will add it to 6/7.

Thanks,
Corey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/7] Make qdisc changeable.

2007-07-31 Thread Corey Hickey

Patrick McHardy wrote:

Corey Hickey wrote:

Re-implement sfq_change() and enable Qdisc_opts.change so tc qdisc
change will work.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   51 ++-
 1 files changed, 50 insertions(+), 1 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index e6a6a21..e042cd0 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -485,6 +485,55 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
return 0;
 }
 
+static int sfq_change(struct Qdisc *sch, struct rtattr *opt)

+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   struct sfq_sched_data tmp;
+   struct sk_buff *skb;
+   int err;
+   
+   /* set up tmp queue */
+   memset(tmp, 0, sizeof(struct sfq_sched_data));
+   tmp.quantum = psched_mtu(sch-dev); /* default */



If no value is given it should use the old value instead of
reinitializing to the default.


+   if ((err = sfq_q_init(tmp, opt)))
+   return err;



This will also use defaults for all unspecified values. It would
be more consistent with other qdiscs to only change those values
that are actually specified, so something like tc qdisc change ...
perturb 10 will *only* change the perturbation parameter.


I somehow had it in my head that a qdisc change was supposed to work 
that way. I'll change my patch.



I'm not sure reusing the initialization function and copying the
parameters is the best way to do this.


I'm not sure either, but I do like that it's conceptually simple and it 
keeps the parameter handling in one place. I've thought and stared for a 
while, and I don't see a better way, but that could of course be due to 
my limited understanding and general newbiehood in that regard.


Thanks again for the review. I'll try to get a new batch of patches sent 
off tomorrow.


-Corey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/7] Preparatory refactoring part 1.

2007-07-30 Thread Corey Hickey

Patrick McHardy wrote:

-static int
-sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
+static void sfq_q_enqueue(struct sk_buff *skb, struct sfq_sched_data *q, 
unsigned int end)



Please make sure to break at 80 chars and to keep the style
in this file consistent (newline before function name).


Ok. For what it's worth, though, most of the original functions in the 
file don't have a newline before the function name. Omitting the newline 
 would thus make the new/changed functions more consistent with the 
rest of the file. I don't have a preference either way, so unless you 
change your mind I'll put the newline back in..



 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
unsigned hash = sfq_hash(q, skb);
sfq_index x;
 
@@ -256,8 +257,12 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)

q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
q-hash[x] = hash;
}
-   sch-qstats.backlog += skb-len;


Why not keep this instead of having both callers do it?


My idea was to have all the sfq_q_* functions operate on struct 
sfq_sched_data and have no knowledge of the struct Qdisc. I did this 
in order to be able to use the new functions in sfq_change() when the 
temporary sfq_sched_data doesn't have a parent Qdisc.


There's probably a better way, and I am of course open to suggestions, 
but what I did made sense to me.



-   __skb_queue_tail(q-qs[x], skb);
+
+   if (end == SFQ_TAIL)
+   __skb_queue_tail(q-qs[x], skb);
+   else
+   __skb_queue_head(q-qs[x], skb);
+
sfq_inc(q, x);
if (q-qs[x].qlen == 1) {/* The flow is new */
if (q-tail == SFQ_DEPTH) {  /* It is the first flow */
@@ -270,12 +275,21 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
q-tail = x;
}
}
+}
+
+static int
+sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);


newline please.


Ok.


+   sfq_q_enqueue(skb, q, SFQ_TAIL);
+   sch-qstats.backlog += skb-len;
if (++sch-q.qlen  q-limit-1) {
sch-bstats.bytes += skb-len;
sch-bstats.packets++;
return 0;
}
 
+	sch-qstats.drops++;


sfq_drop already increments this.


You're right, of course. When I look at the original sfq_requeue(), 
though, I see that same line right before sfq_drop(). That's probably 
why it ended up in my patch. Is that a bug? The original sfq_enqueue() 
doesn't have that line.


http://git.kernel.org/?p=linux/kernel/git/jgarzik/netdev-2.6.git;a=blob;f=net/sched/sch_sfq.c
line 314


sfq_drop(sch);
return NET_XMIT_CN;
 }
@@ -284,28 +298,8 @@ static int
 sfq_requeue(struct sk_buff *skb, struct Qdisc* sch)
 {
struct sfq_sched_data *q = qdisc_priv(sch);


newline please


Ok.


-static struct sk_buff *
-sfq_dequeue(struct Qdisc* sch)
+static struct sk_buff *sfq_q_dequeue(struct sfq_sched_data *q)



Keep style consistent please.


Ok.

Thank you for the review. I'll address the rest of your comments when I 
have time later.


-Corey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/7] SFQ: backport some features from ESFQ

2007-07-29 Thread Corey Hickey
Hello,

This set of patches adds some of ESFQ's modifications to the original 
SFQ. Thus far, I have received support for this approach rather than for 
trying to get ESFQ included as a separate qdisc.

http://mailman.ds9a.nl/pipermail/lartc/2007q2/021056.html

My patches here implement tc qdisc change, user-configurable depth 
(number of flows), and user-configurable divisor (for setting hash table 
size). I've left out the remaining ESFQ features (usage of jhash and 
different hashing methods) because Patrick McHardy intends to submit a 
patch that will supersede that functionality; see the URL above.

Default values remain the same, and SFQ's default behavior remains the 
same, so there should be no user disruption.

A patch for iproute2 is included after the end of the kernel patch series.

Thanks for your consideration,
Corey


 include/linux/pkt_sched.h |8 --
 net/sched/sch_sfq.c   |  301 +
 2 files changed, 192 insertions(+), 117 deletions(-)

[PATCH 1/7] Preparatory refactoring part 1.
[PATCH 2/7] Preparatory refactoring part 2.
[PATCH 3/7] Move two functions.
[PATCH 4/7] Add depth.
[PATCH 5/7] Add divisor.
[PATCH 6/7] Make qdisc changeable.
[PATCH 7/7] Remove comments about hardcoded values.
[PATCH] [iproute2] SFQ: Support changing depth and divisor.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/7] Preparatory refactoring part 1.

2007-07-29 Thread Corey Hickey
Make a new function sfq_q_enqueue() that operates directly on the
queue data. This will be useful for implementing sfq_change() in
a later patch. A pleasant side-effect is reducing most of the
duplicate code in sfq_enqueue() and sfq_requeue().

Similarly, make a new function sfq_q_dequeue().

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   70 ++
 1 files changed, 36 insertions(+), 34 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 9579573..8ae077f 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -77,6 +77,9 @@
 #define SFQ_DEPTH  128
 #define SFQ_HASH_DIVISOR   1024
 
+#define SFQ_HEAD 0
+#define SFQ_TAIL 1
+
 /* This type should contain at least SFQ_DEPTH*2 values */
 typedef unsigned char sfq_index;
 
@@ -244,10 +247,8 @@ static unsigned int sfq_drop(struct Qdisc *sch)
return 0;
 }
 
-static int
-sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
+static void sfq_q_enqueue(struct sk_buff *skb, struct sfq_sched_data *q, 
unsigned int end)
 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
unsigned hash = sfq_hash(q, skb);
sfq_index x;
 
@@ -256,8 +257,12 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
q-hash[x] = hash;
}
-   sch-qstats.backlog += skb-len;
-   __skb_queue_tail(q-qs[x], skb);
+
+   if (end == SFQ_TAIL)
+   __skb_queue_tail(q-qs[x], skb);
+   else
+   __skb_queue_head(q-qs[x], skb);
+
sfq_inc(q, x);
if (q-qs[x].qlen == 1) {   /* The flow is new */
if (q-tail == SFQ_DEPTH) { /* It is the first flow */
@@ -270,12 +275,21 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
q-tail = x;
}
}
+}
+
+static int
+sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   sfq_q_enqueue(skb, q, SFQ_TAIL);
+   sch-qstats.backlog += skb-len;
if (++sch-q.qlen  q-limit-1) {
sch-bstats.bytes += skb-len;
sch-bstats.packets++;
return 0;
}
 
+   sch-qstats.drops++;
sfq_drop(sch);
return NET_XMIT_CN;
 }
@@ -284,28 +298,8 @@ static int
 sfq_requeue(struct sk_buff *skb, struct Qdisc* sch)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-   unsigned hash = sfq_hash(q, skb);
-   sfq_index x;
-
-   x = q-ht[hash];
-   if (x == SFQ_DEPTH) {
-   q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
-   q-hash[x] = hash;
-   }
+   sfq_q_enqueue(skb, q, SFQ_HEAD);
sch-qstats.backlog += skb-len;
-   __skb_queue_head(q-qs[x], skb);
-   sfq_inc(q, x);
-   if (q-qs[x].qlen == 1) {   /* The flow is new */
-   if (q-tail == SFQ_DEPTH) { /* It is the first flow */
-   q-tail = x;
-   q-next[x] = x;
-   q-allot[x] = q-quantum;
-   } else {
-   q-next[x] = q-next[q-tail];
-   q-next[q-tail] = x;
-   q-tail = x;
-   }
-   }
if (++sch-q.qlen  q-limit - 1) {
sch-qstats.requeues++;
return 0;
@@ -316,13 +310,8 @@ sfq_requeue(struct sk_buff *skb, struct Qdisc* sch)
return NET_XMIT_CN;
 }
 
-
-
-
-static struct sk_buff *
-sfq_dequeue(struct Qdisc* sch)
+static struct sk_buff *sfq_q_dequeue(struct sfq_sched_data *q)
 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
struct sk_buff *skb;
sfq_index a, old_a;
 
@@ -335,8 +324,6 @@ sfq_dequeue(struct Qdisc* sch)
/* Grab packet */
skb = __skb_dequeue(q-qs[a]);
sfq_dec(q, a);
-   sch-q.qlen--;
-   sch-qstats.backlog -= skb-len;
 
/* Is the slot empty? */
if (q-qs[a].qlen == 0) {
@@ -353,6 +340,21 @@ sfq_dequeue(struct Qdisc* sch)
a = q-next[a];
q-allot[a] += q-quantum;
}
+
+   return skb;
+}
+
+static struct sk_buff
+*sfq_dequeue(struct Qdisc* sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   struct sk_buff *skb;
+
+   skb = sfq_q_dequeue(q);
+   if (skb == NULL)
+   return NULL;
+   sch-q.qlen--;
+   sch-qstats.backlog -= skb-len;
return skb;
 }
 
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/7] Preparatory refactoring part 2.

2007-07-29 Thread Corey Hickey
Factor code out of sfq_init() and sfq_destroy(), again so that the
new functions can be used by sfq_change() later.

Actually, as the diff itself shows, most of the sfq_q_init() code
comes from the original sfq_change(), but sfq_change() is only
called by sfq_init() right now. Thus, it is safe to remove
sfq_change(); tc qdisc change doesn't yet work for sfq anyway.

The sfq_destroy() -- sfq_q_destroy() change looks pointless here,
but it's cleaner to split now and add code to sfq_q_destroy() in a
later patch.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   80 +-
 1 files changed, 40 insertions(+), 40 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 8ae077f..0c46938 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -380,71 +380,71 @@ static void sfq_perturbation(unsigned long arg)
}
 }
 
-static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
+static int sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
struct tc_sfq_qopt *ctl = RTA_DATA(opt);
-   unsigned int qlen;
+   int i;
 
-   if (opt-rta_len  RTA_LENGTH(sizeof(*ctl)))
+   if (opt  opt-rta_len  RTA_LENGTH(sizeof(*ctl)))
return -EINVAL;
 
-   sch_tree_lock(sch);
-   q-quantum = ctl-quantum ? : psched_mtu(sch-dev);
-   q-perturb_period = ctl-perturb_period*HZ;
-   if (ctl-limit)
-   q-limit = min_t(u32, ctl-limit, SFQ_DEPTH);
+   q-perturbation = 0;
+   q-max_depth = 0;
+   q-tail = q-limit = SFQ_DEPTH;
+   if (opt == NULL) {
+   q-perturb_period = 0;
+   } else {
+   struct tc_sfq_qopt *ctl = RTA_DATA(opt);
+   if (ctl-quantum)
+   q-quantum = ctl-quantum;
+   q-perturb_period = ctl-perturb_period*HZ;
 
-   qlen = sch-q.qlen;
-   while (sch-q.qlen = q-limit-1)
-   sfq_drop(sch);
-   qdisc_tree_decrease_qlen(sch, qlen - sch-q.qlen);
+   if (ctl-limit)
+   q-limit = min_t(u32, ctl-limit, SFQ_DEPTH);
+   }
 
-   del_timer(q-perturb_timer);
-   if (q-perturb_period) {
-   q-perturb_timer.expires = jiffies + q-perturb_period;
-   add_timer(q-perturb_timer);
+   for (i=0; iSFQ_HASH_DIVISOR; i++)
+   q-ht[i] = SFQ_DEPTH;
+   for (i=0; iSFQ_DEPTH; i++) {
+   skb_queue_head_init(q-qs[i]);
+   q-dep[i+SFQ_DEPTH].next = i+SFQ_DEPTH;
+   q-dep[i+SFQ_DEPTH].prev = i+SFQ_DEPTH;
}
-   sch_tree_unlock(sch);
+
+   for (i=0; iSFQ_DEPTH; i++)
+   sfq_link(q, i);
return 0;
 }
 
 static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-   int i;
+   int err;
+   
+   q-quantum = psched_mtu(sch-dev); /* default */
+   if ((err = sfq_q_init(q, opt)))
+   return err;
 
init_timer(q-perturb_timer);
q-perturb_timer.data = (unsigned long)sch;
q-perturb_timer.function = sfq_perturbation;
-
-   for (i=0; iSFQ_HASH_DIVISOR; i++)
-   q-ht[i] = SFQ_DEPTH;
-   for (i=0; iSFQ_DEPTH; i++) {
-   skb_queue_head_init(q-qs[i]);
-   q-dep[i+SFQ_DEPTH].next = i+SFQ_DEPTH;
-   q-dep[i+SFQ_DEPTH].prev = i+SFQ_DEPTH;
-   }
-   q-limit = SFQ_DEPTH;
-   q-max_depth = 0;
-   q-tail = SFQ_DEPTH;
-   if (opt == NULL) {
-   q-quantum = psched_mtu(sch-dev);
-   q-perturb_period = 0;
-   } else {
-   int err = sfq_change(sch, opt);
-   if (err)
-   return err;
+   if (q-perturb_period) {
+   q-perturb_timer.expires = jiffies + q-perturb_period;
+   add_timer(q-perturb_timer);
}
-   for (i=0; iSFQ_DEPTH; i++)
-   sfq_link(q, i);
+
return 0;
 }
 
+static void sfq_q_destroy(struct sfq_sched_data *q)
+{
+   del_timer(q-perturb_timer);
+}
+
 static void sfq_destroy(struct Qdisc *sch)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-   del_timer(q-perturb_timer);
+   sfq_q_destroy(q);
 }
 
 static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/7] Move two functions.

2007-07-29 Thread Corey Hickey
Move sfq_q_destroy() to above sfq_q_init() so that it can be used
by an error case in a later patch.

Move sfq_destroy() as well, for clarity.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   24 
 1 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 0c46938..583f925 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -380,6 +380,17 @@ static void sfq_perturbation(unsigned long arg)
}
 }
 
+static void sfq_q_destroy(struct sfq_sched_data *q)
+{
+   del_timer(q-perturb_timer);
+}
+
+static void sfq_destroy(struct Qdisc *sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   sfq_q_destroy(q);
+}
+
 static int sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
 {
struct tc_sfq_qopt *ctl = RTA_DATA(opt);
@@ -420,7 +431,7 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
int err;
-   
+
q-quantum = psched_mtu(sch-dev); /* default */
if ((err = sfq_q_init(q, opt)))
return err;
@@ -436,17 +447,6 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
return 0;
 }
 
-static void sfq_q_destroy(struct sfq_sched_data *q)
-{
-   del_timer(q-perturb_timer);
-}
-
-static void sfq_destroy(struct Qdisc *sch)
-{
-   struct sfq_sched_data *q = qdisc_priv(sch);
-   sfq_q_destroy(q);
-}
-
 static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/7] Add depth.

2007-07-29 Thread Corey Hickey
Make depth (number of queues) user-configurable:

* replace #define with a parameter
* use old hardcoded value as a default
* kmalloc() arrays in sfq_q_init()
* free() arrays in sfq_q_destroy()

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   85 ---
 1 files changed, 60 insertions(+), 25 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 583f925..2ff6a27 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -74,14 +74,14 @@
 
It is easy to increase these values, but not in flight.  */
 
-#define SFQ_DEPTH  128
+#define SFQ_DEPTH_DEFAULT  128
 #define SFQ_HASH_DIVISOR   1024
 
 #define SFQ_HEAD 0
 #define SFQ_TAIL 1
 
-/* This type should contain at least SFQ_DEPTH*2 values */
-typedef unsigned char sfq_index;
+/* This type should contain at least depth*2 values */
+typedef unsigned int sfq_index;
 
 struct sfq_head
 {
@@ -95,6 +95,7 @@ struct sfq_sched_data
int perturb_period;
unsignedquantum;/* Allotment per round: MUST BE = MTU 
*/
int limit;
+   unsigneddepth;
 
 /* Variables */
struct timer_list perturb_timer;
@@ -103,11 +104,11 @@ struct sfq_sched_data
sfq_index   max_depth;  /* Maximal depth */
 
sfq_index   ht[SFQ_HASH_DIVISOR];   /* Hash table */
-   sfq_index   next[SFQ_DEPTH];/* Active slots link */
-   short   allot[SFQ_DEPTH];   /* Current allotment per slot */
-   unsigned short  hash[SFQ_DEPTH];/* Hash value indexed by slots 
*/
-   struct sk_buff_head qs[SFQ_DEPTH];  /* Slot queue */
-   struct sfq_head dep[SFQ_DEPTH*2];   /* Linked list of slots, 
indexed by depth */
+   sfq_index   *next;  /* Active slots link */
+   short   *allot; /* Current allotment per slot */
+   unsigned short  *hash;  /* Hash value indexed by slots 
*/
+   struct sk_buff_head *qs;/* Slot queue */
+   struct sfq_head *dep;   /* Linked list of slots, 
indexed by depth */
 };
 
 static __inline__ unsigned sfq_fold_hash(struct sfq_sched_data *q, u32 h, u32 
h1)
@@ -164,7 +165,7 @@ static unsigned sfq_hash(struct sfq_sched_data *q, struct 
sk_buff *skb)
 static inline void sfq_link(struct sfq_sched_data *q, sfq_index x)
 {
sfq_index p, n;
-   int d = q-qs[x].qlen + SFQ_DEPTH;
+   int d = q-qs[x].qlen + q-depth;
 
p = d;
n = q-dep[d].next;
@@ -215,7 +216,7 @@ static unsigned int sfq_drop(struct Qdisc *sch)
   drop a packet from it */
 
if (d  1) {
-   sfq_index x = q-dep[d+SFQ_DEPTH].next;
+   sfq_index x = q-dep[d+q-depth].next;
skb = q-qs[x].prev;
len = skb-len;
__skb_unlink(skb, q-qs[x]);
@@ -238,7 +239,7 @@ static unsigned int sfq_drop(struct Qdisc *sch)
kfree_skb(skb);
sfq_dec(q, d);
sch-q.qlen--;
-   q-ht[q-hash[d]] = SFQ_DEPTH;
+   q-ht[q-hash[d]] = q-depth;
sch-qstats.drops++;
sch-qstats.backlog -= len;
return len;
@@ -253,8 +254,8 @@ static void sfq_q_enqueue(struct sk_buff *skb, struct 
sfq_sched_data *q, unsigne
sfq_index x;
 
x = q-ht[hash];
-   if (x == SFQ_DEPTH) {
-   q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
+   if (x == q-depth) {
+   q-ht[hash] = x = q-dep[q-depth].next;
q-hash[x] = hash;
}
 
@@ -265,7 +266,7 @@ static void sfq_q_enqueue(struct sk_buff *skb, struct 
sfq_sched_data *q, unsigne
 
sfq_inc(q, x);
if (q-qs[x].qlen == 1) {   /* The flow is new */
-   if (q-tail == SFQ_DEPTH) { /* It is the first flow */
+   if (q-tail == q-depth) {  /* It is the first flow */
q-tail = x;
q-next[x] = x;
q-allot[x] = q-quantum;
@@ -316,7 +317,7 @@ static struct sk_buff *sfq_q_dequeue(struct sfq_sched_data 
*q)
sfq_index a, old_a;
 
/* No active slots */
-   if (q-tail == SFQ_DEPTH)
+   if (q-tail == q-depth)
return NULL;
 
a = old_a = q-next[q-tail];
@@ -327,10 +328,10 @@ static struct sk_buff *sfq_q_dequeue(struct 
sfq_sched_data *q)
 
/* Is the slot empty? */
if (q-qs[a].qlen == 0) {
-   q-ht[q-hash[a]] = SFQ_DEPTH;
+   q-ht[q-hash[a]] = q-depth;
a = q-next[a];
if (a == old_a) {
-   q-tail = SFQ_DEPTH;
+   q-tail = q-depth;
return skb;
}
q-next[q-tail] = a;
@@ -383,6 +384,16 @@ static void sfq_perturbation(unsigned long arg)
 static void

[PATCH 5/7] Add divisor.

2007-07-29 Thread Corey Hickey
Make hash divisor user-configurable.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   18 +-
 1 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 2ff6a27..3e67a68 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -75,7 +75,7 @@
It is easy to increase these values, but not in flight.  */
 
 #define SFQ_DEPTH_DEFAULT  128
-#define SFQ_HASH_DIVISOR   1024
+#define SFQ_DIVISOR_DEFAULT1024
 
 #define SFQ_HEAD 0
 #define SFQ_TAIL 1
@@ -96,6 +96,7 @@ struct sfq_sched_data
unsignedquantum;/* Allotment per round: MUST BE = MTU 
*/
int limit;
unsigneddepth;
+   unsignedhash_divisor;
 
 /* Variables */
struct timer_list perturb_timer;
@@ -103,7 +104,7 @@ struct sfq_sched_data
sfq_index   tail;   /* Index of current slot in round */
sfq_index   max_depth;  /* Maximal depth */
 
-   sfq_index   ht[SFQ_HASH_DIVISOR];   /* Hash table */
+   sfq_index   *ht;/* Hash table */
sfq_index   *next;  /* Active slots link */
short   *allot; /* Current allotment per slot */
unsigned short  *hash;  /* Hash value indexed by slots 
*/
@@ -118,7 +119,7 @@ static __inline__ unsigned sfq_fold_hash(struct 
sfq_sched_data *q, u32 h, u32 h1
/* Have we any rotation primitives? If not, WHY? */
h ^= (h1pert) ^ (h1(0x1F - pert));
h ^= h10;
-   return h  0x3FF;
+   return h  (q-hash_divisor-1);
 }
 
 static unsigned sfq_hash(struct sfq_sched_data *q, struct sk_buff *skb)
@@ -384,6 +385,8 @@ static void sfq_perturbation(unsigned long arg)
 static void sfq_q_destroy(struct sfq_sched_data *q)
 {
del_timer(q-perturb_timer);
+   if(q-ht)
+   kfree(q-ht);
if(q-dep)
kfree(q-dep);
if(q-next)
@@ -415,12 +418,14 @@ static int sfq_q_init(struct sfq_sched_data *q, struct 
rtattr *opt)
q-max_depth = 0;
if (opt == NULL) {
q-perturb_period = 0;
+   q-hash_divisor = SFQ_DIVISOR_DEFAULT;
q-tail = q-limit = q-depth = SFQ_DEPTH_DEFAULT;
} else {
struct tc_sfq_qopt *ctl = RTA_DATA(opt);
if (ctl-quantum)
q-quantum = ctl-quantum;
q-perturb_period = ctl-perturb_period*HZ;
+   q-hash_divisor = ctl-divisor ? : SFQ_DIVISOR_DEFAULT;
q-tail = q-limit = q-depth = ctl-flows ? : 
SFQ_DEPTH_DEFAULT;
 
if (q-depth  p - 1)
@@ -430,6 +435,9 @@ static int sfq_q_init(struct sfq_sched_data *q, struct 
rtattr *opt)
q-limit = min_t(u32, ctl-limit, q-depth);
}
 
+   q-ht = kmalloc(q-hash_divisor*sizeof(sfq_index), GFP_KERNEL);
+   if (!q-ht)
+   goto err_case;
q-dep = kmalloc((1+q-depth*2)*sizeof(struct sfq_head), GFP_KERNEL);
if (!q-dep)
goto err_case;
@@ -446,7 +454,7 @@ static int sfq_q_init(struct sfq_sched_data *q, struct 
rtattr *opt)
if (!q-qs)
goto err_case;
 
-   for (i=0; iSFQ_HASH_DIVISOR; i++)
+   for (i=0; iq-hash_divisor; i++)
q-ht[i] = q-depth;
for (i=0; iq-depth; i++) {
skb_queue_head_init(q-qs[i]);
@@ -492,7 +500,7 @@ static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
opt.perturb_period = q-perturb_period/HZ;
 
opt.limit = q-limit;
-   opt.divisor = SFQ_HASH_DIVISOR;
+   opt.divisor = q-hash_divisor;
opt.flows = q-depth;
 
RTA_PUT(skb, TCA_OPTIONS, sizeof(opt), opt);
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/7] Make qdisc changeable.

2007-07-29 Thread Corey Hickey
Re-implement sfq_change() and enable Qdisc_opts.change so tc qdisc
change will work.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   51 ++-
 1 files changed, 50 insertions(+), 1 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 3e67a68..4cd523f 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -490,6 +490,55 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
return 0;
 }
 
+static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   struct sfq_sched_data tmp;
+   struct sk_buff *skb;
+   int err;
+   
+   /* set up tmp queue */
+   memset(tmp, 0, sizeof(struct sfq_sched_data));
+   tmp.quantum = psched_mtu(sch-dev); /* default */
+   if ((err = sfq_q_init(tmp, opt)))
+   return err;
+
+   /* copy packets from the old queue to the tmp queue */
+   sch_tree_lock(sch);
+   while (sch-q.qlen = tmp.limit - 1)
+   sfq_drop(sch);
+   while ((skb = sfq_q_dequeue(q)) != NULL)
+   sfq_q_enqueue(skb, tmp, SFQ_TAIL);
+   
+   /* clean up the old queue */
+   sfq_q_destroy(q);
+
+   /* copy elements of the tmp queue into the old queue */
+   q-perturb_period = tmp.perturb_period;
+   q-quantum= tmp.quantum;
+   q-limit  = tmp.limit;
+   q-depth  = tmp.depth;
+   q-hash_divisor   = tmp.hash_divisor;
+   q-tail   = tmp.tail;
+   q-max_depth  = tmp.max_depth;
+   q-ht= tmp.ht;
+   q-dep   = tmp.dep;
+   q-next  = tmp.next;
+   q-allot = tmp.allot;
+   q-hash  = tmp.hash;
+   q-qs= tmp.qs;
+
+   /* finish up */
+   if (q-perturb_period) {
+   q-perturb_timer.expires = jiffies + q-perturb_period;
+   add_timer(q-perturb_timer);
+   } else {
+   q-perturbation = 0;
+   }
+   sch_tree_unlock(sch);
+   return 0;
+}
+
 static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
@@ -524,7 +573,7 @@ static struct Qdisc_ops sfq_qdisc_ops = {
.init   =   sfq_init,
.reset  =   sfq_reset,
.destroy=   sfq_destroy,
-   .change =   NULL,
+   .change =   sfq_change,
.dump   =   sfq_dump,
.owner  =   THIS_MODULE,
 };
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] [iproute2] SFQ: Support changing depth and divisor.

2007-07-29 Thread Corey Hickey
This can safely be applied either before or after the kernel
patches because the tc_sfq_qopt struct is unchanged:

- old kernels will ignore the parameters from new iproute2
- new kernels will use the same default parameters
---
 include/linux/pkt_sched.h |9 -
 tc/q_sfq.c|   21 -
 2 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index d10f353..37946d4 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -139,15 +139,6 @@ struct tc_sfq_qopt
unsignedflows;  /* Maximal number of flows  */
 };
 
-/*
- *  NOTE: limit, divisor and flows are hardwired to code at the moment.
- *
- * limit=flows=128, divisor=1024;
- *
- * The only reason for this is efficiency, it is possible
- * to change these parameters in compile time.
- */
-
 /* RED section */
 
 enum
diff --git a/tc/q_sfq.c b/tc/q_sfq.c
index 05385cf..7754db7 100644
--- a/tc/q_sfq.c
+++ b/tc/q_sfq.c
@@ -25,7 +25,7 @@
 
 static void explain(void)
 {
-   fprintf(stderr, Usage: ... sfq [ limit NUMBER ] [ perturb SECS ] [ 
quantum BYTES ]\n);
+   fprintf(stderr, Usage: ... sfq [ limit NUMBER ] [ depth FLOWS ] [ 
divisor HASHBITS ] [ perturb SECS ] [ quantum BYTES ]\n);
 }
 
 #define usage() return(-1)
@@ -63,6 +63,25 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
return -1;
}
ok++;
+   } else if (strcmp(*argv, depth) == 0) {
+   NEXT_ARG();
+   if (get_unsigned(opt.flows, *argv, 0)) {
+   fprintf(stderr, Illegal \depth\\n);
+   return -1;
+   }
+   ok++;
+   } else if (strcmp(*argv, divisor) == 0) {
+   NEXT_ARG();
+   if (get_unsigned(opt.divisor, *argv, 0)) {
+   fprintf(stderr, Illegal \divisor\\n);
+   return -1;
+   }
+   if (opt.divisor = 15) {
+   fprintf(stderr, Illegal \divisor\, must be  
15\n);
+   return -1;
+   }
+   opt.divisor = 1opt.divisor;
+   ok++;
} else if (strcmp(*argv, help) == 0) {
explain();
return -1;
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/7] Remove comments about hardcoded values.

2007-07-29 Thread Corey Hickey
None of these are true anymore (hooray!).

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 include/linux/pkt_sched.h |8 
 net/sched/sch_sfq.c   |   17 +++--
 2 files changed, 3 insertions(+), 22 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 268c515..58a0ea6 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -148,14 +148,6 @@ struct tc_sfq_qopt
unsignedflows;  /* Maximal number of flows  */
 };
 
-/*
- *  NOTE: limit, divisor and flows are hardwired to code at the moment.
- *
- * limit=flows=128, divisor=1024;
- *
- * The only reason for this is efficiency, it is possible
- * to change these parameters in compile time.
- */
 
 /* RED section */
 
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 4cd523f..8e84881 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -61,18 +61,7 @@
 
We still need true WFQ for top level CSZ, but using WFQ
for the best effort traffic is absolutely pointless:
-   SFQ is superior for this purpose.
-
-   IMPLEMENTATION:
-   This implementation limits maximal queue length to 128;
-   maximal mtu to 2^15-1; number of hash buckets to 1024.
-   The only goal of this restrictions was that all data
-   fit into one 4K page :-). Struct sfq_sched_data is
-   organized in anti-cache manner: all the data for a bucket
-   are scattered over different locations. This is not good,
-   but it allowed me to put it into 4K.
-
-   It is easy to increase these values, but not in flight.  */
+   SFQ is superior for this purpose. */
 
 #define SFQ_DEPTH_DEFAULT  128
 #define SFQ_DIVISOR_DEFAULT1024
@@ -496,7 +485,7 @@ static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
struct sfq_sched_data tmp;
struct sk_buff *skb;
int err;
-   
+
/* set up tmp queue */
memset(tmp, 0, sizeof(struct sfq_sched_data));
tmp.quantum = psched_mtu(sch-dev); /* default */
@@ -509,7 +498,7 @@ static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
sfq_drop(sch);
while ((skb = sfq_q_dequeue(q)) != NULL)
sfq_q_enqueue(skb, tmp, SFQ_TAIL);
-   
+
/* clean up the old queue */
sfq_q_destroy(q);
 
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/7] SFQ: backport some features from ESFQ

2007-07-29 Thread Corey Hickey

Corey Hickey wrote:

Hello,

This set of patches adds some of ESFQ's modifications to the original 
SFQ. Thus far, I have received support for this approach rather than for 
trying to get ESFQ included as a separate qdisc.


Crud; I wasn't expecting git-send-email to thread the messages that way. 
I guess I should have used --no-chain-reply-to. If I ought re-send this 
series with that option, somebody please say the word and I'll do so.


-Corey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/7] Add depth.

2007-07-29 Thread Corey Hickey

Michael Buesch wrote:

On Sunday 29 July 2007 09:08:51 Corey Hickey wrote:

p = d;
n = q-dep[d].next;
@@ -215,7 +216,7 @@ static unsigned int sfq_drop(struct Qdisc *sch)
   drop a packet from it */
 
 	if (d  1) {

-   sfq_index x = q-dep[d+SFQ_DEPTH].next;
+   sfq_index x = q-dep[d+q-depth].next;


Please q-dep[d + q-depth]
Makes it _much_ more readable. And doesn't confuse my brain with a
minus and a BiggerThan sign ;)


Ok.


@@ -383,6 +384,16 @@ static void sfq_perturbation(unsigned long arg)
 static void sfq_q_destroy(struct sfq_sched_data *q)
 {
del_timer(q-perturb_timer);
+   if(q-dep)
+   kfree(q-dep);
+   if(q-next)
+   kfree(q-next);
+   if(q-allot)
+   kfree(q-allot);
+   if(q-hash)
+   kfree(q-hash);
+   if(q-qs)
+   kfree(q-qs);


No need to check for !=NULL. kfree handles NULL.


Ok. Thanks.


 }
 
 static void sfq_destroy(struct Qdisc *sch)

@@ -394,6 +405,7 @@ static void sfq_destroy(struct Qdisc *sch)
 static int sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
 {
struct tc_sfq_qopt *ctl = RTA_DATA(opt);
+   sfq_index p = ~0U/2;
int i;
 
 	if (opt  opt-rta_len  RTA_LENGTH(sizeof(*ctl)))

@@ -401,30 +413,53 @@ static int sfq_q_init(struct sfq_sched_data *q, struct 
rtattr *opt)
 
 	q-perturbation = 0;

q-max_depth = 0;
-   q-tail = q-limit = SFQ_DEPTH;
if (opt == NULL) {
q-perturb_period = 0;
+   q-tail = q-limit = q-depth = SFQ_DEPTH_DEFAULT;
} else {
struct tc_sfq_qopt *ctl = RTA_DATA(opt);
if (ctl-quantum)
q-quantum = ctl-quantum;
q-perturb_period = ctl-perturb_period*HZ;
+   q-tail = q-limit = q-depth = ctl-flows ? : 
SFQ_DEPTH_DEFAULT;
+
+   if (q-depth  p - 1)
+   return -EINVAL;


Compare depth against (~0U/2)-1? What's that doing? Should probably add a 
comment.


~0U/2 - 1 is the maximum value depth can be, based on how it is used in 
indexing q-dep. I agree, though, that deserves a comment. Actually, 
I'll also change it to '#define SFQ_DEPTH_MAX (~0U/2 - 1)' and put it 
near the top of the file next to the 'typedef unsigned int sfq_index;'.


I could also include limits.h and use UINT_MAX instead of ~0U; would 
that be preferable?


 
 		if (ctl-limit)

-   q-limit = min_t(u32, ctl-limit, SFQ_DEPTH);
+   q-limit = min_t(u32, ctl-limit, q-depth);
}
 
+	q-dep = kmalloc((1+q-depth*2)*sizeof(struct sfq_head), GFP_KERNEL);

+   if (!q-dep)
+   goto err_case;
+   q-next = kmalloc(q-depth*sizeof(sfq_index), GFP_KERNEL);
+   if (!q-next)
+   goto err_case;
+   q-allot = kmalloc(q-depth*sizeof(short), GFP_KERNEL);
+   if (!q-allot)
+   goto err_case;
+   q-hash = kmalloc(q-depth*sizeof(unsigned short), GFP_KERNEL);
+   if (!q-hash)
+   goto err_case;
+   q-qs = kmalloc(q-depth*sizeof(struct sk_buff_head), GFP_KERNEL);
+   if (!q-qs)
+   goto err_case;


You may chose to use kcalloc for array allocations.


The arrays in the original code don't get zeroed either, so that 
shouldn't be necessary (and I haven't heard of any problems so far). Do 
you suggest I use kcalloc() anyway, just as a good practice?



for (i=0; iSFQ_HASH_DIVISOR; i++)
-   q-ht[i] = SFQ_DEPTH;
-   for (i=0; iSFQ_DEPTH; i++) {
+   q-ht[i] = q-depth;
+   for (i=0; iq-depth; i++) {
skb_queue_head_init(q-qs[i]);
-   q-dep[i+SFQ_DEPTH].next = i+SFQ_DEPTH;
-   q-dep[i+SFQ_DEPTH].prev = i+SFQ_DEPTH;
+   q-dep[i+q-depth].next = i+q-depth;
+   q-dep[i+q-depth].prev = i+q-depth;
}
 
-	for (i=0; iSFQ_DEPTH; i++)

+   for (i=0; iq-depth; i++)
sfq_link(q, i);
return 0;
+err_case:


This leaks a few kmallocs.


Are you saying that the 'err_case:' leaks kmallocs? It calls 
sfq_q_destroy(q), which kfrees each of the arrays: dep, next, allot, 
hash, and qs. Is that sufficient, or am I missing something or 
misunderstanding you?



+   sfq_q_destroy(q);
+   return -ENOBUFS;
 }


Thank you for your review. Could you please clarify the questions I 
have? I'll make, test, and submit a revision of this patch after that.


-Corey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/7] Move two functions.

2007-07-29 Thread Corey Hickey
Move sfq_q_destroy() to above sfq_q_init() so that it can be used
by an error case in a later patch.

Move sfq_destroy() as well, for clarity.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   24 
 1 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 0c46938..583f925 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -380,6 +380,17 @@ static void sfq_perturbation(unsigned long arg)
}
 }
 
+static void sfq_q_destroy(struct sfq_sched_data *q)
+{
+   del_timer(q-perturb_timer);
+}
+
+static void sfq_destroy(struct Qdisc *sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   sfq_q_destroy(q);
+}
+
 static int sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
 {
struct tc_sfq_qopt *ctl = RTA_DATA(opt);
@@ -420,7 +431,7 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
int err;
-   
+
q-quantum = psched_mtu(sch-dev); /* default */
if ((err = sfq_q_init(q, opt)))
return err;
@@ -436,17 +447,6 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
return 0;
 }
 
-static void sfq_q_destroy(struct sfq_sched_data *q)
-{
-   del_timer(q-perturb_timer);
-}
-
-static void sfq_destroy(struct Qdisc *sch)
-{
-   struct sfq_sched_data *q = qdisc_priv(sch);
-   sfq_q_destroy(q);
-}
-
 static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/7] Preparatory refactoring part 2.

2007-07-29 Thread Corey Hickey
Factor code out of sfq_init() and sfq_destroy(), again so that the
new functions can be used by sfq_change() later.

Actually, as the diff itself shows, most of the sfq_q_init() code
comes from the original sfq_change(), but sfq_change() is only
called by sfq_init() right now. Thus, it is safe to remove
sfq_change(); tc qdisc change doesn't yet work for sfq anyway.

The sfq_destroy() -- sfq_q_destroy() change looks pointless here,
but it's cleaner to split now and add code to sfq_q_destroy() in a
later patch.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   80 +-
 1 files changed, 40 insertions(+), 40 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 8ae077f..0c46938 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -380,71 +380,71 @@ static void sfq_perturbation(unsigned long arg)
}
 }
 
-static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
+static int sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
struct tc_sfq_qopt *ctl = RTA_DATA(opt);
-   unsigned int qlen;
+   int i;
 
-   if (opt-rta_len  RTA_LENGTH(sizeof(*ctl)))
+   if (opt  opt-rta_len  RTA_LENGTH(sizeof(*ctl)))
return -EINVAL;
 
-   sch_tree_lock(sch);
-   q-quantum = ctl-quantum ? : psched_mtu(sch-dev);
-   q-perturb_period = ctl-perturb_period*HZ;
-   if (ctl-limit)
-   q-limit = min_t(u32, ctl-limit, SFQ_DEPTH);
+   q-perturbation = 0;
+   q-max_depth = 0;
+   q-tail = q-limit = SFQ_DEPTH;
+   if (opt == NULL) {
+   q-perturb_period = 0;
+   } else {
+   struct tc_sfq_qopt *ctl = RTA_DATA(opt);
+   if (ctl-quantum)
+   q-quantum = ctl-quantum;
+   q-perturb_period = ctl-perturb_period*HZ;
 
-   qlen = sch-q.qlen;
-   while (sch-q.qlen = q-limit-1)
-   sfq_drop(sch);
-   qdisc_tree_decrease_qlen(sch, qlen - sch-q.qlen);
+   if (ctl-limit)
+   q-limit = min_t(u32, ctl-limit, SFQ_DEPTH);
+   }
 
-   del_timer(q-perturb_timer);
-   if (q-perturb_period) {
-   q-perturb_timer.expires = jiffies + q-perturb_period;
-   add_timer(q-perturb_timer);
+   for (i=0; iSFQ_HASH_DIVISOR; i++)
+   q-ht[i] = SFQ_DEPTH;
+   for (i=0; iSFQ_DEPTH; i++) {
+   skb_queue_head_init(q-qs[i]);
+   q-dep[i+SFQ_DEPTH].next = i+SFQ_DEPTH;
+   q-dep[i+SFQ_DEPTH].prev = i+SFQ_DEPTH;
}
-   sch_tree_unlock(sch);
+
+   for (i=0; iSFQ_DEPTH; i++)
+   sfq_link(q, i);
return 0;
 }
 
 static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-   int i;
+   int err;
+   
+   q-quantum = psched_mtu(sch-dev); /* default */
+   if ((err = sfq_q_init(q, opt)))
+   return err;
 
init_timer(q-perturb_timer);
q-perturb_timer.data = (unsigned long)sch;
q-perturb_timer.function = sfq_perturbation;
-
-   for (i=0; iSFQ_HASH_DIVISOR; i++)
-   q-ht[i] = SFQ_DEPTH;
-   for (i=0; iSFQ_DEPTH; i++) {
-   skb_queue_head_init(q-qs[i]);
-   q-dep[i+SFQ_DEPTH].next = i+SFQ_DEPTH;
-   q-dep[i+SFQ_DEPTH].prev = i+SFQ_DEPTH;
-   }
-   q-limit = SFQ_DEPTH;
-   q-max_depth = 0;
-   q-tail = SFQ_DEPTH;
-   if (opt == NULL) {
-   q-quantum = psched_mtu(sch-dev);
-   q-perturb_period = 0;
-   } else {
-   int err = sfq_change(sch, opt);
-   if (err)
-   return err;
+   if (q-perturb_period) {
+   q-perturb_timer.expires = jiffies + q-perturb_period;
+   add_timer(q-perturb_timer);
}
-   for (i=0; iSFQ_DEPTH; i++)
-   sfq_link(q, i);
+
return 0;
 }
 
+static void sfq_q_destroy(struct sfq_sched_data *q)
+{
+   del_timer(q-perturb_timer);
+}
+
 static void sfq_destroy(struct Qdisc *sch)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-   del_timer(q-perturb_timer);
+   sfq_q_destroy(q);
 }
 
 static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


SFQ: backport some features from ESFQ (try 2)

2007-07-29 Thread Corey Hickey
Hello,

Patchset try 2 addresses the review by Michael Buesch.

This set of patches adds some of ESFQ's modifications to the original 
SFQ. Thus far, I have received support for this approach rather than for 
trying to get ESFQ included as a separate qdisc.

http://mailman.ds9a.nl/pipermail/lartc/2007q2/021056.html

My patches here implement tc qdisc change, user-configurable depth 
(number of flows), and user-configurable divisor (for setting hash table 
size). I've left out the remaining ESFQ features (usage of jhash and 
different hashing methods) because Patrick McHardy intends to submit a 
patch that will supersede that functionality; see the URL above.

Default values remain the same, and SFQ's default behavior remains the 
same, so there should be no user disruption.

A patch for iproute2 is included after the end of the kernel patch series.

Thanks for your consideration,
Corey


 include/linux/pkt_sched.h |8 --
 net/sched/sch_sfq.c   |  296 -
 2 files changed, 187 insertions(+), 117 deletions(-)

[PATCH 1/7] Preparatory refactoring part 1.
[PATCH 2/7] Preparatory refactoring part 2.
[PATCH 3/7] Move two functions.
[PATCH 4/7] Add depth.
[PATCH 5/7] Add divisor.
[PATCH 6/7] Make qdisc changeable.
[PATCH 7/7] Remove comments about hardcoded values.
[PATCH] [iproute2] SFQ: Support changing depth and divisor.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/7] Add divisor.

2007-07-29 Thread Corey Hickey
Make hash divisor user-configurable.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 70124ac..e6a6a21 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -75,7 +75,7 @@
It is easy to increase these values, but not in flight.  */
 
 #define SFQ_DEPTH_DEFAULT  128
-#define SFQ_HASH_DIVISOR   1024
+#define SFQ_DIVISOR_DEFAULT1024
 
 #define SFQ_HEAD 0
 #define SFQ_TAIL 1
@@ -98,6 +98,7 @@ struct sfq_sched_data
unsignedquantum;/* Allotment per round: MUST BE = MTU 
*/
int limit;
unsigneddepth;
+   unsignedhash_divisor;
 
 /* Variables */
struct timer_list perturb_timer;
@@ -105,7 +106,7 @@ struct sfq_sched_data
sfq_index   tail;   /* Index of current slot in round */
sfq_index   max_depth;  /* Maximal depth */
 
-   sfq_index   ht[SFQ_HASH_DIVISOR];   /* Hash table */
+   sfq_index   *ht;/* Hash table */
sfq_index   *next;  /* Active slots link */
short   *allot; /* Current allotment per slot */
unsigned short  *hash;  /* Hash value indexed by slots 
*/
@@ -120,7 +121,7 @@ static __inline__ unsigned sfq_fold_hash(struct 
sfq_sched_data *q, u32 h, u32 h1
/* Have we any rotation primitives? If not, WHY? */
h ^= (h1pert) ^ (h1(0x1F - pert));
h ^= h10;
-   return h  0x3FF;
+   return h  (q-hash_divisor-1);
 }
 
 static unsigned sfq_hash(struct sfq_sched_data *q, struct sk_buff *skb)
@@ -386,6 +387,7 @@ static void sfq_perturbation(unsigned long arg)
 static void sfq_q_destroy(struct sfq_sched_data *q)
 {
del_timer(q-perturb_timer);
+   kfree(q-ht);
kfree(q-dep);
kfree(q-next);
kfree(q-allot);
@@ -411,12 +413,14 @@ static int sfq_q_init(struct sfq_sched_data *q, struct 
rtattr *opt)
q-max_depth = 0;
if (opt == NULL) {
q-perturb_period = 0;
+   q-hash_divisor = SFQ_DIVISOR_DEFAULT;
q-tail = q-limit = q-depth = SFQ_DEPTH_DEFAULT;
} else {
struct tc_sfq_qopt *ctl = RTA_DATA(opt);
if (ctl-quantum)
q-quantum = ctl-quantum;
q-perturb_period = ctl-perturb_period*HZ;
+   q-hash_divisor = ctl-divisor ? : SFQ_DIVISOR_DEFAULT;
q-tail = q-limit = q-depth = ctl-flows ? : 
SFQ_DEPTH_DEFAULT;
 
if (q-depth  SFQ_MAX_DEPTH)
@@ -426,6 +430,9 @@ static int sfq_q_init(struct sfq_sched_data *q, struct 
rtattr *opt)
q-limit = min_t(u32, ctl-limit, q-depth);
}
 
+   q-ht = kcalloc(q-hash_divisor, sizeof(sfq_index), GFP_KERNEL);
+   if (!q-ht)
+   goto err_case;
q-dep = kcalloc(1 + q-depth*2, sizeof(struct sfq_head), GFP_KERNEL);
if (!q-dep)
goto err_case;
@@ -442,7 +449,7 @@ static int sfq_q_init(struct sfq_sched_data *q, struct 
rtattr *opt)
if (!q-qs)
goto err_case;
 
-   for (i=0; iSFQ_HASH_DIVISOR; i++)
+   for (i=0; iq-hash_divisor; i++)
q-ht[i] = q-depth;
for (i=0; i  q-depth; i++) {
skb_queue_head_init(q-qs[i]);
@@ -488,7 +495,7 @@ static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
opt.perturb_period = q-perturb_period/HZ;
 
opt.limit = q-limit;
-   opt.divisor = SFQ_HASH_DIVISOR;
+   opt.divisor = q-hash_divisor;
opt.flows = q-depth;
 
RTA_PUT(skb, TCA_OPTIONS, sizeof(opt), opt);
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/7] Preparatory refactoring part 1.

2007-07-29 Thread Corey Hickey
Make a new function sfq_q_enqueue() that operates directly on the
queue data. This will be useful for implementing sfq_change() in
a later patch. A pleasant side-effect is reducing most of the
duplicate code in sfq_enqueue() and sfq_requeue().

Similarly, make a new function sfq_q_dequeue().

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   70 ++
 1 files changed, 36 insertions(+), 34 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 9579573..8ae077f 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -77,6 +77,9 @@
 #define SFQ_DEPTH  128
 #define SFQ_HASH_DIVISOR   1024
 
+#define SFQ_HEAD 0
+#define SFQ_TAIL 1
+
 /* This type should contain at least SFQ_DEPTH*2 values */
 typedef unsigned char sfq_index;
 
@@ -244,10 +247,8 @@ static unsigned int sfq_drop(struct Qdisc *sch)
return 0;
 }
 
-static int
-sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
+static void sfq_q_enqueue(struct sk_buff *skb, struct sfq_sched_data *q, 
unsigned int end)
 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
unsigned hash = sfq_hash(q, skb);
sfq_index x;
 
@@ -256,8 +257,12 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
q-hash[x] = hash;
}
-   sch-qstats.backlog += skb-len;
-   __skb_queue_tail(q-qs[x], skb);
+
+   if (end == SFQ_TAIL)
+   __skb_queue_tail(q-qs[x], skb);
+   else
+   __skb_queue_head(q-qs[x], skb);
+
sfq_inc(q, x);
if (q-qs[x].qlen == 1) {   /* The flow is new */
if (q-tail == SFQ_DEPTH) { /* It is the first flow */
@@ -270,12 +275,21 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
q-tail = x;
}
}
+}
+
+static int
+sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   sfq_q_enqueue(skb, q, SFQ_TAIL);
+   sch-qstats.backlog += skb-len;
if (++sch-q.qlen  q-limit-1) {
sch-bstats.bytes += skb-len;
sch-bstats.packets++;
return 0;
}
 
+   sch-qstats.drops++;
sfq_drop(sch);
return NET_XMIT_CN;
 }
@@ -284,28 +298,8 @@ static int
 sfq_requeue(struct sk_buff *skb, struct Qdisc* sch)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-   unsigned hash = sfq_hash(q, skb);
-   sfq_index x;
-
-   x = q-ht[hash];
-   if (x == SFQ_DEPTH) {
-   q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
-   q-hash[x] = hash;
-   }
+   sfq_q_enqueue(skb, q, SFQ_HEAD);
sch-qstats.backlog += skb-len;
-   __skb_queue_head(q-qs[x], skb);
-   sfq_inc(q, x);
-   if (q-qs[x].qlen == 1) {   /* The flow is new */
-   if (q-tail == SFQ_DEPTH) { /* It is the first flow */
-   q-tail = x;
-   q-next[x] = x;
-   q-allot[x] = q-quantum;
-   } else {
-   q-next[x] = q-next[q-tail];
-   q-next[q-tail] = x;
-   q-tail = x;
-   }
-   }
if (++sch-q.qlen  q-limit - 1) {
sch-qstats.requeues++;
return 0;
@@ -316,13 +310,8 @@ sfq_requeue(struct sk_buff *skb, struct Qdisc* sch)
return NET_XMIT_CN;
 }
 
-
-
-
-static struct sk_buff *
-sfq_dequeue(struct Qdisc* sch)
+static struct sk_buff *sfq_q_dequeue(struct sfq_sched_data *q)
 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
struct sk_buff *skb;
sfq_index a, old_a;
 
@@ -335,8 +324,6 @@ sfq_dequeue(struct Qdisc* sch)
/* Grab packet */
skb = __skb_dequeue(q-qs[a]);
sfq_dec(q, a);
-   sch-q.qlen--;
-   sch-qstats.backlog -= skb-len;
 
/* Is the slot empty? */
if (q-qs[a].qlen == 0) {
@@ -353,6 +340,21 @@ sfq_dequeue(struct Qdisc* sch)
a = q-next[a];
q-allot[a] += q-quantum;
}
+
+   return skb;
+}
+
+static struct sk_buff
+*sfq_dequeue(struct Qdisc* sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   struct sk_buff *skb;
+
+   skb = sfq_q_dequeue(q);
+   if (skb == NULL)
+   return NULL;
+   sch-q.qlen--;
+   sch-qstats.backlog -= skb-len;
return skb;
 }
 
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/7] Add depth.

2007-07-29 Thread Corey Hickey
Make depth (number of queues) user-configurable:

* replace #define with a parameter
* use old hardcoded value as a default
* kcalloc() arrays in sfq_q_init()
* free() arrays in sfq_q_destroy()

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   81 +++
 1 files changed, 56 insertions(+), 25 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 583f925..70124ac 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -74,14 +74,16 @@
 
It is easy to increase these values, but not in flight.  */
 
-#define SFQ_DEPTH  128
+#define SFQ_DEPTH_DEFAULT  128
 #define SFQ_HASH_DIVISOR   1024
 
 #define SFQ_HEAD 0
 #define SFQ_TAIL 1
 
-/* This type should contain at least SFQ_DEPTH*2 values */
-typedef unsigned char sfq_index;
+/* This type must contain greater than depth*2 values, so depth is constrained 
+ * accordingly. */
+typedef unsigned int sfq_index;
+#define SFQ_MAX_DEPTH (UINT_MAX / 2 - 1)
 
 struct sfq_head
 {
@@ -95,6 +97,7 @@ struct sfq_sched_data
int perturb_period;
unsignedquantum;/* Allotment per round: MUST BE = MTU 
*/
int limit;
+   unsigneddepth;
 
 /* Variables */
struct timer_list perturb_timer;
@@ -103,11 +106,11 @@ struct sfq_sched_data
sfq_index   max_depth;  /* Maximal depth */
 
sfq_index   ht[SFQ_HASH_DIVISOR];   /* Hash table */
-   sfq_index   next[SFQ_DEPTH];/* Active slots link */
-   short   allot[SFQ_DEPTH];   /* Current allotment per slot */
-   unsigned short  hash[SFQ_DEPTH];/* Hash value indexed by slots 
*/
-   struct sk_buff_head qs[SFQ_DEPTH];  /* Slot queue */
-   struct sfq_head dep[SFQ_DEPTH*2];   /* Linked list of slots, 
indexed by depth */
+   sfq_index   *next;  /* Active slots link */
+   short   *allot; /* Current allotment per slot */
+   unsigned short  *hash;  /* Hash value indexed by slots 
*/
+   struct sk_buff_head *qs;/* Slot queue */
+   struct sfq_head *dep;   /* Linked list of slots, 
indexed by depth */
 };
 
 static __inline__ unsigned sfq_fold_hash(struct sfq_sched_data *q, u32 h, u32 
h1)
@@ -164,7 +167,7 @@ static unsigned sfq_hash(struct sfq_sched_data *q, struct 
sk_buff *skb)
 static inline void sfq_link(struct sfq_sched_data *q, sfq_index x)
 {
sfq_index p, n;
-   int d = q-qs[x].qlen + SFQ_DEPTH;
+   int d = q-qs[x].qlen + q-depth;
 
p = d;
n = q-dep[d].next;
@@ -215,7 +218,7 @@ static unsigned int sfq_drop(struct Qdisc *sch)
   drop a packet from it */
 
if (d  1) {
-   sfq_index x = q-dep[d+SFQ_DEPTH].next;
+   sfq_index x = q-dep[d + q-depth].next;
skb = q-qs[x].prev;
len = skb-len;
__skb_unlink(skb, q-qs[x]);
@@ -238,7 +241,7 @@ static unsigned int sfq_drop(struct Qdisc *sch)
kfree_skb(skb);
sfq_dec(q, d);
sch-q.qlen--;
-   q-ht[q-hash[d]] = SFQ_DEPTH;
+   q-ht[q-hash[d]] = q-depth;
sch-qstats.drops++;
sch-qstats.backlog -= len;
return len;
@@ -253,8 +256,8 @@ static void sfq_q_enqueue(struct sk_buff *skb, struct 
sfq_sched_data *q, unsigne
sfq_index x;
 
x = q-ht[hash];
-   if (x == SFQ_DEPTH) {
-   q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
+   if (x == q-depth) {
+   q-ht[hash] = x = q-dep[q-depth].next;
q-hash[x] = hash;
}
 
@@ -265,7 +268,7 @@ static void sfq_q_enqueue(struct sk_buff *skb, struct 
sfq_sched_data *q, unsigne
 
sfq_inc(q, x);
if (q-qs[x].qlen == 1) {   /* The flow is new */
-   if (q-tail == SFQ_DEPTH) { /* It is the first flow */
+   if (q-tail == q-depth) {  /* It is the first flow */
q-tail = x;
q-next[x] = x;
q-allot[x] = q-quantum;
@@ -316,7 +319,7 @@ static struct sk_buff *sfq_q_dequeue(struct sfq_sched_data 
*q)
sfq_index a, old_a;
 
/* No active slots */
-   if (q-tail == SFQ_DEPTH)
+   if (q-tail == q-depth)
return NULL;
 
a = old_a = q-next[q-tail];
@@ -327,10 +330,10 @@ static struct sk_buff *sfq_q_dequeue(struct 
sfq_sched_data *q)
 
/* Is the slot empty? */
if (q-qs[a].qlen == 0) {
-   q-ht[q-hash[a]] = SFQ_DEPTH;
+   q-ht[q-hash[a]] = q-depth;
a = q-next[a];
if (a == old_a) {
-   q-tail = SFQ_DEPTH;
+   q-tail = q-depth;
return skb;
}
q-next[q-tail

[PATCH 6/7] Make qdisc changeable.

2007-07-29 Thread Corey Hickey
Re-implement sfq_change() and enable Qdisc_opts.change so tc qdisc
change will work.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   51 ++-
 1 files changed, 50 insertions(+), 1 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index e6a6a21..e042cd0 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -485,6 +485,55 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
return 0;
 }
 
+static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   struct sfq_sched_data tmp;
+   struct sk_buff *skb;
+   int err;
+   
+   /* set up tmp queue */
+   memset(tmp, 0, sizeof(struct sfq_sched_data));
+   tmp.quantum = psched_mtu(sch-dev); /* default */
+   if ((err = sfq_q_init(tmp, opt)))
+   return err;
+
+   /* copy packets from the old queue to the tmp queue */
+   sch_tree_lock(sch);
+   while (sch-q.qlen = tmp.limit - 1)
+   sfq_drop(sch);
+   while ((skb = sfq_q_dequeue(q)) != NULL)
+   sfq_q_enqueue(skb, tmp, SFQ_TAIL);
+   
+   /* clean up the old queue */
+   sfq_q_destroy(q);
+
+   /* copy elements of the tmp queue into the old queue */
+   q-perturb_period = tmp.perturb_period;
+   q-quantum= tmp.quantum;
+   q-limit  = tmp.limit;
+   q-depth  = tmp.depth;
+   q-hash_divisor   = tmp.hash_divisor;
+   q-tail   = tmp.tail;
+   q-max_depth  = tmp.max_depth;
+   q-ht= tmp.ht;
+   q-dep   = tmp.dep;
+   q-next  = tmp.next;
+   q-allot = tmp.allot;
+   q-hash  = tmp.hash;
+   q-qs= tmp.qs;
+
+   /* finish up */
+   if (q-perturb_period) {
+   q-perturb_timer.expires = jiffies + q-perturb_period;
+   add_timer(q-perturb_timer);
+   } else {
+   q-perturbation = 0;
+   }
+   sch_tree_unlock(sch);
+   return 0;
+}
+
 static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
@@ -519,7 +568,7 @@ static struct Qdisc_ops sfq_qdisc_ops = {
.init   =   sfq_init,
.reset  =   sfq_reset,
.destroy=   sfq_destroy,
-   .change =   NULL,
+   .change =   sfq_change,
.dump   =   sfq_dump,
.owner  =   THIS_MODULE,
 };
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/7] Remove comments about hardcoded values.

2007-07-29 Thread Corey Hickey
None of these are true anymore (hooray!).

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 include/linux/pkt_sched.h |8 
 net/sched/sch_sfq.c   |   17 +++--
 2 files changed, 3 insertions(+), 22 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 268c515..58a0ea6 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -148,14 +148,6 @@ struct tc_sfq_qopt
unsignedflows;  /* Maximal number of flows  */
 };
 
-/*
- *  NOTE: limit, divisor and flows are hardwired to code at the moment.
- *
- * limit=flows=128, divisor=1024;
- *
- * The only reason for this is efficiency, it is possible
- * to change these parameters in compile time.
- */
 
 /* RED section */
 
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index e042cd0..3890452 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -61,18 +61,7 @@
 
We still need true WFQ for top level CSZ, but using WFQ
for the best effort traffic is absolutely pointless:
-   SFQ is superior for this purpose.
-
-   IMPLEMENTATION:
-   This implementation limits maximal queue length to 128;
-   maximal mtu to 2^15-1; number of hash buckets to 1024.
-   The only goal of this restrictions was that all data
-   fit into one 4K page :-). Struct sfq_sched_data is
-   organized in anti-cache manner: all the data for a bucket
-   are scattered over different locations. This is not good,
-   but it allowed me to put it into 4K.
-
-   It is easy to increase these values, but not in flight.  */
+   SFQ is superior for this purpose. */
 
 #define SFQ_DEPTH_DEFAULT  128
 #define SFQ_DIVISOR_DEFAULT1024
@@ -491,7 +480,7 @@ static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
struct sfq_sched_data tmp;
struct sk_buff *skb;
int err;
-   
+
/* set up tmp queue */
memset(tmp, 0, sizeof(struct sfq_sched_data));
tmp.quantum = psched_mtu(sch-dev); /* default */
@@ -504,7 +493,7 @@ static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
sfq_drop(sch);
while ((skb = sfq_q_dequeue(q)) != NULL)
sfq_q_enqueue(skb, tmp, SFQ_TAIL);
-   
+
/* clean up the old queue */
sfq_q_destroy(q);
 
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [LARTC] ESFQ: request for user input

2007-06-24 Thread Corey Hickey
Patrick McHardy wrote:
 Corey Hickey wrote:
 Patrick McHardy wrote:
 Should ESFQ be merged into SFQ or remain as a separate qdisc?
   
 I've CCed netdev. I think merging parts of ESFQ (dynamic depth and
 flow number) would make a lot of sense, but I'm intending to submit
 an alternative to the ESFQ hashing scheme for 2.6.23:

 http://www.mail-archive.com/netdev@vger.kernel.org/msg39156.html
 
 Nice. I wasn't aware of that. Your patch looks like it supersedes ESFQ's
 hashing, so, if it gets applied, that already removes a large chunk of
 the differences between SFQ and ESFQ.

 If I don't hear any opposition, then I'll keep an eye out for when your
 patch gets accepted (assuming it does) and then submit patch(es) porting
 the rest of ESFQ's features to SFQ.
   
 
 I think it would be best if you would start posting patches
 to add the missing features (without the hash changes) to SFQ,
 if you're quick this may already go in during the 2.6.23 merge
 window. My changes are mostly independant of yours, if there
 are any clashes the one who goes last will just have to rediff
 their patches :)
 
 Since you need to pass additional parameters to SFQ for your
 changes, have a look at my rtnetlink compat attribute patch:
 
 http://article.gmane.org/gmane.linux.network/64851

Ok, I'll work on it later. Thanks.

-Corey

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html