[RESEND][PATCH 1/3] PPPoE: improved hashing routine

2007-07-29 Thread Florian Zumbiehl
Hi,

I'm not sure whether this is really worth it, but it looked so
extremely inefficient that I couldn't resist - so let's hope providers
will keep PPPoE around for a while, at least until terabit dsl ;-)

The new code produces the same results as the old version and is
~ 3 to 6 times faster for 4-bit hashes on the CPUs I tested.

Florian

---
Signed-off-by: Florian Zumbiehl [EMAIL PROTECTED]

diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index 9e51fcc..954328c 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -108,19 +108,24 @@ static inline int cmp_addr(struct pppoe_addr *a, unsigned 
long sid, char *addr)
(memcmp(a-remote,addr,ETH_ALEN) == 0));
 }
 
-static int hash_item(unsigned long sid, unsigned char *addr)
+#if 8%PPPOE_HASH_BITS
+#error 8 must be a multiple of PPPOE_HASH_BITS
+#endif
+
+static int hash_item(unsigned int sid, unsigned char *addr)
 {
-   char hash = 0;
-   int i, j;
+   unsigned char hash = 0;
+   unsigned int i;
 
-   for (i = 0; i  ETH_ALEN ; ++i) {
-   for (j = 0; j  8/PPPOE_HASH_BITS ; ++j) {
-   hash ^= addr[i]  ( j * PPPOE_HASH_BITS );
-   }
+   for (i = 0 ; i  ETH_ALEN ; i++) {
+   hash ^= addr[i];
+   }
+   for (i = 0 ; i  sizeof(sid_t)*8 ; i += 8 ){
+   hash ^= sidi;
+   }
+   for (i = 8 ; (i=1) = PPPOE_HASH_BITS ; ) {
+   hash ^= hashi;
}
-
-   for (i = 0; i  (sizeof(unsigned long)*8) / PPPOE_HASH_BITS ; ++i)
-   hash ^= sid  (i*PPPOE_HASH_BITS);
 
return hash  ( PPPOE_HASH_SIZE - 1 );
 }
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RESEND][PATCH 2/3] PPPoX/E: return ENOTTY on unknown ioctl requests

2007-07-29 Thread Florian Zumbiehl
Hi,

here another patch for the PPPoX/E code that makes sure that ENOTTY is
returned for unknown ioctl requests rather than 0 (and removes another
unneeded initializer which I didn't bother creating a separate patch for).

Florian

---
Signed-off-by: Florian Zumbiehl [EMAIL PROTECTED]

diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index 954328c..9554924 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -669,8 +669,8 @@ static int pppoe_ioctl(struct socket *sock, unsigned int 
cmd,
 {
struct sock *sk = sock-sk;
struct pppox_sock *po = pppox_sk(sk);
-   int val = 0;
-   int err = 0;
+   int val;
+   int err;
 
switch (cmd) {
case PPPIOCGMRU:
@@ -759,8 +759,9 @@ static int pppoe_ioctl(struct socket *sock, unsigned int 
cmd,
err = 0;
break;
 
-   default:;
-   };
+   default:
+   err = -ENOTTY;
+   }
 
return err;
 }
diff --git a/drivers/net/pppox.c b/drivers/net/pppox.c
index 3f8115d..51de561 100644
--- a/drivers/net/pppox.c
+++ b/drivers/net/pppox.c
@@ -72,7 +72,7 @@ int pppox_ioctl(struct socket *sock, unsigned int cmd, 
unsigned long arg)
 {
struct sock *sk = sock-sk;
struct pppox_sock *po = pppox_sk(sk);
-   int rc = 0;
+   int rc;
 
lock_sock(sk);
 
@@ -93,12 +93,9 @@ int pppox_ioctl(struct socket *sock, unsigned int cmd, 
unsigned long arg)
break;
}
default:
-   if (pppox_protos[sk-sk_protocol]-ioctl)
-   rc = pppox_protos[sk-sk_protocol]-ioctl(sock, cmd,
- arg);
-
-   break;
-   };
+   rc = pppox_protos[sk-sk_protocol]-ioctl ?
+   pppox_protos[sk-sk_protocol]-ioctl(sock, cmd, arg) : 
-ENOTTY;
+   }
 
release_sock(sk);
return rc;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RESEND][PATCH 3/3] PPPoE: move lock_sock() in pppoe_sendmsg() to the right location

2007-07-29 Thread Florian Zumbiehl
Hi,

and the last one for now: Acquire the sock lock in pppoe_sendmsg()
before accessing the sock - and in particular avoid releasing the lock
even though it hasn't been acquired.

Florian

---
Signed-off-by: Florian Zumbiehl [EMAIL PROTECTED]

diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index 9554924..eef8a5b 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -779,6 +779,7 @@ static int pppoe_sendmsg(struct kiocb *iocb, struct socket 
*sock,
struct net_device *dev;
char *start;
 
+   lock_sock(sk);
if (sock_flag(sk, SOCK_DEAD) || !(sk-sk_state  PPPOX_CONNECTED)) {
error = -ENOTCONN;
goto end;
@@ -789,8 +790,6 @@ static int pppoe_sendmsg(struct kiocb *iocb, struct socket 
*sock,
hdr.code = 0;
hdr.sid = po-num;
 
-   lock_sock(sk);
-
dev = po-pppoe_dev;
 
error = -EMSGSIZE;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/7] SFQ: backport some features from ESFQ

2007-07-29 Thread Corey Hickey
Hello,

This set of patches adds some of ESFQ's modifications to the original 
SFQ. Thus far, I have received support for this approach rather than for 
trying to get ESFQ included as a separate qdisc.

http://mailman.ds9a.nl/pipermail/lartc/2007q2/021056.html

My patches here implement tc qdisc change, user-configurable depth 
(number of flows), and user-configurable divisor (for setting hash table 
size). I've left out the remaining ESFQ features (usage of jhash and 
different hashing methods) because Patrick McHardy intends to submit a 
patch that will supersede that functionality; see the URL above.

Default values remain the same, and SFQ's default behavior remains the 
same, so there should be no user disruption.

A patch for iproute2 is included after the end of the kernel patch series.

Thanks for your consideration,
Corey


 include/linux/pkt_sched.h |8 --
 net/sched/sch_sfq.c   |  301 +
 2 files changed, 192 insertions(+), 117 deletions(-)

[PATCH 1/7] Preparatory refactoring part 1.
[PATCH 2/7] Preparatory refactoring part 2.
[PATCH 3/7] Move two functions.
[PATCH 4/7] Add depth.
[PATCH 5/7] Add divisor.
[PATCH 6/7] Make qdisc changeable.
[PATCH 7/7] Remove comments about hardcoded values.
[PATCH] [iproute2] SFQ: Support changing depth and divisor.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/7] Preparatory refactoring part 1.

2007-07-29 Thread Corey Hickey
Make a new function sfq_q_enqueue() that operates directly on the
queue data. This will be useful for implementing sfq_change() in
a later patch. A pleasant side-effect is reducing most of the
duplicate code in sfq_enqueue() and sfq_requeue().

Similarly, make a new function sfq_q_dequeue().

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   70 ++
 1 files changed, 36 insertions(+), 34 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 9579573..8ae077f 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -77,6 +77,9 @@
 #define SFQ_DEPTH  128
 #define SFQ_HASH_DIVISOR   1024
 
+#define SFQ_HEAD 0
+#define SFQ_TAIL 1
+
 /* This type should contain at least SFQ_DEPTH*2 values */
 typedef unsigned char sfq_index;
 
@@ -244,10 +247,8 @@ static unsigned int sfq_drop(struct Qdisc *sch)
return 0;
 }
 
-static int
-sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
+static void sfq_q_enqueue(struct sk_buff *skb, struct sfq_sched_data *q, 
unsigned int end)
 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
unsigned hash = sfq_hash(q, skb);
sfq_index x;
 
@@ -256,8 +257,12 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
q-hash[x] = hash;
}
-   sch-qstats.backlog += skb-len;
-   __skb_queue_tail(q-qs[x], skb);
+
+   if (end == SFQ_TAIL)
+   __skb_queue_tail(q-qs[x], skb);
+   else
+   __skb_queue_head(q-qs[x], skb);
+
sfq_inc(q, x);
if (q-qs[x].qlen == 1) {   /* The flow is new */
if (q-tail == SFQ_DEPTH) { /* It is the first flow */
@@ -270,12 +275,21 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
q-tail = x;
}
}
+}
+
+static int
+sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   sfq_q_enqueue(skb, q, SFQ_TAIL);
+   sch-qstats.backlog += skb-len;
if (++sch-q.qlen  q-limit-1) {
sch-bstats.bytes += skb-len;
sch-bstats.packets++;
return 0;
}
 
+   sch-qstats.drops++;
sfq_drop(sch);
return NET_XMIT_CN;
 }
@@ -284,28 +298,8 @@ static int
 sfq_requeue(struct sk_buff *skb, struct Qdisc* sch)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-   unsigned hash = sfq_hash(q, skb);
-   sfq_index x;
-
-   x = q-ht[hash];
-   if (x == SFQ_DEPTH) {
-   q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
-   q-hash[x] = hash;
-   }
+   sfq_q_enqueue(skb, q, SFQ_HEAD);
sch-qstats.backlog += skb-len;
-   __skb_queue_head(q-qs[x], skb);
-   sfq_inc(q, x);
-   if (q-qs[x].qlen == 1) {   /* The flow is new */
-   if (q-tail == SFQ_DEPTH) { /* It is the first flow */
-   q-tail = x;
-   q-next[x] = x;
-   q-allot[x] = q-quantum;
-   } else {
-   q-next[x] = q-next[q-tail];
-   q-next[q-tail] = x;
-   q-tail = x;
-   }
-   }
if (++sch-q.qlen  q-limit - 1) {
sch-qstats.requeues++;
return 0;
@@ -316,13 +310,8 @@ sfq_requeue(struct sk_buff *skb, struct Qdisc* sch)
return NET_XMIT_CN;
 }
 
-
-
-
-static struct sk_buff *
-sfq_dequeue(struct Qdisc* sch)
+static struct sk_buff *sfq_q_dequeue(struct sfq_sched_data *q)
 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
struct sk_buff *skb;
sfq_index a, old_a;
 
@@ -335,8 +324,6 @@ sfq_dequeue(struct Qdisc* sch)
/* Grab packet */
skb = __skb_dequeue(q-qs[a]);
sfq_dec(q, a);
-   sch-q.qlen--;
-   sch-qstats.backlog -= skb-len;
 
/* Is the slot empty? */
if (q-qs[a].qlen == 0) {
@@ -353,6 +340,21 @@ sfq_dequeue(struct Qdisc* sch)
a = q-next[a];
q-allot[a] += q-quantum;
}
+
+   return skb;
+}
+
+static struct sk_buff
+*sfq_dequeue(struct Qdisc* sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   struct sk_buff *skb;
+
+   skb = sfq_q_dequeue(q);
+   if (skb == NULL)
+   return NULL;
+   sch-q.qlen--;
+   sch-qstats.backlog -= skb-len;
return skb;
 }
 
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/7] Preparatory refactoring part 2.

2007-07-29 Thread Corey Hickey
Factor code out of sfq_init() and sfq_destroy(), again so that the
new functions can be used by sfq_change() later.

Actually, as the diff itself shows, most of the sfq_q_init() code
comes from the original sfq_change(), but sfq_change() is only
called by sfq_init() right now. Thus, it is safe to remove
sfq_change(); tc qdisc change doesn't yet work for sfq anyway.

The sfq_destroy() -- sfq_q_destroy() change looks pointless here,
but it's cleaner to split now and add code to sfq_q_destroy() in a
later patch.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   80 +-
 1 files changed, 40 insertions(+), 40 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 8ae077f..0c46938 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -380,71 +380,71 @@ static void sfq_perturbation(unsigned long arg)
}
 }
 
-static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
+static int sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
struct tc_sfq_qopt *ctl = RTA_DATA(opt);
-   unsigned int qlen;
+   int i;
 
-   if (opt-rta_len  RTA_LENGTH(sizeof(*ctl)))
+   if (opt  opt-rta_len  RTA_LENGTH(sizeof(*ctl)))
return -EINVAL;
 
-   sch_tree_lock(sch);
-   q-quantum = ctl-quantum ? : psched_mtu(sch-dev);
-   q-perturb_period = ctl-perturb_period*HZ;
-   if (ctl-limit)
-   q-limit = min_t(u32, ctl-limit, SFQ_DEPTH);
+   q-perturbation = 0;
+   q-max_depth = 0;
+   q-tail = q-limit = SFQ_DEPTH;
+   if (opt == NULL) {
+   q-perturb_period = 0;
+   } else {
+   struct tc_sfq_qopt *ctl = RTA_DATA(opt);
+   if (ctl-quantum)
+   q-quantum = ctl-quantum;
+   q-perturb_period = ctl-perturb_period*HZ;
 
-   qlen = sch-q.qlen;
-   while (sch-q.qlen = q-limit-1)
-   sfq_drop(sch);
-   qdisc_tree_decrease_qlen(sch, qlen - sch-q.qlen);
+   if (ctl-limit)
+   q-limit = min_t(u32, ctl-limit, SFQ_DEPTH);
+   }
 
-   del_timer(q-perturb_timer);
-   if (q-perturb_period) {
-   q-perturb_timer.expires = jiffies + q-perturb_period;
-   add_timer(q-perturb_timer);
+   for (i=0; iSFQ_HASH_DIVISOR; i++)
+   q-ht[i] = SFQ_DEPTH;
+   for (i=0; iSFQ_DEPTH; i++) {
+   skb_queue_head_init(q-qs[i]);
+   q-dep[i+SFQ_DEPTH].next = i+SFQ_DEPTH;
+   q-dep[i+SFQ_DEPTH].prev = i+SFQ_DEPTH;
}
-   sch_tree_unlock(sch);
+
+   for (i=0; iSFQ_DEPTH; i++)
+   sfq_link(q, i);
return 0;
 }
 
 static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-   int i;
+   int err;
+   
+   q-quantum = psched_mtu(sch-dev); /* default */
+   if ((err = sfq_q_init(q, opt)))
+   return err;
 
init_timer(q-perturb_timer);
q-perturb_timer.data = (unsigned long)sch;
q-perturb_timer.function = sfq_perturbation;
-
-   for (i=0; iSFQ_HASH_DIVISOR; i++)
-   q-ht[i] = SFQ_DEPTH;
-   for (i=0; iSFQ_DEPTH; i++) {
-   skb_queue_head_init(q-qs[i]);
-   q-dep[i+SFQ_DEPTH].next = i+SFQ_DEPTH;
-   q-dep[i+SFQ_DEPTH].prev = i+SFQ_DEPTH;
-   }
-   q-limit = SFQ_DEPTH;
-   q-max_depth = 0;
-   q-tail = SFQ_DEPTH;
-   if (opt == NULL) {
-   q-quantum = psched_mtu(sch-dev);
-   q-perturb_period = 0;
-   } else {
-   int err = sfq_change(sch, opt);
-   if (err)
-   return err;
+   if (q-perturb_period) {
+   q-perturb_timer.expires = jiffies + q-perturb_period;
+   add_timer(q-perturb_timer);
}
-   for (i=0; iSFQ_DEPTH; i++)
-   sfq_link(q, i);
+
return 0;
 }
 
+static void sfq_q_destroy(struct sfq_sched_data *q)
+{
+   del_timer(q-perturb_timer);
+}
+
 static void sfq_destroy(struct Qdisc *sch)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-   del_timer(q-perturb_timer);
+   sfq_q_destroy(q);
 }
 
 static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TCP SACK issue, hung connection, tcpdump included

2007-07-29 Thread Willy Tarreau
Hi,

[CCing netdev as it's where people competent on the subject are]

On Sun, Jul 29, 2007 at 06:59:26AM +0100, Darryl L. Miles wrote:
 CLIENT = Linux 2.6.20.1-smp [Customer build]
 SERVER = Linux 2.6.9-55.ELsmp [Red Hat Enterprise Linux AS release 4 
 (Nahant Update 5)]
 
 The problems start around time index 09:21:39.860302 when the CLIENT issues 
 a TCP packet with SACK option set (seemingly for a data segment which has 
 already been seen) from that point on the connection hangs.

Where was the capture taken ? on CLIENT or on SERVER (I suspect client from
the timers) ? A possible, but very unlikely reason would be an MTU limitation
somewhere, because the segment which never gets correctly ACKed is also the
largest one in this trace. It would be strange as the SACKs are send
immediately after, but it should be something to consider anyway.

Also, if everything is right with the packets on the server side, then
it would means that it's the RHEL kernel which is buggy (which does not
mean that the same bug does not exist in mainline).

Regards,
Willy


 Full dump available via email.
 
 
 09:12:13.444999 IP CLIENT.50727  SERVER.ssh: S 2919512080:2919512080(0) 
 win 5840 mss 1460,sackOK,timestamp 799860259 0,nop,wscale 7
 09:12:13.535643 IP SERVER.ssh  CLIENT.50727: S 492909547:492909547(0) ack 
 2919512081 win 5792 mss 1460,sackOK,timestamp 7126976 799860259,nop,wscale 
 2
 09:12:13.535717 IP CLIENT.50727  SERVER.ssh: . ack 1 win 46 
 nop,nop,timestamp 799860282 7126976
 09:12:13.665481 IP SERVER.ssh  CLIENT.50727: P 1:24(23) ack 1 win 1448 
 nop,nop,timestamp 7127074 799860282
 09:12:13.665895 IP CLIENT.50727  SERVER.ssh: . ack 24 win 46 
 nop,nop,timestamp 799860314 7127074
 09:12:13.666044 IP CLIENT.50727  SERVER.ssh: P 1:23(22) ack 24 win 46 
 nop,nop,timestamp 799860314 7127074
 09:12:13.776318 IP SERVER.ssh  CLIENT.50727: . ack 23 win 1448 
 nop,nop,timestamp 7127216 799860314
 
 ...SNIPPED SUCCESSFUL TRAFFIC...
 
 09:21:39.490740 IP SERVER.ssh  CLIENT.50727: P 18200:18464(264) ack 2991 
 win 2728 nop,nop,timestamp 7692910 81727
 09:21:39.490775 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
 nop,nop,timestamp 81755 7692910
 09:21:39.860245 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 2991 
 win 2728 nop,nop,timestamp 7693293 81749
 09:21:39.860302 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
 nop,nop,timestamp 81847 7692910,nop,nop,sack sack 1 {12408:13856} 
 09:21:40.453440 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 2991 
 win 2728 nop,nop,timestamp 7693887 81749
 09:21:40.453495 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
 nop,nop,timestamp 81996 7692910,nop,nop,sack sack 1 {12408:13856} 
 09:21:41.641821 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 2991 
 win 2728 nop,nop,timestamp 7695075 81749
 09:21:41.641938 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
 nop,nop,timestamp 82293 7692910,nop,nop,sack sack 1 {12408:13856} 
 09:21:44.017552 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 2991 
 win 2728 nop,nop,timestamp 7697451 81749
 09:21:44.017622 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
 nop,nop,timestamp 82887 7692910,nop,nop,sack sack 1 {12408:13856} 
 09:21:48.770051 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 2991 
 win 2728 nop,nop,timestamp 7702203 81749
 09:21:48.770099 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
 nop,nop,timestamp 84075 7692910,nop,nop,sack sack 1 {12408:13856} 
 09:21:58.274061 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 2991 
 win 2728 nop,nop,timestamp 7711707 81749
 09:21:58.274180 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
 nop,nop,timestamp 86450 7692910,nop,nop,sack sack 1 {12408:13856} 
 09:22:17.282035 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 2991 
 win 2728 nop,nop,timestamp 7730715 81749
 09:22:17.282153 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
 nop,nop,timestamp 800011202 7692910,nop,nop,sack sack 1 {12408:13856} 
 09:22:55.298955 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 2991 
 win 2728 nop,nop,timestamp 7768731 81749
 09:22:55.299023 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
 nop,nop,timestamp 800020705 7692910,nop,nop,sack sack 1 {12408:13856} 
 09:24:11.329853 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 2991 
 win 2728 nop,nop,timestamp 7844763 81749
 09:24:11.329923 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
 nop,nop,timestamp 800039711 7692910,nop,nop,sack sack 1 {12408:13856} 
 09:26:11.331578 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 2991 
 win 2728 nop,nop,timestamp 7964763 81749
 09:26:11.331699 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
 nop,nop,timestamp 800069708 7692910,nop,nop,sack sack 1 {12408:13856} 
 09:26:13.011885 IP CLIENT.50727  SERVER.ssh: P 2991:3039(48) ack 18464 win 
 378 nop,nop,timestamp 800070128 7692910
 09:26:13.309032 IP CLIENT.50727  SERVER.ssh: P 2991:3039(48) ack 18464 win 
 378 

[PATCH 3/7] Move two functions.

2007-07-29 Thread Corey Hickey
Move sfq_q_destroy() to above sfq_q_init() so that it can be used
by an error case in a later patch.

Move sfq_destroy() as well, for clarity.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   24 
 1 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 0c46938..583f925 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -380,6 +380,17 @@ static void sfq_perturbation(unsigned long arg)
}
 }
 
+static void sfq_q_destroy(struct sfq_sched_data *q)
+{
+   del_timer(q-perturb_timer);
+}
+
+static void sfq_destroy(struct Qdisc *sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   sfq_q_destroy(q);
+}
+
 static int sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
 {
struct tc_sfq_qopt *ctl = RTA_DATA(opt);
@@ -420,7 +431,7 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
int err;
-   
+
q-quantum = psched_mtu(sch-dev); /* default */
if ((err = sfq_q_init(q, opt)))
return err;
@@ -436,17 +447,6 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
return 0;
 }
 
-static void sfq_q_destroy(struct sfq_sched_data *q)
-{
-   del_timer(q-perturb_timer);
-}
-
-static void sfq_destroy(struct Qdisc *sch)
-{
-   struct sfq_sched_data *q = qdisc_priv(sch);
-   sfq_q_destroy(q);
-}
-
 static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/7] Add depth.

2007-07-29 Thread Corey Hickey
Make depth (number of queues) user-configurable:

* replace #define with a parameter
* use old hardcoded value as a default
* kmalloc() arrays in sfq_q_init()
* free() arrays in sfq_q_destroy()

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   85 ---
 1 files changed, 60 insertions(+), 25 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 583f925..2ff6a27 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -74,14 +74,14 @@
 
It is easy to increase these values, but not in flight.  */
 
-#define SFQ_DEPTH  128
+#define SFQ_DEPTH_DEFAULT  128
 #define SFQ_HASH_DIVISOR   1024
 
 #define SFQ_HEAD 0
 #define SFQ_TAIL 1
 
-/* This type should contain at least SFQ_DEPTH*2 values */
-typedef unsigned char sfq_index;
+/* This type should contain at least depth*2 values */
+typedef unsigned int sfq_index;
 
 struct sfq_head
 {
@@ -95,6 +95,7 @@ struct sfq_sched_data
int perturb_period;
unsignedquantum;/* Allotment per round: MUST BE = MTU 
*/
int limit;
+   unsigneddepth;
 
 /* Variables */
struct timer_list perturb_timer;
@@ -103,11 +104,11 @@ struct sfq_sched_data
sfq_index   max_depth;  /* Maximal depth */
 
sfq_index   ht[SFQ_HASH_DIVISOR];   /* Hash table */
-   sfq_index   next[SFQ_DEPTH];/* Active slots link */
-   short   allot[SFQ_DEPTH];   /* Current allotment per slot */
-   unsigned short  hash[SFQ_DEPTH];/* Hash value indexed by slots 
*/
-   struct sk_buff_head qs[SFQ_DEPTH];  /* Slot queue */
-   struct sfq_head dep[SFQ_DEPTH*2];   /* Linked list of slots, 
indexed by depth */
+   sfq_index   *next;  /* Active slots link */
+   short   *allot; /* Current allotment per slot */
+   unsigned short  *hash;  /* Hash value indexed by slots 
*/
+   struct sk_buff_head *qs;/* Slot queue */
+   struct sfq_head *dep;   /* Linked list of slots, 
indexed by depth */
 };
 
 static __inline__ unsigned sfq_fold_hash(struct sfq_sched_data *q, u32 h, u32 
h1)
@@ -164,7 +165,7 @@ static unsigned sfq_hash(struct sfq_sched_data *q, struct 
sk_buff *skb)
 static inline void sfq_link(struct sfq_sched_data *q, sfq_index x)
 {
sfq_index p, n;
-   int d = q-qs[x].qlen + SFQ_DEPTH;
+   int d = q-qs[x].qlen + q-depth;
 
p = d;
n = q-dep[d].next;
@@ -215,7 +216,7 @@ static unsigned int sfq_drop(struct Qdisc *sch)
   drop a packet from it */
 
if (d  1) {
-   sfq_index x = q-dep[d+SFQ_DEPTH].next;
+   sfq_index x = q-dep[d+q-depth].next;
skb = q-qs[x].prev;
len = skb-len;
__skb_unlink(skb, q-qs[x]);
@@ -238,7 +239,7 @@ static unsigned int sfq_drop(struct Qdisc *sch)
kfree_skb(skb);
sfq_dec(q, d);
sch-q.qlen--;
-   q-ht[q-hash[d]] = SFQ_DEPTH;
+   q-ht[q-hash[d]] = q-depth;
sch-qstats.drops++;
sch-qstats.backlog -= len;
return len;
@@ -253,8 +254,8 @@ static void sfq_q_enqueue(struct sk_buff *skb, struct 
sfq_sched_data *q, unsigne
sfq_index x;
 
x = q-ht[hash];
-   if (x == SFQ_DEPTH) {
-   q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
+   if (x == q-depth) {
+   q-ht[hash] = x = q-dep[q-depth].next;
q-hash[x] = hash;
}
 
@@ -265,7 +266,7 @@ static void sfq_q_enqueue(struct sk_buff *skb, struct 
sfq_sched_data *q, unsigne
 
sfq_inc(q, x);
if (q-qs[x].qlen == 1) {   /* The flow is new */
-   if (q-tail == SFQ_DEPTH) { /* It is the first flow */
+   if (q-tail == q-depth) {  /* It is the first flow */
q-tail = x;
q-next[x] = x;
q-allot[x] = q-quantum;
@@ -316,7 +317,7 @@ static struct sk_buff *sfq_q_dequeue(struct sfq_sched_data 
*q)
sfq_index a, old_a;
 
/* No active slots */
-   if (q-tail == SFQ_DEPTH)
+   if (q-tail == q-depth)
return NULL;
 
a = old_a = q-next[q-tail];
@@ -327,10 +328,10 @@ static struct sk_buff *sfq_q_dequeue(struct 
sfq_sched_data *q)
 
/* Is the slot empty? */
if (q-qs[a].qlen == 0) {
-   q-ht[q-hash[a]] = SFQ_DEPTH;
+   q-ht[q-hash[a]] = q-depth;
a = q-next[a];
if (a == old_a) {
-   q-tail = SFQ_DEPTH;
+   q-tail = q-depth;
return skb;
}
q-next[q-tail] = a;
@@ -383,6 +384,16 @@ static void sfq_perturbation(unsigned long arg)
 static void 

[PATCH 5/7] Add divisor.

2007-07-29 Thread Corey Hickey
Make hash divisor user-configurable.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   18 +-
 1 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 2ff6a27..3e67a68 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -75,7 +75,7 @@
It is easy to increase these values, but not in flight.  */
 
 #define SFQ_DEPTH_DEFAULT  128
-#define SFQ_HASH_DIVISOR   1024
+#define SFQ_DIVISOR_DEFAULT1024
 
 #define SFQ_HEAD 0
 #define SFQ_TAIL 1
@@ -96,6 +96,7 @@ struct sfq_sched_data
unsignedquantum;/* Allotment per round: MUST BE = MTU 
*/
int limit;
unsigneddepth;
+   unsignedhash_divisor;
 
 /* Variables */
struct timer_list perturb_timer;
@@ -103,7 +104,7 @@ struct sfq_sched_data
sfq_index   tail;   /* Index of current slot in round */
sfq_index   max_depth;  /* Maximal depth */
 
-   sfq_index   ht[SFQ_HASH_DIVISOR];   /* Hash table */
+   sfq_index   *ht;/* Hash table */
sfq_index   *next;  /* Active slots link */
short   *allot; /* Current allotment per slot */
unsigned short  *hash;  /* Hash value indexed by slots 
*/
@@ -118,7 +119,7 @@ static __inline__ unsigned sfq_fold_hash(struct 
sfq_sched_data *q, u32 h, u32 h1
/* Have we any rotation primitives? If not, WHY? */
h ^= (h1pert) ^ (h1(0x1F - pert));
h ^= h10;
-   return h  0x3FF;
+   return h  (q-hash_divisor-1);
 }
 
 static unsigned sfq_hash(struct sfq_sched_data *q, struct sk_buff *skb)
@@ -384,6 +385,8 @@ static void sfq_perturbation(unsigned long arg)
 static void sfq_q_destroy(struct sfq_sched_data *q)
 {
del_timer(q-perturb_timer);
+   if(q-ht)
+   kfree(q-ht);
if(q-dep)
kfree(q-dep);
if(q-next)
@@ -415,12 +418,14 @@ static int sfq_q_init(struct sfq_sched_data *q, struct 
rtattr *opt)
q-max_depth = 0;
if (opt == NULL) {
q-perturb_period = 0;
+   q-hash_divisor = SFQ_DIVISOR_DEFAULT;
q-tail = q-limit = q-depth = SFQ_DEPTH_DEFAULT;
} else {
struct tc_sfq_qopt *ctl = RTA_DATA(opt);
if (ctl-quantum)
q-quantum = ctl-quantum;
q-perturb_period = ctl-perturb_period*HZ;
+   q-hash_divisor = ctl-divisor ? : SFQ_DIVISOR_DEFAULT;
q-tail = q-limit = q-depth = ctl-flows ? : 
SFQ_DEPTH_DEFAULT;
 
if (q-depth  p - 1)
@@ -430,6 +435,9 @@ static int sfq_q_init(struct sfq_sched_data *q, struct 
rtattr *opt)
q-limit = min_t(u32, ctl-limit, q-depth);
}
 
+   q-ht = kmalloc(q-hash_divisor*sizeof(sfq_index), GFP_KERNEL);
+   if (!q-ht)
+   goto err_case;
q-dep = kmalloc((1+q-depth*2)*sizeof(struct sfq_head), GFP_KERNEL);
if (!q-dep)
goto err_case;
@@ -446,7 +454,7 @@ static int sfq_q_init(struct sfq_sched_data *q, struct 
rtattr *opt)
if (!q-qs)
goto err_case;
 
-   for (i=0; iSFQ_HASH_DIVISOR; i++)
+   for (i=0; iq-hash_divisor; i++)
q-ht[i] = q-depth;
for (i=0; iq-depth; i++) {
skb_queue_head_init(q-qs[i]);
@@ -492,7 +500,7 @@ static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
opt.perturb_period = q-perturb_period/HZ;
 
opt.limit = q-limit;
-   opt.divisor = SFQ_HASH_DIVISOR;
+   opt.divisor = q-hash_divisor;
opt.flows = q-depth;
 
RTA_PUT(skb, TCA_OPTIONS, sizeof(opt), opt);
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/7] Make qdisc changeable.

2007-07-29 Thread Corey Hickey
Re-implement sfq_change() and enable Qdisc_opts.change so tc qdisc
change will work.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   51 ++-
 1 files changed, 50 insertions(+), 1 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 3e67a68..4cd523f 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -490,6 +490,55 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
return 0;
 }
 
+static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   struct sfq_sched_data tmp;
+   struct sk_buff *skb;
+   int err;
+   
+   /* set up tmp queue */
+   memset(tmp, 0, sizeof(struct sfq_sched_data));
+   tmp.quantum = psched_mtu(sch-dev); /* default */
+   if ((err = sfq_q_init(tmp, opt)))
+   return err;
+
+   /* copy packets from the old queue to the tmp queue */
+   sch_tree_lock(sch);
+   while (sch-q.qlen = tmp.limit - 1)
+   sfq_drop(sch);
+   while ((skb = sfq_q_dequeue(q)) != NULL)
+   sfq_q_enqueue(skb, tmp, SFQ_TAIL);
+   
+   /* clean up the old queue */
+   sfq_q_destroy(q);
+
+   /* copy elements of the tmp queue into the old queue */
+   q-perturb_period = tmp.perturb_period;
+   q-quantum= tmp.quantum;
+   q-limit  = tmp.limit;
+   q-depth  = tmp.depth;
+   q-hash_divisor   = tmp.hash_divisor;
+   q-tail   = tmp.tail;
+   q-max_depth  = tmp.max_depth;
+   q-ht= tmp.ht;
+   q-dep   = tmp.dep;
+   q-next  = tmp.next;
+   q-allot = tmp.allot;
+   q-hash  = tmp.hash;
+   q-qs= tmp.qs;
+
+   /* finish up */
+   if (q-perturb_period) {
+   q-perturb_timer.expires = jiffies + q-perturb_period;
+   add_timer(q-perturb_timer);
+   } else {
+   q-perturbation = 0;
+   }
+   sch_tree_unlock(sch);
+   return 0;
+}
+
 static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
@@ -524,7 +573,7 @@ static struct Qdisc_ops sfq_qdisc_ops = {
.init   =   sfq_init,
.reset  =   sfq_reset,
.destroy=   sfq_destroy,
-   .change =   NULL,
+   .change =   sfq_change,
.dump   =   sfq_dump,
.owner  =   THIS_MODULE,
 };
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] [iproute2] SFQ: Support changing depth and divisor.

2007-07-29 Thread Corey Hickey
This can safely be applied either before or after the kernel
patches because the tc_sfq_qopt struct is unchanged:

- old kernels will ignore the parameters from new iproute2
- new kernels will use the same default parameters
---
 include/linux/pkt_sched.h |9 -
 tc/q_sfq.c|   21 -
 2 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index d10f353..37946d4 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -139,15 +139,6 @@ struct tc_sfq_qopt
unsignedflows;  /* Maximal number of flows  */
 };
 
-/*
- *  NOTE: limit, divisor and flows are hardwired to code at the moment.
- *
- * limit=flows=128, divisor=1024;
- *
- * The only reason for this is efficiency, it is possible
- * to change these parameters in compile time.
- */
-
 /* RED section */
 
 enum
diff --git a/tc/q_sfq.c b/tc/q_sfq.c
index 05385cf..7754db7 100644
--- a/tc/q_sfq.c
+++ b/tc/q_sfq.c
@@ -25,7 +25,7 @@
 
 static void explain(void)
 {
-   fprintf(stderr, Usage: ... sfq [ limit NUMBER ] [ perturb SECS ] [ 
quantum BYTES ]\n);
+   fprintf(stderr, Usage: ... sfq [ limit NUMBER ] [ depth FLOWS ] [ 
divisor HASHBITS ] [ perturb SECS ] [ quantum BYTES ]\n);
 }
 
 #define usage() return(-1)
@@ -63,6 +63,25 @@ static int sfq_parse_opt(struct qdisc_util *qu, int argc, 
char **argv, struct nl
return -1;
}
ok++;
+   } else if (strcmp(*argv, depth) == 0) {
+   NEXT_ARG();
+   if (get_unsigned(opt.flows, *argv, 0)) {
+   fprintf(stderr, Illegal \depth\\n);
+   return -1;
+   }
+   ok++;
+   } else if (strcmp(*argv, divisor) == 0) {
+   NEXT_ARG();
+   if (get_unsigned(opt.divisor, *argv, 0)) {
+   fprintf(stderr, Illegal \divisor\\n);
+   return -1;
+   }
+   if (opt.divisor = 15) {
+   fprintf(stderr, Illegal \divisor\, must be  
15\n);
+   return -1;
+   }
+   opt.divisor = 1opt.divisor;
+   ok++;
} else if (strcmp(*argv, help) == 0) {
explain();
return -1;
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/7] Remove comments about hardcoded values.

2007-07-29 Thread Corey Hickey
None of these are true anymore (hooray!).

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 include/linux/pkt_sched.h |8 
 net/sched/sch_sfq.c   |   17 +++--
 2 files changed, 3 insertions(+), 22 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 268c515..58a0ea6 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -148,14 +148,6 @@ struct tc_sfq_qopt
unsignedflows;  /* Maximal number of flows  */
 };
 
-/*
- *  NOTE: limit, divisor and flows are hardwired to code at the moment.
- *
- * limit=flows=128, divisor=1024;
- *
- * The only reason for this is efficiency, it is possible
- * to change these parameters in compile time.
- */
 
 /* RED section */
 
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 4cd523f..8e84881 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -61,18 +61,7 @@
 
We still need true WFQ for top level CSZ, but using WFQ
for the best effort traffic is absolutely pointless:
-   SFQ is superior for this purpose.
-
-   IMPLEMENTATION:
-   This implementation limits maximal queue length to 128;
-   maximal mtu to 2^15-1; number of hash buckets to 1024.
-   The only goal of this restrictions was that all data
-   fit into one 4K page :-). Struct sfq_sched_data is
-   organized in anti-cache manner: all the data for a bucket
-   are scattered over different locations. This is not good,
-   but it allowed me to put it into 4K.
-
-   It is easy to increase these values, but not in flight.  */
+   SFQ is superior for this purpose. */
 
 #define SFQ_DEPTH_DEFAULT  128
 #define SFQ_DIVISOR_DEFAULT1024
@@ -496,7 +485,7 @@ static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
struct sfq_sched_data tmp;
struct sk_buff *skb;
int err;
-   
+
/* set up tmp queue */
memset(tmp, 0, sizeof(struct sfq_sched_data));
tmp.quantum = psched_mtu(sch-dev); /* default */
@@ -509,7 +498,7 @@ static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
sfq_drop(sch);
while ((skb = sfq_q_dequeue(q)) != NULL)
sfq_q_enqueue(skb, tmp, SFQ_TAIL);
-   
+
/* clean up the old queue */
sfq_q_destroy(q);
 
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] fix endianness bug in l2cap_sock_listen()

2007-07-29 Thread David Miller
From: Marcel Holtmann [EMAIL PROTECTED]
Date: Fri, 27 Jul 2007 16:41:18 +0200

  We loop through psm values, calling __l2cap_get_sock_by_addr(psm, ...)
  until we get NULL; then we set -psm of our socket to htobs(psm).
  IOW, we find unused psm value and put it into our socket.  So far, so
  good, but...  __l2cap_get_sock_by_addr() compares its argument with
  -psm of sockets.  IOW, the entire thing works correctly only on
  little-endian.  On big-endian we'll get no socket with such psm
  on the first iteration, since we won't find a socket with -psm == 0x1001.
  We will happily conclude that 0x1001 is unused and slap htobs(0x1001)
  (i.e. 0x110) into -psm of our socket.  Of course, the next time around
  the same thing will repeat and we'll just get a fsckload of sockets
  with the same -psm assigned.
  
  Fix: pass htobs(psm) to __l2cap_get_sock_by_addr() there.  All other
  callers are already passing little-endian values and all places that
  store something in -psm are storing little-endian.
  
  Signed-off-by: Al Viro [EMAIL PROTECTED]
 
 Signed-off-by: Marcel Holtmann [EMAIL PROTECTED]

Applied.

 Dave, all four patches are good and should go in sooner than later. If
 you want me to put them into my tree first, then I can do that or you
 can apply them directly. You choice.

I'll take care of these, thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] l2cap: endianness annotations

2007-07-29 Thread David Miller
From: Marcel Holtmann [EMAIL PROTECTED]
Date: Fri, 27 Jul 2007 16:41:21 +0200

  no code changes, just documenting existing types
  
  Signed-off-by: Al Viro [EMAIL PROTECTED]
 
 Signed-off-by: Marcel Holtmann [EMAIL PROTECTED]

Applied.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/7] SFQ: backport some features from ESFQ

2007-07-29 Thread Corey Hickey

Corey Hickey wrote:

Hello,

This set of patches adds some of ESFQ's modifications to the original 
SFQ. Thus far, I have received support for this approach rather than for 
trying to get ESFQ included as a separate qdisc.


Crud; I wasn't expecting git-send-email to thread the messages that way. 
I guess I should have used --no-chain-reply-to. If I ought re-send this 
series with that option, somebody please say the word and I'll do so.


-Corey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] pass (host-endian) cmd length as explicit argument to l2cap_conf_req()

2007-07-29 Thread David Miller
From: Marcel Holtmann [EMAIL PROTECTED]
Date: Fri, 27 Jul 2007 16:41:22 +0200

  Signed-off-by: Al Viro [EMAIL PROTECTED]
 
 Signed-off-by: Marcel Holtmann [EMAIL PROTECTED]

Applied.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] l2cap: don't mangle cmd.len

2007-07-29 Thread David Miller
From: Marcel Holtmann [EMAIL PROTECTED]
Date: Fri, 27 Jul 2007 16:41:26 +0200

  Since nobody uses it after we convert it to host-endian,
  no need to do that at all.  At that point l2cap is endian-clean.
  
  Signed-off-by: Al Viro [EMAIL PROTECTED]
 
 Signed-off-by: Marcel Holtmann [EMAIL PROTECTED]

Also applied, thanks a lot.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/7] SFQ: backport some features from ESFQ

2007-07-29 Thread David Miller
From: Corey Hickey [EMAIL PROTECTED]
Date: Sun, 29 Jul 2007 00:17:15 -0700

 Corey Hickey wrote:
  Hello,
  
  This set of patches adds some of ESFQ's modifications to the original 
  SFQ. Thus far, I have received support for this approach rather than for 
  trying to get ESFQ included as a separate qdisc.
 
 Crud; I wasn't expecting git-send-email to thread the messages that way. 
 I guess I should have used --no-chain-reply-to. If I ought re-send this 
 series with that option, somebody please say the word and I'll do so.

It's not necessary, let some feedback come in first.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TCP SACK issue, hung connection, tcpdump included

2007-07-29 Thread Ilpo Järvinen
On Sun, 29 Jul 2007, Willy Tarreau wrote:

 On Sun, Jul 29, 2007 at 06:59:26AM +0100, Darryl L. Miles wrote:
  CLIENT = Linux 2.6.20.1-smp [Customer build]
  SERVER = Linux 2.6.9-55.ELsmp [Red Hat Enterprise Linux AS release 4 
  (Nahant Update 5)]
  
  The problems start around time index 09:21:39.860302 when the CLIENT issues 
  a TCP packet with SACK option set (seemingly for a data segment which has 
  already been seen) from that point on the connection hangs.

...That's DSACK and it's being correctly sent. To me, it seems unlikely to 
be the cause for this breakage...

 Where was the capture taken ? on CLIENT or on SERVER (I suspect client from
 the timers) ? 

...I would guess the same based on SYN timestamps (and from the DSACK 
timestamps)...

 A possible, but very unlikely reason would be an MTU limitation
 somewhere, because the segment which never gets correctly ACKed is also the
 largest one in this trace.

Limitation for 48 byte segments? You have to be kidding... :-) But yes,
it seems that one of the directions is dropping packets for some reason 
though I would not assume MTU limitation... Or did you mean some other 
segment?

 Also, if everything is right with the packets on the server side, then
 it would means that it's the RHEL kernel which is buggy (which does not
 mean that the same bug does not exist in mainline).

...There are two independed signs that server side is not getting the 
packets (ack field is not advanced nor do the retransmissions of an old 
data cease) so IMHO that would the first thing to rule out.

  Full dump available via email.
  
  
  09:12:13.444999 IP CLIENT.50727  SERVER.ssh: S 2919512080:2919512080(0) 
  win 5840 mss 1460,sackOK,timestamp 799860259 0,nop,wscale 7
  09:12:13.535643 IP SERVER.ssh  CLIENT.50727: S 492909547:492909547(0) ack 
  2919512081 win 5792 mss 1460,sackOK,timestamp 7126976 799860259,nop,wscale 
  2
  09:12:13.535717 IP CLIENT.50727  SERVER.ssh: . ack 1 win 46 
  nop,nop,timestamp 799860282 7126976
  09:12:13.665481 IP SERVER.ssh  CLIENT.50727: P 1:24(23) ack 1 win 1448 
  nop,nop,timestamp 7127074 799860282
  09:12:13.665895 IP CLIENT.50727  SERVER.ssh: . ack 24 win 46 
  nop,nop,timestamp 799860314 7127074
  09:12:13.666044 IP CLIENT.50727  SERVER.ssh: P 1:23(22) ack 24 win 46 
  nop,nop,timestamp 799860314 7127074
  09:12:13.776318 IP SERVER.ssh  CLIENT.50727: . ack 23 win 1448 
  nop,nop,timestamp 7127216 799860314
  
  ...SNIPPED SUCCESSFUL TRAFFIC...
  
  09:21:39.490740 IP SERVER.ssh  CLIENT.50727: P 18200:18464(264) ack 2991 
  win 2728 nop,nop,timestamp 7692910 81727
  09:21:39.490775 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
  nop,nop,timestamp 81755 7692910
  09:21:39.860245 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 2991 
  win 2728 nop,nop,timestamp 7693293 81749
  09:21:39.860302 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
  nop,nop,timestamp 81847 7692910,nop,nop,sack sack 1 {12408:13856} 
  09:21:40.453440 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 2991 
  win 2728 nop,nop,timestamp 7693887 81749
  09:21:40.453495 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
  nop,nop,timestamp 81996 7692910,nop,nop,sack sack 1 {12408:13856} 
  09:21:41.641821 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 2991 
  win 2728 nop,nop,timestamp 7695075 81749
  09:21:41.641938 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
  nop,nop,timestamp 82293 7692910,nop,nop,sack sack 1 {12408:13856} 
  09:21:44.017552 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 2991 
  win 2728 nop,nop,timestamp 7697451 81749
  09:21:44.017622 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
  nop,nop,timestamp 82887 7692910,nop,nop,sack sack 1 {12408:13856} 
  09:21:48.770051 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 2991 
  win 2728 nop,nop,timestamp 7702203 81749
  09:21:48.770099 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
  nop,nop,timestamp 84075 7692910,nop,nop,sack sack 1 {12408:13856} 
  09:21:58.274061 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 2991 
  win 2728 nop,nop,timestamp 7711707 81749
  09:21:58.274180 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
  nop,nop,timestamp 86450 7692910,nop,nop,sack sack 1 {12408:13856} 
  09:22:17.282035 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 2991 
  win 2728 nop,nop,timestamp 7730715 81749
  09:22:17.282153 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
  nop,nop,timestamp 800011202 7692910,nop,nop,sack sack 1 {12408:13856} 
  09:22:55.298955 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 2991 
  win 2728 nop,nop,timestamp 7768731 81749
  09:22:55.299023 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
  nop,nop,timestamp 800020705 7692910,nop,nop,sack sack 1 {12408:13856} 
  09:24:11.329853 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 2991 
  win 2728 nop,nop,timestamp 7844763 81749
  09:24:11.329923 IP CLIENT.50727  SERVER.ssh: . ack 

Re: TCP SACK issue, hung connection, tcpdump included

2007-07-29 Thread Willy Tarreau
On Sun, Jul 29, 2007 at 11:26:00AM +0300, Ilpo Järvinen wrote:
 On Sun, 29 Jul 2007, Willy Tarreau wrote:
 
  On Sun, Jul 29, 2007 at 06:59:26AM +0100, Darryl L. Miles wrote:
   CLIENT = Linux 2.6.20.1-smp [Customer build]
   SERVER = Linux 2.6.9-55.ELsmp [Red Hat Enterprise Linux AS release 4 
   (Nahant Update 5)]
   
   The problems start around time index 09:21:39.860302 when the CLIENT 
   issues 
   a TCP packet with SACK option set (seemingly for a data segment which has 
   already been seen) from that point on the connection hangs.
 
 ...That's DSACK and it's being correctly sent. To me, it seems unlikely to 
 be the cause for this breakage...
 
  Where was the capture taken ? on CLIENT or on SERVER (I suspect client from
  the timers) ? 
 
 ...I would guess the same based on SYN timestamps (and from the DSACK 
 timestamps)...
 
  A possible, but very unlikely reason would be an MTU limitation
  somewhere, because the segment which never gets correctly ACKed is also the
  largest one in this trace.
 
 Limitation for 48 byte segments? You have to be kidding... :-) But yes,
 it seems that one of the directions is dropping packets for some reason 
 though I would not assume MTU limitation... Or did you mean some other 
 segment?

No, I was talking about the 1448 bytes segments. But in fact I don't
believe it much because the SACKs are always retransmitted just afterwards.

BTW, some information are missing. It would have been better if the trace
had been read with tcpdump -Svv. We would have got seq numbers and ttl.
Also, we do not know if there's a firewall between both sides. Sometimes,
some IDS identify attacks in crypted traffic and kill connections. It
might have been the case here, with the connection closed one way on an
intermediate firewall.

Regards,
Willy

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TCP SACK issue, hung connection, tcpdump included

2007-07-29 Thread David Miller
From: Ilpo_Järvinen [EMAIL PROTECTED]
Date: Sun, 29 Jul 2007 11:26:00 +0300 (EEST)

 Is this reproducable? Can you somehow verify that the packets CLIENT is 
 sending are indeed received by the SERVER...?

One possibility is drops due to checksum errors on the receiver, this
tends to pop up from time to time, let's see some SNMP statistics.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TCP SACK issue, hung connection, tcpdump included

2007-07-29 Thread Jan Engelhardt

On Jul 29 2007 08:45, Willy Tarreau wrote:
On Sun, Jul 29, 2007 at 06:59:26AM +0100, Darryl L. Miles wrote:
 CLIENT = Linux 2.6.20.1-smp [Customer build]
 SERVER = Linux 2.6.9-55.ELsmp [Red Hat Enterprise Linux AS release 4 
 (Nahant Update 5)]
 
 The problems start around time index 09:21:39.860302 when the CLIENT issues 
 a TCP packet with SACK option set (seemingly for a data segment which has 
 already been seen) from that point on the connection hangs.

Where was the capture taken ? on CLIENT or on SERVER (I suspect client from
the timers) ? A possible, but very unlikely reason would be an MTU limitation
somewhere, because the segment which never gets correctly ACKed is also the
largest one in this trace.[...]

I had something similar (and most strangely enough, it only seemed to happen
when the source and/or destination address was outside a given subnet). It
boiled down to that some switch acted really strange and God knows why. A
tcpdump on *both* sides (server+client) each showed that the *other* peer sent
a FIN/RST. Well in the end I settled with setting sys.net.ipv4.tcp_sack=0,
no idea if the broken switch got replaced in the meantime.
Well, just letting you know that it is not limited to computers.


Jan
-- 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-usb-devel] [PATCH] USB Pegasus driver - avoid a potential NULL pointer dereference.

2007-07-29 Thread Oliver Neukum
Am Sonntag 29 Juli 2007 schrieb Jesper Juhl:
 On 29/07/07, Satyam Sharma [EMAIL PROTECTED] wrote:
  Hi,
 
  On 7/29/07, Jesper Juhl [EMAIL PROTECTED] wrote:
   Hello,
  
   This patch makes sure we don't dereference a NULL pointer in
   drivers/net/usb/pegasus.c::write_bulk_callback() in the initial
   struct net_device *net = pegasus-net; assignment.
   The existing code checks if 'pegasus' is NULL and bails out if
   it is, so we better not touch that pointer until after that check.
   [...]
   diff --git a/drivers/net/usb/pegasus.c b/drivers/net/usb/pegasus.c
   index a05fd97..04cba6b 100644
   --- a/drivers/net/usb/pegasus.c
   +++ b/drivers/net/usb/pegasus.c
   @@ -768,11 +768,13 @@ done:
static void write_bulk_callback(struct urb *urb)
{
   pegasus_t *pegasus = urb-context;
   -   struct net_device *net = pegasus-net;
   +   struct net_device *net;
  
   if (!pegasus)
   return;
  
   +   net = pegasus-net;
   +
   if (!netif_device_present(net) || !netif_running(net))
   return;
 
  Is it really possible that we're called into this function with
  urb-context == NULL? If not, I'd suggest let's just get rid of
  the check itself, instead.
 
 I'm not sure. I am not very familiar with this code. I just figured
 that moving the assignment is potentially a little safer and it is
 certainly no worse than the current code, so that's a safe and
 potentially benneficial change. Removing the check may be safe but I'm
 not certain enough to make that call...

pegasus == NULL there would be a kernel bug. Silently ignoring
it, like the code now wants to do is bad. As the oops has never been
reported, I figure turning it into an explicit debugging test is overkill,
so removal seems to be the best option.

Regards
Oliver
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TCP SACK issue, hung connection, tcpdump included

2007-07-29 Thread Ilpo Järvinen
On Sun, 29 Jul 2007, Willy Tarreau wrote:

 On Sun, Jul 29, 2007 at 11:26:00AM +0300, Ilpo Järvinen wrote:
  On Sun, 29 Jul 2007, Willy Tarreau wrote:
  
   On Sun, Jul 29, 2007 at 06:59:26AM +0100, Darryl L. Miles wrote:
CLIENT = Linux 2.6.20.1-smp [Customer build]
SERVER = Linux 2.6.9-55.ELsmp [Red Hat Enterprise Linux AS release 4 
(Nahant Update 5)]

The problems start around time index 09:21:39.860302 when the CLIENT 
issues 
a TCP packet with SACK option set (seemingly for a data segment which 
has 
already been seen) from that point on the connection hangs.
  
  ...That's DSACK and it's being correctly sent. To me, it seems unlikely to 
  be the cause for this breakage...
  
   Where was the capture taken ? on CLIENT or on SERVER (I suspect client 
   from
   the timers) ? 
  
  ...I would guess the same based on SYN timestamps (and from the DSACK 
  timestamps)...
  
   A possible, but very unlikely reason would be an MTU limitation
   somewhere, because the segment which never gets correctly ACKed is also 
   the
   largest one in this trace.
  
  Limitation for 48 byte segments? You have to be kidding... :-) But yes,
  it seems that one of the directions is dropping packets for some reason 
  though I would not assume MTU limitation... Or did you mean some other 
  segment?
 
 No, I was talking about the 1448 bytes segments. But in fact I don't
 believe it much because the SACKs are always retransmitted just afterwards.

Ah, but it's ACKed correctly right below it...:

[...snip...]
09:21:39.490740 IP SERVER.ssh  CLIENT.50727: P 18200:18464(264) ack 
2991 
win 2728 nop,nop,timestamp 7692910 81727
09:21:39.490775 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
nop,nop,timestamp 81755 7692910
09:21:39.860245 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 
2991 
win 2728 nop,nop,timestamp 7693293 81749

...segment below snd_una arrived = snd_una remains 18464, receiver 
generates a duplicate ACK:
 
09:21:39.860302 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
nop,nop,timestamp 81847 7692910,nop,nop,sack sack 1 {12408:13856} 

The cumulative ACK field of it covers _everything_ below 18464 (i.e., it 
ACKs them), including the 1448 bytes in 12408:13856... In addition, the 
SACK block is DSACK information [RFC2883] telling explicitly the address 
of the received duplicate block. However, if this ACK doesn't reach the 
SERVER TCP, RTO is triggered and the first not yet cumulatively ACKed 
segment is retransmitted (I guess cumulative ACKs up to 12408 arrived 
without problems to the SERVER):

09:21:40.453440 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 
2991 
win 2728 nop,nop,timestamp 7693887 81749
[...snip...]

 BTW, some information are missing. It would have been better if the trace
 had been read with tcpdump -Svv. We would have got seq numbers and ttl.
 Also, we do not know if there's a firewall between both sides. Sometimes,
 some IDS identify attacks in crypted traffic and kill connections. It
 might have been the case here, with the connection closed one way on an
 intermediate firewall.

Yeah, firewall or some other issue, I'd say it's quite unlikely a bug in 
TCP because behavior to both directions indicate client - sender 
blackhole independently of each other...


-- 
 i.

Re: source interface ping bug ?

2007-07-29 Thread nano bug
Any news about this ?

On 7/27/07, nano bug [EMAIL PROTECTED] wrote:
 Hello there,

 I'm facing the following issue : when I try to ping using source
 interface instead of a source ip address the ping utility starts to
 send arp requests instead of icmp requests though the ip address I'm
 pinging it's not in the subnets directly connected to my linux box.
 I've noticed this situation since I upgraded from kernel 2.6.20 to
 2.6.21. On 2.6.20 and lower I haven't had this problem. Now I upgraded
 to 2.6.22 but it's the same. I'm using latest iproute and iputils.
 Here is an output of tcpdump when I try to ping an outside ip address,
 like for example www.yahoo.com, using source interface :

 [EMAIL PROTECTED]:~# uname -a
 Linux darkstar 2.6.22 #1 Thu Jul 26 21:22:11 EEST 2007 i686 Pentium II
 (Deschutes) GenuineIntel GNU/Linux

 [EMAIL PROTECTED]:~# ip -V
 ip utility, iproute2-ss070710
 [EMAIL PROTECTED]:~#


 [EMAIL PROTECTED]:~# ip address show dev eth2
 3: eth2: BROADCAST,MULTICAST,NOTRAILERS,UP,LOWER_UP mtu 1500 qdisc
 pfifo_fast qlen 1000
link/ether 00:90:27:0f:79:f3 brd ff:ff:ff:ff:ff:ff
inet 86.106.19.75/23 brd 86.106.19.255 scope global eth2
 [EMAIL PROTECTED]:~# ip route get 87.248.113.14 from 86.106.19.75 oif eth2
 87.248.113.14 from 86.106.19.75 via 86.106.18.1 dev eth2
cache  mtu 1500 advmss 1460 hoplimit 64
 [EMAIL PROTECTED]:~#


 [EMAIL PROTECTED]:~/iputils# ./ping -V
 ping utility, iputils-sss20070202
 [EMAIL PROTECTED]:~/iputils# ./ping -I 86.106.19.75 87.248.113.14 -c 2
 PING 87.248.113.14 (87.248.113.14) from 86.106.19.75 : 56(84) bytes of data.
 64 bytes from 87.248.113.14: icmp_seq=1 ttl=51 time=60.5 ms
 64 bytes from 87.248.113.14: icmp_seq=2 ttl=51 time=63.2 ms

 --- 87.248.113.14 ping statistics ---
 2 packets transmitted, 2 received, 0% packet loss, time 999ms
 rtt min/avg/max/mdev = 60.574/61.924/63.274/1.350 ms
 [EMAIL PROTECTED]:~/iputils#


 [EMAIL PROTECTED]:~# tcpdump -i eth2 -vvv -n host 87.248.113.14 and host
 86.106.19.75
 tcpdump: listening on eth2, link-type EN10MB (Ethernet), capture size 96 bytes
 01:18:09.572603 IP (tos 0x0, ttl  64, id 0, offset 0, flags [DF],
 proto: ICMP (1), length: 84) 86.106.19.75  87.248.113.14: ICMP echo
 request, id 27166, seq 1, length 64
 01:18:09.632861 IP (tos 0x0, ttl  51, id 6100, offset 0, flags [none],
 proto: ICMP (1), length: 84) 87.248.113.14  86.106.19.75: ICMP echo
 reply, id 27166, seq 1, length 64
 01:18:10.572746 IP (tos 0x0, ttl  64, id 0, offset 0, flags [DF],
 proto: ICMP (1), length: 84) 86.106.19.75  87.248.113.14: ICMP echo
 request, id 27166, seq 2, length 64
 01:18:10.634951 IP (tos 0x0, ttl  51, id 8790, offset 0, flags [none],
 proto: ICMP (1), length: 84) 87.248.113.14  86.106.19.75: ICMP echo
 reply, id 27166, seq 2, length 64



 using source interface :

 [EMAIL PROTECTED]:~/iputils# ./ping -I eth2 87.248.113.14
 PING 87.248.113.14 (87.248.113.14) from 86.106.19.75 eth2: 56(84) bytes of 
 data.
 From 86.106.19.75 icmp_seq=1 Destination Host Unreachable
 From 86.106.19.75 icmp_seq=2 Destination Host Unreachable
 From 86.106.19.75 icmp_seq=3 Destination Host Unreachable
 From 86.106.19.75 icmp_seq=5 Destination Host Unreachable
 From 86.106.19.75 icmp_seq=6 Destination Host Unreachable
 From 86.106.19.75 icmp_seq=7 Destination Host Unreachable
 From 86.106.19.75 icmp_seq=9 Destination Host Unreachable
 From 86.106.19.75 icmp_seq=10 Destination Host Unreachable
 From 86.106.19.75 icmp_seq=11 Destination Host Unreachable

 --- 87.248.113.14 ping statistics ---
 13 packets transmitted, 0 received, +9 errors, 100% packet loss, time 12006ms
 , pipe 3
 [EMAIL PROTECTED]:~/iputils#


 [EMAIL PROTECTED]:~# tcpdump -i eth2 -vvv -n host 87.248.113.14 and host
 86.106.19.75
 tcpdump: listening on eth2, link-type EN10MB (Ethernet), capture size 96 bytes
 01:19:24.292911 arp who-has 87.248.113.14 tell 86.106.19.75
 01:19:25.292897 arp who-has 87.248.113.14 tell 86.106.19.75
 01:19:26.292901 arp who-has 87.248.113.14 tell 86.106.19.75
 01:19:27.302906 arp who-has 87.248.113.14 tell 86.106.19.75
 01:19:28.302911 arp who-has 87.248.113.14 tell 86.106.19.75
 01:19:29.302912 arp who-has 87.248.113.14 tell 86.106.19.75
 01:19:31.302917 arp who-has 87.248.113.14 tell 86.106.19.75
 01:19:32.302921 arp who-has 87.248.113.14 tell 86.106.19.75
 01:19:33.302923 arp who-has 87.248.113.14 tell 86.106.19.75
 01:19:35.302932 arp who-has 87.248.113.14 tell 86.106.19.75
 01:19:36.302932 arp who-has 87.248.113.14 tell 86.106.19.75
 01:19:37.302939 arp who-has 87.248.113.14 tell 86.106.19.75

 12 packets captured
 12 packets received by filter
 0 packets dropped by kernel
 [EMAIL PROTECTED]:~#


 There is one exception though, it works when using eth0. I'm
 administrating multiple linux boxes with 2 or 3 ethernet cards and if
 I try pinging with eth0 it does send icmp like it should but when
 specifying eth1 or eth2 is sends arp requests. The distro I'm using is
 Slackware 12.0. I already reported this to the iputils maintainer and
 he 

[2.6 patch] drivers/net/cxgb3/xgmac.c: remove dead code

2007-07-29 Thread Adrian Bunk
This patch removes dead code (tx_xcnt can never be != 0 at this place) 
spotted by the Coverity checker.

Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

---

 drivers/net/cxgb3/xgmac.c |5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

--- linux-2.6.22-rc6-mm1/drivers/net/cxgb3/xgmac.c.old  2007-07-24 
13:55:33.0 +0200
+++ linux-2.6.22-rc6-mm1/drivers/net/cxgb3/xgmac.c  2007-07-24 
13:57:06.0 +0200
@@ -510,38 +510,35 @@ int t3b2_mac_watchdog_task(struct cmac *
if (tx_mcnt == mac-tx_mcnt) {
tx_xcnt = (G_TXSPI4SOPCNT(t3_read_reg(adap,
A_XGM_TX_SPI4_SOP_EOP_CNT +
mac-offset)));
if (tx_xcnt == 0) {
t3_write_reg(adap, A_TP_PIO_ADDR,
 A_TP_TX_DROP_CNT_CH0 + macidx(mac));
tx_tcnt = (G_TXDROPCNTCH0RCVD(t3_read_reg(adap,
  A_TP_PIO_DATA)));
} else {
goto rxcheck;
}
} else {
mac-toggle_cnt = 0;
goto rxcheck;
}
 
-   if (((tx_tcnt != mac-tx_tcnt) 
-(tx_xcnt == 0)  (mac-tx_xcnt == 0)) ||
-   ((mac-tx_mcnt == tx_mcnt) 
-(tx_xcnt != 0)  (mac-tx_xcnt != 0))) {
+   if ((tx_tcnt != mac-tx_tcnt)  (mac-tx_xcnt == 0))  {
if (mac-toggle_cnt  4) {
status = 2;
goto out;
} else {
status = 1;
goto out;
}
} else {
mac-toggle_cnt = 0;
goto rxcheck;
}
 
 rxcheck:
if (rx_mcnt != mac-rx_mcnt)
rx_xcnt = (G_TXSPI4SOPCNT(t3_read_reg(adap,
A_XGM_RX_SPI4_SOP_EOP_CNT +
mac-offset)));

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-mm patch] e1000: #if 0 two functions

2007-07-29 Thread Adrian Bunk
e1000_{read,write}_pci_cfg() are no longer used.

Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

---

This patch has been sent on:
- 1 Jul 2007
- 5 Jun 2007

 drivers/net/e1000/e1000_hw.h   |2 --
 drivers/net/e1000/e1000_main.c |4 
 2 files changed, 4 insertions(+), 2 deletions(-)

--- linux-2.6.22-rc3-mm1/drivers/net/e1000/e1000_hw.h.old   2007-06-04 
22:03:05.0 +0200
+++ linux-2.6.22-rc3-mm1/drivers/net/e1000/e1000_hw.h   2007-06-04 
22:03:14.0 +0200
@@ -421,8 +421,6 @@ void e1000_tbi_adjust_stats(struct e1000
 void e1000_get_bus_info(struct e1000_hw *hw);
 void e1000_pci_set_mwi(struct e1000_hw *hw);
 void e1000_pci_clear_mwi(struct e1000_hw *hw);
-void e1000_read_pci_cfg(struct e1000_hw *hw, uint32_t reg, uint16_t * value);
-void e1000_write_pci_cfg(struct e1000_hw *hw, uint32_t reg, uint16_t * value);
 int32_t e1000_read_pcie_cap_reg(struct e1000_hw *hw, uint32_t reg, uint16_t 
*value);
 void e1000_pcix_set_mmrbc(struct e1000_hw *hw, int mmrbc);
 int e1000_pcix_get_mmrbc(struct e1000_hw *hw);
--- linux-2.6.22-rc3-mm1/drivers/net/e1000/e1000_main.c.old 2007-06-04 
22:03:24.0 +0200
+++ linux-2.6.22-rc3-mm1/drivers/net/e1000/e1000_main.c 2007-06-04 
22:03:40.0 +0200
@@ -4888,6 +4888,8 @@ e1000_pci_clear_mwi(struct e1000_hw *hw)
pci_clear_mwi(adapter-pdev);
 }
 
+#if 0
+
 void
 e1000_read_pci_cfg(struct e1000_hw *hw, uint32_t reg, uint16_t *value)
 {
@@ -4904,6 +4906,8 @@ e1000_write_pci_cfg(struct e1000_hw *hw,
pci_write_config_word(adapter-pdev, reg, *value);
 }
 
+#endif  /*  0  */
+
 int
 e1000_pcix_get_mmrbc(struct e1000_hw *hw)
 {


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[2.6 patch] make pktgen.c:get_ipsec_sa() static and non-inline

2007-07-29 Thread Adrian Bunk
Non-static inline code usually doesn't makes sense.

In this case making is static and non-inline is the correct solution.

Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

---
--- linux-2.6.23-rc1-mm1/net/core/pktgen.c.old  2007-07-26 20:12:01.0 
+0200
+++ linux-2.6.23-rc1-mm1/net/core/pktgen.c  2007-07-26 20:12:13.0 
+0200
@@ -2156,8 +2156,7 @@
 /* If there was already an IPSEC SA, we keep it as is, else
  * we go look for it ...
 */
-inline
-void get_ipsec_sa(struct pktgen_dev *pkt_dev, int flow)
+static void get_ipsec_sa(struct pktgen_dev *pkt_dev, int flow)
 {
struct xfrm_state *x = pkt_dev-flows[flow].x;
if (!x) {

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[2.6 patch] make nf_ct_ipv6_skip_exthdr() static

2007-07-29 Thread Adrian Bunk
nf_ct_ipv6_skip_exthdr() can now become static.

Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

---

 include/net/netfilter/ipv6/nf_conntrack_ipv6.h |3 ---
 net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c |4 ++--
 2 files changed, 2 insertions(+), 5 deletions(-)

--- linux-2.6.23-rc1-mm1/include/net/netfilter/ipv6/nf_conntrack_ipv6.h.old 
2007-07-26 20:17:47.0 +0200
+++ linux-2.6.23-rc1-mm1/include/net/netfilter/ipv6/nf_conntrack_ipv6.h 
2007-07-26 20:17:53.0 +0200
@@ -7,9 +7,6 @@
 extern struct nf_conntrack_l4proto nf_conntrack_l4proto_udp6;
 extern struct nf_conntrack_l4proto nf_conntrack_l4proto_icmpv6;
 
-extern int nf_ct_ipv6_skip_exthdr(const struct sk_buff *skb, int start,
- u8 *nexthdrp, int len);
-
 extern int nf_ct_frag6_init(void);
 extern void nf_ct_frag6_cleanup(void);
 extern struct sk_buff *nf_ct_frag6_gather(struct sk_buff *skb);
--- linux-2.6.23-rc1-mm1/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c.old 
2007-07-26 20:18:03.0 +0200
+++ linux-2.6.23-rc1-mm1/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c 
2007-07-26 20:18:27.0 +0200
@@ -86,8 +86,8 @@
  *- Note also special handling of AUTH header. Thanks to IPsec wizards.
  */
 
-int nf_ct_ipv6_skip_exthdr(const struct sk_buff *skb, int start, u8 *nexthdrp,
-  int len)
+static int nf_ct_ipv6_skip_exthdr(const struct sk_buff *skb, int start,
+ u8 *nexthdrp, int len)
 {
u8 nexthdr = *nexthdrp;
 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[2.6 patch] net/unix/af_unix.c: make code static

2007-07-29 Thread Adrian Bunk
The following code can now become static:
- struct unix_socket_table
- unix_table_lock

Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

---

 include/net/af_unix.h |   29 -
 net/unix/af_unix.c|   30 --
 2 files changed, 28 insertions(+), 31 deletions(-)

--- linux-2.6.23-rc1-mm1/include/net/af_unix.h.old  2007-07-26 
21:17:33.0 +0200
+++ linux-2.6.23-rc1-mm1/include/net/af_unix.h  2007-07-26 21:20:27.0 
+0200
@@ -12,37 +12,8 @@
 
 #define UNIX_HASH_SIZE 256
 
-extern struct hlist_head unix_socket_table[UNIX_HASH_SIZE + 1];
-extern spinlock_t unix_table_lock;
-
 extern atomic_t unix_tot_inflight;
 
-static inline struct sock *first_unix_socket(int *i)
-{
-   for (*i = 0; *i = UNIX_HASH_SIZE; (*i)++) {
-   if (!hlist_empty(unix_socket_table[*i]))
-   return __sk_head(unix_socket_table[*i]);
-   }
-   return NULL;
-}
-
-static inline struct sock *next_unix_socket(int *i, struct sock *s)
-{
-   struct sock *next = sk_next(s);
-   /* More in this chain? */
-   if (next)
-   return next;
-   /* Look for next non-empty chain. */
-   for ((*i)++; *i = UNIX_HASH_SIZE; (*i)++) {
-   if (!hlist_empty(unix_socket_table[*i]))
-   return __sk_head(unix_socket_table[*i]);
-   }
-   return NULL;
-}
-
-#define forall_unix_sockets(i, s) \
-   for (s = first_unix_socket((i)); s; s = next_unix_socket((i),(s)))
-
 struct unix_address {
atomic_trefcnt;
int len;
--- linux-2.6.23-rc1-mm1/net/unix/af_unix.c.old 2007-07-26 21:17:48.0 
+0200
+++ linux-2.6.23-rc1-mm1/net/unix/af_unix.c 2007-07-26 21:20:24.0 
+0200
@@ -118,14 +118,40 @@
 
 int sysctl_unix_max_dgram_qlen __read_mostly = 10;
 
-struct hlist_head unix_socket_table[UNIX_HASH_SIZE + 1];
-DEFINE_SPINLOCK(unix_table_lock);
+static struct hlist_head unix_socket_table[UNIX_HASH_SIZE + 1];
+static DEFINE_SPINLOCK(unix_table_lock);
 static atomic_t unix_nr_socks = ATOMIC_INIT(0);
 
 #define unix_sockets_unbound   (unix_socket_table[UNIX_HASH_SIZE])
 
 #define UNIX_ABSTRACT(sk)  (unix_sk(sk)-addr-hash != UNIX_HASH_SIZE)
 
+static struct sock *first_unix_socket(int *i)
+{
+   for (*i = 0; *i = UNIX_HASH_SIZE; (*i)++) {
+   if (!hlist_empty(unix_socket_table[*i]))
+   return __sk_head(unix_socket_table[*i]);
+   }
+   return NULL;
+}
+
+static struct sock *next_unix_socket(int *i, struct sock *s)
+{
+   struct sock *next = sk_next(s);
+   /* More in this chain? */
+   if (next)
+   return next;
+   /* Look for next non-empty chain. */
+   for ((*i)++; *i = UNIX_HASH_SIZE; (*i)++) {
+   if (!hlist_empty(unix_socket_table[*i]))
+   return __sk_head(unix_socket_table[*i]);
+   }
+   return NULL;
+}
+
+#define forall_unix_sockets(i, s) \
+   for (s = first_unix_socket((i)); s; s = next_unix_socket((i),(s)))
+
 #ifdef CONFIG_SECURITY_NETWORK
 static void unix_get_secdata(struct scm_cookie *scm, struct sk_buff *skb)
 {

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TCP SACK issue, hung connection, tcpdump included

2007-07-29 Thread Willy Tarreau
On Sun, Jul 29, 2007 at 12:28:04PM +0300, Ilpo Järvinen wrote:
(...)
   Limitation for 48 byte segments? You have to be kidding... :-) But yes,
   it seems that one of the directions is dropping packets for some reason 
   though I would not assume MTU limitation... Or did you mean some other 
   segment?
  
  No, I was talking about the 1448 bytes segments. But in fact I don't
  believe it much because the SACKs are always retransmitted just afterwards.
 
 Ah, but it's ACKed correctly right below it...:
 
 [...snip...]
 09:21:39.490740 IP SERVER.ssh  CLIENT.50727: P 18200:18464(264) ack 
 2991 
 win 2728 nop,nop,timestamp 7692910 81727
 09:21:39.490775 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
 nop,nop,timestamp 81755 7692910
 09:21:39.860245 IP SERVER.ssh  CLIENT.50727: . 12408:13856(1448) ack 
 2991 
 win 2728 nop,nop,timestamp 7693293 81749
 
 ...segment below snd_una arrived = snd_una remains 18464, receiver 
 generates a duplicate ACK:
  
 09:21:39.860302 IP CLIENT.50727  SERVER.ssh: . ack 18464 win 378 
 nop,nop,timestamp 81847 7692910,nop,nop,sack sack 1 
 {12408:13856} 
 
 The cumulative ACK field of it covers _everything_ below 18464 (i.e., it 
 ACKs them), including the 1448 bytes in 12408:13856... In addition, the 
 SACK block is DSACK information [RFC2883] telling explicitly the address 
 of the received duplicate block. However, if this ACK doesn't reach the 
 SERVER TCP, RTO is triggered and the first not yet cumulatively ACKed 
 segment is retransmitted (I guess cumulative ACKs up to 12408 arrived 
 without problems to the SERVER):

Oh yes, you're damn right. I did not notice that the ACK was higher than
the SACK, I'm more used to read traces with absolute rather than relative
seq/acks.

So I agree, it is this ACK which is lost between client and server,
reinforcing the supposition about the location of the capture (client side).

 [...snip...]
 
  BTW, some information are missing. It would have been better if the trace
  had been read with tcpdump -Svv. We would have got seq numbers and ttl.
  Also, we do not know if there's a firewall between both sides. Sometimes,
  some IDS identify attacks in crypted traffic and kill connections. It
  might have been the case here, with the connection closed one way on an
  intermediate firewall.
 
 Yeah, firewall or some other issue, I'd say it's quite unlikely a bug in 
 TCP because behavior to both directions indicate client - sender 
 blackhole independently of each other...

It would also be possible that something stupid between both ends simply
drops packets with the SACK option set. Also something which sometimes
happen is that some firewalls automatically translate sequence numbers
but not necessarily SACK values, which could pretty well lead to this
packet being received but ignored on the server side.

I'm pretty sure that the same trace taken on the server side will reveal
the reason for the problem.

Willy

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TCP SACK issue, hung connection, tcpdump included

2007-07-29 Thread Ilpo Järvinen
On Sun, 29 Jul 2007, Willy Tarreau wrote:

 On Sun, Jul 29, 2007 at 12:28:04PM +0300, Ilpo Järvinen wrote:
 
  [...snip...]
  
   BTW, some information are missing. It would have been better if the trace
   had been read with tcpdump -Svv. We would have got seq numbers and ttl.
   Also, we do not know if there's a firewall between both sides. Sometimes,
   some IDS identify attacks in crypted traffic and kill connections. It
   might have been the case here, with the connection closed one way on an
   intermediate firewall.
  
  Yeah, firewall or some other issue, I'd say it's quite unlikely a bug in 
  TCP because behavior to both directions indicate client - sender 
  blackhole independently of each other...
 
 It would also be possible that something stupid between both ends simply
 drops packets with the SACK option set. Also something which sometimes
 happen is that some firewalls automatically translate sequence numbers
 but not necessarily SACK values, which could pretty well lead to this
 packet being received but ignored on the server side.

...One can toggle those off with /proc/sys/net/ipv4/tcp_dsack but I 
suspect DSACKs are not the cause because these retransmissions neither
are making it through (there are many of them also earlier in the log, 
just quoted the easiest ones :-) ):

09:36:44.335591 IP CLIENT.50727  SERVER.ssh: P 2991:3039(48) ack 18464 
win 
378 nop,nop,timestamp 800227942 7692910
09:38:44.351950 IP CLIENT.50727  SERVER.ssh: P 2991:3039(48) ack 18464 
win 
378 nop,nop,timestamp 800257942 7692910
09:40:44.368172 IP CLIENT.50727  SERVER.ssh: P 2991:3039(48) ack 18464 
win 
378 nop,nop,timestamp 800287943 7692910

...there are no SACKs involved in them, yet no cumulative ACK ever 
arrives from SERVER...


-- 
 i.

Re: [linux-usb-devel] [PATCH] USB Pegasus driver - avoid a potential NULL pointer dereference.

2007-07-29 Thread Satyam Sharma


On Sun, 29 Jul 2007, Oliver Neukum wrote:

 Am Sonntag 29 Juli 2007 schrieb Jesper Juhl:
  On 29/07/07, Satyam Sharma [EMAIL PROTECTED] wrote:
   Hi,
  
   On 7/29/07, Jesper Juhl [EMAIL PROTECTED] wrote:
Hello,
   
This patch makes sure we don't dereference a NULL pointer in
drivers/net/usb/pegasus.c::write_bulk_callback() in the initial
struct net_device *net = pegasus-net; assignment.
The existing code checks if 'pegasus' is NULL and bails out if
it is, so we better not touch that pointer until after that check.
[...]
  
   Is it really possible that we're called into this function with
   urb-context == NULL? If not, I'd suggest let's just get rid of
   the check itself, instead.
  
  I'm not sure. I am not very familiar with this code. I just figured
  that moving the assignment is potentially a little safer and it is
  certainly no worse than the current code, so that's a safe and
  potentially benneficial change. Removing the check may be safe but I'm
  not certain enough to make that call...
 
 pegasus == NULL there would be a kernel bug. Silently ignoring
 it, like the code now wants to do is bad. As the oops has never been
 reported, I figure turning it into an explicit debugging test is overkill,
 so removal seems to be the best option.

Ok, thanks. Updated patch below.

[PATCH] pegasus: Remove bogus checks in urb-complete() callbacks

urb-complete() callbacks registered in drivers/net/usb/pegasus.c needlessly
check for urb-context != NULL, but that is not possible (the only way that
can be possible would be a kernel bug elsewhere, and these checks would
simply end up hiding that). So let's remove the bogus checks.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]

---

 drivers/net/usb/pegasus.c |9 +
 1 files changed, 1 insertions(+), 8 deletions(-)

diff --git a/drivers/net/usb/pegasus.c b/drivers/net/usb/pegasus.c
index a05fd97..439ef9f 100644
--- a/drivers/net/usb/pegasus.c
+++ b/drivers/net/usb/pegasus.c
@@ -770,9 +770,6 @@ static void write_bulk_callback(struct urb *urb)
pegasus_t *pegasus = urb-context;
struct net_device *net = pegasus-net;
 
-   if (!pegasus)
-   return;
-
if (!netif_device_present(net) || !netif_running(net))
return;
 
@@ -805,13 +802,9 @@ static void write_bulk_callback(struct urb *urb)
 static void intr_callback(struct urb *urb)
 {
pegasus_t *pegasus = urb-context;
-   struct net_device *net;
+   struct net_device *net = pegasus-net;
int status;
 
-   if (!pegasus)
-   return;
-   net = pegasus-net;
-
switch (urb-status) {
case 0:
break;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sb1000: prevent a potential NULL pointer dereference in sb1000_dev_ioctl()

2007-07-29 Thread Satyam Sharma


 On Sun, 29 Jul 2007, Domen Puncer wrote:
 
  On 29/07/07 00:02 +0200, Jesper Juhl wrote:
   Hi,
   
   Here's a small patch, prompted by a find by the Coverity checker, 
   that removes a potential NULL pointer dereference from 
   drivers/net/sb1000.c::sb1000_dev_ioctl().
   The checker spotted that we do a NULL test of 'dev', yet we 
   dereference the pointer prior to that check.
   This patch simply moves the dereference after the NULL test.
  
  But... it can't be called without a valid 'dev', no?
  A quick 'grep do_ioctl net/' confirms that all calls are in
  the form of 'dev-do_ioctl(dev, ...'.
 
 Yup, I think so too ...
 
 
   @@ -991,11 +991,13 @@ static int sb1000_dev_ioctl(struct net_device *dev, 
   struct ifreq *ifr, int cmd)
 short PID[4];
 int ioaddr[2], status, frequency;
 unsigned int stats[5];
   - struct sb1000_private *lp = netdev_priv(dev);
   + struct sb1000_private *lp;

 if (!(dev  dev-flags  IFF_UP))
 return -ENODEV;
 
 I think we could get rid of the !dev check itself. Actually, the IFF_UP
 check /also/ looks suspect to me for two reasons: (1) I remember Stephen
 Hemminger once telling me dev-flags is legacy and unsafe, and one of
 the netif_xxx() functions be used instead, and, (2) I wonder if we really
 require the interface to be up and *running* when we do this ioctl.

Updated patch below.

[PATCH] sb1000: Remove bogus checks

In net_device-do_ioctl() of the sb1000 driver (sb1000_dev_ioctl):

(1) !dev condition is always false -- this function cannot be called with
NULL net_device.
(2) !(dev-flags  IFF_UP) is bogus because the functions of this ioctl
can (and should) be allowed even when the interface is not up and running.

So let's remove these checks.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Cc: Jesper Juhl [EMAIL PROTECTED]
Cc: Domen Puncer [EMAIL PROTECTED]

---

 drivers/net/sb1000.c |3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/drivers/net/sb1000.c b/drivers/net/sb1000.c
index 1de3eec..f60fe98 100644
--- a/drivers/net/sb1000.c
+++ b/drivers/net/sb1000.c
@@ -993,9 +993,6 @@ static int sb1000_dev_ioctl(struct net_device *dev, struct 
ifreq *ifr, int cmd)
unsigned int stats[5];
struct sb1000_private *lp = netdev_priv(dev);
 
-   if (!(dev  dev-flags  IFF_UP))
-   return -ENODEV;
-
ioaddr[0] = dev-base_addr;
/* mem_start holds the second I/O address */
ioaddr[1] = dev-mem_start;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/7] Add depth.

2007-07-29 Thread Michael Buesch
On Sunday 29 July 2007 09:08:51 Corey Hickey wrote:
   p = d;
   n = q-dep[d].next;
 @@ -215,7 +216,7 @@ static unsigned int sfq_drop(struct Qdisc *sch)
  drop a packet from it */
  
   if (d  1) {
 - sfq_index x = q-dep[d+SFQ_DEPTH].next;
 + sfq_index x = q-dep[d+q-depth].next;

Please q-dep[d + q-depth]
Makes it _much_ more readable. And doesn't confuse my brain with a
minus and a BiggerThan sign ;)

 @@ -383,6 +384,16 @@ static void sfq_perturbation(unsigned long arg)
  static void sfq_q_destroy(struct sfq_sched_data *q)
  {
   del_timer(q-perturb_timer);
 + if(q-dep)
 + kfree(q-dep);
 + if(q-next)
 + kfree(q-next);
 + if(q-allot)
 + kfree(q-allot);
 + if(q-hash)
 + kfree(q-hash);
 + if(q-qs)
 + kfree(q-qs);

No need to check for !=NULL. kfree handles NULL.

  }
  
  static void sfq_destroy(struct Qdisc *sch)
 @@ -394,6 +405,7 @@ static void sfq_destroy(struct Qdisc *sch)
  static int sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
  {
   struct tc_sfq_qopt *ctl = RTA_DATA(opt);
 + sfq_index p = ~0U/2;
   int i;
  
   if (opt  opt-rta_len  RTA_LENGTH(sizeof(*ctl)))
 @@ -401,30 +413,53 @@ static int sfq_q_init(struct sfq_sched_data *q, struct 
 rtattr *opt)
  
   q-perturbation = 0;
   q-max_depth = 0;
 - q-tail = q-limit = SFQ_DEPTH;
   if (opt == NULL) {
   q-perturb_period = 0;
 + q-tail = q-limit = q-depth = SFQ_DEPTH_DEFAULT;
   } else {
   struct tc_sfq_qopt *ctl = RTA_DATA(opt);
   if (ctl-quantum)
   q-quantum = ctl-quantum;
   q-perturb_period = ctl-perturb_period*HZ;
 + q-tail = q-limit = q-depth = ctl-flows ? : 
 SFQ_DEPTH_DEFAULT;
 +
 + if (q-depth  p - 1)
 + return -EINVAL;

Compare depth against (~0U/2)-1? What's that doing? Should probably add a 
comment.

  
   if (ctl-limit)
 - q-limit = min_t(u32, ctl-limit, SFQ_DEPTH);
 + q-limit = min_t(u32, ctl-limit, q-depth);
   }
  
 + q-dep = kmalloc((1+q-depth*2)*sizeof(struct sfq_head), GFP_KERNEL);
 + if (!q-dep)
 + goto err_case;
 + q-next = kmalloc(q-depth*sizeof(sfq_index), GFP_KERNEL);
 + if (!q-next)
 + goto err_case;
 + q-allot = kmalloc(q-depth*sizeof(short), GFP_KERNEL);
 + if (!q-allot)
 + goto err_case;
 + q-hash = kmalloc(q-depth*sizeof(unsigned short), GFP_KERNEL);
 + if (!q-hash)
 + goto err_case;
 + q-qs = kmalloc(q-depth*sizeof(struct sk_buff_head), GFP_KERNEL);
 + if (!q-qs)
 + goto err_case;

You may chose to use kcalloc for array allocations.

   for (i=0; iSFQ_HASH_DIVISOR; i++)
 - q-ht[i] = SFQ_DEPTH;
 - for (i=0; iSFQ_DEPTH; i++) {
 + q-ht[i] = q-depth;
 + for (i=0; iq-depth; i++) {
   skb_queue_head_init(q-qs[i]);
 - q-dep[i+SFQ_DEPTH].next = i+SFQ_DEPTH;
 - q-dep[i+SFQ_DEPTH].prev = i+SFQ_DEPTH;
 + q-dep[i+q-depth].next = i+q-depth;
 + q-dep[i+q-depth].prev = i+q-depth;
   }
  
 - for (i=0; iSFQ_DEPTH; i++)
 + for (i=0; iq-depth; i++)
   sfq_link(q, i);
   return 0;
 +err_case:

This leaks a few kmallocs.

 + sfq_q_destroy(q);
 + return -ENOBUFS;
  }
  
  static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
 @@ -458,7 +493,7 @@ static int sfq_dump(struct Qdisc *sch, struct sk_buff 
 *skb)
  
   opt.limit = q-limit;
   opt.divisor = SFQ_HASH_DIVISOR;
 - opt.flows = q-limit;
 + opt.flows = q-depth;
  
   RTA_PUT(skb, TCA_OPTIONS, sizeof(opt), opt);
  



-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sb1000: prevent a potential NULL pointer dereference in sb1000_dev_ioctl()

2007-07-29 Thread Michael Buesch
On Sunday 29 July 2007 20:34:46 Satyam Sharma wrote:
 (2) !(dev-flags  IFF_UP) is bogus because the functions of this ioctl
 can (and should) be allowed even when the interface is not up and running.

Are you _sure_? This function does poke with the device hardware.
It might return crap or even machinecheck when not initialized.
Hardware is probably powered down, if not IFF_UP. (I don't know if that's
the case here, though).

  drivers/net/sb1000.c |3 ---
  1 files changed, 0 insertions(+), 3 deletions(-)
 
 diff --git a/drivers/net/sb1000.c b/drivers/net/sb1000.c
 index 1de3eec..f60fe98 100644
 --- a/drivers/net/sb1000.c
 +++ b/drivers/net/sb1000.c
 @@ -993,9 +993,6 @@ static int sb1000_dev_ioctl(struct net_device *dev, 
 struct ifreq *ifr, int cmd)
   unsigned int stats[5];
   struct sb1000_private *lp = netdev_priv(dev);
  
 - if (!(dev  dev-flags  IFF_UP))
 - return -ENODEV;
 -


-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sb1000: prevent a potential NULL pointer dereference in sb1000_dev_ioctl()

2007-07-29 Thread Satyam Sharma
Hi Michael,


On Sun, 29 Jul 2007, Michael Buesch wrote:

 On Sunday 29 July 2007 20:34:46 Satyam Sharma wrote:
  (2) !(dev-flags  IFF_UP) is bogus because the functions of this ioctl
  can (and should) be allowed even when the interface is not up and running.
 
 Are you _sure_? This function does poke with the device hardware.
 It might return crap or even machinecheck when not initialized.
 Hardware is probably powered down, if not IFF_UP. (I don't know if that's
 the case here, though).

IFF_UP checks if the _interface_ is up -- the hardware / card could still
be powered up, but the interface down (ifconfing eth0 down or ip link set
eth0 down).

Probably what we want here is netif_device_present()? -- I think that
should return true only when the *device* itself is up (as in powered)
but the interface itself could be down ...

Let's wait for comments from the netdev people Cc:'ed here, in that case.


   drivers/net/sb1000.c |3 ---
   1 files changed, 0 insertions(+), 3 deletions(-)
  
  diff --git a/drivers/net/sb1000.c b/drivers/net/sb1000.c
  index 1de3eec..f60fe98 100644
  --- a/drivers/net/sb1000.c
  +++ b/drivers/net/sb1000.c
  @@ -993,9 +993,6 @@ static int sb1000_dev_ioctl(struct net_device *dev, 
  struct ifreq *ifr, int cmd)
  unsigned int stats[5];
  struct sb1000_private *lp = netdev_priv(dev);
   
  -   if (!(dev  dev-flags  IFF_UP))
  -   return -ENODEV;
  -


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] ethtool_perm_addr only has one implementation

2007-07-29 Thread Matthew Wilcox

All drivers implement ethtool get_perm_addr the same way -- by calling
the generic function.  So we can inline the generic function into the
caller and avoid going through the drivers.

Signed-off-by: Matthew Wilcox [EMAIL PROTECTED]

diff -u b/net/core/ethtool.c b/net/core/ethtool.c
--- b/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -95,18 +95,6 @@
return 0;
 }
 
-int ethtool_op_get_perm_addr(struct net_device *dev, struct ethtool_perm_addr 
*addr, u8 *data)
-{
-   unsigned char len = dev-addr_len;
-   if ( addr-size  len )
-   return -ETOOSMALL;
-
-   addr-size = len;
-   memcpy(data, dev-perm_addr, len);
-   return 0;
-}
-
-
 u32 ethtool_op_get_ufo(struct net_device *dev)
 {
return (dev-features  NETIF_F_UFO) != 0;
@@ -779,34 +767,20 @@
 static int ethtool_get_perm_addr(struct net_device *dev, void __user *useraddr)
 {
struct ethtool_perm_addr epaddr;
-   u8 *data;
-   int ret;
 
-   if (!dev-ethtool_ops-get_perm_addr)
-   return -EOPNOTSUPP;
-
-   if (copy_from_user(epaddr,useraddr,sizeof(epaddr)))
+   if (copy_from_user(epaddr, useraddr, sizeof(epaddr)))
return -EFAULT;
 
-   data = kmalloc(epaddr.size, GFP_USER);
-   if (!data)
-   return -ENOMEM;
-
-   ret = dev-ethtool_ops-get_perm_addr(dev,epaddr,data);
-   if (ret)
-   return ret;
+   if (epaddr.size  dev-addr_len)
+   return -ETOOSMALL;
+   epaddr.size = dev-addr_len;
 
-   ret = -EFAULT;
if (copy_to_user(useraddr, epaddr, sizeof(epaddr)))
-   goto out;
+   return -EFAULT;
useraddr += sizeof(epaddr);
-   if (copy_to_user(useraddr, data, epaddr.size))
-   goto out;
-   ret = 0;
-
- out:
-   kfree(data);
-   return ret;
+   if (copy_to_user(useraddr, dev-perm_addr, epaddr.size))
+   return -EFAULT;
+   return 0;
 }
 
 /* The main entry point in this file.  Called from net/core/dev.c */
@@ -976,7 +950,6 @@
 
 EXPORT_SYMBOL(dev_ethtool);
 EXPORT_SYMBOL(ethtool_op_get_link);
-EXPORT_SYMBOL_GPL(ethtool_op_get_perm_addr);
 EXPORT_SYMBOL(ethtool_op_get_sg);
 EXPORT_SYMBOL(ethtool_op_get_tso);
 EXPORT_SYMBOL(ethtool_op_get_tx_csum);
only in patch2:
unchanged:
--- a/drivers/net/3c59x.c
+++ b/drivers/net/3c59x.c
@@ -2886,7 +2886,6 @@ static const struct ethtool_ops vortex_ethtool_ops = {
.set_settings   = vortex_set_settings,
.get_link   = ethtool_op_get_link,
.nway_reset = vortex_nway_reset,
-   .get_perm_addr  = ethtool_op_get_perm_addr,
 };
 
 #ifdef CONFIG_PCI
only in patch2:
unchanged:
--- a/drivers/net/8139cp.c
+++ b/drivers/net/8139cp.c
@@ -1578,7 +1578,6 @@ static const struct ethtool_ops cp_ethtool_ops = {
.set_wol= cp_set_wol,
.get_strings= cp_get_strings,
.get_ethtool_stats  = cp_get_ethtool_stats,
-   .get_perm_addr  = ethtool_op_get_perm_addr,
.get_eeprom_len = cp_get_eeprom_len,
.get_eeprom = cp_get_eeprom,
.set_eeprom = cp_set_eeprom,
only in patch2:
unchanged:
--- a/drivers/net/8139too.c
+++ b/drivers/net/8139too.c
@@ -2452,7 +2452,6 @@ static const struct ethtool_ops rtl8139_ethtool_ops = {
.get_strings= rtl8139_get_strings,
.get_stats_count= rtl8139_get_stats_count,
.get_ethtool_stats  = rtl8139_get_ethtool_stats,
-   .get_perm_addr  = ethtool_op_get_perm_addr,
 };
 
 static int netdev_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
only in patch2:
unchanged:
--- a/drivers/net/ax88796.c
+++ b/drivers/net/ax88796.c
@@ -580,7 +580,6 @@ static const struct ethtool_ops ax_ethtool_ops = {
.set_settings   = ax_set_settings,
.nway_reset = ax_nway_reset,
.get_link   = ax_get_link,
-   .get_perm_addr  = ethtool_op_get_perm_addr,
 };
 
 /* setup code */
only in patch2:
unchanged:
--- a/drivers/net/b44.c
+++ b/drivers/net/b44.c
@@ -2033,7 +2033,6 @@ static const struct ethtool_ops b44_ethtool_ops = {
.get_strings= b44_get_strings,
.get_stats_count= b44_get_stats_count,
.get_ethtool_stats  = b44_get_ethtool_stats,
-   .get_perm_addr  = ethtool_op_get_perm_addr,
 };
 
 static int b44_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
only in patch2:
unchanged:
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -6269,7 +6269,6 @@ static const struct ethtool_ops bnx2_ethtool_ops = {
.phys_id= bnx2_phys_id,
.get_stats_count= bnx2_get_stats_count,
.get_ethtool_stats  = bnx2_get_ethtool_stats,
-   .get_perm_addr  = ethtool_op_get_perm_addr,
 };
 
 /* Called with rtnl_lock */
only in patch2:
unchanged:
--- a/drivers/net/cxgb3/cxgb3_main.c
+++ 

[PATCH 2.6.23 1/2] Make the iw_cxgb3 module parameters writable.

2007-07-29 Thread Steve Wise

Make the iw_cxgb3 module parameters writable.

Signed-off-by: Steve Wise [EMAIL PROTECTED]
---

 drivers/infiniband/hw/cxgb3/iwch_cm.c |   16 
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb3/iwch_cm.c 
b/drivers/infiniband/hw/cxgb3/iwch_cm.c
index 9574088..fa95dce 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_cm.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_cm.c
@@ -63,37 +63,37 @@ static char *states[] = {
 };
 
 static int ep_timeout_secs = 10;
-module_param(ep_timeout_secs, int, 0444);
+module_param(ep_timeout_secs, int, 0644);
 MODULE_PARM_DESC(ep_timeout_secs, CM Endpoint operation timeout 
   in seconds (default=10));
 
 static int mpa_rev = 1;
-module_param(mpa_rev, int, 0444);
+module_param(mpa_rev, int, 0644);
 MODULE_PARM_DESC(mpa_rev, MPA Revision, 0 supports amso1100, 
 1 is spec compliant. (default=1));
 
 static int markers_enabled = 0;
-module_param(markers_enabled, int, 0444);
+module_param(markers_enabled, int, 0644);
 MODULE_PARM_DESC(markers_enabled, Enable MPA MARKERS (default(0)=disabled));
 
 static int crc_enabled = 1;
-module_param(crc_enabled, int, 0444);
+module_param(crc_enabled, int, 0644);
 MODULE_PARM_DESC(crc_enabled, Enable MPA CRC (default(1)=enabled));
 
 static int rcv_win = 256 * 1024;
-module_param(rcv_win, int, 0444);
+module_param(rcv_win, int, 0644);
 MODULE_PARM_DESC(rcv_win, TCP receive window in bytes (default=256));
 
 static int snd_win = 32 * 1024;
-module_param(snd_win, int, 0444);
+module_param(snd_win, int, 0644);
 MODULE_PARM_DESC(snd_win, TCP send window in bytes (default=32KB));
 
 static unsigned int nocong = 0;
-module_param(nocong, uint, 0444);
+module_param(nocong, uint, 0644);
 MODULE_PARM_DESC(nocong, Turn off congestion control (default=0));
 
 static unsigned int cong_flavor = 1;
-module_param(cong_flavor, uint, 0444);
+module_param(cong_flavor, uint, 0644);
 MODULE_PARM_DESC(cong_flavor, TCP Congestion control flavor (default=1));
 
 static void process_work(struct work_struct *work);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2.6.23 2/2] iw_cxgb3: Always call low level send function via cxgb3_ofld_send().

2007-07-29 Thread Steve Wise

iw_cxgb3: Always call low level send function via cxgb3_ofld_send().

Avoids deadlocks.

Signed-off-by: Steve Wise [EMAIL PROTECTED]
---

 drivers/infiniband/hw/cxgb3/iwch_cm.c |   16 
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb3/iwch_cm.c 
b/drivers/infiniband/hw/cxgb3/iwch_cm.c
index fa95dce..20ba372 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_cm.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_cm.c
@@ -139,7 +139,7 @@ static void release_tid(struct t3cdev *t
req-wr.wr_hi = htonl(V_WR_OP(FW_WROPCODE_FORWARD));
OPCODE_TID(req) = htonl(MK_OPCODE_TID(CPL_TID_RELEASE, hwtid));
skb-priority = CPL_PRIORITY_SETUP;
-   tdev-send(tdev, skb);
+   cxgb3_ofld_send(tdev, skb);
return;
 }
 
@@ -161,7 +161,7 @@ int iwch_quiesce_tid(struct iwch_ep *ep)
req-val = cpu_to_be64(1  S_TCB_RX_QUIESCE);
 
skb-priority = CPL_PRIORITY_DATA;
-   ep-com.tdev-send(ep-com.tdev, skb);
+   cxgb3_ofld_send(ep-com.tdev, skb);
return 0;
 }
 
@@ -183,7 +183,7 @@ int iwch_resume_tid(struct iwch_ep *ep)
req-val = 0;
 
skb-priority = CPL_PRIORITY_DATA;
-   ep-com.tdev-send(ep-com.tdev, skb);
+   cxgb3_ofld_send(ep-com.tdev, skb);
return 0;
 }
 
@@ -784,7 +784,7 @@ static int update_rx_credits(struct iwch
OPCODE_TID(req) = htonl(MK_OPCODE_TID(CPL_RX_DATA_ACK, ep-hwtid));
req-credit_dack = htonl(V_RX_CREDITS(credits) | V_RX_FORCE_ACK(1));
skb-priority = CPL_PRIORITY_ACK;
-   ep-com.tdev-send(ep-com.tdev, skb);
+   cxgb3_ofld_send(ep-com.tdev, skb);
return credits;
 }
 
@@ -1152,7 +1152,7 @@ static int listen_start(struct iwch_list
req-opt1 = htonl(V_CONN_POLICY(CPL_CONN_POLICY_ASK));
 
skb-priority = 1;
-   ep-com.tdev-send(ep-com.tdev, skb);
+   cxgb3_ofld_send(ep-com.tdev, skb);
return 0;
 }
 
@@ -1186,7 +1186,7 @@ static int listen_stop(struct iwch_liste
req-cpu_idx = 0;
OPCODE_TID(req) = htonl(MK_OPCODE_TID(CPL_CLOSE_LISTSRV_REQ, ep-stid));
skb-priority = 1;
-   ep-com.tdev-send(ep-com.tdev, skb);
+   cxgb3_ofld_send(ep-com.tdev, skb);
return 0;
 }
 
@@ -1264,7 +1264,7 @@ static void reject_cr(struct t3cdev *tde
rpl-opt0l_status = htonl(CPL_PASS_OPEN_REJECT);
rpl-opt2 = 0;
rpl-rsvd = rpl-opt2;
-   tdev-send(tdev, skb);
+   cxgb3_ofld_send(tdev, skb);
}
 }
 
@@ -1557,7 +1557,7 @@ static int peer_abort(struct t3cdev *tde
rpl-wr.wr_lo = htonl(V_WR_TID(ep-hwtid));
OPCODE_TID(rpl) = htonl(MK_OPCODE_TID(CPL_ABORT_RPL, ep-hwtid));
rpl-cmd = CPL_ABORT_NO_RST;
-   ep-com.tdev-send(ep-com.tdev, rpl_skb);
+   cxgb3_ofld_send(ep-com.tdev, rpl_skb);
if (state != ABORTING) {
state_set(ep-com, DEAD);
release_ep_resources(ep);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/7] Add depth.

2007-07-29 Thread Corey Hickey

Michael Buesch wrote:

On Sunday 29 July 2007 09:08:51 Corey Hickey wrote:

p = d;
n = q-dep[d].next;
@@ -215,7 +216,7 @@ static unsigned int sfq_drop(struct Qdisc *sch)
   drop a packet from it */
 
 	if (d  1) {

-   sfq_index x = q-dep[d+SFQ_DEPTH].next;
+   sfq_index x = q-dep[d+q-depth].next;


Please q-dep[d + q-depth]
Makes it _much_ more readable. And doesn't confuse my brain with a
minus and a BiggerThan sign ;)


Ok.


@@ -383,6 +384,16 @@ static void sfq_perturbation(unsigned long arg)
 static void sfq_q_destroy(struct sfq_sched_data *q)
 {
del_timer(q-perturb_timer);
+   if(q-dep)
+   kfree(q-dep);
+   if(q-next)
+   kfree(q-next);
+   if(q-allot)
+   kfree(q-allot);
+   if(q-hash)
+   kfree(q-hash);
+   if(q-qs)
+   kfree(q-qs);


No need to check for !=NULL. kfree handles NULL.


Ok. Thanks.


 }
 
 static void sfq_destroy(struct Qdisc *sch)

@@ -394,6 +405,7 @@ static void sfq_destroy(struct Qdisc *sch)
 static int sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
 {
struct tc_sfq_qopt *ctl = RTA_DATA(opt);
+   sfq_index p = ~0U/2;
int i;
 
 	if (opt  opt-rta_len  RTA_LENGTH(sizeof(*ctl)))

@@ -401,30 +413,53 @@ static int sfq_q_init(struct sfq_sched_data *q, struct 
rtattr *opt)
 
 	q-perturbation = 0;

q-max_depth = 0;
-   q-tail = q-limit = SFQ_DEPTH;
if (opt == NULL) {
q-perturb_period = 0;
+   q-tail = q-limit = q-depth = SFQ_DEPTH_DEFAULT;
} else {
struct tc_sfq_qopt *ctl = RTA_DATA(opt);
if (ctl-quantum)
q-quantum = ctl-quantum;
q-perturb_period = ctl-perturb_period*HZ;
+   q-tail = q-limit = q-depth = ctl-flows ? : 
SFQ_DEPTH_DEFAULT;
+
+   if (q-depth  p - 1)
+   return -EINVAL;


Compare depth against (~0U/2)-1? What's that doing? Should probably add a 
comment.


~0U/2 - 1 is the maximum value depth can be, based on how it is used in 
indexing q-dep. I agree, though, that deserves a comment. Actually, 
I'll also change it to '#define SFQ_DEPTH_MAX (~0U/2 - 1)' and put it 
near the top of the file next to the 'typedef unsigned int sfq_index;'.


I could also include limits.h and use UINT_MAX instead of ~0U; would 
that be preferable?


 
 		if (ctl-limit)

-   q-limit = min_t(u32, ctl-limit, SFQ_DEPTH);
+   q-limit = min_t(u32, ctl-limit, q-depth);
}
 
+	q-dep = kmalloc((1+q-depth*2)*sizeof(struct sfq_head), GFP_KERNEL);

+   if (!q-dep)
+   goto err_case;
+   q-next = kmalloc(q-depth*sizeof(sfq_index), GFP_KERNEL);
+   if (!q-next)
+   goto err_case;
+   q-allot = kmalloc(q-depth*sizeof(short), GFP_KERNEL);
+   if (!q-allot)
+   goto err_case;
+   q-hash = kmalloc(q-depth*sizeof(unsigned short), GFP_KERNEL);
+   if (!q-hash)
+   goto err_case;
+   q-qs = kmalloc(q-depth*sizeof(struct sk_buff_head), GFP_KERNEL);
+   if (!q-qs)
+   goto err_case;


You may chose to use kcalloc for array allocations.


The arrays in the original code don't get zeroed either, so that 
shouldn't be necessary (and I haven't heard of any problems so far). Do 
you suggest I use kcalloc() anyway, just as a good practice?



for (i=0; iSFQ_HASH_DIVISOR; i++)
-   q-ht[i] = SFQ_DEPTH;
-   for (i=0; iSFQ_DEPTH; i++) {
+   q-ht[i] = q-depth;
+   for (i=0; iq-depth; i++) {
skb_queue_head_init(q-qs[i]);
-   q-dep[i+SFQ_DEPTH].next = i+SFQ_DEPTH;
-   q-dep[i+SFQ_DEPTH].prev = i+SFQ_DEPTH;
+   q-dep[i+q-depth].next = i+q-depth;
+   q-dep[i+q-depth].prev = i+q-depth;
}
 
-	for (i=0; iSFQ_DEPTH; i++)

+   for (i=0; iq-depth; i++)
sfq_link(q, i);
return 0;
+err_case:


This leaks a few kmallocs.


Are you saying that the 'err_case:' leaks kmallocs? It calls 
sfq_q_destroy(q), which kfrees each of the arrays: dep, next, allot, 
hash, and qs. Is that sufficient, or am I missing something or 
misunderstanding you?



+   sfq_q_destroy(q);
+   return -ENOBUFS;
 }


Thank you for your review. Could you please clarify the questions I 
have? I'll make, test, and submit a revision of this patch after that.


-Corey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ethtool_perm_addr only has one implementation

2007-07-29 Thread Kok, Auke

Matthew Wilcox wrote:

All drivers implement ethtool get_perm_addr the same way -- by calling
the generic function.  So we can inline the generic function into the
caller and avoid going through the drivers.

Signed-off-by: Matthew Wilcox [EMAIL PROTECTED]


For the e100, e1000, ixgb parts:

Acked-by: Auke Kok [EMAIL PROTECTED]




diff -u b/net/core/ethtool.c b/net/core/ethtool.c
--- b/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -95,18 +95,6 @@
return 0;
 }
 
-int ethtool_op_get_perm_addr(struct net_device *dev, struct ethtool_perm_addr *addr, u8 *data)

-{
-   unsigned char len = dev-addr_len;
-   if ( addr-size  len )
-   return -ETOOSMALL;
-
-   addr-size = len;
-   memcpy(data, dev-perm_addr, len);
-   return 0;
-}
-
-
 u32 ethtool_op_get_ufo(struct net_device *dev)
 {
return (dev-features  NETIF_F_UFO) != 0;
@@ -779,34 +767,20 @@
 static int ethtool_get_perm_addr(struct net_device *dev, void __user *useraddr)
 {
struct ethtool_perm_addr epaddr;
-   u8 *data;
-   int ret;
 
-	if (!dev-ethtool_ops-get_perm_addr)

-   return -EOPNOTSUPP;
-
-   if (copy_from_user(epaddr,useraddr,sizeof(epaddr)))
+   if (copy_from_user(epaddr, useraddr, sizeof(epaddr)))
return -EFAULT;
 
-	data = kmalloc(epaddr.size, GFP_USER);

-   if (!data)
-   return -ENOMEM;
-
-   ret = dev-ethtool_ops-get_perm_addr(dev,epaddr,data);
-   if (ret)
-   return ret;
+   if (epaddr.size  dev-addr_len)
+   return -ETOOSMALL;
+   epaddr.size = dev-addr_len;
 
-	ret = -EFAULT;

if (copy_to_user(useraddr, epaddr, sizeof(epaddr)))
-   goto out;
+   return -EFAULT;
useraddr += sizeof(epaddr);
-   if (copy_to_user(useraddr, data, epaddr.size))
-   goto out;
-   ret = 0;
-
- out:
-   kfree(data);
-   return ret;
+   if (copy_to_user(useraddr, dev-perm_addr, epaddr.size))
+   return -EFAULT;
+   return 0;
 }
 
 /* The main entry point in this file.  Called from net/core/dev.c */

@@ -976,7 +950,6 @@
 
 EXPORT_SYMBOL(dev_ethtool);

 EXPORT_SYMBOL(ethtool_op_get_link);
-EXPORT_SYMBOL_GPL(ethtool_op_get_perm_addr);
 EXPORT_SYMBOL(ethtool_op_get_sg);
 EXPORT_SYMBOL(ethtool_op_get_tso);
 EXPORT_SYMBOL(ethtool_op_get_tx_csum);
only in patch2:
unchanged:
--- a/drivers/net/3c59x.c
+++ b/drivers/net/3c59x.c
@@ -2886,7 +2886,6 @@ static const struct ethtool_ops vortex_ethtool_ops = {
.set_settings   = vortex_set_settings,
.get_link   = ethtool_op_get_link,
.nway_reset = vortex_nway_reset,
-   .get_perm_addr  = ethtool_op_get_perm_addr,
 };
 
 #ifdef CONFIG_PCI

only in patch2:
unchanged:
--- a/drivers/net/8139cp.c
+++ b/drivers/net/8139cp.c
@@ -1578,7 +1578,6 @@ static const struct ethtool_ops cp_ethtool_ops = {
.set_wol= cp_set_wol,
.get_strings= cp_get_strings,
.get_ethtool_stats  = cp_get_ethtool_stats,
-   .get_perm_addr  = ethtool_op_get_perm_addr,
.get_eeprom_len = cp_get_eeprom_len,
.get_eeprom = cp_get_eeprom,
.set_eeprom = cp_set_eeprom,
only in patch2:
unchanged:
--- a/drivers/net/8139too.c
+++ b/drivers/net/8139too.c
@@ -2452,7 +2452,6 @@ static const struct ethtool_ops rtl8139_ethtool_ops = {
.get_strings= rtl8139_get_strings,
.get_stats_count= rtl8139_get_stats_count,
.get_ethtool_stats  = rtl8139_get_ethtool_stats,
-   .get_perm_addr  = ethtool_op_get_perm_addr,
 };
 
 static int netdev_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)

only in patch2:
unchanged:
--- a/drivers/net/ax88796.c
+++ b/drivers/net/ax88796.c
@@ -580,7 +580,6 @@ static const struct ethtool_ops ax_ethtool_ops = {
.set_settings   = ax_set_settings,
.nway_reset = ax_nway_reset,
.get_link   = ax_get_link,
-   .get_perm_addr  = ethtool_op_get_perm_addr,
 };
 
 /* setup code */

only in patch2:
unchanged:
--- a/drivers/net/b44.c
+++ b/drivers/net/b44.c
@@ -2033,7 +2033,6 @@ static const struct ethtool_ops b44_ethtool_ops = {
.get_strings= b44_get_strings,
.get_stats_count= b44_get_stats_count,
.get_ethtool_stats  = b44_get_ethtool_stats,
-   .get_perm_addr  = ethtool_op_get_perm_addr,
 };
 
 static int b44_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)

only in patch2:
unchanged:
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -6269,7 +6269,6 @@ static const struct ethtool_ops bnx2_ethtool_ops = {
.phys_id= bnx2_phys_id,
.get_stats_count= bnx2_get_stats_count,
.get_ethtool_stats  = bnx2_get_ethtool_stats,
-   .get_perm_addr  = ethtool_op_get_perm_addr,
 };
 
 /* Called 

Re: [PATCH] sb1000: prevent a potential NULL pointer dereference in sb1000_dev_ioctl()

2007-07-29 Thread Michael Buesch
On Sunday 29 July 2007 21:09, Satyam Sharma wrote:
 Hi Michael,
 
 
 On Sun, 29 Jul 2007, Michael Buesch wrote:
 
  On Sunday 29 July 2007 20:34:46 Satyam Sharma wrote:
   (2) !(dev-flags  IFF_UP) is bogus because the functions of this ioctl
   can (and should) be allowed even when the interface is not up and running.
  
  Are you _sure_? This function does poke with the device hardware.
  It might return crap or even machinecheck when not initialized.
  Hardware is probably powered down, if not IFF_UP. (I don't know if that's
  the case here, though).
 
 IFF_UP checks if the _interface_ is up -- the hardware / card could still
 be powered up, but the interface down (ifconfing eth0 down or ip link set
 eth0 down).

Well, that is device/driver dependent and I don't know what's
the case for this driver. It's encouraged to shutdown hardware
completely (except the WOL parts) when the interface is down.
Dunno if this driver does it. But _if_ it does it, it could cause
problems to poke with the hardware while down.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] SMSC LAN911x and LAN921x vendor driver

2007-07-29 Thread Peter Korsgaard
 Steve == Steve Glendinning [EMAIL PROTECTED] writes:

Hi,

 Steve Attached is a driver for SMSC's LAN911x and LAN921x families
 Steve of embedded ethernet controllers.

 Steve There is an existing smc911x driver in the tree; this is intended to
 Steve replace it.  Dustin McIntire (the author of the smc911x driver) has
 Steve expressed his support for switching to this driver.

What's the problem with Dustin's driver? It seems to work fine here
with a lan9117. Why not just add lan921x support to the existing
driver?

-- 
Bye, Peter Korsgaard

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/7] Add depth.

2007-07-29 Thread Michael Buesch
On Sunday 29 July 2007 22:21, Corey Hickey wrote:
  Compare depth against (~0U/2)-1? What's that doing? Should probably add a 
  comment.
 
 ~0U/2 - 1 is the maximum value depth can be, based on how it is used in 
 indexing q-dep. I agree, though, that deserves a comment. Actually, 
 I'll also change it to '#define SFQ_DEPTH_MAX (~0U/2 - 1)' and put it 
 near the top of the file next to the 'typedef unsigned int sfq_index;'.
 
 I could also include limits.h and use UINT_MAX instead of ~0U; would 
 that be preferable?

Seems like a good idea.

   
 if (ctl-limit)
  -  q-limit = min_t(u32, ctl-limit, SFQ_DEPTH);
  +  q-limit = min_t(u32, ctl-limit, q-depth);
 }
   
  +  q-dep = kmalloc((1+q-depth*2)*sizeof(struct sfq_head), GFP_KERNEL);
  +  if (!q-dep)
  +  goto err_case;
  +  q-next = kmalloc(q-depth*sizeof(sfq_index), GFP_KERNEL);
  +  if (!q-next)
  +  goto err_case;
  +  q-allot = kmalloc(q-depth*sizeof(short), GFP_KERNEL);
  +  if (!q-allot)
  +  goto err_case;
  +  q-hash = kmalloc(q-depth*sizeof(unsigned short), GFP_KERNEL);
  +  if (!q-hash)
  +  goto err_case;
  +  q-qs = kmalloc(q-depth*sizeof(struct sk_buff_head), GFP_KERNEL);
  +  if (!q-qs)
  +  goto err_case;
  
  You may chose to use kcalloc for array allocations.
 
 The arrays in the original code don't get zeroed either, so that 
 shouldn't be necessary (and I haven't heard of any problems so far). Do 
 you suggest I use kcalloc() anyway, just as a good practice?

Well, I think we don't have strict rules on that, so it depends
on the developer's taste. The advantage of kcalloc is, that it might
catch errors in the args better than this opencoded multiplication.
(There's some BUG_ON logic in kcalloc)

 for (i=0; iSFQ_HASH_DIVISOR; i++)
  -  q-ht[i] = SFQ_DEPTH;
  -  for (i=0; iSFQ_DEPTH; i++) {
  +  q-ht[i] = q-depth;
  +  for (i=0; iq-depth; i++) {
 skb_queue_head_init(q-qs[i]);
  -  q-dep[i+SFQ_DEPTH].next = i+SFQ_DEPTH;
  -  q-dep[i+SFQ_DEPTH].prev = i+SFQ_DEPTH;
  +  q-dep[i+q-depth].next = i+q-depth;
  +  q-dep[i+q-depth].prev = i+q-depth;
 }
   
  -  for (i=0; iSFQ_DEPTH; i++)
  +  for (i=0; iq-depth; i++)
 sfq_link(q, i);
 return 0;
  +err_case:
  
  This leaks a few kmallocs.
 
 Are you saying that the 'err_case:' leaks kmallocs? It calls 
 sfq_q_destroy(q), which kfrees each of the arrays: dep, next, allot, 
 hash, and qs. Is that sufficient, or am I missing something or 
 misunderstanding you?

Ok, I didn't see that. So this should be ok.

  +  sfq_q_destroy(q);
  +  return -ENOBUFS;
   }
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: netdevice queueing / sendmsg issue?

2007-07-29 Thread Krzysztof Halasa
David Miller [EMAIL PROTECTED] writes:

 Software interrupts might be getting lost, dev_kfree_skb_irq() has to
 queue the kfree_skb() to soft IRQ.

 Therefore, dev_kfree_skb_irq() will only work properly from hardware
 interrupt context, where we will return and thus run the scheduled
 software interrupt.

 So some things to check out are whether the driver is invoking
 dev_kfree_skb_irq() in the right context, whether ARM might have some
 software interrupt processing preculiarity, etc.

I see. I call dev_kfree_skb_irq() from hardware IRQ handler, so
the main suspect is soft IRQ processing. Should be easy now.

Thanks.
-- 
Krzysztof Halasa
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


IPUtils and uClibc

2007-07-29 Thread Rafał Bilski

Hi,

Today I was trying to update my router based on Gentoo and uClibc. 
Unfortunatly build fails because of b* functions. Linker can't 
find them later. At first gcc is complaining that b* functions 
are impilicity declared. Acording to man pages these functions 
are deprecated anyway.

Patch is for Gentoo's iputils-20070202.

---
clockdiff.c   |4 ++--
ping.c|   13 +++--
rdisc.c   |6 +++---
traceroute6.c |6 +++---
4 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/clockdiff.c b/clockdiff.c
index e17e0b8..ee7bb76 100644
--- a/clockdiff.c
+++ b/clockdiff.c
@@ -594,7 +594,7 @@ main(int argc, char *argv[])

memset(server, 0, sizeof(server));
server.sin_family = hp-h_addrtype;
-	bcopy(hp-h_addr, (server.sin_addr.s_addr), 4); 
+	memcpy((server.sin_addr.s_addr), hp-h_addr, 4); 


if (connect(sock_raw, (struct sockaddr*)server, sizeof(server)) == -1) 
{
perror(connect);
@@ -605,7 +605,7 @@ main(int argc, char *argv[])
socklen_t addrlen = sizeof(myaddr);
unsigned char rspace[ip_opt_len];

-   bzero(rspace, sizeof(rspace));
+   memset(rspace, 0, sizeof(rspace));
rspace[0] = IPOPT_TIMESTAMP;
rspace[1] = ip_opt_len;
rspace[2] = 5;
diff --git a/ping.c b/ping.c
index 561e71a..6148fd9 100644
--- a/ping.c
+++ b/ping.c
@@ -58,6 +58,7 @@ char copyright[] =
 *  This program has to run SUID to ROOT to access the ICMP socket.
 */

+#include string.h
#include ping_common.h

#include netinet/ip.h
@@ -235,7 +236,7 @@ main(int argc, char **argv)
while (argc  0) {
target = *argv;

-   bzero((char *)whereto, sizeof(whereto));
+   memset((char *)whereto, 0, sizeof(whereto));
whereto.sin_family = AF_INET;
if (inet_aton(target, whereto.sin_addr) == 1) {
hostname = target;
@@ -393,7 +394,7 @@ main(int argc, char **argv)

/* record route option */
if (options  F_RROUTE) {
-   bzero(rspace, sizeof(rspace));
+   memset(rspace, 0, sizeof(rspace));
rspace[0] = IPOPT_NOP;
rspace[1+IPOPT_OPTVAL] = IPOPT_RR;
rspace[1+IPOPT_OLEN] = sizeof(rspace)-1;
@@ -405,7 +406,7 @@ main(int argc, char **argv)
}
}
if (options  F_TIMESTAMP) {
-   bzero(rspace, sizeof(rspace));
+   memset(rspace, 0, sizeof(rspace));
rspace[0] = IPOPT_TIMESTAMP;
rspace[1] = (ts_type==IPOPT_TS_TSONLY ? 40 : 36);
rspace[2] = 5;
@@ -427,7 +428,7 @@ main(int argc, char **argv)
}
if (options  F_SOURCEROUTE) {
int i;
-   bzero(rspace, sizeof(rspace));
+   memset(rspace, 0, sizeof(rspace));
rspace[0] = IPOPT_NOOP;
rspace[1+IPOPT_OPTVAL] = (options  F_SO_DONTROUTE) ? IPOPT_SSRR
: IPOPT_LSRR;
@@ -1009,7 +1010,7 @@ void pr_options(unsigned char * cp, int hlen)
if (i = 0)
continue;
if (i == old_rrlen
-!bcmp((char *)cp, old_rr, i)
+!strncmp((char *)cp, old_rr, i)
 !(options  F_FLOOD)) {
printf(\t(same route));
i = ((i + 3) / 4) * 4;
@@ -1017,7 +1018,7 @@ void pr_options(unsigned char * cp, int hlen)
break;
}
old_rrlen = i;
-   bcopy((char *)cp, old_rr, i);
+   memcpy(old_rr, (char *)cp, i);
printf(\nRR: );
cp++;
for (;;) {
diff --git a/rdisc.c b/rdisc.c
index bb223bc..b1eab40 100644
--- a/rdisc.c
+++ b/rdisc.c
@@ -409,11 +409,11 @@ next:
forever = 1;
}

-   bzero( (char *)whereto, sizeof(struct sockaddr_in) );
+   memset( (char *)whereto, 0, sizeof(struct sockaddr_in) );
to-sin_family = AF_INET;
to-sin_addr.s_addr = inet_addr(sendaddress);

-   bzero( (char *)joinaddr, sizeof(struct sockaddr_in) );
+   memset( (char *)joinaddr, 0, sizeof(struct sockaddr_in) );
joinaddr.sin_family = AF_INET;
joinaddr.sin_addr.s_addr = inet_addr(recvaddress);

@@ -1468,7 +1468,7 @@ rtioctl(struct in_addr addr, int op)
struct rtentry rt;
struct sockaddr_in *sin;

-   bzero((char *)rt, sizeof(struct rtentry));
+   memset((char *)rt, 0, sizeof(struct rtentry));
rt.rt_dst.sa_family = AF_INET;
rt.rt_gateway.sa_family = AF_INET;
rt.rt_genmask.sa_family = AF_INET;
diff --git a/traceroute6.c b/traceroute6.c
index 114cb0a..2dd5c42 100644
--- a/traceroute6.c
+++ b/traceroute6.c
@@ -276,7 +276,7 

[PATCH 3/7] Move two functions.

2007-07-29 Thread Corey Hickey
Move sfq_q_destroy() to above sfq_q_init() so that it can be used
by an error case in a later patch.

Move sfq_destroy() as well, for clarity.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   24 
 1 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 0c46938..583f925 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -380,6 +380,17 @@ static void sfq_perturbation(unsigned long arg)
}
 }
 
+static void sfq_q_destroy(struct sfq_sched_data *q)
+{
+   del_timer(q-perturb_timer);
+}
+
+static void sfq_destroy(struct Qdisc *sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   sfq_q_destroy(q);
+}
+
 static int sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
 {
struct tc_sfq_qopt *ctl = RTA_DATA(opt);
@@ -420,7 +431,7 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
int err;
-   
+
q-quantum = psched_mtu(sch-dev); /* default */
if ((err = sfq_q_init(q, opt)))
return err;
@@ -436,17 +447,6 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
return 0;
 }
 
-static void sfq_q_destroy(struct sfq_sched_data *q)
-{
-   del_timer(q-perturb_timer);
-}
-
-static void sfq_destroy(struct Qdisc *sch)
-{
-   struct sfq_sched_data *q = qdisc_priv(sch);
-   sfq_q_destroy(q);
-}
-
 static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/7] Preparatory refactoring part 2.

2007-07-29 Thread Corey Hickey
Factor code out of sfq_init() and sfq_destroy(), again so that the
new functions can be used by sfq_change() later.

Actually, as the diff itself shows, most of the sfq_q_init() code
comes from the original sfq_change(), but sfq_change() is only
called by sfq_init() right now. Thus, it is safe to remove
sfq_change(); tc qdisc change doesn't yet work for sfq anyway.

The sfq_destroy() -- sfq_q_destroy() change looks pointless here,
but it's cleaner to split now and add code to sfq_q_destroy() in a
later patch.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   80 +-
 1 files changed, 40 insertions(+), 40 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 8ae077f..0c46938 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -380,71 +380,71 @@ static void sfq_perturbation(unsigned long arg)
}
 }
 
-static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
+static int sfq_q_init(struct sfq_sched_data *q, struct rtattr *opt)
 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
struct tc_sfq_qopt *ctl = RTA_DATA(opt);
-   unsigned int qlen;
+   int i;
 
-   if (opt-rta_len  RTA_LENGTH(sizeof(*ctl)))
+   if (opt  opt-rta_len  RTA_LENGTH(sizeof(*ctl)))
return -EINVAL;
 
-   sch_tree_lock(sch);
-   q-quantum = ctl-quantum ? : psched_mtu(sch-dev);
-   q-perturb_period = ctl-perturb_period*HZ;
-   if (ctl-limit)
-   q-limit = min_t(u32, ctl-limit, SFQ_DEPTH);
+   q-perturbation = 0;
+   q-max_depth = 0;
+   q-tail = q-limit = SFQ_DEPTH;
+   if (opt == NULL) {
+   q-perturb_period = 0;
+   } else {
+   struct tc_sfq_qopt *ctl = RTA_DATA(opt);
+   if (ctl-quantum)
+   q-quantum = ctl-quantum;
+   q-perturb_period = ctl-perturb_period*HZ;
 
-   qlen = sch-q.qlen;
-   while (sch-q.qlen = q-limit-1)
-   sfq_drop(sch);
-   qdisc_tree_decrease_qlen(sch, qlen - sch-q.qlen);
+   if (ctl-limit)
+   q-limit = min_t(u32, ctl-limit, SFQ_DEPTH);
+   }
 
-   del_timer(q-perturb_timer);
-   if (q-perturb_period) {
-   q-perturb_timer.expires = jiffies + q-perturb_period;
-   add_timer(q-perturb_timer);
+   for (i=0; iSFQ_HASH_DIVISOR; i++)
+   q-ht[i] = SFQ_DEPTH;
+   for (i=0; iSFQ_DEPTH; i++) {
+   skb_queue_head_init(q-qs[i]);
+   q-dep[i+SFQ_DEPTH].next = i+SFQ_DEPTH;
+   q-dep[i+SFQ_DEPTH].prev = i+SFQ_DEPTH;
}
-   sch_tree_unlock(sch);
+
+   for (i=0; iSFQ_DEPTH; i++)
+   sfq_link(q, i);
return 0;
 }
 
 static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-   int i;
+   int err;
+   
+   q-quantum = psched_mtu(sch-dev); /* default */
+   if ((err = sfq_q_init(q, opt)))
+   return err;
 
init_timer(q-perturb_timer);
q-perturb_timer.data = (unsigned long)sch;
q-perturb_timer.function = sfq_perturbation;
-
-   for (i=0; iSFQ_HASH_DIVISOR; i++)
-   q-ht[i] = SFQ_DEPTH;
-   for (i=0; iSFQ_DEPTH; i++) {
-   skb_queue_head_init(q-qs[i]);
-   q-dep[i+SFQ_DEPTH].next = i+SFQ_DEPTH;
-   q-dep[i+SFQ_DEPTH].prev = i+SFQ_DEPTH;
-   }
-   q-limit = SFQ_DEPTH;
-   q-max_depth = 0;
-   q-tail = SFQ_DEPTH;
-   if (opt == NULL) {
-   q-quantum = psched_mtu(sch-dev);
-   q-perturb_period = 0;
-   } else {
-   int err = sfq_change(sch, opt);
-   if (err)
-   return err;
+   if (q-perturb_period) {
+   q-perturb_timer.expires = jiffies + q-perturb_period;
+   add_timer(q-perturb_timer);
}
-   for (i=0; iSFQ_DEPTH; i++)
-   sfq_link(q, i);
+
return 0;
 }
 
+static void sfq_q_destroy(struct sfq_sched_data *q)
+{
+   del_timer(q-perturb_timer);
+}
+
 static void sfq_destroy(struct Qdisc *sch)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-   del_timer(q-perturb_timer);
+   sfq_q_destroy(q);
 }
 
 static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


SFQ: backport some features from ESFQ (try 2)

2007-07-29 Thread Corey Hickey
Hello,

Patchset try 2 addresses the review by Michael Buesch.

This set of patches adds some of ESFQ's modifications to the original 
SFQ. Thus far, I have received support for this approach rather than for 
trying to get ESFQ included as a separate qdisc.

http://mailman.ds9a.nl/pipermail/lartc/2007q2/021056.html

My patches here implement tc qdisc change, user-configurable depth 
(number of flows), and user-configurable divisor (for setting hash table 
size). I've left out the remaining ESFQ features (usage of jhash and 
different hashing methods) because Patrick McHardy intends to submit a 
patch that will supersede that functionality; see the URL above.

Default values remain the same, and SFQ's default behavior remains the 
same, so there should be no user disruption.

A patch for iproute2 is included after the end of the kernel patch series.

Thanks for your consideration,
Corey


 include/linux/pkt_sched.h |8 --
 net/sched/sch_sfq.c   |  296 -
 2 files changed, 187 insertions(+), 117 deletions(-)

[PATCH 1/7] Preparatory refactoring part 1.
[PATCH 2/7] Preparatory refactoring part 2.
[PATCH 3/7] Move two functions.
[PATCH 4/7] Add depth.
[PATCH 5/7] Add divisor.
[PATCH 6/7] Make qdisc changeable.
[PATCH 7/7] Remove comments about hardcoded values.
[PATCH] [iproute2] SFQ: Support changing depth and divisor.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/7] Add divisor.

2007-07-29 Thread Corey Hickey
Make hash divisor user-configurable.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 70124ac..e6a6a21 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -75,7 +75,7 @@
It is easy to increase these values, but not in flight.  */
 
 #define SFQ_DEPTH_DEFAULT  128
-#define SFQ_HASH_DIVISOR   1024
+#define SFQ_DIVISOR_DEFAULT1024
 
 #define SFQ_HEAD 0
 #define SFQ_TAIL 1
@@ -98,6 +98,7 @@ struct sfq_sched_data
unsignedquantum;/* Allotment per round: MUST BE = MTU 
*/
int limit;
unsigneddepth;
+   unsignedhash_divisor;
 
 /* Variables */
struct timer_list perturb_timer;
@@ -105,7 +106,7 @@ struct sfq_sched_data
sfq_index   tail;   /* Index of current slot in round */
sfq_index   max_depth;  /* Maximal depth */
 
-   sfq_index   ht[SFQ_HASH_DIVISOR];   /* Hash table */
+   sfq_index   *ht;/* Hash table */
sfq_index   *next;  /* Active slots link */
short   *allot; /* Current allotment per slot */
unsigned short  *hash;  /* Hash value indexed by slots 
*/
@@ -120,7 +121,7 @@ static __inline__ unsigned sfq_fold_hash(struct 
sfq_sched_data *q, u32 h, u32 h1
/* Have we any rotation primitives? If not, WHY? */
h ^= (h1pert) ^ (h1(0x1F - pert));
h ^= h10;
-   return h  0x3FF;
+   return h  (q-hash_divisor-1);
 }
 
 static unsigned sfq_hash(struct sfq_sched_data *q, struct sk_buff *skb)
@@ -386,6 +387,7 @@ static void sfq_perturbation(unsigned long arg)
 static void sfq_q_destroy(struct sfq_sched_data *q)
 {
del_timer(q-perturb_timer);
+   kfree(q-ht);
kfree(q-dep);
kfree(q-next);
kfree(q-allot);
@@ -411,12 +413,14 @@ static int sfq_q_init(struct sfq_sched_data *q, struct 
rtattr *opt)
q-max_depth = 0;
if (opt == NULL) {
q-perturb_period = 0;
+   q-hash_divisor = SFQ_DIVISOR_DEFAULT;
q-tail = q-limit = q-depth = SFQ_DEPTH_DEFAULT;
} else {
struct tc_sfq_qopt *ctl = RTA_DATA(opt);
if (ctl-quantum)
q-quantum = ctl-quantum;
q-perturb_period = ctl-perturb_period*HZ;
+   q-hash_divisor = ctl-divisor ? : SFQ_DIVISOR_DEFAULT;
q-tail = q-limit = q-depth = ctl-flows ? : 
SFQ_DEPTH_DEFAULT;
 
if (q-depth  SFQ_MAX_DEPTH)
@@ -426,6 +430,9 @@ static int sfq_q_init(struct sfq_sched_data *q, struct 
rtattr *opt)
q-limit = min_t(u32, ctl-limit, q-depth);
}
 
+   q-ht = kcalloc(q-hash_divisor, sizeof(sfq_index), GFP_KERNEL);
+   if (!q-ht)
+   goto err_case;
q-dep = kcalloc(1 + q-depth*2, sizeof(struct sfq_head), GFP_KERNEL);
if (!q-dep)
goto err_case;
@@ -442,7 +449,7 @@ static int sfq_q_init(struct sfq_sched_data *q, struct 
rtattr *opt)
if (!q-qs)
goto err_case;
 
-   for (i=0; iSFQ_HASH_DIVISOR; i++)
+   for (i=0; iq-hash_divisor; i++)
q-ht[i] = q-depth;
for (i=0; i  q-depth; i++) {
skb_queue_head_init(q-qs[i]);
@@ -488,7 +495,7 @@ static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
opt.perturb_period = q-perturb_period/HZ;
 
opt.limit = q-limit;
-   opt.divisor = SFQ_HASH_DIVISOR;
+   opt.divisor = q-hash_divisor;
opt.flows = q-depth;
 
RTA_PUT(skb, TCA_OPTIONS, sizeof(opt), opt);
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/7] Preparatory refactoring part 1.

2007-07-29 Thread Corey Hickey
Make a new function sfq_q_enqueue() that operates directly on the
queue data. This will be useful for implementing sfq_change() in
a later patch. A pleasant side-effect is reducing most of the
duplicate code in sfq_enqueue() and sfq_requeue().

Similarly, make a new function sfq_q_dequeue().

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   70 ++
 1 files changed, 36 insertions(+), 34 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 9579573..8ae077f 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -77,6 +77,9 @@
 #define SFQ_DEPTH  128
 #define SFQ_HASH_DIVISOR   1024
 
+#define SFQ_HEAD 0
+#define SFQ_TAIL 1
+
 /* This type should contain at least SFQ_DEPTH*2 values */
 typedef unsigned char sfq_index;
 
@@ -244,10 +247,8 @@ static unsigned int sfq_drop(struct Qdisc *sch)
return 0;
 }
 
-static int
-sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
+static void sfq_q_enqueue(struct sk_buff *skb, struct sfq_sched_data *q, 
unsigned int end)
 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
unsigned hash = sfq_hash(q, skb);
sfq_index x;
 
@@ -256,8 +257,12 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
q-hash[x] = hash;
}
-   sch-qstats.backlog += skb-len;
-   __skb_queue_tail(q-qs[x], skb);
+
+   if (end == SFQ_TAIL)
+   __skb_queue_tail(q-qs[x], skb);
+   else
+   __skb_queue_head(q-qs[x], skb);
+
sfq_inc(q, x);
if (q-qs[x].qlen == 1) {   /* The flow is new */
if (q-tail == SFQ_DEPTH) { /* It is the first flow */
@@ -270,12 +275,21 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
q-tail = x;
}
}
+}
+
+static int
+sfq_enqueue(struct sk_buff *skb, struct Qdisc* sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   sfq_q_enqueue(skb, q, SFQ_TAIL);
+   sch-qstats.backlog += skb-len;
if (++sch-q.qlen  q-limit-1) {
sch-bstats.bytes += skb-len;
sch-bstats.packets++;
return 0;
}
 
+   sch-qstats.drops++;
sfq_drop(sch);
return NET_XMIT_CN;
 }
@@ -284,28 +298,8 @@ static int
 sfq_requeue(struct sk_buff *skb, struct Qdisc* sch)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
-   unsigned hash = sfq_hash(q, skb);
-   sfq_index x;
-
-   x = q-ht[hash];
-   if (x == SFQ_DEPTH) {
-   q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
-   q-hash[x] = hash;
-   }
+   sfq_q_enqueue(skb, q, SFQ_HEAD);
sch-qstats.backlog += skb-len;
-   __skb_queue_head(q-qs[x], skb);
-   sfq_inc(q, x);
-   if (q-qs[x].qlen == 1) {   /* The flow is new */
-   if (q-tail == SFQ_DEPTH) { /* It is the first flow */
-   q-tail = x;
-   q-next[x] = x;
-   q-allot[x] = q-quantum;
-   } else {
-   q-next[x] = q-next[q-tail];
-   q-next[q-tail] = x;
-   q-tail = x;
-   }
-   }
if (++sch-q.qlen  q-limit - 1) {
sch-qstats.requeues++;
return 0;
@@ -316,13 +310,8 @@ sfq_requeue(struct sk_buff *skb, struct Qdisc* sch)
return NET_XMIT_CN;
 }
 
-
-
-
-static struct sk_buff *
-sfq_dequeue(struct Qdisc* sch)
+static struct sk_buff *sfq_q_dequeue(struct sfq_sched_data *q)
 {
-   struct sfq_sched_data *q = qdisc_priv(sch);
struct sk_buff *skb;
sfq_index a, old_a;
 
@@ -335,8 +324,6 @@ sfq_dequeue(struct Qdisc* sch)
/* Grab packet */
skb = __skb_dequeue(q-qs[a]);
sfq_dec(q, a);
-   sch-q.qlen--;
-   sch-qstats.backlog -= skb-len;
 
/* Is the slot empty? */
if (q-qs[a].qlen == 0) {
@@ -353,6 +340,21 @@ sfq_dequeue(struct Qdisc* sch)
a = q-next[a];
q-allot[a] += q-quantum;
}
+
+   return skb;
+}
+
+static struct sk_buff
+*sfq_dequeue(struct Qdisc* sch)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   struct sk_buff *skb;
+
+   skb = sfq_q_dequeue(q);
+   if (skb == NULL)
+   return NULL;
+   sch-q.qlen--;
+   sch-qstats.backlog -= skb-len;
return skb;
 }
 
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/7] Add depth.

2007-07-29 Thread Corey Hickey
Make depth (number of queues) user-configurable:

* replace #define with a parameter
* use old hardcoded value as a default
* kcalloc() arrays in sfq_q_init()
* free() arrays in sfq_q_destroy()

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   81 +++
 1 files changed, 56 insertions(+), 25 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 583f925..70124ac 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -74,14 +74,16 @@
 
It is easy to increase these values, but not in flight.  */
 
-#define SFQ_DEPTH  128
+#define SFQ_DEPTH_DEFAULT  128
 #define SFQ_HASH_DIVISOR   1024
 
 #define SFQ_HEAD 0
 #define SFQ_TAIL 1
 
-/* This type should contain at least SFQ_DEPTH*2 values */
-typedef unsigned char sfq_index;
+/* This type must contain greater than depth*2 values, so depth is constrained 
+ * accordingly. */
+typedef unsigned int sfq_index;
+#define SFQ_MAX_DEPTH (UINT_MAX / 2 - 1)
 
 struct sfq_head
 {
@@ -95,6 +97,7 @@ struct sfq_sched_data
int perturb_period;
unsignedquantum;/* Allotment per round: MUST BE = MTU 
*/
int limit;
+   unsigneddepth;
 
 /* Variables */
struct timer_list perturb_timer;
@@ -103,11 +106,11 @@ struct sfq_sched_data
sfq_index   max_depth;  /* Maximal depth */
 
sfq_index   ht[SFQ_HASH_DIVISOR];   /* Hash table */
-   sfq_index   next[SFQ_DEPTH];/* Active slots link */
-   short   allot[SFQ_DEPTH];   /* Current allotment per slot */
-   unsigned short  hash[SFQ_DEPTH];/* Hash value indexed by slots 
*/
-   struct sk_buff_head qs[SFQ_DEPTH];  /* Slot queue */
-   struct sfq_head dep[SFQ_DEPTH*2];   /* Linked list of slots, 
indexed by depth */
+   sfq_index   *next;  /* Active slots link */
+   short   *allot; /* Current allotment per slot */
+   unsigned short  *hash;  /* Hash value indexed by slots 
*/
+   struct sk_buff_head *qs;/* Slot queue */
+   struct sfq_head *dep;   /* Linked list of slots, 
indexed by depth */
 };
 
 static __inline__ unsigned sfq_fold_hash(struct sfq_sched_data *q, u32 h, u32 
h1)
@@ -164,7 +167,7 @@ static unsigned sfq_hash(struct sfq_sched_data *q, struct 
sk_buff *skb)
 static inline void sfq_link(struct sfq_sched_data *q, sfq_index x)
 {
sfq_index p, n;
-   int d = q-qs[x].qlen + SFQ_DEPTH;
+   int d = q-qs[x].qlen + q-depth;
 
p = d;
n = q-dep[d].next;
@@ -215,7 +218,7 @@ static unsigned int sfq_drop(struct Qdisc *sch)
   drop a packet from it */
 
if (d  1) {
-   sfq_index x = q-dep[d+SFQ_DEPTH].next;
+   sfq_index x = q-dep[d + q-depth].next;
skb = q-qs[x].prev;
len = skb-len;
__skb_unlink(skb, q-qs[x]);
@@ -238,7 +241,7 @@ static unsigned int sfq_drop(struct Qdisc *sch)
kfree_skb(skb);
sfq_dec(q, d);
sch-q.qlen--;
-   q-ht[q-hash[d]] = SFQ_DEPTH;
+   q-ht[q-hash[d]] = q-depth;
sch-qstats.drops++;
sch-qstats.backlog -= len;
return len;
@@ -253,8 +256,8 @@ static void sfq_q_enqueue(struct sk_buff *skb, struct 
sfq_sched_data *q, unsigne
sfq_index x;
 
x = q-ht[hash];
-   if (x == SFQ_DEPTH) {
-   q-ht[hash] = x = q-dep[SFQ_DEPTH].next;
+   if (x == q-depth) {
+   q-ht[hash] = x = q-dep[q-depth].next;
q-hash[x] = hash;
}
 
@@ -265,7 +268,7 @@ static void sfq_q_enqueue(struct sk_buff *skb, struct 
sfq_sched_data *q, unsigne
 
sfq_inc(q, x);
if (q-qs[x].qlen == 1) {   /* The flow is new */
-   if (q-tail == SFQ_DEPTH) { /* It is the first flow */
+   if (q-tail == q-depth) {  /* It is the first flow */
q-tail = x;
q-next[x] = x;
q-allot[x] = q-quantum;
@@ -316,7 +319,7 @@ static struct sk_buff *sfq_q_dequeue(struct sfq_sched_data 
*q)
sfq_index a, old_a;
 
/* No active slots */
-   if (q-tail == SFQ_DEPTH)
+   if (q-tail == q-depth)
return NULL;
 
a = old_a = q-next[q-tail];
@@ -327,10 +330,10 @@ static struct sk_buff *sfq_q_dequeue(struct 
sfq_sched_data *q)
 
/* Is the slot empty? */
if (q-qs[a].qlen == 0) {
-   q-ht[q-hash[a]] = SFQ_DEPTH;
+   q-ht[q-hash[a]] = q-depth;
a = q-next[a];
if (a == old_a) {
-   q-tail = SFQ_DEPTH;
+   q-tail = q-depth;
return skb;
}
q-next[q-tail] = a;

[PATCH 6/7] Make qdisc changeable.

2007-07-29 Thread Corey Hickey
Re-implement sfq_change() and enable Qdisc_opts.change so tc qdisc
change will work.

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 net/sched/sch_sfq.c |   51 ++-
 1 files changed, 50 insertions(+), 1 deletions(-)

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index e6a6a21..e042cd0 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -485,6 +485,55 @@ static int sfq_init(struct Qdisc *sch, struct rtattr *opt)
return 0;
 }
 
+static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
+{
+   struct sfq_sched_data *q = qdisc_priv(sch);
+   struct sfq_sched_data tmp;
+   struct sk_buff *skb;
+   int err;
+   
+   /* set up tmp queue */
+   memset(tmp, 0, sizeof(struct sfq_sched_data));
+   tmp.quantum = psched_mtu(sch-dev); /* default */
+   if ((err = sfq_q_init(tmp, opt)))
+   return err;
+
+   /* copy packets from the old queue to the tmp queue */
+   sch_tree_lock(sch);
+   while (sch-q.qlen = tmp.limit - 1)
+   sfq_drop(sch);
+   while ((skb = sfq_q_dequeue(q)) != NULL)
+   sfq_q_enqueue(skb, tmp, SFQ_TAIL);
+   
+   /* clean up the old queue */
+   sfq_q_destroy(q);
+
+   /* copy elements of the tmp queue into the old queue */
+   q-perturb_period = tmp.perturb_period;
+   q-quantum= tmp.quantum;
+   q-limit  = tmp.limit;
+   q-depth  = tmp.depth;
+   q-hash_divisor   = tmp.hash_divisor;
+   q-tail   = tmp.tail;
+   q-max_depth  = tmp.max_depth;
+   q-ht= tmp.ht;
+   q-dep   = tmp.dep;
+   q-next  = tmp.next;
+   q-allot = tmp.allot;
+   q-hash  = tmp.hash;
+   q-qs= tmp.qs;
+
+   /* finish up */
+   if (q-perturb_period) {
+   q-perturb_timer.expires = jiffies + q-perturb_period;
+   add_timer(q-perturb_timer);
+   } else {
+   q-perturbation = 0;
+   }
+   sch_tree_unlock(sch);
+   return 0;
+}
+
 static int sfq_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
struct sfq_sched_data *q = qdisc_priv(sch);
@@ -519,7 +568,7 @@ static struct Qdisc_ops sfq_qdisc_ops = {
.init   =   sfq_init,
.reset  =   sfq_reset,
.destroy=   sfq_destroy,
-   .change =   NULL,
+   .change =   sfq_change,
.dump   =   sfq_dump,
.owner  =   THIS_MODULE,
 };
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/7] Remove comments about hardcoded values.

2007-07-29 Thread Corey Hickey
None of these are true anymore (hooray!).

Signed-off-by: Corey Hickey [EMAIL PROTECTED]
---
 include/linux/pkt_sched.h |8 
 net/sched/sch_sfq.c   |   17 +++--
 2 files changed, 3 insertions(+), 22 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 268c515..58a0ea6 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -148,14 +148,6 @@ struct tc_sfq_qopt
unsignedflows;  /* Maximal number of flows  */
 };
 
-/*
- *  NOTE: limit, divisor and flows are hardwired to code at the moment.
- *
- * limit=flows=128, divisor=1024;
- *
- * The only reason for this is efficiency, it is possible
- * to change these parameters in compile time.
- */
 
 /* RED section */
 
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index e042cd0..3890452 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -61,18 +61,7 @@
 
We still need true WFQ for top level CSZ, but using WFQ
for the best effort traffic is absolutely pointless:
-   SFQ is superior for this purpose.
-
-   IMPLEMENTATION:
-   This implementation limits maximal queue length to 128;
-   maximal mtu to 2^15-1; number of hash buckets to 1024.
-   The only goal of this restrictions was that all data
-   fit into one 4K page :-). Struct sfq_sched_data is
-   organized in anti-cache manner: all the data for a bucket
-   are scattered over different locations. This is not good,
-   but it allowed me to put it into 4K.
-
-   It is easy to increase these values, but not in flight.  */
+   SFQ is superior for this purpose. */
 
 #define SFQ_DEPTH_DEFAULT  128
 #define SFQ_DIVISOR_DEFAULT1024
@@ -491,7 +480,7 @@ static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
struct sfq_sched_data tmp;
struct sk_buff *skb;
int err;
-   
+
/* set up tmp queue */
memset(tmp, 0, sizeof(struct sfq_sched_data));
tmp.quantum = psched_mtu(sch-dev); /* default */
@@ -504,7 +493,7 @@ static int sfq_change(struct Qdisc *sch, struct rtattr *opt)
sfq_drop(sch);
while ((skb = sfq_q_dequeue(q)) != NULL)
sfq_q_enqueue(skb, tmp, SFQ_TAIL);
-   
+
/* clean up the old queue */
sfq_q_destroy(q);
 
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 -mm 0/9] netconsole: Multiple targets and dynamic reconfigurability

2007-07-29 Thread Satyam Sharma
[0/9] netconsole: Multiple targets and dynamic reconfigurability

This is v3 of the patchset, the previous versions are available at:
http://lkml.org/lkml/2007/7/4/107
http://lkml.org/lkml/2007/7/10/78

Diffed against 2.6.23-rc1-git6 (6c8dca5d as of writing), but applies
successfully to 2.6.23-rc1-mm1 as well.

Patches 1-5 are okay-to-apply to current mainline -git, I think.

Patches 6-9 introduce the new functionality and are requested for
inclusion in -mm.

[1/9] netconsole: Cleanups, codingstyle, prettyfication
[2/9] netconsole: Remove bogus check
[3/9] netconsole: Simplify boot/module option setup logic
[4/9] netconsole: Use netif_running() in write_msg()
[5/9] netconsole: Add some useful tips to documentation
[6/9] netconsole: Introduce netconsole_target
[7/9] netconsole: Introduce netconsole_netdev_notifier
[8/9] netconsole: Support multiple logging targets
[9/9] netconsole: Support dynamic reconfiguration using configfs

Changes since v2:
=

* enabled must be defined outside #ifdef NETCONSOLE_DYNAMIC

* Some simple enhancements to documentation

* Drop the use of unlikely from a condition where I'd got the
  common case wrong

About this patchset:


What?

Support multiple remote logging targets in netconsole. Also, ability to
dynamically add or remove targets or modify parameters (IP, port, remote
MAC address) of targets at runtime, from userspace, using configfs.

Why?

From Keiichi Kii's original post:

[...] current netconsole is not flexible. For example, if you want to change
ip address for logging agent, in the case of built-in netconsole, you can't
change config except for changing boot parameter and rebooting your system,
or in the case of module netconsole, you need to remove it and reload
with different parameters. [...] and we have been losing serial console
port with PCs and Servers.

(... and especially laptops, I would add.)


Satyam
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 -mm 1/9] netconsole: Cleanups, codingstyle, prettyfication

2007-07-29 Thread Satyam Sharma
From: Satyam Sharma [EMAIL PROTECTED]

[1/9] netconsole: Cleanups, codingstyle, prettyfication

(1) Remove unwanted headers.
(2) Mark __init and __exit as appropriate.
(3) Various trivial codingstyle and prettification stuff.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Signed-off-by: Keiichi Kii [EMAIL PROTECTED]
Acked-by: Matt Mackall [EMAIL PROTECTED]

---

 drivers/net/netconsole.c |   55 ++---
 1 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index 69233f6..f1c2a2d 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -35,35 +35,32 @@
  /
 
 #include linux/mm.h
-#include linux/tty.h
 #include linux/init.h
 #include linux/module.h
 #include linux/console.h
-#include linux/tty_driver.h
 #include linux/moduleparam.h
 #include linux/string.h
-#include linux/sysrq.h
-#include linux/smp.h
 #include linux/netpoll.h
 
 MODULE_AUTHOR(Maintainer: Matt Mackall [EMAIL PROTECTED]);
 MODULE_DESCRIPTION(Console driver for network interfaces);
 MODULE_LICENSE(GPL);
 
-static char config[256];
-module_param_string(netconsole, config, 256, 0);
+#define MAX_PARAM_LENGTH   256
+#define MAX_PRINT_CHUNK1000
+
+static char config[MAX_PARAM_LENGTH];
+module_param_string(netconsole, config, MAX_PARAM_LENGTH, 0);
 MODULE_PARM_DESC(netconsole,  [EMAIL 
PROTECTED]/[dev],[tgt-port]@tgt-ip/[tgt-macaddr]\n);
 
 static struct netpoll np = {
-   .name = netconsole,
-   .dev_name = eth0,
-   .local_port = 6665,
-   .remote_port = ,
-   .remote_mac = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
+   .name   = netconsole,
+   .dev_name   = eth0,
+   .local_port = 6665,
+   .remote_port= ,
+   .remote_mac = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
 };
-static int configured = 0;
-
-#define MAX_PRINT_CHUNK 1000
+static int configured;
 
 static void write_msg(struct console *con, const char *msg, unsigned int len)
 {
@@ -75,7 +72,7 @@ static void write_msg(struct console *con, const char *msg, 
unsigned int len)
 
local_irq_save(flags);
 
-   for(left = len; left; ) {
+   for (left = len; left;) {
frag = min(left, MAX_PRINT_CHUNK);
netpoll_send_udp(np, msg, frag);
msg += frag;
@@ -86,12 +83,12 @@ static void write_msg(struct console *con, const char *msg, 
unsigned int len)
 }
 
 static struct console netconsole = {
-   .name = netcon,
-   .flags = CON_ENABLED | CON_PRINTBUFFER,
-   .write = write_msg
+   .name   = netcon,
+   .flags  = CON_ENABLED | CON_PRINTBUFFER,
+   .write  = write_msg,
 };
 
-static int option_setup(char *opt)
+static int __init option_setup(char *opt)
 {
configured = !netpoll_parse_options(np, opt);
return 1;
@@ -99,28 +96,30 @@ static int option_setup(char *opt)
 
 __setup(netconsole=, option_setup);
 
-static int init_netconsole(void)
+static int __init init_netconsole(void)
 {
-   int err;
+   int err = 0;
 
-   if(strlen(config))
+   if (strnlen(config, MAX_PARAM_LENGTH))
option_setup(config);
 
-   if(!configured) {
-   printk(netconsole: not configured, aborting\n);
-   return 0;
+   if (!configured) {
+   printk(KERN_INFO netconsole: not configured, aborting\n);
+   goto out;
}
 
err = netpoll_setup(np);
if (err)
-   return err;
+   goto out;
 
register_console(netconsole);
printk(KERN_INFO netconsole: network logging started\n);
-   return 0;
+
+out:
+   return err;
 }
 
-static void cleanup_netconsole(void)
+static void __exit cleanup_netconsole(void)
 {
unregister_console(netconsole);
netpoll_cleanup(np);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 -mm 2/9] netconsole: Remove bogus check

2007-07-29 Thread Satyam Sharma
From: Satyam Sharma [EMAIL PROTECTED]

[2/9] netconsole: Remove bogus check

The (!np.dev) check in write_msg() is bogus (always false), because:
np.dev is set by netpoll_setup(), which is called by init_netconsole()
before register_console(), so write_msg() cannot be triggered unless
netpoll_setup() successfully set np.dev. Also np.dev cannot go away
from under us, because netpoll_setup() grabs us reference on it. So
let's remove the bogus check.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Acked-by: Keiichi Kii [EMAIL PROTECTED]
Acked-by: Matt Mackall [EMAIL PROTECTED]

---

 drivers/net/netconsole.c |3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index f1c2a2d..2c2aef1 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -67,9 +67,6 @@ static void write_msg(struct console *con, const char *msg, 
unsigned int len)
int frag, left;
unsigned long flags;
 
-   if (!np.dev)
-   return;
-
local_irq_save(flags);
 
for (left = len; left;) {
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 -mm 3/9] netconsole: Simplify boot/module option setup logic

2007-07-29 Thread Satyam Sharma
From: Satyam Sharma [EMAIL PROTECTED]

[3/9] netconsole: Simplify boot/module option setup logic

Presently, boot/module parameters are set up quite differently for the
case of built-in netconsole (__setup() - obsolete_checksetup() -
netpoll_parse_options() - strlen(config) == 0 in init_netconsole())
vs modular netconsole (module_param_string() - string copied to the
config variable - strlen(config) != 0 init_netconsole() -
netpoll_parse_options()).

This patch makes both of them similar by doing exactly the equivalent
of a module_param_string() in option_setup() also -- just copying the
param string passed from the kernel command line into config variable.
So, strlen(config) != 0 in both cases, and netpoll_parse_options() is
always called from init_netconsole(), thus making the setup logic for
both cases similar.

Now, option_setup() is only ever called / used for the built-in case,
so we put it inside a #ifndef MODULE, otherwise gcc will complain about
option_setup() being defined but not used. Also, the configured
variable is redundant with this patch and hence removed.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Signed-off-by: Keiichi Kii [EMAIL PROTECTED]
Acked-by: Matt Mackall [EMAIL PROTECTED]

---

 drivers/net/netconsole.c |   27 ++-
 1 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index 2c2aef1..e56aa6c 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -53,6 +53,15 @@ static char config[MAX_PARAM_LENGTH];
 module_param_string(netconsole, config, MAX_PARAM_LENGTH, 0);
 MODULE_PARM_DESC(netconsole,  [EMAIL 
PROTECTED]/[dev],[tgt-port]@tgt-ip/[tgt-macaddr]\n);
 
+#ifndefMODULE
+static int __init option_setup(char *opt)
+{
+   strlcpy(config, opt, MAX_PARAM_LENGTH);
+   return 1;
+}
+__setup(netconsole=, option_setup);
+#endif /* MODULE */
+
 static struct netpoll np = {
.name   = netconsole,
.dev_name   = eth0,
@@ -60,7 +69,6 @@ static struct netpoll np = {
.remote_port= ,
.remote_mac = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
 };
-static int configured;
 
 static void write_msg(struct console *con, const char *msg, unsigned int len)
 {
@@ -85,26 +93,19 @@ static struct console netconsole = {
.write  = write_msg,
 };
 
-static int __init option_setup(char *opt)
-{
-   configured = !netpoll_parse_options(np, opt);
-   return 1;
-}
-
-__setup(netconsole=, option_setup);
-
 static int __init init_netconsole(void)
 {
int err = 0;
 
-   if (strnlen(config, MAX_PARAM_LENGTH))
-   option_setup(config);
-
-   if (!configured) {
+   if (!strnlen(config, MAX_PARAM_LENGTH)) {
printk(KERN_INFO netconsole: not configured, aborting\n);
goto out;
}
 
+   err = netpoll_parse_options(np, config);
+   if (err)
+   goto out;
+
err = netpoll_setup(np);
if (err)
goto out;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 -mm 4/9] netconsole: Use netif_running() in write_msg()

2007-07-29 Thread Satyam Sharma
From: Satyam Sharma [EMAIL PROTECTED]

[4/9] netconsole: Use netif_running() in write_msg()

Avoid unnecessarily disabling interrupts and calling netpoll_send_udp()
if the corresponding local interface is not up.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Acked-by: Keiichi Kii [EMAIL PROTECTED]
Cc: Matt Mackall [EMAIL PROTECTED]

---

 drivers/net/netconsole.c |   18 +-
 1 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index e56aa6c..75cb761 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -75,16 +75,16 @@ static void write_msg(struct console *con, const char *msg, 
unsigned int len)
int frag, left;
unsigned long flags;
 
-   local_irq_save(flags);
-
-   for (left = len; left;) {
-   frag = min(left, MAX_PRINT_CHUNK);
-   netpoll_send_udp(np, msg, frag);
-   msg += frag;
-   left -= frag;
+   if (netif_running(np.dev)) {
+   local_irq_save(flags);
+   for (left = len; left;) {
+   frag = min(left, MAX_PRINT_CHUNK);
+   netpoll_send_udp(np, msg, frag);
+   msg += frag;
+   left -= frag;
+   }
+   local_irq_restore(flags);
}
-
-   local_irq_restore(flags);
 }
 
 static struct console netconsole = {
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 -mm 5/9] netconsole: Add some useful tips to documentation

2007-07-29 Thread Satyam Sharma
From: Satyam Sharma [EMAIL PROTECTED]

[5/9] netconsole: Add some useful tips to documentation

Add some useful general-purpose tips. Also suggest solution for the frequent
problem of console loglevel set too low numerically (i.e. for high priority
messages only) on the sender.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Acked-by: Keiichi Kii [EMAIL PROTECTED]
Acked-by: Matt Mackall [EMAIL PROTECTED]

---

 Documentation/networking/netconsole.txt |   25 +
 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/Documentation/networking/netconsole.txt 
b/Documentation/networking/netconsole.txt
index 1caa6c7..5962f45 100644
--- a/Documentation/networking/netconsole.txt
+++ b/Documentation/networking/netconsole.txt
@@ -44,11 +44,36 @@ WARNING: the default target ethernet setting uses the 
broadcast
 ethernet address to send packets, which can cause increased load on
 other systems on the same ethernet segment.
 
+TIP: some LAN switches may be configured to suppress ethernet broadcasts
+so it is advised to explicitly specify the remote agents' MAC addresses
+from the config parameters passed to netconsole.
+
+TIP: to find out the MAC address of, say, 10.0.0.2, you may try using:
+
+ ping -c 1 10.0.0.2 ; /sbin/arp -n | grep 10.0.0.2
+
+TIP: in case the remote logging agent is on a separate LAN subnet than
+the sender, it is suggested to try specifying the MAC address of the
+default gateway (you may use /sbin/route -n to find it out) as the
+remote MAC address instead.
+
 NOTE: the network device (eth1 in the above case) can run any kind
 of other network traffic, netconsole is not intrusive. Netconsole
 might cause slight delays in other traffic if the volume of kernel
 messages is high, but should have no other impact.
 
+NOTE: if you find that the remote logging agent is not receiving or
+printing all messages from the sender, it is likely that you have set
+the console_loglevel parameter (on the sender) to only send high
+priority messages to the console. You can change this at runtime using:
+
+ dmesg -n 8
+
+or by specifying debug on the kernel command line at boot, to send
+all kernel messages to the console. A specific value for this parameter
+can also be set using the loglevel kernel boot option. See the
+dmesg(8) man page and Documentation/kernel-parameters.txt for details.
+
 Netconsole was designed to be as instantaneous as possible, to
 enable the logging of even the most critical kernel bugs. It works
 from IRQ contexts as well, and does not enable interrupts while
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 -mm 6/9] netconsole: Introduce netconsole_target

2007-07-29 Thread Satyam Sharma
From: Satyam Sharma [EMAIL PROTECTED]

[6/9] netconsole: Introduce netconsole_target

Introduce a wrapper structure over netpoll to represent logging targets
configured in netconsole. This will get extended with other members in
further patches.

This is done independent of the (to-be-introduced) NETCONSOLE_DYNAMIC
config option so that we're able to drastically cut down on the #ifdef
complexity of final netconsole.c. Also, struct netconsole_target would be
required for multiple targets support also, and not just dynamic
reconfigurability.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Signed-off-by: Keiichi Kii [EMAIL PROTECTED]

---

 drivers/net/netconsole.c |   36 +---
 1 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index 75cb761..be15ca6 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -62,24 +62,35 @@ static int __init option_setup(char *opt)
 __setup(netconsole=, option_setup);
 #endif /* MODULE */
 
-static struct netpoll np = {
-   .name   = netconsole,
-   .dev_name   = eth0,
-   .local_port = 6665,
-   .remote_port= ,
-   .remote_mac = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
+/**
+ * struct netconsole_target - Represents a configured netconsole target.
+ * @np:The netpoll structure for this target.
+ */
+struct netconsole_target {
+   struct netpoll  np;
+};
+
+static struct netconsole_target default_target = {
+   .np = {
+   .name   = netconsole,
+   .dev_name   = eth0,
+   .local_port = 6665,
+   .remote_port= ,
+   .remote_mac = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
+   },
 };
 
 static void write_msg(struct console *con, const char *msg, unsigned int len)
 {
int frag, left;
unsigned long flags;
+   struct netconsole_target *nt = default_target;
 
-   if (netif_running(np.dev)) {
+   if (netif_running(nt-np.dev)) {
local_irq_save(flags);
for (left = len; left;) {
frag = min(left, MAX_PRINT_CHUNK);
-   netpoll_send_udp(np, msg, frag);
+   netpoll_send_udp(nt-np, msg, frag);
msg += frag;
left -= frag;
}
@@ -96,17 +107,18 @@ static struct console netconsole = {
 static int __init init_netconsole(void)
 {
int err = 0;
+   struct netconsole_target *nt = default_target;
 
if (!strnlen(config, MAX_PARAM_LENGTH)) {
printk(KERN_INFO netconsole: not configured, aborting\n);
goto out;
}
 
-   err = netpoll_parse_options(np, config);
+   err = netpoll_parse_options(nt-np, config);
if (err)
goto out;
 
-   err = netpoll_setup(np);
+   err = netpoll_setup(nt-np);
if (err)
goto out;
 
@@ -119,8 +131,10 @@ out:
 
 static void __exit cleanup_netconsole(void)
 {
+   struct netconsole_target *nt = default_target;
+
unregister_console(netconsole);
-   netpoll_cleanup(np);
+   netpoll_cleanup(nt-np);
 }
 
 module_init(init_netconsole);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 -mm 7/9] netconsole: Introduce netconsole_netdev_notifier

2007-07-29 Thread Satyam Sharma
From: Satyam Sharma [EMAIL PROTECTED]

[7/9] netconsole: Introduce netconsole_netdev_notifier

To update fields of underlying netpoll structure at runtime on
corresponding NETDEV_CHANGEADDR or NETDEV_CHANGENAME notifications.

ioctl(SIOCSIFHWADDR or SIOCSIFNAME) could be used to change the
hardware/MAC address or name of the local interface that our netpoll
is attached to. Whenever this happens, netdev notifier chain is called
out with the NETDEV_CHANGEADDR or NETDEV_CHANGENAME event message. We
respond to that and update the local_mac or dev_name field of the struct
netpoll. This makes sense anyway, but is especially required for dynamic
netconsole because the netpoll structure's internal members become user
visible files when either sysfs or configfs are used. So this helps us
to keep up with the MAC address/name changes and keep values in struct
netpoll uptodate.

[ Note that ioctl(SIOCSIFADDR) to change IP address of interface at
  runtime is not handled (to update local_ip of netpoll) on purpose --
  some setups may set the local_ip to a private address, not necessary
  the actual IP address of the sender host, as presently allowed. ]

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Signed-off-by: Keiichi Kii [EMAIL PROTECTED]

---

 drivers/net/netconsole.c |   32 
 1 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index be15ca6..5557098 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -80,6 +80,33 @@ static struct netconsole_target default_target = {
},
 };
 
+/* Handle network interface device notifications */
+static int netconsole_netdev_event(struct notifier_block *this,
+  unsigned long event,
+  void *ptr)
+{
+   struct net_device *dev = ptr;
+   struct netconsole_target *nt = default_target;
+
+   if (nt-np.dev == dev) {
+   switch (event) {
+   case NETDEV_CHANGEADDR:
+   memcpy(nt-np.local_mac, dev-dev_addr, ETH_ALEN);
+   break;
+
+   case NETDEV_CHANGENAME:
+   strlcpy(nt-np.dev_name, dev-name, IFNAMSIZ);
+   break;
+   }
+   }
+
+   return NOTIFY_DONE;
+}
+
+static struct notifier_block netconsole_netdev_notifier = {
+   .notifier_call  = netconsole_netdev_event,
+};
+
 static void write_msg(struct console *con, const char *msg, unsigned int len)
 {
int frag, left;
@@ -122,6 +149,10 @@ static int __init init_netconsole(void)
if (err)
goto out;
 
+   err = register_netdevice_notifier(netconsole_netdev_notifier);
+   if (err)
+   goto out;
+
register_console(netconsole);
printk(KERN_INFO netconsole: network logging started\n);
 
@@ -134,6 +165,7 @@ static void __exit cleanup_netconsole(void)
struct netconsole_target *nt = default_target;
 
unregister_console(netconsole);
+   unregister_netdevice_notifier(netconsole_netdev_notifier);
netpoll_cleanup(nt-np);
 }
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 -mm 8/9] netconsole: Support multiple logging targets

2007-07-29 Thread Satyam Sharma
From: Satyam Sharma [EMAIL PROTECTED]

[8/9] netconsole: Support multiple logging targets

This patch introduces support for multiple targets, independent of
CONFIG_NETCONSOLE_DYNAMIC -- this is useful even in the default case and
(including the infrastructure introduced in previous patches) doesn't
really add too many bytes to module text. All the complexity (and size)
comes with the dynamic reconfigurability / userspace interface patch,
and so it's plausible users may want to keep this enabled but that
disabled (say to avoid a dependency on CONFIG_CONFIGFS_FS too).

Also update documentation to mention the use of ; separator to specify
multiple logging targets in the boot/module option string.

Brief overview:

We maintain a target_list (and corresponding lock). Get rid of the static
default_target and introduce allocation and release functions for our
netconsole_target objects (but keeping sure to preserve previous behaviour
such as default values). During init_netconsole(), ; is used as the
separator to identify multiple target specifications in the boot/module
option string. The target specifications are parsed and netpolls setup.
During exit, the target_list is torn down and all items released.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Signed-off-by: Keiichi Kii [EMAIL PROTECTED]

---

 Documentation/networking/netconsole.txt |6 +
 drivers/net/netconsole.c|  171 +++---
 2 files changed, 137 insertions(+), 40 deletions(-)

diff --git a/Documentation/networking/netconsole.txt 
b/Documentation/networking/netconsole.txt
index 5962f45..1aaa738 100644
--- a/Documentation/networking/netconsole.txt
+++ b/Documentation/networking/netconsole.txt
@@ -34,6 +34,12 @@ Examples:
 
  insmod netconsole netconsole=@/,@10.0.0.2/
 
+It also supports logging to multiple remote agents by specifying
+parameters for the multiple agents separated by semicolons and the
+complete string enclosed in quotes, thusly:
+
+ modprobe netconsole netconsole=@/,@10.0.0.2/;@/eth1,[EMAIL PROTECTED]/
+
 Built-in netconsole starts immediately after the TCP stack is
 initialized and attempts to bring up the supplied dev at the supplied
 address.
diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index 5557098..458c4d6 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -62,44 +62,93 @@ static int __init option_setup(char *opt)
 __setup(netconsole=, option_setup);
 #endif /* MODULE */
 
+/* Linked list of all configured targets */
+static LIST_HEAD(target_list);
+
+/* This needs to be a spinlock because write_msg() cannot sleep */
+static DEFINE_SPINLOCK(target_list_lock);
+
 /**
  * struct netconsole_target - Represents a configured netconsole target.
+ * @list:  Links this target into the target_list.
  * @np:The netpoll structure for this target.
  */
 struct netconsole_target {
+   struct list_headlist;
struct netpoll  np;
 };
 
-static struct netconsole_target default_target = {
-   .np = {
-   .name   = netconsole,
-   .dev_name   = eth0,
-   .local_port = 6665,
-   .remote_port= ,
-   .remote_mac = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff},
-   },
-};
+/* Allocate new target and setup netpoll for it */
+static struct netconsole_target *alloc_target(char *target_config)
+{
+   int err = -ENOMEM;
+   struct netconsole_target *nt;
+
+   /* Allocate and initialize with defaults */
+   nt = kzalloc(sizeof(*nt), GFP_KERNEL);
+   if (!nt) {
+   printk(KERN_ERR netconsole: failed to allocate memory\n);
+   goto fail;
+   }
+
+   nt-np.name = netconsole;
+   strlcpy(nt-np.dev_name, eth0, IFNAMSIZ);
+   nt-np.local_port = 6665;
+   nt-np.remote_port = ;
+   memset(nt-np.remote_mac, 0xff, ETH_ALEN);
+
+   /* Parse parameters and setup netpoll */
+   err = netpoll_parse_options(nt-np, target_config);
+   if (err)
+   goto fail;
+
+   err = netpoll_setup(nt-np);
+   if (err)
+   goto fail;
+
+   return nt;
+
+fail:
+   kfree(nt);
+   return ERR_PTR(err);
+}
+
+/* Cleanup netpoll for given target and free it */
+static void free_target(struct netconsole_target *nt)
+{
+   netpoll_cleanup(nt-np);
+   kfree(nt);
+}
 
 /* Handle network interface device notifications */
 static int netconsole_netdev_event(struct notifier_block *this,
   unsigned long event,
   void *ptr)
 {
+   unsigned long flags;
+   struct netconsole_target *nt;
struct net_device *dev = ptr;
-   struct netconsole_target *nt = default_target;
 
-   if (nt-np.dev == dev) {
-   switch (event) {
-   case NETDEV_CHANGEADDR:
-   memcpy(nt-np.local_mac, dev-dev_addr, ETH_ALEN);
- 

[PATCH v3 -mm 9/9] netconsole: Support dynamic reconfiguration using configfs

2007-07-29 Thread Satyam Sharma
From: Satyam Sharma [EMAIL PROTECTED]

[9/9] netconsole: Support dynamic reconfiguration using configfs

This patch introduces support for dynamic reconfiguration (adding, removing
and/or modifying parameters of netconsole targets at runtime) using a
userspace interface exported via configfs. Documentation is also updated
accordingly.

Issues and brief design overview:

(1) Kernel-initiated creation / destruction of kernel objects is not
possible with configfs -- the lifetimes of the config items is managed
exclusively from userspace. But netconsole must support boot/module params
too, and these are parsed in kernel and hence netpolls must be setup from
the kernel. Joel Becker suggested to separately manage the lifetimes of
the two kinds of netconsole_target objects -- those created via configfs
mkdir(2) from userspace and those specified from the boot/module option
string. This adds complexity and some redundancy here and also means that
boot/module param-created targets are not exposed through the configfs
namespace (and hence cannot be updated / destroyed dynamically). However,
this saves us from locking / refcounting complexities that would need to
be introduced in configfs to support kernel-initiated item creation /
destroy there.

(2) In configfs, item creation takes place in the call chain of the mkdir(2)
syscall in the driver subsystem. If we used an ioctl(2) to create / destroy
objects from userspace, the special userspace program is able to fill out
the structure to be passed into the ioctl and hence specify attributes such
as local interface that are required at the time we set up the netpoll.
For configfs, this information is not available at the time of mkdir(2).
So, we keep all newly-created targets (via configfs) disabled by default.
The user is expected to set various attributes appropriately (including the
local network interface if required) and then write(2) 1 to the enabled
attribute. Thus, netpoll_setup() is then called on the set parameters in the
context of _this_ write(2) on the enabled attribute itself. This design
enables the user to reconfigure existing netconsole targets at runtime to
be attached to newly-come-up interfaces that may not have existed when
netconsole was loaded or when the targets were actually created. All this
effectively enables us to get rid of custom ioctls.

(3) Ultra-paranoid configfs attribute show() and store() operations, with
sanity and input range checking, using only safe string primitives, and
compliant with the recommendations in Documentation/filesystems/sysfs.txt.

(4) A new function netpoll_print_options() is created in the netpoll API,
that just prints out the configured parameters for a netpoll structure.
netpoll_parse_options() is modified to use that and it is also exported to
be used from netconsole.

Signed-off-by: Satyam Sharma [EMAIL PROTECTED]
Acked-by: Keiichi Kii [EMAIL PROTECTED]

---

 Documentation/networking/netconsole.txt |   68 
 drivers/net/Kconfig |   10 +
 drivers/net/netconsole.c|  605 +--
 include/linux/netpoll.h |1 +
 net/core/netpoll.c  |   44 ++-
 5 files changed, 683 insertions(+), 45 deletions(-)

diff --git a/Documentation/networking/netconsole.txt 
b/Documentation/networking/netconsole.txt
index 1aaa738..3c2f2b3 100644
--- a/Documentation/networking/netconsole.txt
+++ b/Documentation/networking/netconsole.txt
@@ -3,6 +3,10 @@ started by Ingo Molnar [EMAIL PROTECTED], 2001.09.17
 2.6 port and netpoll api by Matt Mackall [EMAIL PROTECTED], Sep 9 2003
 
 Please send bug reports to Matt Mackall [EMAIL PROTECTED]
+and Satyam Sharma [EMAIL PROTECTED]
+
+Introduction:
+=
 
 This module logs kernel printk messages over UDP allowing debugging of
 problem where disk logging fails and serial consoles are impractical.
@@ -13,6 +17,9 @@ the specified interface as soon as possible. While this 
doesn't allow
 capture of early kernel panics, it does capture most of the boot
 process.
 
+Sender and receiver configuration:
+==
+
 It takes a string configuration parameter netconsole in the
 following format:
 
@@ -46,6 +53,67 @@ address.
 
 The remote host can run either 'netcat -u -l -p port' or syslogd.
 
+Dynamic reconfiguration:
+
+
+Dynamic reconfigurability is a useful addition to netconsole that enables
+remote logging targets to be dynamically added, removed, or have their
+parameters reconfigured at runtime from a configfs-based userspace interface.
+[ Note that the parameters of netconsole targets that were specified/created
+from the boot/module option are not exposed via this interface, and hence
+cannot be modified dynamically. ]
+
+To include this feature, select CONFIG_NETCONSOLE_DYNAMIC when building the
+netconsole module (or kernel, if netconsole is built-in).
+
+Some examples follow (where configfs is mounted at the /sys/kernel/config