Fw: AIM7 fails with 2.6.18-rc5-mm1

2006-09-05 Thread Andrew Morton

We think this is a net bug.


Begin forwarded message:

Date: Mon, 4 Sep 2006 17:02:22 -0700 (PDT)
From: Christoph Lameter [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: linux-kernel@vger.kernel.org
Subject: AIM7 fails with 2.6.18-rc5-mm1


On an 8p Altix. 6 GB Ram

AIM Multiuser Benchmark - Suite VII Run Beginning

Tasksjobs/min  jti  jobs/min/task  real   cpu
1 2435.06  100  2435.0649  2.46  0.02   Mon Sep  4 
10:17:44 2006
  100   178784.27   94  1787.8427  3.36  7.08   Mon Sep  4 
10:17:58 2006
  200   280636.11   95  1403.1805  4.28 14.46   Mon Sep  4 
10:18:15 2006
  300   340973.67   91  1136.5789  5.28 22.35   Mon Sep  4 
10:18:37 2006
  400   382897.26   82   957.2431  6.27 30.44   Mon Sep  4 
10:19:03 2006
  500   413793.10   86   827.5862  7.25 38.14   Mon Sep  4 
10:19:33 2006
  600   434940.20   89   724.9003  8.28 46.43   Mon Sep  4 
10:20:07 2006
  700
Fatal error 98 at line 284 of file pipe_test.c: bind on write -- Address 
already in use

Child #489: : Address already in use

Failed to execute
udp_test 100

Fatal error 98 at line 264 of file pipe_test.c: bind on write -- Address 
already in use

Child #286: : Address already in use

Failed to execute
udp_test 100

etc etc

Is this a known issue?

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [take15 1/4] kevent: Core files.

2006-09-05 Thread Arnd Bergmann
On Monday 04 September 2006 12:14, Evgeniy Polyakov wrote:

 +asmlinkage long sys_kevent_get_events(int ctl_fd, unsigned int min_nr,
   unsigned int max_nr, __u64 timeout, void __user *buf,
   unsigned flags) 
 +asmlinkage long sys_kevent_ctl(int fd, unsigned int cmd, unsigned int num,
   void __user *arg) 

'void __user *arg' in both of these always points to a struct ukevent,
according to your documentation. Shouldn't it be a 
'struct ukevent __user *arg' then?

Arnd 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] network namespaces

2006-09-05 Thread Daniel Lezcano

Hi all,


  This complete separation of namespaces is very useful for at least two
  purposes:
   - allowing users to create and manage by their own various tunnels and
 VPNs, and
   - enabling easier and more straightforward live migration of groups of
 processes with their environment.



I conceptually prefer this approach, but I seem to recall there were
actual problems in using this for checkpoint/restart of lightweight
(application) containers.  Performance aside, are there any reasons why
this approach would be problematic for c/r?


I agree with this approach too, separated namespaces is the best way to 
identify the network ressources for a specific container.



I'm afraid Daniel may be on vacation, and don't know who else other than
Eric might have thoughts on this.


Yes, I was in vacation, but I am back :)


2. People expressed concerns that complete separation of namespaces
  may introduce an undesired overhead in certain usage scenarios.
  The overhead comes from packets traversing input path, then output path,
  then input path again in the destination namespace if root namespace
  acts as a router.


Yes, performance is probably one issue.

My concerns was for layer 2 / layer 3 virtualization. I agree a layer 2 
isolation/virtualization is the best for the system container.
But there is another family of container called application container, 
it is not a system which is run inside a container but only the 
application. If you want to run a oracle database inside a container, 
you can run it inside an application container without launching init 
and all the services.


This family of containers are used too for HPC (high performance 
computing) and for distributed checkpoint/restart. The cluster runs 
hundred of jobs, spawning them on different hosts inside an application 
container. Usually the jobs communicates with broadcast and multicast.
Application containers does not care of having different MAC address and 
rely on a layer 3 approach.


Are application containers comfortable with a layer 2 virtualization ? I 
 don't think so, because several jobs running inside the same host 
communicate via broadcast/multicast between them and between other jobs 
running on different hosts. The IP consumption is a problem too: 1 
container == 2 IP (one for the root namespace/ one for the container), 
multiplicated with the number of jobs. Furthermore, lot of jobs == lot 
of virtual devices.


However, after a discussion with Kirill at the OLS, it appears we can 
merge the layer 2 and 3 approaches if the level of network 
virtualization is tunable and we can choose layer 2 or layer 3 when 
doing the unshare. The determination of the namespace for the incoming 
traffic can be done with an specific iptable module as a first step. 
While looking at the network namespace patches, it appears that the 
TCP/UDP part is **very** similar at what is needed for a layer 3 approach.


Any thoughts ?

Daniel
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] network namespaces

2006-09-05 Thread Eric W. Biederman
Daniel Lezcano [EMAIL PROTECTED] writes:

2. People expressed concerns that complete separation of namespaces
   may introduce an undesired overhead in certain usage scenarios.
   The overhead comes from packets traversing input path, then output path,
   then input path again in the destination namespace if root namespace
   acts as a router.

 Yes, performance is probably one issue.

 My concerns was for layer 2 / layer 3 virtualization. I agree a layer 2
 isolation/virtualization is the best for the system container.
 But there is another family of container called application container, it is
 not a system which is run inside a container but only the application. If you
 want to run a oracle database inside a container, you can run it inside an
 application container without launching init and all the services.

 This family of containers are used too for HPC (high performance computing) 
 and
 for distributed checkpoint/restart. The cluster runs hundred of jobs, spawning
 them on different hosts inside an application container. Usually the jobs
 communicates with broadcast and multicast.
 Application containers does not care of having different MAC address and rely 
 on
 a layer 3 approach.

 Are application containers comfortable with a layer 2 virtualization ? I don't
 think so, because several jobs running inside the same host communicate via
 broadcast/multicast between them and between other jobs running on different
 hosts. The IP consumption is a problem too: 1 container == 2 IP (one for the
 root namespace/ one for the container), multiplicated with the number of
 jobs. Furthermore, lot of jobs == lot of virtual devices.

 However, after a discussion with Kirill at the OLS, it appears we can merge 
 the
 layer 2 and 3 approaches if the level of network virtualization is tunable and
 we can choose layer 2 or layer 3 when doing the unshare. The determination 
 of
 the namespace for the incoming traffic can be done with an specific iptable
 module as a first step. While looking at the network namespace patches, it
 appears that the TCP/UDP part is **very** similar at what is needed for a 
 layer
 3 approach.

 Any thoughts ?

For HPC if you are interested in migration you need a separate IP per
container.  If you can take you IP address with you migration of
networking state is simple.  If you can't take your IP address with
you a network container is nearly pointless from a migration
perspective.

Beyond that from everything I have seen layer 2 is just much cleaner
than any layer 3 approach short of Serge's bind filtering.

Beyond that I have yet to see a clean semantics for anything
resembling your layer 2 layer 3 hybrid approach.  If we can't have
clear semantics it is by definition impossible to implement correctly
because no one understands what it is supposed to do.

Note.  A true layer 3 approach has no impact on TCP/UDP filtering
because it filters at bind time not at packet reception time.  Once
you start inspecting packets I don't see what the gain is from not
going all of the way to layer 2.

Eric
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2: hw checksum failures

2006-09-05 Thread Stephen Hemminger

Benjamin Herrenschmidt wrote:

On Mon, 2006-09-04 at 20:56 -0700, Stephen Hemminger wrote:
  

On Tue, 05 Sep 2006 13:42:38 +1000
Benjamin Herrenschmidt [EMAIL PROTECTED] wrote:



On Mon, 2006-09-04 at 20:34 -0700, Stephen Hemminger wrote:
  

Unneeded byte swap was occurring.

--- linux-2.6.orig/drivers/net/sky2.c
+++ linux-2.6/drivers/net/sky2.c
@@ -2001,7 +2001,7 @@ static int sky2_status_intr(struct sky2_
case OP_RXCHKS:
skb = sky2-rx_ring[sky2-rx_next].skb;
skb-ip_summed = CHECKSUM_HW;
-   skb-csum = le16_to_cpu(status);
+   skb-csum = status;
break;
 
 		case OP_TXINDEXLE:


I've removed it in my paches (have you seen the other patches I sent for
this driver ?), though I'm pre-swapping status and lenght now before the
switch/case so there might still be an issue there. I'll have a look.
  

The other tack would be to leave the reverse in hw flag on and take out all 
the existing
swap calls but then you have to add an ifdef to re-order all the structures for 
tx_le, rx_le, status_le.
That is what the vendor (GPL) version of sk98lin does.



I prefer keeping the HW swap out of the way for now... that way, I know
the card will react exactly like in an x86, and I avoid those ugly
ifdef's. At least on powerpc, there is no cost in doing swap in software
(well, pretty much no cost).

Which means that if it worked on x86 with le16_to_cpu, it should work on
powerpc... The main difference here however is that you called
le16_to_cpu (which is basically a nop) on a 32 bits field, while I
called le32_to_cpu() on it. But both should lead to the same ... (x86
will do a swapped 16 bits load of the 2 first bytes, while ppc will do a
load of 4 bytes and swap that, thus ending up with the first 2 bytes
swapped in the low order of the result). I'll dump the values and have a
look to be sure. Another possibility would be a problem with the bits
telling the chip where to calculate the checksum.
  

Hardware only computes 16 bit checksum.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[REVISED] [PATCH] ethtool v4: add autoneg advertise feature

2006-09-05 Thread Jeff Kirsher

adds the ability to change the advertised speed and duplex for a network 
interface.  Previously, a network interface was only able to advertise all 
supported speed's and duplex's, or one individual speed and duplex.  The 
feature allows the user to choose which supported speed's and duplex's to 
advertise by using the hex value.

Signed-off-by: Jeff Kirsher [EMAIL PROTECTED]
Signed-off-by: Auke Kok [EMAIL PROTECTED]

---

 ethtool.8 |   24 
 ethtool.c |   12 +++-
 2 files changed, 35 insertions(+), 1 deletions(-)

diff --git a/ethtool.8 b/ethtool.8
index 888a7d8..679f6bc 100644
--- a/ethtool.8
+++ b/ethtool.8
@@ -176,6 +176,8 @@ ethtool \- Display or change ethernet ca
 .B2 duplex half full
 .B4 port tp aui bnc mii fibre
 .B2 autoneg on off
+.RB [ advertise
+.IR N ]
 .RB [ phyad
 .IR N ]
 .B2 xcvr internal external
@@ -327,6 +329,28 @@ Select device port.
 Specify if autonegotiation is enabled. In the usual case it is, but might
 cause some problems with some network devices, so you can turn it off.
 .TP
+.BI advertise \ N
+Set the speed and duplex advertised by autonegotiation.  The argument is
+a hexidecimal value using one or a combination of the following values:
+.RS
+.PD 0
+.TP 3
+.BR 0x01 10 Half
+.TP 3
+.BR 0x02 10 Full
+.TP 3
+.BR 0x04 100 Half
+.TP 3
+.BR 0x08 100 Full
+.TP 3
+.BR 0x10 1000 Half (not supported by IEEE standards)
+.TP 3
+.BR 0x20 1000 Full
+.TP 3
+.BR 0x3F Auto
+.PD
+.RE
+.TP
 .BI phyad \ N
 PHY address.
 .TP
diff --git a/ethtool.c b/ethtool.c
index 87e22ab..b7f189a 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -99,6 +99,7 @@ static struct option {
   [ duplex half|full ]\n
   [ port tp|aui|bnc|mii|fibre ]\n
   [ autoneg on|off ]\n
+  [ advertise %%x ]\n
   [ phyad %%d ]\n
   [ xcvr internal|external ]\n
   [ wol p|u|m|b|a|g|s|d... ]\n
@@ -549,6 +550,15 @@ static void parse_cmdline(int argc, char
show_usage(1);
}
break;
+   } else if (!strcmp(argp[i], advertise)) {
+   gset_changed = 1;
+   i += 1;
+   if (i = argc)
+   show_usage(1);
+   advertising_wanted = strtol(argp[i], NULL, 16);
+   if (advertising_wanted  0)
+   show_usage(1);
+   break;
} else if (!strcmp(argp[i], phyad)) {
gset_changed = 1;
i += 1;
@@ -601,7 +611,7 @@ static void parse_cmdline(int argc, char
}
}
 
-   if (autoneg_wanted == AUTONEG_ENABLE){
+   if ((autoneg_wanted == AUTONEG_ENABLE)  (advertising_wanted  0)) {
if (speed_wanted == SPEED_10  duplex_wanted == DUPLEX_HALF)
advertising_wanted = ADVERTISED_10baseT_Half;
else if (speed_wanted == SPEED_10 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [take15 4/4] kevent: Timer notifications.

2006-09-05 Thread Arnd Bergmann
On Monday 04 September 2006 12:14, Evgeniy Polyakov wrote:
 Timer notifications can be used for fine grained per-process time 
 management, since interval timers are very inconvenient to use, 
 and they are limited.

I guess this must have been discussed before, but why is this
not using high-resolution timers?

Are you planning to change this?

Maybe at least mention it in the description.

Arnd 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.19 PATCH 1/7] ehea: interface to network stack

2006-09-05 Thread Thomas Klein

Hi Francois,

thanks for your review and your comments. See below our answers.

Regards
Thomas



Francois Romieu wrote:

 +cb2 = kzalloc(H_CB_ALIGNMENT, GFP_KERNEL);
 +if (!cb2) {
 +ehea_error(no mem for cb2);
 +goto kzalloc_failed;

 It's better when the label tell what it does than where it comes from.
 If it's numbered too, one can check them without going back and forth.
 +stats-tx_packets = cb2-txucp + cb2-txmcp + cb2-txbcp;
 +stats-multicast = cb2-rxmcp;
 +stats-rx_errors = cb2-rxuerr;
 +stats-rx_bytes = cb2-rxo;
 +stats-tx_bytes = cb2-txo;
 +stats-rx_packets = rx_packets;
 +
 +hcall_failed:
 +kfree(cb2);

 Tab was turned into spaces.

Fixed.

 +static inline int ehea_refill_rq1(struct ehea_port_res *pr, int index,

 Avoid inline ?

Inline declaration was removed from this one and several other functions.

 +for (i = 0; i  nr_of_wqes; i++) {
 +if (!skb_arr_rq1[index]) {
 +skb_arr_rq1[index] = dev_alloc_skb(EHEA_LL_PKT_SIZE);

 netdev_alloc_skb ?

Agreed  done.


 +
 +if (!skb_arr_rq1[index]) {
 +ehea_error(no mem for skb/%d wqes filled, i);
 +ret = -ENOMEM;

 The caller does not check the returned value.

Agreed. fn returns void now.

 +if (!skb_arr_rq1[i]) {
 +ehea_error(no mem for skb/%d skbs filled., i);
 +ret = -ENOMEM;
 +goto exit0;

 s/exit0/out/

Goto target naming was reworked throughout the whole driver and basically
uses the style used by Dave M. and Jeff G. in the Tigon3 driver.

 +static inline int ehea_check_cqe(struct ehea_cqe *cqe, int *rq_num)
 +{
 +*rq_num = (cqe-type  EHEA_CQE_TYPE_RQ)  5;
 +if ((cqe-status  EHEA_CQE_STAT_ERR_MASK) == 0)
 +return 0;
 +if (((cqe-status  EHEA_CQE_STAT_ERR_TCP) != 0)
 + (cqe-header_length == 0))

  on the previous line please.

Changed at all occurences.

 +static inline struct sk_buff *get_skb_by_index(struct sk_buff **skb_array,
 +   int arr_len,
 +   struct ehea_cqe *cqe)
 +{
 +int skb_index = EHEA_BMASK_GET(EHEA_WR_ID_INDEX, cqe-wr_id);
 +struct sk_buff *skb;
 +void *pref;
 +int x;
 +
 +x = skb_index + 1;
 +x = (arr_len - 1);
 +
 +pref = (void*)skb_array[x];

 Useless cast.

Agreed - removed.

 +if (unlikely(!skb)) {
 +if (netif_msg_rx_err(port))
 +ehea_error(LL rq1: skb=NULL);
 +skb = 
dev_alloc_skb(EHEA_LL_PKT_SIZE);

 Tab/space

Fixed.

 +irqreturn_t ehea_qp_aff_irq_handler(int irq, void *param, struct pt_regs * 
regs)

 static ?

Agreed.

 +int ehea_sense_port_attr(struct ehea_port *port)

 static ?

No - used in ehea_ethtool.c

 +} else {
 +if (hret == H_AUTHORITY)
 +{

 Misplaced curly brace.

Fixed.


 +ehea_info(Hypervisor denied setting port speed. Either
 +   this partition is not authorized to set 
 +  port speed or another partition has modified
 +   port speed first.);
 +ret = -EPERM;
 +} else
 +{

 Misplaced curly brace.

Fixed.


 +ret = -EIO;
 +ehea_error(Failed setting port speed);
 +}
 +}
 +netif_carrier_on(port-netdev);
 +exit0:
 +kfree(cb4);

 cb4 is NULL. Not wrong per se but I'd rather move the label one line down.

Agreed.

 +void ehea_neq_tasklet(unsigned long data)

 static ?

Agreed.

 +irqreturn_t ehea_interrupt_neq(int irq, void *param, struct pt_regs *regs)

 static ?

Agreed.

 +{
 +struct ehea_adapter *adapter = (struct ehea_adapter*)param;

 Useless cast.

Fixed.

 +static int ehea_fill_port_res(struct ehea_port_res *pr)
 +{
 +int ret;
 +struct ehea_qp_init_attr *init_attr = pr-qp-init_attr;
 +
 +/* RQ 1 */
 +ret = ehea_init_fill_rq1(pr, init_attr-act_nr_rwqes_rq1
 + - init_attr-act_nr_rwqes_rq2
 + - init_attr-act_nr_rwqes_rq3 - 1);
 +/* RQ 2 */

 Useless comment.

Removed.

 +for (k = 0; k  i; k++) {
 +u32 ist = port-port_res[k].recv_eq-attr.ist1;
 +ibmebus_free_irq(NULL, ist, port-port_res[k]);
 +}
 +goto failure;

 Poor label (and bloaty release practice too: remove k, reuse i below
 and more importantly release the things in allocation-reversed order).

Somehow I don't get your point concerning the usage of 'k'. We need another
iterator as the for loops using 'k' use 'i' as their terminating condition.


 +}
 +if 

[PATCH] FRV: Fix {dis,en}able_irq_lockdep_irqrestore compile error

2006-09-05 Thread David Howells

Fix the lack of certain non-LOCKDEP stub functions in linux/interrupt.h and
also provide FRV with LOCKDEP variants.

This is to be applied to -mm kernel since not all of the functions added exist
in the main kernel.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---
warthogdiffstat -p1 frv-irq-lockdep-2618rc5mm1.diff 
 include/asm-frv/irq.h |   43 +++
 include/linux/interrupt.h |2 ++
 2 files changed, 45 insertions(+)

diff -urp ../kernels/linux-2.6.18-rc5-mm1/include/asm-frv/irq.h 
linux-2.6.18-rc5-mm1-frv/include/asm-frv/irq.h
--- ../kernels/linux-2.6.18-rc5-mm1/include/asm-frv/irq.h   2006-09-04 
18:02:48.0 +0100
+++ linux-2.6.18-rc5-mm1-frv/include/asm-frv/irq.h  2006-09-05 
15:59:08.0 +0100
@@ -39,5 +39,48 @@ extern void disable_irq_nosync(unsigned 
 extern void disable_irq(unsigned int irq);
 extern void enable_irq(unsigned int irq);
 
+#ifdef CONFIG_LOCKDEP
+/*
+ * Special lockdep variants of irq disabling/enabling.
+ * These should be used for locking constructs that
+ * know that a particular irq context which is disabled,
+ * and which is the only irq-context user of a lock,
+ * that it's safe to take the lock in the irq-disabled
+ * section without disabling hardirqs.
+ *
+ * On !CONFIG_LOCKDEP they are equivalent to the normal
+ * irq disable/enable methods.
+ */
+static inline void disable_irq_nosync_lockdep(unsigned int irq)
+{
+   disable_irq_nosync(irq);
+   local_irq_disable();
+}
+
+static inline void disable_irq_nosync_lockdep_irqsave(unsigned int irq, 
unsigned long *flags)
+{
+   disable_irq_nosync(irq);
+   local_irq_save(*flags);
+}
+
+static inline void disable_irq_lockdep(unsigned int irq)
+{
+   disable_irq(irq);
+   local_irq_disable();
+}
+
+static inline void enable_irq_lockdep(unsigned int irq)
+{
+   local_irq_enable();
+   enable_irq(irq);
+}
+
+static inline void enable_irq_lockdep_irqrestore(unsigned int irq, unsigned 
long *flags)
+{
+   local_irq_restore(*flags);
+   enable_irq(irq);
+}
+#endif /* CONFIG_LOCKDEP */
+
 
 #endif /* _ASM_IRQ_H_ */
diff -urp ../kernels/linux-2.6.18-rc5-mm1/include/linux/interrupt.h 
linux-2.6.18-rc5-mm1-frv/include/linux/interrupt.h
--- ../kernels/linux-2.6.18-rc5-mm1/include/linux/interrupt.h   2006-09-04 
18:03:31.0 +0100
+++ linux-2.6.18-rc5-mm1-frv/include/linux/interrupt.h  2006-09-05 
15:58:53.0 +0100
@@ -178,6 +178,8 @@ static inline int disable_irq_wake(unsig
 #  define disable_irq_nosync_lockdep(irq)  disable_irq_nosync(irq)
 #  define disable_irq_lockdep(irq) disable_irq(irq)
 #  define enable_irq_lockdep(irq)  enable_irq(irq)
+#  define disable_irq_nosync_lockdep_irqsave(irq, flags) 
disable_irq_nosync(irq)
+#  define enable_irq_lockdep_irqrestore(irq, flags) enable_irq(irq)
 # endif
 
 #endif /* CONFIG_GENERIC_HARDIRQS */
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] FRV: do_gettimeofday() should no longer use tickadj

2006-09-05 Thread David Howells

Stop do_gettimeofday() on FRV from using tickadj, and model it after ARM
instead.

This patch also provides a placeholder macro for getting hardware timer data to
be filled in when such is available.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---
warthogdiffstat -p1 frv-tickadj-2618rc5mm1.diff 
 arch/frv/kernel/time.c |   20 +---
 1 file changed, 5 insertions(+), 15 deletions(-)

diff -urp ../kernels/linux-2.6.18-rc5-mm1/arch/frv/kernel/time.c 
linux-2.6.18-rc5-mm1-frv/arch/frv/kernel/time.c
--- ../kernels/linux-2.6.18-rc5-mm1/arch/frv/kernel/time.c  2006-09-04 
18:03:14.0 +0100
+++ linux-2.6.18-rc5-mm1-frv/arch/frv/kernel/time.c 2006-09-05 
15:44:42.0 +0100
@@ -31,6 +31,9 @@
 
 #define TICK_SIZE (tick_nsec / 1000)
 
+/* H/W clock data if we can get it (in microseconds) */
+#define FRV_HW_CLOCK_DATA (0)
+
 unsigned long __nongprelbss __clkin_clock_speed_HZ;
 unsigned long __nongprelbss __ext_bus_clock_speed_HZ;
 unsigned long __nongprelbss __res_bus_clock_speed_HZ;
@@ -148,23 +151,10 @@ void do_gettimeofday(struct timeval *tv)
 {
unsigned long seq;
unsigned long usec, sec;
-   unsigned long max_ntp_tick;
 
do {
seq = read_seqbegin(xtime_lock);
-
-   usec = 0;
-
-   /*
-* If time_adjust is negative then NTP is slowing the clock
-* so make sure not to go into next possible interval.
-* Better to lose some accuracy than have time go backwards..
-*/
-   if (unlikely(time_adjust  0)) {
-   max_ntp_tick = (USEC_PER_SEC / HZ) - tickadj;
-   usec = min(usec, max_ntp_tick);
-   }
-
+   usec = FRV_HW_CLOCK_DATA;
sec = xtime.tv_sec;
usec += (xtime.tv_nsec / 1000);
} while (read_seqretry(xtime_lock, seq));
@@ -195,7 +185,7 @@ int do_settimeofday(struct timespec *tv)
 * wall time.  Discover what correction gettimeofday() would have
 * made, and then undo it!
 */
-   nsec -= 0 * NSEC_PER_USEC;
+   nsec -= FRV_HW_CLOCK_DATA * NSEC_PER_USEC;
 
wtm_sec  = wall_to_monotonic.tv_sec + (xtime.tv_sec - sec);
wtm_nsec = wall_to_monotonic.tv_nsec + (xtime.tv_nsec - nsec);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] NOMMU: Provide page_mkclean() for NOMMU

2006-09-05 Thread David Howells

Provide a page_mkclean() implementation for NOMMU.  This doesn't do anything
except return successfully as there are no PTEs for it to play with.

This is only relevant to the -mm kernels.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---
warthogdiffstat -p1 nommu-page_mkclean-2618rc5mm1.diff 
 include/linux/rmap.h |6 ++
 1 file changed, 6 insertions(+)

diff -urp ../kernels/linux-2.6.18-rc5-mm1/include/linux/rmap.h 
linux-2.6.18-rc5-mm1-frv/include/linux/rmap.h
--- ../kernels/linux-2.6.18-rc5-mm1/include/linux/rmap.h2006-09-04 
18:03:32.0 +0100
+++ linux-2.6.18-rc5-mm1-frv/include/linux/rmap.h   2006-09-05 
15:34:35.0 +0100
@@ -120,6 +120,12 @@ int page_mkclean(struct page *);
 #define page_referenced(page,l) TestClearPageReferenced(page)
 #define try_to_unmap(page, refs) SWAP_FAIL
 
+static inline int page_mkclean(struct page *page)
+{
+   return 0;
+}
+
+
 #endif /* CONFIG_MMU */
 
 /*
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] NOMMU: Make lib/ioremap.c conditional

2006-09-05 Thread David Howells

Make lib/ioremap.c conditional on !CONFIG_MMU.  It plays with PTEs which don't
exist under NOMMU conditions.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---
warthogdiffstat -p1 nommu-ioremap-2618rc5mm1.diff 
 lib/Makefile |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff -urp ../kernels/linux-2.6.18-rc5-mm1/lib/Makefile 
linux-2.6.18-rc5-mm1-frv/lib/Makefile
--- ../kernels/linux-2.6.18-rc5-mm1/lib/Makefile2006-09-04 
18:03:32.0 +0100
+++ linux-2.6.18-rc5-mm1-frv/lib/Makefile   2006-09-05 16:01:38.0 
+0100
@@ -5,8 +5,9 @@
 lib-y := ctype.o string.o vsprintf.o cmdline.o \
 bust_spinlocks.o rbtree.o radix-tree.o dump_stack.o \
 idr.o div64.o int_sqrt.o bitmap.o extable.o prio_tree.o \
-sha1.o ioremap.o
+sha1.o
 
+lib-$(CONFIG_MMU) += ioremap.o
 lib-$(CONFIG_SMP) += cpumask.o
 
 lib-y  += kobject.o kref.o kobject_uevent.o klist.o
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] NOMMU: Move the fallback arch_vma_name() to a sensible place

2006-09-05 Thread David Howells

Move the fallback arch_vma_name() to a sensible place (kernel/signal.c).

Currently it's in fs/proc/task_mmu.c, a file that is dependent on both
CONFIG_PROC_FS and CONFIG_MMU being enabled, but it's used from kernel/signal.c
from where it is called unconditionally.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---
warthogdiffstat -p1 nommu-arch_vma_name-2618rc5mm1.diff 
 fs/proc/task_mmu.c |5 -
 kernel/signal.c|5 +
 2 files changed, 5 insertions(+), 5 deletions(-)

diff -urp ../kernels/linux-2.6.18-rc5-mm1/fs/proc/task_mmu.c 
linux-2.6.18-rc5-mm1-frv/fs/proc/task_mmu.c
--- ../kernels/linux-2.6.18-rc5-mm1/fs/proc/task_mmu.c  2006-09-04 
18:02:43.0 +0100
+++ linux-2.6.18-rc5-mm1-frv/fs/proc/task_mmu.c 2006-09-05 15:49:18.0 
+0100
@@ -122,11 +122,6 @@ struct mem_size_stats
unsigned long private_dirty;
 };
 
-__attribute__((weak)) const char *arch_vma_name(struct vm_area_struct *vma)
-{
-   return NULL;
-}
-
 static int show_map_internal(struct seq_file *m, void *v, struct 
mem_size_stats *mss)
 {
struct proc_maps_private *priv = m-private;
diff -urp ../kernels/linux-2.6.18-rc5-mm1/kernel/signal.c 
linux-2.6.18-rc5-mm1-frv/kernel/signal.c
--- ../kernels/linux-2.6.18-rc5-mm1/kernel/signal.c 2006-09-04 
18:03:32.0 +0100
+++ linux-2.6.18-rc5-mm1-frv/kernel/signal.c2006-09-05 15:49:19.0 
+0100
@@ -773,6 +773,11 @@ static void pad_len_spaces(int len)
printk(%*c, len, ' ');
 }
 
+__attribute__((weak)) const char *arch_vma_name(struct vm_area_struct *vma)
+{
+   return NULL;
+}
+
 static int print_vma(struct vm_area_struct *vma)
 {
struct mm_struct *mm = vma-vm_mm;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] network namespaces

2006-09-05 Thread Daniel Lezcano

For HPC if you are interested in migration you need a separate IP per
container.  If you can take you IP address with you migration of
networking state is simple.  If you can't take your IP address with
you a network container is nearly pointless from a migration
perspective.


Eric, please, I know... I showed you a migration demo at OLS ;)


Beyond that from everything I have seen layer 2 is just much cleaner
than any layer 3 approach short of Serge's bind filtering.



Beyond that I have yet to see a clean semantics for anything
resembling your layer 2 layer 3 hybrid approach.  If we can't have
clear semantics it is by definition impossible to implement correctly
because no one understands what it is supposed to do.



Note.  A true layer 3 approach has no impact on TCP/UDP filtering
because it filters at bind time not at packet reception time.  Once
you start inspecting packets I don't see what the gain is from not
going all of the way to layer 2.


The bsdjail was just for information ...


- Daniel

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] network namespaces

2006-09-05 Thread Kirill Korotaev

Yes, performance is probably one issue.

My concerns was for layer 2 / layer 3 virtualization. I agree a layer 2 
isolation/virtualization is the best for the system container.
But there is another family of container called application container, 
it is not a system which is run inside a container but only the 
application. If you want to run a oracle database inside a container, 
you can run it inside an application container without launching init 
and all the services.


This family of containers are used too for HPC (high performance 
computing) and for distributed checkpoint/restart. The cluster runs 
hundred of jobs, spawning them on different hosts inside an application 
container. Usually the jobs communicates with broadcast and multicast.
Application containers does not care of having different MAC address and 
rely on a layer 3 approach.


Are application containers comfortable with a layer 2 virtualization ? I 
 don't think so, because several jobs running inside the same host 
communicate via broadcast/multicast between them and between other jobs 
running on different hosts. The IP consumption is a problem too: 1 
container == 2 IP (one for the root namespace/ one for the container), 
multiplicated with the number of jobs. Furthermore, lot of jobs == lot 
of virtual devices.


However, after a discussion with Kirill at the OLS, it appears we can 
merge the layer 2 and 3 approaches if the level of network 
virtualization is tunable and we can choose layer 2 or layer 3 when 
doing the unshare. The determination of the namespace for the incoming 
traffic can be done with an specific iptable module as a first step. 
While looking at the network namespace patches, it appears that the 
TCP/UDP part is **very** similar at what is needed for a layer 3 approach.


Any thoughts ?

My humble opinion is that your approach doesn't intersect with this one.
So we can freely go with both *if needed*.
And hear the comments from network guru guys and what and how to improve.

So I suggest you at least to send the patches, so we could discuss it.

Thanks,
Kirill
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: e1000 Detected Tx Unit Hang

2006-09-05 Thread Jesse Brandeburg

On 9/3/06, Paul Aviles [EMAIL PROTECTED] wrote:

Hey Jesse, thanks for your reply. Here is the stuff on /procs. The weird

no problem,


part is that I have several other identical systems and only one is
affected. Today I moved the hard drive to another similar system and I am
not seeing the problem so I am wondering if is something maybe wrong with
the card eeprom? Is there a way to check that?


I doubt it is an eeprom problem.  you can dump the eeproms with
ethtool -e eth0 from both machines and compare them .  Odd that only
one system is having the problem.  Could it be that the hardware on
that box is having issues?  Are you sure the machines are running the
same bios version with the same settings?  Any overclocking?


 cat /proc/interrupts
   CPU0   CPU1
 16:  70540  0   IO-APIC-level  uhci_hcd:usb4, eth0


this could contribute to your problem, were you able to test without NAPI?

Jesse
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2 problem on powerpc

2006-09-05 Thread Stephen Hemminger
On Tue, 05 Sep 2006 13:47:52 +1000
Benjamin Herrenschmidt [EMAIL PROTECTED] wrote:

 
  It may not need any swapping, it is hard to tell what the hardware
  will do without experimentation.
 
 Yes... did you have a chance to test the vlan stuff on LE machines
 (x86) ? did it work with the BE swapping you were doing ? I've
 purposedly removed in my patches the hardware side swapping of the
 descriptors, as I explained, thus making the hardware react the same on
 ppc and x86. Thus we need the exact same swapping macros on both
 platforms).


Last time I checked it worked.  Private cable simulating VLAN
from other Linux card.

 I know pretty much nothing about vlan so I'm not too much about trying
 to check that right now :)
 
 Also, there is still the hw checksum issue  I need to verify what's
 up there, it might be a swapping problem as well... or not. Can you send
 me your latest patch set so I can work from there ?
 
 Cheers,
 Ben
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] network namespaces

2006-09-05 Thread Herbert Poetzl
On Tue, Sep 05, 2006 at 08:45:39AM -0600, Eric W. Biederman wrote:
 Daniel Lezcano [EMAIL PROTECTED] writes:
 
 2. People expressed concerns that complete separation of namespaces
may introduce an undesired overhead in certain usage scenarios.
The overhead comes from packets traversing input path, then output path,
then input path again in the destination namespace if root namespace
acts as a router.
 
  Yes, performance is probably one issue.
 
  My concerns was for layer 2 / layer 3 virtualization. I agree
  a layer 2 isolation/virtualization is the best for the system
  container. But there is another family of container called
  application container, it is not a system which is run inside a
  container but only the application. If you want to run a oracle
  database inside a container, you can run it inside an application
  container without launching init and all the services.
 
  This family of containers are used too for HPC (high performance
  computing) and for distributed checkpoint/restart. The cluster
  runs hundred of jobs, spawning them on different hosts inside an
  application container. Usually the jobs communicates with broadcast
  and multicast. Application containers does not care of having
  different MAC address and rely on a layer 3 approach.
 
  Are application containers comfortable with a layer 2 virtualization
  ? I don't think so, because several jobs running inside the same
  host communicate via broadcast/multicast between them and between
  other jobs running on different hosts. The IP consumption is a
  problem too: 1 container == 2 IP (one for the root namespace/
  one for the container), multiplicated with the number of jobs.
  Furthermore, lot of jobs == lot of virtual devices.
 
  However, after a discussion with Kirill at the OLS, it appears we
  can merge the layer 2 and 3 approaches if the level of network
  virtualization is tunable and we can choose layer 2 or layer 3 when
  doing the unshare. The determination of the namespace for the
  incoming traffic can be done with an specific iptable module as
  a first step. While looking at the network namespace patches, it
  appears that the TCP/UDP part is **very** similar at what is needed
  for a layer
  3 approach.
 
  Any thoughts ?
 
 For HPC if you are interested in migration you need a separate IP
 per container. If you can take you IP address with you migration of
 networking state is simple. If you can't take your IP address with you
 a network container is nearly pointless from a migration perspective.

 Beyond that from everything I have seen layer 2 is just much cleaner
 than any layer 3 approach short of Serge's bind filtering.

well, the 'ip subset' approach Linux-VServer and
other Jail solutions use is very clean, it just does
not match your expectations of a virtual interface
(as there is none) and it does not cope well with
all kinds of per context 'requirements', which IMHO
do not really exist on the application layer (only
on the whole system layer)

 Beyond that I have yet to see a clean semantics for anything
 resembling your layer 2 layer 3 hybrid approach. If we can't have
 clear semantics it is by definition impossible to implement correctly
 because no one understands what it is supposed to do.

IMHO that would be quite simple, have a 'namespace'
for limiting port binds to a subset of the available
ips and another one which does complete network 
virtualization with all the whistles and bells, IMHO
most of them are orthogonal and can easily be combined

 - full network virtualization
 - lightweight ip subset 
 - both

 Note. A true layer 3 approach has no impact on TCP/UDP filtering
 because it filters at bind time not at packet reception time. Once you
 start inspecting packets I don't see what the gain is from not going
 all of the way to layer 2.

IMHO this requirement only arises from the full system
virtualization approach, just look at the other jail
solutions (solaris, bsd, ...) some of them do not even 
allow for more than a single ip but they work quite
well when used properly ...

best,
Herbert

 Eric
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] bridge: random extra bytes on STP TCN packet

2006-09-05 Thread Stephen Hemminger
We seem to send 3 extra bytes in a TCN, which will be whatever happens
to be on the stack. Thanks to [EMAIL PROTECTED] for seeing.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

diff -Naur a/net/bridge/br_stp_bpdu.c b/net/bridge/br_stp_bpdu.c
--- a/net/bridge/br_stp_bpdu.c  2006-09-03 23:40:08.0 +0530
+++ b/net/bridge/br_stp_bpdu.c  2006-09-03 23:40:33.0 +0530
@@ -121,7 +121,7 @@
buf[1] = 0;
buf[2] = 0;
buf[3] = BPDU_TYPE_TCN;
-   br_send_bpdu(p, buf, 7);
+   br_send_bpdu(p, buf, 4);
 }
 
 /*

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2.6.18] WE-21 support (core API)

2006-09-05 Thread Jean Tourrilhes
On Mon, Sep 04, 2006 at 10:35:09AM +0200, Johannes Berg wrote:
 Uh, please don't strip me from the CC list :)
 
  WE-netlink is optional. And WE-ioctl could be made optional
  (still on the todo list). You can also disable WE-event and WE-iwspy
  for further footprint reduction.
 
 The real question is: Why does removing WE-event reduce footprint? I
 guess the answer is that there's a lot of non-generic code needed to
 pack/unpack all the data. Which is not really something you want.

Wrong answer.

 wireless.c has about 2.3k lines of code. But, for example airo.c
 contains another 15 lines of code just for the trivial *parameter
 checking* in airo_set_essid. This is duplicated all over. Did it never
 occur to you that things like
 /* Check the size of the string */
 if(dwrq-length  IW_ESSID_MAX_SIZE+1) {
 return -E2BIG ;
 }
 can be checked generically? Maybe you're actually checking this
 generically. But if I did it your way, I'd copy and paste this all
 over...

It is actually checked generically, that's the whole point of
the code in wireless.c. But, driver authors don't trust generic
checks.

  It was designed this way on purpose, because you get low
  footprint and very good scalability.
 
 Wtf does scalability have to do with it?

Footprint scalability.

 johannes

Jean
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] network namespaces

2006-09-05 Thread Eric W. Biederman

 This family of containers are used too for HPC (high performance computing) 
 and
 for distributed checkpoint/restart. The cluster runs hundred of jobs, spawning
 them on different hosts inside an application container. Usually the jobs
 communicates with broadcast and multicast.
 Application containers does not care of having different MAC address and rely 
 on
 a layer 3 approach.

Ok I think to understand this we need some precise definitions.
In the normal case it is an error for a job to communication with a different
job.  

The basic advantage with a different MAC is that you can found out who the
intended recipient is sooner in the networking stack and you have truly
separate network devices.  Allowing for a cleaner implementation.

Changing the MAC after migration is likely to be fine.

 Are application containers comfortable with a layer 2 virtualization ? I don't
 think so, because several jobs running inside the same host communicate via
 broadcast/multicast between them and between other jobs running on different
 hosts. The IP consumption is a problem too: 1 container == 2 IP (one for the
 root namespace/ one for the container), multiplicated with the number of
 jobs. Furthermore, lot of jobs == lot of virtual devices.

First if you hook you network namespaces with ethernet bridging
you don't need any extra IPs.

Second don't see the conflict you perceive between application containers
and layer 2 containment.

The bottom line is that you need at least one loopback interface per non-trivial
network namespace.  One you get that having a virtual is no big deal.  In
addition network devices don't consume less memory than a process.  So lots
of network devices should not be a problem. 

Eric
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] Re: high latency with TCP connections

2006-09-05 Thread Rick Jones

Alexey Kuznetsov wrote:

Hello!



Some people reported that this program runs in 9.997 sec when run on
FreeBSD.



Try enclosed patch. I have no idea why 9.997 sec is so magic, but I
get exactly this number on my notebook. :-)

Alexey

=

This patch enables sending ACKs each 2d received segment.
It does not affect either mss-sized connections (obviously) or connections
controlled by Nagle (because there is only one small segment in flight).

The idea is to record the fact that a small segment arrives
on a connection, where one small segment has already been received
and still not-ACKed. In this case ACK is forced after tcp_recvmsg()
drains receive buffer.

In other words, it is a soft each-2d-segment ACK, which is enough
to preserve ACK clock even when ABC is enabled.


Is this really necessary?  I thought that the problems with ABC were in 
trying to apply byte-based heuristics from the RFC(s) to a 
packet-oritented cwnd in the stack?


rick jones
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] network namespaces

2006-09-05 Thread Eric W. Biederman
Herbert Poetzl [EMAIL PROTECTED] writes:

 On Tue, Sep 05, 2006 at 08:45:39AM -0600, Eric W. Biederman wrote:
 Daniel Lezcano [EMAIL PROTECTED] writes:
 
 For HPC if you are interested in migration you need a separate IP
 per container. If you can take you IP address with you migration of
 networking state is simple. If you can't take your IP address with you
 a network container is nearly pointless from a migration perspective.

 Beyond that from everything I have seen layer 2 is just much cleaner
 than any layer 3 approach short of Serge's bind filtering.

 well, the 'ip subset' approach Linux-VServer and
 other Jail solutions use is very clean, it just does
 not match your expectations of a virtual interface
 (as there is none) and it does not cope well with
 all kinds of per context 'requirements', which IMHO
 do not really exist on the application layer (only
 on the whole system layer)

I probably expressed that wrong.  There are currently three
basic approaches under discussion.
Layer 3 (Basically bind filtering) nothing at the packet level.
   The approach taken by Serge's version of bsdjails and Vserver.

Layer 2.5 What Daniel proposed.

Layer 2.  (Trivially mapping each packet to a different interface)
   And then treating everything as multiple instances of the
   network stack.
Roughly what OpenVZ and I have implemented.

You can get into some weird complications at layer 3 but because
it doesn't touch each packet the proof it is fast is trivial.

 Beyond that I have yet to see a clean semantics for anything
 resembling your layer 2 layer 3 hybrid approach. If we can't have
 clear semantics it is by definition impossible to implement correctly
 because no one understands what it is supposed to do.

 IMHO that would be quite simple, have a 'namespace'
 for limiting port binds to a subset of the available
 ips and another one which does complete network 
 virtualization with all the whistles and bells, IMHO
 most of them are orthogonal and can easily be combined

  - full network virtualization
  - lightweight ip subset 
  - both

Quite possibly.  The LSM will stay for a while so we do have
a clean way to restrict port binds.

 Note. A true layer 3 approach has no impact on TCP/UDP filtering
 because it filters at bind time not at packet reception time. Once you
 start inspecting packets I don't see what the gain is from not going
 all of the way to layer 2.

 IMHO this requirement only arises from the full system
 virtualization approach, just look at the other jail
 solutions (solaris, bsd, ...) some of them do not even 
 allow for more than a single ip but they work quite
 well when used properly ...


Yes they do.  Currently I am strongly opposed to Daniel Layer 2.5 approach
as I see no redeeming value in it.  A good clean layer 3 approach I 
avoid only because I think we can do better.

Eric
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch] d80211: fix multiple device ap support

2006-09-05 Thread David Kimdon
Another fix to the interpretation of dev_alloc_name() return value.
dev_alloc_name() returns the number of the unit assigned or a negative
errno code.

Signed-off-by: David Kimdon [EMAIL PROTECTED]

Index: linux-2.6.16/net/d80211/ieee80211_iface.c
===
--- linux-2.6.16.orig/net/d80211/ieee80211_iface.c
+++ linux-2.6.16/net/d80211/ieee80211_iface.c
@@ -122,7 +122,7 @@ int ieee80211_if_add_mgmt(struct net_dev
if (!ndev)
return -ENOMEM;
ret = dev_alloc_name(ndev, wmgmt%d);
-   if (ret)
+   if (ret  0)
goto fail;
 
ndev-ieee80211_ptr = local;

--
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.19 PATCH 1/7] ehea: interface to network stack

2006-09-05 Thread Francois Romieu
Thomas Klein [EMAIL PROTECTED] :
[...]
 Somehow I don't get your point concerning the usage of 'k'. We need another
 iterator as the for loops using 'k' use 'i' as their terminating condition.

Something like the code below perhaps (with more local variables maybe):

static int ehea_reg_interrupts(struct net_device *dev)
{
struct ehea_port *port = netdev_priv(dev);
struct ehea_port_res *pr;
int i, ret;

for (i = 0; i  port-num_def_qps; i++) {
pr = port-port_res[i];
snprintf(pr-int_recv_name, EHEA_IRQ_NAME_SIZE - 1
 , %s-recv%d, dev-name, i);
ret = ibmebus_request_irq(NULL, pr-recv_eq-attr.ist1,
  ehea_recv_irq_handler, SA_INTERRUPT,
  pr-int_recv_name, pr);
if (ret) {
ehea_error(failed registering irq for ehea_recv_int:
   port_res_nr:%d, ist=%X, i,
   pr-recv_eq-attr.ist1);
goto err_free_irq_recv_eq_0;
}
if (netif_msg_ifup(port))
ehea_info(irq_handle 0x%X for funct ehea_recv_int %d 
  registered, pr-recv_eq-attr.ist1, i);
}

snprintf(port-int_aff_name, EHEA_IRQ_NAME_SIZE - 1,
 %s-aff, dev-name);
ret = ibmebus_request_irq(NULL, port-qp_eq-attr.ist1,
  ehea_qp_aff_irq_handler,
  SA_INTERRUPT, port-int_aff_name, port);
if (ret) {
ehea_error(failed registering irq for qp_aff_irq_handler:
ist=%X, port-qp_eq-attr.ist1);
goto err_free_irq_recv_eq_0;
}
if (netif_msg_ifup(port))
ehea_info(irq_handle 0x%X for function qp_aff_irq_handler 
  registered, port-qp_eq-attr.ist1);

for (i = 0; i  port-num_def_qps + port-num_add_tx_qps; i++) {
pr = port-port_res[i];
snprintf(pr-int_send_name, EHEA_IRQ_NAME_SIZE - 1,
 %s-send%d, dev-name, i);
ret = ibmebus_request_irq(NULL, pr-send_eq-attr.ist1,
  ehea_send_irq_handler, SA_INTERRUPT,
  pr-int_send_name, pr);
if (ret) {
ehea_error(failed registering irq for ehea_send
port_res_nr:%d, ist=%X, i,
   pr-send_eq-attr.ist1);
goto err_free_irq_send_eq_1;
}
if (netif_msg_ifup(port))
ehea_info(irq_handle 0x%X for function ehea_send_int 
  %d registered, pr-send_eq-attr.ist1, i);
}
out:
return ret;

err_free_irq_send_eq_1:
// Post-dec works with unsigned int too.
while (i--  0) {
u32 ist = port-port_res[i].send_eq-attr.ist1;
ibmebus_free_irq(NULL, ist, port-port_res[i]);
}
ibmebus_free_irq(NULL, port-qp_eq-attr.ist1, port);
i = port-num_def_qps;
err_free_irq_recv_eq_0:
while (i--  0) {
u32 ist = port-port_res[i].recv_eq-attr.ist1;
ibmebus_free_irq(NULL, ist, port-port_res[k]);
}
goto out;
}
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] tcp-lp: bug fix for oops in 2.6.18-rc6

2006-09-05 Thread Wong Edison

Sorry that the patch submited yesterday still contain a small bug.
This version have already been test for hours with BT connections. The
oops is now difficult to reproduce.

Signed-off-by: Wong Hoi Sing Edison [EMAIL PROTECTED]

---

diff -urpN linux-2.6.18-rc6/net/ipv4/tcp_lp.c linux/net/ipv4/tcp_lp.c
--- linux-2.6.18-rc6/net/ipv4/tcp_lp.c  2006-09-06 04:12:00.0 +0800
+++ linux/net/ipv4/tcp_lp.c 2006-09-06 04:24:07.0 +0800
@@ -3,13 +3,8 @@
 *
 * TCP Low Priority is a distributed algorithm whose goal is to utilize only
 *   the excess network bandwidth as compared to the ``fair share`` of
- *   bandwidth as targeted by TCP. Available from:
- * http://www.ece.rice.edu/~akuzma/Doc/akuzma/TCP-LP.pdf
+ *   bandwidth as targeted by TCP.
 *
- * Original Author:
- *   Aleksandar Kuzmanovic [EMAIL PROTECTED]
- *
- * See http://www-ece.rice.edu/networks/TCP-LP/ for their implementation.
 * As of 2.6.13, Linux supports pluggable congestion control algorithms.
 * Due to the limitation of the API, we take the following changes from
 * the original TCP-LP implementation:
@@ -24,11 +19,20 @@
 *   o OWD is handled in relative format, where local time stamp will in
 * tcp_time_stamp format.
 *
- * Port from 2.4.19 to 2.6.16 as module by:
- *   Wong Hoi Sing Edison [EMAIL PROTECTED]
- *   Hung Hing Lun [EMAIL PROTECTED]
+ * Original Author:
+ *   Aleksandar Kuzmanovic [EMAIL PROTECTED]
+ * Available from:
+ *   http://www.ece.rice.edu/~akuzma/Doc/akuzma/TCP-LP.pdf
+ * Original implementation for 2.4.19:
+ *   http://www-ece.rice.edu/networks/TCP-LP/
+ *
+ * 2.6.x module Authors:
+ *   Wong Hoi Sing, Edison [EMAIL PROTECTED]
+ *   Hung Hing Lun, Mike [EMAIL PROTECTED]
+ * SourceForge project page:
+ *   http://tcp-lp-mod.sourceforge.net/
 *
- * Version: $Id: tcp_lp.c,v 1.22 2006-05-02 18:18:19 hswong3i Exp $
+ * Version: $Id: tcp_lp.c,v 1.24 2006/09/05 20:22:53 hswong3i Exp $
 */

#include linux/config.h
@@ -153,16 +157,19 @@ static u32 tcp_lp_remote_hz_estimator(st
if (m  0)
m = -m;

-   if (rhz != 0) {
+   if (rhz  0) {
m -= rhz  6;/* m is now error in remote HZ est */
rhz += m;   /* 63/64 old + 1/64 new */
} else
rhz = m  6;

+ out:
/* record time for successful remote HZ calc */
-   lp-flag |= LP_VALID_RHZ;
+   if (rhz  0)
+   lp-flag |= LP_VALID_RHZ;
+   else
+   lp-flag = ~LP_VALID_RHZ;

- out:
/* record reference time stamp */
lp-remote_ref_time = tp-rx_opt.rcv_tsval;
lp-local_ref_time = tp-rx_opt.rcv_tsecr;
@@ -333,6 +340,6 @@ static void __exit tcp_lp_unregister(voi
module_init(tcp_lp_register);
module_exit(tcp_lp_unregister);

-MODULE_AUTHOR(Wong Hoi Sing Edison, Hung Hing Lun);
+MODULE_AUTHOR(Wong Hoi Sing Edison, Hung Hing Lun Mike);
MODULE_LICENSE(GPL);
MODULE_DESCRIPTION(TCP Low Priority);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] tcp-lp: update information to MAINTAINERS

2006-09-05 Thread Wong Edison

Signed-off-by: Wong Hoi Sing Edison [EMAIL PROTECTED]

---

diff -urpN linux-2.6.18-rc6/MAINTAINERS linux/MAINTAINERS
--- linux-2.6.18-rc6/MAINTAINERS2006-09-06 04:12:11.0 +0800
+++ linux/MAINTAINERS   2006-09-06 04:19:08.0 +0800
@@ -2818,6 +2818,14 @@ M:   [EMAIL PROTECTED]
L:  netdev@vger.kernel.org
S:  Maintained

+TCP LOW PRIORITY MODULE
+P: Wong Hoi Sing, Edison
+M: [EMAIL PROTECTED]
+P: Hung Hing Lun, Mike
+M: [EMAIL PROTECTED]
+W: http://tcp-lp-mod.sourceforge.net/
+S: Maintained
+
TI OMAP RANDOM NUMBER GENERATOR SUPPORT
P:  Deepak Saxena
M:  [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] tcp-lp: update information to MAINTAINERS

2006-09-05 Thread Edison
Signed-off-by: Wong Hoi Sing Edison [EMAIL PROTECTED]

---

diff -urpN linux-2.6.18-rc6/MAINTAINERS linux/MAINTAINERS
--- linux-2.6.18-rc6/MAINTAINERS2006-09-06 04:12:11.0 +0800
+++ linux/MAINTAINERS   2006-09-06 04:19:08.0 +0800
@@ -2818,6 +2818,14 @@ M:   [EMAIL PROTECTED]
 L: netdev@vger.kernel.org
 S: Maintained

+TCP LOW PRIORITY MODULE
+P: Wong Hoi Sing, Edison
+M: [EMAIL PROTECTED]
+P: Hung Hing Lun, Mike
+M: [EMAIL PROTECTED]
+W: http://tcp-lp-mod.sourceforge.net/
+S: Maintained
+
 TI OMAP RANDOM NUMBER GENERATOR SUPPORT
 P: Deepak Saxena
 M: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] tcp-lp: bug fix for oops in 2.6.18-rc6

2006-09-05 Thread Edison
Sorry that the patch submited yesterday still contain a small bug.
This version have already been test for hours with BT connections. The
oops is now difficult to reproduce.

Signed-off-by: Wong Hoi Sing Edison [EMAIL PROTECTED]

---

diff -urpN linux-2.6.18-rc6/net/ipv4/tcp_lp.c linux/net/ipv4/tcp_lp.c
--- linux-2.6.18-rc6/net/ipv4/tcp_lp.c  2006-09-06 04:12:00.0 +0800
+++ linux/net/ipv4/tcp_lp.c 2006-09-06 04:24:07.0 +0800
@@ -3,13 +3,8 @@
 *
 * TCP Low Priority is a distributed algorithm whose goal is to utilize only
 *   the excess network bandwidth as compared to the ``fair share`` of
- *   bandwidth as targeted by TCP. Available from:
- * http://www.ece.rice.edu/~akuzma/Doc/akuzma/TCP-LP.pdf
+ *   bandwidth as targeted by TCP.
 *
- * Original Author:
- *   Aleksandar Kuzmanovic [EMAIL PROTECTED]
- *
- * See http://www-ece.rice.edu/networks/TCP-LP/ for their implementation.
 * As of 2.6.13, Linux supports pluggable congestion control algorithms.
 * Due to the limitation of the API, we take the following changes from
 * the original TCP-LP implementation:
@@ -24,11 +19,20 @@
 *   o OWD is handled in relative format, where local time stamp will in
 * tcp_time_stamp format.
 *
- * Port from 2.4.19 to 2.6.16 as module by:
- *   Wong Hoi Sing Edison [EMAIL PROTECTED]
- *   Hung Hing Lun [EMAIL PROTECTED]
+ * Original Author:
+ *   Aleksandar Kuzmanovic [EMAIL PROTECTED]
+ * Available from:
+ *   http://www.ece.rice.edu/~akuzma/Doc/akuzma/TCP-LP.pdf
+ * Original implementation for 2.4.19:
+ *   http://www-ece.rice.edu/networks/TCP-LP/
+ *
+ * 2.6.x module Authors:
+ *   Wong Hoi Sing, Edison [EMAIL PROTECTED]
+ *   Hung Hing Lun, Mike [EMAIL PROTECTED]
+ * SourceForge project page:
+ *   http://tcp-lp-mod.sourceforge.net/
 *
- * Version: $Id: tcp_lp.c,v 1.22 2006-05-02 18:18:19 hswong3i Exp $
+ * Version: $Id: tcp_lp.c,v 1.24 2006/09/05 20:22:53 hswong3i Exp $
 */

 #include linux/config.h
@@ -153,16 +157,19 @@ static u32 tcp_lp_remote_hz_estimator(st
   if (m  0)
   m = -m;

-   if (rhz != 0) {
+   if (rhz  0) {
   m -= rhz  6;  /* m is now error in remote HZ est */
   rhz += m;   /* 63/64 old + 1/64 new */
   } else
   rhz = m  6;

+ out:
   /* record time for successful remote HZ calc */
-   lp-flag |= LP_VALID_RHZ;
+   if (rhz  0)
+   lp-flag |= LP_VALID_RHZ;
+   else
+   lp-flag = ~LP_VALID_RHZ;

- out:
   /* record reference time stamp */
   lp-remote_ref_time = tp-rx_opt.rcv_tsval;
   lp-local_ref_time = tp-rx_opt.rcv_tsecr;
@@ -333,6 +340,6 @@ static void __exit tcp_lp_unregister(voi
 module_init(tcp_lp_register);
 module_exit(tcp_lp_unregister);

-MODULE_AUTHOR(Wong Hoi Sing Edison, Hung Hing Lun);
+MODULE_AUTHOR(Wong Hoi Sing Edison, Hung Hing Lun Mike);
 MODULE_LICENSE(GPL);
 MODULE_DESCRIPTION(TCP Low Priority);

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] tcp-lp: bug fix for oops in 2.6.18-rc6

2006-09-05 Thread Wong Edison

  Folks: we do watch over you and your postings, if you get rejects
  do send UNABRIDGED messages to  [EMAIL PROTECTED], however
  we do look into the FREEZER (as the reject message does refer to)
  several times a day to find possible mis-rejects.
Yours,   [EMAIL PROTECTED]


Signed-off-by: Wong Hoi Sing Edison [EMAIL PROTECTED]

---

diff -urpN linux-2.6.18-rc6/MAINTAINERS linux/MAINTAINERS
--- linux-2.6.18-rc6/MAINTAINERS2006-09-06 04:12:11.0 +0800
+++ linux/MAINTAINERS   2006-09-06 04:19:08.0 +0800
@@ -2818,6 +2818,14 @@ M:   [EMAIL PROTECTED]
L: netdev@vger.kernel.org
S: Maintained

+TCP LOW PRIORITY MODULE
+P: Wong Hoi Sing, Edison
+M: [EMAIL PROTECTED]
+P: Hung Hing Lun, Mike
+M: [EMAIL PROTECTED]
+W: http://tcp-lp-mod.sourceforge.net/
+S: Maintained
+
TI OMAP RANDOM NUMBER GENERATOR SUPPORT
P: Deepak Saxena
M: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2 problem on powerpc

2006-09-05 Thread Benjamin Herrenschmidt
On Mon, 2006-09-04 at 21:15 -0700, Stephen Hemminger wrote:
 On Tue, 05 Sep 2006 13:47:52 +1000
 Benjamin Herrenschmidt [EMAIL PROTECTED] wrote:
 
  
   It may not need any swapping, it is hard to tell what the hardware
   will do without experimentation.
  
  Yes... did you have a chance to test the vlan stuff on LE machines
  (x86) ? did it work with the BE swapping you were doing ? I've
  purposedly removed in my patches the hardware side swapping of the
  descriptors, as I explained, thus making the hardware react the same on
  ppc and x86. Thus we need the exact same swapping macros on both
  platforms).
 
 
 Last time I checked it worked.  Private cable simulating VLAN
 from other Linux card.

Ok, so we should probably switch back the vlan bits to BE swapping
macros... 

However, we then have an inconsistency with that bit:

#ifdef SKY2_VLAN_TAG_USED
case OP_RXVLAN:
sky2-rx_tag = length;
break;

case OP_RXCHKSVLAN:
sky2-rx_tag = length;
/* fall through */
#endif

in sky2_status_intr()

Where we read the lenght field directly without swapping (on the non
patched driver, on the patched driver, lenght will have gone through an
LE swap). That is, if you take the standpoint of a LE machine, you will
read that value as a little endian value while elsewhere, we manipulate
sky2-rx_tag as a BE value... (this is even without my patch)

Ben.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2: hw checksum failures

2006-09-05 Thread Benjamin Herrenschmidt

  Which means that if it worked on x86 with le16_to_cpu, it should work on
  powerpc... The main difference here however is that you called
  le16_to_cpu (which is basically a nop) on a 32 bits field, while I
  called le32_to_cpu() on it. But both should lead to the same ... (x86
  will do a swapped 16 bits load of the 2 first bytes, while ppc will do a
  load of 4 bytes and swap that, thus ending up with the first 2 bytes
  swapped in the low order of the result). I'll dump the values and have a
  look to be sure. Another possibility would be a problem with the bits
  telling the chip where to calculate the checksum.

 Hardware only computes 16 bit checksum.

Oh I know that, but calling 16 bits swapping macros on a 32 bits field
is a bit dodgy... might work in this case, I'll verify, but you may end
up with the wrong half of the 32 bits word being used :)

Ben.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2: hw checksum failures

2006-09-05 Thread Stephen Hemminger
On Wed, 06 Sep 2006 07:12:43 +1000
Benjamin Herrenschmidt [EMAIL PROTECTED] wrote:

 
   Which means that if it worked on x86 with le16_to_cpu, it should work on
   powerpc... The main difference here however is that you called
   le16_to_cpu (which is basically a nop) on a 32 bits field, while I
   called le32_to_cpu() on it. But both should lead to the same ... (x86
   will do a swapped 16 bits load of the 2 first bytes, while ppc will do a
   load of 4 bytes and swap that, thus ending up with the first 2 bytes
   swapped in the low order of the result). I'll dump the values and have a
   look to be sure. Another possibility would be a problem with the bits
   telling the chip where to calculate the checksum.
 
  Hardware only computes 16 bit checksum.
 
 Oh I know that, but calling 16 bits swapping macros on a 32 bits field
 is a bit dodgy... might work in this case, I'll verify, but you may end
 up with the wrong half of the 32 bits word being used :)
 
 Ben.
 
 

Agreed. Actually the checksum value is same hi/lo because there are
two checksum units and we ask for the same offset on both.

-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2 problem on powerpc

2006-09-05 Thread Stephen Hemminger
This is the reduced version of your patch, plus I got rid of the union
in tx_le, it is a nuisance.


--- sky2.orig/drivers/net/sky2.c2006-09-05 13:39:34.0 -0700
+++ sky2/drivers/net/sky2.c 2006-09-05 13:57:44.0 -0700
@@ -809,7 +809,7 @@
struct sky2_rx_le *le;
 
le = sky2_next_rx(sky2);
-   le-addr = (ETH_HLEN  16) | ETH_HLEN;
+   le-addr = cpu_to_le32((ETH_HLEN  16) | ETH_HLEN);
le-ctrl = 0;
le-opcode = OP_TCPSTART | HW_OWNER;
 
@@ -1227,7 +1227,7 @@
/* Send high bits if changed or crosses boundary */
if (addr64 != sky2-tx_addr64 || high32(mapping + len) != 
sky2-tx_addr64) {
le = get_tx_le(sky2);
-   le-tx.addr = cpu_to_le32(addr64);
+   le-addr = cpu_to_le32(addr64);
le-ctrl = 0;
le-opcode = OP_ADDR64 | HW_OWNER;
sky2-tx_addr64 = high32(mapping + len);
@@ -1242,8 +1242,7 @@
 
if (mss != sky2-tx_last_mss) {
le = get_tx_le(sky2);
-   le-tx.tso.size = cpu_to_le16(mss);
-   le-tx.tso.rsvd = 0;
+   le-addr = cpu_to_le32(mss);
le-opcode = OP_LRGLEN | HW_OWNER;
le-ctrl = 0;
sky2-tx_last_mss = mss;
@@ -1256,7 +1255,7 @@
if (sky2-vlgrp  vlan_tx_tag_present(skb)) {
if (!le) {
le = get_tx_le(sky2);
-   le-tx.addr = 0;
+   le-addr = 0;
le-opcode = OP_VLAN|HW_OWNER;
le-ctrl = 0;
} else
@@ -1268,20 +1267,21 @@
 
/* Handle TCP checksum offload */
if (skb-ip_summed == CHECKSUM_HW) {
-   u16 hdr = skb-h.raw - skb-data;
-   u16 offset = hdr + skb-csum;
+   unsigned offset = skb-h.raw - skb-data;
+   u32 tcpsum;
+
+   tcpsum = offset  16;  /* sum start */
+   tcpsum |= offset + skb-csum;   /* sum write */
 
ctrl = CALSUM | WR_SUM | INIT_SUM | LOCK_SUM;
if (skb-nh.iph-protocol == IPPROTO_UDP)
ctrl |= UDPTCP;
 
-   if (hdr != sky2-tx_csum_start || offset != 
sky2-tx_csum_offset) {
-   sky2-tx_csum_start = hdr;
-   sky2-tx_csum_offset = offset;
+   if (tcpsum != sky2-tx_tcpsum) {
+   sky2-tx_tcpsum = tcpsum;
 
le = get_tx_le(sky2);
-   le-tx.csum.start = cpu_to_le16(hdr);
-   le-tx.csum.offset = cpu_to_le16(offset);
+   le-addr = cpu_to_le32(tcpsum);
le-length = 0; /* initial checksum value */
le-ctrl = 1;   /* one packet */
le-opcode = OP_TCPLISW | HW_OWNER;
@@ -1289,7 +1289,7 @@
}
 
le = get_tx_le(sky2);
-   le-tx.addr = cpu_to_le32((u32) mapping);
+   le-addr = cpu_to_le32((u32) mapping);
le-length = cpu_to_le16(len);
le-ctrl = ctrl;
le-opcode = mss ? (OP_LARGESEND | HW_OWNER) : (OP_PACKET | HW_OWNER);
@@ -1307,14 +1307,14 @@
addr64 = high32(mapping);
if (addr64 != sky2-tx_addr64) {
le = get_tx_le(sky2);
-   le-tx.addr = cpu_to_le32(addr64);
+   le-addr = cpu_to_le32(addr64);
le-ctrl = 0;
le-opcode = OP_ADDR64 | HW_OWNER;
sky2-tx_addr64 = addr64;
}
 
le = get_tx_le(sky2);
-   le-tx.addr = cpu_to_le32((u32) mapping);
+   le-addr = cpu_to_le32((u32) mapping);
le-length = cpu_to_le16(frag-size);
le-ctrl = ctrl;
le-opcode = OP_BUFFER | HW_OWNER;
@@ -1919,8 +1919,8 @@
dev = hw-dev[le-link];
 
sky2 = netdev_priv(dev);
-   length = le-length;
-   status = le-status;
+   length = le16_to_cpu(le-length);
+   status = le32_to_cpu(le-status);
 
switch (le-opcode  ~HW_OWNER) {
case OP_RXSTAT:
@@ -1964,7 +1964,7 @@
case OP_RXCHKS:
skb = sky2-rx_ring[sky2-rx_next].skb;
skb-ip_summed = CHECKSUM_HW;
-   skb-csum = le16_to_cpu(status);
+   skb-csum = status  0x;
break;
 
case OP_TXINDEXLE:
@@ -3266,12 +3266,13 @@
hw-pm_cap = pm_cap;
 
 #ifdef __BIG_ENDIAN
-   /* byte swap descriptors in hardware */
+   /* The sk98lin vendor driver uses hardware byte swapping but
+* this driver uses software swapping.
+*/
{
u32 reg;
-
reg = 

Re: sky2: hw checksum failures

2006-09-05 Thread Benjamin Herrenschmidt

 Agreed. Actually the checksum value is same hi/lo because there are
 two checksum units and we ask for the same offset on both.

Ok, that explains the (HLEN  16) | HLEN thing when configuring it...
At this point, best is I dig into the actual values and see what's up.
I'll let you know (I don't have the HW at hand right now)

Ben.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2 problem on powerpc

2006-09-05 Thread Benjamin Herrenschmidt
On Tue, 2006-09-05 at 14:36 -0700, Stephen Hemminger wrote:
 This is the reduced version of your patch, plus I got rid of the union
 in tx_le, it is a nuisance.

Thanks. I'll give it a go later today. The remaining nit is the
inconsitent swapping of the vlan tag which is manipulated at BE at times
and LE at others (later hapens in status_intr).

Ben.

 --- sky2.orig/drivers/net/sky2.c  2006-09-05 13:39:34.0 -0700
 +++ sky2/drivers/net/sky2.c   2006-09-05 13:57:44.0 -0700
 @@ -809,7 +809,7 @@
   struct sky2_rx_le *le;
  
   le = sky2_next_rx(sky2);
 - le-addr = (ETH_HLEN  16) | ETH_HLEN;
 + le-addr = cpu_to_le32((ETH_HLEN  16) | ETH_HLEN);
   le-ctrl = 0;
   le-opcode = OP_TCPSTART | HW_OWNER;
  
 @@ -1227,7 +1227,7 @@
   /* Send high bits if changed or crosses boundary */
   if (addr64 != sky2-tx_addr64 || high32(mapping + len) != 
 sky2-tx_addr64) {
   le = get_tx_le(sky2);
 - le-tx.addr = cpu_to_le32(addr64);
 + le-addr = cpu_to_le32(addr64);
   le-ctrl = 0;
   le-opcode = OP_ADDR64 | HW_OWNER;
   sky2-tx_addr64 = high32(mapping + len);
 @@ -1242,8 +1242,7 @@
  
   if (mss != sky2-tx_last_mss) {
   le = get_tx_le(sky2);
 - le-tx.tso.size = cpu_to_le16(mss);
 - le-tx.tso.rsvd = 0;
 + le-addr = cpu_to_le32(mss);
   le-opcode = OP_LRGLEN | HW_OWNER;
   le-ctrl = 0;
   sky2-tx_last_mss = mss;
 @@ -1256,7 +1255,7 @@
   if (sky2-vlgrp  vlan_tx_tag_present(skb)) {
   if (!le) {
   le = get_tx_le(sky2);
 - le-tx.addr = 0;
 + le-addr = 0;
   le-opcode = OP_VLAN|HW_OWNER;
   le-ctrl = 0;
   } else
 @@ -1268,20 +1267,21 @@
  
   /* Handle TCP checksum offload */
   if (skb-ip_summed == CHECKSUM_HW) {
 - u16 hdr = skb-h.raw - skb-data;
 - u16 offset = hdr + skb-csum;
 + unsigned offset = skb-h.raw - skb-data;
 + u32 tcpsum;
 +
 + tcpsum = offset  16;  /* sum start */
 + tcpsum |= offset + skb-csum;   /* sum write */
  
   ctrl = CALSUM | WR_SUM | INIT_SUM | LOCK_SUM;
   if (skb-nh.iph-protocol == IPPROTO_UDP)
   ctrl |= UDPTCP;
  
 - if (hdr != sky2-tx_csum_start || offset != 
 sky2-tx_csum_offset) {
 - sky2-tx_csum_start = hdr;
 - sky2-tx_csum_offset = offset;
 + if (tcpsum != sky2-tx_tcpsum) {
 + sky2-tx_tcpsum = tcpsum;
  
   le = get_tx_le(sky2);
 - le-tx.csum.start = cpu_to_le16(hdr);
 - le-tx.csum.offset = cpu_to_le16(offset);
 + le-addr = cpu_to_le32(tcpsum);
   le-length = 0; /* initial checksum value */
   le-ctrl = 1;   /* one packet */
   le-opcode = OP_TCPLISW | HW_OWNER;
 @@ -1289,7 +1289,7 @@
   }
  
   le = get_tx_le(sky2);
 - le-tx.addr = cpu_to_le32((u32) mapping);
 + le-addr = cpu_to_le32((u32) mapping);
   le-length = cpu_to_le16(len);
   le-ctrl = ctrl;
   le-opcode = mss ? (OP_LARGESEND | HW_OWNER) : (OP_PACKET | HW_OWNER);
 @@ -1307,14 +1307,14 @@
   addr64 = high32(mapping);
   if (addr64 != sky2-tx_addr64) {
   le = get_tx_le(sky2);
 - le-tx.addr = cpu_to_le32(addr64);
 + le-addr = cpu_to_le32(addr64);
   le-ctrl = 0;
   le-opcode = OP_ADDR64 | HW_OWNER;
   sky2-tx_addr64 = addr64;
   }
  
   le = get_tx_le(sky2);
 - le-tx.addr = cpu_to_le32((u32) mapping);
 + le-addr = cpu_to_le32((u32) mapping);
   le-length = cpu_to_le16(frag-size);
   le-ctrl = ctrl;
   le-opcode = OP_BUFFER | HW_OWNER;
 @@ -1919,8 +1919,8 @@
   dev = hw-dev[le-link];
  
   sky2 = netdev_priv(dev);
 - length = le-length;
 - status = le-status;
 + length = le16_to_cpu(le-length);
 + status = le32_to_cpu(le-status);
  
   switch (le-opcode  ~HW_OWNER) {
   case OP_RXSTAT:
 @@ -1964,7 +1964,7 @@
   case OP_RXCHKS:
   skb = sky2-rx_ring[sky2-rx_next].skb;
   skb-ip_summed = CHECKSUM_HW;
 - skb-csum = le16_to_cpu(status);
 + skb-csum = status  0x;
   break;
  
   case OP_TXINDEXLE:
 @@ -3266,12 +3266,13 @@
   hw-pm_cap = pm_cap;
  
  #ifdef __BIG_ENDIAN
 - /* byte swap descriptors in hardware */
 + 

Re: [PATCH][RFC] Re: high latency with TCP connections

2006-09-05 Thread Alexey Kuznetsov
Hello!

 Is this really necessary?

No, of course. We lived for ages without this, would live for another age.



   I thought that the problems with ABC were in 
 trying to apply byte-based heuristics from the RFC(s) to a 
 packet-oritented cwnd in the stack?

It was just the last drop.

Even with disabled ABC, that test shows some gaps in latency summed
up to ~300 msec. Almost invisible, but not good.

Too aggressive delack has many other issues. Even without ABC
we have quadratically suppressed cwnd on TCP_NODELAY connections
comparing to BSD: at sender side we suppress it by counting
cwnd in packets, at receiver side by ACKing by byte counter.

Each time when another victim sees artificial latencies introduced
by agressive delayed acks, even though he requested TCP_NODELAY,
our best argument is Stupid, you do all wrong, how could you get
a decent performance? :-). 

Probably, we stand for a feature which really does not worth
to stand for and causes nothing but permanent pain in ass.

Alexey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFT 5/5] sky2: fix fiber support

2006-09-05 Thread shemminger
Fix support for fiber based devices.  Needed to keep track of PMD type to
add workaround in setup. Add support for gigabit half duplex fiber.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---
 drivers/net/sky2.c |   81 -
 drivers/net/sky2.h |   15 +
 2 files changed, 63 insertions(+), 33 deletions(-)

--- sky2.orig/drivers/net/sky2.c2006-09-05 13:57:44.0 -0700
+++ sky2/drivers/net/sky2.c 2006-09-05 14:00:04.0 -0700
@@ -308,7 +308,7 @@
}
 
ctrl = gm_phy_read(hw, port, PHY_MARV_PHY_CTRL);
-   if (hw-copper) {
+   if (sky2_is_copper(hw)) {
if (hw-chip_id == CHIP_ID_YUKON_FE) {
/* enable automatic crossover */
ctrl |= PHY_M_PC_MDI_XMODE(PHY_M_PC_ENA_AUTO)  1;
@@ -325,25 +325,37 @@
ctrl |= PHY_M_PC_DSC(2) | PHY_M_PC_DOWN_S_ENA;
}
}
-   gm_phy_write(hw, port, PHY_MARV_PHY_CTRL, ctrl);
} else {
/* workaround for deviation #4.88 (CRC errors) */
/* disable Automatic Crossover */
 
ctrl = ~PHY_M_PC_MDIX_MSK;
-   gm_phy_write(hw, port, PHY_MARV_PHY_CTRL, ctrl);
+   }
 
-   if (hw-chip_id == CHIP_ID_YUKON_XL) {
-   /* Fiber: select 1000BASE-X only mode MAC Specific Ctrl 
Reg. */
-   gm_phy_write(hw, port, PHY_MARV_EXT_ADR, 2);
-   ctrl = gm_phy_read(hw, port, PHY_MARV_PHY_CTRL);
-   ctrl = ~PHY_M_MAC_MD_MSK;
-   ctrl |= PHY_M_MAC_MODE_SEL(PHY_M_MAC_MD_1000BX);
-   gm_phy_write(hw, port, PHY_MARV_PHY_CTRL, ctrl);
+   gm_phy_write(hw, port, PHY_MARV_PHY_CTRL, ctrl);
+
+   /* special setup for PHY 88E1112 Fiber */
+   if (hw-chip_id == CHIP_ID_YUKON_XL  !sky2_is_copper(hw)) {
+   pg = gm_phy_read(hw, port, PHY_MARV_EXT_ADR);
 
+   /* Fiber: select 1000BASE-X only mode MAC Specific Ctrl Reg. */
+   gm_phy_write(hw, port, PHY_MARV_EXT_ADR, 2);
+   ctrl = gm_phy_read(hw, port, PHY_MARV_PHY_CTRL);
+   ctrl = ~PHY_M_MAC_MD_MSK;
+   ctrl |= PHY_M_MAC_MODE_SEL(PHY_M_MAC_MD_1000BX);
+   gm_phy_write(hw, port, PHY_MARV_PHY_CTRL, ctrl);
+
+   if (hw-pmd_type  == 'P') {
/* select page 1 to access Fiber registers */
gm_phy_write(hw, port, PHY_MARV_EXT_ADR, 1);
+
+   /* for SFP-module set SIGDET polarity to low */
+   ctrl = gm_phy_read(hw, port, PHY_MARV_PHY_CTRL);
+   ctrl |= PHY_M_FIB_SIGD_POL;
+   gm_phy_write(hw, port, PHY_MARV_CTRL, ctrl);
}
+
+   gm_phy_write(hw, port, PHY_MARV_EXT_ADR, pg);
}
 
ctrl = gm_phy_read(hw, port, PHY_MARV_CTRL);
@@ -361,7 +373,7 @@
reg = 0;
 
if (sky2-autoneg == AUTONEG_ENABLE) {
-   if (hw-copper) {
+   if (sky2_is_copper(hw)) {
if (sky2-advertising  ADVERTISED_1000baseT_Full)
ct1000 |= PHY_M_1000C_AFD;
if (sky2-advertising  ADVERTISED_1000baseT_Half)
@@ -374,8 +386,12 @@
adv |= PHY_M_AN_10_FD;
if (sky2-advertising  ADVERTISED_10baseT_Half)
adv |= PHY_M_AN_10_HD;
-   } else  /* special defines for FIBER (88E1011S only) */
-   adv |= PHY_M_AN_1000X_AHD | PHY_M_AN_1000X_AFD;
+   } else {/* special defines for FIBER (88E1040S only) */
+   if (sky2-advertising  ADVERTISED_1000baseT_Full)
+   adv |= PHY_M_AN_1000X_AFD;
+   if (sky2-advertising  ADVERTISED_1000baseT_Half)
+   adv |= PHY_M_AN_1000X_AHD;
+   }
 
/* Set Flow-control capabilities */
if (sky2-tx_pause  sky2-rx_pause)
@@ -1494,7 +1510,7 @@
 
 static u16 sky2_phy_speed(const struct sky2_hw *hw, u16 aux)
 {
-   if (!hw-copper)
+   if (!sky2_is_copper(hw))
return SPEED_1000;
 
if (hw-chip_id == CHIP_ID_YUKON_FE)
@@ -2266,7 +2282,7 @@
 static int sky2_reset(struct sky2_hw *hw)
 {
u16 status;
-   u8 t8, pmd_type;
+   u8 t8;
int i;
 
sky2_write8(hw, B0_CTST, CS_RST_CLR);
@@ -2312,9 +2328,7 @@
sky2_pci_write32(hw, PEX_UNC_ERR_STAT, 0xUL);
 
 
-   pmd_type = sky2_read8(hw, B2_PMD_TYP);
-   hw-copper = !(pmd_type == 'L' || pmd_type == 'S');
-
+   hw-pmd_type = sky2_read8(hw, B2_PMD_TYP);
hw-ports = 1;
t8 = sky2_read8(hw, B2_Y2_HW_RES);
if ((t8  CFG_DUAL_MAC_MSK) == CFG_DUAL_MAC_MSK) {
@@ 

[RFT 3/5] sky2: handle forced settings

2006-09-05 Thread shemminger
Handle cases where pause parameters are forced. Need to program
the GMAC before starting the PHY, not after.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- sky2.orig/drivers/net/sky2.c2006-09-05 12:10:18.0 -0700
+++ sky2/drivers/net/sky2.c 2006-09-05 13:32:59.0 -0700
@@ -289,7 +289,7 @@
 static void sky2_phy_init(struct sky2_hw *hw, unsigned port)
 {
struct sky2_port *sky2 = netdev_priv(hw-dev[port]);
-   u16 ctrl, ct1000, adv, pg, ledctrl, ledover;
+   u16 ctrl, ct1000, adv, pg, ledctrl, ledover, reg;
 
if (sky2-autoneg == AUTONEG_ENABLE 
!(hw-chip_id == CHIP_ID_YUKON_XL || hw-chip_id == 
CHIP_ID_YUKON_EC_U)) {
@@ -358,6 +358,7 @@
ctrl = 0;
ct1000 = 0;
adv = PHY_AN_CSMA;
+   reg = 0;
 
if (sky2-autoneg == AUTONEG_ENABLE) {
if (hw-copper) {
@@ -390,21 +391,44 @@
/* forced speed/duplex settings */
ct1000 = PHY_M_1000C_MSE;
 
-   if (sky2-duplex == DUPLEX_FULL)
-   ctrl |= PHY_CT_DUP_MD;
+   /* Disable auto update for duplex flow control and speed */
+   reg |= GM_GPCR_AU_ALL_DIS;
 
switch (sky2-speed) {
case SPEED_1000:
ctrl |= PHY_CT_SP1000;
+   reg |= GM_GPCR_SPEED_1000;
break;
case SPEED_100:
ctrl |= PHY_CT_SP100;
+   reg |= GM_GPCR_SPEED_100;
break;
}
 
+   if (sky2-duplex == DUPLEX_FULL) {
+   reg |= GM_GPCR_DUP_FULL;
+   ctrl |= PHY_CT_DUP_MD;
+   }
+
+   if (!sky2-rx_pause)
+   reg |= GM_GPCR_FC_RX_DIS;
+
+   if (!sky2-tx_pause)
+   reg |= GM_GPCR_FC_TX_DIS;
+
+   /* Forward pause packets to GMAC? */
+   if (!sky2-tx_pause ||
+   (hw-chip_id != CHIP_ID_YUKON_EC_U 
+sky2-duplex == DUPLEX_HALF  sky2-speed != SPEED_1000))
+   sky2_write8(hw, SK_REG(port, GMAC_CTRL), GMC_PAUSE_OFF);
+   else
+   sky2_write8(hw, SK_REG(port, GMAC_CTRL), GMC_PAUSE_ON);
+
ctrl |= PHY_CT_RESET;
}
 
+   gma_write16(hw, port, GM_GP_CTRL, reg);
+
if (hw-chip_id != CHIP_ID_YUKON_FE)
gm_phy_write(hw, port, PHY_MARV_1000T_CTRL, ct1000);
 
@@ -508,6 +532,7 @@
gm_phy_write(hw, port, PHY_MARV_LED_OVER, ledover);
 
}
+
/* Enable phy interrupt on auto-negotiation complete (or link up) */
if (sky2-autoneg == AUTONEG_ENABLE)
gm_phy_write(hw, port, PHY_MARV_INT_MASK, PHY_M_IS_AN_COMPL);
@@ -570,49 +595,11 @@
 gm_phy_read(hw, 1, PHY_MARV_INT_MASK) != 0);
}
 
-   if (sky2-autoneg == AUTONEG_DISABLE) {
-   reg = gma_read16(hw, port, GM_GP_CTRL);
-   reg |= GM_GPCR_AU_ALL_DIS;
-   gma_write16(hw, port, GM_GP_CTRL, reg);
-   gma_read16(hw, port, GM_GP_CTRL);
-
-   switch (sky2-speed) {
-   case SPEED_1000:
-   reg = ~GM_GPCR_SPEED_100;
-   reg |= GM_GPCR_SPEED_1000;
-   break;
-   case SPEED_100:
-   reg = ~GM_GPCR_SPEED_1000;
-   reg |= GM_GPCR_SPEED_100;
-   break;
-   case SPEED_10:
-   reg = ~(GM_GPCR_SPEED_1000 | GM_GPCR_SPEED_100);
-   break;
-   }
-
-   if (sky2-duplex == DUPLEX_FULL)
-   reg |= GM_GPCR_DUP_FULL;
-
-   /* turn off pause in 10/100mbps half duplex */
-   else if (sky2-speed != SPEED_1000 
-hw-chip_id != CHIP_ID_YUKON_EC_U)
-   sky2-tx_pause = sky2-rx_pause = 0;
-   } else
-   reg = GM_GPCR_SPEED_1000 | GM_GPCR_SPEED_100 | GM_GPCR_DUP_FULL;
-
-   if (!sky2-tx_pause  !sky2-rx_pause) {
-   sky2_write32(hw, SK_REG(port, GMAC_CTRL), GMC_PAUSE_OFF);
-   reg |=
-   GM_GPCR_FC_TX_DIS | GM_GPCR_FC_RX_DIS | GM_GPCR_AU_FCT_DIS;
-   } else if (sky2-tx_pause  !sky2-rx_pause) {
-   /* disable Rx flow-control */
-   reg |= GM_GPCR_FC_RX_DIS | GM_GPCR_AU_FCT_DIS;
-   }
-
-   gma_write16(hw, port, GM_GP_CTRL, reg);
-
sky2_read16(hw, SK_REG(port, GMAC_IRQ_SRC));
 
+   /* Enable Transmit FIFO Underrun */
+   sky2_write8(hw, SK_REG(port, GMAC_IRQ_MSK), GMAC_DEF_MSK);
+
spin_lock_bh(sky2-phy_lock);
sky2_phy_init(hw, port);
spin_unlock_bh(sky2-phy_lock);
@@ -1529,40 +1516,10 @@
unsigned port = sky2-port;
u16 reg;
 
-   /* Enable Transmit 

[RFT 2/5] sky2: accept flow control

2006-09-05 Thread shemminger
Don't program the GMAC to reject flow control packets.
This maybe the cause of some of the receive hangs.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- sky2.orig/drivers/net/sky2.h2006-09-05 15:17:38.0 -0700
+++ sky2/drivers/net/sky2.h 2006-09-05 15:18:00.0 -0700
@@ -1566,7 +1566,7 @@
 
GMR_FS_ANY_ERR  = GMR_FS_RX_FF_OV | GMR_FS_CRC_ERR |
  GMR_FS_FRAGMENT | GMR_FS_LONG_ERR |
- GMR_FS_MII_ERR | GMR_FS_BAD_FC | GMR_FS_GOOD_FC |
+ GMR_FS_MII_ERR | GMR_FS_BAD_FC |
  GMR_FS_UN_SIZE | GMR_FS_JABBER,
 };
 

--
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFT 0/5] sky2 experimental patches

2006-09-05 Thread shemminger
These patches (against 2.6.18-rc6) may solve some of the
mystery hangs and other open problems. Still seeking
confirmation.

--
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFT 1/5] sky2: more device ids (resend)

2006-09-05 Thread shemminger
Some more Marvell device id's, these are from the latest SysKonnect
driver version.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


---
 drivers/net/sky2.c |3 +++
 1 file changed, 3 insertions(+)

--- sky2.orig/drivers/net/sky2.c2006-09-01 14:49:49.0 -0700
+++ sky2/drivers/net/sky2.c 2006-09-01 14:49:56.0 -0700
@@ -106,6 +106,7 @@
{ PCI_DEVICE(PCI_VENDOR_ID_SYSKONNECT, 0x9000) },
{ PCI_DEVICE(PCI_VENDOR_ID_SYSKONNECT, 0x9E00) },
{ PCI_DEVICE(PCI_VENDOR_ID_DLINK, 0x4b00) },/* DGE-560T */
+   { PCI_DEVICE(PCI_VENDOR_ID_DLINK, 0x4001) },/* DGE-550SX */
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4340) },
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4341) },
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4342) },
@@ -117,6 +118,7 @@
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4350) },
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4351) },
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4352) },
+   { PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4353) },
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4360) },
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4361) },
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4362) },
@@ -126,6 +128,7 @@
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4366) },
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4367) },
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4368) },
+   { PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4369) },
{ 0 }
 };
 

--
Stephen Hemminger [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFT 4/5] sky2: big endian fix

2006-09-05 Thread shemminger
Revised version of Ben's patch to fix big endian support.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- sky2.orig/drivers/net/sky2.c2006-09-05 13:39:34.0 -0700
+++ sky2/drivers/net/sky2.c 2006-09-05 13:57:44.0 -0700
@@ -809,7 +809,7 @@
struct sky2_rx_le *le;
 
le = sky2_next_rx(sky2);
-   le-addr = (ETH_HLEN  16) | ETH_HLEN;
+   le-addr = cpu_to_le32((ETH_HLEN  16) | ETH_HLEN);
le-ctrl = 0;
le-opcode = OP_TCPSTART | HW_OWNER;
 
@@ -1227,7 +1227,7 @@
/* Send high bits if changed or crosses boundary */
if (addr64 != sky2-tx_addr64 || high32(mapping + len) != 
sky2-tx_addr64) {
le = get_tx_le(sky2);
-   le-tx.addr = cpu_to_le32(addr64);
+   le-addr = cpu_to_le32(addr64);
le-ctrl = 0;
le-opcode = OP_ADDR64 | HW_OWNER;
sky2-tx_addr64 = high32(mapping + len);
@@ -1242,8 +1242,7 @@
 
if (mss != sky2-tx_last_mss) {
le = get_tx_le(sky2);
-   le-tx.tso.size = cpu_to_le16(mss);
-   le-tx.tso.rsvd = 0;
+   le-addr = cpu_to_le32(mss);
le-opcode = OP_LRGLEN | HW_OWNER;
le-ctrl = 0;
sky2-tx_last_mss = mss;
@@ -1256,7 +1255,7 @@
if (sky2-vlgrp  vlan_tx_tag_present(skb)) {
if (!le) {
le = get_tx_le(sky2);
-   le-tx.addr = 0;
+   le-addr = 0;
le-opcode = OP_VLAN|HW_OWNER;
le-ctrl = 0;
} else
@@ -1268,20 +1267,21 @@
 
/* Handle TCP checksum offload */
if (skb-ip_summed == CHECKSUM_HW) {
-   u16 hdr = skb-h.raw - skb-data;
-   u16 offset = hdr + skb-csum;
+   unsigned offset = skb-h.raw - skb-data;
+   u32 tcpsum;
+
+   tcpsum = offset  16;  /* sum start */
+   tcpsum |= offset + skb-csum;   /* sum write */
 
ctrl = CALSUM | WR_SUM | INIT_SUM | LOCK_SUM;
if (skb-nh.iph-protocol == IPPROTO_UDP)
ctrl |= UDPTCP;
 
-   if (hdr != sky2-tx_csum_start || offset != 
sky2-tx_csum_offset) {
-   sky2-tx_csum_start = hdr;
-   sky2-tx_csum_offset = offset;
+   if (tcpsum != sky2-tx_tcpsum) {
+   sky2-tx_tcpsum = tcpsum;
 
le = get_tx_le(sky2);
-   le-tx.csum.start = cpu_to_le16(hdr);
-   le-tx.csum.offset = cpu_to_le16(offset);
+   le-addr = cpu_to_le32(tcpsum);
le-length = 0; /* initial checksum value */
le-ctrl = 1;   /* one packet */
le-opcode = OP_TCPLISW | HW_OWNER;
@@ -1289,7 +1289,7 @@
}
 
le = get_tx_le(sky2);
-   le-tx.addr = cpu_to_le32((u32) mapping);
+   le-addr = cpu_to_le32((u32) mapping);
le-length = cpu_to_le16(len);
le-ctrl = ctrl;
le-opcode = mss ? (OP_LARGESEND | HW_OWNER) : (OP_PACKET | HW_OWNER);
@@ -1307,14 +1307,14 @@
addr64 = high32(mapping);
if (addr64 != sky2-tx_addr64) {
le = get_tx_le(sky2);
-   le-tx.addr = cpu_to_le32(addr64);
+   le-addr = cpu_to_le32(addr64);
le-ctrl = 0;
le-opcode = OP_ADDR64 | HW_OWNER;
sky2-tx_addr64 = addr64;
}
 
le = get_tx_le(sky2);
-   le-tx.addr = cpu_to_le32((u32) mapping);
+   le-addr = cpu_to_le32((u32) mapping);
le-length = cpu_to_le16(frag-size);
le-ctrl = ctrl;
le-opcode = OP_BUFFER | HW_OWNER;
@@ -1919,8 +1919,8 @@
dev = hw-dev[le-link];
 
sky2 = netdev_priv(dev);
-   length = le-length;
-   status = le-status;
+   length = le16_to_cpu(le-length);
+   status = le32_to_cpu(le-status);
 
switch (le-opcode  ~HW_OWNER) {
case OP_RXSTAT:
@@ -1964,7 +1964,7 @@
case OP_RXCHKS:
skb = sky2-rx_ring[sky2-rx_next].skb;
skb-ip_summed = CHECKSUM_HW;
-   skb-csum = le16_to_cpu(status);
+   skb-csum = status  0x;
break;
 
case OP_TXINDEXLE:
@@ -3266,12 +3266,13 @@
hw-pm_cap = pm_cap;
 
 #ifdef __BIG_ENDIAN
-   /* byte swap descriptors in hardware */
+   /* The sk98lin vendor driver uses hardware byte swapping but
+* this driver uses software swapping.
+*/
{
u32 reg;
-

[2.6 patch] net/sctp/: cleanups

2006-09-05 Thread Adrian Bunk
This patch contains the following cleanups:
- make the following needlessly global function static:
  - socket.c: sctp_apply_peer_addr_params()
- add proper prototypes for the several global functions in
  include/net/sctp/sctp.h

Note that this fixes wrong prototypes for the following functions:
- sctp_snmp_proc_exit()
- sctp_eps_proc_exit()
- sctp_assocs_proc_exit()

The latter was spotted by the GNU C compiler and reported
by David Woodhouse.

Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

---

 include/net/sctp/sctp.h |   13 +
 net/sctp/ipv6.c |1 -
 net/sctp/protocol.c |7 ---
 net/sctp/socket.c   |   14 +++---
 4 files changed, 20 insertions(+), 15 deletions(-)

--- linux-2.6.18-rc5-mm1/include/net/sctp/sctp.h.old2006-09-05 
16:50:33.0 +0200
+++ linux-2.6.18-rc5-mm1/include/net/sctp/sctp.h2006-09-05 
16:54:18.0 +0200
@@ -128,6 +128,8 @@
 int flags);
 extern struct sctp_pf *sctp_get_pf_specific(sa_family_t family);
 extern int sctp_register_pf(struct sctp_pf *, sa_family_t);
+int sctp_inetaddr_event(struct notifier_block *this, unsigned long ev,
+void *ptr);
 
 /*
  * sctp/socket.c
@@ -178,6 +180,17 @@
  struct sock *oldsk, struct sock *newsk);
 
 /*
+ * sctp/proc.c
+ */
+int sctp_snmp_proc_init(void);
+void sctp_snmp_proc_exit(void);
+int sctp_eps_proc_init(void);
+void sctp_eps_proc_exit(void);
+int sctp_assocs_proc_init(void);
+void sctp_assocs_proc_exit(void);
+
+
+/*
  *  Section:  Macros, externs, and inlines
  */
 
--- linux-2.6.18-rc5-mm1/net/sctp/socket.c.old  2006-09-05 16:49:15.0 
+0200
+++ linux-2.6.18-rc5-mm1/net/sctp/socket.c  2006-09-05 16:49:27.0 
+0200
@@ -2081,13 +2081,13 @@
  * SPP_SACKDELAY_ENABLE, setting both will have undefined
  * results.
  */
-int sctp_apply_peer_addr_params(struct sctp_paddrparams *params,
-   struct sctp_transport   *trans,
-   struct sctp_association *asoc,
-   struct sctp_sock*sp,
-   int  hb_change,
-   int  pmtud_change,
-   int  sackdelay_change)
+static int sctp_apply_peer_addr_params(struct sctp_paddrparams *params,
+  struct sctp_transport   *trans,
+  struct sctp_association *asoc,
+  struct sctp_sock*sp,
+  int  hb_change,
+  int  pmtud_change,
+  int  
sackdelay_change)
 {
int error;
 
--- linux-2.6.18-rc5-mm1/net/sctp/ipv6.c.old2006-09-05 16:50:51.0 
+0200
+++ linux-2.6.18-rc5-mm1/net/sctp/ipv6.c2006-09-05 16:50:58.0 
+0200
@@ -78,7 +78,6 @@
 
 #include asm/uaccess.h
 
-extern int sctp_inetaddr_event(struct notifier_block *, unsigned long, void *);
 static struct notifier_block sctp_inet6addr_notifier = {
.notifier_call = sctp_inetaddr_event,
 };
--- linux-2.6.18-rc5-mm1/net/sctp/protocol.c.old2006-09-05 
16:53:10.0 +0200
+++ linux-2.6.18-rc5-mm1/net/sctp/protocol.c2006-09-05 16:53:20.0 
+0200
@@ -82,13 +82,6 @@
 kmem_cache_t *sctp_chunk_cachep __read_mostly;
 kmem_cache_t *sctp_bucket_cachep __read_mostly;
 
-extern int sctp_snmp_proc_init(void);
-extern int sctp_snmp_proc_exit(void);
-extern int sctp_eps_proc_init(void);
-extern int sctp_eps_proc_exit(void);
-extern int sctp_assocs_proc_init(void);
-extern int sctp_assocs_proc_exit(void);
-
 /* Return the address of the control sock. */
 struct sock *sctp_get_ctl_sock(void)
 {

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC] A change in periodic work scheduling in bcm43xx

2006-09-05 Thread Larry Finger

Michael,

Based on user reports and my own experiences, the current problems with NETDEV WATCHDOG tx timeouts, 
and the device just falling over do not happen when periodic work is not preemptible. These problems 
seem to affect BCM4306 rev 2  3 chips. Since I changed BADNESS_LIMIT to 20 to disable preemption 
during periodic work, my device has stayed up continuously for more than 18 hours. Previously, the 
longest time between failures was less than 6 hours, and sometimes as short as 10 minutes.


As you know, the present scheme for periodic work scheduling for bcm43xx in both wireless-2.6 and 
wireless-dev runs all 4 periodic tasks on certain ticks of the 15-second clock. Using your values of 
badness of 1, 1, 5, and 10 for the 15, 30, 60, and 120 second periodic tasks, respectively, the 
badness repeat cycle is ..., 1, 2, 1, 7, 1, 2, 1, 17, ...


I propose that we reduce the size of the spike in badness by shifting the 120 second task from a 
clock value of 8n to 8n+7, and the 60 second task from 4n to 4n+1. This way no more than 2 of the 
periodic tasks will be run in any clock period, and the badness repeat cycle becomes ..., 6, 2, 1, 
2, 6, 2, 11, 2,  The tasks are run with the same periodicity as before, just a little more 
asynchronously. I recall that they were completely asynchronous in early versions of this driver.


Until we can locate and fix the problem that occurs during preemption, should we consider setting 
BADNESS_LIMIT to 20 in the wireless-2.6 kernels? For those of us whose cards have the problem, it 
certainly makes the device a lot more usable.


Larry

The patches to implement the scheduling change are as follows:

Index: wireless-2.6/drivers/net/wireless/bcm43xx/bcm43xx_main.c
===
--- wireless-2.6.orig/drivers/net/wireless/bcm43xx/bcm43xx_main.c
+++ wireless-2.6/drivers/net/wireless/bcm43xx/bcm43xx_main.c
@@ -3195,9 +3195,9 @@ static void do_periodic_work(struct bcm4
unsigned int state;

state = bcm-periodic_state;
-   if (state % 8 == 0)
+   if (state % 8 == 7)
bcm43xx_periodic_every120sec(bcm);
-   if (state % 4 == 0)
+   if (state % 4 == 1)
bcm43xx_periodic_every60sec(bcm);
if (state % 2 == 0)
bcm43xx_periodic_every30sec(bcm);
@@ -3216,8 +3216,8 @@ static int estimate_periodic_work_badnes
 {
int badness = 0;

-   if (state % 8 == 0) /* every 120 sec */
+   if (state % 8 == 7) /* every 120 sec */
badness += 10;
-   if (state % 4 == 0) /* every 60 sec */
+   if (state % 4 == 1) /* every 60 sec */
badness += 5;
if (state % 2 == 0) /* every 30 sec */



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] A change in periodic work scheduling in bcm43xx

2006-09-05 Thread Michael Buesch
On Tuesday 05 September 2006 19:58, Larry Finger wrote:
 Michael,
 
 Based on user reports and my own experiences, the current problems with 
 NETDEV WATCHDOG tx timeouts, 
 and the device just falling over do not happen when periodic work is not 
 preemptible. These problems 
 seem to affect BCM4306 rev 2  3 chips. Since I changed BADNESS_LIMIT to 20 
 to disable preemption 
 during periodic work, my device has stayed up continuously for more than 18 
 hours. Previously, the 
 longest time between failures was less than 6 hours, and sometimes as short 
 as 10 minutes.
 
 As you know, the present scheme for periodic work scheduling for bcm43xx in 
 both wireless-2.6 and 
 wireless-dev runs all 4 periodic tasks on certain ticks of the 15-second 
 clock. Using your values of 
 badness of 1, 1, 5, and 10 for the 15, 30, 60, and 120 second periodic 
 tasks, respectively, the 
 badness repeat cycle is ..., 1, 2, 1, 7, 1, 2, 1, 17, ...
 
 I propose that we reduce the size of the spike in badness by shifting the 120 
 second task from a 
 clock value of 8n to 8n+7, and the 60 second task from 4n to 4n+1. This way 
 no more than 2 of the 
 periodic tasks will be run in any clock period, and the badness repeat cycle 
 becomes ..., 6, 2, 1, 
 2, 6, 2, 11, 2,  The tasks are run with the same periodicity as before, 
 just a little more 
 asynchronously. I recall that they were completely asynchronous in early 
 versions of this driver.
 
 Until we can locate and fix the problem that occurs during preemption, should 
 we consider setting 
 BADNESS_LIMIT to 20 in the wireless-2.6 kernels? For those of us whose cards 
 have the problem, it 
 certainly makes the device a lot more usable.

Oh well...
And if we do this, it will take two weeks for the latency-people to
show up and request a revert of this again.

Well, I _really_ don't want to have a patch like this, because
it just papers over a real bug.
There are only two choices: Either we want preemption or we don't.
It's worthless to tune the badness limit to a point where it is least
likely for the bug to trigger. Sooner or later it _will_ trigger.

What we really want is:
1st: A relieable way to reproduce the bug in short time.
 Waiting 20hours isn't really a good way of debugging.
2nd: If we can reproduce it in reasonable time, we can track
 down what is actually causing the bug.

My thoughts on the bug:

When a preemptible work happens, we completely shutdown IRQ
handling and we suspend the MAC. We do this, because we must
not take the IRQ spinlock if we want to be preemptible.
By not taking the IRQ spinlock, we race against the DMA engine
(and other parts). So we must shutdown any data flow during
the periodic work to ensure the IRQ handler does not trigger.
The sad thing is: We don't know much about how the card and
the firmware works (yet). So the big question is:
How to suspend the card in an easy and _inexpensive_ way?
We currently mask all IRQs and suspend the MAC. I guess MAC
suspending is part of the problem. I _guess_ the card is
confused by suspending the MAC in the middle of possible
transmissions. It's all just a guess. That's why I want to
have a good way to reproduce the bug to do experiments.
We could suspend the DMA TX channel before we suspend the MAC,
for example. We could try other things as well. For example
don't suspend the MAC at all. Just mask IRQs.

We must be _careful_ here. The preemptible periodic work
is a damn fragile part of the whole driver and it is easily
possible to break it even more with a patch that looks
correct.

Short:
We don't need a patch to paper over the bug, but we need
_ideas_ of what is actually going on.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [IPSEC]: searching SAD without assumming L3 details

2006-09-05 Thread Herbert Xu
On Sat, Sep 02, 2006 at 09:43:02AM -0400, jamal wrote:

 Allow for searching the SAD from external data path points without
 assumming L3 details. The only customer of this exposure currently
 is pktgen.

Any reason why xfrm_state_find can't be used? It doesn't look right
to add generic code that's only used by pktgen.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


roaming support for d80211 stack

2006-09-05 Thread Mohamed Abbas

Hi
Are there any one working on roaming support for  d80211 stack?
Thanks
Mohamed
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: e1000 Detected Tx Unit Hang

2006-09-05 Thread Paul Aviles
I haven't done the NAPI yet. These are identical systems altogether, maybe 
the CPU is a different stepping at the most, but that is all.
The 16:  70540  0   IO-APIC-level  uhci_hcd:usb4, eth0 is the 
same in every GS12 I have. No overclocking and same BIOS. Tyan released  ver 
1.8 about a month ago and I did the upgrade and same effect. Then I thought 
about upgrading to 2.6.17.11 just to see if the driver will have any issues 
and nothing, same deal. The only way I was able to control it was usign a 
dummy 10/100 non-management switch. Then we had no issues.


I will try without NAPI tomorrow 9-6-06 and will report back. My 
understanding on NAPI was that it will drop packets by design on overload. 
Why will that cause a system lock?


Are there any other kernel options you would like to enable to track this 
better and if you need remote access to the system I can accomodate too, 
just let me know what time zone you are to schedule it. Let me know.


Regards,

Paul Aviles

- Original Message - 
From: Jesse Brandeburg [EMAIL PROTECTED]

To: Paul Aviles [EMAIL PROTECTED]
Cc: netdev@vger.kernel.org
Sent: Tuesday, September 05, 2006 12:09 PM
Subject: Re: e1000 Detected Tx Unit Hang



On 9/3/06, Paul Aviles [EMAIL PROTECTED] wrote:

Hey Jesse, thanks for your reply. Here is the stuff on /procs. The weird

no problem,


part is that I have several other identical systems and only one is
affected. Today I moved the hard drive to another similar system and I am
not seeing the problem so I am wondering if is something maybe wrong with
the card eeprom? Is there a way to check that?


I doubt it is an eeprom problem.  you can dump the eeproms with
ethtool -e eth0 from both machines and compare them .  Odd that only
one system is having the problem.  Could it be that the hardware on
that box is having issues?  Are you sure the machines are running the
same bios version with the same settings?  Any overclocking?


 cat /proc/interrupts
   CPU0   CPU1
 16:  70540  0   IO-APIC-level  uhci_hcd:usb4, eth0


this could contribute to your problem, were you able to test without NAPI?

Jesse





-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6 patch] net/sctp/: cleanups

2006-09-05 Thread Sridhar Samudrala
On Tue, 2006-09-05 at 23:57 +0200, Adrian Bunk wrote:
 This patch contains the following cleanups:
 - make the following needlessly global function static:
   - socket.c: sctp_apply_peer_addr_params()
 - add proper prototypes for the several global functions in
   include/net/sctp/sctp.h
 
 Note that this fixes wrong prototypes for the following functions:
 - sctp_snmp_proc_exit()
 - sctp_eps_proc_exit()
 - sctp_assocs_proc_exit()
 
 The latter was spotted by the GNU C compiler and reported
 by David Woodhouse.
 
 Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

Acked-by: Sridhar Samudrala [EMAIL PROTECTED]

 
 ---
 
  include/net/sctp/sctp.h |   13 +
  net/sctp/ipv6.c |1 -
  net/sctp/protocol.c |7 ---
  net/sctp/socket.c   |   14 +++---
  4 files changed, 20 insertions(+), 15 deletions(-)
 
 --- linux-2.6.18-rc5-mm1/include/net/sctp/sctp.h.old  2006-09-05 
 16:50:33.0 +0200
 +++ linux-2.6.18-rc5-mm1/include/net/sctp/sctp.h  2006-09-05 
 16:54:18.0 +0200
 @@ -128,6 +128,8 @@
int flags);
  extern struct sctp_pf *sctp_get_pf_specific(sa_family_t family);
  extern int sctp_register_pf(struct sctp_pf *, sa_family_t);
 +int sctp_inetaddr_event(struct notifier_block *this, unsigned long ev,
 +void *ptr);
  
  /*
   * sctp/socket.c
 @@ -178,6 +180,17 @@
 struct sock *oldsk, struct sock *newsk);
  
  /*
 + * sctp/proc.c
 + */
 +int sctp_snmp_proc_init(void);
 +void sctp_snmp_proc_exit(void);
 +int sctp_eps_proc_init(void);
 +void sctp_eps_proc_exit(void);
 +int sctp_assocs_proc_init(void);
 +void sctp_assocs_proc_exit(void);
 +
 +
 +/*
   *  Section:  Macros, externs, and inlines
   */
  
 --- linux-2.6.18-rc5-mm1/net/sctp/socket.c.old2006-09-05 
 16:49:15.0 +0200
 +++ linux-2.6.18-rc5-mm1/net/sctp/socket.c2006-09-05 16:49:27.0 
 +0200
 @@ -2081,13 +2081,13 @@
   * SPP_SACKDELAY_ENABLE, setting both will have undefined
   * results.
   */
 -int sctp_apply_peer_addr_params(struct sctp_paddrparams *params,
 - struct sctp_transport   *trans,
 - struct sctp_association *asoc,
 - struct sctp_sock*sp,
 - int  hb_change,
 - int  pmtud_change,
 - int  sackdelay_change)
 +static int sctp_apply_peer_addr_params(struct sctp_paddrparams *params,
 +struct sctp_transport   *trans,
 +struct sctp_association *asoc,
 +struct sctp_sock*sp,
 +int  hb_change,
 +int  pmtud_change,
 +int  
 sackdelay_change)
  {
   int error;
  
 --- linux-2.6.18-rc5-mm1/net/sctp/ipv6.c.old  2006-09-05 16:50:51.0 
 +0200
 +++ linux-2.6.18-rc5-mm1/net/sctp/ipv6.c  2006-09-05 16:50:58.0 
 +0200
 @@ -78,7 +78,6 @@
  
  #include asm/uaccess.h
  
 -extern int sctp_inetaddr_event(struct notifier_block *, unsigned long, void 
 *);
  static struct notifier_block sctp_inet6addr_notifier = {
   .notifier_call = sctp_inetaddr_event,
  };
 --- linux-2.6.18-rc5-mm1/net/sctp/protocol.c.old  2006-09-05 
 16:53:10.0 +0200
 +++ linux-2.6.18-rc5-mm1/net/sctp/protocol.c  2006-09-05 16:53:20.0 
 +0200
 @@ -82,13 +82,6 @@
  kmem_cache_t *sctp_chunk_cachep __read_mostly;
  kmem_cache_t *sctp_bucket_cachep __read_mostly;
  
 -extern int sctp_snmp_proc_init(void);
 -extern int sctp_snmp_proc_exit(void);
 -extern int sctp_eps_proc_init(void);
 -extern int sctp_eps_proc_exit(void);
 -extern int sctp_assocs_proc_init(void);
 -extern int sctp_assocs_proc_exit(void);
 -
  /* Return the address of the control sock. */
  struct sock *sctp_get_ctl_sock(void)
  {

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFT] sky2 vs iptables

2006-09-05 Thread Daniel Drake

Hi,

There's a strange sky2 bug on the Gentoo bugzilla:
http://bugs.gentoo.org/show_bug.cgi?id=136508

sky2 seems to work OK, but breaks as soon as the iptables ruleset is 
loaded. Nothing can be pinged, etc.


Can someone try and reproduce this? The iptables rule script has been 
uploaded here:

http://bugs.gentoo.org/attachment.cgi?id=95694action=view

The very last command in that file is the one which produces an error 
and stops everything working:


iptables: Unknown error 18446744073709551615

Apparently a sky2 null deref has also been seen at this point, although 
I don't have further details on that.


Thanks!
Daniel
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] myri10ge: update the firmware download URL in Kconfig

2006-09-05 Thread Brice Goglin
Jeff,

Could you please push the following patch to Linus before 2.6.18?
It updates the firmware download URL in Kconfig to match the header
in drivers/net/myri10ge/myri10ge.c.

Thanks!
Brice Goglin



From: Brice Goglin [EMAIL PROTECTED]

[PATCH] myri10ge: update the firmware download URL in Kconfig

Update the firmware download URL in Kconfig to match the header
in drivers/net/myri10ge/myri10ge.c.

Signed-off-by: Brice Goglin [EMAIL PROTECTED]
---
 drivers/net/Kconfig |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-rc/drivers/net/Kconfig
===
--- linux-rc.orig/drivers/net/Kconfig   2006-09-05 11:19:37.0 -0400
+++ linux-rc/drivers/net/Kconfig2006-09-05 11:19:49.0 -0400
@@ -2393,7 +2393,7 @@
  you will need a newer firmware image.
  You may get this image or more information, at:
 
- http://www.myri.com/Myri-10G/
+ http://www.myri.com/scs/download-Myri10GE.html
 
  To compile this driver as a module, choose M here and read
  file:Documentation/networking/net-modules.txt.  The module


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ProxyARP and IPSec

2006-09-05 Thread Stephen J. Bevan
Alexey Kuznetsov writes:
  Probably, you are not aware that standard IPsec tunnel device,
  if it is created:
  
  1. Probably, will not accept fragmented frames, because IPsec cannot
 handle them

IPsec can handle them, though not particularly smoothly if the IPsec
tunnel is only supposed to carry a particular portprotocol
combination.


  2. Probably, will have undefined MTU (65536), because of 1

An MTU that is more likely to make most things work (at least over
Ethernet) is ETH_DATA_LEN - MAX_SA_LEN where MAX_SA_LEN is however
much is required for IPsec (something like IP + UDP if NAT-T + ESP
header + IV + padding + ESP trailer).  The simplest thing is to just
statically configure it.  However, some implementations dynamically
calculate the IPsec device MTU based on the maximum size required by
any of the IPsec SAs that will go over the interface, using either a
pessimistic (255) or optimistic (2) padding estimate.  This can cause
problems for OPSF adjacency if each side arrives at a different MTU
but that can be handled by either manually configuring the device MTU
or explicitly configuring the MTU that OSPF will advertise (I think
Quagga supports that).


  Actually, this is the reason why it is not implemented.
  It is dirty business. :-) And the person, who implements this,
  has to be really... unscrupulous. :-)

Exactly the same issue occurs if one implements IPsec (or any other
encryption method) in user-level using a tun/tap device.  Consequently
while I agree that fragmentation causes an additional level of
problems if one wants to have port/protocol based selectors in IPsec,
I believe that most (but not all) VPN users are quite satisfied with
policies containing all traffic, all ports and so will not encounter
any IPsec specific problems.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] FRV: do_gettimeofday() should no longer use tickadj

2006-09-05 Thread john stultz
On Tue, 2006-09-05 at 16:35 +0100, David Howells wrote:
 Stop do_gettimeofday() on FRV from using tickadj, and model it after ARM
 instead.
 
 This patch also provides a placeholder macro for getting hardware timer data 
 to
 be filled in when such is available.

From this patch it looks like the FRV arch could be trivially converted
to GENERIC_TIME.

Would you consider the following, totally untested patch?

Signed-off-by: John Stultz [EMAIL PROTECTED]

 Kconfig   |4 ++
 kernel/time.c |   81 --
 2 files changed, 4 insertions(+), 81 deletions(-)

diff --git a/arch/frv/Kconfig b/arch/frv/Kconfig
index 95a3892..a601a17 100644
--- a/arch/frv/Kconfig
+++ b/arch/frv/Kconfig
@@ -29,6 +29,10 @@ config GENERIC_HARDIRQS
bool
default n
 
+config GENERIC_TIME
+   bool
+   default y
+
 config TIME_LOW_RES
bool
default y
diff --git a/arch/frv/kernel/time.c b/arch/frv/kernel/time.c
index d5b64e1..68a77fe 100644
--- a/arch/frv/kernel/time.c
+++ b/arch/frv/kernel/time.c
@@ -32,8 +32,6 @@
 
 #define TICK_SIZE (tick_nsec / 1000)
 
-extern unsigned long wall_jiffies;
-
 unsigned long __nongprelbss __clkin_clock_speed_HZ;
 unsigned long __nongprelbss __ext_bus_clock_speed_HZ;
 unsigned long __nongprelbss __res_bus_clock_speed_HZ;
@@ -145,85 +143,6 @@ void time_init(void)
 }
 
 /*
- * This version of gettimeofday has near microsecond resolution.
- */
-void do_gettimeofday(struct timeval *tv)
-{
-   unsigned long seq;
-   unsigned long usec, sec;
-   unsigned long max_ntp_tick;
-
-   do {
-   unsigned long lost;
-
-   seq = read_seqbegin(xtime_lock);
-
-   usec = 0;
-   lost = jiffies - wall_jiffies;
-
-   /*
-* If time_adjust is negative then NTP is slowing the clock
-* so make sure not to go into next possible interval.
-* Better to lose some accuracy than have time go backwards..
-*/
-   if (unlikely(time_adjust  0)) {
-   max_ntp_tick = (USEC_PER_SEC / HZ) - tickadj;
-   usec = min(usec, max_ntp_tick);
-
-   if (lost)
-   usec += lost * max_ntp_tick;
-   }
-   else if (unlikely(lost))
-   usec += lost * (USEC_PER_SEC / HZ);
-
-   sec = xtime.tv_sec;
-   usec += (xtime.tv_nsec / 1000);
-   } while (read_seqretry(xtime_lock, seq));
-
-   while (usec = 100) {
-   usec -= 100;
-   sec++;
-   }
-
-   tv-tv_sec = sec;
-   tv-tv_usec = usec;
-}
-
-EXPORT_SYMBOL(do_gettimeofday);
-
-int do_settimeofday(struct timespec *tv)
-{
-   time_t wtm_sec, sec = tv-tv_sec;
-   long wtm_nsec, nsec = tv-tv_nsec;
-
-   if ((unsigned long)tv-tv_nsec = NSEC_PER_SEC)
-   return -EINVAL;
-
-   write_seqlock_irq(xtime_lock);
-   /*
-* This is revolting. We need to set xtime correctly. However, the
-* value in this location is the value at the most recent update of
-* wall time.  Discover what correction gettimeofday() would have
-* made, and then undo it!
-*/
-   nsec -= 0 * NSEC_PER_USEC;
-   nsec -= (jiffies - wall_jiffies) * TICK_NSEC;
-
-   wtm_sec  = wall_to_monotonic.tv_sec + (xtime.tv_sec - sec);
-   wtm_nsec = wall_to_monotonic.tv_nsec + (xtime.tv_nsec - nsec);
-
-   set_normalized_timespec(xtime, sec, nsec);
-   set_normalized_timespec(wall_to_monotonic, wtm_sec, wtm_nsec);
-
-   ntp_clear();
-   write_sequnlock_irq(xtime_lock);
-   clock_was_set();
-   return 0;
-}
-
-EXPORT_SYMBOL(do_settimeofday);
-
-/*
  * Scheduler clock - returns current time in nanosec units.
  */
 unsigned long long sched_clock(void)


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html