Re: [PATCH] rtnl: Simplify ASSERT_RTNL

2007-10-03 Thread Herbert Xu
On Tue, Oct 02, 2007 at 05:29:11PM +0200, Patrick McHardy wrote:

 I think this doesn't completely fix it, when dev_unicast_add is
 interrupted by dev_mc_add before the unicast changes are performed,
 they will get committed in the dev_mc_add context, so we might still
 call change_flags with BH disabled. Taking the TX lock around the
 dev-uc_count and dev-uc_promisc checks and changes in __dev_set_rx_mode
 should fix this.

Good catch.  Digging back in history it seems that you added
the change_rx_flags function so that the driver didn't have to
do it under TX lock, right?

The problem with this is that the stack can now call
change_rx_flags and set_multicast_list simultaneously
which presents a potential headache for the driver
author (if they were to use change_rx_flags).

It seems to me what we could do is in fact separate out the
part that adds the address and the part that syncs it with
hardware.

That way we can call the hardware from a process context later
and use the RTNL to guarantee that we only enter the driver
once.

So dev_mc_add would look like:

1) Hold some form of lock L.
2) Modify mc list A (a copy of the current mc list).
3) Drop lock.
4) Schedule an update to the hardware.

The update to the hardware would look lie:

1) Hold RTNL.
2) Hold lock L.
3) Copy list A to list B (B would be our current list).
4) Drop lock L.
5) Call the hardware.
6) Drop RTNL.

For compatibility, set_multicast_list would still be invoked
under the TX lock while set_rx_mode would do exactly the same
thing but would only hold the RTNL.

What do you think about this approach?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


kernel 2.4 vs 2.6 Traffic Controller performance

2007-10-03 Thread Sonny
Hello
This is a repost, there seems to have a misunderstanding before.

I hope this is the right place to ask this. Does any know if there is a
substantial difference in the performance of the traffic controller
between kernel 2.4 and 2.6. We tested it using 1 iperf server and use
250 and 500 clients, altering the burst.

This is the set-up:
iperf client -  router (w/ traffic controller) - iperf server

We use the top command inside the router to check the idle time of our
router to see this. The results we got from the 2.4 kernel shows
around 65-70% idle time while the 2.6 shows
60-65% idle time. We tried to use MRTG and we're not getting any
results either. We want to know if we could improve the bandwidth by
upgrading the kernel, else we would have to get a new bandwidth
manager.  Have anyone performed a similar test or can suggest a better
way to do this. Thanks in advance.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel 2.4 vs 2.6 Traffic Controller performance

2007-10-03 Thread Eric Dumazet

Sonny a écrit :

Hello
This is a repost, there seems to have a misunderstanding before.

I hope this is the right place to ask this. Does any know if there is a
substantial difference in the performance of the traffic controller
between kernel 2.4 and 2.6. We tested it using 1 iperf server and use
250 and 500 clients, altering the burst.

This is the set-up:
iperf client -  router (w/ traffic controller) - iperf server

We use the top command inside the router to check the idle time of our
router to see this. The results we got from the 2.4 kernel shows
around 65-70% idle time while the 2.6 shows
60-65% idle time. We tried to use MRTG and we're not getting any
results either. We want to know if we could improve the bandwidth by
upgrading the kernel, else we would have to get a new bandwidth
manager.  Have anyone performed a similar test or can suggest a better
way to do this. Thanks in advance.
-

Hi Sonny

I am not sure what you are asking here. 65-70% idle time (or 60-65%) is fine.

2.6 is also not very meaningfull, there are a lot of changes between 2.6.0 and 
2.6.23 :)


Why should you upgrade kernel ?
What bandwidth do you handle ?
What kind of platform is it ? (a new kernel wont help much if its a real old 
machine, or old NICs)


You seem to have some bandwidth problem but focus on cpu affairs...

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/7] CAN: Add virtual CAN netdevice driver

2007-10-03 Thread Oliver Hartkopp

David Miller wrote:

From: Arnaldo Carvalho de Melo [EMAIL PROTECTED]
Date: Tue, 2 Oct 2007 18:43:25 -0300

  

I think that helping ctags to find the definition for the debug variable
to see, for instance, if it is a bitmask or a boolean without having to
chose from tons of 'debug' variables is a good thing.



I completely agree.
  


OK. No problem if it's helpful. We'll change debug to vcan_debug.

Thanks.

Oliver
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: tcp bw in 2.6

2007-10-03 Thread Bill Fink
Tangential aside:

On Tue, 02 Oct 2007, Rick Jones wrote:

 *) depending on the quantity of CPU around, and the type of test one is 
 running, 
 results can be better/worse depending on the CPU to which you bind the 
 application.  Latency tends to be best when running on the same core as takes 
 interrupts from the NIC, bulk transfer can be better when running on a 
 different 
 core, although generally better when a different core on the same chip.  These
 days the throughput stuff is more easily seen on 10G, but the netperf service 
 demand changes are still visible on 1G.

Interesting.  I was going to say that I've generally had the opposite
experience when it comes to bulk data transfers, which is what I would
expect due to CPU caching effects, but that perhaps it's motherboard/NIC/
driver dependent.  But in testing I just did I discovered it's even
MTU dependent (most of my normal testing is always with 9000-byte
jumbo frames).

With Myricom 10-GigE NICs, NIC interrupts on CPU 0 and nuttcp app
running on CPU 1 (both transmit and receive sides), and using 9000-byte
jumbo frames:

[EMAIL PROTECTED] ~]# nuttcp -w10m 192.168.88.16
10078.5000 MB /  10.02 sec = 8437.5396 Mbps 100 %TX 99 %RX

With Myricom 10-GigE NICs, and both NIC interrupts and nuttcp app
on CPU 0 (both transmit and receive sides), again using 9000-byte
jumbo frames:

[EMAIL PROTECTED] ~]# nuttcp -w10m 192.168.88.16
11817.8750 MB /  10.00 sec = 9909.7537 Mbps 100 %TX 74 %RX

Same tests repeated with standard 1500-byte Ethernet MTU:

With Myricom 10-GigE NICs, NIC interrupts on CPU 0 and nuttcp app
running on CPU 1 (both transmit and receive sides), and using
standard 1500-byte Ethernet MTU:

[EMAIL PROTECTED] ~]# nuttcp -M1460 -w10m 192.168.88.16
 5685.9375 MB /  10.00 sec = 4768.0951 Mbps 99 %TX 98 %RX

With Myricom 10-GigE NICs, and both NIC interrupts and nuttcp app
on CPU 0 (both transmit and receive sides), again using standard
1500-byte Ethernet MTU:

[EMAIL PROTECTED] ~]# nuttcp -M1460 -w10m 192.168.88.16
 4974.0625 MB /  10.03 sec = 4161.6015 Mbps 100 %TX 100 %RX

Now back to your regularly scheduled programming.  :-)

-Bill
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sky2: jumbo frame regression fix

2007-10-03 Thread Ian Kumlien
On tis, 2007-10-02 at 21:59 -0700, Stephen Hemminger wrote:
 On Wed, 03 Oct 2007 03:34:34 +0200
 Ian Kumlien [EMAIL PROTECTED] wrote:
 
  On tis, 2007-10-02 at 18:02 -0700, Stephen Hemminger wrote:
   Remove unneeded check that caused problems with jumbo frame sizes.
   The check was recently added and is wrong.
   When using jumbo frames the sky2 driver does fragmentation, so
   rx_data_size is less than mtu.
  
  Confirmed working.
  
  Now running with 9k mtu with no errors, =)
  
  It also seems that the FIFO bug was the one that affected me before,
  damn odd race that one.
 
 Does the workaround (forced reset work). Ian, you are the first person to
 report triggering it.  I haven't found a way to make it happen.
 What combination of flow control and speeds are you using?

Yes it works, it's the problem i had all along =)

As to how to make it happen thats a bit harder...
To me it seems like it's a combination of several connections and
somewhat high bandwidth but you have to send data for it to happen...

To me it usually happens when seeding files via Bittorrent, but it seems
like it has to be somewhat special circumstances to actually trigger it.

I use jumbo frames, my lan is gigabit, to my firewall. From the firewall
it's common 1500 mtu 100mbit and i doubt that this has anything to do
with it (if it's not a 'number of frames that can be stored' problem and
thus the mtu limits it to a really small value making it easier to
trigger)

Well, thats my thoughts atleast but then i just got up after having
slept 5 hours, so =)

-- 
Ian Kumlien pomac () vapor ! com -- http://pomac.netswarm.net


signature.asc
Description: This is a digitally signed message part


Re: tcp bw in 2.6

2007-10-03 Thread David Miller
From: [EMAIL PROTECTED] (Larry McVoy)
Date: Tue, 2 Oct 2007 15:36:44 -0700

 On Tue, Oct 02, 2007 at 03:32:16PM -0700, David Miller wrote:
  I'm starting to have a theory about what the bad case might
  be.
  
  A strong sender going to an even stronger receiver which can
  pull out packets into the process as fast as they arrive.
  This might be part of what keeps the receive window from
  growing.
 
 I can back you up on that.  When I straced the receiving side that goes
 slowly, all the reads were short, like 1-2K.  The way that works the 
 reads were a lot larger as I recall.

My issue turns out to be hardware specific too.

The two Broadcom 5714 onboard NICs on my Niagara t1000 give bad packet
receive performance for some reason, the other two which are Broadcom
5704's are perfectly fine.  I'll figure out what the problem is,
probably some misprogramed register in either the chip or the bridge
it's behind.

The UDP stream test of netperf is great for isolating TCP/TSO vs.
hardware issues.  If you can't saturate the pipe or the cpu with
the UDP stream test, it's likely a hardware issue.

The cpu utilization and service demand numbers provided, on both
send and receive, are really useful for diagnosing problems like
this.

Rick deserves several beers for his work on this cool toy. :)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel 2.4 vs 2.6 Traffic Controller performance

2007-10-03 Thread Sonny
On 10/3/07, Eric Dumazet [EMAIL PROTECTED] wrote:
 Sonny a écrit :
  Hello
  This is a repost, there seems to have a misunderstanding before.
 
  I hope this is the right place to ask this. Does any know if there is a
  substantial difference in the performance of the traffic controller
  between kernel 2.4 and 2.6. We tested it using 1 iperf server and use
  250 and 500 clients, altering the burst.
 
  This is the set-up:
  iperf client -  router (w/ traffic controller) - iperf server
 
  We use the top command inside the router to check the idle time of our
  router to see this. The results we got from the 2.4 kernel shows
  around 65-70% idle time while the 2.6 shows
  60-65% idle time. We tried to use MRTG and we're not getting any
  results either. We want to know if we could improve the bandwidth by
  upgrading the kernel, else we would have to get a new bandwidth
  manager.  Have anyone performed a similar test or can suggest a better
  way to do this. Thanks in advance.
  -
 Hi Sonny

 I am not sure what you are asking here. 65-70% idle time (or 60-65%) is fine.

 2.6 is also not very meaningfull, there are a lot of changes between 2.6.0 and
 2.6.23 :)

we're using 2.6.22

 Why should you upgrade kernel ?
we would like to test the difference bet 2 kernels performance

 What bandwidth do you handle ?
10 mbps

 What kind of platform is it ? (a new kernel wont help much if its a real old
 machine, or old NICs)
it's a P IV 2.8 GHz HT with 512 MB

 You seem to have some bandwidth problem but focus on cpu affairs...
Bandwidth is not a problem, we can get 10mbps without a hitch. But we
would like to know the scalability on the CPU vs the number of
clients. So far, for both kernels, we're getting 50% CPU utilization
using 500 clients and 384 burst kbps each.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


net-2.6.24 plans

2007-10-03 Thread David Miller

I'm a bit behind after investigating the TCP performance issues that
turned out to be HW specific problems.  It's a bit of a
dissapointment, I thought maybe there was a cool bug to fix in TCP :-)

Anyways, that means there are patches backlogged in my inbox and it is
also about time to do the hopefully last rebase of the net-2.6.24
tree.

I merged in Jeff Garzik's and John Linville's latest and I'm running
the current tree on my workstation most of today with good results so
far.

Linus should release the final 2.6.23 very soon, let's kind of assume
it will happen over the next 3 or 4 days.

That means we need to bear down for the merge.  I plan to commit my
Neptune driver in it's current state, and that's the last new feature
going in.

You can help make the merge go swimmingly by picking some nagging
issue you noticed and track it down.  If you can figure out why
something happens but can't or don't have time to come up with
a fix, report what you've discovered.

If you can provide the fix too, all the better.

That's how I get backlogged, I'm working on A and notice some problem
with B, then I refuse to go back to A until I bring closure to B. :)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] [TCP]: Fix two off-by-one errors in fackets_out adjusting logic

2007-10-03 Thread Ilpo Järvinen
1) Passing wrong skb to tcp_adjust_fackets_out could corrupt
fastpath_cnt_hint as tcp_skb_pcount(next_skb) is not included
to it if hint points exactly to the next_skb (it's lagging
behind, see sacktag).

2) When fastpath_skb_hint is put backwards to avoid dangling
skb reference, the skb's pcount must also be removed from count
(not included like above).

Reported by Cedric Le Goater [EMAIL PROTECTED]

Signed-off-by: Ilpo Järvinen [EMAIL PROTECTED]
---
 net/ipv4/tcp_output.c |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 6199abe..5329675 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1755,14 +1755,16 @@ static void tcp_retrans_try_collapse(struct sock *sk, 
struct sk_buff *skb, int m
if (tcp_is_reno(tp)  tp-sacked_out)
tcp_dec_pcount_approx(tp-sacked_out, next_skb);
 
-   tcp_adjust_fackets_out(tp, skb, tcp_skb_pcount(next_skb));
+   tcp_adjust_fackets_out(tp, next_skb, tcp_skb_pcount(next_skb));
tp-packets_out -= tcp_skb_pcount(next_skb);
 
/* changed transmit queue under us so clear hints */
tcp_clear_retrans_hints_partial(tp);
/* manually tune sacktag skb hint */
-   if (tp-fastpath_skb_hint == next_skb)
+   if (tp-fastpath_skb_hint == next_skb) {
tp-fastpath_skb_hint = skb;
+   tp-fastpath_cnt_hint -= tcp_skb_pcount(skb);
+   }
 
sk_stream_free_skb(sk, next_skb);
}
-- 
1.5.0.6

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] [TCP]: Comment fastpath_cnt_hint off-by-one trap

2007-10-03 Thread Ilpo Järvinen
Signed-off-by: Ilpo Järvinen [EMAIL PROTECTED]
---
 include/linux/tcp.h |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index f8cf090..9ff456e 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -343,7 +343,8 @@ struct tcp_sock {
struct sk_buff *forward_skb_hint;
struct sk_buff *fastpath_skb_hint;
 
-   int fastpath_cnt_hint;
+   int fastpath_cnt_hint;  /* Lags behind by current skb's pcount
+* compared to respective fackets_out */
int lost_cnt_hint;
int retransmit_cnt_hint;
 
-- 
1.5.0.6

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-2.6.24 0/3]: More TCP fixes

2007-10-03 Thread Ilpo Järvinen
Hi Dave,

Sacktag fastpath_cnt_hint seems to be very tricky to get right...
I suppose this one fixes Cedric's case. I cannot say for sure
until there is something more definite indication of
tcp_retrans_try_collapse origin than what the simple late WARN_ON
gave for us. ...Especially since it's non-trivial to have skb
hint correctly positioned in the write_queue while still ending
up calling that function. However, considering how difficult it
seems to be for Cedric to reproduce, it might well be this one.

In addition, I noticed another reset which wasn't previously   
converted to WARN_ON, so doing that now. Boot + simple xfer
tested. Please apply to net-2.6.24.

--
 i.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] [TCP]: Annotate another fackets_out state reset

2007-10-03 Thread Ilpo Järvinen
This should no longer be necessary because fackets_out is
accurate. It indicates bugs elsewhere, thus report it.

Signed-off-by: Ilpo Järvinen [EMAIL PROTECTED]
---
 net/ipv4/tcp_input.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index e22ffe7..87c9ef5 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1160,7 +1160,8 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff 
*ack_skb, u32 prior_snd_
int first_sack_index;
 
if (!tp-sacked_out) {
-   tp-fackets_out = 0;
+   if (WARN_ON(tp-fackets_out))
+   tp-fackets_out = 0;
tp-highest_sack = tp-snd_una;
}
prior_fackets = tp-fackets_out;
-- 
1.5.0.6

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-2.6.24 0/3]: More TCP fixes

2007-10-03 Thread Cedric Le Goater
Hello Ilpo !

Ilpo Järvinen wrote:
 Hi Dave,
 
 Sacktag fastpath_cnt_hint seems to be very tricky to get right...
 I suppose this one fixes Cedric's case. I cannot say for sure
 until there is something more definite indication of
 tcp_retrans_try_collapse origin than what the simple late WARN_ON
 gave for us. ...Especially since it's non-trivial to have skb
 hint correctly positioned in the write_queue while still ending
 up calling that function. However, considering how difficult it
 seems to be for Cedric to reproduce, it might well be this one.
 
 In addition, I noticed another reset which wasn't previously   
 converted to WARN_ON, so doing that now. Boot + simple xfer
 tested. Please apply to net-2.6.24.

I'm dropping the previous patches you sent me and switching to this patchset. 
right ?

Thanks,

C.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-2.6.24 0/3]: More TCP fixes

2007-10-03 Thread Ilpo Järvinen
On Wed, 3 Oct 2007, Cedric Le Goater wrote:

 Ilpo Järvinen wrote:
  Sacktag fastpath_cnt_hint seems to be very tricky to get right...
  I suppose this one fixes Cedric's case. I cannot say for sure
  until there is something more definite indication of
  tcp_retrans_try_collapse origin than what the simple late WARN_ON
  gave for us. ...Especially since it's non-trivial to have skb
  hint correctly positioned in the write_queue while still ending
  up calling that function. However, considering how difficult it
  seems to be for Cedric to reproduce, it might well be this one.
  
  In addition, I noticed another reset which wasn't previously   
  converted to WARN_ON, so doing that now. Boot + simple xfer
  tested. Please apply to net-2.6.24.
 
 I'm dropping the previous patches you sent me and switching to this patchset. 
 right ?

Yes you can do that... However, there are two ways forward:

1) Drop and test with this patchset long enough to verify it's gone...
2) No dropping and get the more exact trace by reproducing, which can 
   point out to tcp_retrans_try_collapse confirming the source of the
   bug or revealing yet another bug...

The first one has one drawback, it cannot prove the fix very well since 
the bug could just not occur by chance... Path 2 would clearly show the 
place from where the problem originates because we will know that it got 
triggered! I personally would prefer path 2 but whether you want to go for 
that depends on the time you want to invest in it...

...I rediffed the tcp_verify_fackets patch too (below) just in case it 
would be something else in you case and you choose path 1 (put it on top 
of this patchset, applies with some offsets). In case the problem is gone, 
it shouldn't trigger and if it does, we'll have another bug caught.

Anyway, thanks for ccing right persons and netdev right from the 
beginning.


-- 
 i.

 include/net/tcp.h |3 +
 net/ipv4/tcp_input.c  |   25 +---
 net/ipv4/tcp_ipv4.c   |  103 +
 net/ipv4/tcp_output.c |6 ++-
 4 files changed, 130 insertions(+), 7 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 991ccdc..54a0d91 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -43,6 +43,9 @@
 
 #include linux/seq_file.h
 
+extern void tcp_verify_fackets(struct sock *sk);
+extern void tcp_print_queue(struct sock *sk);
+
 extern struct inet_hashinfo tcp_hashinfo;
 
 extern atomic_t tcp_orphan_count;
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 87c9ef5..93bdc20 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1140,7 +1140,7 @@ static int tcp_check_dsack(struct tcp_sock *tp, struct 
sk_buff *ack_skb,
return dup_sack;
 }
 
-static int
+int
 tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 
prior_snd_una)
 {
const struct inet_connection_sock *icsk = inet_csk(sk);
@@ -1160,8 +1160,10 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff 
*ack_skb, u32 prior_snd_
int first_sack_index;
 
if (!tp-sacked_out) {
-   if (WARN_ON(tp-fackets_out))
+   if (WARN_ON(tp-fackets_out)) {
tp-fackets_out = 0;
+   tcp_print_queue(sk);
+   }
tp-highest_sack = tp-snd_una;
}
prior_fackets = tp-fackets_out;
@@ -1421,6 +1423,7 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff 
*ack_skb, u32 prior_snd_
}
}
}
+   tcp_verify_fackets(sk);
 
/* Check for lost retransmit. This superb idea is
 * borrowed from ratehalving. Event C.
@@ -1633,13 +1636,14 @@ void tcp_enter_frto(struct sock *sk)
tcp_set_ca_state(sk, TCP_CA_Disorder);
tp-high_seq = tp-snd_nxt;
tp-frto_counter = 1;
+   tcp_verify_fackets(sk);
 }
 
 /* Enter Loss state after F-RTO was applied. Dupack arrived after RTO,
  * which indicates that we should follow the traditional RTO recovery,
  * i.e. mark everything lost and do go-back-N retransmission.
  */
-static void tcp_enter_frto_loss(struct sock *sk, int allowed_segments, int 
flag)
+void tcp_enter_frto_loss(struct sock *sk, int allowed_segments, int flag)
 {
struct tcp_sock *tp = tcp_sk(sk);
struct sk_buff *skb;
@@ -1676,6 +1680,7 @@ static void tcp_enter_frto_loss(struct sock *sk, int 
allowed_segments, int flag)
}
}
tcp_verify_left_out(tp);
+   tcp_verify_fackets(sk);
 
tp-snd_cwnd = tcp_packets_in_flight(tp) + allowed_segments;
tp-snd_cwnd_cnt = 0;
@@ -1754,6 +1759,7 @@ void tcp_enter_loss(struct sock *sk, int how)
}
}
tcp_verify_left_out(tp);
+   tcp_verify_fackets(sk);
 
tp-reordering = min_t(unsigned int, tp-reordering,
 sysctl_tcp_reordering);
@@ -2309,7 +2315,7 @@ static void tcp_mtup_probe_success(struct sock 

Re: [PATCH net-2.6.24 0/3]: More TCP fixes

2007-10-03 Thread Cedric Le Goater
Ilpo Järvinen wrote:
 On Wed, 3 Oct 2007, Cedric Le Goater wrote:
 
 Ilpo Järvinen wrote:
 Sacktag fastpath_cnt_hint seems to be very tricky to get right...
 I suppose this one fixes Cedric's case. I cannot say for sure
 until there is something more definite indication of
 tcp_retrans_try_collapse origin than what the simple late WARN_ON
 gave for us. ...Especially since it's non-trivial to have skb
 hint correctly positioned in the write_queue while still ending
 up calling that function. However, considering how difficult it
 seems to be for Cedric to reproduce, it might well be this one.

 In addition, I noticed another reset which wasn't previously   
 converted to WARN_ON, so doing that now. Boot + simple xfer
 tested. Please apply to net-2.6.24.
 I'm dropping the previous patches you sent me and switching to this 
 patchset. 
 right ?
 
 Yes you can do that... However, there are two ways forward:
 
 1) Drop and test with this patchset long enough to verify it's gone...
 2) No dropping and get the more exact trace by reproducing, which can 
point out to tcp_retrans_try_collapse confirming the source of the
bug or revealing yet another bug...
 
 The first one has one drawback, it cannot prove the fix very well since 
 the bug could just not occur by chance... Path 2 would clearly show the 
 place from where the problem originates because we will know that it got 
 triggered! I personally would prefer path 2 but whether you want to go for 
 that depends on the time you want to invest in it...
 
 ...I rediffed the tcp_verify_fackets patch too (below) just in case it 
 would be something else in you case and you choose path 1 (put it on top 
 of this patchset, applies with some offsets). In case the problem is gone, 
 it shouldn't trigger and if it does, we'll have another bug caught.

I have a spare node so I'm starting 2) with the 3 patches you sent and that
last one which applied fine. all of them on a fresh git pull of net-2.6.24

 Anyway, thanks for ccing right persons and netdev right from the 
 beginning.

thanks to git ! :) 

C.
 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][E1000E] some cleanups

2007-10-03 Thread jamal
On Tue, 2007-02-10 at 10:43 -0700, Kok, Auke wrote:

 the description of this patch is rather misleading, and the title certainly 
 too.

That was fast - you said weeks, not days;-

 Can you resend this with a bit more elaborate explanation as to why the cb 
 code is
 relevant to use here? Not only do I need to understand this, but others might 
 want
 to as well later on ;)

I am probably repeating something youve seen/know already.
The cleanup is to break up the code so it is functionally more readable
from a perspective of the 4 distinct parts in -hard_start_xmit():

a) packet formatting (example: vlan, mss, descriptor counting, etc.)
b) chip-specific formatting
c) enqueueing the packet on a DMA ring
d) IO operations to complete packet transmit, tell DMA engine to chew
on, tx completion interrupts, set last tx time, etc.

Each of those steps sitting in different functions accumulates state
that is used in the next steps. cb stores this state because it a
scratchpad the driver owns. You could create some other structure and
pass it around the iteration, but why waste more bytes.

I could stop there with the explanation, but let me go on .. ;-

From a secondary angle, remember i am pulling these patches out of my
batching work. Thats how we started this discussion ;- I would like,
once converted the driver to remove LLTX, to do #a without holding the
tx lock. This stands on its own even without batching. Then of course,
once all this is in such good shape it makes it easier to add the
batching code because i could reuse the now functionalized steps.
I hope that provides reasonable and good explanation ;-

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][TG3]Some cleanups

2007-10-03 Thread jamal
On Tue, 2007-02-10 at 16:33 -0700, Michael Chan wrote:

 Seems ok to me.  I think we should make it more clear that we're
 skipping over the VLAN tag:
 
 (struct tg3_tx_cbdata *)((__skb)-cb[sizeof(struct vlan_skb_tx_cookie)])
 

Will do - thanks Michael.

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-2.6.24 0/3]: More TCP fixes

2007-10-03 Thread Ilpo Järvinen
On Wed, 3 Oct 2007, Cedric Le Goater wrote:

 Ilpo Järvinen wrote:
  On Wed, 3 Oct 2007, Cedric Le Goater wrote:
  
  I'm dropping the previous patches you sent me and switching to this 
  patchset. 
  right ?
  
  Yes you can do that... However, there are two ways forward:
  
  1) Drop and test with this patchset long enough to verify it's gone...
  2) No dropping and get the more exact trace by reproducing, which can 
 point out to tcp_retrans_try_collapse confirming the source of the
 bug or revealing yet another bug...
  
  The first one has one drawback, it cannot prove the fix very well since 
  the bug could just not occur by chance... Path 2 would clearly show the 
  place from where the problem originates because we will know that it got 
  triggered! I personally would prefer path 2 but whether you want to go for 
  that depends on the time you want to invest in it...
  
  ...I rediffed the tcp_verify_fackets patch too (below) just in case it 
  would be something else in you case and you choose path 1 (put it on top 
  of this patchset, applies with some offsets). In case the problem is gone, 
  it shouldn't trigger and if it does, we'll have another bug caught.
 
 I have a spare node so I'm starting 2) with the 3 patches you sent and that
 last one which applied fine.

Ah, that's path 1) then... Since you seem to have enough time, I would say 
that the path 1 is good as well and bugs unrelated to the fix will show up 
there too...

I should have stated it explicitly that with path 2 those 3 patches should 
not be applied because the aim is not a fix but reproducal. Path 2 was 
intentionally left without the potentional fix as then nice backtrace 
informs when we can stop trying (which would hopefully occurred 
pretty soon) :-). But lets discard that path 2...

 all of them on a fresh git pull of net-2.6.24

That's fine, they're pretty well in sync (mm and net-2.6.24, and 
soon 2.6.24-rcs too).

-- 
 i.

Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-03 Thread jamal
On Wed, 2007-03-10 at 01:29 -0400, Bill Fink wrote:

 It does sound sensible.  My own decidedly non-expert speculation
 was that the big 30 % performance hit right at 4 KB may be related
 to memory allocation issues or having to split the skb across
 multiple 4 KB pages.  

plausible. But i also worry it could be 10 other things; example, could
it be the driver used? I noted in my udp test the oddity that turned out
to be tx coal parameter related.
In any case, I will attempt to run those tests later.

 And perhaps it only affected the single
 process case because with multiple processes lock contention may
 be a bigger issue and the xmit batching changes would presumably
 help with that.  I am admittedly a novice when it comes to the
 detailed internals of TCP/skb processing, although I have been
 slowly slogging my way through parts of the TCP kernel code to
 try and get a better understanding, so I don't know if these
 thoughts have any merit.

You do bring up issues that need to be looked into and i will run those
tests.
Note, the effectiveness of batching becomes evident as the number of
flows grows. Actually, scratch that: It becomes evident if you can keep
the tx path busyed out to which multiple users running contribute. If i
can have a user per CPU with lots of traffic to send, i can create that
condition. It's a little boring in the scenario where the bottleneck is
the wire but it needs to be checked.

 BTW does anyone know of a good book they would recommend that has
 substantial coverage of the Linux kernel TCP code, that's fairly
 up-to-date and gives both an overall view of the code and packet
 flow as well as details on individual functions and algorithms,
 and hopefully covers basic issues like locking and synchronization,
 concurrency of different parts of the stack, and memory allocation.
 I have several books already on Linux kernel and networking internals,
 but they seem to only cover the IP (and perhaps UDP) portions of the
 network stack, and none have more than a cursory reference to TCP.  
 The most useful documentation on the Linux TCP stack that I have
 found thus far is some of Dave Miller's excellent web pages and
 a few other web references, but overall it seems fairly skimpy
 for such an important part of the Linux network code.

Reading books or magazines may end up busying you out with some small
gains of knowledge at the end. They tend to be outdated fast. My advice
is if you start with a focus on one thing, watch the patches that fly
around on that area and learn that way. Read the code to further
understand things then ask questions when its not clear. Other folks may
have different views. The other way to do it is pick yourself some task
to either add or improve something and get your hands dirty that way. 

 It would be good to see some empirical evidence that there aren't
 any unforeseen gotchas for larger packet sizes, that at least the
 same level of performance can be obtained with no greater CPU
 utilization.

Reasonable - I will try with 9K after i move over to the new tree from
Dave and make sure nothing else broke in the previous tests.
And when all looks good, i will move to TCP.


  [1] On average i spend 10x more time performance testing and analysing
  results than writting code.
 
 As you have written previously, and I heartily agree with, this is a
 very good practice for developing performance enhancement patches.

To give you a perspective, the results i posted were each run 10
iterations per packet size per kernel. Each run is 60 seconds long. I
think i am past that stage for resolving or fixing anything for UDP or
pktgen, but i need to keep checking for any new regressions when Dave
updates his tree. Now multiply that by 5 packet sizes (I am going to add
2 more) and multiply that by 3-4 kernels. Then add the time it takes to
sift through the data and collect it then analyze it and go back to the
drawing table when something doesnt look right.  Essentially, it needs a
weekend ;-

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Fallback to ipv4 if we try to add join IPv4 multicast group via ipv4-mapped address.

2007-10-03 Thread Dmitry Baryshkov

Hello,

David Stevens wrote:

Dmitry,
Good catch; a couple comments:


Thank you for the response.




struct ipv6_pinfo *np = inet6_sk(sk);
int err;
+   int addr_type = ipv6_addr_type(addr);
+
+   if (addr_type == IPV6_ADDR_MAPPED) {
+  __be32 v4addr = addr-s6_addr32[3];
+  struct ip_mreqn mreq;
+  mreq.imr_multiaddr.s_addr = v4addr;
+  mreq.imr_address.s_addr = INADDR_ANY;
+  mreq.imr_ifindex = ifindex;
+
+  return ip_mc_join_group(sk, mreq);
+   }


ipv6_addr_type() returns a bitmask, so you should use:

if (addr_type  IPV6_ADDR_MAPPED) {


I just c'n'pasted the code that checks for mapped addresses. In most 
cases it's just ==, not bitmask operation.




Also, you should have a blank line after the mreq declaration.


ok.



Ditto for both in ipv6_mc_sock_drop().




I don't expect the multicast source filtering interface will
behave well for mapped addresses, either. The mapped multicast
address won't appear to be a multicast address (and return
error there), and all the source filters would have to be
v4mapped addresses and modify the v4 source filters for this
to do as you expect. So, there's more to it (and it may be a
bit messy) to support mapped multicast addresses fully. I'll
think about that part some more.



Didn't have time to test it throughly. I've only checked that call 
succeeds and that all necessary igmp are sent. I hope, this weekend I'll 
have more time to check.


--
With best wishes
Dmitry Baryshkov

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 3/3] git-net: sctp build fix (not for applying)

2007-10-03 Thread Vlad Yasevich
[EMAIL PROTECTED] wrote:
 From: Andrew Morton [EMAIL PROTECTED]
 
 net/sctp/sm_statetable.c:551: error: 'sctp_sf_tabort_8_4_8' undeclared here 
 (not in a function)
 

Andrew, is the a result of the merge of net-2.6.24 with net-2.6?  

That's the only way I see this happening.

 
 Signed-off-by: Andrew Morton [EMAIL PROTECTED]
 ---
 
  net/sctp/sm_statetable.c |2 --
  1 file changed, 2 deletions(-)
 
 diff -puN net/sctp/sm_statetable.c~git-net-sctp-hack net/sctp/sm_statetable.c
 --- a/net/sctp/sm_statetable.c~git-net-sctp-hack
 +++ a/net/sctp/sm_statetable.c
 @@ -527,8 +527,6 @@ static const sctp_sm_table_entry_t prsct
   /* SCTP_STATE_EMPTY */ \
   TYPE_SCTP_FUNC(sctp_sf_ootb), \
   /* SCTP_STATE_CLOSED */ \
 - TYPE_SCTP_FUNC(sctp_sf_tabort_8_4_8), \
   
That should be changed to sctp_sf_ootb and then it'll compile.  As is, the patch
is wrong.

Thanks
-vlad

 - /* SCTP_STATE_COOKIE_WAIT */ \
   TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \
   /* SCTP_STATE_COOKIE_ECHOED */ \
   TYPE_SCTP_FUNC(sctp_sf_eat_auth), \
 _
 -
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-2.6.24 0/3]: More TCP fixes

2007-10-03 Thread Cedric Le Goater
Ilpo Järvinen wrote:
 On Wed, 3 Oct 2007, Cedric Le Goater wrote:
 
 Ilpo Järvinen wrote:
 On Wed, 3 Oct 2007, Cedric Le Goater wrote:

 I'm dropping the previous patches you sent me and switching to this 
 patchset. 
 right ?
 Yes you can do that... However, there are two ways forward:

 1) Drop and test with this patchset long enough to verify it's gone...
 2) No dropping and get the more exact trace by reproducing, which can 
point out to tcp_retrans_try_collapse confirming the source of the
bug or revealing yet another bug...

 The first one has one drawback, it cannot prove the fix very well since 
 the bug could just not occur by chance... Path 2 would clearly show the 
 place from where the problem originates because we will know that it got 
 triggered! I personally would prefer path 2 but whether you want to go for 
 that depends on the time you want to invest in it...

 ...I rediffed the tcp_verify_fackets patch too (below) just in case it 
 would be something else in you case and you choose path 1 (put it on top 
 of this patchset, applies with some offsets). In case the problem is gone, 
 it shouldn't trigger and if it does, we'll have another bug caught.
 I have a spare node so I'm starting 2) with the 3 patches you sent and that
 last one which applied fine.
 
 Ah, that's path 1) then... Since you seem to have enough time, I would say 
 that the path 1 is good as well and bugs unrelated to the fix will show up 
 there too...

arg. yes. sorry for the confusion.

 I should have stated it explicitly that with path 2 those 3 patches should 
 not be applied because the aim is not a fix but reproducal. Path 2 was 
 intentionally left without the potentional fix as then nice backtrace 
 informs when we can stop trying (which would hopefully occurred 
 pretty soon) :-).  But lets discard that path 2...

I have 2 spare nodes so i'll run both. 1) is on already without any issues
i'm just compiling 2)

I usually work on -mm, so what would be interesting for me is to have what you 
need in net-2.6.24 which is getting pulled in -mm by andrew. then, if you need 
an extra patch for verbosity, that's fine, i'll include it in my usual patchset.

Cheers,

C.
   
 all of them on a fresh git pull of net-2.6.24
 
 That's fine, they're pretty well in sync (mm and net-2.6.24, and 
 soon 2.6.24-rcs too).
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-2.6.24 0/3]: More TCP fixes

2007-10-03 Thread Ilpo Järvinen
On Wed, 3 Oct 2007, Cedric Le Goater wrote:

 Ilpo Järvinen wrote:
  
  Ah, that's path 1) then... Since you seem to have enough time, I would say 
  that the path 1 is good as well and bugs unrelated to the fix will show up 
  there too...
 
 arg. yes. sorry for the confusion.
 
  I should have stated it explicitly that with path 2 those 3 patches should 
  not be applied because the aim is not a fix but reproducal. Path 2 was 
  intentionally left without the potentional fix as then nice backtrace 
  informs when we can stop trying (which would hopefully occurred 
  pretty soon) :-).  But lets discard that path 2...
 
 I have 2 spare nodes so i'll run both. 1) is on already without any issues
 i'm just compiling 2)

Thanks a lot. :-)

 I usually work on -mm, so what would be interesting for me is to have what 
 you 
 need in net-2.6.24 which is getting pulled in -mm by andrew. then, if 
 you need an extra patch for verbosity, that's fine, i'll include it in 
 my usual patchset.

Ah, I'm sorry about the subject and the extra work it caused, it was 
meant for DaveM only, didn't realize at that time it would be 
meaningful to you as well, thus couldn't warn you back then... Testing on 
top of mm would be (/ have been) fine as well... From my point of view 
both mm and net-2.6.24 are pretty much the same (I even verified that 
those patches apply fine on top of rc8-mm2 since I thought that you might 
want to use that one).

-- 
 i.

Re: Please pull 'upstream-davem' branch of wireless-2.6

2007-10-03 Thread John W. Linville
On Tue, Oct 02, 2007 at 07:01:56PM -0700, David Miller wrote:
 From: John W. Linville [EMAIL PROTECTED]
 Date: Tue, 2 Oct 2007 21:25:52 -0400
 
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git 
  upstream-davem
 
 This doesn't pull cleanly.
 
 Probably you used a recently cloned Linus tree, pulled
 net-2.6.24 into that (and resolved the conflicts), and
 then put your patches in.

No, in fact I'm quite conscious of that.  I follow a procedure
identical to what you outlined.  I even leave my 'master-davem' branch
available as a reference, and create the initial 'upstream-davem'
branch as a checkout from it. :-)

As an experiment, I cloned your current tree (which has the patches
applied already, thanks!) and created a branch which backed-out the
patches from me you had already applied by hand.  I then did a pull
from my tree, and the results were quite clean.

[linville]: git checkout -b jwltest 
fc26d79bb258b5fdb3dee940bea12d6ef7c217c5
Switched to a new branch jwltest

[linville]: git pull 
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git 
upstream-davem
remote: Generating pack...
remote: Done counting 257 objects.
remote: Result has 199 objects.
remote: Deltifying 199 objects...
remote:  100% (199/199) done
Indexing 199 objects...
remote: Total 199 (delta 150), reused 143 (delta 115)
 100% (199/199) done
Resolving 150 deltas...
 100% (150/150) done
32 objects were added to complete this thin pack.
Removed drivers/net/wireless/zd1211rw/zd_util.c
Removed drivers/net/wireless/zd1211rw/zd_util.h
Merge made by recursive.
 Documentation/networking/mac80211-injection.txt |   32 ++-
 drivers/net/wireless/adm8211.c  |8 +-
 drivers/net/wireless/b43/Kconfig|   12 +
 drivers/net/wireless/b43/Makefile   |5 +-
 drivers/net/wireless/b43/b43.h  |   11 +-
 drivers/net/wireless/b43/leds.c |  399 
++-
 drivers/net/wireless/b43/leds.h |   63 ++--
 drivers/net/wireless/b43/main.c |  205 
 drivers/net/wireless/b43/phy.c  |   13 +-
 drivers/net/wireless/b43/phy.h  |2 +-
 drivers/net/wireless/b43/rfkill.c   |  184 +++
 drivers/net/wireless/b43/rfkill.h   |   58 
 drivers/net/wireless/hostap/hostap.h|2 +-
 drivers/net/wireless/hostap/hostap_hw.c |2 +-
 drivers/net/wireless/hostap/hostap_main.c   |   19 +-
 drivers/net/wireless/iwlwifi/iwl3945-base.c |4 -
 drivers/net/wireless/iwlwifi/iwl4965-base.c |4 -
 drivers/net/wireless/p54common.c|4 +-
 drivers/net/wireless/p54pci.c   |4 +-
 drivers/net/wireless/rt2x00/rt2x00.h|2 +-
 drivers/net/wireless/zd1211rw/Makefile  |2 +-
 drivers/net/wireless/zd1211rw/zd_chip.c |1 -
 drivers/net/wireless/zd1211rw/zd_mac.c  |4 +-
 drivers/net/wireless/zd1211rw/zd_usb.c  |1 -
 drivers/net/wireless/zd1211rw/zd_util.c |   82 -
 drivers/net/wireless/zd1211rw/zd_util.h |   29 --
 include/linux/rfkill.h  |   24 ++
 include/net/mac80211.h  |   46 +++-
 net/mac80211/cfg.c  |   75 -
 net/mac80211/ieee80211.c|  189 +---
 net/mac80211/ieee80211_i.h  |   17 +-
 net/mac80211/ieee80211_iface.c  |   68 +
 net/mac80211/ieee80211_ioctl.c  |   31 +-
 net/mac80211/ieee80211_led.c|   67 +++-
 net/mac80211/ieee80211_led.h|6 +
 net/mac80211/ieee80211_rate.c   |3 +-
 net/mac80211/ieee80211_rate.h   |2 -
 net/mac80211/ieee80211_sta.c|7 +-
 net/mac80211/key.c  |1 -
 net/mac80211/rx.c   |  122 +++-
 net/mac80211/sta_info.c |   13 +-
 net/mac80211/tx.c   |  211 ++--
 net/mac80211/wme.c  |   10 +-
 net/rfkill/Kconfig  |7 +
 net/rfkill/rfkill.c |   49 +++-
 45 files changed, 1022 insertions(+), 1078 deletions(-)
 create mode 100644 drivers/net/wireless/b43/rfkill.c
 create mode 100644 

[IPv6] Fix ICMPv6 redirect handling with target multicast address, try 3

2007-10-03 Thread Brian Haley
When the ICMPv6 Target address is multicast, Linux processes the 
redirect instead of dropping it.  The problem is in this code in 
ndisc_redirect_rcv():


if (ipv6_addr_equal(dest, target)) {
on_link = 1;
} else if (!(ipv6_addr_type(target)  IPV6_ADDR_LINKLOCAL)) {
ND_PRINTK2(KERN_WARNING
   ICMPv6 Redirect: target address is not 
link-local.\n);

return;
}

This second check will succeed if the Target address is, for example, 
FF02::1 because it has link-local scope.  Instead, it should be checking 
if it's a unicast link-local address, as stated in RFC 2461/4861 Section 
8.1:


  - The ICMP Target Address is either a link-local address (when
redirected to a router) or the same as the ICMP Destination
Address (when redirected to the on-link destination).

I know this doesn't explicitly say unicast link-local address, but it's 
implied.


This bug is preventing Linux kernels from achieving IPv6 Logo Phase II 
certification because of a recent error that was found in the TAHI test 
suite - Neighbor Disovery suite test 206 (v6LC.2.3.6_G) had the 
multicast address in the Destination field instead of Target field, so 
we were passing the test.  This won't be the case anymore.


The patch below fixes this problem, and also fixes ndisc_send_redirect() 
to not send an invalid redirect with a multicast address in the Target 
field.  I re-ran the TAHI Neighbor Discovery section to make sure Linux 
passes all 245 tests now.


-Brian


Signed-off-by: Brian Haley [EMAIL PROTECTED]
Acked-by: David L Stevens [EMAIL PROTECTED]
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 74c4d8d..b761dbe 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1267,9 +1267,10 @@ static void ndisc_redirect_rcv(struct sk_buff *skb)
 
 	if (ipv6_addr_equal(dest, target)) {
 		on_link = 1;
-	} else if (!(ipv6_addr_type(target)  IPV6_ADDR_LINKLOCAL)) {
+	} else if (ipv6_addr_type(target) !=
+		   (IPV6_ADDR_UNICAST|IPV6_ADDR_LINKLOCAL)) {
 		ND_PRINTK2(KERN_WARNING
-			   ICMPv6 Redirect: target address is not link-local.\n);
+			   ICMPv6 Redirect: target address is not link-local unicast.\n);
 		return;
 	}
 
@@ -1343,9 +1344,9 @@ void ndisc_send_redirect(struct sk_buff *skb, struct neighbour *neigh,
 	}
 
 	if (!ipv6_addr_equal(ipv6_hdr(skb)-daddr, target) 
-	!(ipv6_addr_type(target)  IPV6_ADDR_LINKLOCAL)) {
+	ipv6_addr_type(target) != (IPV6_ADDR_UNICAST|IPV6_ADDR_LINKLOCAL)) {
 		ND_PRINTK2(KERN_WARNING
-			ICMPv6 Redirect: target address is not link-local.\n);
+			ICMPv6 Redirect: target address is not link-local unicast.\n);
 		return;
 	}
 


Re: [PATCH net-2.6.24 0/3]: More TCP fixes

2007-10-03 Thread Cedric Le Goater
Ilpo Järvinen wrote:
 On Wed, 3 Oct 2007, Cedric Le Goater wrote:
 
 Ilpo Järvinen wrote:
 Ah, that's path 1) then... Since you seem to have enough time, I would say 
 that the path 1 is good as well and bugs unrelated to the fix will show up 
 there too...
 arg. yes. sorry for the confusion.

 I should have stated it explicitly that with path 2 those 3 patches should 
 not be applied because the aim is not a fix but reproducal. Path 2 was 
 intentionally left without the potentional fix as then nice backtrace 
 informs when we can stop trying (which would hopefully occurred 
 pretty soon) :-).  But lets discard that path 2...
 I have 2 spare nodes so i'll run both. 1) is on already without any issues
 i'm just compiling 2)

Below are the messages I got on 2) right after running ketchup (which does 
a wget www.kernel.org) 

not a warning on 1) with your extra verbose patch.

 I usually work on -mm, so what would be interesting for me is to have what 
 you 
 need in net-2.6.24 which is getting pulled in -mm by andrew. then, if 
 you need an extra patch for verbosity, that's fine, i'll include it in 
 my usual patchset.
 
 Ah, I'm sorry about the subject and the extra work it caused, 

no problem, that was a comment for the futur patchset.
 
 it was meant for DaveM only, didn't realize at that time it would be 
 meaningful to you as well, thus couldn't warn you back then... Testing on 
 top of mm would be (/ have been) fine as well... From my point of view 
 both mm and net-2.6.24 are pretty much the same (I even verified that 
 those patches apply fine on top of rc8-mm2 since I thought that you might 
 want to use that one).

He, you might have solved it with 1). If not, I'm keeping the hardware for
you.

Cheers,

C.

WARNING: at 
/home/legoater/linux/2.6.23-rc8-mm2-tcp_fastretrans/net/ipv4/tcp_ipv4.c:193 
tcp_verify_fackets()

Call Trace:
 IRQ  [8041aa86] tcp_verify_fackets+0x119/0x237
 [80416e57] tcp_fragment+0x468/0x4b8
 [804184a5] tcp_retransmit_skb+0xcf/0x2f4
 [8041878d] tcp_xmit_retransmit_queue+0xc3/0x31e
 [8041220a] tcp_fastretrans_alert+0xb36/0xb43
 [80412f0f] tcp_ack+0x5d3/0x71b
 [80415229] tcp_rcv_established+0x61f/0x6df
 [8025419a] __lock_acquire+0x8a1/0xf1b
 [8041c7ff] tcp_v4_do_rcv+0x3e/0x394
 [8041d171] tcp_v4_rcv+0x61c/0x9a9
 [804017e3] ip_local_deliver+0x1da/0x2a4
 [8040214e] ip_rcv+0x583/0x5c9
 [8046fe43] packet_rcv_spkt+0x19a/0x1a8
 [803e2e1c] netif_receive_skb+0x2cf/0x2f5
 [88042505] :tg3:tg3_poll+0x65d/0x8a4
 [803e2fe8] net_rx_action+0xb8/0x191
 [8023a9b7] __do_softirq+0x5f/0xe0
 [8020c98c] call_softirq+0x1c/0x28
 [8020e9c3] do_softirq+0x3b/0xb8
 [8023aaae] irq_exit+0x4e/0x50
 [8020e7df] do_IRQ+0xbd/0xd7
 [80209cb9] mwait_idle+0x0/0x4d
 [8020bce6] ret_from_intr+0x0/0xf
 EOI  [80209cfc] mwait_idle+0x43/0x4d
 [802099fb] enter_idle+0x22/0x24
 [80209c4f] cpu_idle+0x9d/0xc0
 [80479591] rest_init+0x55/0x57
 [8063681f] start_kernel+0x2e0/0x2ec
 [80636134] _sinittext+0x134/0x13b

WARNING: at 
/home/legoater/linux/2.6.23-rc8-mm2-tcp_fastretrans/net/ipv4/tcp_ipv4.c:198 
tcp_verify_fackets()

Call Trace:
 IRQ  [80323bf6] vgacon_set_cursor_size+0x39/0xd5
 [8041aad0] tcp_verify_fackets+0x163/0x237
 [80416e57] tcp_fragment+0x468/0x4b8
 [804184a5] tcp_retransmit_skb+0xcf/0x2f4
 [8041878d] tcp_xmit_retransmit_queue+0xc3/0x31e
 [8041220a] tcp_fastretrans_alert+0xb36/0xb43
 [80412f0f] tcp_ack+0x5d3/0x71b
 [80415229] tcp_rcv_established+0x61f/0x6df
 [8025419a] __lock_acquire+0x8a1/0xf1b
 [8041c7ff] tcp_v4_do_rcv+0x3e/0x394
 [8041d171] tcp_v4_rcv+0x61c/0x9a9
 [804017e3] ip_local_deliver+0x1da/0x2a4
 [8040214e] ip_rcv+0x583/0x5c9
 [8046fe43] packet_rcv_spkt+0x19a/0x1a8
 [803e2e1c] netif_receive_skb+0x2cf/0x2f5
 [88042505] :tg3:tg3_poll+0x65d/0x8a4
 [803e2fe8] net_rx_action+0xb8/0x191
 [8023a9b7] __do_softirq+0x5f/0xe0
 [8020c98c] call_softirq+0x1c/0x28
 [8020e9c3] do_softirq+0x3b/0xb8
 [8023aaae] irq_exit+0x4e/0x50
 [8020e7df] do_IRQ+0xbd/0xd7
 [80209cb9] mwait_idle+0x0/0x4d
 [8020bce6] ret_from_intr+0x0/0xf
 EOI  [80209cfc] mwait_idle+0x43/0x4d
 [802099fb] enter_idle+0x22/0x24
 [80209c4f] cpu_idle+0x9d/0xc0
 [80479591] rest_init+0x55/0x57
 [8063681f] start_kernel+0x2e0/0x2ec
 [80636134] _sinittext+0x134/0x13b

TCP wq(s) -S--SSS
TCP wq(i)   hf   
s4 f9 (47) p9 seq: su3460595874 hs3460607374 sn3460659962 (3460608822)
WARNING: at 
/home/legoater/linux/2.6.23-rc8-mm2-tcp_fastretrans/net/ipv4/tcp_ipv4.c:193 
tcp_verify_fackets()

Call Trace:
 IRQ  [80323bf6] vgacon_set_cursor_size+0x39/0xd5
 

Re: [PATCH net-2.6.24 0/3]: More TCP fixes

2007-10-03 Thread Cedric Le Goater
Cedric Le Goater wrote:
 Ilpo Järvinen wrote:
 On Wed, 3 Oct 2007, Cedric Le Goater wrote:

 Ilpo Järvinen wrote:
 Ah, that's path 1) then... Since you seem to have enough time, I would say 
 that the path 1 is good as well and bugs unrelated to the fix will show up 
 there too...
 arg. yes. sorry for the confusion.

 I should have stated it explicitly that with path 2 those 3 patches should 
 not be applied because the aim is not a fix but reproducal. Path 2 was 
 intentionally left without the potentional fix as then nice backtrace 
 informs when we can stop trying (which would hopefully occurred 
 pretty soon) :-).  But lets discard that path 2...
 I have 2 spare nodes so i'll run both. 1) is on already without any issues
 i'm just compiling 2)
 
 Below are the messages I got on 2) right after running ketchup (which does 
 a wget www.kernel.org) 

and a second run of ketchup gave the following.

cheers,

C.

WARNING: at 
/home/legoater/linux/2.6.23-rc8-mm2-tcp_fastretrans/net/ipv4/tcp_ipv4.c:193 
tcp_verify_fackets()

Call Trace:
 IRQ  [8041aa86] tcp_verify_fackets+0x119/0x237
 [80416e57] tcp_fragment+0x468/0x4b8
 [804184a5] tcp_retransmit_skb+0xcf/0x2f4
 [8041878d] tcp_xmit_retransmit_queue+0xc3/0x31e
 [8041220a] tcp_fastretrans_alert+0xb36/0xb43
 [80412f0f] tcp_ack+0x5d3/0x71b
 [80415229] tcp_rcv_established+0x61f/0x6df
 [8025419a] __lock_acquire+0x8a1/0xf1b
 [8041c7ff] tcp_v4_do_rcv+0x3e/0x394
 [8041d171] tcp_v4_rcv+0x61c/0x9a9
 [804017e3] ip_local_deliver+0x1da/0x2a4
 [8040214e] ip_rcv+0x583/0x5c9
 [8046fe43] packet_rcv_spkt+0x19a/0x1a8
 [803e2e1c] netif_receive_skb+0x2cf/0x2f5
 [88042505] :tg3:tg3_poll+0x65d/0x8a4
 [803e2fe8] net_rx_action+0xb8/0x191
 [8023a9b7] __do_softirq+0x5f/0xe0
 [8020c98c] call_softirq+0x1c/0x28
 [8020e9c3] do_softirq+0x3b/0xb8
 [8023aaae] irq_exit+0x4e/0x50
 [8020e7df] do_IRQ+0xbd/0xd7
 [80209cb9] mwait_idle+0x0/0x4d
 [8020bce6] ret_from_intr+0x0/0xf
 EOI  [80209cfc] mwait_idle+0x43/0x4d
 [802099fb] enter_idle+0x22/0x24
 [80209c4f] cpu_idle+0x9d/0xc0
 [80479591] rest_init+0x55/0x57
 [8063681f] start_kernel+0x2e0/0x2ec
 [80636134] _sinittext+0x134/0x13b

TCP wq(s) --S--
TCP wq(i)   h  
s1 f5 (14) p6 seq: su110259658 hs110265450 sn110278722 (0)
WARNING: at 
/home/legoater/linux/2.6.23-rc8-mm2-tcp_fastretrans/net/ipv4/tcp_ipv4.c:193 
tcp_verify_fackets()

Call Trace:
 IRQ  [803250aa] vgacon_scroll+0x188/0x1dd
 [8041aa86] tcp_verify_fackets+0x119/0x237
 [80416e57] tcp_fragment+0x468/0x4b8
 [804184a5] tcp_retransmit_skb+0xcf/0x2f4
 [8041878d] tcp_xmit_retransmit_queue+0xc3/0x31e
 [8041220a] tcp_fastretrans_alert+0xb36/0xb43
 [80412f0f] tcp_ack+0x5d3/0x71b
 [80415229] tcp_rcv_established+0x61f/0x6df
 [8025419a] __lock_acquire+0x8a1/0xf1b
 [8041c7ff] tcp_v4_do_rcv+0x3e/0x394
 [8041d171] tcp_v4_rcv+0x61c/0x9a9
 [804017e3] ip_local_deliver+0x1da/0x2a4
 [8040214e] ip_rcv+0x583/0x5c9
 [8046fe43] packet_rcv_spkt+0x19a/0x1a8
 [803e2e1c] netif_receive_skb+0x2cf/0x2f5
 [88042505] :tg3:tg3_poll+0x65d/0x8a4
 [803e2fe8] net_rx_action+0xb8/0x191
 [8023a9b7] __do_softirq+0x5f/0xe0
 [8020c98c] call_softirq+0x1c/0x28
 [8020e9c3] do_softirq+0x3b/0xb8
 [8023aaae] irq_exit+0x4e/0x50
 [8020e7df] do_IRQ+0xbd/0xd7
 [80209cb9] mwait_idle+0x0/0x4d
 [8020bce6] ret_from_intr+0x0/0xf
 EOI  [80209cfc] mwait_idle+0x43/0x4d
 [802099fb] enter_idle+0x22/0x24
 [80209c4f] cpu_idle+0x9d/0xc0
 [80479591] rest_init+0x55/0x57
 [8063681f] start_kernel+0x2e0/0x2ec
 [80636134] _sinittext+0x134/0x13b

WARNING: at 
/home/legoater/linux/2.6.23-rc8-mm2-tcp_fastretrans/net/ipv4/tcp_ipv4.c:198 
tcp_verify_fackets()

Call Trace:
 IRQ  [803250aa] vgacon_scroll+0x188/0x1dd
 [8041aad0] tcp_verify_fackets+0x163/0x237
 [80416e57] tcp_fragment+0x468/0x4b8
 [804184a5] tcp_retransmit_skb+0xcf/0x2f4
 [8041878d] tcp_xmit_retransmit_queue+0xc3/0x31e
 [8041220a] tcp_fastretrans_alert+0xb36/0xb43
 [80412f0f] tcp_ack+0x5d3/0x71b
 [80415229] tcp_rcv_established+0x61f/0x6df
 [8025419a] __lock_acquire+0x8a1/0xf1b
 [8041c7ff] tcp_v4_do_rcv+0x3e/0x394
 [8041d171] tcp_v4_rcv+0x61c/0x9a9
 [804017e3] ip_local_deliver+0x1da/0x2a4
 [8040214e] ip_rcv+0x583/0x5c9
 [8046fe43] packet_rcv_spkt+0x19a/0x1a8
 [803e2e1c] netif_receive_skb+0x2cf/0x2f5
 [88042505] :tg3:tg3_poll+0x65d/0x8a4
 [803e2fe8] net_rx_action+0xb8/0x191
 [8023a9b7] __do_softirq+0x5f/0xe0
 [8020c98c] call_softirq+0x1c/0x28
 [8020e9c3] do_softirq+0x3b/0xb8
 

Re: [PATCH net-2.6.24 0/3]: More TCP fixes

2007-10-03 Thread Cedric Le Goater
Cedric Le Goater wrote:
 Ilpo Järvinen wrote:
 On Wed, 3 Oct 2007, Cedric Le Goater wrote:

 Ilpo Järvinen wrote:
 Ah, that's path 1) then... Since you seem to have enough time, I would say 
 that the path 1 is good as well and bugs unrelated to the fix will show up 
 there too...
 arg. yes. sorry for the confusion.

 I should have stated it explicitly that with path 2 those 3 patches should 
 not be applied because the aim is not a fix but reproducal. Path 2 was 
 intentionally left without the potentional fix as then nice backtrace 
 informs when we can stop trying (which would hopefully occurred 
 pretty soon) :-).  But lets discard that path 2...
 I have 2 spare nodes so i'll run both. 1) is on already without any issues
 i'm just compiling 2)
 
 Below are the messages I got on 2) right after running ketchup (which does 
 a wget www.kernel.org) 
 
 not a warning on 1) with your extra verbose patch.

bummer, I got this one on 1) :(

C.

WARNING: at /home/legoater/linux/net-2.6.24.git/net/ipv4/tcp_input.c:2325 
tcp_fastretrans_alert()

Call Trace:
 IRQ  [8022ddb6] __wake_up+0x1f/0x4c
 [803fd9d3] tcp_ack+0xcee/0x18ac
 [80400764] tcp_rcv_established+0x61f/0x6df
 [8024e8d8] __lock_acquire+0x8a1/0xf1b
 [8040795b] tcp_v4_do_rcv+0x3e/0x394
 [804082d5] tcp_v4_rcv+0x624/0x9b1
 [803ecfa3] ip_local_deliver+0x1da/0x2a4
 [803ed900] ip_rcv+0x57c/0x5c4
 [8045ae53] packet_rcv_spkt+0x19a/0x1a8
 [803ce78e] netif_receive_skb+0x2ba/0x2de
 [88044505] :tg3:tg3_poll+0x65d/0x8a4
 [803ce958] net_rx_action+0xb8/0x191
 [802385cb] __do_softirq+0x5f/0xe0
 [8020c97c] call_softirq+0x1c/0x28
 [8020e672] do_softirq+0x3b/0xb9
 [802386c2] irq_exit+0x4e/0x50
 [8020e48e] do_IRQ+0xbe/0xd8
 [80209cb9] mwait_idle+0x0/0x4d
 [8020bcc6] ret_from_intr+0x0/0xf
 EOI  [80464e10] __sched_text_start+0x5f0/0x62b
 [80464e10] __sched_text_start+0x5f0/0x62b
 [80209cfc] mwait_idle+0x43/0x4d
 [802099fb] enter_idle+0x22/0x24
 [80209c4f] cpu_idle+0x9d/0xc0
 [80464513] rest_init+0x57/0x59
 [8060c82a] start_kernel+0x2d1/0x2dd
 [8060c14e] _sinittext+0x14e/0x155

WARNING: at /home/legoater/linux/net-2.6.24.git/net/ipv4/tcp_input.c:2325 
tcp_fastretrans_alert()

Call Trace:
 IRQ  [8022ddb6] __wake_up+0x1f/0x4c
 [803fd9d3] tcp_ack+0xcee/0x18ac
 [80400764] tcp_rcv_established+0x61f/0x6df
 [8024e8d8] __lock_acquire+0x8a1/0xf1b
 [8040795b] tcp_v4_do_rcv+0x3e/0x394
 [804082d5] tcp_v4_rcv+0x624/0x9b1
 [803ecfa3] ip_local_deliver+0x1da/0x2a4
 [803ed900] ip_rcv+0x57c/0x5c4
 [8045ae53] packet_rcv_spkt+0x19a/0x1a8
 [803ce78e] netif_receive_skb+0x2ba/0x2de
 [88044505] :tg3:tg3_poll+0x65d/0x8a4
 [803ce958] net_rx_action+0xb8/0x191
 [802385cb] __do_softirq+0x5f/0xe0
 [8020c97c] call_softirq+0x1c/0x28
 [8020e672] do_softirq+0x3b/0xb9
 [802386c2] irq_exit+0x4e/0x50
 [8020e48e] do_IRQ+0xbe/0xd8
 [80209cb9] mwait_idle+0x0/0x4d
 [8020bcc6] ret_from_intr+0x0/0xf
 EOI  [80464e10] __sched_text_start+0x5f0/0x62b
 [80464e10] __sched_text_start+0x5f0/0x62b
 [80209cfc] mwait_idle+0x43/0x4d
 [802099fb] enter_idle+0x22/0x24
 [80209c4f] cpu_idle+0x9d/0xc0
 [80464513] rest_init+0x57/0x59
 [8060c82a] start_kernel+0x2d1/0x2dd


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] net: fix race in process_backlog

2007-10-03 Thread Peter Zijlstra
Subject: net: fix race in process_backlog

The recent NAPI rework (4fa57c9ea9f36f9ca852f3a88ca5d2f1aebbc960)
introduced a race between netif_rx() and process_backlog() which
resulted in softirq processing to drop dead.

netif_rx()  process_backlog()

irq_disable();
skb = __skb_dequeue();
irq_enable();

irq_disable();
__skb_queue_tail();
napi_schedule();
irq_enable();

if (!skb)
  napi_complete();  -- oops!

we cleared the napi bit, even though there is data to process.

Signed-off-by: Peter Zijlstra [EMAIL PROTECTED]
---
 net/core/dev.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Index: linux-2.6/net/core/dev.c
===
--- linux-2.6.orig/net/core/dev.c
+++ linux-2.6/net/core/dev.c
@@ -2095,11 +2095,11 @@ static int process_backlog(struct napi_s
 
local_irq_disable();
skb = __skb_dequeue(queue-input_pkt_queue);
-   local_irq_enable();
if (!skb) {
-   napi_complete(napi);
+   __napi_complete(napi);
break;
}
+   local_irq_enable();
 
dev = skb-dev;
 


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: fix race in process_backlog

2007-10-03 Thread Stephen Hemminger
On Wed, 03 Oct 2007 17:44:53 +0200
Peter Zijlstra [EMAIL PROTECTED] wrote:

 Subject: net: fix race in process_backlog
 
 The recent NAPI rework (4fa57c9ea9f36f9ca852f3a88ca5d2f1aebbc960)
 introduced a race between netif_rx() and process_backlog() which
 resulted in softirq processing to drop dead.
 
 netif_rx()process_backlog()
 
   irq_disable();
   skb = __skb_dequeue();
   irq_enable();
 
 irq_disable();
 __skb_queue_tail();
 napi_schedule();
 irq_enable();
 
   if (!skb)
 napi_complete();  -- oops!
 
 we cleared the napi bit, even though there is data to process.
 
 Signed-off-by: Peter Zijlstra [EMAIL PROTECTED]

Acked-by: Stephen Hemminger [EMAIL PROTECTED]


-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4] qe: miscellaneous code improvements and fixes to the QE library

2007-10-03 Thread Timur Tabi
This patch makes numerous miscellaneous code improvements to the QE library.

1. Remove struct ucc_common and merge ucc_init_guemr() into ucc_set_type()
   (every caller of ucc_init_guemr() also calls ucc_set_type()).  Modify all
   callers of ucc_set_type() accordingly.

2. Remove the unused enum ucc_pram_initial_offset.

3. Refactor qe_setbrg(), also implement work-around for errata QE_General4.

4. Several printk() calls were missing the terminating \n.

5. Add __iomem where needed, and change u16 to __be16 and u32 to __be32 where
   appropriate.

6. In ucc_slow_init() the RBASE and TBASE registers in the PRAM were programmed
   with the wrong value.

7. Add the protocol type to struct us_info and updated ucc_slow_init() to
   use it, instead of always programming QE_CR_PROTOCOL_UNSPECIFIED.

8. Rename ucc_slow_restart_x() to ucc_slow_restart_tx()

9. Add several macros in qe.h (mostly for slow UCC support, but also to
   standardize some naming convention) and remove several unused macros.

10. Update ucc_geth.c to use the new macros.

11. Add ucc_slow_info.protocol to specify which QE_CR_PROTOCOL_xxx protcol
to use when initializing the UCC in ucc_slow_init().

12. Rename ucc_slow_pram.rfcr to rbmr and ucc_slow_pram.tfcr to tbmr, since
these are the real names of the registers.

13. Use the setbits, clrbits, and clrsetbits where appropriate.

14. Refactor ucc_set_qe_mux_rxtx().

15. Remove all instances of 'volatile'.

16. Simplify get_cmxucr_reg();

17. Replace qe_mux.cmxucrX with qe_mux.cmxucr[].

18. Updated struct ucc_geth because struct ucc_fast is not padded any more.

Signed-off-by: Timur Tabi [EMAIL PROTECTED]
---

Add fix 18.

 arch/powerpc/sysdev/qe_lib/qe.c   |   36 +++--
 arch/powerpc/sysdev/qe_lib/qe_ic.c|2 -
 arch/powerpc/sysdev/qe_lib/qe_io.c|   35 ++---
 arch/powerpc/sysdev/qe_lib/ucc.c  |  270 ++---
 arch/powerpc/sysdev/qe_lib/ucc_fast.c |  127 
 arch/powerpc/sysdev/qe_lib/ucc_slow.c |   48 +++---
 drivers/net/ucc_geth.c|2 +-
 drivers/net/ucc_geth.h|1 +
 include/asm-powerpc/immap_qe.h|   30 ++---
 include/asm-powerpc/qe.h  |  243 -
 include/asm-powerpc/ucc.h |   40 ++
 include/asm-powerpc/ucc_slow.h|9 +-
 12 files changed, 431 insertions(+), 412 deletions(-)

diff --git a/arch/powerpc/sysdev/qe_lib/qe.c b/arch/powerpc/sysdev/qe_lib/qe.c
index 90f8740..3d57d38 100644
--- a/arch/powerpc/sysdev/qe_lib/qe.c
+++ b/arch/powerpc/sysdev/qe_lib/qe.c
@@ -141,7 +141,7 @@ EXPORT_SYMBOL(qe_issue_cmd);
  * 16 BRGs, which can be connected to the QE channels or output
  * as clocks. The BRGs are in two different block of internal
  * memory mapped space.
- * The baud rate clock is the system clock divided by something.
+ * The BRG clock is the QE clock divided by 2.
  * It was set up long ago during the initial boot phase and is
  * is given to us.
  * Baud rate clocks are zero-based in the driver code (as that maps
@@ -165,28 +165,38 @@ unsigned int get_brg_clk(void)
return brg_clk;
 }
 
-/* This function is used by UARTS, or anything else that uses a 16x
- * oversampled clock.
+/* Program the BRG to the given sampling rate and multiplier
+ *
+ * @brg: the BRG, 1-16
+ * @rate: the desired sampling rate
+ * @multiplier: corresponds to the value programmed in GUMR_L[RDCR] or
+ * GUMR_L[TDCR].  E.g., if this BRG is the RX clock, and GUMR_L[RDCR]=01,
+ * then 'multiplier' should be 8.
+ *
+ * Also note that the value programmed into the BRGC register must be even.
  */
-void qe_setbrg(u32 brg, u32 rate)
+void qe_setbrg(unsigned int brg, unsigned int rate, unsigned int multiplier)
 {
-   volatile u32 *bp;
u32 divisor, tempval;
-   int div16 = 0;
+   u32 div16 = 0;
 
-   bp = qe_immr-brg.brgc[brg];
+   divisor = get_brg_clk() / (rate * multiplier);
 
-   divisor = (get_brg_clk() / rate);
if (divisor  QE_BRGC_DIVISOR_MAX + 1) {
-   div16 = 1;
+   div16 = QE_BRGC_DIV16;
divisor /= 16;
}
 
-   tempval = ((divisor - 1)  QE_BRGC_DIVISOR_SHIFT) | QE_BRGC_ENABLE;
-   if (div16)
-   tempval |= QE_BRGC_DIV16;
+   /* Errata QE_General4, which affects some MPC832x and MPC836x SOCs, says
+  that the BRG divisor must be even if you're not using divide-by-16
+  mode. */
+   if (!div16  (divisor  1))
+   divisor++;
+
+   tempval = ((divisor - 1)  QE_BRGC_DIVISOR_SHIFT) |
+   QE_BRGC_ENABLE | div16;
 
-   out_be32(bp, tempval);
+   out_be32(qe_immr-brg.brgc[brg - 1], tempval);
 }
 
 /* Initialize SNUMs (thread serial numbers) according to
diff --git a/arch/powerpc/sysdev/qe_lib/qe_ic.c 
b/arch/powerpc/sysdev/qe_lib/qe_ic.c
index 55e6f39..9a2d1ed 100644
--- a/arch/powerpc/sysdev/qe_lib/qe_ic.c
+++ b/arch/powerpc/sysdev/qe_lib/qe_ic.c
@@ -405,8 +405,6 @@ void 

Re: [PATCH v4] qe: miscellaneous code improvements and fixes to the QE library

2007-10-03 Thread Stephen Hemminger
On Wed,  3 Oct 2007 11:34:59 -0500
Timur Tabi [EMAIL PROTECTED] wrote:

 This patch makes numerous miscellaneous code improvements to the QE library.
 
 1. Remove struct ucc_common and merge ucc_init_guemr() into ucc_set_type()
(every caller of ucc_init_guemr() also calls ucc_set_type()).  Modify all
callers of ucc_set_type() accordingly.
 
 2. Remove the unused enum ucc_pram_initial_offset.
 
 3. Refactor qe_setbrg(), also implement work-around for errata QE_General4.
 
 4. Several printk() calls were missing the terminating \n.
 
 5. Add __iomem where needed, and change u16 to __be16 and u32 to __be32 where
appropriate.
 
 6. In ucc_slow_init() the RBASE and TBASE registers in the PRAM were 
 programmed
with the wrong value.
 
 7. Add the protocol type to struct us_info and updated ucc_slow_init() to
use it, instead of always programming QE_CR_PROTOCOL_UNSPECIFIED.
 
 8. Rename ucc_slow_restart_x() to ucc_slow_restart_tx()
 
 9. Add several macros in qe.h (mostly for slow UCC support, but also to
standardize some naming convention) and remove several unused macros.
 
 10. Update ucc_geth.c to use the new macros.
 
 11. Add ucc_slow_info.protocol to specify which QE_CR_PROTOCOL_xxx protcol
 to use when initializing the UCC in ucc_slow_init().
 
 12. Rename ucc_slow_pram.rfcr to rbmr and ucc_slow_pram.tfcr to tbmr, since
 these are the real names of the registers.
 
 13. Use the setbits, clrbits, and clrsetbits where appropriate.
 
 14. Refactor ucc_set_qe_mux_rxtx().
 
 15. Remove all instances of 'volatile'.
 
 16. Simplify get_cmxucr_reg();
 
 17. Replace qe_mux.cmxucrX with qe_mux.cmxucr[].
 
 18. Updated struct ucc_geth because struct ucc_fast is not padded any more.
 
 Signed-off-by: Timur Tabi [EMAIL PROTECTED]
 ---
 

Separate the changes into individual patches to allow for better comment/review
and bisection in case of regression.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


lockdep report from bonding.

2007-10-03 Thread Dave Jones
Reported by a Fedora user this morning.

Ethernet Channel Bonding Driver: v3.1.3 (June 13, 2007)
bonding: MII link monitoring set to 100 ms
ADDRCONF(NETDEV_UP): bond0: link is not ready
bonding: bond0: Adding slave eth0.
e100: eth0: e100_watchdog: link up, 100Mbps, full-duplex
bonding: bond0: making interface eth0 the new active one.
bonding: bond0: enslaving eth0 as an active interface with an up link.

=
[ INFO: inconsistent lock state ]
2.6.23-0.214.rc8.git2.fc8 #1
-
inconsistent {softirq-on-W} - {in-softirq-W} usage.
events/1/10 [HC0[0]:SC1[1]:HE1:SE0] takes:
 ((bond_info-tx_hashtbl_lock)){-+..}, at: [f8ad154c] 
tlb_clear_slave+0x1d/0x9a [bonding]
{softirq-on-W} state was registered at:
  [c0449fb0] __lock_acquire+0x4ff/0xc67
  [c044ab92] lock_acquire+0x7b/0x9e
  [c0633050] _spin_lock+0x2e/0x58
  [f8ad293a] bond_alb_initialize+0x64/0x18e [bonding]
  [f8acf25f] bond_open+0x33/0x178 [bonding]
  [c05ceb36] dev_open+0x31/0x6c
  [c05ccc8d] dev_change_flags+0xa3/0x156
  [c060d579] devinet_ioctl+0x207/0x50e
  [c060dc27] inet_ioctl+0x86/0xa4
  [c05c2e62] sock_ioctl+0x1ac/0x1c9
  [c04942a2] do_ioctl+0x22/0x68
  [c0494531] vfs_ioctl+0x249/0x25c
  [c049458d] sys_ioctl+0x49/0x64
  [c040522e] syscall_call+0x7/0xb
  [] 0x
irq event stamp: 40878
hardirqs last  enabled at (40878): [c0633474] _spin_unlock_irq+0x22/0x2f
hardirqs last disabled at (40877): [c063339d] _spin_lock_irq+0x19/0x67
softirqs last  enabled at (40872): [c05e6fcf] rt_run_flush+0x6e/0x97
softirqs last disabled at (40873): [c04075d4] do_softirq+0x74/0xf7

other info that might help us debug this:
3 locks held by events/1/10:
 #0:  (rtnl_mutex){--..}, at: [c0631c31] mutex_lock+0x21/0x24
 #1:  (bond-lock){-.-+}, at: [f8ad25ed] bond_alb_monitor+0x16/0x26e 
[bonding]
 #2:  (bond-curr_slave_lock){..-+}, at: [f8ad2680] 
bond_alb_monitor+0xa9/0x26e [bonding]

stack backtrace:
 [c0406463] show_trace_log_lvl+0x1a/0x2f
 [c0406e4d] show_trace+0x12/0x14
 [c0406e65] dump_stack+0x16/0x18
 [c0448856] print_usage_bug+0x141/0x14b
 [c04490dc] mark_lock+0x12f/0x472
 [c0449f38] __lock_acquire+0x487/0xc67
 [c044ab92] lock_acquire+0x7b/0x9e
 [c0633050] _spin_lock+0x2e/0x58
 [f8ad154c] tlb_clear_slave+0x1d/0x9a [bonding]
 [f8ad269a] bond_alb_monitor+0xc3/0x26e [bonding]
 [c043541b] run_timer_softirq+0x127/0x18f
 [c0432a21] __do_softirq+0x78/0xff
 [c04075d4] do_softirq+0x74/0xf7
 ===
ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
bonding: bond0: Adding slave eth1.

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Please pull 'fixes-jgarzik' branch of wireless-2.6

2007-10-03 Thread Jeff Garzik

John W. Linville wrote:

The following changes since commit 3146b39c185f8a436d430132457e84fa1d8f8208:
  Linus Torvalds (1):
Linux 2.6.23-rc9

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git 
fixes-jgarzik

Joe Perches (1):
  bcm43xx: Correct printk with PFX before KERN_

Richard Knutsson (1):
  softmac: Fix compiler-warning


pulled


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sky2: jumbo frame regression fix

2007-10-03 Thread Jeff Garzik

Stephen Hemminger wrote:

Remove unneeded check that caused problems with jumbo frame sizes.
The check was recently added and is wrong.
When using jumbo frames the sky2 driver does fragmentation, so
rx_data_size is less than mtu.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


applied


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [9/11] pasemi_mac: clear out old errors on interface open

2007-10-03 Thread Jeff Garzik

Olof Johansson wrote:

pasemi_mac: clear out old errors on interface open

Clear out any pending errors when an interface is brought up. Since the bits
are sticky, they might be from interface shutdown time after firmware has
used it, etc.

Signed-off-by: Olof Johansson [EMAIL PROTECTED]


In general, interface-open should completely reset and initialize the 
hardware.  does pasemi_mac not do that?


Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [11/11] pasemi_mac: enable iommu support

2007-10-03 Thread Jeff Garzik

Olof Johansson wrote:

pasemi_mac: use buffer index pointer in clean_rx()

Use the new features in B0 for buffer ring index on the receive side. This
means we no longer have to search in the ring for where the buffer
came from.

Also cleanup the RX cleaning side a little, while I was at it.

Note: Pre-B0 hardware is no longer supported, and needs a pile of other
workarounds that are not being submitted for mainline inclusion. So the
fact that this breaks old hardware is not a problem at this time.

Signed-off-by: Olof Johansson [EMAIL PROTECTED]


You sent patch #10 against as patch #11 :)


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/7] CAN: Add virtual CAN netdevice driver

2007-10-03 Thread Oliver Hartkopp

David Miller wrote:

From: Stephen Hemminger [EMAIL PROTECTED]
Date: Tue, 2 Oct 2007 14:52:36 -0700

  

Please consider using netif_msg_xxx() and module parameter to set
default message level, like other real network drivers already do.



I keep seeing this recommendation, but the two supposedly most mature
and actively used drivers in the tree, tg3 and e1000 and e1000e, all
do not use this scheme.

In fact there are tons of drivers that even hook up the ethtool
msg_level setting function and never even use the value.

If people aren't using netif_msg_xxx() and the ethtool msg_level
facilities properly, it's because there is a severe dearth of good
example drivers to learn about it from.
  


The currently available CAN netdevice drivers do not have a common debug 
concept neither any runtime control mechanism for this debugging. So 
netif_msg_xxx() is definitely worth to look at instead of creating any 
new stuff in this direction, before posting any 'real' CAN network 
driver here.


Thanks very much for that hint!

Oliver
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [1/11] pasemi_mac: basic error checking

2007-10-03 Thread Jeff Garzik

Olof Johansson wrote:

pasemi_mac: basic error checking

Add some rudimentary error checking to pasemi_mac.

Signed-off-by: Olof Johansson [EMAIL PROTECTED]


applied 1-10


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/3] ipg.c doesn't compile with with CONFIG_HIGHMEM64G

2007-10-03 Thread Jeff Garzik

applied


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Fix typo in new EMAC driver.

2007-10-03 Thread Jeff Garzik

Valentine Barshak (by way of Josh Boyer [EMAIL PROTECTED]) wrote:

Fix an obvious typo in emac_xmit_finish.

Signed-off-by: Valentine Barshak [EMAIL PROTECTED]
---
 drivers/net/ibm_newemac/core.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)


applied


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [9/11] pasemi_mac: clear out old errors on interface open

2007-10-03 Thread Olof Johansson
On Wed, Oct 03, 2007 at 01:46:16PM -0400, Jeff Garzik wrote:
 Olof Johansson wrote:
 pasemi_mac: clear out old errors on interface open
 Clear out any pending errors when an interface is brought up. Since the 
 bits
 are sticky, they might be from interface shutdown time after firmware has
 used it, etc.
 Signed-off-by: Olof Johansson [EMAIL PROTECTED]

 In general, interface-open should completely reset and initialize the 
 hardware.  does pasemi_mac not do that?

There's no explicit way to reset just one interface besides disabling it
(which we do at close, and re-enable at open). It seems that some of
the error bits are sticky across disable/enable, which is why this was
needed. Also, they're RW1C, so writing 0 doesn't remove them (need to
write 1 to clear).

The only other dependency from firmware at this time is the setting of mac
addresses, something that will be taken care of once we allow override of
them via ethtool, since we'd need to program them from the driver then
no matter what. Right now we assume that firmware has programmed it.


-Olof

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sky2: jumbo frame regression fix

2007-10-03 Thread Bill Davidsen

Ian Kumlien wrote:

On tis, 2007-10-02 at 18:02 -0700, Stephen Hemminger wrote:

Remove unneeded check that caused problems with jumbo frame sizes.
The check was recently added and is wrong.
When using jumbo frames the sky2 driver does fragmentation, so
rx_data_size is less than mtu.


Confirmed working.

Now running with 9k mtu with no errors, =)


Have you verified that you are actually getting jumbo packets out of the 
NIC? I had one machine which did standard packets silently using sky2 
and jumbo using sk98lin. I was looking for something else with tcpdump 
and got one of those WTF moments when I saw all the tiny packets.


--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RESEND] [11/11] pasemi_mac: enable iommu support

2007-10-03 Thread Olof Johansson
pasemi_mac: enable iommu support

Enable IOMMU support for pasemi_mac, but avoid using it on non-partitioned
systems for performance reasons.

The user can override this by selecting the PPC_PASEMI_IOMMU_DMA_FORCE
configuration option.

Signed-off-by: Olof Johansson [EMAIL PROTECTED]

---

On Wed, Oct 03, 2007 at 01:47:17PM -0400, Jeff Garzik wrote:

 You sent patch #10 against as patch #11 :)

Oops! Here's the real copy.


-Olof


Index: k.org/arch/powerpc/platforms/pasemi/iommu.c
===
--- k.org.orig/arch/powerpc/platforms/pasemi/iommu.c
+++ k.org/arch/powerpc/platforms/pasemi/iommu.c
@@ -25,6 +25,7 @@
 #include asm/iommu.h
 #include asm/machdep.h
 #include asm/abs_addr.h
+#include asm/firmware.h
 
 
 #define IOBMAP_PAGE_SHIFT  12
@@ -175,13 +176,17 @@ static void pci_dma_dev_setup_pasemi(str
 {
pr_debug(pci_dma_dev_setup, dev %p (%s)\n, dev, pci_name(dev));
 
-   /* DMA device is untranslated, but all other PCI-e goes through
-* the IOMMU
+#if !defined(CONFIG_PPC_PASEMI_IOMMU_DMA_FORCE)
+   /* For non-LPAR environment, don't translate anything for the DMA
+* engine. The exception to this is if the user has enabled
+* CONFIG_PPC_PASEMI_IOMMU_DMA_FORCE at build time.
 */
-   if (dev-vendor == 0x1959  dev-device == 0xa007)
+   if (dev-vendor == 0x1959  dev-device == 0xa007 
+   !firmware_has_feature(FW_FEATURE_LPAR))
dev-dev.archdata.dma_ops = dma_direct_ops;
-   else
-   dev-dev.archdata.dma_data = iommu_table_iobmap;
+#endif
+
+   dev-dev.archdata.dma_data = iommu_table_iobmap;
 }
 
 static void pci_dma_bus_setup_null(struct pci_bus *b) { }
Index: k.org/drivers/net/pasemi_mac.c
===
--- k.org.orig/drivers/net/pasemi_mac.c
+++ k.org/drivers/net/pasemi_mac.c
@@ -34,6 +34,7 @@
 #include net/checksum.h
 
 #include asm/irq.h
+#include asm/firmware.h
 
 #include pasemi_mac.h
 
@@ -89,6 +90,15 @@ MODULE_PARM_DESC(debug, PA Semi MAC bit
 
 static struct pasdma_status *dma_status;
 
+static int translation_enabled(void)
+{
+#if defined(CONFIG_PPC_PASEMI_IOMMU_DMA_FORCE)
+   return 1;
+#else
+   return firmware_has_feature(FW_FEATURE_LPAR);
+#endif
+}
+
 static void write_iob_reg(struct pasemi_mac *mac, unsigned int reg,
  unsigned int val)
 {
@@ -193,6 +203,7 @@ static int pasemi_mac_setup_rx_resources
struct pasemi_mac_rxring *ring;
struct pasemi_mac *mac = netdev_priv(dev);
int chan_id = mac-dma_rxch;
+   unsigned int cfg;
 
ring = kzalloc(sizeof(*ring), GFP_KERNEL);
 
@@ -232,20 +243,28 @@ static int pasemi_mac_setup_rx_resources
   PAS_DMA_RXCHAN_BASEU_BRBH(ring-dma  32) |
   PAS_DMA_RXCHAN_BASEU_SIZ(RX_RING_SIZE  3));
 
-   write_dma_reg(mac, PAS_DMA_RXCHAN_CFG(chan_id),
-  PAS_DMA_RXCHAN_CFG_HBU(2));
+   cfg = PAS_DMA_RXCHAN_CFG_HBU(2);
+
+   if (translation_enabled())
+   cfg |= PAS_DMA_RXCHAN_CFG_CTR;
+
+   write_dma_reg(mac, PAS_DMA_RXCHAN_CFG(chan_id), cfg);
 
write_dma_reg(mac, PAS_DMA_RXINT_BASEL(mac-dma_if),
-  PAS_DMA_RXINT_BASEL_BRBL(__pa(ring-buffers)));
+  PAS_DMA_RXINT_BASEL_BRBL(ring-buf_dma));
 
write_dma_reg(mac, PAS_DMA_RXINT_BASEU(mac-dma_if),
-  PAS_DMA_RXINT_BASEU_BRBH(__pa(ring-buffers)  32) |
+  PAS_DMA_RXINT_BASEU_BRBH(ring-buf_dma  32) |
   PAS_DMA_RXINT_BASEU_SIZ(RX_RING_SIZE  3));
 
-   write_dma_reg(mac, PAS_DMA_RXINT_CFG(mac-dma_if),
- PAS_DMA_RXINT_CFG_DHL(3) | PAS_DMA_RXINT_CFG_L2 |
- PAS_DMA_RXINT_CFG_LW | PAS_DMA_RXINT_CFG_RBP |
- PAS_DMA_RXINT_CFG_HEN);
+   cfg = PAS_DMA_RXINT_CFG_DHL(3) | PAS_DMA_RXINT_CFG_L2 |
+ PAS_DMA_RXINT_CFG_LW | PAS_DMA_RXINT_CFG_RBP |
+ PAS_DMA_RXINT_CFG_HEN;
+
+   if (translation_enabled())
+   cfg |= PAS_DMA_RXINT_CFG_ITRR | PAS_DMA_RXINT_CFG_ITR;
+
+   write_dma_reg(mac, PAS_DMA_RXINT_CFG(mac-dma_if), cfg);
 
ring-next_to_fill = 0;
ring-next_to_clean = 0;
@@ -275,6 +294,7 @@ static int pasemi_mac_setup_tx_resources
u32 val;
int chan_id = mac-dma_txch;
struct pasemi_mac_txring *ring;
+   unsigned int cfg;
 
ring = kzalloc(sizeof(*ring), GFP_KERNEL);
if (!ring)
@@ -304,11 +324,15 @@ static int pasemi_mac_setup_tx_resources
 
write_dma_reg(mac, PAS_DMA_TXCHAN_BASEU(chan_id), val);
 
-   write_dma_reg(mac, PAS_DMA_TXCHAN_CFG(chan_id),
-  PAS_DMA_TXCHAN_CFG_TY_IFACE |
-  PAS_DMA_TXCHAN_CFG_TATTR(mac-dma_if) |
-  PAS_DMA_TXCHAN_CFG_UP |
-  

Re: [PATCH v4] qe: miscellaneous code improvements and fixes to the QE library

2007-10-03 Thread Timur Tabi

Stephen Hemminger wrote:


Separate the changes into individual patches to allow for better comment/review
and bisection in case of regression.


That would be too difficult.  Some of the changes are single lines, and this 
patch has already been approved -- I just cross-posted to netdev because I 
made a few ucc_geth changes that can't be docoupled from the powerpc changes. 
 A series of 18 patches would just be convoluted.


--
Timur Tabi
Linux Kernel Developer @ Freescale
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [9/11] pasemi_mac: clear out old errors on interface open

2007-10-03 Thread Jeff Garzik

Olof Johansson wrote:

On Wed, Oct 03, 2007 at 01:46:16PM -0400, Jeff Garzik wrote:

Olof Johansson wrote:

pasemi_mac: clear out old errors on interface open
Clear out any pending errors when an interface is brought up. Since the 
bits

are sticky, they might be from interface shutdown time after firmware has
used it, etc.
Signed-off-by: Olof Johansson [EMAIL PROTECTED]
In general, interface-open should completely reset and initialize the 
hardware.  does pasemi_mac not do that?


There's no explicit way to reset just one interface besides disabling it
(which we do at close, and re-enable at open). It seems that some of
the error bits are sticky across disable/enable, which is why this was
needed. Also, they're RW1C, so writing 0 doesn't remove them (need to
write 1 to clear).


OK just making sure, thanks.



The only other dependency from firmware at this time is the setting of mac
addresses, something that will be taken care of once we allow override of
them via ethtool, since we'd need to program them from the driver then
no matter what. Right now we assume that firmware has programmed it.


Standard procedure for this is

* upon module-load, obtain the MAC address from whatever canonical source
* upon interface-up, program dev-dev_addr[] into chip's RX filter (aka 
MAC address) registers


That permits the admin to override the MAC address via ifconfig. 
(ethtool doesn't support that, but you basically had the right idea)


Jeff


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RESEND] [11/11] pasemi_mac: enable iommu support

2007-10-03 Thread Jeff Garzik

Olof Johansson wrote:

pasemi_mac: enable iommu support

Enable IOMMU support for pasemi_mac, but avoid using it on non-partitioned
systems for performance reasons.

The user can override this by selecting the PPC_PASEMI_IOMMU_DMA_FORCE
configuration option.

Signed-off-by: Olof Johansson [EMAIL PROTECTED]


applied

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[git patches] net driver updates

2007-10-03 Thread Jeff Garzik

Normally I wait a day or two between pushes, to queue up patches and
also to avoid annoying my upstream :)  But this includes a couple fixes
I felt should be upstreamed sooner rather than later.


Please pull from 'upstream' branch of
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git upstream

to receive the following updates:

Jeff Garzik (1):
  drivers/net/qla3xxx: trim trailing whitespace

Olof Johansson (11):
  pasemi_mac: basic error checking
  pasemi_mac: fix bug in receive buffer dma mapping
  pasemi_mac: rework ring management
  pasemi_mac: implement sg support
  pasemi_mac: workaround for erratum 5971
  pasemi_mac: add local skb alignment
  pasemi_mac: further performance tweaks
  pasemi_mac: update todo list
  pasemi_mac: clear out old errors on interface open
  pasemi_mac: use buffer index pointer in clean_rx()
  pasemi_mac: enable iommu support

trem (1):
  ipg.c doesn't compile with with CONFIG_HIGHMEM64G

[EMAIL PROTECTED] (1):
  Fix typo in new EMAC driver.

 arch/powerpc/platforms/pasemi/Kconfig |   10 
 arch/powerpc/platforms/pasemi/iommu.c |   15 
 drivers/net/ibm_newemac/core.c|4 
 drivers/net/ipg.c |   10 
 drivers/net/pasemi_mac.c  |  595 ++
 drivers/net/pasemi_mac.h  |   67 ++-
 drivers/net/qla3xxx.c |  128 +++
 drivers/net/qla3xxx.h |6 
 8 files changed, 527 insertions(+), 308 deletions(-)

diff --git a/arch/powerpc/platforms/pasemi/Kconfig 
b/arch/powerpc/platforms/pasemi/Kconfig
index 95cd90f..e95261e 100644
--- a/arch/powerpc/platforms/pasemi/Kconfig
+++ b/arch/powerpc/platforms/pasemi/Kconfig
@@ -18,6 +18,16 @@ config PPC_PASEMI_IOMMU
help
  IOMMU support for PA6T-1682M
 
+config PPC_PASEMI_IOMMU_DMA_FORCE
+   bool Force DMA engine to use IOMMU
+   depends on PPC_PASEMI_IOMMU
+   help
+ This option forces the use of the IOMMU also for the
+ DMA engine. Otherwise the kernel will use it only when
+ running under a hypervisor.
+
+ If in doubt, say N.
+
 config PPC_PASEMI_MDIO
depends on PHYLIB
tristate MDIO support via GPIO
diff --git a/arch/powerpc/platforms/pasemi/iommu.c 
b/arch/powerpc/platforms/pasemi/iommu.c
index 9014d55..ab5 100644
--- a/arch/powerpc/platforms/pasemi/iommu.c
+++ b/arch/powerpc/platforms/pasemi/iommu.c
@@ -25,6 +25,7 @@
 #include asm/iommu.h
 #include asm/machdep.h
 #include asm/abs_addr.h
+#include asm/firmware.h
 
 
 #define IOBMAP_PAGE_SHIFT  12
@@ -175,13 +176,17 @@ static void pci_dma_dev_setup_pasemi(struct pci_dev *dev)
 {
pr_debug(pci_dma_dev_setup, dev %p (%s)\n, dev, pci_name(dev));
 
-   /* DMA device is untranslated, but all other PCI-e goes through
-* the IOMMU
+#if !defined(CONFIG_PPC_PASEMI_IOMMU_DMA_FORCE)
+   /* For non-LPAR environment, don't translate anything for the DMA
+* engine. The exception to this is if the user has enabled
+* CONFIG_PPC_PASEMI_IOMMU_DMA_FORCE at build time.
 */
-   if (dev-vendor == 0x1959  dev-device == 0xa007)
+   if (dev-vendor == 0x1959  dev-device == 0xa007 
+   !firmware_has_feature(FW_FEATURE_LPAR))
dev-dev.archdata.dma_ops = dma_direct_ops;
-   else
-   dev-dev.archdata.dma_data = iommu_table_iobmap;
+#endif
+
+   dev-dev.archdata.dma_data = iommu_table_iobmap;
 }
 
 static void pci_dma_bus_setup_null(struct pci_bus *b) { }
diff --git a/drivers/net/ibm_newemac/core.c b/drivers/net/ibm_newemac/core.c
index 653bfdc..ce127b9 100644
--- a/drivers/net/ibm_newemac/core.c
+++ b/drivers/net/ibm_newemac/core.c
@@ -1232,9 +1232,9 @@ static inline int emac_xmit_finish(struct emac_instance 
*dev, int len)
 * instead
 */
if (emac_has_feature(dev, EMAC_FTR_EMAC4))
-   out_be32(p-tmr0, EMAC_TMR0_XMIT);
-   else
out_be32(p-tmr0, EMAC4_TMR0_XMIT);
+   else
+   out_be32(p-tmr0, EMAC_TMR0_XMIT);
 
if (unlikely(++dev-tx_cnt == NUM_TX_BUFF)) {
netif_stop_queue(ndev);
diff --git a/drivers/net/ipg.c b/drivers/net/ipg.c
index dfdc96f..59898ce 100644
--- a/drivers/net/ipg.c
+++ b/drivers/net/ipg.c
@@ -25,6 +25,8 @@
 #include linux/mii.h
 #include linux/mutex.h
 
+#include asm/div64.h
+
 #define IPG_RX_RING_BYTES  (sizeof(struct ipg_rx) * IPG_RFDLIST_LENGTH)
 #define IPG_TX_RING_BYTES  (sizeof(struct ipg_tx) * IPG_TFDLIST_LENGTH)
 #define IPG_RESET_MASK \
@@ -836,10 +838,14 @@ static void ipg_nic_txfree(struct net_device *dev)
 {
struct ipg_nic_private *sp = netdev_priv(dev);
void __iomem *ioaddr = sp-ioaddr;
-   const unsigned int curr = ipg_r32(TFD_LIST_PTR_0) -
-   (sp-txd_map / sizeof(struct ipg_tx)) - 1;
+   unsigned int curr;
+   u64 txd_map;
unsigned int released, pending;
 
+   txd_map = 

[git patches] net driver fixes

2007-10-03 Thread Jeff Garzik

sky2 is really the only important fix, the others are trivial.


Please pull from 'upstream-linus' branch of
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git 
upstream-linus

to receive the following updates:

 drivers/net/sky2.c  |3 ---
 drivers/net/wireless/bcm43xx/bcm43xx_wx.c   |2 +-
 net/ieee80211/softmac/ieee80211softmac_wx.c |2 +-
 3 files changed, 2 insertions(+), 5 deletions(-)

Joe Perches (1):
  bcm43xx: Correct printk with PFX before KERN_

Richard Knutsson (1):
  softmac: Fix compiler-warning

Stephen Hemminger (1):
  sky2: jumbo frame regression fix

diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index 162489b..ea117fc 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -2163,9 +2163,6 @@ static struct sk_buff *sky2_receive(struct net_device 
*dev,
sky2-rx_next = (sky2-rx_next + 1) % sky2-rx_pending;
prefetch(sky2-rx_ring + sky2-rx_next);
 
-   if (length  ETH_ZLEN || length  sky2-rx_data_size)
-   goto len_error;
-
/* This chip has hardware problems that generates bogus status.
 * So do only marginal checking and expect higher level protocols
 * to handle crap frames.
diff --git a/drivers/net/wireless/bcm43xx/bcm43xx_wx.c 
b/drivers/net/wireless/bcm43xx/bcm43xx_wx.c
index d6d9413..6acfdc4 100644
--- a/drivers/net/wireless/bcm43xx/bcm43xx_wx.c
+++ b/drivers/net/wireless/bcm43xx/bcm43xx_wx.c
@@ -444,7 +444,7 @@ static int bcm43xx_wx_set_xmitpower(struct net_device 
*net_dev,
u16 maxpower;
 
if ((data-txpower.flags  IW_TXPOW_TYPE) != IW_TXPOW_DBM) {
-   printk(PFX KERN_ERR TX power not in dBm.\n);
+   printk(KERN_ERR PFX TX power not in dBm.\n);
return -EOPNOTSUPP;
}
 
diff --git a/net/ieee80211/softmac/ieee80211softmac_wx.c 
b/net/ieee80211/softmac/ieee80211softmac_wx.c
index 442b987..5742dc8 100644
--- a/net/ieee80211/softmac/ieee80211softmac_wx.c
+++ b/net/ieee80211/softmac/ieee80211softmac_wx.c
@@ -114,7 +114,7 @@ check_assoc_again:
sm-associnfo.associating = 1;
/* queue lower level code to do work (if necessary) */
schedule_delayed_work(sm-associnfo.work, 0);
-out:
+
mutex_unlock(sm-associnfo.mutex);
 
return 0;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Fix rose.ko oops on unload

2007-10-03 Thread Alexey Dobriyan
Quick'n'dirty fix to 100% oops on rmmod rose. Do you want me to
properly unwind everything before .24?
---
Commit a3d384029aa304f8f3f5355d35f0ae274454f7cd aka
[AX.25]: Fix unchecked rose_add_loopback_neigh uses
transformed rose_loopback_neigh var into statically allocated one.
However, on unload it will be kfree's which can't work.

Steps to reproduce:

modprobe rose
rmmod rose

BUG: unable to handle kernel NULL pointer dereference at virtual address 
0008
 printing eip:
c014c664
*pde = 
Oops:  [#1]
PREEMPT DEBUG_PAGEALLOC
Modules linked in: rose ax25 fan ufs loop usbhid rtc snd_intel8x0 
snd_ac97_codec ehci_hcd ac97_bus uhci_hcd thermal usbcore button processor 
evdev sr_mod cdrom
CPU:0
EIP:0060:[c014c664]Not tainted VLI
EFLAGS: 00210086   (2.6.23-rc9 #3)
EIP is at kfree+0x48/0xa1
eax: 0556   ebx: c1734aa0   ecx: f6a5e000   edx: f7082000
esi:    edi: f9a55d20   ebp: 00200287   esp: f6a5ef28
ds: 007b   es: 007b   fs:   gs: 0033  ss: 0068
Process rmmod (pid: 1823, ti=f6a5e000 task=f7082000 task.ti=f6a5e000)
Stack: f9a55d20 f9a5200c    f6a5e000 f9a5200c f9a55a00 
    bf818cf0 f9a51f3f f9a55a00  c0132c60 65736f72  
   f69f9630 f69f9528 c014244a f6a4e900 00200246 f7082000 c01025e6  
Call Trace:
 [f9a5200c] rose_rt_free+0x1d/0x49 [rose]
 [f9a5200c] rose_rt_free+0x1d/0x49 [rose]
 [f9a51f3f] rose_exit+0x4c/0xd5 [rose]
 [c0132c60] sys_delete_module+0x15e/0x186
 [c014244a] remove_vma+0x40/0x45
 [c01025e6] sysenter_past_esp+0x8f/0x99
 [c012bacf] trace_hardirqs_on+0x118/0x13b
 [c01025b6] sysenter_past_esp+0x5f/0x99
 ===
Code: 05 03 1d 80 db 5b c0 8b 03 25 00 40 02 00 3d 00 40 02 00 75 03 8b 5b 0c 
8b 73 10 8b 44 24 18 89 44 24 04 9c 5d fa e8 77 df fd ff 8b 56 08 89 f8 e8 84 
f4 fd ff e8 bd 32 06 00 3b 5c 86 60 75 0f 
EIP: [c014c664] kfree+0x48/0xa1 SS:ESP 0068:f6a5ef28

Signed-off-by: Alexey Dobriyan [EMAIL PROTECTED]
---

 include/net/rose.h   |2 +-
 net/rose/rose_loopback.c |4 ++--
 net/rose/rose_route.c|   15 ++-
 3 files changed, 13 insertions(+), 8 deletions(-)

--- a/include/net/rose.h
+++ b/include/net/rose.h
@@ -188,7 +188,7 @@ extern void rose_kick(struct sock *);
 extern void rose_enquiry_response(struct sock *);
 
 /* rose_route.c */
-extern struct rose_neigh rose_loopback_neigh;
+extern struct rose_neigh *rose_loopback_neigh;
 extern const struct file_operations rose_neigh_fops;
 extern const struct file_operations rose_nodes_fops;
 extern const struct file_operations rose_routes_fops;
--- a/net/rose/rose_loopback.c
+++ b/net/rose/rose_loopback.c
@@ -79,7 +79,7 @@ static void rose_loopback_timer(unsigned long param)
 
skb_reset_transport_header(skb);
 
-   sk = rose_find_socket(lci_o, rose_loopback_neigh);
+   sk = rose_find_socket(lci_o, rose_loopback_neigh);
if (sk) {
if (rose_process_rx_frame(sk, skb) == 0)
kfree_skb(skb);
@@ -88,7 +88,7 @@ static void rose_loopback_timer(unsigned long param)
 
if (frametype == ROSE_CALL_REQUEST) {
if ((dev = rose_dev_get(dest)) != NULL) {
-   if (rose_rx_call_request(skb, dev, 
rose_loopback_neigh, lci_o) == 0)
+   if (rose_rx_call_request(skb, dev, 
rose_loopback_neigh, lci_o) == 0)
kfree_skb(skb);
} else {
kfree_skb(skb);
--- a/net/rose/rose_route.c
+++ b/net/rose/rose_route.c
@@ -45,7 +45,7 @@ static DEFINE_SPINLOCK(rose_neigh_list_lock);
 static struct rose_route *rose_route_list;
 static DEFINE_SPINLOCK(rose_route_list_lock);
 
-struct rose_neigh rose_loopback_neigh;
+struct rose_neigh *rose_loopback_neigh;
 
 /*
  * Add a new route to a node, and in the process add the node and the
@@ -362,7 +362,12 @@ out:
  */
 void rose_add_loopback_neigh(void)
 {
-   struct rose_neigh *sn = rose_loopback_neigh;
+   struct rose_neigh *sn;
+
+   rose_loopback_neigh = kmalloc(sizeof(struct rose_neigh), GFP_KERNEL);
+   if (!rose_loopback_neigh)
+   return;
+   sn = rose_loopback_neigh;
 
sn-callsign  = null_ax25_address;
sn-digipeat  = NULL;
@@ -417,13 +422,13 @@ int rose_add_loopback_node(rose_address *address)
rose_node-mask = 10;
rose_node-count= 1;
rose_node-loopback = 1;
-   rose_node-neighbour[0] = rose_loopback_neigh;
+   rose_node-neighbour[0] = rose_loopback_neigh;
 
/* Insert at the head of list. Address is always mask=10 */
rose_node-next = rose_node_list;
rose_node_list  = rose_node;
 
-   rose_loopback_neigh.count++;
+   rose_loopback_neigh-count++;
 
 out:
spin_unlock_bh(rose_node_list_lock);
@@ -454,7 +459,7 @@ void 

Re: InfiniBand/RDMA merge plans for 2.6.24

2007-10-03 Thread Shirley Ma
Roland Dreier [EMAIL PROTECTED] wrote on 09/17/2007 02:47:42 PM:

IPoIB CM handles this properly by gathering together single pages 
in
skbs' fragment lists.
 
   Then can we reuse IPoIB CM code here?
 
 Yes, if possible, refactoring things so that the rx skb allocation
 code becomes common between CM and non-CM would definitely make sense.

IPoIB-CM rx skb allocation is not generic to be used by UD, it allocates 
more buffers than needed if mtu is not 64K, and doesn't query the real 
max_num_sg from the device. I am thinking to have a generic skb allocation 
in IPoIB based on matrix of (ipoib-mtu-size, page-size, max_num_sg, 
head-size).

Thanks
Shirley 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sky2: jumbo frame regression fix

2007-10-03 Thread Ian Kumlien
On ons, 2007-10-03 at 14:04 -0400, Bill Davidsen wrote:
 Ian Kumlien wrote:
  On tis, 2007-10-02 at 18:02 -0700, Stephen Hemminger wrote:
  Remove unneeded check that caused problems with jumbo frame sizes.
  The check was recently added and is wrong.
  When using jumbo frames the sky2 driver does fragmentation, so
  rx_data_size is less than mtu.
  
  Confirmed working.
  
  Now running with 9k mtu with no errors, =)
 
 Have you verified that you are actually getting jumbo packets out of the 
 NIC? I had one machine which did standard packets silently using sky2 
 and jumbo using sk98lin. I was looking for something else with tcpdump 
 and got one of those WTF moments when I saw all the tiny packets.

20:27:06.542461 IP pi.local  blue.local: ICMP echo request, id 27173, seq 42, 
length 8008
20:27:06.543136 IP blue.local  pi.local: ICMP echo reply, id 27173, seq 42, 
length 8008

That should solve it for us, right? =)

-- 
Ian Kumlien pomac () vapor ! com -- http://pomac.netswarm.net


signature.asc
Description: This is a digitally signed message part


Re: [PATCH] Fix rose.ko oops on unload

2007-10-03 Thread Jeff Garzik

Alexey Dobriyan wrote:

Quick'n'dirty fix to 100% oops on rmmod rose. Do you want me to
properly unwind everything before .24?
---
Commit a3d384029aa304f8f3f5355d35f0ae274454f7cd aka
[AX.25]: Fix unchecked rose_add_loopback_neigh uses
transformed rose_loopback_neigh var into statically allocated one.
However, on unload it will be kfree's which can't work.


I'm definitely missing something...  assuming your patch is applied, 
where is the kfree() for rose_loopback_neigh ?


Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Fix rose.ko oops on unload

2007-10-03 Thread Alexey Dobriyan
On Wed, Oct 03, 2007 at 03:04:20PM -0400, Jeff Garzik wrote:
 Alexey Dobriyan wrote:
 Quick'n'dirty fix to 100% oops on rmmod rose. Do you want me to
 properly unwind everything before .24?
 ---
 Commit a3d384029aa304f8f3f5355d35f0ae274454f7cd aka
 [AX.25]: Fix unchecked rose_add_loopback_neigh uses
 transformed rose_loopback_neigh var into statically allocated one.
 However, on unload it will be kfree's which can't work.
 
 I'm definitely missing something...  assuming your patch is applied, 
 where is the kfree() for rose_loopback_neigh ?

AFAICS, it will be glued to rose_neigh_list in rose_add_loopback_neigh().
On unload, found and rose_remove_neigh() will free it.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: lockdep report from bonding.

2007-10-03 Thread Andy Gospodarek
On Wed, Oct 03, 2007 at 01:05:14PM -0400, Dave Jones wrote:
 Reported by a Fedora user this morning.
 
 Ethernet Channel Bonding Driver: v3.1.3 (June 13, 2007)
 bonding: MII link monitoring set to 100 ms
 ADDRCONF(NETDEV_UP): bond0: link is not ready
 bonding: bond0: Adding slave eth0.
 e100: eth0: e100_watchdog: link up, 100Mbps, full-duplex
 bonding: bond0: making interface eth0 the new active one.
 bonding: bond0: enslaving eth0 as an active interface with an up link.
 
 =
 [ INFO: inconsistent lock state ]
 2.6.23-0.214.rc8.git2.fc8 #1
 -
 inconsistent {softirq-on-W} - {in-softirq-W} usage.
 events/1/10 [HC0[0]:SC1[1]:HE1:SE0] takes:
  ((bond_info-tx_hashtbl_lock)){-+..}, at: [f8ad154c] 
 tlb_clear_slave+0x1d/0x9a [bonding]
 {softirq-on-W} state was registered at:
   [c0449fb0] __lock_acquire+0x4ff/0xc67
   [c044ab92] lock_acquire+0x7b/0x9e
   [c0633050] _spin_lock+0x2e/0x58
   [f8ad293a] bond_alb_initialize+0x64/0x18e [bonding]
   [f8acf25f] bond_open+0x33/0x178 [bonding]
   [c05ceb36] dev_open+0x31/0x6c
   [c05ccc8d] dev_change_flags+0xa3/0x156
   [c060d579] devinet_ioctl+0x207/0x50e
   [c060dc27] inet_ioctl+0x86/0xa4
   [c05c2e62] sock_ioctl+0x1ac/0x1c9
   [c04942a2] do_ioctl+0x22/0x68
   [c0494531] vfs_ioctl+0x249/0x25c
   [c049458d] sys_ioctl+0x49/0x64
   [c040522e] syscall_call+0x7/0xb
   [] 0x
 irq event stamp: 40878
 hardirqs last  enabled at (40878): [c0633474] _spin_unlock_irq+0x22/0x2f
 hardirqs last disabled at (40877): [c063339d] _spin_lock_irq+0x19/0x67
 softirqs last  enabled at (40872): [c05e6fcf] rt_run_flush+0x6e/0x97
 softirqs last disabled at (40873): [c04075d4] do_softirq+0x74/0xf7
 
 other info that might help us debug this:
 3 locks held by events/1/10:
  #0:  (rtnl_mutex){--..}, at: [c0631c31] mutex_lock+0x21/0x24
  #1:  (bond-lock){-.-+}, at: [f8ad25ed] bond_alb_monitor+0x16/0x26e 
 [bonding]
  #2:  (bond-curr_slave_lock){..-+}, at: [f8ad2680] 
 bond_alb_monitor+0xa9/0x26e [bonding]
 
 stack backtrace:
  [c0406463] show_trace_log_lvl+0x1a/0x2f
  [c0406e4d] show_trace+0x12/0x14
  [c0406e65] dump_stack+0x16/0x18
  [c0448856] print_usage_bug+0x141/0x14b
  [c04490dc] mark_lock+0x12f/0x472
  [c0449f38] __lock_acquire+0x487/0xc67
  [c044ab92] lock_acquire+0x7b/0x9e
  [c0633050] _spin_lock+0x2e/0x58
  [f8ad154c] tlb_clear_slave+0x1d/0x9a [bonding]
  [f8ad269a] bond_alb_monitor+0xc3/0x26e [bonding]
  [c043541b] run_timer_softirq+0x127/0x18f
  [c0432a21] __do_softirq+0x78/0xff
  [c04075d4] do_softirq+0x74/0xf7
  ===
 ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
 bonding: bond0: Adding slave eth1.
 


This isn't surprising and rears it's head every once in a while.  It
probably becomes more apparent on faster, multiprocessor systems.  The
big bonding-workqueue conversion patch that Jay and I have been testing
for a while should resolve this one too (since it moves the monitoring
out of softirq context *and* it switches the hashtbl locks to _bh ones
along with a bunch of other changes).



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: tcp bw in 2.6

2007-10-03 Thread Larry McVoy
 A few notes to the discussion. I've seen one e1000 bug that ended up being
 a crappy AMD pre-opteron SMP chipset with a totally useless PCI bus
 implementation, which limited performance quite a bit-totally depending on
 what you plugged in and in which slot. 10e milk-and-bread-store 
 32/33 gige nics actually were better than server-class e1000's 
 in those, but weren't that great either.

That could well be my problem, this is a dual processor (not core) athlon
(not opteron) tyan motherboard if I recall correctly.

 Check your interrupt rates for the interface. You shouldn't be getting
 anywhere near 1 interrupt/packet. If you are, something is badly wrong :).

The acks (because I'm sending) are about 1.5 packets/interrupt.
When this box is receiving it's moving about 3x ass much data
and has a _lower_ (absolute, not per packet) interrupt load.
-- 
---
Larry McVoylm at bitmover.com   http://www.bitkeeper.com
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: tcp bw in 2.6

2007-10-03 Thread Pekka Pietikainen
On Tue, Oct 02, 2007 at 02:21:32PM -0700, Larry McVoy wrote:
 More data, sky2 works fine (really really fine, like 79MB/sec) between
 Linux dylan.bitmover.com 2.6.18.1 #5 SMP Mon Oct 23 17:36:00 PDT 2006 i686
 Linux steele 2.6.20-16-generic #2 SMP Sun Sep 23 18:31:23 UTC 2007 x86_64
 
 So this is looking like a e1000 bug.  I'll try to upgrade the kernel on 
 the ia64 box and see what happens.
A few notes to the discussion. I've seen one e1000 bug that ended up being
a crappy AMD pre-opteron SMP chipset with a totally useless PCI bus
implementation, which limited performance quite a bit-totally depending on
what you plugged in and in which slot. 10e milk-and-bread-store 
32/33 gige nics actually were better than server-class e1000's 
in those, but weren't that great either.

A few things worth trying out is using recv(.., MSG_TRUNC ) on the receiver,
that tests the theoretical sender maximum performance much better (but memory
bandwidth vs. GigE is much higher these days than it was in 2001 so maybe
not that useful anymore).

Check your interrupt rates for the interface. You shouldn't be getting
anywhere near 1 interrupt/packet. If you are, something is badly wrong :).

Running getsockopt(...TCP_INFO) every few secs on the socket and printing
that out can be useful too. That gives you both sides' idea on what the
tcp windows etc. are.

My favourite tool is a home-made thing called yantt btw. 
( http://www.ee.oulu.fi/~pp/yantt.tgz . Needs lots of cleanup love, 
it mucks with the window sizes by default, since in the 2.4 days you really
had to do that to get any kind of performance and the help text is wrong.
But it's pretty easy to hack to try out new ideas, use 
sendfile/MSG_TRUNC/TCP_INFO etc.

Netperf is the kitchen sink of network benchmark tools. But trying out a few
tiny things with it is not fun at all, I tried and quickly decided to 
write my own tool for my master's thesis work ;-)

Oh. Don't measure CPU usage with top. Use a cyclesoaker (google for
cyclesoak, I included akpm's with yantt) :-)

And yes. TCP stacks do have bugs, especially when things get outside the
equipment most people have. Having a dedicated transatlantic 2.5Gbps
connection found a really fun one a long time ago ;)

-- 
Pekka Pietikainen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: tcp bw in 2.6

2007-10-03 Thread Pekka Pietikainen
On Wed, Oct 03, 2007 at 02:23:58PM -0700, Larry McVoy wrote:
  A few notes to the discussion. I've seen one e1000 bug that ended up being
  a crappy AMD pre-opteron SMP chipset with a totally useless PCI bus
  implementation, which limited performance quite a bit-totally depending on
  what you plugged in and in which slot. 10e milk-and-bread-store 
  32/33 gige nics actually were better than server-class e1000's 
  in those, but weren't that great either.
 
 That could well be my problem, this is a dual processor (not core) athlon
 (not opteron) tyan motherboard if I recall correctly.
If it's AMD760/768MPX, here's some relevant discussion:

http://lkml.org/lkml/2002/7/18/292  
http://www.ussg.iu.edu/hypermail/linux/kernel/0307.1/1109.html  
http://www.ussg.iu.edu/hypermail/linux/kernel/0307.1/1154.html  
http://www.ussg.iu.edu/hypermail/linux/kernel/0307.1/1212.html 
http://forums.2cpu.com/showthread.php?s=threadid=31211

 
  Check your interrupt rates for the interface. You shouldn't be getting
  anywhere near 1 interrupt/packet. If you are, something is badly wrong :).
 
 The acks (because I'm sending) are about 1.5 packets/interrupt.
 When this box is receiving it's moving about 3x ass much data
 and has a _lower_ (absolute, not per packet) interrupt load.
Probably not a problem then, since those acks probably cover many 
sent packets. Current interrupt mitigation schemes are pretty 
dynamic, balancing between latency and bulk performance so the acks
might be fine (thousands vs. tens of thousands/sec)

-- 
Pekka Pietikainen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: fix race in process_backlog

2007-10-03 Thread Stephen Hemminger
On Wed, 03 Oct 2007 14:58:07 -0700 (PDT)
David Miller [EMAIL PROTECTED] wrote:

 From: Peter Zijlstra [EMAIL PROTECTED]
 Date: Wed, 03 Oct 2007 17:44:53 +0200
 
  Index: linux-2.6/net/core/dev.c
  ===
  --- linux-2.6.orig/net/core/dev.c
  +++ linux-2.6/net/core/dev.c
  @@ -2095,11 +2095,11 @@ static int process_backlog(struct napi_s
   
  local_irq_disable();
  skb = __skb_dequeue(queue-input_pkt_queue);
  -   local_irq_enable();
  if (!skb) {
  -   napi_complete(napi);
  +   __napi_complete(napi);
  break;
  }
  +   local_irq_enable();
 
 What re-enables interrupts in the !skb path?

This looks like a better fix. the irq_enable is needed in both cases.

--- a/net/core/dev.c2007-09-27 07:19:10.0 -0700
+++ b/net/core/dev.c2007-10-03 15:03:54.0 -0700
@@ -2077,12 +2077,14 @@ static int process_backlog(struct napi_s
 
local_irq_disable();
skb = __skb_dequeue(queue-input_pkt_queue);
-   local_irq_enable();
if (!skb) {
-   napi_complete(napi);
+   __napi_complete(napi);
+   local_irq_enable();
break;
}
 
+   local_irq_enable();
+
dev = skb-dev;
 
netif_receive_skb(skb);


-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: fix race in process_backlog

2007-10-03 Thread David Miller
From: Peter Zijlstra [EMAIL PROTECTED]
Date: Wed, 03 Oct 2007 17:44:53 +0200

 Index: linux-2.6/net/core/dev.c
 ===
 --- linux-2.6.orig/net/core/dev.c
 +++ linux-2.6/net/core/dev.c
 @@ -2095,11 +2095,11 @@ static int process_backlog(struct napi_s
  
   local_irq_disable();
   skb = __skb_dequeue(queue-input_pkt_queue);
 - local_irq_enable();
   if (!skb) {
 - napi_complete(napi);
 + __napi_complete(napi);
   break;
   }
 + local_irq_enable();

What re-enables interrupts in the !skb path?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Please pull 'upstream-davem' branch of wireless-2.6

2007-10-03 Thread David Miller
From: John W. Linville [EMAIL PROTECTED]
Date: Wed, 3 Oct 2007 10:10:51 -0400

 So I'm not sure what happened for you.  But I think it must have been
 some other anomaly.

Ok, I'll take some detailed notes next time it happens so we can
figure out why :-)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git patches] net driver updates

2007-10-03 Thread David Miller
From: Jeff Garzik [EMAIL PROTECTED]
Date: Wed, 3 Oct 2007 14:39:16 -0400

 
 Normally I wait a day or two between pushes, to queue up patches and
 also to avoid annoying my upstream :)  But this includes a couple fixes
 I felt should be upstreamed sooner rather than later.
 
 Please pull from 'upstream' branch of
 master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git upstream

Pulled, thanks Jeff!
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4] qe: miscellaneous code improvements and fixes to the QE library

2007-10-03 Thread Kumar Gala


On Oct 3, 2007, at 1:00 PM, Timur Tabi wrote:


Stephen Hemminger wrote:

Separate the changes into individual patches to allow for better  
comment/review

and bisection in case of regression.


That would be too difficult.  Some of the changes are single lines,  
and this patch has already been approved -- I just cross-posted to  
netdev because I made a few ucc_geth changes that can't be  
docoupled from the powerpc changes.  A series of 18 patches would  
just be convoluted.


Normally I would agree, but at this point I'm not going to gripe too  
much about it.


- k
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] r8169: revert part of 6dccd16b7c2703e8bbf8bca62b5cf248332afbe2

2007-10-03 Thread Francois Romieu
The 8169/8110SC currently announces itself as:
[...]
eth0: RTL8169sc/8110sc at 0x, ..:..:..:..:..:.., XID 1800 IRQ ..
 
It uses RTL_GIGA_MAC_VER_05 and this part of the changeset can cut
its performance by a factor of 2~2.5 as reported by Timo.

(the driver includes code just before the hunk to write the ChipCmd
register when mac_version == RTL_GIGA_MAC_VER_0[1-4])

Signed-off-by: Francois Romieu [EMAIL PROTECTED]
Cc: Timo Jantunen [EMAIL PROTECTED]
---
 drivers/net/r8169.c |   16 +---
 1 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index c921ec3..c76dd29 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -1918,7 +1918,11 @@ static void rtl_hw_start_8169(struct net_device *dev)
 
rtl_set_rx_max_size(ioaddr);
 
-   rtl_set_rx_tx_config_registers(tp);
+   if ((tp-mac_version == RTL_GIGA_MAC_VER_01) ||
+   (tp-mac_version == RTL_GIGA_MAC_VER_02) ||
+   (tp-mac_version == RTL_GIGA_MAC_VER_03) ||
+   (tp-mac_version == RTL_GIGA_MAC_VER_04))
+   rtl_set_rx_tx_config_registers(tp);
 
tp-cp_cmd |= rtl_rw_cpluscmd(ioaddr) | PCIMulRW;
 
@@ -1941,6 +1945,14 @@ static void rtl_hw_start_8169(struct net_device *dev)
 
rtl_set_rx_tx_desc_registers(tp, ioaddr);
 
+   if ((tp-mac_version != RTL_GIGA_MAC_VER_01) 
+   (tp-mac_version != RTL_GIGA_MAC_VER_02) 
+   (tp-mac_version != RTL_GIGA_MAC_VER_03) 
+   (tp-mac_version != RTL_GIGA_MAC_VER_04)) {
+   RTL_W8(ChipCmd, CmdTxEnb | CmdRxEnb);
+   rtl_set_rx_tx_config_registers(tp);
+   }
+
RTL_W8(Cfg9346, Cfg9346_Lock);
 
/* Initially a 10 us delay. Turned it into a PCI commit. - FR */
@@ -1955,8 +1967,6 @@ static void rtl_hw_start_8169(struct net_device *dev)
 
/* Enable all known interrupts by setting the interrupt mask. */
RTL_W16(IntrMask, tp-intr_event);
-
-   RTL_W8(ChipCmd, CmdTxEnb | CmdRxEnb);
 }
 
 static void rtl_hw_start_8168(struct net_device *dev)
-- 
1.5.3.2

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: fix race in process_backlog

2007-10-03 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED]
Date: Wed, 3 Oct 2007 15:05:19 -0700

 On Wed, 03 Oct 2007 14:58:07 -0700 (PDT)
 David Miller [EMAIL PROTECTED] wrote:
 
  From: Peter Zijlstra [EMAIL PROTECTED]
  Date: Wed, 03 Oct 2007 17:44:53 +0200
  
   Index: linux-2.6/net/core/dev.c
   ===
   --- linux-2.6.orig/net/core/dev.c
   +++ linux-2.6/net/core/dev.c
   @@ -2095,11 +2095,11 @@ static int process_backlog(struct napi_s

 local_irq_disable();
 skb = __skb_dequeue(queue-input_pkt_queue);
   - local_irq_enable();
 if (!skb) {
   - napi_complete(napi);
   + __napi_complete(napi);
 break;
 }
   + local_irq_enable();
  
  What re-enables interrupts in the !skb path?
 
 This looks like a better fix. the irq_enable is needed in both cases.

Yep, applied, thanks Peter and Stephen.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 3/3] git-net: sctp build fix (not for applying)

2007-10-03 Thread David Miller
From: Vlad Yasevich [EMAIL PROTECTED]
Date: Wed, 03 Oct 2007 09:50:55 -0400

 [EMAIL PROTECTED] wrote:
  From: Andrew Morton [EMAIL PROTECTED]
  
  net/sctp/sm_statetable.c:551: error: 'sctp_sf_tabort_8_4_8' undeclared here 
  (not in a function)
  
 
 Andrew, is the a result of the merge of net-2.6.24 with net-2.6?  

Actually, it is a result of merging with Linus's tree since your SCTP
bits were there already, that's why Andrew hit this.

 That's the only way I see this happening.

Right.

I'll resolve this cleanly as I rebase net-2.6.24 today.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


net-2.6.24 rebased

2007-10-03 Thread David Miller

Available as usual at:

kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.24.git

I resolved the SCTP and network driver conflicts that come standard
with a rebase to Linus's current tree.

We're up to 700 changesets and an 8.7 MB patch, w00t!

I've been using it for an hour or so on my workstation so something
works.  It also passes an allmodconfig build on sparc64.

Either later tonight or some time tomorrow I'll start hitting the
patch backlog in my inbox.

Thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html