Re: [RFC] network namespaces

2006-09-08 Thread Herbert Poetzl
On Thu, Sep 07, 2006 at 12:29:21PM -0600, Eric W. Biederman wrote:
 Daniel Lezcano [EMAIL PROTECTED] writes:
 
  IHMO, I think there is one reason. The unsharing mechanism is
  not only for containers, its aim other kind of isolation like a
  bsdjail for example. The unshare syscall is flexible, shall the
  network unsharing be one-block solution ? For example, we want to
  launch an application using TCP/IP and we want to have
  an IP address only used by the application, nothing more.
  With a layer 2, we must after unsharing:
   1) create a virtual device into the application namespace
   2) assign an IP address
   3) create a virtual device pass-through in the root namespace
   4) set the virtual device IP
 
  All this stuff, need a lot of administration (check mac addresses
  conflicts, check interface names collision in root namespace, ...)
  for a simple network isolation.
 
 Yes, and even more it is hard to show that it will perform as well.
 Although by dropping CAP_NET_ADMIN the actual runtime administration
 is about the same.
 
  With a layer 3:
   1) assign an IP address
 
  In the other hand, a layer 3 isolation is not sufficient to reach
  the level of isolation/virtualization needed for the system
  containers.
 
 Agreed.
 
  Very soon, I will commit more info at:
 
  http://wiki.openvz.org/Containers/Networking
 
  So the consensus is based on the fact that there is a lot of common
  code for the layer 2 and layer 3 isolation/virtualization and we can
  find a way to merge the 2 implementation in order to have a flexible
  network virtualization/isolation.
 
 NACK In a real level 3 implementation there is very little common
 code with a layer 2 implementation. You don't need to muck with the
 socket handling code as you are not allowed to dup addresses between
 containers. Look at what Serge did that is layer 3.

 A layer 3 isolation implementation should either be a new security
 module or a new form of iptables. The problem with using the lsm is
 that it seems to be an all or nothing mechanism so is a very coarse
 grained tool for this job.

IMHO LSM was never an option for that, because it is
a) very complicated to use it for that purpose
b) missing many hooks you definitely need to make this work
c) is not really efficient and/or performant

with something 'like' iptables, this could be done, but
I'm not sure that is the best approach either ...

best,
Herbert

 A layer 2 implementation (where you have network devices isolated and
 not sockets) should be a namespace.
 
 Eric
 ___
 Containers mailing list
 [EMAIL PROTECTED]
 https://lists.osdl.org/mailman/listinfo/containers
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ProxyARP and IPSec

2006-09-08 Thread Thomas Graf
* H. Peter Anvin [EMAIL PROTECTED] 2006-09-07 15:28
 Thomas Graf wrote:
 What about adding blackhole device to be used for such routes.
 I believe it would be good architecture to always use devices
 to state directions packets are being received from and sent to.
 
 The dummy device can be used for that.

I was thinking a bit beyond that, a device similiar to ifb but
without a hard header to allow classification. Packets get
enqueued before skb-dst gets overwritten with dst-child and
dequeue handles the second half of xfrm4_output_finish2().
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC/T: Possible fix for bcm43xx periodic work bug

2006-09-08 Thread Erik Mouw
On Thu, Sep 07, 2006 at 01:17:05PM -0500, Larry Finger wrote:
 I think I have a fix for the bcm43xx bug that leads to NETDEV WATCHDOG tx 
 timeouts and would like it
 to get as much testing as possible as this bug affects V2.6.18-rcX. If the 
 problem is truly
 fixed, I hope to get the fix into mainline before release of the bug into the 
 stable series.

FWIW, I finally tracked down the bug that hangs my laptop to the
bcm43xx driver. At first I got the impression it was the cpufreq code
(which has been flaky in early 2.6.18-rc kernels), but after disabling
that my laptop still crashed. After that I disabled preempt cause I got
a couple of lockdep warnings when I had it enabled. That didn't make a
difference, laptop still hangs after some time (runs a couple of hours
at most). Right now I'm on wired network and my laptop finally doesn't
hang anymore (up for two days).

The hang is very hard to trigger (i.e.: it happens at random, I see no
pattern) and locks up the machine completely. I've tried to capture
kernel messages through serial console, but that doesn't work (lock up
before any messages are printed).

This is with any 2.6.18-rc kernel without additional patches or
proprietary modules, 2.6.17 works ok.

Output from lspci:

:06:00.0 Network controller: Broadcom Corporation BCM4318 [AirForce
One 54g] 802.11g Wireless LAN Controller (rev 02)
Subsystem: Linksys WPC54G-EU version 3 [Wireless-G Notebook Adapter]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR-
Latency: 64
Interrupt: pin A routed to IRQ 10
Region 0: Memory at 3600 (32-bit, non-prefetchable) [size=8K]
00: e4 14 18 43 06 00 00 00 02 00 80 02 00 40 00 00
10: 00 00 00 36 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 37 17 48 00
30: 00 00 00 00 00 00 00 00 00 00 00 00 0a 01 00 00

Is the patch you just posted supposed to fix the kind of problems I
have, or do I have to look elsewhere?


Erik

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] FRV: do_gettimeofday() should no longer use tickadj

2006-09-08 Thread David Howells
Benjamin Herrenschmidt [EMAIL PROTECTED] wrote:

 No, you do a chain handler. Look at how I do it in
 arch/powerpc/platform/pseries/setup.c for example. It's actually
 trivial. You install a special flow handler (which means that there is
 very little overhead, almost none, from the toplevel irq to the chained
 irq). You can _also_ if you want just install an IRQ handler for the
 cascaded controller and call generic_handle_irq (rather than __do_IRQ)
 from it, but that has more overhead. A chained handler completely
 relaces the flow handler for the cascade, and thus, if you don't need
 all of the nits and bits of the other flow handlers for your cascade,
 you can speed things up by hooking at that level.

Please update Documentation/DocBook/genericirq.tmpl.  That doesn't mention it.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] FRV: do_gettimeofday() should no longer use tickadj

2006-09-08 Thread Benjamin Herrenschmidt
On Fri, 2006-09-08 at 11:25 +0100, David Howells wrote:
 Benjamin Herrenschmidt [EMAIL PROTECTED] wrote:
 
  No, you do a chain handler. Look at how I do it in
  arch/powerpc/platform/pseries/setup.c for example. It's actually
  trivial. You install a special flow handler (which means that there is
  very little overhead, almost none, from the toplevel irq to the chained
  irq). You can _also_ if you want just install an IRQ handler for the
  cascaded controller and call generic_handle_irq (rather than __do_IRQ)
  from it, but that has more overhead. A chained handler completely
  relaces the flow handler for the cascade, and thus, if you don't need
  all of the nits and bits of the other flow handlers for your cascade,
  you can speed things up by hooking at that level.
 
 Please update Documentation/DocBook/genericirq.tmpl.  That doesn't mention it.

I must admit I haven't read the documentation :) I looked at the
code/patches when genirq was posted and did my powerpc implementation
based on my understanding of the code and discussions with Thomas and
Ingo. I'll have a look at the doc next week and see if I can improve it.

Cheers,
Ben.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] FRV: do_gettimeofday() should no longer use tickadj

2006-09-08 Thread David Howells
Benjamin Herrenschmidt [EMAIL PROTECTED] wrote:

  Please update Documentation/DocBook/genericirq.tmpl.  That doesn't mention 
  it.
 
 I must admit I haven't read the documentation :) I looked at the
 code/patches when genirq was posted and did my powerpc implementation
 based on my understanding of the code and discussions with Thomas and
 Ingo. I'll have a look at the doc next week and see if I can improve it.

While you're at it, you should also encomment pseries_8259_cascade() which is
what I suspect you're referring to in the powerpc sources.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] FRV: do_gettimeofday() should no longer use tickadj

2006-09-08 Thread David Howells
Benjamin Herrenschmidt [EMAIL PROTECTED] wrote:

   Now, if you have funky cascades, then you can always group them into a
   virtual irq cascade line and have a special chained flow handler that
   does all the figuring out off those... it's up to you. 
  
  You make it sound so easy, but it's not obvious how to do this, apart from
  installing interrupt handlers for the auxiliary PIC interrupts on the CPU 
  and
  having those call back into __do_IRQ().  Chaining isn't mentioned in
  genericirq.tmpl.
 
 No, you do a chain handler. Look at how I do it in
 arch/powerpc/platform/pseries/setup.c for example. It's actually
 trivial. You install a special flow handler (which means that there is
 very little overhead, almost none, from the toplevel irq to the chained
 irq). You can _also_ if you want just install an IRQ handler for the
 cascaded controller and call generic_handle_irq (rather than __do_IRQ)
 from it, but that has more overhead. A chained handler completely
 relaces the flow handler for the cascade, and thus, if you don't need
 all of the nits and bits of the other flow handlers for your cascade,
 you can speed things up by hooking at that level.

But funky cascading using chained flow handlers doesn't work if the cascade
must share an IRQ with some other device, right?

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] cleanup unnecessary forcedeth printk

2006-09-08 Thread Andy Gospodarek

This removes unnecessary messages that show up every time I put my
ethernet card in promiscuous mode.  I'm already getting notification
from the networking layer, I don't need notification from the driver as
well.

There are probably other drivers that do this as well -- I'll look
around and see what I can find.

Signed-off-by: Andy Gospodarek [EMAIL PROTECTED]
---

 forcedeth.c |1 -
 1 files changed, 1 deletion(-)

diff --git a/drivers/net/forcedeth.c b/drivers/net/forcedeth.c
--- a/drivers/net/forcedeth.c
+++ b/drivers/net/forcedeth.c
@@ -2032,7 +2032,6 @@ static void nv_set_multicast(struct net_
memset(mask, 0, sizeof(mask));
 
if (dev-flags  IFF_PROMISC) {
-   printk(KERN_NOTICE %s: Promiscuous mode enabled.\n, 
dev-name);
pff |= NVREG_PFF_PROMISC;
} else {
pff |= NVREG_PFF_MYADDR;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Devel] Re: [RFC] network namespaces

2006-09-08 Thread Dmitry Mishin
On Thursday 07 September 2006 21:27, Herbert Poetzl wrote:
 well, who said that you need to have things like RAW sockets
 or other protocols except IP, not to speak of iptable and
 routing entries ...

 folks who _want_ full network virtualization can use the
 more complete virtual setup and be happy ...
Let's think about how to implement this.
As I understood VServer's design, your proposal is to split CAP_NET_ADMIN to
multiple capabilities and use them if required. So, for your light-weight 
container it is enough to implement context isolation for protected by 
CAP_NET_IP capability (for example) code and put 'if (!capable(CAP_NET_*))' 
checks to all other places. But this could be easily implemented over OpenVZ 
code by CAP_VE_NET_ADMIN split.

So, the question is:
Could you point out the places in Andrey's implementation of network 
namespaces, which prevents you to add CAP_NET_ADMIN separation later?

-- 
Thanks,
Dmitry.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Alternate WE-21 support (core API)

2006-09-08 Thread John W. Linville
On Wed, Sep 06, 2006 at 02:30:53PM -0700, Jean Tourrilhes wrote:
 On Wed, Sep 06, 2006 at 04:55:44PM -0400, John W. Linville wrote:

  + * V20 to V21
  + * --
  + * - Remove (struct net_device *)-get_wireless_stats()
  + * - Change length in ESSID and NICK to strlen() instead of strlen()+1
  + * - Add IW_RETRY_SHORT/IW_RETRY_LONG retry modifiers
  + * - Add explicit flag to tell stats are in 802.11k RCPI : IW_QUAL_RCPI
 
   Personally, I would also add this :
 
 + *   - Power/Retry relative values no longer * 10

   Three reason :
   1) It's a cleanup and does not add any new feature
   2) It does not change the rest of the patches
   3) Userspace part has already gone in distro, not
 including this bit would mean breaking userspace.

Is there any code that corresponds to that?  Or does the comment
simply indicate policy?

   The other bits can be included at a later time ;-)

Well, maybe.  But, I think we should now consider WE to be in
maintenance mode.  I think nl80211 is the future, as long as Johannes
delivers.

Thanks,

John
-- 
John W. Linville
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: e100 fails, eepro100 works

2006-09-08 Thread Auke Kok

Jan Kiszka wrote:

Hi,

we have a couple of industrial PCs here with Intel PRO/100 controllers
on board. Most of them work fine with the e100, but today I stumbled
over one box that doesn't: Reception works (RX counter increases, ARP
cache gets filled up), but transmission fails (TX counter is also zero).


please include `ifconfig ethX` and `ethtool -S ethX` after attempting to 
transmit using the device.



In contrast, the eepro100 is fine, also Etherboot's driver.

I'm currently on 2.6.17.8, but I don't see any changes up to latest git
that may have positive influence. This is what lspci -v tells about this
piece of hardware:

00:12.0 Ethernet controller: Intel Corporation 8255xER/82551IT Fast
Ethernet Controller (rev 08)
Subsystem: Intel Corporation: Unknown device 1229
Flags: bus master, medium devsel, latency 66, IRQ 10
Memory at fc02 (32-bit, non-prefetchable) [size=4K]
I/O ports at 1080 [size=64]
Memory at fc00 (32-bit, non-prefetchable) [size=128K]
Capabilities: [dc] Power Management version 2

And here is the kernel log of e100 with highest debug level when sending
out a few pings while other packets arrive on the network:

e100: Intel(R) PRO/100 Network Driver, 3.5.10-k2-NAPI
e100: Copyright(c) 1999-2005 Intel Corporation
PCI: Found IRQ 10 for device :00:12.0
e100: eth0: e100_probe: addr 0xfc02, irq 10, MAC addr 00:30:59:01:07:A7
e100: eth0: e100_watchdog: right now = 35470
e100: eth0: e100_watchdog: link up, 100Mbps, full-duplex
e100: eth0: e100_intr: stat_ack = 0x04
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_watchdog: right now = 35970
e100: eth0: e100_intr: stat_ack = 0x04
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_watchdog: right now = 36470
e100: eth0: e100_intr: stat_ack = 0x04
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_watchdog: right now = 36970
e100: eth0: e100_intr: stat_ack = 0x04
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_watchdog: right now = 37470
e100: eth0: e100_intr: stat_ack = 0x04
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_watchdog: right now = 37970
e100: eth0: e100_intr: stat_ack = 0x04
e100: eth0: e100_watchdog: right now = 38470
e100: eth0: e100_intr: stat_ack = 0x04
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_watchdog: right now = 38970
e100: eth0: e100_intr: stat_ack = 0x04

I may find the time one day to debug this at lower levels, but you could
accelerate this process with any pointer where to dig deeper precisely.


Can you include a full `dmesg` and `lcpci -vv -s 00:12.0` ?

Also you're using 3.5.10-k2, can you try the current git tree version instead? 
I can send you the e100.c if wanted.


Auke
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TG3 data corruption (TSO ?)

2006-09-08 Thread Michael Chan
Copying netdev.

Benjamin Herrenschmidt wrote:
 Hi !
 
 I've been chasing with Segher a data corruption problem lately.
 Basically transferring huge amount of data (several Gb) and I get
 corrupted data at the rx side. I cannot tell for sure wether what I've
 been observing here is the same problem that segher's been seing on is
 blades, he will confirm or not. He also seemed to imply that reverting
 to an older kernel on the -receiver- side fixed it, which makes me
 wonder, since it's looks really like a sending side problem (see
 explanation below), if some change in, for exmaple, window scaling,
 might hide or trigger it. 

Please send me lspci and tg3 probing output so that I know what
tg3 hardware you're using.  I also want to look at the tcpdump or
ethereal on the mirrored port that shows the packet being corrupted.

 
 Now, first, I've been playing with ssh from /dev/zero on one machine
 to /dev/zero on the other. That allowed me to run enough 
 tests all over
 the place to have some idea of where the problem comes from since ssh
 will shoke at decryption when hitting the corruption.
 
 The base setup where it happens often is 2 Quad G5's connected to a
 gigabit switch. Both were running some versions of 2.6.18-rc4 and -rc5
 (some random git actually, but see below as I've reproduced 
 the problem
 with today's git snapshot which includes the TG3 tx race fix among
 others).
 
 I have reproduced with various machines as the receiver. A sungem in a
 Dual G5 and a virtual ethernet in a Power5 partition (so the 
 packets go
 to an e1000 then routed through an AIX IO server to a virtual
 ethernet :) are good examples of variety :) I haven't tested with
 non-PowerPC machines so far. I've also never been able to 
 reproduce with
 TSO disabled on the emitting TG3's
 
 Then, I've hacked tridge socklib test program (a simple TCP server
 that pushes a known buffer and a simple TCP receiver that 
 connects to it
 and reads the data). I've added comparison of the data with what they
 are supposed to be on the receiving end. The interesting thing is that
 is much faster than ssh or whatever else I tried. ssh or rsync between
 those 2 Quad G5s give me about 35Mb/sec while I get to 
 107Mb/sec average
 with the small test program.
 
 The fun thing is, I've not been able to reproduce at all that 
 way. When
 the link is pretty much saturated, the problem doesn't occur !
 
 As soon as I introduce a small delay (some crap waiting loop) in the
 sender to slow down the throughput to about 80Mb/sec, then the problem
 starts occuring every now and then (I don't have precise frquency data
 but I get a corruption every couple of gigabytes I'd say).
 
 As for my previous tests, disabling TSO on the sending side 
 fixes it.
 
 Below is a dump of what the corruption look like. I've trimmed the
 beginning and end of the dumped packet (the receiver does 8k 
 reads). The
 0x5a are the expected data, the rest is corruption. They look like
 kernel pointers, but that isn't always the case (often though but that
 might not be relevant). The size and position within the buffer of the
 corrupted data is variable (doesn't seem to be specifically a page or
 anything nice and round like that).
 
 I've configured the switch to send all the traffic between the two
 machines to a 3rd box and then recorded it with tcpdump (the 
 spy uses
 an e1000) and I can see the corrupted data in the recorded
 traces (the exact same pattern as detected by the receiver). 
 So it seems
 very likely at this point that the corruption happens on the sending
 side. The TCP checksums are correct I assume. I don't see any error
 count on the receiving tg3 nor suspicous message in dmesg indicating
 they aren't.
 
 That's all the data I have at this point. I can't guarantee 100% that
 it's a TSO bug (it might be a bug that TSO renders visible 
 due to timing
 effects) but it looks like it since I've not reproduced yet with TSO
 disabled. I'll do an overnight test to confirm that though... 
 sometimes
 the bug can take it's time to show up ... I've seen it wait 
 20Gb before
 it kicked in. Also the fact that fully loading the machine never
 produced it is strange smells like a race.
 
 Cheers,
 Ben.
 
 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
 5a 5a 5a 5a 5a 5a 5a 5a 00 00 00 00 00 00 00 00
 2f 63 70 75 73 00 7f 7e c0 00 00 00 01 cb 70 82
 00 00 00 04 bf 1d db 4d c0 00 00 00 01 cb 92 00
 c0 00 00 01 7b fe 6d 98 c0 00 00 00 01 cb 70 91
 00 00 00 04 df 5d fe fd c0 00 00 00 01 cb 92 10
 c0 00 00 01 7b fe 6d b8 c0 00 00 00 01 cb 71 0e
 00 00 00 04 fe e2 fb cf c0 00 00 00 01 cb 92 20
 c0 00 00 01 7b fe 6d d8 c0 00 00 00 01 cb 71 1f
 00 00 00 04 73 69 ed ff c0 00 00 00 01 cb 92 30
 c0 00 00 01 7b fe 6d f8 c0 00 00 00 01 

Re: e1000_xmit_frame and e1000_down racing with next_to_use?

2006-09-08 Thread Auke Kok

Shaw Vrana wrote:

On Wed, 6 Sep 2006 10:58:15 -0700 (PDT)
[EMAIL PROTECTED] wrote:


Hello All,

I have a question about the use of the tx_ring-next_to_use variable in
the e1000.  Specifically, I'm wondering about a race between the use of
next_to_use in e1000_xmit_frame and the clearing of next_to_use in
e1000_down via e1000_clean_tx_ring.

Thread 1 (_xmit) -  first = adapter-tx_ring.next_to_use;
 e1000_tx_map();
Thread 2 (_down) -  e1000_clean_tx_ring();
 tx_ring-next_to_use = 0;
Thread 1 (_xmit) - e1000_tx_queue();

It seems that tx_ring.next_to_use could change between the time the
skbuff
is mapped in e1000_tx_map and the time it is reported to the hardware in
e1000_tx_queue.  While I don't see any memory leaks or possible oops, it
does seem possible that that an skbuff could be lost in the ring as it
will not be queued in the subsequent e1000_queue.

If the race is possible, perhaps this could be the culprit behind the tx
timeouts we've seen reported in this list?  The watchdog will eventually
find the lost skbuff and mistakenly think that the hardware transmit
has
hung and stop the queue.

Could one of you plese tell me how this race is avoided, if indeed it
is?

Thanks,
Shaw


e1000_down calls netif_stop_queue() and that stops transmit requests.
It doesn't handle the case of a transmit in flight during the e1000_down.

Shouldn't clean_tx_ring acquire tx_ring-tx_lock to avoid that?


Hi Stephen,

Yes, holding the adapter-tx_lock is all that is needed. e1000_irq_disable
has been called prior to e1000_clean_tx_ring or the interrupt has never
been enabled (e1000_open), so a simple spin_lock should suffice. I've
included a patch against Garzik's netdev git tree.

Thanks,
Shaw



Shaw,

Thanks for the patch. We're looking into this and indeed it appears to be a 
race, but as the commit message says - it covers only a transmit in flight in 
case of e1000_down.


I doubt this will fix the majority of tx hang reports that we see or have seen, 
which happen under normal operation (i.e. when the device is not going down at 
all).


This is an extreme corner case and I doubt it would even show up in normal use.

The patch prolly won't affect performance, which is the good part.

I'll give it a test.

Auke




Protect against the race to modify tx_ring-next_to_use in the case of a
transmit in flight during e1000_down.

Signed-off-by: Shaw Vrana [EMAIL PROTECTED]
---

 drivers/net/e1000/e1000_main.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index 726f43d..b327976 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -1937,6 +1937,8 @@ e1000_clean_tx_ring(struct e1000_adapter
unsigned long size;
unsigned int i;

+   spin_lock(tx_ring-tx_lock);
+
/* Free all the Tx ring sk_buffs */

for (i = 0; i  tx_ring-count; i++) {
@@ -1957,6 +1959,8 @@ e1000_clean_tx_ring(struct e1000_adapter

writel(0, adapter-hw.hw_addr + tx_ring-tdh);
writel(0, adapter-hw.hw_addr + tx_ring-tdt);
+
+   spin_unlock(tx_ring-tx_lock);
 }

 /**

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Alternate WE-21 support (core API)

2006-09-08 Thread Jean Tourrilhes
On Fri, Sep 08, 2006 at 10:29:23AM -0400, John W. Linville wrote:
 On Wed, Sep 06, 2006 at 02:30:53PM -0700, Jean Tourrilhes wrote:
  On Wed, Sep 06, 2006 at 04:55:44PM -0400, John W. Linville wrote:
 
   + * V20 to V21
   + * --
   + *   - Remove (struct net_device *)-get_wireless_stats()
   + *   - Change length in ESSID and NICK to strlen() instead of 
   strlen()+1
   + *   - Add IW_RETRY_SHORT/IW_RETRY_LONG retry modifiers
   + *   - Add explicit flag to tell stats are in 802.11k RCPI : 
   IW_QUAL_RCPI
  
  Personally, I would also add this :
  
  + * - Power/Retry relative values no longer * 10
 
  Three reason :
  1) It's a cleanup and does not add any new feature
  2) It does not change the rest of the patches
  3) Userspace part has already gone in distro, not
  including this bit would mean breaking userspace.
 
 Is there any code that corresponds to that?  Or does the comment
 simply indicate policy?

There is no code in the core of the WE, so it only indicates
policy. But, I believe policy change need to be documented.
On the other hand you will find code in the tiacx patch.

  The other bits can be included at a later time ;-)
 
 Well, maybe.  But, I think we should now consider WE to be in
 maintenance mode.  I think nl80211 is the future, as long as Johannes
 delivers.

We'll see. WE has been supposed to be replaced any time soon
for the last 3 years. And I don't believe nl80211 will address legacy
driver and non-802.11 hardware.

 Thanks,
 
 John

Thanks, and have fun...

Jean
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


IPSec broken in 2.6.18-rc4-mm3

2006-09-08 Thread Gnome42 Gnome42

Hi Folks,

(please CC me ...)

IPSec got broken in 2.6.18-rc4-mm3+, 2.6.18-rc4-mm2 works and
2.6.18-rc5 also works.

The tunnel looks like its established correctly in the racoon logs and
the traffic is encrypted on the wire. However, the other side does not
decrypt the traffic it just seems to disappear.

I have confirmed this problem exists between two linux boxen and a
Netopia router as well.

The git-net.patch increased in size by about 50% between
2.6.18-rc4-mm2 and 2.6.18-rc4-mm3 (likely suspect?), but i was unable
to simply patch -R it cleanly.

Suggestions?

Shane
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/7] secid reconciliation-v02: Enforcement for SELinux

2006-09-08 Thread Venkat Yekkirala

This defines SELinux enforcement of the 2 new LSM hooks.

Signed-off-by: Venkat Yekkirala [EMAIL PROTECTED]
---
security/selinux/hooks.c|  125 --
security/selinux/include/xfrm.h |5 +
security/selinux/ss/mls.c   |2 
security/selinux/ss/services.c  |2 
security/selinux/xfrm.c |   28 ++

5 files changed, 136 insertions(+), 26 deletions(-)

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 5a66c4c..044e452 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -3449,8 +3449,12 @@ static int selinux_sock_rcv_skb_compat(s

err = avc_has_perm(sock_sid, port_sid,
   sock_class, recv_perm, ad);
+   if (err)
+   goto out;
}

+   err = selinux_xfrm_sock_rcv_skb(sock_sid, skb, ad);
+
out:
return err;
}
@@ -3489,10 +3493,6 @@ static int selinux_socket_sock_rcv_skb(s
goto out;

err = selinux_netlbl_sock_rcv_skb(sksec, skb, ad);
-   if (err)
-   goto out;
-
-   err = selinux_xfrm_sock_rcv_skb(sksec-sid, skb, ad);
out:
return err;
}
@@ -3626,13 +3626,16 @@ static int selinux_inet_conn_request(str
return 0;
}

-   err = selinux_xfrm_decode_session(skb, peersid, 0);
-   BUG_ON(err);
+   if (selinux_compat_net) {
+   err = selinux_xfrm_decode_session(skb, peersid, 0);
+   BUG_ON(err);

-   if (peersid == SECSID_NULL) {
-   req-secid = sksec-sid;
-   return 0;
-   }
+   if (peersid == SECSID_NULL) {
+   req-secid = sksec-sid;
+   return 0;
+   }
+   } else
+   peersid = skb-secmark;

err = security_sid_mls_copy(sksec-sid, peersid, newsid);
if (err)
@@ -3662,6 +3665,78 @@ static void selinux_req_classify_flow(co
fl-secid = req-secid;
}

+static int selinux_skb_policy_check(struct sk_buff *skb, unsigned short family)
+{
+   u32 xfrm_sid, trans_sid;
+   int err;
+
+   if (selinux_compat_net)
+   return 1;
+
+   err = selinux_xfrm_decode_session(skb, xfrm_sid, 0);
+   BUG_ON(err);
+
+   err = avc_has_perm(xfrm_sid, skb-secmark, SECCLASS_PACKET,
+   PACKET__FLOW_IN, NULL);
+   if (err)
+   goto out;
+
+   if (xfrm_sid) {
+   err = security_transition_sid(xfrm_sid, skb-secmark,
+   SECCLASS_PACKET, trans_sid);
+   if (err)
+   goto out;
+
+   skb-secmark = trans_sid;
+   }
+
+   /* See if CIPSO can flow in thru the current secmark here */
+
+out:
+   return err ? 0 : 1;
+};
+
+static int selinux_skb_netfilter_check(struct sk_buff *skb, u32 nf_secid)
+{
+   u32 xfrm_sid;
+   u32 trans_sid;
+   int err;
+
+   if (selinux_compat_net)
+   return 1;
+
+   if (!skb-secmark  skb-sk) {
+   struct sk_security_struct *sksec = skb-sk-sk_security;
+   skb-secmark = sksec-sid;
+   }
+
+   selinux_skb_xfrm_sid(skb, xfrm_sid);
+
+   err = avc_has_perm(skb-secmark, xfrm_sid, SECCLASS_PACKET,
+   PACKET__FLOW_OUT, NULL);
+
+   if (err)
+   goto out;
+
+   if (xfrm_sid) {
+   err = security_transition_sid(xfrm_sid, skb-secmark,
+   SECCLASS_PACKET, trans_sid);
+   if (err)
+   goto out;
+
+   skb-secmark = trans_sid;
+   }
+
+   err = avc_has_perm(skb-secmark, nf_secid, SECCLASS_PACKET,
+   PACKET__FLOW_OUT, NULL);
+
+out:
+   /* Signal postroute_last that we are done with this skb */
+   skb-secmark = SECSID_WILD;
+
+   return err ? 0 : 1;
+}
+
static int selinux_nlmsg_perm(struct sock *sk, struct sk_buff *skb)
{
int err = 0;
@@ -3700,7 +3775,8 @@ out:

#ifdef CONFIG_NETFILTER

-static int selinux_ip_postroute_last_compat(struct sock *sk, struct net_device 
*dev,
+static int selinux_ip_postroute_last_compat(struct sock *sk, struct sk_buff 
*skb,
+   struct net_device *dev,
struct avc_audit_data *ad,
u16 family, char *addrp, int len)
{
@@ -3710,6 +3786,9 @@ static int selinux_ip_postroute_last_com
struct inode *inode;
struct inode_security_struct *isec;

+   if (!sk)
+   goto out;
+
sock = sk-sk_socket;
if (!sock)
goto out;
@@ -3768,7 +3847,11 @@ static int selinux_ip_postroute_last_com

err = avc_has_perm(isec-sid, port_sid, isec-sclass,
   send_perm, ad);
+   if (err)
+  

[PATCH 4/7] secid reconciliation-v02: Invoke LSM hook for outbound traffic

2006-09-08 Thread Venkat Yekkirala

Invoke the skb_netfilter_check LSM hook for outbound (OUTPUT/FORWARD)
traffic for secid reconciliation and flow control.

Signed-off-by: Venkat Yekkirala [EMAIL PROTECTED]
---
net/netfilter/xt_CONNSECMARK.c |   44 ++-
net/netfilter/xt_SECMARK.c |   20 --
2 files changed, 50 insertions(+), 14 deletions(-)

diff --git a/net/netfilter/xt_CONNSECMARK.c b/net/netfilter/xt_CONNSECMARK.c
index 4673862..a79bd20 100644
--- a/net/netfilter/xt_CONNSECMARK.c
+++ b/net/netfilter/xt_CONNSECMARK.c
@@ -17,6 +17,8 @@
 */
#include linux/module.h
#include linux/skbuff.h
+#include linux/security.h
+#include linux/netfilter_ipv6.h
#include linux/netfilter/x_tables.h
#include linux/netfilter/xt_CONNSECMARK.h
#include net/netfilter/nf_conntrack_compat.h
@@ -47,20 +49,32 @@ static void secmark_save(struct sk_buff 
}


/*
- * If packet has no security mark, and the connection does, restore the
- * security mark from the connection to the packet.
+ * On the inbound, restore the security mark from the connection to the packet.
+ * On the outbound, filter based on the current secmark.
 */
-static void secmark_restore(struct sk_buff *skb)
+static unsigned int secmark_restore(struct sk_buff *skb, unsigned int hooknum,
+  const struct xt_target *target)
{
-   if (!skb-secmark) {
-   u32 *connsecmark;
-   enum ip_conntrack_info ctinfo;
+   u32 *psecmark;
+   u32 secmark = 0;
+   enum ip_conntrack_info ctinfo;

-   connsecmark = nf_ct_get_secmark(skb, ctinfo);
-   if (connsecmark  *connsecmark)
-   if (skb-secmark != *connsecmark)
-   skb-secmark = *connsecmark;
-   }
+   psecmark = nf_ct_get_secmark(skb, ctinfo);
+   if (psecmark)
+   secmark = *psecmark;
+
+   if (!secmark)
+   return XT_CONTINUE;
+
+   /* Set secmark on inbound and filter it on outbound */
+   if (hooknum == NF_IP_POST_ROUTING || hooknum == NF_IP6_POST_ROUTING) {
+   if (!security_skb_netfilter_check(skb, secmark))
+   return NF_DROP;
+   } else
+   if (skb-secmark != secmark)
+   skb-secmark = secmark;
+
+   return XT_CONTINUE;
}

static unsigned int target(struct sk_buff **pskb, const struct net_device *in,
@@ -77,7 +91,7 @@ static unsigned int target(struct sk_buf
break;

case CONNSECMARK_RESTORE:
-   secmark_restore(skb);
+   return secmark_restore(skb, hooknum, target);
break;

default:
@@ -114,6 +128,9 @@ static struct xt_target xt_connsecmark_t
.target = target,
.targetsize = sizeof(struct xt_connsecmark_target_info),
.table  = mangle,
+   .hooks  = (1  NF_IP_LOCAL_IN) |
+ (1  NF_IP_FORWARD) |
+ (1  NF_IP_POST_ROUTING),
.me = THIS_MODULE,
},
{
@@ -123,6 +140,9 @@ static struct xt_target xt_connsecmark_t
.target = target,
.targetsize = sizeof(struct xt_connsecmark_target_info),
.table  = mangle,
+   .hooks  = (1  NF_IP6_LOCAL_IN) |
+ (1  NF_IP6_FORWARD) |
+ (1  NF_IP6_POST_ROUTING),
.me = THIS_MODULE,
},
};
diff --git a/net/netfilter/xt_SECMARK.c b/net/netfilter/xt_SECMARK.c
index add7521..de1de45 100644
--- a/net/netfilter/xt_SECMARK.c
+++ b/net/netfilter/xt_SECMARK.c
@@ -15,8 +15,10 @@
#include linux/module.h
#include linux/skbuff.h
#include linux/selinux.h
+#include linux/security.h
#include linux/netfilter/x_tables.h
#include linux/netfilter/xt_SECMARK.h
+#include linux/netfilter_ipv6.h

MODULE_LICENSE(GPL);
MODULE_AUTHOR(James Morris [EMAIL PROTECTED]);
@@ -47,8 +49,16 @@ static unsigned int target(struct sk_buf
BUG();
}

-   if ((*pskb)-secmark != secmark)
-   (*pskb)-secmark = secmark;
+   if (!secmark)
+   return XT_CONTINUE;
+
+   /* Set secmark on inbound and filter it on outbound */
+   if (hooknum == NF_IP_POST_ROUTING || hooknum == NF_IP6_POST_ROUTING) {
+   if (!security_skb_netfilter_check(*pskb, secmark))
+   return NF_DROP;
+   } else
+   if ((*pskb)-secmark != secmark)
+   (*pskb)-secmark = secmark;

return XT_CONTINUE;
}
@@ -119,6 +129,9 @@ static struct xt_target xt_secmark_targe
.target = target,
.targetsize = sizeof(struct xt_secmark_target_info),
.table  = mangle,
+   .hooks  = (1  NF_IP_LOCAL_IN) |
+ (1  NF_IP_FORWARD) |
+

[PATCH 2/7] secid reconciliation-v02: Add LSM hooks

2006-09-08 Thread Venkat Yekkirala

Add skb_policy_check and skb_netfilter_check hooks to LSM to enable
reconciliation of the various security identifiers as well as enforce
flow control on inbound (INPUT/FORWARD) and outbound (OUTPUT/FORWARD)
traffic.

Signed-off-by: Venkat Yekkirala [EMAIL PROTECTED]
---
include/linux/security.h |   32 
security/dummy.c |   13 +
2 files changed, 45 insertions(+)

diff --git a/include/linux/security.h b/include/linux/security.h
index 9f56fb8..032cede 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -828,6 +828,12 @@ #ifdef CONFIG_SECURITY
 *  Sets the new child socket's sid to the openreq sid.
 * @req_classify_flow:
 *  Sets the flow's sid to the openreq sid.
+ * @skb_policy_check:
+ * Checks to see if security policy would allow skb into the system.
+ * Returns 1 if skb allowed into system, 0 otherwise.
+ * @skb_netfilter_check:
+ * Checks to see if security policy would allow skb to go out of system.
+ * Returns 1 if skb allowed out of system, 0 otherwise.
 *
 * Security hooks for XFRM operations.
 *
@@ -1372,6 +1378,8 @@ #ifdef CONFIG_SECURITY_NETWORK
struct request_sock *req);
void (*inet_csk_clone)(struct sock *newsk, const struct request_sock 
*req);
void (*req_classify_flow)(const struct request_sock *req, struct flowi 
*fl);
+   int (*skb_policy_check)(struct sk_buff *skb, unsigned short family);
+   int (*skb_netfilter_check)(struct sk_buff *skb, u32 nf_secid);
#endif  /* CONFIG_SECURITY_NETWORK */

#ifdef CONFIG_SECURITY_NETWORK_XFRM
@@ -2946,6 +2954,18 @@ static inline void security_req_classify
security_ops-req_classify_flow(req, fl);
}

+static inline int security_skb_policy_check(struct sk_buff *skb,
+   unsigned short family)
+{
+   return security_ops-skb_policy_check(skb, family);
+}
+
+static inline int security_skb_netfilter_check(struct sk_buff *skb,
+   u32 nf_secid)
+{
+   return security_ops-skb_netfilter_check(skb, nf_secid);
+}
+
static inline void security_sock_graft(struct sock* sk, struct socket *parent)
{
security_ops-sock_graft(sk, parent);
@@ -3097,6 +3117,18 @@ static inline void security_req_classify
{
}

+static inline int security_skb_policy_check(struct sk_buff *skb,
+   unsigned short family)
+{
+   return 1;
+}
+
+static inline int security_skb_netfilter_check(struct sk_buff *skb,
+   u32 nf_secid)
+{
+   return 1;
+}
+
static inline void security_sock_graft(struct sock* sk, struct socket *parent)
{
}
diff --git a/security/dummy.c b/security/dummy.c
index aeee705..077d3c9 100644
--- a/security/dummy.c
+++ b/security/dummy.c
@@ -832,6 +832,17 @@ static inline void dummy_req_classify_fl
struct flowi *fl)
{
}
+
+static inline int dummy_skb_policy_check(struct sk_buff *skb,
+   unsigned short family)
+{
+   return 1;
+}
+
+static inline int dummy_skb_netfilter_check(struct sk_buff *skb, u32 nf_secid)
+{
+   return 1;
+}
#endif  /* CONFIG_SECURITY_NETWORK */

#ifdef CONFIG_SECURITY_NETWORK_XFRM
@@ -1108,6 +1119,8 @@ #ifdef CONFIG_SECURITY_NETWORK
set_to_dummy_if_null(ops, inet_conn_request);
set_to_dummy_if_null(ops, inet_csk_clone);
set_to_dummy_if_null(ops, req_classify_flow);
+   set_to_dummy_if_null(ops, skb_policy_check);
+   set_to_dummy_if_null(ops, skb_netfilter_check);
 #endif /* CONFIG_SECURITY_NETWORK */
#ifdef  CONFIG_SECURITY_NETWORK_XFRM
set_to_dummy_if_null(ops, xfrm_policy_alloc_security);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/7] secid reconciliation-v02: Label locally generated IPv4 traffic

2006-09-08 Thread Venkat Yekkirala

This labels the skb(s) for locally generated IPv4 traffic. This will
be reconciled with xfrm secid as well as used in pertinent flow control
checks on the outbound later in the LSM hook.

This is not as pretty as it is for IPv6, but what to do?
Note that skb(s) that derive the secmark from the originating socket
do so in the outbound hook.

NOTE: Forwarded traffic is already labeled with the reconciled
secmark on the inbound.

Signed-off-by: Venkat Yekkirala [EMAIL PROTECTED]
---
include/net/ip.h   |   32 
include/net/request_sock.h |   17 +
net/dccp/ipv4.c|5 +
net/ipv4/icmp.c|4 
net/ipv4/ip_output.c   |6 ++
net/ipv4/tcp_ipv4.c|1 +
6 files changed, 65 insertions(+)

diff --git a/include/net/ip.h b/include/net/ip.h
index 98f9084..4646c13 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -48,6 +48,9 @@ struct ipcm_cookie
u32 addr;
int oif;
struct ip_options   *opt;
+#ifdef CONFIG_SECURITY_NETWORK
+   __u32   secid;
+#endif /* CONFIG_SECURITY_NETWORK */
};

#define IPCB(skb) ((struct inet_skb_parm*)((skb)-cb))
@@ -383,4 +386,33 @@ #endif

extern struct ctl_table ipv4_table[];

+#ifdef CONFIG_SECURITY_NETWORK
+
+static inline void security_skb_classify_ipcm(struct sk_buff *skb,
+   struct ipcm_cookie *ipc)
+{
+   ipc-secid = 0;
+   ipc-secid = skb-secmark;
+}
+
+static inline void security_ipcm_classify_skb(struct ipcm_cookie *ipc,
+   struct sk_buff *skb)
+{
+   skb-secmark = ipc-secid;
+}
+
+#else
+
+static inline void security_skb_classify_ipcm(struct sk_buff *skb,
+   struct ipcm_cookie *ipc)
+{
+}
+
+static inline void security_ipcm_classify_skb(struct ipcm_cookie *ipc,
+   struct sk_buff *skb)
+{
+}
+
+#endif /* CONFIG_SECURITY_NETWORK */
+
#endif  /* _IP_H */
diff --git a/include/net/request_sock.h b/include/net/request_sock.h
index 8e165ca..bba8dba 100644
--- a/include/net/request_sock.h
+++ b/include/net/request_sock.h
@@ -259,4 +259,21 @@ static inline void reqsk_queue_hash_req(
write_unlock(queue-syn_wait_lock);
}

+#ifdef CONFIG_SECURITY_NETWORK
+
+static inline void security_req_classify_skb(struct request_sock *req,
+   struct sk_buff *skb)
+{
+   skb-secmark = req-secid;
+}
+
+#else
+
+static inline void security_req_classify_skb(struct request_sock *req,
+   struct sk_buff *skb)
+{
+}
+
+#endif /* CONFIG_SECURITY_NETWORK */
+
#endif /* _REQUEST_SOCK_H */
diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index 9a1a76a..526835e 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -233,6 +233,8 @@ static void dccp_v4_reqsk_send_ack(struc
dccp_hdr_set_ack(dccp_hdr_ack_bits(skb),
 DCCP_SKB_CB(rxskb)-dccpd_seq);

+   security_req_classify_skb(req, skb);
+
bh_lock_sock(dccp_v4_ctl_socket-sk);
err = ip_build_and_send_pkt(skb, dccp_v4_ctl_socket-sk,
rxskb-nh.iph-daddr,
@@ -264,6 +266,7 @@ static int dccp_v4_send_response(struct 
		dh-dccph_checksum = dccp_v4_checksum(skb, ireq-loc_addr,

  ireq-rmt_addr);
memset((IPCB(skb)-opt), 0, sizeof(IPCB(skb)-opt));
+   security_req_classify_skb(req, skb);
err = ip_build_and_send_pkt(skb, sk, ireq-loc_addr,
ireq-rmt_addr,
ireq-opt);
@@ -746,6 +749,8 @@ static void dccp_v4_ctl_send_reset(struc
dh-dccph_checksum = dccp_v4_checksum(skb, rxskb-nh.iph-saddr,
  rxskb-nh.iph-daddr);

+   security_skb_classify_skb(rxskb, skb);
+
bh_lock_sock(dccp_v4_ctl_socket-sk);
err = ip_build_and_send_pkt(skb, dccp_v4_ctl_socket-sk,
rxskb-nh.iph-daddr,
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index c2ad07e..956791a 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -389,6 +389,8 @@ static void icmp_reply(struct icmp_bxm *
if (icmp_xmit_lock())
return;

+   security_skb_classify_ipcm(skb, ipc);
+
icmp_param-data.icmph.checksum = 0;
icmp_out_count(icmp_param-data.icmph.type);

@@ -507,6 +509,8 @@ void icmp_send(struct sk_buff *skb_in, i
if (icmp_xmit_lock())
return;

+   security_skb_classify_ipcm(skb_in, ipc);
+
/*
 *  Construct source address and options.
 */
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 97aee76..2e0775c 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -926,6 +926,8 @@ alloc_new_skb:
if 

[PATCH 1/7] secid reconciliation-v02

2006-09-08 Thread Venkat Yekkirala

Currently a packet accumulates multiple security identifiers, each of a
different class, as it enters/leaves the system. This patch set reconciles these
identifiers into a single identifier while also allowing LSM (SELinux is
addressed in this patch set) to impose flow control checks based on the
identifiers.

The reconciliation steps for SELinux are explained in the Labeled Networking
document at:
http://marc.theaimsgroup.com/?l=linux-netdevm=115136637800361w=2
with the change that SELinux transition rules are used when available
to arrive at the new secid.

The following are the identifiers handled here:

1. secmark on the skb
2. xfrm security identifier associated with the skb if it used any xfrms,
a zero secid otherwise.

This patch: Add new flask definitions to SELinux

Adds a new avperm flow_in to arbitrate among the identifiers on the
inbound (input/forward). Also adds a new avperm flow_out to enable flow
control checks on the outbound (output/forward), addressed in this patch
as well.

Signed-off-by: Venkat Yekkirala [EMAIL PROTECTED]
---
security/selinux/include/av_perm_to_string.h |2 ++
security/selinux/include/av_permissions.h|2 ++
2 files changed, 4 insertions(+)

diff --git a/security/selinux/include/av_perm_to_string.h 
b/security/selinux/include/av_perm_to_string.h
index 09fc8a2..1e65d28 100644
--- a/security/selinux/include/av_perm_to_string.h
+++ b/security/selinux/include/av_perm_to_string.h
@@ -245,6 +245,8 @@
   S_(SECCLASS_PACKET, PACKET__SEND, send)
   S_(SECCLASS_PACKET, PACKET__RECV, recv)
   S_(SECCLASS_PACKET, PACKET__RELABELTO, relabelto)
+   S_(SECCLASS_PACKET, PACKET__FLOW_IN, flow_in)
+   S_(SECCLASS_PACKET, PACKET__FLOW_OUT, flow_out)
   S_(SECCLASS_KEY, KEY__VIEW, view)
   S_(SECCLASS_KEY, KEY__READ, read)
   S_(SECCLASS_KEY, KEY__WRITE, write)
diff --git a/security/selinux/include/av_permissions.h 
b/security/selinux/include/av_permissions.h
index 81f4f52..2faf3d8 100644
--- a/security/selinux/include/av_permissions.h
+++ b/security/selinux/include/av_permissions.h
@@ -962,6 +962,8 @@ #define APPLETALK_SOCKET__NAME_BIND 
#define PACKET__SEND  0x0001UL

#define PACKET__RECV  0x0002UL
#define PACKET__RELABELTO 0x0004UL
+#define PACKET__FLOW_IN   0x0008UL
+#define PACKET__FLOW_OUT  0x0010UL

#define KEY__VIEW 0x0001UL
#define KEY__READ 0x0002UL
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/7] secid reconciliation-v02: Invoke LSM hook for inbound traffic

2006-09-08 Thread Venkat Yekkirala

Invoke the skb_policy_check LSM hook for inbound (INPUT/FORWARD)
traffic for secid reconciliation and flow control.

Signed-off-by: Venkat Yekkirala [EMAIL PROTECTED]
---
include/net/xfrm.h |   50 +++
1 file changed, 27 insertions(+), 23 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index bf8e2df..7b020bd 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -663,22 +663,20 @@ extern int __xfrm_policy_check(struct so

static inline int xfrm_policy_check(struct sock *sk, int dir, struct sk_buff 
*skb, unsigned short family)
{
-   if (sk  sk-sk_policy[XFRM_POLICY_IN])
-   return __xfrm_policy_check(sk, dir, skb, family);
-
-   return  (!xfrm_policy_count[dir]  !skb-sp) ||
-   (skb-dst-flags  DST_NOPOLICY) ||
-   __xfrm_policy_check(sk, dir, skb, family);
-}
-
-static inline int xfrm4_policy_check(struct sock *sk, int dir, struct sk_buff 
*skb)
-{
-   return xfrm_policy_check(sk, dir, skb, AF_INET);
-}
+   int ret;

-static inline int xfrm6_policy_check(struct sock *sk, int dir, struct sk_buff 
*skb)
-{
-   return xfrm_policy_check(sk, dir, skb, AF_INET6);
+   if (sk  sk-sk_policy[XFRM_POLICY_IN])
+   ret = __xfrm_policy_check(sk, dir, skb, family);
+   else
+   ret = (!xfrm_policy_count[dir]  !skb-sp) ||
+ (skb-dst-flags  DST_NOPOLICY) ||
+ __xfrm_policy_check(sk, dir, skb, family);
+
+#ifdef CONFIG_SECURITY_NETWORK
+   if (ret)
+   ret = security_skb_policy_check(skb, family);
+#endif /* CONFIG_SECURITY_NETWORK */
+   return ret;
}

extern int xfrm_decode_session(struct sk_buff *skb, struct flowi *fl, unsigned 
short family);
@@ -730,20 +728,26 @@ static inline void xfrm_sk_free_policy(s
static inline int xfrm_sk_clone_policy(struct sock *sk) { return 0; }
static inline int xfrm6_route_forward(struct sk_buff *skb) { return 1; }  
static inline int xfrm4_route_forward(struct sk_buff *skb) { return 1; } 
-static inline int xfrm6_policy_check(struct sock *sk, int dir, struct sk_buff *skb)
-{ 
-	return 1; 
-} 
-static inline int xfrm4_policy_check(struct sock *sk, int dir, struct sk_buff *skb)

-{
-   return 1;
-}
static inline int xfrm_policy_check(struct sock *sk, int dir, struct sk_buff 
*skb, unsigned short family)
{
+#ifdef CONFIG_SECURITY_NETWORK
+   return security_skb_policy_check(skb, family);
+#else
return 1;
+#endif /* CONFIG_SECURITY_NETWORK */
}
#endif

+static inline int xfrm4_policy_check(struct sock *sk, int dir, struct sk_buff 
*skb)
+{
+   return xfrm_policy_check(sk, dir, skb, AF_INET);
+}
+
+static inline int xfrm6_policy_check(struct sock *sk, int dir, struct sk_buff 
*skb)
+{
+   return xfrm_policy_check(sk, dir, skb, AF_INET6);
+}
+
static __inline__
xfrm_address_t *xfrm_flowi_daddr(struct flowi *fl, unsigned short family)
{
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/7] secid reconciliation-v02: Label locally generated IPv6 traffic

2006-09-08 Thread Venkat Yekkirala

This labels the skb(s) for locally generated IPv6 traffic. This will
be reconciled with xfrm secid as well as used in pertinent flow control
checks on the outbound later in the LSM hook.

NOTE: Forwarded traffic is already labeled with the reconciled
secmark on the inbound.

Signed-off-by: Venkat Yekkirala [EMAIL PROTECTED]
---
include/linux/skbuff.h   |   29 +
net/ipv6/ip6_output.c|5 +
net/ipv6/netfilter/ip6t_REJECT.c |2 ++
3 files changed, 36 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 85577a4..18967f2 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -29,6 +29,7 @@ #include linux/net.h
#include linux/textsearch.h
#include net/checksum.h
#include linux/dmaengine.h
+#include net/flow.h

#define HAVE_ALLOC_SKB  /* For the drivers to know */
#define HAVE_ALIGNABLE_SKB  /* Ditto 8)*/
@@ -1499,5 +1500,33 @@ static inline int skb_is_gso(const struc
return skb_shinfo(skb)-gso_size;
}

+#ifdef CONFIG_SECURITY_NETWORK
+
+static inline void security_skb_classify_skb(struct sk_buff *from,
+   struct sk_buff *skb)
+{
+   skb-secmark = from-secmark;
+}
+
+static inline void security_flow_classify_skb(struct flowi *fl,
+   struct sk_buff *skb)
+{
+   skb-secmark = fl-secid;
+}
+
+#else
+
+static inline void security_skb_classify_skb(struct sk_buff *from,
+   struct sk_buff *skb)
+{
+}
+
+static inline void security_flow_classify_skb(struct flowi *fl,
+   struct sk_buff *skb)
+{
+}
+
+#endif /* CONFIG_SECURITY_NETWORK */
+
#endif  /* __KERNEL__ */
#endif  /* _LINUX_SKBUFF_H */
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index c14ea1e..753e3b4 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -170,6 +170,8 @@ int ip6_xmit(struct sock *sk, struct sk_
int hlimit, tclass;
u32 mtu;

+   security_flow_classify_skb(fl, skb);
+
if (opt) {
int head_room;

@@ -1088,6 +1090,9 @@ alloc_new_skb:
}
if (skb == NULL)
goto error;
+
+   security_flow_classify_skb(fl, skb);
+
/*
 *  Fill in the control structures
 */
diff --git a/net/ipv6/netfilter/ip6t_REJECT.c b/net/ipv6/netfilter/ip6t_REJECT.c
index 311eae8..0508c30 100644
--- a/net/ipv6/netfilter/ip6t_REJECT.c
+++ b/net/ipv6/netfilter/ip6t_REJECT.c
@@ -128,6 +128,8 @@ static void send_reset(struct sk_buff *o
ipv6_addr_copy(ip6h-saddr, oip6h-daddr);
ipv6_addr_copy(ip6h-daddr, oip6h-saddr);

+   security_skb_classify_skb(oldskb, nskb);
+
tcph = (struct tcphdr *)skb_put(nskb, sizeof(struct tcphdr));
/* Truncate to length (no data) */
tcph-doff = sizeof(struct tcphdr)/4;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/7] secid reconciliation-v02: Repost patchset with updates

2006-09-08 Thread Venkat Yekkirala

The following are the changes included in this patchset since the previous post:

- Perform flow_in check before (as opposed to after) computing transition
 secid on inbound; this seems more intuitive and correct.
- Implement reconciliation and flow control for outbound traffic
 (forward case being a sequence of inbound checks followed by outbound checks).
- Make selinux_xfrm_postroute_last checks conditional on compat_net.

This patchset is relative to David Miller's net-2.6.19.git (last updated on Sep 
1st).

Please consider for inclusion in 2.6.19.


UPCOMING WORK:

The following per the discussion at:
 http://marc.theaimsgroup.com/?l=selinuxm=115755980516072w=2

- Create IPSec SAs to be acquired with the creating sock's context as opposed
 to that of the matching SPD rule, resulting in a simpler SPD as well as policy.
- Set peer_sid on tcp sockets to the reconciled secmark so trusted applications
 can retrieve and service the data at the appropriate context.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Devel] Re: [RFC] network namespaces

2006-09-08 Thread Herbert Poetzl
On Fri, Sep 08, 2006 at 05:10:08PM +0400, Dmitry Mishin wrote:
 On Thursday 07 September 2006 21:27, Herbert Poetzl wrote:
  well, who said that you need to have things like RAW sockets
  or other protocols except IP, not to speak of iptable and
  routing entries ...
 
  folks who _want_ full network virtualization can use the
  more complete virtual setup and be happy ...

 Let's think about how to implement this.

 As I understood VServer's design, your proposal is to split
 CAP_NET_ADMIN to multiple capabilities and use them if required. So,
 for your light-weight container it is enough to implement context
 isolation for protected by CAP_NET_IP capability (for example) code
 and put 'if (!capable(CAP_NET_*))' checks to all other places. 

actually the light-weight ip isolation runs perfectly
fine _without_ CAP_NET_ADMIN, as you do not want the
guest to be able to mess with the 'configured' ips at
all (not to speak of interfaces here)

best,
Herbert

 But this could be easily implemented over OpenVZ code by
 CAP_VE_NET_ADMIN split.
 
 So, the question is:
 Could you point out the places in Andrey's implementation of network
 namespaces, which prevents you to add CAP_NET_ADMIN separation later?
 
 -- 
 Thanks,
 Dmitry.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] ethtool: allow const ethtool_ops

2006-09-08 Thread Stephen Hemminger
The ethtool_ops structure is immutable, it expected to be setup
by the driver and is never changed. This patch allows drivers to
declare there ethtool_ops structure read-only.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


--- linux-2.6.orig/include/linux/netdevice.h
+++ linux-2.6/include/linux/netdevice.h
@@ -342,7 +342,7 @@ struct net_device
/* Instance data managed by the core of Wireless Extensions. */
struct iw_public_data * wireless_data;
 
-   struct ethtool_ops *ethtool_ops;
+   const struct ethtool_ops *ethtool_ops;
 
/*
 * This marks the end of the visible part of the structure. All
--- linux-2.6.orig/net/core/ethtool.c
+++ linux-2.6/net/core/ethtool.c
@@ -143,7 +143,7 @@ static int ethtool_set_settings(struct n
 static int ethtool_get_drvinfo(struct net_device *dev, void __user *useraddr)
 {
struct ethtool_drvinfo info;
-   struct ethtool_ops *ops = dev-ethtool_ops;
+   const struct ethtool_ops *ops = dev-ethtool_ops;
 
if (!ops-get_drvinfo)
return -EOPNOTSUPP;
@@ -169,7 +169,7 @@ static int ethtool_get_drvinfo(struct ne
 static int ethtool_get_regs(struct net_device *dev, char __user *useraddr)
 {
struct ethtool_regs regs;
-   struct ethtool_ops *ops = dev-ethtool_ops;
+   const struct ethtool_ops *ops = dev-ethtool_ops;
void *regbuf;
int reglen, ret;
 
@@ -282,7 +282,7 @@ static int ethtool_get_link(struct net_d
 static int ethtool_get_eeprom(struct net_device *dev, void __user *useraddr)
 {
struct ethtool_eeprom eeprom;
-   struct ethtool_ops *ops = dev-ethtool_ops;
+   const struct ethtool_ops *ops = dev-ethtool_ops;
u8 *data;
int ret;
 
@@ -327,7 +327,7 @@ static int ethtool_get_eeprom(struct net
 static int ethtool_set_eeprom(struct net_device *dev, void __user *useraddr)
 {
struct ethtool_eeprom eeprom;
-   struct ethtool_ops *ops = dev-ethtool_ops;
+   const struct ethtool_ops *ops = dev-ethtool_ops;
u8 *data;
int ret;
 
@@ -640,7 +640,7 @@ static int ethtool_set_gso(struct net_de
 static int ethtool_self_test(struct net_device *dev, char __user *useraddr)
 {
struct ethtool_test test;
-   struct ethtool_ops *ops = dev-ethtool_ops;
+   const struct ethtool_ops *ops = dev-ethtool_ops;
u64 *data;
int ret;
 
@@ -673,7 +673,7 @@ static int ethtool_self_test(struct net_
 static int ethtool_get_strings(struct net_device *dev, void __user *useraddr)
 {
struct ethtool_gstrings gstrings;
-   struct ethtool_ops *ops = dev-ethtool_ops;
+   const struct ethtool_ops *ops = dev-ethtool_ops;
u8 *data;
int ret;
 
@@ -733,7 +733,7 @@ static int ethtool_phys_id(struct net_de
 static int ethtool_get_stats(struct net_device *dev, void __user *useraddr)
 {
struct ethtool_stats stats;
-   struct ethtool_ops *ops = dev-ethtool_ops;
+   const struct ethtool_ops *ops = dev-ethtool_ops;
u64 *data;
int ret;
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] loopback: minor statistics optimization

2006-09-08 Thread Stephen Hemminger
Minor loopback enhancements for 2.6.19

The loopback device status structure is a singleton and doesn't
need to be allocated. Add ethtool_ops hooks to show checksum always on,
and make ethtool_ops const.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]


--- linux-2.6.orig/drivers/net/loopback.c
+++ linux-2.6/drivers/net/loopback.c
@@ -161,15 +161,13 @@ static int loopback_xmit(struct sk_buff 
return(0);
 }
 
+static struct net_device_stats loopback_stats;
+
 static struct net_device_stats *get_stats(struct net_device *dev)
 {
-   struct net_device_stats *stats = dev-priv;
+   struct net_device_stats *stats = loopback_stats;
int i;
 
-   if (!stats) {
-   return NULL;
-   }
-
memset(stats, 0, sizeof(struct net_device_stats));
 
for_each_possible_cpu(i) {
@@ -185,19 +183,28 @@ static struct net_device_stats *get_stat
return stats;
 }
 
-static u32 loopback_get_link(struct net_device *dev)
+static u32 always_on(struct net_device *dev)
 {
return 1;
 }
 
-static struct ethtool_ops loopback_ethtool_ops = {
-   .get_link   = loopback_get_link,
+static const struct ethtool_ops loopback_ethtool_ops = {
+   .get_link   = always_on,
.get_tso= ethtool_op_get_tso,
.set_tso= ethtool_op_set_tso,
+   .get_tx_csum= always_on,
+   .get_sg = always_on,
+   .get_rx_csum= always_on,
 };
 
+/*
+ * The loopback device is special. There is only one instance and
+ * it is statically allocated. Don't do this for other devices.
+ */
 struct net_device loopback_dev = {
.name   = lo,
+   .get_stats  = get_stats,
+   .priv   = loopback_stats,
.mtu= (16 * 1024) + 20 + 20 + 12,
.hard_start_xmit= loopback_xmit,
.hard_header= eth_header,
@@ -221,16 +228,6 @@ struct net_device loopback_dev = {
 /* Setup and register the loopback device. */
 int __init loopback_init(void)
 {
-   struct net_device_stats *stats;
-
-   /* Can survive without statistics */
-   stats = kmalloc(sizeof(struct net_device_stats), GFP_KERNEL);
-   if (stats) {
-   memset(stats, 0, sizeof(struct net_device_stats));
-   loopback_dev.priv = stats;
-   loopback_dev.get_stats = get_stats;
-   }
-   
return register_netdev(loopback_dev);
 };
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: IPSec broken in 2.6.18-rc4-mm3

2006-09-08 Thread Patrick McHardy
Gnome42 Gnome42 wrote:
 IPSec got broken in 2.6.18-rc4-mm3+, 2.6.18-rc4-mm2 works and
 2.6.18-rc5 also works.
 
 The tunnel looks like its established correctly in the racoon logs and
 the traffic is encrypted on the wire. However, the other side does not
 decrypt the traffic it just seems to disappear.

Can you see the decrypted packets on the incoming interface on the
other side?

 I have confirmed this problem exists between two linux boxen and a
 Netopia router as well.
 
 The git-net.patch increased in size by about 50% between
 2.6.18-rc4-mm2 and 2.6.18-rc4-mm3 (likely suspect?), but i was unable
 to simply patch -R it cleanly.
 
 Suggestions?

Please post your policies and related SAs from both sides.
Are you using NAT, iptables or anything like that?

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


capturing packets with FCS error (tg3)

2006-09-08 Thread Sabit A. Sayeed

Is it possible to capture packets with FCS error using the tg3 driver?

- Sabit
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/7] secid reconciliation-v02: Add LSM hooks

2006-09-08 Thread James Morris
On Fri, 8 Sep 2006, Venkat Yekkirala wrote:

 Add skb_policy_check and skb_netfilter_check hooks to LSM to enable
 reconciliation of the various security identifiers as well as enforce
 flow control on inbound (INPUT/FORWARD) and outbound (OUTPUT/FORWARD)
 traffic.

Is there any way you can send patches without format=flowed in the 
content-type?  On two mailers I've tried, the patches get mangled.



- James
-- 
James Morris
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Alternate WE-21 support (core API)

2006-09-08 Thread John W. Linville
On Fri, Sep 08, 2006 at 09:13:45AM -0700, Jean Tourrilhes wrote:
 On Fri, Sep 08, 2006 at 10:29:23AM -0400, John W. Linville wrote:
  On Wed, Sep 06, 2006 at 02:30:53PM -0700, Jean Tourrilhes wrote:
   On Wed, Sep 06, 2006 at 04:55:44PM -0400, John W. Linville wrote:
  
+ * V20 to V21
+ * --
+ * - Remove (struct net_device *)-get_wireless_stats()
+ * - Change length in ESSID and NICK to strlen() instead of 
strlen()+1
+ * - Add IW_RETRY_SHORT/IW_RETRY_LONG retry modifiers
+ * - Add explicit flag to tell stats are in 802.11k RCPI : 
IW_QUAL_RCPI
   
 Personally, I would also add this :
   
   + *   - Power/Retry relative values no longer * 10
  
 Three reason :
 1) It's a cleanup and does not add any new feature
 2) It does not change the rest of the patches
 3) Userspace part has already gone in distro, not
   including this bit would mean breaking userspace.
  
  Is there any code that corresponds to that?  Or does the comment
  simply indicate policy?
 
   There is no code in the core of the WE, so it only indicates
 policy. But, I believe policy change need to be documented.
   On the other hand you will find code in the tiacx patch.

Fair enough...

Any objections?

---


This is version 21 of the Wireless Extensions. Changelog :
o finishes migrating the ESSID API (remove the +1)
o netdev-get_wireless_stats is no more
o long/short retry

This is a redacted version of a patch originally submitted by Jean
Tourrilhes.  I removed most of the additions, in order to minimize
future support requirements for nl80211 (or other WE successor).

CC: Jean Tourrilhes [EMAIL PROTECTED]
Signed-off-by: John W. Linville [EMAIL PROTECTED]
---
 include/linux/netdevice.h |1 -
 include/linux/wireless.h  |   23 +--
 net/core/net-sysfs.c  |5 +--
 net/core/wireless.c   |   67 -
 4 files changed, 61 insertions(+), 35 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 50a4719..91dc36c 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -334,7 +334,6 @@ #define NETIF_F_ALL_CSUM(NETIF_F_IP_CSU
 
 
struct net_device_stats* (*get_stats)(struct net_device *dev);
-   struct iw_statistics*   (*get_wireless_stats)(struct net_device *dev);
 
/* List of functions to handle Wireless Extensions (instead of ioctl).
 * See net/iw_handler.h for details. Jean II */
diff --git a/include/linux/wireless.h b/include/linux/wireless.h
index 1358856..7a5860f 100644
--- a/include/linux/wireless.h
+++ b/include/linux/wireless.h
@@ -1,7 +1,7 @@
 /*
  * This file define a set of standard wireless extensions
  *
- * Version :   20  17.2.06
+ * Version :   21  14.3.06
  *
  * Authors :   Jean Tourrilhes - HPL - [EMAIL PROTECTED]
  * Copyright (c) 1997-2006 Jean Tourrilhes, All Rights Reserved.
@@ -69,9 +69,14 @@ #define _LINUX_WIRELESS_H
 
 /* INCLUDES */
 
+/* This header is used in user-space, therefore need to be sanitised
+ * for that purpose. Those includes are usually not compatible with glibc.
+ * To know which includes to use in user-space, check iwlib.h. */
+#ifdef __KERNEL__
 #include linux/types.h   /* for caddr_t et al  */
 #include linux/socket.h  /* for struct sockaddr et al  */
 #include linux/if.h  /* for IFNAMSIZ and co... */
+#endif /* __KERNEL__ */
 
 /* VERSION */
 /*
@@ -80,7 +85,7 @@ #include linux/if.h /* for IFNAMSIZ 
  * (there is some stuff that will be added in the future...)
  * I just plan to increment with each new version.
  */
-#define WIRELESS_EXT   20
+#define WIRELESS_EXT   21
 
 /*
  * Changes :
@@ -208,6 +213,14 @@ #define WIRELESS_EXT   20
  * V19 to V20
  * --
  * - RtNetlink requests support (SET/GET)
+ *
+ * V20 to V21
+ * --
+ * - Remove (struct net_device *)-get_wireless_stats()
+ * - Change length in ESSID and NICK to strlen() instead of strlen()+1
+ * - Add IW_RETRY_SHORT/IW_RETRY_LONG retry modifiers
+ * - Power/Retry relative values no longer * 10
+ * - Add explicit flag to tell stats are in 802.11k RCPI : IW_QUAL_RCPI
  */
 
 / CONSTANTS /
@@ -448,6 +460,7 @@ #define IW_QUAL_DBM 0x08/* Level + Noi
 #define IW_QUAL_QUAL_INVALID   0x10/* Driver doesn't provide value */
 #define IW_QUAL_LEVEL_INVALID  0x20
 #define IW_QUAL_NOISE_INVALID  0x40
+#define IW_QUAL_RCPI   0x80/* Level + Noise are 802.11k RCPI */
 #define IW_QUAL_ALL_INVALID0x70
 
 /* Frequency flags */
@@ -500,10 +513,12 @@ #define IW_RETRY_ON   0x  /* No detail
 #define IW_RETRY_TYPE  0xF000  /* Type of parameter */
 

Re: [PATCH,RFC] Re: r8169 driver problem with RTL8110SB chip (on iop3xx ARM board)

2006-09-08 Thread Francois Romieu
Lennert Buytenhek [EMAIL PROTECTED] :
[...]
 I suspect it's a chip bug.  I rechecked with I/O space, and that works
 okay, so this artifact (bug) only manifests itself when you do the upper
 write in MMIO space.

 Are there any plans to switch r8169 to the iomap API?  Would you take
 a patch if I'd write one?

Given the current state of the r8169 driver, I do not see a lot of
benefit from the iomap() API in itself.

It could make the switch to I/O read/write easier for strange bugs
like your but I have an epidermic defiance against I/O ops (much too 
synchronizing for me: people forget that MMIO will post). I may
change my mind if bugs start poping up like mushrooms but we are
hopefully not there yet.

An ordered write with a big sign in front of it to comment the issue
is good enough for me.

Don't hesitate to protest if you think that I need a clue.

-- 
Ueimor
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: IPSec broken in 2.6.18-rc4-mm3

2006-09-08 Thread Gnome42 Gnome42

On 9/8/06, Patrick McHardy [EMAIL PROTECTED] wrote:

Gnome42 Gnome42 wrote:



Can you see the decrypted packets on the incoming interface on the
other side?


No, not the decrypted ones only the encrypted ones. I never see the
decrypted packets. ( I should see them twice right? Once encrypted and
once decrypted?)


Please post your policies and related SAs from both sides.
Are you using NAT, iptables or anything like that?


(Beware, I am not knowledgeable about IPSec :)

I am testing this between my workstation and my linux/firewall/nat box
with adsl. So encrypted on my local lan only.

The firewall box is using iptables and is natting for me but the ipsec
traffic is just local and is not natted. I am testing roadwarrior
mode, with the firewall as the responder. No iptables/NAT on my
workstation. I have allowed protocols 50/51 and udp 500 and it works
well with other kernels including 2.6.18-rc5, so I think the iptables
stuff is OK.

On my workstation(34.34.36.1) I use:
spdadd 34.34.36.1 206.207.0.0/16 any -P out ipsec
  esp/tunnel/34.34.36.1-34.34.36.6/use;
spdadd 206.207.0.0/16 34.34.36.1 any -P in ipsec
  esp/tunnel/34.34.36.6-34.34.36.1/use;

and on the firewall:
remote anonymous {
   exchange_mode aggressive,main;
   passive on;
   my_identifier fqdn blah1;
   peers_identifier fqdn blah2;
   verify_identifier on;
   proposal {
   encryption_algorithm aes;
   hash_algorithm md5;
   authentication_method pre_shared_key;
   dh_group modp1024;
   }
   generate_policy on;
}
sainfo anonymous {
   pfs_group modp1024;
   encryption_algorithm aes;
   authentication_algorithm hmac_md5;
   compression_algorithm deflate;
}

... or did you mean dumps from setkey -D[P]?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH,RFC] Re: r8169 driver problem with RTL8110SB chip (on iop3xx ARM board)

2006-09-08 Thread Francois Romieu
Lennert Buytenhek [EMAIL PROTECTED] :
[...]
 I tried your series from step (1) plus my TxDesc change (so I didn't
 include the hunk from (2)), and that seems to work fine.  I.e. I don't
 need to disable error interrupts anymore to keep it working.
 
 So really the only thing that would need addressing would be the TXDesc
 MMIO bug (either by using I/O accesses, or doing the top/bottom writes
 the other way round) and we can dump the vendor driver.

I'll turn the top/bottom write the other way around for it looks
like the less intrusive option so far.

The current r8169 serie stood long enough in -mm that I can ask Jeff
to pull it. The link management of the 8136 needs more work but it is
not a showstopper.

-- 
Ueimor
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: capturing packets with FCS error (tg3)

2006-09-08 Thread Michael Chan
On Fri, 2006-09-08 at 16:04 -0400, Sabit A. Sayeed wrote:
 Is it possible to capture packets with FCS error using the tg3 driver?
 
Yes, you'll need to set the RX_MODE_NO_CRC_CHECK in the MAC_RX_MODE
register.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH,RFC] Re: r8169 driver problem with RTL8110SB chip (on iop3xx ARM board)

2006-09-08 Thread Lennert Buytenhek
On Fri, Sep 08, 2006 at 10:23:36PM +0200, Francois Romieu wrote:

  I suspect it's a chip bug.  I rechecked with I/O space, and that
  works okay, so this artifact (bug) only manifests itself when you
  do the upper write in MMIO space.
 
  Are there any plans to switch r8169 to the iomap API?  Would you
  take a patch if I'd write one?
 
 Given the current state of the r8169 driver, I do not see a lot of
 benefit from the iomap() API in itself.
 
 It could make the switch to I/O read/write easier for strange bugs
 like your but I have an epidermic defiance against I/O ops (much too 
 synchronizing for me: people forget that MMIO will post). I may
 change my mind if bugs start poping up like mushrooms but we are
 hopefully not there yet.
 
 An ordered write with a big sign in front of it to comment the issue
 is good enough for me.
 
 Don't hesitate to protest if you think that I need a clue.

What you say makes sense -- in my case it would have been useful to
have a knob to switch the driver to use I/O ops (since that is what
the vendor driver uses, and the vendor driver works), but bugs like
these are generally rare anyway. and so the added benefit isn't too
big.  OK.


cheers,
Lennert
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 1/2] hostap_cs: added support for Proxim Harmony PCI W-Lan card

2006-09-08 Thread Christian Steineck
hostap_cs driver
- added support for Proxim Harmony PCI W-Lan Card (uses pd6729 based
pcmcia2pci bridge)

Signed-off-by: Christian Steineck [EMAIL PROTECTED]

^---

This is my first patch, if the form of delivery is not as it should be
please forgive me :o)

best regards

Christian
--- drivers/net/wireless/hostap/hostap_cs.orig  2006-09-08 17:53:42.0 
+0200
+++ drivers/net/wireless/hostap/hostap_cs.c 2006-09-08 22:37:14.0 
+0200
@@ -848,6 +848,7 @@
PCMCIA_DEVICE_MANF_CARD(0xd601, 0x0002),
PCMCIA_DEVICE_MANF_CARD(0xd601, 0x0005),
PCMCIA_DEVICE_MANF_CARD(0xd601, 0x0010),
+   PCMCIA_DEVICE_MANF_CARD(0x0126, 0x0002),
PCMCIA_DEVICE_MANF_CARD_PROD_ID1(0x0156, 0x0002, INTERSIL,
 0x74c5e40d),
PCMCIA_DEVICE_MANF_CARD_PROD_ID1(0x0156, 0x0002, Intersil,


[patch 2/2] hostap_cs.mod: generating proper entry in /lib/kernelversion/modules.alias for Proxim Harmony PCI W-Lan card

2006-09-08 Thread Christian Steineck
hostap_cs driver
- added proper entry in /lib/kernelversion/modules.alias for Proxim
Harmony PCI W-Lan Card (uses pd6729 based pcmcia2pci bridge)

Signed-off-by: Christian Steineck [EMAIL PROTECTED]

^---

This is my first patch, if the form of delivery is not as it should be
please forgive me :o)


best regards

Christian
--- drivers/net/wireless/hostap/hostap_cs.mod.c.orig2006-09-08 
22:39:32.0 +0200
+++ drivers/net/wireless/hostap/hostap_cs.mod.c 2006-09-08 23:27:23.0 
+0200
@@ -50,5 +50,6 @@
 MODULE_ALIAS(pcmcia:m*c*f*fn*pfn*pa74C5E40DpbDB472A18pc*pd*);
 MODULE_ALIAS(pcmcia:m*c*f*fn*pfn*pa0733CC81pb0C52F395pc*pd*);
 MODULE_ALIAS(pcmcia:m*c*f*fn*pfn*pa273FE3DBpb32A1EAEEpc*pd*);
+MODULE_ALIAS(pcmcia:m0126c0002f06fn00pfn00paC6536A5Epb9F494E26pcpd);
 
 MODULE_INFO(srcversion, 6D6E7A7655C37028DF40116);


Re: [patch 2/2] hostap_cs.mod: generating proper entry in /lib/kernelversion/modules.alias for Proxim Harmony PCI W-Lan card

2006-09-08 Thread Michael Wu
On Friday 08 September 2006 17:51, Christian Steineck wrote:
 This is my first patch, if the form of delivery is not as it should be
 please forgive me :o)

This file (hostap_cs.mod.c) is generated by the build system - you don't need 
to and cannot patch it.

Also, please inline your patch instead of attaching whenever possible to make 
it easy to comment on specific portions of your patch.

-Michael Wu


pgpdakveKkYUH.pgp
Description: PGP signature


[PATCH 0/2] pcnet32: NAPI support

2006-09-08 Thread Don Fry
These patches to the pcnet32 driver implement NAPI and respond to some
other suggestions found during NAPI development and testing.

The comments from Francois Romieu regarding using spin_lock instead of
spin_lock_irqsave were investigated, but since interrupts have to be
disabled to prevent the interrupt handler from deadlocking, and since I
would probably forget sometime, it is safer to leave the locking as it
is.  The requested mmiowb calls were added.

Please appply to 2.6.19.

1/2 NAPI implementation.
2/2 Magic number cleanup.
-- 
Don Fry
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] pcnet32: NAPI implementation

2006-09-08 Thread Don Fry
Implement NAPI changes to pcnet32 driver.  Compile default is off.
Listed as experimental.

Len and Don both worked on a NAPI implementation and have both tested
these changes. 

An e1000 blasting short packets to the pcnet32 will lockup Don's system
until the receive storm stops.  Without NAPI Len's system watchdog would
expire causing the system to reboot.  With NAPI the system will stay
operational.

Tested ia32 and ppc64.  Tested '970A, '971, '972, '973, '975, '976, and
'978.

The Kconfig changes came from Len.  Don is to blame for all the others.

Signed-off-by: Len Sorensen [EMAIL PROTECTED]
Signed-off-by: Don Fry [EMAIL PROTECTED]

--- linux-2.6.17-git13/drivers/net/orig.Kconfig Wed Jun 28 10:38:45 2006
+++ linux-2.6.17-git13/drivers/net/Kconfig  Wed Jun 28 15:36:25 2006
@@ -1300,6 +1300,23 @@ config PCNET32
  file:Documentation/networking/net-modules.txt. The module
  will be called pcnet32.
 
+config PCNET32_NAPI
+   bool Use RX polling (NAPI) (EXPERIMENTAL)
+   depends on PCNET32  EXPERIMENTAL
+   help
+ NAPI is a new driver API designed to reduce CPU and interrupt load
+ when the driver is receiving lots of packets from the card. It is
+ still somewhat experimental and thus not yet enabled by default.
+
+ If your estimated Rx load is 10kpps or more, or if the card will be
+ deployed on potentially unfriendly networks (e.g. in a firewall),
+ then say Y here.
+
+ See file:Documentation/networking/NAPI_HOWTO.txt for more
+ information.
+
+ If in doubt, say N.
+
 config AMD8111_ETH
tristate AMD 8111 (new PCI lance) support
depends on NET_PCI  PCI
--- linux-2.6.18-rc6/drivers/net/pcnet32.c.orig Tue Aug 29 11:56:44 2006
+++ linux-2.6.18-rc6/drivers/net/pcnet32.c  Fri Sep  8 13:50:12 2006
@@ -21,9 +21,15 @@
  *
  */
 
+#include linux/config.h
+
 #define DRV_NAME   pcnet32
-#define DRV_VERSION1.32
-#define DRV_RELDATE18.Mar.2006
+#ifdef CONFIG_PCNET32_NAPI
+#define DRV_VERSION1.33-NAPI
+#else
+#define DRV_VERSION1.33
+#endif
+#define DRV_RELDATE27.Jun.2006
 #define PFXDRV_NAME : 
 
 static const char *const version =
@@ -299,7 +305,6 @@ static int pcnet32_probe1(unsigned long,
 static int pcnet32_open(struct net_device *);
 static int pcnet32_init_ring(struct net_device *);
 static int pcnet32_start_xmit(struct sk_buff *, struct net_device *);
-static int pcnet32_rx(struct net_device *);
 static void pcnet32_tx_timeout(struct net_device *dev);
 static irqreturn_t pcnet32_interrupt(int, void *, struct pt_regs *);
 static int pcnet32_close(struct net_device *);
@@ -883,7 +888,11 @@ static int pcnet32_loopback_test(struct 
rc = 1; /* default to fail */
 
if (netif_running(dev))
+#ifdef CONFIG_PCNET32_NAPI
+   pcnet32_netif_stop(dev);
+#else
pcnet32_close(dev);
+#endif
 
spin_lock_irqsave(lp-lock, flags);
lp-a.write_csr(ioaddr, CSR0, CSR0_STOP);   /* stop the chip */
@@ -1015,6 +1024,16 @@ static int pcnet32_loopback_test(struct 
x = a-read_bcr(ioaddr, 32);/* reset internal loopback */
a-write_bcr(ioaddr, 32, (x  ~0x0002));
 
+#ifdef CONFIG_PCNET32_NAPI
+   if (netif_running(dev)) {
+   pcnet32_netif_start(dev);
+   pcnet32_restart(dev, CSR0_NORMAL);
+   } else {
+   pcnet32_purge_rx_ring(dev);
+   lp-a.write_bcr(ioaddr, 20, 4); /* return to 16bit mode */
+   }
+   spin_unlock_irqrestore(lp-lock, flags);
+#else
if (netif_running(dev)) {
spin_unlock_irqrestore(lp-lock, flags);
pcnet32_open(dev);
@@ -1023,6 +1042,7 @@ static int pcnet32_loopback_test(struct 
lp-a.write_bcr(ioaddr, 20, 4); /* return to 16bit mode */
spin_unlock_irqrestore(lp-lock, flags);
}
+#endif
 
return (rc);
 }  /* end pcnet32_loopback_test  */
@@ -1125,6 +1145,285 @@ static int pcnet32_suspend(struct net_de
return 1;
 }
 
+static int pcnet32_rx_entry(struct net_device *dev,
+   struct pcnet32_private *lp,
+   struct pcnet32_rx_head *rxp,
+   int entry)
+{
+   int status = (short)le16_to_cpu(rxp-status)  8;
+   int rx_in_place = 0;
+   struct sk_buff *skb;
+   short pkt_len;
+
+   if (status != 0x03) {   /* There was an error. */
+   /*
+* There is a tricky error noted by John Murphy,
+* [EMAIL PROTECTED] to Russ Nelson: Even with full-sized
+* buffers it's possible for a jabber packet to use two
+* buffers, with only the last correctly noting the error.
+*/
+   if (status  0x01)  /* Only count a general error at the */
+   

[PATCH 2/2] pcnet32: Magic number cleanup

2006-09-08 Thread Don Fry
Initial magic number cleanup.  Delete one unnecessary read and write.
Tested ia32 and ppc64.

Signed-off-by: Don Fry [EMAIL PROTECTED]

--- linux-2.6.18-rc6/drivers/net/pcnet32.c.napi Fri Sep  8 13:19:53 2006
+++ linux-2.6.18-rc6/drivers/net/pcnet32.c  Fri Sep  8 13:57:13 2006
@@ -213,7 +213,7 @@ static int homepna[MAX_UNITS];
 /* The PCNET32 Rx and Tx ring descriptors. */
 struct pcnet32_rx_head {
u32 base;
-   s16 buf_length;
+   s16 buf_length; /* two`s complement of length */
s16 status;
u32 msg_length;
u32 reserved;
@@ -221,7 +221,7 @@ struct pcnet32_rx_head {
 
 struct pcnet32_tx_head {
u32 base;
-   s16 length;
+   s16 length; /* two`s complement of length */
s16 status;
u32 misc;
u32 reserved;
@@ -901,7 +901,7 @@ static int pcnet32_loopback_test(struct 
 
/* Reset the PCNET32 */
lp-a.reset(ioaddr);
-   lp-a.write_csr(ioaddr, CSR4, 0x0915);
+   lp-a.write_csr(ioaddr, CSR4, 0x0915);  /* auto tx pad */
 
/* switch pcnet32 to 32bit mode */
lp-a.write_bcr(ioaddr, 20, 2);
@@ -1393,7 +1393,7 @@ static int pcnet32_poll(struct net_devic
if (pcnet32_tx(dev)) {
/* reset the chip to clear the error condition, then restart */
lp-a.reset(ioaddr);
-   lp-a.write_csr(ioaddr, CSR4, 0x0915);
+   lp-a.write_csr(ioaddr, CSR4, 0x0915);  /* auto tx pad */
pcnet32_restart(dev, CSR0_START);
netif_wake_queue(dev);
}
@@ -1901,7 +1901,7 @@ pcnet32_probe1(unsigned long ioaddr, int
 * boards will work.
 */
/* Trigger an initialization just for the interrupt. */
-   a-write_csr(ioaddr, 0, 0x41);
+   a-write_csr(ioaddr, CSR0, CSR0_INTEN | CSR0_INIT);
mdelay(1);
 
dev-irq = probe_irq_off(irq_mask);
@@ -2268,9 +2268,9 @@ static int pcnet32_open(struct net_devic
 
 #ifdef DO_DXSUFLO
if (lp-dxsuflo) {  /* Disable transmit stop on underflow */
-   val = lp-a.read_csr(ioaddr, 3);
+   val = lp-a.read_csr(ioaddr, CSR3);
val |= 0x40;
-   lp-a.write_csr(ioaddr, 3, val);
+   lp-a.write_csr(ioaddr, CSR3, val);
}
 #endif
 
@@ -2291,8 +2291,8 @@ static int pcnet32_open(struct net_devic
(lp-dma_addr +
 offsetof(struct pcnet32_private, init_block))  16);
 
-   lp-a.write_csr(ioaddr, 4, 0x0915);
-   lp-a.write_csr(ioaddr, 0, 0x0001);
+   lp-a.write_csr(ioaddr, CSR4, 0x0915);  /* auto tx pad */
+   lp-a.write_csr(ioaddr, CSR0, CSR0_INIT);
 
netif_start_queue(dev);
 
@@ -2304,13 +2304,13 @@ static int pcnet32_open(struct net_devic
 
i = 0;
while (i++  100)
-   if (lp-a.read_csr(ioaddr, 0)  0x0100)
+   if (lp-a.read_csr(ioaddr, CSR0)  CSR0_IDON)
break;
/*
 * We used to clear the InitDone bit, 0x0100, here but Mark Stockton
 * reports that doing so triggers a bug in the '974.
 */
-   lp-a.write_csr(ioaddr, 0, 0x0042);
+   lp-a.write_csr(ioaddr, CSR0, CSR0_NORMAL);
 
if (netif_msg_ifup(lp))
printk(KERN_DEBUG
@@ -2318,7 +2318,7 @@ static int pcnet32_open(struct net_devic
   dev-name, i,
   (u32) (lp-dma_addr +
  offsetof(struct pcnet32_private, init_block)),
-  lp-a.read_csr(ioaddr, 0));
+  lp-a.read_csr(ioaddr, CSR0));
 
spin_unlock_irqrestore(lp-lock, flags);
 
@@ -2389,7 +2389,7 @@ static int pcnet32_init_ring(struct net_
(rx_skbuff = lp-rx_skbuff[i] =
 dev_alloc_skb(PKT_BUF_SZ))) {
/* there is not much, we can do at this point */
-   if (pcnet32_debug  NETIF_MSG_DRV)
+   if (netif_msg_drv(lp))
printk(KERN_ERR
   %s: pcnet32_init_ring 
dev_alloc_skb failed.\n,
   dev-name);
@@ -2439,7 +2439,7 @@ static void pcnet32_restart(struct net_d
 
/* wait for stop */
for (i = 0; i  100; i++)
-   if (lp-a.read_csr(ioaddr, 0)  0x0004)
+   if (lp-a.read_csr(ioaddr, CSR0)  CSR0_STOP)
break;
 
if (i = 100  netif_msg_drv(lp))
@@ -2452,13 +2452,13 @@ static void pcnet32_restart(struct net_d
return;
 
/* ReInit Ring */
-   lp-a.write_csr(ioaddr, 0, 1);
+   lp-a.write_csr(ioaddr, CSR0, CSR0_INIT);
i = 0;
while (i++  1000)
-   if (lp-a.read_csr(ioaddr, 0)  0x0100)
+   if 

[PATCH] ethtool: e1000: fix a typo

2006-09-08 Thread Auke Kok
From: Auke Kok [EMAIL PROTECTED]

Signed-off-by: Auke Kok [EMAIL PROTECTED]
---

 e1000.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/e1000.c b/e1000.c
index 6de27ca..6741323 100644
--- a/e1000.c
+++ b/e1000.c
@@ -372,7 +372,7 @@ e1000_dump_regs(struct ethtool_drvinfo *
  Descriptor minimum threshold size: %s\n
  Broadcast accept mode: %s\n
  VLAN filter:   %s\n
- Cononical form indicator:  %s\n
+ Canonical form indicator:  %s\n
  Discard pause frames:  %s\n
  Pass MAC control frames:   %s\n,
reg,

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC/T: Possible fix for bcm43xx periodic work bug

2006-09-08 Thread Michael Buesch
On Friday 08 September 2006 11:42, Erik Mouw wrote:
 On Thu, Sep 07, 2006 at 01:17:05PM -0500, Larry Finger wrote:
  I think I have a fix for the bcm43xx bug that leads to NETDEV WATCHDOG tx 
  timeouts and would like it
  to get as much testing as possible as this bug affects V2.6.18-rcX. If the 
  problem is truly
  fixed, I hope to get the fix into mainline before release of the bug into 
  the stable series.
 
 FWIW, I finally tracked down the bug that hangs my laptop to the
 bcm43xx driver. At first I got the impression it was the cpufreq code
 (which has been flaky in early 2.6.18-rc kernels), but after disabling
 that my laptop still crashed. After that I disabled preempt cause I got
 a couple of lockdep warnings when I had it enabled. That didn't make a
 difference, laptop still hangs after some time (runs a couple of hours
 at most). Right now I'm on wired network and my laptop finally doesn't
 hang anymore (up for two days).
 
 The hang is very hard to trigger (i.e.: it happens at random, I see no
 pattern) and locks up the machine completely. I've tried to capture
 kernel messages through serial console, but that doesn't work (lock up
 before any messages are printed).
 
 This is with any 2.6.18-rc kernel without additional patches or
 proprietary modules, 2.6.17 works ok.

The crash is fixed in wireless-2.6.
The actual cause of the controller restart not. So at least it
does not crash anymore.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] fix for system lockups in 2.6.18-rcX caused by bcm43xx

2006-09-08 Thread Larry Finger

John,

PLease send this upstream for inclusion in 2.6.18, if possible. This patch will 
not work for
wireless-2.6. That patch will be sent to you soon.

Larry

=


This patch fixes a bug in the bcm43xx driver in 2.6.18-rcX that hangs the 
machine due to improper
locking. Between 2.6.17 and .18, longer portions of certain periodic work were 
made preemptible to
improve latency, which is how this bug was introduced. It happens relatively 
infrequently - every 6
- 10 hours, but when it does, the power button is the only possible recovery.

Signed-off-by: Larry Finger [EMAIL PROTECTED]

==


Index: linux-2.6/drivers/net/wireless/bcm43xx/bcm43xx_main.c
===
--- linux-2.6.orig/drivers/net/wireless/bcm43xx/bcm43xx_main.c
+++ linux-2.6/drivers/net/wireless/bcm43xx/bcm43xx_main.c
@@ -3182,19 +3182,21 @@ static void bcm43xx_periodic_work_handle
/* Periodic work will take a long time, so we want it to
 * be preemtible.
 */
-   bcm43xx_lock_irqonly(bcm, flags);
+   bcm43xx_lock_noirq(bcm);
netif_stop_queue(bcm-net_dev);
+   bcm43xx_lock_irqonly(bcm, flags);
+   bcm43xx_mac_suspend(bcm);
if (bcm43xx_using_pio(bcm))
bcm43xx_pio_freeze_txqueues(bcm);
savedirqs = bcm43xx_interrupt_disable(bcm, BCM43xx_IRQ_ALL);
bcm43xx_unlock_irqonly(bcm, flags);
-   bcm43xx_lock_noirq(bcm);
bcm43xx_synchronize_irq(bcm);
} else {
/* Periodic work should take short time, so we want low
 * locking overhead.
 */
-   bcm43xx_lock_irqsafe(bcm, flags);
+   bcm43xx_lock_noirq(bcm);
+   bcm43xx_lock_irqonly(bcm, flags);
}

do_periodic_work(bcm);
@@ -3206,6 +3208,7 @@ static void bcm43xx_periodic_work_handle
bcm43xx_interrupt_enable(bcm, savedirqs);
if (bcm43xx_using_pio(bcm))
bcm43xx_pio_thaw_txqueues(bcm);
+   bcm43xx_mac_enable(bcm);
}
netif_wake_queue(bcm-net_dev);
mmiowb();
@@ -3213,7 +3216,8 @@ static void bcm43xx_periodic_work_handle
bcm43xx_unlock_noirq(bcm);
} else {
mmiowb();
-   bcm43xx_unlock_irqsafe(bcm, flags);
+   bcm43xx_unlock_irqonly(bcm, flags);
+   bcm43xx_unlock_noirq(bcm);
}
 }


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


HELP - NETDEV WATCHDOG tx timeouts

2006-09-08 Thread Larry Finger
In the bcm43xx driver, the code snippet shown below has a problem. When the synchronize_net 
statement is included, once every 200-300 passes through the code, the system will report a NETDEV 
WATCHDOG tx timeout for bcm43xx, even when the watchdog timeout is set to 30 sec. When the 
synchronize statement is removed, there are no errors, Except for lo, this is the only active 
network device on the system.


Is there something wrong with this structure? How can synchronize_net take that 
long?

Thanks, Larry

==

   mutex_lock(...);
   netif_stop_queue(net_device);
   synchronize_net();    problem ?
   spin_lock_irqsave(.);
.. do some stuff on the hardware
   disable interrupts on device
   spin_unlock_irqrestore(...);
   synchronize irq top/bottom halves
.. lengthy processing here
   spin_lock_irqsave(.);
   tasklet_enable(.);
   enable interrupts
.. more stuff with the hardware
   netif_wake_queue(net_device);
   spin_unlock_irqrestore(...);
   mutex_unlock(...);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


e100 fails, eepro100 works

2006-09-08 Thread Jan Kiszka
Hi,

we have a couple of industrial PCs here with Intel PRO/100 controllers
on board. Most of them work fine with the e100, but today I stumbled
over one box that doesn't: Reception works (RX counter increases, ARP
cache gets filled up), but transmission fails (TX counter is also zero).
In contrast, the eepro100 is fine, also Etherboot's driver.

I'm currently on 2.6.17.8, but I don't see any changes up to latest git
that may have positive influence. This is what lspci -v tells about this
piece of hardware:

00:12.0 Ethernet controller: Intel Corporation 8255xER/82551IT Fast
Ethernet Controller (rev 08)
Subsystem: Intel Corporation: Unknown device 1229
Flags: bus master, medium devsel, latency 66, IRQ 10
Memory at fc02 (32-bit, non-prefetchable) [size=4K]
I/O ports at 1080 [size=64]
Memory at fc00 (32-bit, non-prefetchable) [size=128K]
Capabilities: [dc] Power Management version 2

And here is the kernel log of e100 with highest debug level when sending
out a few pings while other packets arrive on the network:

e100: Intel(R) PRO/100 Network Driver, 3.5.10-k2-NAPI
e100: Copyright(c) 1999-2005 Intel Corporation
PCI: Found IRQ 10 for device :00:12.0
e100: eth0: e100_probe: addr 0xfc02, irq 10, MAC addr 00:30:59:01:07:A7
e100: eth0: e100_watchdog: right now = 35470
e100: eth0: e100_watchdog: link up, 100Mbps, full-duplex
e100: eth0: e100_intr: stat_ack = 0x04
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_watchdog: right now = 35970
e100: eth0: e100_intr: stat_ack = 0x04
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_watchdog: right now = 36470
e100: eth0: e100_intr: stat_ack = 0x04
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_watchdog: right now = 36970
e100: eth0: e100_intr: stat_ack = 0x04
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_watchdog: right now = 37470
e100: eth0: e100_intr: stat_ack = 0x04
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_watchdog: right now = 37970
e100: eth0: e100_intr: stat_ack = 0x04
e100: eth0: e100_watchdog: right now = 38470
e100: eth0: e100_intr: stat_ack = 0x04
e100: eth0: e100_intr: stat_ack = 0x40
e100: eth0: e100_watchdog: right now = 38970
e100: eth0: e100_intr: stat_ack = 0x04

I may find the time one day to debug this at lower levels, but you could
accelerate this process with any pointer where to dig deeper precisely.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] FRV: do_gettimeofday() should no longer use tickadj

2006-09-08 Thread Ingo Molnar

* David Howells [EMAIL PROTECTED] wrote:

 Ingo Molnar [EMAIL PROTECTED] wrote:
 
  we'll get rid of that pt_regs thing centrally, from all drivers at once 
  - there's upstream buy-in for that already, and Thomas already generated 
  a test-patch for that a few months ago. But it's not a big issue right 
  now.
 
 Yay!  Can you give me a pointer to the patch?

i cannot find Thomas' recent 2.6 one (Thomas, do you have a link to 
it?), but i did one 5 years ago:

 http://people.redhat.com/mingo/irq-rewrite-patches/irq-cleanup-2.4.15-B1.bz2

in general it's a large but otherwise pretty dumb patch.

  this shouldnt be a big issue either - we use indirect jumps all around 
  the kernel.
 
 Yes, I know.  I'm sometimes concerned at just how fast indirect jumps 
 (and even direct calls) are proliferating.  Look at the read syscall 
 path for something like ext3 these days: it's like a pile of 
 spaghetti.  That seems particularly true of direct-IO where it seems 
 to weave in and out of core code and the filesystem as it goes down.  
 I'm also concerned about stack usage.

yeah - but unless you can suggest some low-maintainance-overhead 
solution, not much can be done i suspect: being a few cycles slower is a 
lot less of a problem than being less flexible in the design. In general 
CPUs do optimize this quite well, but it is true that not every CPU 
does.

  CPUs are either smart enough to predict it
 
 I was told a while back (2002?) not to use indirect pointers for some 
 stuff because CPUs _couldn't_ predict it.  Maybe this has changed in 
 modern CPUs.

indirect pointers are very common both in OSs and in applications, 
especially in C++ based ones, where lots of execution goes off dynamic 
objects which have function pointers associated to them. So _lots_ of 
effort goes into branch prediction on the hardware side - and yes, 
modern CPUs do quite well with indirect pointers too.

The worst-case scenario is when the indirect branch flip-flops between 
multiple destination addresses - but that shouldnt be an issue for 
genirq because most systems have _one_ preferred way of handling 
interrupts that the majority of interrupt traffic uses. (for example on 
i686 it's level-triggered PCI irqs)

But even if there's multiple destinations from the indirect jump, newest 
CPUs (such as Core 2) can actually store _multiple_ branch history 
targets and can prefetch all of them at once (if there's idle capacity 
left).

(And i wouldnt be surprised if some modern CPUs already stored the 
indirection register's index in the BHT, and used that for the 
prediction. Most indirect calls happen off registers, and if the 
compiler loads the register early enough (which it typically does) then 
the branch target value is available to the CPU. Other context 
information can be included in a BHT too.)

Also, in general, if something is arguably a smart thing to do in an OS 
(and more design flexibility via function pointers is a smart thing for 
which there is no viable alternative), we can expect CPUs to get 
gradually better at handling them.

(4) No account is taken of interrupt priority.
  
  hm, i'm not sure what you mean - could you be more specific?
 
 The FRV CPU, like many others, supports interrupt prioritisation.  A 
 particular interrupt level is set in the PSR, and any interrupt of a 
 higher priority can interrupt.  do_IRQ() can then do the interrupt 
 processing in the interrupt level of the interrupt that invoked it, 
 thus permitting higher priority interrupts to still happen.

ah, ok. For PREEMPT_HARDIRQS we thought about possibly utilizing 
hw-level IRQ prioritization too - but it's quite inflexible in most IRQ 
controller designs: the prioritization is rarely integrated with the CPU 
and is often attached to the ACK/EOI-ing of the IRQ line (and an 
unACK-ed IRQ can have side-effects).

So the thing we chose for PREEMPT_HARDIRQS was to do the prioritization 
at the OS/scheduler level. And OS level handling of this is what we need 
anyway: IRQ handlers are just the first, often tiny portion in a 
critical workload that a system must perform. (we have softirqs, 
signals, tasks, etc.) Nevertheless the door is open to utilize hw 
capabilities of IRQ prioritization - we 'only' need standard driver and 
/sys APIs to make use of them.

Ingo
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [take17 1/4] kevent: Core files.

2006-09-08 Thread shawvrana
I stand corrected.

On Thursday 07 September 2006 23:38, Evgeniy Polyakov wrote:
 On Thu, Sep 07, 2006 at 09:05:16PM -0700, [EMAIL PROTECTED] ([EMAIL 
 PROTECTED]) 
wrote:
   +static int __devinit kevent_user_init(void)
   +{
   + int err = 0;
   +
   + kevent_cache = kmem_cache_create(kevent_cache,
   + sizeof(struct kevent), 0, SLAB_PANIC, NULL, NULL);
   +
   + err = misc_register(kevent_miscdev);
   + if (err) {
   + printk(KERN_ERR Failed to register kevent miscdev: err=%d.\n,
   err); +   goto err_out_exit;
   + }
   +
   + printk(KEVENT subsystem has been successfully registered.\n);
   +
   + return 0;
   +
   +err_out_exit:
   + kmem_cache_destroy(kevent_cache);
   + return err;
   +}
 
  It's probably best to treat kmem_cache_create like a black box and check
  for it returning null.

 It can not return NULL, it will panic instead since I use SLAB_PANIC
 flag.

  Thanks,
  Shaw
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 01/10] [TULIP] Change tulip maintainer

2006-09-08 Thread Valerie Henson
Signed-off-by: Valerie Henson [EMAIL PROTECTED]
Signed-off-by: Jeff Garzik [EMAIL PROTECTED]

 MAINTAINERS|4 ++--
 drivers/net/tulip/21142.c  |2 +-
 drivers/net/tulip/eeprom.c |2 +-
 drivers/net/tulip/interrupt.c  |2 +-
 drivers/net/tulip/media.c  |2 +-
 drivers/net/tulip/pnic.c   |2 +-
 drivers/net/tulip/pnic2.c  |2 +-
 drivers/net/tulip/timer.c  |2 +-
 drivers/net/tulip/tulip_core.c |2 +-
 9 files changed, 10 insertions(+), 10 deletions(-)

--- linux-2.6.18-rc4-mm1.orig/MAINTAINERS
+++ linux-2.6.18-rc4-mm1/MAINTAINERS
@@ -2956,8 +2956,8 @@ W:http://www.auk.cx/tms380tr/
 S: Maintained
 
 TULIP NETWORK DRIVER
-P: Jeff Garzik
-M: [EMAIL PROTECTED]
+P: Valerie Henson
+M: [EMAIL PROTECTED]
 L: [EMAIL PROTECTED]
 W: http://sourceforge.net/projects/tulip/
 S: Maintained
--- linux-2.6.18-rc4-mm1.orig/drivers/net/tulip/21142.c
+++ linux-2.6.18-rc4-mm1/drivers/net/tulip/21142.c
@@ -1,7 +1,7 @@
 /*
drivers/net/tulip/21142.c
 
-   Maintained by Jeff Garzik [EMAIL PROTECTED]
+   Maintained by Valerie Henson [EMAIL PROTECTED]
Copyright 2000,2001  The Linux Kernel Team
Written/copyright 1994-2001 by Donald Becker.
 
--- linux-2.6.18-rc4-mm1.orig/drivers/net/tulip/eeprom.c
+++ linux-2.6.18-rc4-mm1/drivers/net/tulip/eeprom.c
@@ -1,7 +1,7 @@
 /*
drivers/net/tulip/eeprom.c
 
-   Maintained by Jeff Garzik [EMAIL PROTECTED]
+   Maintained by Valerie Henson [EMAIL PROTECTED]
Copyright 2000,2001  The Linux Kernel Team
Written/copyright 1994-2001 by Donald Becker.
 
--- linux-2.6.18-rc4-mm1.orig/drivers/net/tulip/interrupt.c
+++ linux-2.6.18-rc4-mm1/drivers/net/tulip/interrupt.c
@@ -1,7 +1,7 @@
 /*
drivers/net/tulip/interrupt.c
 
-   Maintained by Jeff Garzik [EMAIL PROTECTED]
+   Maintained by Valerie Henson [EMAIL PROTECTED]
Copyright 2000,2001  The Linux Kernel Team
Written/copyright 1994-2001 by Donald Becker.
 
--- linux-2.6.18-rc4-mm1.orig/drivers/net/tulip/media.c
+++ linux-2.6.18-rc4-mm1/drivers/net/tulip/media.c
@@ -1,7 +1,7 @@
 /*
drivers/net/tulip/media.c
 
-   Maintained by Jeff Garzik [EMAIL PROTECTED]
+   Maintained by Valerie Henson [EMAIL PROTECTED]
Copyright 2000,2001  The Linux Kernel Team
Written/copyright 1994-2001 by Donald Becker.
 
--- linux-2.6.18-rc4-mm1.orig/drivers/net/tulip/pnic.c
+++ linux-2.6.18-rc4-mm1/drivers/net/tulip/pnic.c
@@ -1,7 +1,7 @@
 /*
drivers/net/tulip/pnic.c
 
-   Maintained by Jeff Garzik [EMAIL PROTECTED]
+   Maintained by Valerie Henson [EMAIL PROTECTED]
Copyright 2000,2001  The Linux Kernel Team
Written/copyright 1994-2001 by Donald Becker.
 
--- linux-2.6.18-rc4-mm1.orig/drivers/net/tulip/pnic2.c
+++ linux-2.6.18-rc4-mm1/drivers/net/tulip/pnic2.c
@@ -1,7 +1,7 @@
 /*
drivers/net/tulip/pnic2.c
 
-   Maintained by Jeff Garzik [EMAIL PROTECTED]
+   Maintained by Valerie Henson [EMAIL PROTECTED]
Copyright 2000,2001  The Linux Kernel Team
Written/copyright 1994-2001 by Donald Becker.
 Modified to hep support PNIC_II by Kevin B. Hendricks
--- linux-2.6.18-rc4-mm1.orig/drivers/net/tulip/timer.c
+++ linux-2.6.18-rc4-mm1/drivers/net/tulip/timer.c
@@ -1,7 +1,7 @@
 /*
drivers/net/tulip/timer.c
 
-   Maintained by Jeff Garzik [EMAIL PROTECTED]
+   Maintained by Valerie Henson [EMAIL PROTECTED]
Copyright 2000,2001  The Linux Kernel Team
Written/copyright 1994-2001 by Donald Becker.
 
--- linux-2.6.18-rc4-mm1.orig/drivers/net/tulip/tulip_core.c
+++ linux-2.6.18-rc4-mm1/drivers/net/tulip/tulip_core.c
@@ -1,7 +1,7 @@
 /* tulip_core.c: A DEC 21x4x-family ethernet driver for Linux. */
 
 /*
-   Maintained by Jeff Garzik [EMAIL PROTECTED]
+   Maintained by Valerie Henson [EMAIL PROTECTED]
Copyright 2000,2001  The Linux Kernel Team
Written/copyright 1994-2001 by Donald Becker.
 

--
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 02/10] [TULIP] Print physical address in tulip_init_one

2006-09-08 Thread Valerie Henson
From: Grant Grundler [EMAIL PROTECTED]

As the cookie returned by pci_iomap() is fairly useless...

[Compile warning on pci_resource_start() format fixed up by Valerie
Henson.]

Signed-off-by: Grant Grundler [EMAIL PROTECTED]
Signed-off-by: Kyle McMartin [EMAIL PROTECTED]
Signed-off-by: Valerie Henson [EMAIL PROTECTED]
Signed-off-by: Jeff Garzik [EMAIL PROTECTED]

---
 drivers/net/tulip/tulip_core.c |   10 --
 1 files changed, 8 insertions(+), 2 deletions(-)

--- linux-2.6.18-rc4-mm1.orig/drivers/net/tulip/tulip_core.c
+++ linux-2.6.18-rc4-mm1/drivers/net/tulip/tulip_core.c
@@ -1656,8 +1656,14 @@ static int __devinit tulip_init_one (str
if (register_netdev(dev))
goto err_out_free_ring;
 
-   printk(KERN_INFO %s: %s rev %d at %p,,
-  dev-name, chip_name, chip_rev, ioaddr);
+   printk(KERN_INFO %s: %s rev %d at 
+#ifdef CONFIG_TULIP_MMIO
+   MMIO
+#else
+   Port
+#endif
+%#llx,, dev-name, chip_name, chip_rev,
+   (unsigned long long) pci_resource_start(pdev, TULIP_BAR));
pci_set_drvdata(pdev, dev);
 
if (eeprom_missing)

--
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 04/10] [TULIP] Flush MMIO writes in reset sequence

2006-09-08 Thread Valerie Henson
From: Grant Grundler [EMAIL PROTECTED]

The obvious safe registers to read is one from PCI config space.

Signed-off-by: Grant Grundler [EMAIL PROTECTED]
Signed-off-by: Kyle McMartin [EMAIL PROTECTED]
Signed-off-by: Valerie Henson [EMAIL PROTECTED]
Signed-off-by: Jeff Garzik [EMAIL PROTECTED]

---
 drivers/net/tulip/tulip_core.c |2 ++
 1 files changed, 2 insertions(+)

--- linux-2.6.18-rc4-mm1.orig/drivers/net/tulip/tulip_core.c
+++ linux-2.6.18-rc4-mm1/drivers/net/tulip/tulip_core.c
@@ -295,12 +295,14 @@ static void tulip_up(struct net_device *
 
/* Reset the chip, holding bit 0 set at least 50 PCI cycles. */
iowrite32(0x0001, ioaddr + CSR0);
+   pci_read_config_dword(tp-pdev, PCI_COMMAND, i);  /* flush write */
udelay(100);
 
/* Deassert reset.
   Wait the specified 50 PCI cycles after a reset by initializing
   Tx and Rx queues and the address filter list. */
iowrite32(tp-csr0, ioaddr + CSR0);
+   pci_read_config_dword(tp-pdev, PCI_COMMAND, i);  /* flush write */
udelay(100);
 
if (tulip_debug  1)

--
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 07/10] [TULIP] Use tulip.h in winbond-840.c

2006-09-08 Thread Valerie Henson
From: Grant Grundler [EMAIL PROTECTED]

Include tulip.h in winbond-840.c and clean up lots of redundant
definitions.

Signed-off-by: Grant Grundler [EMAIL PROTECTED]
Signed-off-by: Kyle McMartin [EMAIL PROTECTED]
Signed-off-by: Valerie Henson [EMAIL PROTECTED]
Signed-off-by: Jeff Garzik [EMAIL PROTECTED]

---
 drivers/net/tulip/winbond-840.c |   68 ++--
 1 files changed, 24 insertions(+), 44 deletions(-)

--- linux-2.6.18-rc4-mm1.orig/drivers/net/tulip/winbond-840.c
+++ linux-2.6.18-rc4-mm1/drivers/net/tulip/winbond-840.c
@@ -90,10 +90,8 @@ static int full_duplex[MAX_UNITS] = {-1,
Making the Tx ring too large decreases the effectiveness of channel
bonding and packet priority.
There are no ill effects from too-large receive rings. */
-#define TX_RING_SIZE   16
 #define TX_QUEUE_LEN   10  /* Limit ring entries actually used.  */
 #define TX_QUEUE_LEN_RESTART   5
-#define RX_RING_SIZE   32
 
 #define TX_BUFLIMIT(1024-128)
 
@@ -137,6 +135,8 @@ static int full_duplex[MAX_UNITS] = {-1,
 #include asm/io.h
 #include asm/irq.h
 
+#include tulip.h
+
 /* These identify the driver base version and may not be removed. */
 static char version[] =
 KERN_INFO DRV_NAME .c:v DRV_VERSION  (2.4 port)  DRV_RELDATE   Donald 
Becker [EMAIL PROTECTED]\n
@@ -242,8 +242,8 @@ static const struct pci_id_info pci_id_t
 };
 
 /* This driver was written to use PCI memory space, however some x86 systems
-   work only with I/O space accesses.  Pass -DUSE_IO_OPS to use PCI I/O space
-   accesses instead of memory space. */
+   work only with I/O space accesses. See CONFIG_TULIP_MMIO in .config
+*/
 
 /* Offsets to the Command and Status Registers, CSRs.
While similar to the Tulip, these registers are longword aligned.
@@ -261,21 +261,11 @@ enum w840_offsets {
CurTxDescAddr=0x4C, CurTxBufAddr=0x50,
 };
 
-/* Bits in the interrupt status/enable registers. */
-/* The bits in the Intr Status/Enable registers, mostly interrupt sources. */
-enum intr_status_bits {
-   NormalIntr=0x1, AbnormalIntr=0x8000,
-   IntrPCIErr=0x2000, TimerInt=0x800,
-   IntrRxDied=0x100, RxNoBuf=0x80, IntrRxDone=0x40,
-   TxFIFOUnderflow=0x20, RxErrIntr=0x10,
-   TxIdle=0x04, IntrTxStopped=0x02, IntrTxDone=0x01,
-};
-
 /* Bits in the NetworkConfig register. */
 enum rx_mode_bits {
-   AcceptErr=0x80, AcceptRunt=0x40,
-   AcceptBroadcast=0x20, AcceptMulticast=0x10,
-   AcceptAllPhys=0x08, AcceptMyPhys=0x02,
+   AcceptErr=0x80,
+   RxAcceptBroadcast=0x20, AcceptMulticast=0x10,
+   RxAcceptAllPhys=0x08, AcceptMyPhys=0x02,
 };
 
 enum mii_reg_bits {
@@ -297,13 +287,6 @@ struct w840_tx_desc {
u32 buffer1, buffer2;
 };
 
-/* Bits in network_desc.status */
-enum desc_status_bits {
-   DescOwn=0x8000, DescEndRing=0x0200, DescUseLink=0x0100,
-   DescWholePkt=0x6000, DescStartPkt=0x2000, DescEndPkt=0x4000,
-   DescIntr=0x8000,
-};
-
 #define MII_CNT1 /* winbond only supports one MII */
 struct netdev_private {
struct w840_rx_desc *rx_ring;
@@ -371,7 +354,6 @@ static int __devinit w840_probe1 (struct
int irq;
int i, option = find_cnt  MAX_UNITS ? options[find_cnt] : 0;
void __iomem *ioaddr;
-   int bar = 1;
 
i = pci_enable_device(pdev);
if (i) return i;
@@ -393,10 +375,8 @@ static int __devinit w840_probe1 (struct
 
if (pci_request_regions(pdev, DRV_NAME))
goto err_out_netdev;
-#ifdef USE_IO_OPS
-   bar = 0;
-#endif
-   ioaddr = pci_iomap(pdev, bar, netdev_res_size);
+
+   ioaddr = pci_iomap(pdev, TULIP_BAR, netdev_res_size);
if (!ioaddr)
goto err_out_free_res;
 
@@ -838,7 +818,7 @@ static void init_rxtx_rings(struct net_d
np-rx_buf_sz,PCI_DMA_FROMDEVICE);
 
np-rx_ring[i].buffer1 = np-rx_addr[i];
-   np-rx_ring[i].status = DescOwn;
+   np-rx_ring[i].status = DescOwned;
}
 
np-cur_rx = 0;
@@ -923,7 +903,7 @@ static void init_registers(struct net_de
}
 #elif defined(__powerpc__) || defined(__i386__) || defined(__alpha__) || 
defined(__ia64__) || defined(__x86_64__)
i |= 0xE000;
-#elif defined(__sparc__)
+#elif defined(__sparc__) || defined (CONFIG_PARISC)
i |= 0x4800;
 #else
 #warning Processor architecture undefined
@@ -1043,11 +1023,11 @@ static int start_tx(struct sk_buff *skb,
 
/* Now acquire the irq spinlock.
 * The difficult race is the the ordering between
-* increasing np-cur_tx and setting DescOwn:
+* increasing np-cur_tx and setting DescOwned:
 * - if np-cur_tx is increased first the interrupt
 *   handler could consider the packet as transmitted
-*   since DescOwn is cleared.
-* - If DescOwn is set first the NIC could report the
+*   since DescOwned is cleared.
+* - If 

[patch 03/10] [TULIP] Make DS21143 printout match lspci output

2006-09-08 Thread Valerie Henson
From: Thibaut Varene [EMAIL PROTECTED]

Signed-off-by: Thibaut Varene [EMAIL PROTECTED]
Signed-off-by: Kyle McMartin [EMAIL PROTECTED]
Signed-off-by: Valerie Henson [EMAIL PROTECTED]
Signed-off-by: Jeff Garzik [EMAIL PROTECTED]

---
 drivers/net/tulip/tulip_core.c |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.18-rc4-mm1.orig/drivers/net/tulip/tulip_core.c
+++ linux-2.6.18-rc4-mm1/drivers/net/tulip/tulip_core.c
@@ -147,7 +147,7 @@ struct tulip_chip_table tulip_tbl[] = {
HAS_MII | HAS_MEDIA_TABLE | CSR12_IN_SROM | HAS_PCI_MWI, tulip_timer },
 
   /* DC21142, DC21143 */
-  { Digital DS21143 Tulip, 128, 0x0801fbff,
+  { Digital DS21142/43 Tulip, 128, 0x0801fbff,
HAS_MII | HAS_MEDIA_TABLE | ALWAYS_CHECK_MII | HAS_ACPI | HAS_NWAY
| HAS_INTR_MITIGATION | HAS_PCI_MWI, t21142_timer },
 

--
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 09/10] [TULIP] Update tulip version

2006-09-08 Thread Valerie Henson
Signed-off-by: Valerie Henson [EMAIL PROTECTED]
Signed-off-by: Jeff Garzik [EMAIL PROTECTED]

---
 drivers/net/tulip/tulip_core.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

--- linux-2.6.18-rc4-mm1.orig/drivers/net/tulip/tulip_core.c
+++ linux-2.6.18-rc4-mm1/drivers/net/tulip/tulip_core.c
@@ -17,11 +17,11 @@
 
 #define DRV_NAME   tulip
 #ifdef CONFIG_TULIP_NAPI
-#define DRV_VERSION1.1.14-NAPI /* Keep at least for test */
+#define DRV_VERSION1.1.15-NAPI /* Keep at least for test */
 #else
-#define DRV_VERSION1.1.14
+#define DRV_VERSION1.1.15
 #endif
-#define DRV_RELDATEMay 6, 2006
+#define DRV_RELDATEAug 23, 2006
 
 
 #include linux/module.h

--
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 08/10] [TULIP] Handle pci_enable_device() errors in resume

2006-09-08 Thread Valerie Henson
Signed-off-by: Valerie Henson [EMAIL PROTECTED]
Cc: Jeff Garzik [EMAIL PROTECTED]

---
 drivers/net/tulip/de2104x.c |   16 ++--
 drivers/net/tulip/tulip_core.c  |5 -
 drivers/net/tulip/winbond-840.c |   12 
 3 files changed, 22 insertions(+), 11 deletions(-)

--- linux-2.6.18-rc4-mm1-tulip.orig/drivers/net/tulip/tulip_core.c
+++ linux-2.6.18-rc4-mm1-tulip/drivers/net/tulip/tulip_core.c
@@ -1780,7 +1780,10 @@ static int tulip_resume(struct pci_dev *
pci_set_power_state(pdev, PCI_D0);
pci_restore_state(pdev);
 
-   pci_enable_device(pdev);
+   if ((retval = pci_enable_device(pdev))) {
+   printk (KERN_ERR tulip: pci_enable_device failed in resume\n);
+   return retval;
+   }
 
if ((retval = request_irq(dev-irq, tulip_interrupt, IRQF_SHARED, 
dev-name, dev))) {
printk (KERN_ERR tulip: request_irq failed in resume\n);
--- linux-2.6.18-rc4-mm1-tulip.orig/drivers/net/tulip/winbond-840.c
+++ linux-2.6.18-rc4-mm1-tulip/drivers/net/tulip/winbond-840.c
@@ -1626,14 +1626,18 @@ static int w840_resume (struct pci_dev *
 {
struct net_device *dev = pci_get_drvdata (pdev);
struct netdev_private *np = netdev_priv(dev);
+   int retval = 0;
 
rtnl_lock();
if (netif_device_present(dev))
goto out; /* device not suspended */
if (netif_running(dev)) {
-   pci_enable_device(pdev);
-   /*  pci_power_on(pdev); */
-
+   if ((retval = pci_enable_device(pdev))) {
+   printk (KERN_ERR
+   %s: pci_enable_device failed in resume\n,
+   dev-name);
+   goto out;
+   }
spin_lock_irq(np-lock);
iowrite32(1, np-base_addr+PCIBusCfg);
ioread32(np-base_addr+PCIBusCfg);
@@ -1651,7 +1655,7 @@ static int w840_resume (struct pci_dev *
}
 out:
rtnl_unlock();
-   return 0;
+   return retval;
 }
 #endif
 
--- linux-2.6.18-rc4-mm1-tulip.orig/drivers/net/tulip/de2104x.c
+++ linux-2.6.18-rc4-mm1-tulip/drivers/net/tulip/de2104x.c
@@ -2138,17 +2138,21 @@ static int de_resume (struct pci_dev *pd
 {
struct net_device *dev = pci_get_drvdata (pdev);
struct de_private *de = dev-priv;
+   int retval = 0;
 
rtnl_lock();
if (netif_device_present(dev))
goto out;
-   if (netif_running(dev)) {
-   pci_enable_device(pdev);
-   de_init_hw(de);
-   netif_device_attach(dev);
-   } else {
-   netif_device_attach(dev);
+   if (!netif_running(dev))
+   goto out_attach;
+   if ((retval = pci_enable_device(pdev))) {
+   printk (KERN_ERR %s: pci_enable_device failed in resume\n,
+   dev-name);
+   goto out;
}
+   de_init_hw(de);
+out_attach:
+   netif_device_attach(dev);
 out:
rtnl_unlock();
return 0;

--
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 05/10] [TULIP] Defer tulip_select_media() to process context

2006-09-08 Thread Valerie Henson
From: Francois Romieu [EMAIL PROTECTED]

Move tulip_select_media() processing to a workqueue, instead of
delaying in interrupt context, edited by Kyle McMartin to use kevent
thread, instead of creating its own workqueue.

Signed-off-by: Kyle McMartin [EMAIL PROTECTED]
Signed-off-by: Valerie Henson [EMAIL PROTECTED]
Signed-off-by: Jeff Garzik [EMAIL PROTECTED]

---
 drivers/net/tulip/21142.c  |4 +-
 drivers/net/tulip/timer.c  |   14 +++-
 drivers/net/tulip/tulip.h  |   19 ++--
 drivers/net/tulip/tulip_core.c |   64 +++--
 4 files changed, 60 insertions(+), 41 deletions(-)

--- linux-2.6.18-rc4-mm1.orig/drivers/net/tulip/21142.c
+++ linux-2.6.18-rc4-mm1/drivers/net/tulip/21142.c
@@ -26,9 +26,9 @@ static u16 t21142_csr15[] = { 0x0008, 0x
 
 /* Handle the 21143 uniquely: do autoselect with NWay, not the EEPROM list
of available transceivers.  */
-void t21142_timer(unsigned long data)
+void t21142_media_task(void *data)
 {
-   struct net_device *dev = (struct net_device *)data;
+   struct net_device *dev = data;
struct tulip_private *tp = netdev_priv(dev);
void __iomem *ioaddr = tp-base_addr;
int csr12 = ioread32(ioaddr + CSR12);
--- linux-2.6.18-rc4-mm1.orig/drivers/net/tulip/timer.c
+++ linux-2.6.18-rc4-mm1/drivers/net/tulip/timer.c
@@ -18,13 +18,14 @@
 #include tulip.h
 
 
-void tulip_timer(unsigned long data)
+void tulip_media_task(void *data)
 {
-   struct net_device *dev = (struct net_device *)data;
+   struct net_device *dev = data;
struct tulip_private *tp = netdev_priv(dev);
void __iomem *ioaddr = tp-base_addr;
u32 csr12 = ioread32(ioaddr + CSR12);
int next_tick = 2*HZ;
+   unsigned long flags;
 
if (tulip_debug  2) {
printk(KERN_DEBUG %s: Media selection tick, %s, status %8.8x 
mode
@@ -126,6 +127,15 @@ void tulip_timer(unsigned long data)
}
break;
}
+
+
+   spin_lock_irqsave(tp-lock, flags);
+   if (tp-timeout_recovery) {
+   tulip_tx_timeout_complete(tp, ioaddr);
+   tp-timeout_recovery = 0;
+   }
+   spin_unlock_irqrestore(tp-lock, flags);
+
/* mod_timer synchronizes us with potential add_timer calls
 * from interrupts.
 */
--- linux-2.6.18-rc4-mm1.orig/drivers/net/tulip/tulip.h
+++ linux-2.6.18-rc4-mm1/drivers/net/tulip/tulip.h
@@ -44,7 +44,8 @@ struct tulip_chip_table {
int io_size;
int valid_intrs;/* CSR7 interrupt enable settings */
int flags;
-   void (*media_timer) (unsigned long data);
+   void (*media_timer) (unsigned long);
+   void (*media_task) (void *);
 };
 
 
@@ -366,6 +367,7 @@ struct tulip_private {
unsigned int medialock:1;   /* Don't sense media type. */
unsigned int mediasense:1;  /* Media sensing in progress. */
unsigned int nway:1, nwayset:1; /* 21143 internal NWay. */
+   unsigned int timeout_recovery:1;
unsigned int csr0;  /* CSR0 setting. */
unsigned int csr6;  /* Current CSR6 control settings. */
unsigned char eeprom[EEPROM_SIZE];  /* Serial EEPROM contents. */
@@ -384,6 +386,7 @@ struct tulip_private {
void __iomem *base_addr;
int csr12_shadow;
int pad0;   /* Used for 8-byte alignment */
+   struct work_struct media_work;
 };
 
 
@@ -398,7 +401,7 @@ struct eeprom_fixup {
 
 /* 21142.c */
 extern u16 t21142_csr14[];
-void t21142_timer(unsigned long data);
+void t21142_media_task(void *data);
 void t21142_start_nway(struct net_device *dev);
 void t21142_lnk_change(struct net_device *dev, int csr5);
 
@@ -436,7 +439,7 @@ void pnic_lnk_change(struct net_device *
 void pnic_timer(unsigned long data);
 
 /* timer.c */
-void tulip_timer(unsigned long data);
+void tulip_media_task(void *data);
 void mxic_timer(unsigned long data);
 void comet_timer(unsigned long data);
 
@@ -488,4 +491,14 @@ static inline void tulip_restart_rxtx(st
tulip_start_rxtx(tp);
 }
 
+static inline void tulip_tx_timeout_complete(struct tulip_private *tp, void 
__iomem *ioaddr)
+{
+   /* Stop and restart the chip's Tx processes. */
+   tulip_restart_rxtx(tp);
+   /* Trigger an immediate transmit demand. */
+   iowrite32(0, ioaddr + CSR1);
+
+   tp-stats.tx_errors++;
+}
+
 #endif /* __NET_TULIP_H__ */
--- linux-2.6.18-rc4-mm1.orig/drivers/net/tulip/tulip_core.c
+++ linux-2.6.18-rc4-mm1/drivers/net/tulip/tulip_core.c
@@ -130,7 +130,14 @@ int tulip_debug = TULIP_DEBUG;
 int tulip_debug = 1;
 #endif
 
+static void tulip_timer(unsigned long data)
+{
+   struct net_device *dev = (struct net_device *)data;
+   struct tulip_private *tp = netdev_priv(dev);
 
+   if (netif_running(dev))
+   schedule_work(tp-media_work);
+}
 
 /*
  * This table use during operation for capabilities and media timer.
@@ -144,59 +151,60 @@ struct tulip_chip_table 

[patch 10/10] [TULIP] Update winbond840.c version

2006-09-08 Thread Valerie Henson
Signed-off-by: Valerie Henson [EMAIL PROTECTED]
Signed-off-by: Jeff Garzik [EMAIL PROTECTED]

---
 drivers/net/tulip/winbond-840.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

--- linux-2.6.18-rc4-mm1.orig/drivers/net/tulip/winbond-840.c
+++ linux-2.6.18-rc4-mm1/drivers/net/tulip/winbond-840.c
@@ -45,8 +45,8 @@
 */
 
 #define DRV_NAME   winbond-840
-#define DRV_VERSION1.01-d
-#define DRV_RELDATENov-17-2001
+#define DRV_VERSION1.01-e
+#define DRV_RELDATEAug-23-2006
 
 
 /* Automatically extracted configuration info:

--
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 06/10] [TULIP] Clean up tulip.h

2006-09-08 Thread Valerie Henson
From: Grant Grundler [EMAIL PROTECTED]

Update/cleanup some definitions in tulip.h and tulip_core.c.

Signed-off-by: Grant Grundler [EMAIL PROTECTED]
Signed-off-by: Kyle McMartin [EMAIL PROTECTED]
Signed-off-by: Valerie Henson [EMAIL PROTECTED]
Signed-off-by: Jeff Garzik [EMAIL PROTECTED]

---
 drivers/net/tulip/tulip.h  |   17 +++--
 drivers/net/tulip/tulip_core.c |7 ++-
 2 files changed, 13 insertions(+), 11 deletions(-)

--- linux-2.6.18-rc4-mm1.orig/drivers/net/tulip/tulip.h
+++ linux-2.6.18-rc4-mm1/drivers/net/tulip/tulip.h
@@ -30,11 +30,10 @@
 /* undefine, or define to various debugging levels (4 == obscene levels) */
 #define TULIP_DEBUG 1
 
-/* undefine USE_IO_OPS for MMIO, define for PIO */
 #ifdef CONFIG_TULIP_MMIO
-# undef USE_IO_OPS
+#define TULIP_BAR  1   /* CBMA */
 #else
-# define USE_IO_OPS 1
+#define TULIP_BAR  0   /* CBIO */
 #endif
 
 
@@ -143,6 +142,7 @@ enum status_bits {
RxNoBuf = 0x80,
RxIntr = 0x40,
TxFIFOUnderflow = 0x20,
+   RxErrIntr = 0x10,
TxJabber = 0x08,
TxNoBuf = 0x04,
TxDied = 0x02,
@@ -193,9 +193,14 @@ struct tulip_tx_desc {
 
 
 enum desc_status_bits {
-   DescOwned = 0x8000,
-   RxDescFatalErr = 0x8000,
-   RxWholePkt = 0x0300,
+   DescOwned= 0x8000,
+   DescWholePkt = 0x6000,
+   DescEndPkt   = 0x4000,
+   DescStartPkt = 0x2000,
+   DescEndRing  = 0x0200,
+   DescUseLink  = 0x0100,
+   RxDescFatalErr = 0x008000,
+   RxWholePkt   = 0x0300,
 };
 
 
--- linux-2.6.18-rc4-mm1.orig/drivers/net/tulip/tulip_core.c
+++ linux-2.6.18-rc4-mm1/drivers/net/tulip/tulip_core.c
@@ -1369,11 +1369,8 @@ static int __devinit tulip_init_one (str
if (pci_request_regions (pdev, tulip))
goto err_out_free_netdev;
 
-#ifndef USE_IO_OPS
-   ioaddr =  pci_iomap(pdev, 1, tulip_tbl[chip_idx].io_size);
-#else
-   ioaddr =  pci_iomap(pdev, 0, tulip_tbl[chip_idx].io_size);
-#endif
+   ioaddr =  pci_iomap(pdev, TULIP_BAR, tulip_tbl[chip_idx].io_size);
+
if (!ioaddr)
goto err_out_free_res;
 

--
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TG3 data corruption (TSO ?)

2006-09-08 Thread Segher Boessenkool

I've been chasing with Segher a data corruption problem lately.
Basically transferring huge amount of data (several Gb) and I get
corrupted data at the rx side. I cannot tell for sure wether what  
I've
been observing here is the same problem that segher's been seing  
on is
blades, he will confirm or not. He also seemed to imply that  
reverting

to an older kernel on the -receiver- side fixed it, which makes me
wonder, since it's looks really like a sending side problem (see
explanation below), if some change in, for exmaple, window scaling,
might hide or trigger it.


Please send me lspci and tg3 probing output so that I know what
tg3 hardware you're using.


I use a 5780 rev. A3, but the problem is not limited to this chip.


I also want to look at the tcpdump or
ethereal on the mirrored port that shows the packet being corrupted.


I don't have such, sorry.


That's all the data I have at this point. I can't guarantee 100% that
it's a TSO bug (it might be a bug that TSO renders visible
due to timing
effects) but it looks like it since I've not reproduced yet with TSO
disabled.


It seems to indeed to only be exposed by TSO, not actually a
bug of it /an sich/.

I've got a patch that seems so solve the problem, it needs more testing
though (maybe Ben can do this :-) ).  The problem is that there should
be quite a few wmb()'s in the code that are just not there; adding some
to tg3_set_txd() seems to fix the immediate problem but more is needed
(and I don't see why those should be needed, unless tg3_set_txd() is
updating a life ring entry in place or something like that).

More testing is needed, but the problem is definitely the lack of memory
ordering.


Segher

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TG3 data corruption (TSO ?)

2006-09-08 Thread Michael Chan
On Fri, 2006-09-08 at 21:29 +0200, Segher Boessenkool wrote:

 I've got a patch that seems so solve the problem, it needs more testing
 though (maybe Ben can do this :-) ).  The problem is that there should
 be quite a few wmb()'s in the code that are just not there; adding some
 to tg3_set_txd() seems to fix the immediate problem but more is needed
 (and I don't see why those should be needed, unless tg3_set_txd() is
 updating a life ring entry in place or something like that).
 
 More testing is needed, but the problem is definitely the lack of memory
 ordering.
 
Oh, we know about this.  The powerpc writel() used to have memory
barriers in 2.4 kernels but not any more in 2.6 kernels.  Red Hat's
version of tg3 has extra wmb()'s to fix this problem.  David doesn't
think that the upstream version of tg3 should have these wmb()'s, and
the problem should instead be fixed in powerpc's writel().

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC/T: Possible fix for bcm43xx periodic work bug

2006-09-08 Thread Michael Buesch
On Friday 08 September 2006 15:25, Larry Finger wrote:
 Michael Buesch wrote:
  On Thursday 07 September 2006 20:17, Larry Finger wrote:
  Hi all,
 
  I think I have a fix for the bcm43xx bug that leads to NETDEV WATCHDOG tx 
  timeouts and would like it
  to get as much testing as possible as this bug affects V2.6.18-rcX. If the 
  problem is truly
  fixed, I hope to get the fix into mainline before release of the bug into 
  the stable series.
 
  I got the idea for the fix when I discovered that the timeout interval 
  used for the watchdog is the 
  default value of 5 seconds. Obviously, the few milliseconds used in the 
  periodic work handler 
  weren't causing us to just miss.
 
  To exacerbate the problem, I changed the repeat timer for periodic work 
  from 15 to 1 sec. I also set 
  BADNESS_LIMIT to 0. As a result, I was running the problem code once per 
  second instead of once per 
  minute. Now failures would occur in minutes instead of hours.
 
  Operating from the premise that the DMA needed some time to reach the idle 
  state after the MAC was 
  suspended, I tried various delays, but nothing worked.
 
  Then I decided to test the premise that the problem was associated with 
  shutting down and restarting 
  the network. That lead to the current patch, which has run for what is 
  effectively 100 times longer 
  than previous versions.
 
  Larry
  ---
 
 
  Index: wireless-2.6/drivers/net/wireless/bcm43xx/bcm43xx_main.c
  ===
  --- wireless-2.6.orig/drivers/net/wireless/bcm43xx/bcm43xx_main.c
  +++ wireless-2.6/drivers/net/wireless/bcm43xx/bcm43xx_main.c
  @@ -3244,8 +3244,6 @@ static void bcm43xx_periodic_work_handle
  * be preemtible.
  */
 mutex_lock(bcm-mutex);
  -  netif_stop_queue(bcm-net_dev);
  -  synchronize_net();
 spin_lock_irqsave(bcm-irq_lock, flags);
 bcm43xx_mac_suspend(bcm);
 if (bcm43xx_using_pio(bcm))
  @@ -3270,7 +3268,6 @@ static void bcm43xx_periodic_work_handle
 if (bcm43xx_using_pio(bcm))
 bcm43xx_pio_thaw_txqueues(bcm);
 bcm43xx_mac_enable(bcm);
  -  netif_wake_queue(bcm-net_dev);
 }
 mmiowb();
 spin_unlock_irqrestore(bcm-irq_lock, flags);
  
  The real question is: Why does this patch help?
  
  Let's explain it. We don't stop networking just for fun there.
  While executing long preemptible periodic work, we must ensure
  that the TX path into the driver is not entered. It's the same
  reason why we disable IRQs in the first place. We can't take the
  mutex in the TX path and the IRQ handler. (That are the only places
  where we can't take the mutex).
  Short: We must stop netif here.
  The question is: Why does stopping netif queue cause a watchdog
  trigger here? The maximum time it can take for the periodic
  work inside of the critical section is about 0.2sec. So the queue
  is stopped for about 0.2sec max. Why does the watchdog trigger?
  Any idea from some networking guru?
  Could synchronize_net() take over 5sec in some worst case? Why?
  Questions over questions :D
 
 This may be a stupid question, but does the synchronize_net call belong?
 
 The reason I ask is because the code for synchronize_net is

synchronize_net() ensures that all currently running TX handlers
complete before returning from synchronize_net(). That's what I have
been told.
So what we do is: Disable TX queue and wait for any running TX queue
to finish. We must do this to make sure no TX handler can run after
the sync. I previously explained the exact reasons.

 When I look through the mutex_lock code, I don't find any rcu code. What did 
 I miss?

This is bot about synchronoizing bcm43xx, but the net layer.

 BTW, I still  
 got NETDEV tx timeouts with a 30 second timeout.

So, well. I don't think synchronize_net can take 30 seconds.
That would be a big bug in the net layer.

 I'm currently testing with the netif_stop_queue and netif_wake_queue 
 statements restored, but 
 without the synchronize_net. It has run for over 11 hours with the 
 accelerated testing, which would 
 correspond to almost 4 weeks at regular rates.

Well, we can brute-force it to death, or we can ask someone who actually has
a clue what is going on. Some networking guru, please help. :)

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [take17 1/4] kevent: Core files.

2006-09-08 Thread Evgeniy Polyakov
On Thu, Sep 07, 2006 at 09:05:16PM -0700, [EMAIL PROTECTED] ([EMAIL PROTECTED]) 
wrote:
  +static int __devinit kevent_user_init(void)
  +{
  +   int err = 0;
  +
  +   kevent_cache = kmem_cache_create(kevent_cache,
  +   sizeof(struct kevent), 0, SLAB_PANIC, NULL, NULL);
  +
  +   err = misc_register(kevent_miscdev);
  +   if (err) {
  +   printk(KERN_ERR Failed to register kevent miscdev: err=%d.\n, 
  err);
  +   goto err_out_exit;
  +   }
  +
  +   printk(KEVENT subsystem has been successfully registered.\n);
  +
  +   return 0;
  +
  +err_out_exit:
  +   kmem_cache_destroy(kevent_cache);
  +   return err;
  +}
 
 It's probably best to treat kmem_cache_create like a black box and check for 
 it returning null.

It can not return NULL, it will panic instead since I use SLAB_PANIC
flag.

 Thanks,
 Shaw

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TG3 data corruption (TSO ?)

2006-09-08 Thread Benjamin Herrenschmidt

 Please send me lspci and tg3 probing output so that I know what
 tg3 hardware you're using.  I also want to look at the tcpdump or
 ethereal on the mirrored port that shows the packet being corrupted.

Hi Michael !

It's the dual controller of an Apple Quad G5, thus afaik in an HT2000
chip:

0001:05:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5780 
Gigabit Ethernet (rev 03)
Subsystem: Apple Computer Inc. Unknown device 0085
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- 
TAbort- MAbort- SERR- PERR-
Latency: 64 (16000ns min)
Interrupt: pin A routed to IRQ 66
Region 0: Memory at fa53 (64-bit, non-prefetchable) [size=64K]
Region 2: Memory at fa52 (64-bit, non-prefetchable) [size=64K]
Capabilities: [40] PCI-X non-bridge device
Command: DPERE- ERO- RBC=512 OST=1
Status: Dev=05:04.0 64bit+ 133MHz+ SCD- USC- DC=simple 
DMMRBC=2048 DMOST=1 DMCRS=16 RSCEM- 266MHz- 533MHz-
Capabilities: [48] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot+,D3cold+)
Status: D0 PME-Enable+ DSel=0 DScale=1 PME-
Capabilities: [50] Vital Product Data
Capabilities: [58] Message Signalled Interrupts: Mask- 64bit+ Queue=0/3 
Enable-
Address: 00011aa5c8ce4904  Data: 18d8

0001:05:04.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5780 
Gigabit Ethernet (rev 03)
Subsystem: Apple Computer Inc. Unknown device 0085
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- 
TAbort- MAbort- SERR- PERR-
Latency: 16 (16000ns min), Cache Line Size: 64 bytes
Interrupt: pin B routed to IRQ 67
Region 0: Memory at fa51 (64-bit, non-prefetchable) [size=64K]
Region 2: Memory at fa50 (64-bit, non-prefetchable) [size=64K]
Capabilities: [40] PCI-X non-bridge device
Command: DPERE- ERO+ RBC=512 OST=1
Status: Dev=05:04.1 64bit+ 133MHz+ SCD- USC- DC=simple 
DMMRBC=2048 DMOST=1 DMCRS=16 RSCEM- 266MHz- 533MHz-
Capabilities: [48] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [50] Vital Product Data
Capabilities: [58] Message Signalled Interrupts: Mask- 64bit+ Queue=0/3 
Enable-
Address: 4e001a0002804460  Data: 00a2

And the dmesg bits:

tg3.c:v3.65 (August 07, 2006)
eth0: Tigon3 [partno(BCM95780) rev 8003 PHY(5780)] (PCIX:133MHz:64-bit) 
10/100/1000BaseT Ethernet 00:14:51:65:e6:90
eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
eth0: dma_rwctrl[76144000] dma_mask[40-bit]
eth1: Tigon3 [partno(BCM95780) rev 8003 PHY(5780)] (PCIX:133MHz:64-bit) 
10/100/1000BaseT Ethernet 00:14:51:65:e6:91
eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
eth1: dma_rwctrl[76144000] dma_mask[40-bit]

As for the tcpdump output, well, I have a 3Gb file for now :) I need to do a 
bit of surgery on it to
get only the interesting part. I'll try to do that later today (but it may have 
to wait for monday).

Cheers,
Ben.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TG3 data corruption (TSO ?)

2006-09-08 Thread Michael Chan
On Sat, 2006-09-09 at 07:41 +1000, Benjamin Herrenschmidt wrote:

 As for the tcpdump output, well, I have a 3Gb file for now :) I need to do a 
 bit of surgery on it to
 get only the interesting part. I'll try to do that later today (but it may 
 have to wait for monday).
 
Ben, We probably don't need the tcpdump anymore now that we know it's a
memory ordering issue.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TG3 data corruption (TSO ?)

2006-09-08 Thread Michael Chan
On Sat, 2006-09-09 at 07:46 +1000, Benjamin Herrenschmidt wrote:

 The PowerPC writel has a full sync _after_ the write, mostly to prevent
 it from leaking out of a spinlock, and for ordering it vs. other
 writel's or readl's. It doesn't provide any ordering guarantee vs
 cacheable storage (and was never intended to do so afaik). Such ordering
 shall
 be provided explicitely. It's possible that 2.4 used a big hammer
 approach but we've since been actively fixing drivers for that. It's to
 be noted that PowerPC might not be the only architecture affected as I
 don't think that in general, you have ordering guarantees between
 cacheable and non-cacheable stores unless you use explicit barriers.

I think 2.4 might have an additional sync before the write which will
guarantee that the buffer descriptor is written before telling the chip
to DMA it.

 
 Thus I disagree with fixing the powerpc writel(). The barries shall
 definitely go into tg3.
 

You'll have to take this up with David.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TG3 data corruption (TSO ?)

2006-09-08 Thread Benjamin Herrenschmidt
On Fri, 2006-09-08 at 15:07 -0700, Michael Chan wrote:
 On Sat, 2006-09-09 at 07:41 +1000, Benjamin Herrenschmidt wrote:
 
  As for the tcpdump output, well, I have a 3Gb file for now :) I need to do 
  a bit of surgery on it to
  get only the interesting part. I'll try to do that later today (but it may 
  have to wait for monday).
  
 Ben, We probably don't need the tcpdump anymore now that we know it's a
 memory ordering issue.

Ok. I'm trying to figure out what's the best way with fixing that. I can
see the flamewar coming on wether stores to memory vs. writel shall be
ordered or not :)

I'm very reluctant to add another sync instruction to our writel though.
It needs one already after the stores to prevent leaking out of
spinlocks (and thus possible mmio vs. mmio order issues on SMP with
stores from different CPUs being re-ordered). Fixing the above would
require one before the store as well. We already pay a pretty high price
for that sync, having 2 would be a real shame.

(Unfortunately, there is no cheap barrier available for ordering
cacheable vs. non cacheable storage on PowerPC, they are completely
separate domains).

One option I was discussing with others would be to drop that sync after
the store, and instead start requiring drivers to use mmiowb() (as
defined by the ia64 folks) to provide ordering of writel's vs. locks.
But that probably means breaking and then having to fix a while bunch of
drivers in the tree who haven't been updated to use it...

I'd rather not have to do that, or even if I go that way, not have to
add that sync at all before the store and thus get back the few percent
of perfs lost due to those sync's on some heavy IO benchmarks.

Ben.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TG3 data corruption (TSO ?)

2006-09-08 Thread Michael Chan
On Sat, 2006-09-09 at 08:25 +1000, Benjamin Herrenschmidt wrote:
 Ok. I'm trying to figure out what's the best way with fixing that. I can
 see the flamewar coming on wether stores to memory vs. writel shall be
 ordered or not :)
 
 I'm very reluctant to add another sync instruction to our writel though.
 It needs one already after the stores to prevent leaking out of
 spinlocks (and thus possible mmio vs. mmio order issues on SMP with
 stores from different CPUs being re-ordered). Fixing the above would
 require one before the store as well. We already pay a pretty high price
 for that sync, having 2 would be a real shame.
 
 (Unfortunately, there is no cheap barrier available for ordering
 cacheable vs. non cacheable storage on PowerPC, they are completely
 separate domains).
 
 One option I was discussing with others would be to drop that sync after
 the store, and instead start requiring drivers to use mmiowb() (as
 defined by the ia64 folks) to provide ordering of writel's vs. locks.
 But that probably means breaking and then having to fix a while bunch of
 drivers in the tree who haven't been updated to use it...
 
 I'd rather not have to do that, or even if I go that way, not have to
 add that sync at all before the store and thus get back the few percent
 of perfs lost due to those sync's on some heavy IO benchmarks.
 
Another way to fix this without requiring drivers to add all kinds of
barriers in the driver code is to add a writel_sync() variant.  So on
powerpc, writel_sync() will have a sync before and after the write.  On
most other architectures, writel_sync() is the same as writel() if the
ordering is guaranteed.  We'll then convert tg3 and other drivers to use
writel_sync() in places where they're needed.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TG3 data corruption (TSO ?)

2006-09-08 Thread Benjamin Herrenschmidt

  I'd rather not have to do that, or even if I go that way, not have to
  add that sync at all before the store and thus get back the few percent
  of perfs lost due to those sync's on some heavy IO benchmarks.
  
 Another way to fix this without requiring drivers to add all kinds of
 barriers in the driver code is to add a writel_sync() variant.  So on
 powerpc, writel_sync() will have a sync before and after the write.  On
 most other architectures, writel_sync() is the same as writel() if the
 ordering is guaranteed.  We'll then convert tg3 and other drivers to use
 writel_sync() in places where they're needed.

I think the preferred approach for that sort of thing is to have writel
be the sync version and add special relaxed version. Now there have
been talks and debates about relaxed IOs but they generally map to
something different, typically IOs that are relaxed vs. DMA (PCI-X/PCIe
relaxed ordering options for example).

Adding yet another round of IO accessors sounds like a bit nasty to me,
driver writers will potentially not understand which ones to use etc...

Anyway, I think I'll let Anton and Paulus argue that one for now.

Cheers,
Ben.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: e100 fails, eepro100 works

2006-09-08 Thread Jan Kiszka
Auke Kok wrote:
 Can you include a full `dmesg` and `lcpci -vv -s 00:12.0` ?
 
 Also you're using 3.5.10-k2, can you try the current git tree version
 instead? I can send you the e100.c if wanted.

Yes, please, to make sure that we'll really discuss the same version.
Will then try to collect the additional information on Monday.

Jan



signature.asc
Description: OpenPGP digital signature


Re: RFC/T: Possible fix for bcm43xx periodic work bug

2006-09-08 Thread Erik Mouw
On Fri, Sep 08, 2006 at 11:45:27AM +0200, Michael Buesch wrote:
 The crash is fixed in wireless-2.6.
 The actual cause of the controller restart not. So at least it
 does not crash anymore.

Thanks for the information, pulled wireless-2.6 and recompiling kernel.
If this really fixes the problem, can we try to get it merged before
2.6.18 closes? I don't know if vanilla 2.6.18-rc6 locks up on other
hardware as well, but if it does it would be a major regression against
2.6.17.


Erik

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC/T: Possible fix for bcm43xx periodic work bug

2006-09-08 Thread Larry Finger

Erik Mouw wrote:

On Fri, Sep 08, 2006 at 11:45:27AM +0200, Michael Buesch wrote:

The crash is fixed in wireless-2.6.
The actual cause of the controller restart not. So at least it
does not crash anymore.


Thanks for the information, pulled wireless-2.6 and recompiling kernel.
If this really fixes the problem, can we try to get it merged before
2.6.18 closes? I don't know if vanilla 2.6.18-rc6 locks up on other
hardware as well, but if it does it would be a major regression against
2.6.17.


I'm trying. At the moment, I'm testing a patch against 2.6.18-rc6. Once it works here, I'll be 
putting it out for testing.


Larry
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] RDMA: merge iWARP support

2006-09-08 Thread Roland Dreier
Here is a series of patches that adds iWARP (RDMA over IP) support to
the InfiniBand support already in the kernel.  Since the iWARP RDMA
model is quite close to the InfiniBand model, the changes are not that
large.  The biggest difference is in how connections are established,
since iWARP connections are TCP connections, while IB uses a different
(native IB) mechanism for establishing a connection.

The first patch in the series adds an iWARP connection manager, which
handles establishing and tearing down connections for iWARP devices.
The second patch is all the small changes required to hook in the
connection manager and make the rest of the IB stuff also work with
iWARP devices.  The third patch (compressed due to its size) adds the
first driver for an iWARP device, the Ammasso 1100 1 Gb/sec RNIC.

My current plan is to merge this stuff for 2.6.19.  Please let me know
if you see anything (major or minor) that needs to be fixed up.

Thanks,
  Roland
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] RDMA: iWARP changes to IB core

2006-09-08 Thread Roland Dreier
From: Tom Tucker [EMAIL PROTECTED]

Modifications to the existing rdma header files, core files, drivers,
and ulp files to support iWARP, including:
 - Hook iWARP CM into the build system and use it in rdma_cm.
 - Convert enum ib_node_type to enum rdma_node_type, which includes
   the possibility of RDMA_NODE_RNIC, and update everything for this.

Signed-off-by: Tom Tucker [EMAIL PROTECTED]
Signed-off-by: Steve Wise [EMAIL PROTECTED]
Signed-off-by: Roland Dreier [EMAIL PROTECTED]
---
 drivers/infiniband/core/Makefile |4 
 drivers/infiniband/core/addr.c   |   18 +
 drivers/infiniband/core/cache.c  |5 
 drivers/infiniband/core/cm.c |3 
 drivers/infiniband/core/cma.c|  355 +++---
 drivers/infiniband/core/device.c |4 
 drivers/infiniband/core/mad.c|7 -
 drivers/infiniband/core/sa_query.c   |5 
 drivers/infiniband/core/smi.c|   16 +
 drivers/infiniband/core/sysfs.c  |   11 -
 drivers/infiniband/core/ucm.c|3 
 drivers/infiniband/core/user_mad.c   |5 
 drivers/infiniband/core/verbs.c  |   17 +
 drivers/infiniband/hw/ehca/ehca_main.c   |2 
 drivers/infiniband/hw/ipath/ipath_verbs.c|2 
 drivers/infiniband/hw/mthca/mthca_provider.c |2 
 drivers/infiniband/ulp/ipoib/ipoib_main.c|8 +
 drivers/infiniband/ulp/srp/ib_srp.c  |2 
 include/rdma/ib_addr.h   |   17 +
 include/rdma/ib_verbs.h  |   25 ++
 20 files changed, 430 insertions(+), 81 deletions(-)

diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile
index 68e73ec..163d991 100644
--- a/drivers/infiniband/core/Makefile
+++ b/drivers/infiniband/core/Makefile
@@ -1,7 +1,7 @@
 infiniband-$(CONFIG_INFINIBAND_ADDR_TRANS) := ib_addr.o rdma_cm.o
 
 obj-$(CONFIG_INFINIBAND) +=ib_core.o ib_mad.o ib_sa.o \
-   ib_cm.o $(infiniband-y)
+   ib_cm.o iw_cm.o $(infiniband-y)
 obj-$(CONFIG_INFINIBAND_USER_MAD) +=   ib_umad.o
 obj-$(CONFIG_INFINIBAND_USER_ACCESS) +=ib_uverbs.o ib_ucm.o
 
@@ -14,6 +14,8 @@ ib_sa-y :=sa_query.o
 
 ib_cm-y := cm.o
 
+iw_cm-y := iwcm.o
+
 rdma_cm-y :=   cma.o
 
 ib_addr-y :=   addr.o
diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
index d8e54e0..9cbf09e 100644
--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -61,12 +61,15 @@ static LIST_HEAD(req_list);
 static DECLARE_WORK(work, process_req, NULL);
 static struct workqueue_struct *addr_wq;
 
-static int copy_addr(struct rdma_dev_addr *dev_addr, struct net_device *dev,
-unsigned char *dst_dev_addr)
+int rdma_copy_addr(struct rdma_dev_addr *dev_addr, struct net_device *dev,
+const unsigned char *dst_dev_addr)
 {
switch (dev-type) {
case ARPHRD_INFINIBAND:
-   dev_addr-dev_type = IB_NODE_CA;
+   dev_addr-dev_type = RDMA_NODE_IB_CA;
+   break;
+   case ARPHRD_ETHER:
+   dev_addr-dev_type = RDMA_NODE_RNIC;
break;
default:
return -EADDRNOTAVAIL;
@@ -78,6 +81,7 @@ static int copy_addr(struct rdma_dev_add
memcpy(dev_addr-dst_dev_addr, dst_dev_addr, MAX_ADDR_LEN);
return 0;
 }
+EXPORT_SYMBOL(rdma_copy_addr);
 
 int rdma_translate_ip(struct sockaddr *addr, struct rdma_dev_addr *dev_addr)
 {
@@ -89,7 +93,7 @@ int rdma_translate_ip(struct sockaddr *a
if (!dev)
return -EADDRNOTAVAIL;
 
-   ret = copy_addr(dev_addr, dev, NULL);
+   ret = rdma_copy_addr(dev_addr, dev, NULL);
dev_put(dev);
return ret;
 }
@@ -161,7 +165,7 @@ static int addr_resolve_remote(struct so
 
/* If the device does ARP internally, return 'done' */
if (rt-idev-dev-flags  IFF_NOARP) {
-   copy_addr(addr, rt-idev-dev, NULL);
+   rdma_copy_addr(addr, rt-idev-dev, NULL);
goto put;
}
 
@@ -181,7 +185,7 @@ static int addr_resolve_remote(struct so
src_in-sin_addr.s_addr = rt-rt_src;
}
 
-   ret = copy_addr(addr, neigh-dev, neigh-ha);
+   ret = rdma_copy_addr(addr, neigh-dev, neigh-ha);
 release:
neigh_release(neigh);
 put:
@@ -245,7 +249,7 @@ static int addr_resolve_local(struct soc
if (ZERONET(src_ip)) {
src_in-sin_family = dst_in-sin_family;
src_in-sin_addr.s_addr = dst_ip;
-   ret = copy_addr(addr, dev, dev-dev_addr);
+   ret = rdma_copy_addr(addr, dev, dev-dev_addr);
} else if (LOOPBACK(src_ip)) {
ret = rdma_translate_ip((struct sockaddr *)dst_in, addr);
if (!ret)
diff 

[PATCH 3/2] RDMA: Ammasso 1100 RNIC driver

2006-09-08 Thread Roland Dreier
Here's the compressed patch adding the amso1100 driver.  You can also
find this in my git tree at

git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git

in the for-2.6.19 branch.



0003-RDMA-amso1100-Add-driver-for-Ammasso-1100-RNIC.txt.bz2
Description: application/bzip


[PATCH 1/2] RDMA: iWARP connection manager

2006-09-08 Thread Roland Dreier
From: Tom Tucker [EMAIL PROTECTED]

Add an iWARP Connection Manager (CM), which abstracts connection
management for iWARP devices (RNICs).  It is a logical instance of the
xx_cm where xx is the transport type (ib or iw).  The symbols exported
are used by the transport independent rdma_cm module, and are
available also for transport dependent ULPs.

Signed-off-by: Tom Tucker [EMAIL PROTECTED]
Signed-off-by: Steve Wise [EMAIL PROTECTED]
Signed-off-by: Roland Dreier [EMAIL PROTECTED]
---
 drivers/infiniband/core/iwcm.c | 1019 
 drivers/infiniband/core/iwcm.h |   62 ++
 include/rdma/iw_cm.h   |  258 ++
 3 files changed, 1339 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/core/iwcm.c b/drivers/infiniband/core/iwcm.c
new file mode 100644
index 000..c3fb304
--- /dev/null
+++ b/drivers/infiniband/core/iwcm.c
@@ -0,0 +1,1019 @@
+/*
+ * Copyright (c) 2004, 2005 Intel Corporation.  All rights reserved.
+ * Copyright (c) 2004 Topspin Corporation.  All rights reserved.
+ * Copyright (c) 2004, 2005 Voltaire Corporation.  All rights reserved.
+ * Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved.
+ * Copyright (c) 2005 Open Grid Computing, Inc. All rights reserved.
+ * Copyright (c) 2005 Network Appliance, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+#include linux/dma-mapping.h
+#include linux/err.h
+#include linux/idr.h
+#include linux/interrupt.h
+#include linux/pci.h
+#include linux/rbtree.h
+#include linux/spinlock.h
+#include linux/workqueue.h
+#include linux/completion.h
+
+#include rdma/iw_cm.h
+#include rdma/ib_addr.h
+
+#include iwcm.h
+
+MODULE_AUTHOR(Tom Tucker);
+MODULE_DESCRIPTION(iWARP CM);
+MODULE_LICENSE(Dual BSD/GPL);
+
+static struct workqueue_struct *iwcm_wq;
+struct iwcm_work {
+   struct work_struct work;
+   struct iwcm_id_private *cm_id;
+   struct list_head list;
+   struct iw_cm_event event;
+   struct list_head free_list;
+};
+
+/*
+ * The following services provide a mechanism for pre-allocating iwcm_work
+ * elements.  The design pre-allocates them  based on the cm_id type:
+ * LISTENING IDS:  Get enough elements preallocated to handle the
+ * listen backlog.
+ * ACTIVE IDS: 4: CONNECT_REPLY, ESTABLISHED, DISCONNECT, CLOSE
+ * PASSIVE IDS:3: ESTABLISHED, DISCONNECT, CLOSE
+ *
+ * Allocating them in connect and listen avoids having to deal
+ * with allocation failures on the event upcall from the provider (which
+ * is called in the interrupt context).
+ *
+ * One exception is when creating the cm_id for incoming connection requests.
+ * There are two cases:
+ * 1) in the event upcall, cm_event_handler(), for a listening cm_id.  If
+ *the backlog is exceeded, then no more connection request events will
+ *be processed.  cm_event_handler() returns -ENOMEM in this case.  Its up
+ *to the provider to reject the connectino request.
+ * 2) in the connection request workqueue handler, cm_conn_req_handler().
+ *If work elements cannot be allocated for the new connect request cm_id,
+ *then IWCM will call the provider reject method.  This is ok since
+ *cm_conn_req_handler() runs in the workqueue thread context.
+ */
+
+static struct iwcm_work *get_work(struct iwcm_id_private *cm_id_priv)
+{
+   struct iwcm_work *work;
+
+   if (list_empty(cm_id_priv-work_free_list))
+   return NULL;
+   work = list_entry(cm_id_priv-work_free_list.next, struct iwcm_work,
+ free_list);
+   list_del_init(work-free_list);
+   return work;
+}
+