Re: Udev coldplugging loads 8139too driver instead of 8139cp

2008-01-29 Thread Michael Tokarev
Stephen Hemminger wrote:
 On Tue, 29 Jan 2008 03:46:08 +0300
 Michael Tokarev [EMAIL PROTECTED] wrote:
[]
 There are 2 drivers for 8139-based NICs.  For really different two kinds
 of hardware, which both uses the same PCI identifiers.  Both drivers
 claims to work with all NICs with those PCI ids, because externally
 (by means of udev for example) it's impossible to distinguish the two
 kinds of hardware, it becomes clean only when the driver (either of the
 two) loads and actually checks which hardware we have here.
 
 Is there any chance of using subdevice or subversion to tell them apart?
 That worked for other vendors like DLINK who slapped same ID on different
 cards.

If it were that simple... ;)

No.  The difference is in PCI revision number (byte #8 in PCI config space).
If it's = 0x40 - it's 8139too,  0x40 - 8139cp.  Or 0x20 - I forgot.

Here's a code snippet from a shell script I used ages ago to automatically
load modules (similar to what udev does nowadays):

  # special hack for 8139{too,cp} stuff
  case $modalias in
  *v10ECd8139*)
rev=$(dd if=$1/config bs=1 skip=8 count=1 2/dev/null)
if [ -n $rev ]; then
  list=
  for module in $modlist; do
case $module in
8139cp)
  if [ .$rev \ .  ]; then
$vecho1 $TAG: not loading $module for this device
continue
  fi
  ;;
8139too)
  if [ .$rev \ .  ]; then
$vecho1 $TAG: not loading $module for this device
continue
  fi
  ;;
esac
list=$list $module
  done
  modlist=$list
fi
;;
  esac

/mjt
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Udev coldplugging loads 8139too driver instead of 8139cp

2008-01-28 Thread Michael Tokarev
Frederik Himpe wrote:
 Linux 2.6.24 kernel gives the following messages when udev coldplugging
 loads the driver for my NIC:
 
 8139too :00:0b.0: This (id 10ec:8139 rev 20) is an enhanced 8139C+ chip
 8139too :00:0b.0: Use the 8139cp driver for improved performance and 
 stability.

There are 2 drivers for 8139-based NICs.  For really different two kinds
of hardware, which both uses the same PCI identifiers.  Both drivers
claims to work with all NICs with those PCI ids, because externally
(by means of udev for example) it's impossible to distinguish the two
kinds of hardware, it becomes clean only when the driver (either of the
two) loads and actually checks which hardware we have here.

Udev in fact loads both - 8139cp and 8139too.  The difference is the ORDER
in which it loads them - if for cp-handled hardware it first loads too,
too will complain as above and will NOT claim the device.  The same is
true for the opposite.

So - in short - things has always been this way (thanks to realtec).
I've seen similar (but opposite) effects on my systems, which are all
should be serviced by 8139too driver but 8139cp loaded first - up
till i gave up and just disabled 8139cp...

I don't know what happened in 2.6.24, but my guess is that since 8139too-based
hw is now alot more common, the two drivers are listed in the opposite
order.

In short: NotABug, or ComplainToRealtec (but that's wy too late and
will not help anyway) ;)

/mjt

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: why would EPIPE cause socket port to change?

2007-01-23 Thread Michael Tokarev
Herbert Xu wrote:
 dean gaudet [EMAIL PROTECTED] wrote:
 in the test program below the getsockname result on a TCP socket changes 
 across a write which produces EPIPE... here's a fragment of the strace:

 getsockname(3, {sa_family=AF_INET, sin_port=htons(37636), 
 sin_addr=inet_addr(127.0.0.1)}, [17863593746633850896]) = 0
 ...
 write(3, hi!\n, 4)= 4
 write(3, hi!\n, 4)= -1 EPIPE (Broken pipe)
 --- SIGPIPE (Broken pipe) @ 0 (0) ---
 getsockname(3, {sa_family=AF_INET, sin_port=htons(59882), 
 sin_addr=inet_addr(127.0.0.1)}, [16927060683038654480]) = 0

 why does the port# change?  this is on 2.6.19.1.
 
 Prior to the last write, the socket entered the CLOSED state meaning
 that the old port is no longer allocated to it.  As a result, the
 last write operates on an unconnected socket which causes a new local
 port to be allocated as an autobind.  It then fails because the socket
 is still not connected.

Well, but why getsockname() didn't just return ENOTCONN?

 So any attempt to run getsockname after an error on the socket is
 simply buggy.

Yes it is.  But so is not returning ENOTCONN from getsockname().  I think.

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: why would EPIPE cause socket port to change?

2007-01-23 Thread Michael Tokarev
Herbert Xu wrote:
 On Tue, Jan 23, 2007 at 02:10:39PM +0300, Michael Tokarev wrote:
 Well, but why getsockname() didn't just return ENOTCONN?
 
 It's perfectly valid to have a local port number without being connected.

Er.  You're right - I was confusing getSOCKname() and getPEERname().

Still, after the connection has been closed, there's no chance to do
anything with the filedescriptor but to close it as well, right?  Or
can the fd be reused by making new connection with it, as if it were
just returned from socket() call?

If it's the former, than there's no reason to assign new local address
to it.

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] tcp_output: Re: rare bad TCP checksum with 2.6.19?

2007-01-21 Thread Michael Tokarev
Jarek Poplawski wrote:
 On Fri, Jan 19, 2007 at 03:08:20PM +0100, Jarek Poplawski wrote:
 ...
 You are welcome! But you probably didn't read this with
 attention: if it works, you should thank mainly to that
 other guy...

 Btw. I can't remember I've seen such ferocious testing
 ever!
 
 After checking in the dictionary I found my btw. could
 be rather confusing:
 
 ferocious:
 1.savagely fierce, as a wild beast, person, action,
  or aspect; violently cruel: a ferocious beating.
 2.extreme or intense: a ferocious thirst. 
 
 I've only meant #2 - and nothing like #1.
 If you were confused - sorry!

Heh.

Jarek, thank you for your apprecation of my efforts.

And no, I noticied this your statement in the first place
(in good sense - like, #2 above) -- I wanted to comment on
it first, but didn't.

I was only running tcpdump - yes, it was running almost the
whole day, with different options.  I did almost nothing.

You over-estimate my contribution, really ;)

The very good thing is that this bug is now found, and *that*
is what matters.

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] tcp_output: Re: rare bad TCP checksum with 2.6.19?

2007-01-19 Thread Michael Tokarev
Jarek Poplawski wrote:
 On 17-01-2007 15:12, Michael Tokarev wrote:
[]
 Here's another sample, which may be more useful.  I've seen quite
 alot of very similar stuff while running tcpdump.

   http://www.corpit.ru/mjt/bad-cksum-session3-dmp.bin

 The scenario looks like this.

 A client (82.84.172.37 -- a zombie machine trying to send us spam
 in this case) connects to a port 25 here (81.13.94.6:25).  SYN+ACK
 sequence completes.  Next, our server send an initial SMTP greething
 message, but almost right after that, the client sends a FIN packet,
 WITHOUT acknowleging that it received the (first and only) data
 packet.  So some time later our machine re-sends the data, AND adds
 FIN flag to the packet (also replying to the FIN received from the
 client).  And *that* packet - original data packet which is modified
 to also include FIN - has incorrect checksum.

 So it looks like the checksum isn't being updated WHEN ADDING MORE
 FLAGS to the original data packet.

 
 Hi,
 
 Here is my patch proposal. If I'm not totally wrong,
 there is a possibility that, during collapsing, empty
 skb with FIN is added to normal packet and changes
 its ip_summed field to CHECKSUM_NONE.
 
 Regards,
 Jarek P.
 
 PS: probably there are also other possibilities...

Well..  I just tried it - with this patch applied, no more bad checksums
are shown.  Tried from the network that triggers it most reliable - and
wasn't able to reproduce the bad behavior.

I'm running a tcpdump right now, and so far it only captured a few bad-cksum
packets from other hosts (which are also running 2.6.19 ;)

Thanks Jarek!

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] tcp_output: Re: rare bad TCP checksum with 2.6.19?

2007-01-19 Thread Michael Tokarev
Patrick McHardy wrote:
 Jarek Poplawski wrote:
 Here is my patch proposal. If I'm not totally wrong,
 there is a possibility that, during collapsing, empty
 skb with FIN is added to normal packet and changes
 its ip_summed field to CHECKSUM_NONE.

 diff -Nurp linux-2.6.19-/net/ipv4/tcp_output.c 
 linux-2.6.19/net/ipv4/tcp_output.c
 --- linux-2.6.19-/net/ipv4/tcp_output.c  2006-11-29 22:57:37.0 
 +0100
 +++ linux-2.6.19/net/ipv4/tcp_output.c   2007-01-19 07:58:39.0 
 +0100
 @@ -1590,7 +1590,8 @@ static void tcp_retrans_try_collapse(str
  
  memcpy(skb_put(skb, next_skb_size), next_skb-data, 
 next_skb_size);
  
 -skb-ip_summed = next_skb-ip_summed;
 +if (next_skb-ip_summed == CHECKSUM_PARTIAL)
 +skb-ip_summed = CHECKSUM_PARTIAL;
  
  if (skb-ip_summed != CHECKSUM_PARTIAL)
  skb-csum = csum_block_add(skb-csum, next_skb-csum, 
 skb_size);

 
 I noticed this too, but I can't see how it could lead to
 a partial checksum on the wire since the checksumming is
 done after changing ip_summed to CHECKSUM_NONE. Is this
 patch verified to fix Michael's problem?

It seems to fix this my problem, yes - at least I can't reproduce it anymore.
Tcpdump is running however - let's see... :)

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rare bad TCP checksum with 2.6.19?

2007-01-17 Thread Michael Tokarev
Herbert Xu wrote:
 On Tue, Jan 16, 2007 at 11:08:51AM +0300, Michael Tokarev wrote:
 Ok.  Here's another trace, from that remote network that triggers
 this thing more-or-less reliable (every 2nd transfer at least) --
 http://www.corpit.ru/mjt/bh-bad-cksum-dmp.bin . It's a full session
 between 216.168.29.244 - the requesting/receiving side -- and
 81.13.94.6 -- our sending side (the file being transferred is some
 trojan horse I found on a friend's PC, so be careful ;)
 
 I'll have a look at this tomorrow.
 
 Since you're certain that this is being seen on the wire, one
 possibility is that we've got a bug somewhere that's zeroing
 skb-ip_summed on a packet with a partial checksum.

Here's another sample, which may be more useful.  I've seen quite
alot of very similar stuff while running tcpdump.

  http://www.corpit.ru/mjt/bad-cksum-session3-dmp.bin

The scenario looks like this.

A client (82.84.172.37 -- a zombie machine trying to send us spam
in this case) connects to a port 25 here (81.13.94.6:25).  SYN+ACK
sequence completes.  Next, our server send an initial SMTP greething
message, but almost right after that, the client sends a FIN packet,
WITHOUT acknowleging that it received the (first and only) data
packet.  So some time later our machine re-sends the data, AND adds
FIN flag to the packet (also replying to the FIN received from the
client).  And *that* packet - original data packet which is modified
to also include FIN - has incorrect checksum.

So it looks like the checksum isn't being updated WHEN ADDING MORE
FLAGS to the original data packet.

/mjt

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rare bad TCP checksum with 2.6.19?

2007-01-16 Thread Michael Tokarev
Herbert Xu wrote:
 On Tue, Jan 16, 2007 at 02:27:39PM +1100, Herbert Xu wrote:
 I'm sorry but this dump does NOT look like it was taken from an
 intermediate box.  I verified two bad checksums (chosen randomly)
 and they were both correct but partial checksums.  This means that
 this dump was most likely taken from the sending host.
 
 I did see one strange bit:
 
 02:39:51.758803 IP (tos 0x0, ttl  63, id 41084, offset 0, flags [DF], length: 
 102) 192.168.1.1.25  81.13.94.6.21350: FP [bad tcp cksum 81b0 (-9ee8)!] 
 4271854025:4271
 854075(50) ack 3772789166 win 272 nop,nop,timestamp 145420525 6279830
 0x:  4500 0066 a07c 4000 3f06 2a59 c0a8 0101  E..f.|@.?.*Y
 0x0010:  510d 5e06 0019 5366 fe9f 51c9 e0e0 31ae  Q.^...Sf..Q...1.
 0x0020:  8019 0110 81b0  0101 080a 08aa f0ed  
 0x0030:  005f d296 3235 3020 322e 302e 3020 4f6b  ._..250.2.0.0.Ok
 0x0040:  3a20 7175 6575 6564 2061 7320 3631 3345  :.queued.as.613E
 0x0050:  4137 4637 440d 0a32 3231 2032 2e30 2e30  A7F7D..221.2.0.0
 0x0060:  2042 7965 0d0a   .Bye..
 
 Most of the bad checksums are from 81.13.94.6, which I presume is
 the host you were dumping on.  However, this packet is destined
 for it instead and yet it too has a partial (but correct) checksum.
 
 So the question is where in your network is 192.168.1.1 and how is
 your network setup in terms of NAT?

This 192.168.* network is internal, and this very packet - I didn't
think it'll be there, but.. hum.

The network looks like this:

   internet
  |  81.13.94.6 etc
 [ router ]  -  [ DMZ ]
  |
   [ LAN ] 192.168.1.1 etc

The capture has been made on the router, on the interface which is
connected to a DMZ segment (so no netfilter stuff should be involved
at all; but there's no fancy netfilter setup between dmz and external
inteface, many packets don't even go to conntrack).

81.13.94.6 is a machine in the DMZ segment (it's www.corpit.ru, by the
way).


192.168.1.1 is a machine in LAN.

So the packet you're referring to belongs to a connection between
internal (on LAN) mailserver and a DMZ mailserver - and that one, --
at least I didn't think about capturing *that* traffic.  At least
most of the packets were between dmz and external interface.  That
to say - 192.168.1.1 machine also has this problem (as I mentioned
before - it happens on several different machines with different
kernels (all are 2.6.19 still - it doesn't happen with 2.6.18 or
before)), but it wasn't the main machine I did the testing on.

Ok.  Here's another trace, from that remote network that triggers
this thing more-or-less reliable (every 2nd transfer at least) --
http://www.corpit.ru/mjt/bh-bad-cksum-dmp.bin . It's a full session
between 216.168.29.244 - the requesting/receiving side -- and
81.13.94.6 -- our sending side (the file being transferred is some
trojan horse I found on a friend's PC, so be careful ;)

The last packet(s) -- they're repeated many times, ad infinitum,
because the receiving side discards incorrectly checksummed packets
and thus never sees the final part of the data -- here it's as
captured on the router (above, included in the trace):

10:52:35.702649 IP (tos 0x0, ttl  64, id 61117, offset 0, flags [DF], proto: 
TCP (6), length: 82)
 81.13.94.6.80  216.168.29.244.55354: FP, cksum 0x9185 (incorrect (- 0x5c56),
 140062:140092(30) ack 125 win 2896 nop,nop,timestamp 12118000 265951653

And here it is again, captured on the RECEIVING side (on 216.168.29.244):

07:52:35.816545 IP (tos 0x0, ttl  48, id 61117, offset 0, flags [DF], proto: 
TCP (6), length: 82)
 81.13.94.6.80  216.168.29.244.55354: FP, cksum 0x9185 (incorrect (- 0x5c56),
 140062:140092(30) ack 125 win 2896 nop,nop,timestamp 12118000 265951653

(the only difference in headers I see is in the TTL, which is expectable).

The transfer never finishes, it sits at 98% or so.  On the receiving side
(which is running FreeBSD), bad checksums statistics counter increases with
every FP packet.  It also makes no difference whenever tcpdump is running on
either side or on an intermediate host or not.

Thanks!

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rare bad TCP checksum with 2.6.19?

2007-01-16 Thread Michael Tokarev
Patrick McHardy wrote:
 Herbert Xu wrote:
[]
 Since you're certain that this is being seen on the wire, one
 possibility is that we've got a bug somewhere that's zeroing
 skb-ip_summed on a packet with a partial checksum.

 One potential spot where this could happen is netfilter.
 Patrick, do you know of any recent changes (this is happening
 with 2.6.19) that might cause this?
 
 The incremental HW checksum update stuff went in 2.6.19, so thats
 a prime suspect. Can't see where this could be happening though.
 
 Michael, how exactly is netfilter involved in your setup?

I think it doesn't involved.

The captures I did were done on a router box, which indeed has some
netfilter stuff.  But:

 1) the capture has been done on an interface directly connected to
   the segment where the testing machine is located (not on the
   external interface)

 2) the testing machine itself does not have any netfilter modules
   loaded

 3) the packets looks exactly the same in at least 3 places (modulo
   the TTL values): on the sending machine, on the router (on the
   interface connected to the sending machine - in those 2 places,
   the TTL is the same), and at the receiving side, which is 20+
   hops away.

 4) I tried another machine today (upgraded from 2.6.17 to 2.6.19) -
   stand-alone, without any netfilter modules loaded (but it's under
   quite.. some load - see http://j.ns.dsbl.org/nsg/ -- with this load
   it'll die right after iptables module loading, it's a 600MHz Celeron
   box replying to 15000 DNS packets every secound) - it started showing
   the same behavior.

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rare bad TCP checksum with 2.6.19?

2007-01-15 Thread Michael Tokarev
Herbert Xu wrote:
 Michael Tokarev [EMAIL PROTECTED] wrote:
 Note there's no funny/interesting hardware involved, like network cards with
 tcp checksumming offload capabilities (this is plain dumb 8139 card).
 
 The 8139 card might be dumb, but the driver isn't :) It emulates
 checksum offload in software, meaning that tcpdump will show bogus
 checksums.
 
 So please disable hardware checksum offload with ethtool -K and
 then try again.

# ethtool -k eth0
Offload parameters for eth0:
Cannot get device rx csum settings: Operation not supported
Cannot get device tx csum settings: Operation not supported
Cannot get device scatter-gather settings: Operation not supported
Cannot get device tcp segmentation offload settings: Operation not supported
no offload info available

# ethtool -K eth0 rx off tx off tso off
Cannot set device rx csum settings: Operation not supported

So I guess the problem is not related to hw checksumming offloading.

Meanwhile, I tried many times to reproduce the problem - with little
success.  With different sizings, options, et al - I can't force the
sending side to send some data within a FIN packet.  I.e, most of the
time, the thing just works, because no data goes with FIN packet.
But once every 50..100 tries, I see single FIN-with-data packet, and
that one ALWAYS has bad checksum.

I was never able to reproduce the problem on a LAN, only when going from
a distant host.  And even with that distant host, it's very difficult to
reproduce.

At least one network (also distant) triggers this problem on every 2nd
try or so (the one I experimented with yesterday).  But I've no access
to that network - I kindly asked for help yesterday, but I can't abuse
their willingness to help more.

And another thing I noticed.  Right now I'm experimenting with another
machine, running 2.6.17(.13) - it also shows similar behavior with bad
csums, but MUCH rarer than this 2.6.19.  Like this:

16:29:32.490976 IP (tos 0x60, ttl  48, id 14110, offset 0, flags [DF], length: 
80)
 69.42.67.34.2612  81.13.94.6.1234: . [bad tcp cksum f4b4 (-c1cc)!] ack 93407 
win 9821
 nop,nop,timestamp 1046528199 5497679,nop,nop,sack sack 3 
{104991:109335}{110783:112231}{104991:109335} 
16:29:32.525988 IP (tos 0x60, ttl  48, id 14112, offset 0, flags [DF], length: 
80)
 69.42.67.34.2612  81.13.94.6.1234: . [bad tcp cksum 3fb1 (-1819)!] ack 93407 
win 9821
 nop,nop,timestamp 1046528202 5497679,nop,nop,sack sack 3 
{110783:113679}{122367:123815}{110783:113679} 
16:29:32.561407 IP (tos 0x60, ttl  48, id 14116, offset 0, flags [DF], length: 
80)
 69.42.67.34.2612  81.13.94.6.1234: . [bad tcp cksum 87c0 (-2610)!] ack 93407 
win 9821
 nop,nop,timestamp 1046528205 5497679,nop,nop,sack sack 3 
{122367:127103}{128551:129572}{122367:127103} 

Here, 69.42.67.34 is 2.6.17 from which I'm requesting data, and
81.13.94.6 is the sender.  This behavior so far is demonstrated with
sack packets only, but I've seen it in other direction too (also with
sack), at least once.

Any idea how to force sending FIN-with-data?

Thanks!

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rare bad TCP checksum with 2.6.19?

2007-01-15 Thread Michael Tokarev
Herbert Xu wrote:
 On Mon, Jan 15, 2007 at 04:34:41PM +0300, Michael Tokarev wrote:
[]
 So I guess the problem is not related to hw checksumming offloading.
 
 Nope, it just means that 8139too doesn't provide ethtool handlers to
 disable checksum offloading.
 
 So I suggest that you try doing the tcpdump on the receive side as
 that should show the real checksum.

I'm doing the capture on an intermediate host - the whole day today ;)

 BTW, the reason tcpdump only shows some packets with bogus checksums
 is because it cuts packets off at 100 bytes by default so for most
 packets it can't verify the checksum at all.  If you run it with
 -s 1600 you should see bogus checksums on every packet with payload.

And I'm capturing with -s 2000.  By the way, tcpdump just does not
verify the cheksum of truncated (due to capture size) packets.  At
least not the version I'm using (which is 3.9.5).

Herbert, the problem IS real, it's not due to some bad behavior due
to improper capturing or something like that.  Yes it's difficult to
come to it, but it is real.

I've saved quite alot of packets today, but it's all quite.. useless
as the thing is difficult to hit.  Here's some traces made with the
following filter:

 proto TCP and tcp[tcpflags]  (tcp-fin|tcp-push) == (tcp-fin|tcp-push)

(I've choosen FIN+PUSH because this combination is where the problem
is seen most - to be fair, it looks like I haven't seen it with other
flags).

In there, some packets are ok, but some are not.  So - again, it seems
like - I was wrong about 100% hit ratio -- ie, that the bad checksum
is ALWAYS the case with packets where some data goes in FIN packets --
this is incorrect, because the trace shows quite a few examples of right
behavior.

The trace is here: http://www.corpit.ru/mjt/bad-tcp-cksum-dmp.bin

(it contains some data which it sholdn't - but I hope there's nothing
confidential in there ;)

So, after the whole day digging around, I still don't have any more-or-less
clean way to reproduce it.  But I've noticied another thing as well: many
different machines here, with different kernels, behave the same way.
So it can't be a hardware problem for example.

And only at VERY rare cases, the thing causes noticeable transfer slowdowns
or stalls.  But some networks triggers those rare cases more often than others
(so the only more or less sane conclusion I can come with is that it's
somehow timing-related).

Thanks!

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


airo driver: still can't deal with interface renames?

2006-11-12 Thread Michael Tokarev
A long time ago, in kernel-2.4 days, I noticed that
airo module (for various aironet etc cards) can't
handle interface renames (at that time, the kernel
crashed after `ip link set eth1 name xxx' command
and some packets going to/from the interface).

The problem seems to be that the driver uses some
sort of private data in /proc/driver/airo/ethN/ which
gets created when the driver discovers an interface
and names it as usual, giving next available N in
ethN scheme.

And currently (2.6.18) the situation is still similar.
Well, it does not crashes, but still keeps the directory
in /proc/driver/airo named after the automatic interface
name, which leads to confusion at least, especially if
i load airo.ko, rename the iface, and next load some other
network driver module, which grabs THE SAME ethN as was
first grabbed by airo.

The fix seems to be to follow interface renames with name
change in /proc/driver/airo/.  Or, better longterm but
leads to incompatibilities, switch to using sysfs interface,
where the change will be made automagically.

I don't understand all the driver internals to a level when
I can fix the problem in the right way.  But I can fix it
with a hack, which seems to be simple enouth and will suit
at least my needs: by introducing another array of charp
for module parameters, giving the name(s) of interface(s)
it should create, just like currently it accepts ssids
and rates parameters.

Can someone comment on the situation, please?

Thanks.

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/6] net: use bitrev8

2006-10-20 Thread Michael Tokarev
Andrew Morton wrote:
 On Thu, 19 Oct 2006 01:46:47 +0900
 Akinobu Mita [EMAIL PROTECTED] wrote:
 
 Use bitrev8 for bmac, mace, macmace, macsonic, and skfp drivers.

[]
 ===
 --- work-fault-inject.orig/drivers/net/Kconfig
 +++ work-fault-inject/drivers/net/Kconfig
 @@ -2500,6 +2500,7 @@ config DEFXX
  config SKFP
  tristate SysKonnect FDDI PCI support
  depends on FDDI  PCI
 +select BITREVERSE
  ---help---
Say Y here if you have a SysKonnect FDDI PCI adapter.
The following adapters are supported by this driver:
 
[]
 But select is problematic and I do wonder whether it'd be simpler to just
 link the thing into vmlinux.

Why it's problematic?  Maintenance costs of various missing selects?
I don't want extra stuff in kernel (vmlinux) if it's not used.

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mii-tool gigabit support.

2006-10-01 Thread Michael Tokarev
Rick Jones wrote:
 2) develop some style
 of register description definition type of text file, maybe XML, maybe
 INI style or something stored in /etc/ethtool as drivername.conf or
 something like that.  This way, ethtool doesn't have to be
 changed/updated/patched/likely-bug-added for every single device known
 to man. 
 Just a thought.
  
 We could switch to shared libraries like 'tc' uses.
 
 From a practical standpoint is shipping a new config file or a new
 shared library all that much different from a new ethtool binary?

New config - certainly yes.  After all, it's trivial to change a line
in a config file locally.  But yes, new shared lib is the same as a
new binary.

. o O { /sys/class/net/$iface/config.xml };)

But seriously, I don't think it's that a bad idea.  Maybe not xml,
but a plain list of registers with their names - a static string
known to each driver isn't much a bloat really.  I think anyway.

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NET 00/06]: Increase number of possible routing tables

2006-08-11 Thread Michael Tokarev
Patrick McHardy wrote:
 These are the updated patches (against net-2.6.19) to increase the number
 of possible routing tables to 2^32. They basically consist of four parts:
 
 - Use u32 for routing table IDs everywhere inside the kernel

Just out of curiocity: why current limit of 2^31 isn't sufficient?
Or am I missing the point?

Thanks.

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NET 00/06]: Increase number of possible routing tables

2006-08-11 Thread Michael Tokarev
David Miller wrote:
 From: Michael Tokarev [EMAIL PROTECTED]
[]
 - Use u32 for routing table IDs everywhere inside the kernel
 Just out of curiocity: why current limit of 2^31 isn't sufficient?
 Or am I missing the point?
 
 The current limit is 256 because the table member of the struct
 used to configure them is an 8-bit quantity.
 
 That's the whole purpose of Patrick's patch set, to provide a new
 optional attribute that allows specifying a 32-bit rather than
 the 8-bit table ID.

Aha, it was 256, not 2^31.  I remember now.

So the question probably should have been like, why u32 and additional
attribute (to represent former -1) instead of current int?  I mean,
it probably makes no difference whenever there are 2^32 or 2^31 tables
(both values are pretty large), but 2^32 requires more changes for the
existing code.

And while we're at it...  How about using table *names* instead of
numbers in kernel too, a-la iptables?  Once possible number of tables
is large, and we're using hashes for tables now anyway, keeping a
name inside the table structure wont hurt ;)

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] [IPV4] route: Dynamic hash table sizing.

2006-08-09 Thread Michael Tokarev
On Wednesday 09 August 2006 09:53, David Miller wrote:
[]
 -   mult = ((u64)ip_rt_gc_interval)  long_log2(hmask + 1);
 +   mult = ((u64)(hmask + 1))  (u64)ip_rt_gc_interval;

Hmm.. shift *by* a 64-bit number?

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Is the qla3xxx driver in the mainline?

2006-07-26 Thread Michael Tokarev
Albert Lee wrote:
[]
 (I am curious to have some performance comparison of
 qla3xxx + open iscsi v.s. qla4xxx + on board TOE/iscsi.)

On wich card?  I've been told that IPS4010 for example
isn't supported by qla3xxx.

BTW, I found qla4xxx (on IPS4010) performs noticeable
worse than open-iscsi stack on top of tigon (non-jumbo-
frames-capable) GigE NIC (on an eserver xSeries 346
machine).

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Qlogic qla3xxx driver v2.02.00-k36 for upstream inclusion.

2006-07-20 Thread Michael Tokarev
By the way, should it work with ISP4010 controllers?
Those expose network interface card subdevice too,
but aren't listed in pci_device_table of the driver,
and after adding the device ID to the driver, it still
does not quite work (I tried, just out of curiosity) -
the NIC on ISP4010 is - it seems - close but not exactly
the same as the driver expects.

Thanks.

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Qlogic qla3xxx driver v2.02.00-k36 for upstream inclusion.

2006-07-20 Thread Michael Tokarev
Andrew Vasquez wrote:
 On Thu, 20 Jul 2006, Ron Mercer wrote:
 
 qla3xxx driver  does not support ISP4010.
 
 Exactly...  The qla3xxx driver supports the NIC function only.

...which is provided by ISP4010 card, as appears on PCI bus:

04:04.0 Ethernet controller: QLogic Corp. QLA3010 Network Adapter (rev 05)
04:04.1 Network controller: QLogic Corp. QLA4010 iSCSI TOE Adapter (rev 05)

(the first (sub)device).  So it *looks* like the card has *both*
a NIC and iSCSI TOE adapter, and the NIC part is pretty much similar
to what qla3xxx driver expects...  That's why my curiosity. ;)

(not that it matters much, just.. curious, really.
Well.  Not exactly.  It'd be nice to compare a NIC w/o Jumbo
frames support (which we have on all machines connected to
the iSCSI segment), with something more.. advanced.  So I
wondered if I can utilize the NIC part of the ISP4010 for
the test.  iSCSI part of the card works significantly slower
than open-iscsi stack on non-jumbo-frames-aware Tigon GigE NIC).

[]
 You'll need to use the qla4xxx driver to drive the iSCSI function.

Yeah, I know.  I posted some results to open-iscsi@ list about a week
ago.  It basically works (the new one, with open-iscsi infrastructure),
but is slooow... ;)

Thanks.

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] airo: make debug-like messages printed by airo_print_dbg()

2006-07-09 Thread Michael Tokarev
Robert Schulze wrote:
 Hi,
 
 Dan Williams schrieb:
 This message will only happen if the card hangs up and stops responding
 to commands anyway, so we don't necessarily care about making the
 message decipherable to anyone other than developers.
 
 Well, I get this message each time I insert my Cisco Aironet 350 PCMCIA
 card, which works obviously fine.
[]
 Besides, the messages can be read by issuing dmesg even after the patch,
 so no information gets lost.

The fact that you're getting that message indicates that something's wrong,
at least from the kernel's point of view.  So it better be understand and
fixed, instead of being hidden in debugging output.  If it's visible in
dmesg but isn't visible in syslog (default syslog configuration does not
capture any debugging messages), far less people will notice it.

I'd vote for making it one-line, but with current KERN_ERR priority.

/mjt (who don't even have the hardware in question)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: http://bugzilla.kernel.org/show_bug.cgi?id=6197

2006-06-13 Thread Michael Tokarev
Patrick McHardy wrote:
[]
 He patched his kernel with the IMQ device, which is known to cause all
 kinds of weird problems.

Wich problems?  Known to whom?

I was considering using imq for our needs (not done yet), and from the
FAQ at http://www.linuximq.net/faq.html (item #3, Is it stable?) it
seems there's no problems except of gre tunnels and locally generated
traffic...

Googling for imq linux problem shows usual pile of various user
support questions (how to configure.. what did I do wrong.. etc),
but nothing relevant.

So... I'm curious whenever the claim on linuximq.net site about the
stability is true, or there in fact are some real issue...

Thanks.

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Safe remote kernel install howto (Re: [Bugme-new] [Bug 6613] New: iptables broken on 32-bit PReP (ARCH=ppc))

2006-05-26 Thread Michael Tokarev
Ingo Oeser wrote:
 Hi Meelis,
 
 Unfortunatlety, 2.6.15 does not boot on this machine so I'm locked out 
 remotely at the moment.
 
 Here it my paranoid boot setup:
 
 1. Use lilo -R new-kernel, to boot a kernel only
 once and reboot the default kernel next time.
 
 2. Force reboot on any panic after 10 seconds:
   append=panic=10 in /etc/lilo.conf
 
 3. Schedule automatic reboot in case of impossible login
   echo /bin/sync; /sbin/reboot -f |at now + 15min

Instead of this, I usually use a system startup script like this:

case $(cat /proc/cmdline) in
 *linux-test*)
   (sleep 300; [ -f /var/run/noreboot ] || reboot) 
   ;;
esac

which means that if the kernel image is named 'linux-test', it will
be rebooted in 15 minutes after booting if no /var/run/noreboot file
exist.  So if I'm able to log in, i just touch /var/run/noreboot and
be done with it.

And oh, yes, for this to work, in lilo.conf the new entry should be
labeled linux-test -- ie, install new kernel, add new entry into lilo.conf
with label=linux-test, run `lilo  lilo -R linux-test  init 6' and..
wait ;)  After successeful reboot (and touching /var/run/noreboot), edit
lilo.conf, restore the proper label, set proper order of entries if needed
and re-run lilo.

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] add a dev_ioctl fallback to sock_ioctl

2005-12-27 Thread Michael Tokarev

Christoph Hellwig wrote:

Currently all network protocols need to call dev_ioctl as the default
fallback in their ioctl implementations.  This patch adds a fallback
to dev_ioctl to sock_ioctl if the protocol returned -ENOIOCTLCMD.
This way all the procotol ioctl handlers can be simplified and we don't
need to export dev_ioctl.

Signed-off-by: Christoph Hellwig [EMAIL PROTECTED]


[]

--- linux-2.6.orig/net/wanrouter/af_wanpipe.c   2005-12-25 14:12:07.0 
+0100
+++ linux-2.6/net/wanrouter/af_wanpipe.c2005-12-25 14:13:42.0 
+0100
@@ -1839,7 +1839,7 @@
 #endif
 
 		default:

-   return dev_ioctl(cmd,(void __user *) arg);
+   return �ENOIOCTLCMD;
}
/*NOTREACHED*/
 }


There's something wrong with this new `return' statement.
On my screen it looks like a 'diamond' character instead
of a minus sign -- a character with code 0xAD ;)

/mjt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


IPSEC tunnel: more than two networks?

2005-12-13 Thread Michael Tokarev
I'm not sure if this is the right list for such a questions..
But still.

Recently we tried to set up an IPSEC tunnel in a branch office
of a large company, using linux (currently 2.6.14).  The endpoint
is running some Cisco PIX device.  Everything's fine, using any
userspace tools available, except of one strangeness: the company
is routing several networks to their central hub (that Cisco PIX)
from each branch office, instead of usual one.

Ie, in all HOWTOs etc about this topic (IPSEC tunneling), there
are exactly two networks mentioned: local network and remote
network, usually with private 10/8 or 192.168/16 addresses, to
be routed using IPSEC tunnel.  I haven't found any reference so
far where there are more than two networks involved this way.
More, in some configurations/software, it isn't even possible
to configure tunnel for more than two networks - eg, for
isakmpd package (available on linux too), there are exactly
*two* config settings -- something aking 'my-net' and 'their-net'.

As far as I understand (from Cisco documentation - I'm a total
newbie in cisco world), the SAs in their equipment are based on
ACLs (which is somewhat more broad term than access control list,
to mean routes etc as well), and each ACL entry is based on two
network ranges - 'from' and 'to' networks.

So, the Cisco device tries to establish several tunnels to each
endpoint, for every network being routed.  And this is clearly
seen by racoon traces - when I ping local network L from first
remote network A, racoon receives a request to establish tunnel
At, and when I ping it from second remote network B, racoon sees
request to establish tunnel Bt.  So this is working fine, at least
according to the Cisco docs.

But what happens next is somewhat weird.

With two tunnels established, linux box tries to tunnel replies
using somewhat random tunnel from the two.  *Usually* (but not
always), the first established tunnel (A/At) works OK, but replies
to pings coming from network B *sometimes* goes with SPI belonging
to Bt, and *sometimes* to At -- and in this last case, Cisco just
drops the wrong packets.

From the SPD dump (setkey -DP), I see both entries (actually 4,
as each tunnel is bi-directional), but I see no way to determine
which tunnel instance to use in each case (IP addresses and
network ranges changed to letters for readability; LE = local
'external' ip-address, and RE = remote 'external' IP, ie, the
IP addresses of the IPSEC endpoints):

For A = L tunnel (At):

A[any] L[any] any
in ipsec
esp/tunnel/RE-LE/require
created: Dec 12 16:06:44 2005  lastused: Dec 13 17:47:21 2005
lifetime: 0(s) validtime: 0(s)
spid=544 seq=9 pid=6430
refcnt=2

L[any] A[any] any
out ipsec
esp/tunnel/LE-RE/require
created: Dec 12 16:06:44 2005  lastused: Dec 13 17:47:21 2005
lifetime: 0(s) validtime: 0(s)
spid=537 seq=6 pid=6430
refcnt=2

For B = L tunnel (Bt):

B[any] L[any] any
in ipsec
esp/tunnel/RE-LE/require
created: Dec 12 16:06:44 2005  lastused: Dec 13 17:47:23 2005
lifetime: 0(s) validtime: 0(s)
spid=568 seq=8 pid=6430
refcnt=2

L[any] B[any] any
out ipsec
esp/tunnel/LE-RE/require
created: Dec 12 16:06:44 2005  lastused: Dec 13 17:47:23 2005
lifetime: 0(s) validtime: 0(s)
spid=561 seq=5 pid=6430
refcnt=2


And here are the SAD entries:

LE RE
esp mode=tunnel spi=2591183185(0x9a725151) reqid=0(0x)
E: des-cbc  bc64154f 70de4e2d
A: hmac-md5  c7db01fd 85e076f3 cbe9997e 56345808
seq=0x replay=4 flags=0x state=mature
created: Dec 13 17:47:10 2005   current: Dec 13 17:52:58 2005
diff: 348(s)hard: 28800(s)  soft: 23040(s)
last: Dec 13 17:47:11 2005  hard: 0(s)  soft: 0(s)
current: 2312(bytes)hard: 0(bytes)  soft: 0(bytes)
allocated: 17   hard: 0 soft: 0
sadb_seq=3 pid=6437 refcnt=0
LE RE
esp mode=tunnel spi=2352252189(0x8c34851d) reqid=0(0x)
E: des-cbc  5dd71a3d 50632fe7
A: hmac-md5  7cf08c5d 26b45e6e 9a6d6cf2 2c1fe68c
seq=0x replay=4 flags=0x state=mature
created: Dec 13 17:47:05 2005   current: Dec 13 17:52:58 2005
diff: 353(s)hard: 28800(s)  soft: 23040(s)
last: Dec 13 17:47:06 2005  hard: 0(s)  soft: 0(s)
current: 1632(bytes)hard: 0(bytes)  soft: 0(bytes)
allocated: 12   hard: 0 soft: 0
sadb_seq=2 pid=6437 refcnt=0
RE LE
esp mode=tunnel spi=209233273(0x0c78a579) reqid=0(0x)
E: des-cbc  bd1c8e9c 038d38d4
A: hmac-md5  8738c7c5 5035559b 718ae30b 9e1f7192
seq=0x replay=4 flags=0x state=mature
created: Dec 13 17:47:10 2005   current: Dec 13 17:52:58 2005
diff: 348(s)hard: 28800(s)  soft: 23040(s)
last: Dec 13 17:47:11 2005  hard: