Re: [IPv6] PROBLEM? Network unreachable despite correct route

2007-01-17 Thread Jarek Poplawski
On Thu, Jan 11, 2007 at 02:08:16PM +0100, Bernhard Schmidt wrote:
 Jarek Poplawski wrote:
 
 ip -6 route:
 2001:4ca0:0:f000::/64 dev eth0  proto kernel  metric 256  expires 
 86322sec mtu 1500 advmss 1440 fragtimeout 4294967295
 fe80::/64 dev eth0  metric 256  expires 21225804sec mtu 1500 advmss 1440 
 fragtimeout 4294967295
 ff00::/8 dev eth0  metric 256  expires 21225804sec mtu 1500 advmss 1440 
 fragtimeout 4294967295
 default via fe80::2d0:4ff:fe12:2400 dev eth0  proto kernel  metric 1024  
 expires 1717sec mtu 1500 advmss 1440 fragtimeout 64
 unreachable default dev lo  proto none  metric -1  error -101 fragtimeout 
 255

I've a look at this once more and have one more doubt:
probably it's some other ip6 trick again, but why this
default router doesn't have normal address in the
same segment (2001:4ca0:.../64)? 

Regards,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [IPv6] PROBLEM? Network unreachable despite correct route

2007-01-17 Thread Jarek Poplawski
On Wed, Jan 17, 2007 at 09:14:20AM +0100, Jarek Poplawski wrote:
...
 I've a look at this once more and have one more doubt:
 probably it's some other ip6 trick again, but why this
 default router doesn't have normal address in the
 same segment (2001:4ca0:.../64)? 

Sorry, I see it's OK (except my silly question!).
I should definitely read first - then write...

Jarek P. 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [IPv6] PROBLEM? Network unreachable despite correct route

2007-01-11 Thread Jarek Poplawski
On 10-01-2007 01:23, Bernhard Schmidt wrote:
 On Tue, Jan 09, 2007 at 08:36:24PM +0100, Bernhard Schmidt wrote:
...
 I'm having a really ugly problem I'm trying to pinpoint, but failed so
 far. I'm neither completely convinced it is not related to my local
 setup(s), nor do I have any clue how this might be caused.
...
 I managed to pull ip -6 route, ip -6 neigh and ip -6 addr while the box
 was not responding:
 
 ip -6 route:
 2001:4ca0:0:f000::/64 dev eth0  proto kernel  metric 256  expires 86322sec 
 mtu 1500 advmss 1440 fragtimeout 4294967295
 fe80::/64 dev eth0  metric 256  expires 21225804sec mtu 1500 advmss 1440 
 fragtimeout 4294967295
 ff00::/8 dev eth0  metric 256  expires 21225804sec mtu 1500 advmss 1440 
 fragtimeout 4294967295
 default via fe80::2d0:4ff:fe12:2400 dev eth0  proto kernel  metric 1024  
 expires 1717sec mtu 1500 advmss 1440 fragtimeout 64
 unreachable default dev lo  proto none  metric -1  error -101 fragtimeout 255

Did you analyze this dev lo warning? 

Regards,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [IPv6] PROBLEM? Network unreachable despite correct route

2007-01-11 Thread Bernhard Schmidt

Jarek Poplawski wrote:


ip -6 route:
2001:4ca0:0:f000::/64 dev eth0  proto kernel  metric 256  expires 86322sec mtu 
1500 advmss 1440 fragtimeout 4294967295
fe80::/64 dev eth0  metric 256  expires 21225804sec mtu 1500 advmss 1440 
fragtimeout 4294967295
ff00::/8 dev eth0  metric 256  expires 21225804sec mtu 1500 advmss 1440 
fragtimeout 4294967295
default via fe80::2d0:4ff:fe12:2400 dev eth0  proto kernel  metric 1024  
expires 1717sec mtu 1500 advmss 1440 fragtimeout 64
unreachable default dev lo  proto none  metric -1  error -101 fragtimeout 255
Did you analyze this dev lo warning? 


That one is default. Recent kernels (since 2.6.12 or so, I think when 
the default on-link assumption was killed) have a default route pointing 
to unreachable default lo on bootup. Routes learned from RA or added 
statically are installed with a better metric and are preferred that way.


I think the use is to have a Network unreachable returned immediately if 
no IPv6 router is present.


Regards,
Bernhard
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[IPv6] PROBLEM? Network unreachable despite correct route

2007-01-09 Thread Bernhard Schmidt
Hi,

I'm having a really ugly problem I'm trying to pinpoint, but failed so
far. I'm neither completely convinced it is not related to my local
setup(s), nor do I have any clue how this might be caused.

I have several boxes with native IPv6 connectivity at various places.
Some of them show symptoms of a lost default route for small periods of
time (10-15 seconds several times a day). By symptoms I mean

- traceroute6 from the affected box to any other host dies immediately
  (the network unreachable does not come from the first hop (the
  upstream router), but from the local stack itself)

- a local running OpenVPN 2.1_rc1b with UDPv6 transport patched in shows
  the following output in the syslog file

  Tue Jan  9 16:48:28 2007 write UDPv6 []: Network is unreachable
  (code=101)

- mtr from the outside to the machine shows that the affected box does
  not respond anymore, while the hop before (the router) is clean.

- new connects to the box (e.g. ssh) from the outside are stuck (packets
  get lost, since I'm running my client with tcp_retries=1 I get a
  timeout

At the same time, established ssh connections to the box work fine. I
can do ip -6 route and it shows the default route, both preferred and
valid lifetime not exceeded (far from that). 

The systems I'm observing this are:

- Dell PowerEdge 750 (P4 with HT), Debian Etch, self compiled kernel
  2.6.17.11, connected (e1000) to two upstream Cisco 7200, default route
  is learned from RIPng (Quagga), static addresses

- Dell OptiPlex GXsomething (P4 with HT, Single Core), SuSE 10.2,
  distribution kernel 2.6.18.5-3-default, connected (tg3) to one
  upstream Cisco 6500/Sup720, default route learned through stateless
  autoconfiguration (RA)

- self built AMD Athlon64 (x86_64), Ubuntu Edgy, Distribution kernel
  2.6.17-10-generic, connected (forcedeth) to an upstream Linux box
  (2.6.20-rc3), default route learned through stateless
  autoconfiguration (RA) as well.

My current believe is that this is an regression introduced in 2.6.17.
I have searched for several weeks now why box #1 (the PowerEdge) shows
signs of unreachability in the monitoring, but could not find any clue
(or verify any reachability problems when I got the monitoring alert).
At the same time, a sibling (same hardware, same switch, same network
segment, route also learned through Quagga, but different kernel (2.6.16))
of this box did not show any symptoms, so I ruled out the local network.

Also, I upgraded box #2 from SuSE 10.1 (distribution kernel
2.6.16-something) to SuSE 10.2 yesterday. While it was running the
OpenVPN/UDPv6 daemon the whole time, there has been exactly _one_
occurence of the Network is unreachable message in the past two weeks
before the upgrade (and I can correlate this message with network
maintainance where the VPN endpoint was indeed unreachable). Since the
upgrade, I have at least 50 lines of that sort in syslog (in about a
day).

It is pretty hard to trace this. It seems to appear very seldom, it is
not long and I cannot predict the time where it happens by doing more
network load or anything else on that machine. IPv4 is fine and without
loss in all cases. All network components are dual-stacked, so if there
was an L2 issue between the router and the host it would affect IPv4 as
well.

Is anyone aware of any issue which might cause this? I've upgraded the
PowerEdge to 2.6.19.1 now, but it is too early to tell whether this
problem still exists. Does anyone recall a bugreport and maybe a fix for
it? A patch or a link to a changeset would be even better, so I could
report that to SuSE and Ubuntu to have it included in future kernels. 

Thanks,
Bernhard
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [IPv6] PROBLEM? Network unreachable despite correct route

2007-01-09 Thread Bernhard Schmidt
On Tue, Jan 09, 2007 at 08:36:24PM +0100, Bernhard Schmidt wrote:

Hi,

I did some additional testing

 I'm having a really ugly problem I'm trying to pinpoint, but failed so
 far. I'm neither completely convinced it is not related to my local
 setup(s), nor do I have any clue how this might be caused.
[...]
 - Dell OptiPlex GXsomething (P4 with HT, Single Core), SuSE 10.2,
   distribution kernel 2.6.18.5-3-default, connected (tg3) to one
   upstream Cisco 6500/Sup720, default route learned through stateless
   autoconfiguration (RA)

Running tcpdump on this (target) box shows that ICMPv6 echo requests
(which is what mtr sends to the target box) are received by the box, but
not replied to

01:02:09.884692 IP6 2001:a60:f001:1:218:f3ff:fe66:  
2001:4ca0:0:f000:211:43ff:fe7e:: ICMP6, echo request, seq 54173, length 64
01:02:09.884706 IP6 2001:4ca0:0:f000:211:43ff:fe7e:  
2001:a60:f001:1:218:f3ff:fe66:: ICMP6, echo reply, seq 54173, length 64
01:02:10.428063 IP6 2001:a60:f001:1:218:f3ff:fe66:  
2001:4ca0:0:f000:211:43ff:fe7e:: ICMP6, echo request, seq 55453, length 64
01:02:11.056871 IP6 2001:a60:f001:1:218:f3ff:fe66:  
2001:4ca0:0:f000:211:43ff:fe7e:: ICMP6, echo request, seq 56733, length 64
01:02:11.700772 IP6 2001:a60:f001:1:218:f3ff:fe66:  
2001:4ca0:0:f000:211:43ff:fe7e:: ICMP6, echo request, seq 58013, length 64
[...]
01:02:17.301169 IP6 2001:a60:f001:1:218:f3ff:fe66:  
2001:4ca0:0:f000:211:43ff:fe7e:: ICMP6, echo request, seq 3998, length 64
01:02:17.941020 IP6 2001:a60:f001:1:218:f3ff:fe66:  
2001:4ca0:0:f000:211:43ff:fe7e:: ICMP6, echo request, seq 5278, length 64
01:02:18.581037 IP6 2001:a60:f001:1:218:f3ff:fe66:  
2001:4ca0:0:f000:211:43ff:fe7e:: ICMP6, echo request, seq 6558, length 64
01:02:18.581050 IP6 2001:4ca0:0:f000:211:43ff:fe7e:  
2001:a60:f001:1:218:f3ff:fe66:: ICMP6, echo reply, seq 6558, length 64

while this is happening, the SSH session (between the very same hosts) is
perfectly fine. ip6_tables.ko is not loaded, there is no other ICMPv6 packet
(e.g. neighbor solicitation or router advertisement) anywhere near the
beginning of my problem. Incoming TCP SYN (an additional SSH session I
tried to establish when I saw the box was not responding) are also 
visible on the interface, but not answered.

01:18:35.638744 IP6 2001:a60:f001:1:218:f3ff:fe66:.57045  
2001:4ca0:0:f000:211:43ff:fe7e:.22: SWE 1448406153:1448406153(0) win 5760 
mss 1440,sackOK,timestamp 13958554 0,nop,wscale 2
01:18:35.701523 IP6 2001:a60:f001:1:218:f3ff:fe66:  
2001:4ca0:0:f000:211:43ff:fe7e:: ICMP6, echo request, seq 41148, length 64
01:18:36.328728 IP6 2001:a60:f001:1:218:f3ff:fe66:  
2001:4ca0:0:f000:211:43ff:fe7e:: ICMP6, echo request, seq 42428, length 64

I managed to pull ip -6 route, ip -6 neigh and ip -6 addr while the box
was not responding:

ip -6 route:
2001:4ca0:0:f000::/64 dev eth0  proto kernel  metric 256  expires 86322sec mtu 
1500 advmss 1440 fragtimeout 4294967295
fe80::/64 dev eth0  metric 256  expires 21225804sec mtu 1500 advmss 1440 
fragtimeout 4294967295
ff00::/8 dev eth0  metric 256  expires 21225804sec mtu 1500 advmss 1440 
fragtimeout 4294967295
default via fe80::2d0:4ff:fe12:2400 dev eth0  proto kernel  metric 1024  
expires 1717sec mtu 1500 advmss 1440 fragtimeout 64
unreachable default dev lo  proto none  metric -1  error -101 fragtimeout 255

ip -6 neigh:
fe80::2d0:4ff:fe12:2400 dev eth0 lladdr 00:d0:04:12:24:00 router REACHABLE

ip -6 addr:
2: eth0: BROADCAST,MULTICAST,NOTRAILERS,UP,1 mtu 1500 qlen 1000
inet6 2001:4ca0:0:f000:211:43ff:fe7e:/64 scope global dynamic 
   valid_lft 86318sec preferred_lft 14318sec
inet6 fe80::211:43ff:fe7e:/64 scope link 
   valid_lft forever preferred_lft forever

Nothing in dmesg or any file in /var/log (except the notorious Network
is unreachable messages from OpenVPN).

I was wrong before by the way, some outgoing connections from the affected
machine still work, I was able to ping6, traceroute6 and telnet. At least
on this particular machine, I am very sure I have seen Network unreachable
on outgoing connects at some point.

I'll try to downgrade this machine to 2.6.16 (and eventually upgrade to 
2.6.19.1) and have a look whether the problem is gone.

 - Dell PowerEdge 750 (P4 with HT), Debian Etch, self compiled kernel
   2.6.17.11, connected (e1000) to two upstream Cisco 7200, default route
   is learned from RIPng (Quagga), static addresses

Still too soon to be absolutely sure, but I think the problem is gone
since the upgrade to 2.6.19.1.

Regards,
Bernhard
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html