Re: Losing IPv6 connectivity after 20min, why?

2017-06-27 Thread Alexander Koeppe
Am 20.04.2017 um 03:53 schrieb Info:
> Hello Olaf,
> 
> I've run at the exactly same issue, but delay on my end is around ~1 min.
> I'm also on fresh/brand-new Hetzner root server, no changes applied, ipv6 
> wasn't working from the beginning.
> 
> Also for some weird reason fe80::1missing lladdr (MAC) in neighbour:
> 
> # ip neigh
> fe80::1 dev eth0  FAILED
> 176.9.XX.XX dev eth0 lladdr 00:31:XX:XX:XX:XX REACHABLE
> 
> Anyways could you tell - have you solved this issue? If so, how?
> 
> Sincerely.
> 

Have you ever thought using EUI-64 instead of manually setting the
link-local address manually?

A possible reason may be a duplicate address on the same link which
isn't very unlikely using fe80::1.

Cheers Alex



Re: Re: Losing IPv6 connectivity after 20min, why?

2017-04-19 Thread Info
Hello Olaf,

I've run at the exactly same issue, but delay on my end is around ~1 min.
I'm also on fresh/brand-new Hetzner root server, no changes applied, ipv6 
wasn't working from the beginning.

Also for some weird reason fe80::1missing lladdr (MAC) in neighbour:

# ip neigh
fe80::1 dev eth0  FAILED
176.9.XX.XX dev eth0 lladdr 00:31:XX:XX:XX:XX REACHABLE

Anyways could you tell - have you solved this issue? If so, how?

Sincerely.


Re: Losing IPv6 connectivity after 20min, why?

2016-09-20 Thread Marc Haber
On Mon, Sep 19, 2016 at 10:58:56PM +0200, Olaf Schreck wrote:
> > Check ip neigh output. Does the entry for your default gateway go
> > STALE after those 20 minutes?
> 
> Yes, exactly:
> 
> # ip -6 nei
> fe80::1 dev eth0 lladdr 0c:86:10:ed:31:ca router STALE

So it is the neighbor table entry going stale. Does your system retry
neighbor solicitation (see tcpdump) or does it sit quietly with a
STALE entry? I had this behavior in similiarly styled hosting networks
a few years ago, and since it has fixed itself I suspect a kernel
issue that was fixed since then. I do use more recent kernels than
Debian stable though.

> > Do you really need to meddle with the fe80::1 route?
> 
> I had no plans to do so.  I just learned during debugging that this would 
> reestablish IPv6 connectivity without reboot.

It would be interesting to see whether this causes your system to do a
neighbor solicitation for the default gateway.

> > Do you really
> > need an explicit route for fe80::1%eth0? Will it work without?
> 
> My hosters docs (Hetzner) recommend that.  They don't specify an IPv6 
> default gateway.

They're unfortunately not very clueful with regard to IPv6. I haven't
been on their network for years though.

> > or is it
> > really necessary to remove the default route and to re-add it?
> 
> yep, connectivity is back

Interesting.

> > No need, an fe80::/64 IP address is only valid when an interface is
> > added:
> [...]
> > Is the other interface connected? eth1 should not play a role here at
> > all.
> 
> I had no plans to fiddle with link-local addresses, and of course eth1 
> settings should not matter.  I just disabled IPv6 on eth1 for debugging, 
> and suddenly IPv6 worked >20min.  Maybe coincidence rather than causality. 

Check whether your system re-solicits for the default gateway with
eth1 down and/or up. If its re-solicitation behavior on eth0 differs
depending on eth1's state, we have a clear system software issue here.

If we're not talking about jessie but something more recent,
systemd-networkd might play a role. systemd-networkd has recently
started to take over IPv6 mechanics, and does so in a quite broken way.

> I'd like to learn what's going on here,

Me too

> thanks for your comments.

You're welcome

Greetings
MArc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421



Re: Losing IPv6 connectivity after 20min, why?

2016-09-19 Thread Kenyon Ralph
On 2016-09-20T00:37:26+0200, Olaf Schreck  wrote:
> > As someone else asked, showing us the output of 'ip -6 a' and 'ip -6
> > r' could be helpful.
> 
> Ok, sorry.  I don't see anything special here:

Thanks. You said it breaks when you configure eth1 though. What is the
configuration you are putting on eth1 that causes the breakage?

> # ip -6 ro
> 2a01:4f8:191:::/64 dev eth0  proto kernel  metric 256
> fe80::/64 dev eth0  proto kernel  metric 256
> fe80::/64 dev eth1  proto kernel  metric 256
> fe80::/64 dev vif1.0  proto kernel  metric 256
> fe80::/64 dev vif2.0  proto kernel  metric 256
> fe80::/64 dev vif3.0  proto kernel  metric 256
> fe80::/64 dev vif4.0  proto kernel  metric 256
> default via fe80::1 dev eth0  metric 1024
> 
> # ip -6 ad
> 1: lo:  mtu 16436
> inet6 ::1/128 scope host
>valid_lft forever preferred_lft forever
> 2: eth0:  mtu 1500 qlen 1000
> inet6 2a01:4f8:191:::4/64 scope global
>valid_lft forever preferred_lft forever
> inet6 fe80::5246:5dff:fe9f:f752/64 scope link
>valid_lft forever preferred_lft forever
> 3: eth1:  mtu 1500 qlen 1000
> inet6 fe80::6a05:caff:fe18:596/64 scope link
>valid_lft forever preferred_lft forever
> 4: vif1.0:  mtu 1500 qlen 32
> inet6 fe80::fcff::feff:/64 scope link
>valid_lft forever preferred_lft forever
> 5: vif2.0:  mtu 1500 qlen 32
> inet6 fe80::fcff::feff:/64 scope link
>valid_lft forever preferred_lft forever
> 6: vif3.0:  mtu 1500 qlen 32
> inet6 fe80::fcff::feff:/64 scope link
>valid_lft forever preferred_lft forever
> 7: vif4.0:  mtu 1500 qlen 32
> inet6 fe80::fcff::feff:/64 scope link
>valid_lft forever preferred_lft forever


-- 
Kenyon Ralph


signature.asc
Description: Digital signature


Re: Losing IPv6 connectivity after 20min, why?

2016-09-19 Thread Olaf Schreck
> As someone else asked, showing us the output of 'ip -6 a' and 'ip -6
> r' could be helpful.

Ok, sorry.  I don't see anything special here:

# ip -6 ro
2a01:4f8:191:::/64 dev eth0  proto kernel  metric 256
fe80::/64 dev eth0  proto kernel  metric 256
fe80::/64 dev eth1  proto kernel  metric 256
fe80::/64 dev vif1.0  proto kernel  metric 256
fe80::/64 dev vif2.0  proto kernel  metric 256
fe80::/64 dev vif3.0  proto kernel  metric 256
fe80::/64 dev vif4.0  proto kernel  metric 256
default via fe80::1 dev eth0  metric 1024

# ip -6 ad
1: lo:  mtu 16436
inet6 ::1/128 scope host
   valid_lft forever preferred_lft forever
2: eth0:  mtu 1500 qlen 1000
inet6 2a01:4f8:191:::4/64 scope global
   valid_lft forever preferred_lft forever
inet6 fe80::5246:5dff:fe9f:f752/64 scope link
   valid_lft forever preferred_lft forever
3: eth1:  mtu 1500 qlen 1000
inet6 fe80::6a05:caff:fe18:596/64 scope link
   valid_lft forever preferred_lft forever
4: vif1.0:  mtu 1500 qlen 32
inet6 fe80::fcff::feff:/64 scope link
   valid_lft forever preferred_lft forever
5: vif2.0:  mtu 1500 qlen 32
inet6 fe80::fcff::feff:/64 scope link
   valid_lft forever preferred_lft forever
6: vif3.0:  mtu 1500 qlen 32
inet6 fe80::fcff::feff:/64 scope link
   valid_lft forever preferred_lft forever
7: vif4.0:  mtu 1500 qlen 32
inet6 fe80::fcff::feff:/64 scope link
   valid_lft forever preferred_lft forever



Re: Losing IPv6 connectivity after 20min, why?

2016-09-19 Thread Kenyon Ralph
On 2016-09-19T22:58:56+0200, Olaf Schreck  wrote:
> I had no plans to fiddle with link-local addresses, and of course eth1 
> settings should not matter.  I just disabled IPv6 on eth1 for debugging, 
> and suddenly IPv6 worked >20min.  Maybe coincidence rather than causality. 
> 
> I'd like to learn what's going on here, thanks for your comments.

As someone else asked, showing us the output of 'ip -6 a' and 'ip -6
r' could be helpful.

-- 
Kenyon Ralph


signature.asc
Description: Digital signature


Re: Losing IPv6 connectivity after 20min, why?

2016-09-19 Thread Olaf Schreck
> Check ip neigh output. Does the entry for your default gateway go
> STALE after those 20 minutes?

Yes, exactly:

# ip -6 nei
fe80::1 dev eth0 lladdr 0c:86:10:ed:31:ca router STALE

> Also check the lifetime of any SLAAC ip addresses given in ip addr
> output.

forever

> Do you really need to meddle with the fe80::1 route?

I had no plans to do so.  I just learned during debugging that this would 
reestablish IPv6 connectivity without reboot.

> Do you really
> need an explicit route for fe80::1%eth0? Will it work without?

My hosters docs (Hetzner) recommend that.  They don't specify an IPv6 
default gateway.

> Does adding a route for 2000/3 via fe80::1 dev eth0 help,

nope (just tried it)

> or is it
> really necessary to remove the default route and to re-add it?

yep, connectivity is back

> No need, an fe80::/64 IP address is only valid when an interface is
> added:
[...]
> Is the other interface connected? eth1 should not play a role here at
> all.

I had no plans to fiddle with link-local addresses, and of course eth1 
settings should not matter.  I just disabled IPv6 on eth1 for debugging, 
and suddenly IPv6 worked >20min.  Maybe coincidence rather than causality. 

I'd like to learn what's going on here, thanks for your comments.


Olaf



Re: Losing IPv6 connectivity after 20min, why?

2016-09-19 Thread Olaf Schreck
I didn't reply yet because I'm still testing stuff.

But privacy extensions are not the problem, they're turned off, no?:

# sysctl -a | grep tempad
net.ipv6.conf.all.use_tempaddr = 0
net.ipv6.conf.default.use_tempaddr = 0
net.ipv6.conf.lo.use_tempaddr = -1
net.ipv6.conf.eth0.use_tempaddr = 0
net.ipv6.conf.eth1.use_tempaddr = 0
net.ipv6.conf.vif1/0.use_tempaddr = 0
net.ipv6.conf.vif2/0.use_tempaddr = 0
net.ipv6.conf.vif3/0.use_tempaddr = 0
net.ipv6.conf.vif4/0.use_tempaddr = 0
net.ipv6.conf.vif5/0.use_tempaddr = 0

> I have DEFINITELY run into that problem. To the point where I generally 
> disable the privacy extension stuff to prevent it. A whole lot of equipment 
> doesn't handle it right and breaks everything.



Re: Losing IPv6 connectivity after 20min, why?

2016-09-19 Thread Matthew Hall
On Mon, Sep 19, 2016 at 02:06:40PM +0200, Gerdriaan Mulder wrote:
> Could you also check whether privacy extensions are enabled on eth0
> and eth1 (/proc/sys/net/ipv6/conf/*/use_tempaddr)? I have a hunch that
> this might explain the 20 minutes lifetime.
> 
> ~ Gerdriaan

I have DEFINITELY run into that problem. To the point where I generally 
disable the privacy extension stuff to prevent it. A whole lot of equipment 
doesn't handle it right and breaks everything.

Matthew.



Re: Losing IPv6 connectivity after 20min, why?

2016-09-19 Thread Marc Haber
Hi,

On Mon, Sep 19, 2016 at 02:06:40PM +0200, Gerdriaan Mulder wrote:
> Depending on whether you need the link-local on the other interface
> (e.g. eth1), you could try a couple of things:
> * remove that address from the interface (which also removes the
> fe80::/64 route on that interface)
> * remove the fe80::/64 route on eth1 (although the OS might add it
> again at some point)

This should not matter. It shold actually make things worse.
Generally, do not remove fe80::/64 addresses from interfaces.

> Could you also check whether privacy extensions are enabled on eth0
> and eth1 (/proc/sys/net/ipv6/conf/*/use_tempaddr)? I have a hunch that
> this might explain the 20 minutes lifetime.

Good point. Yes.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421



Re: Losing IPv6 connectivity after 20min, why?

2016-09-19 Thread Marc Haber
On Mon, Sep 19, 2016 at 12:55:11PM +0200, Olaf Schreck wrote:
> My hoster (Hetzner) routes the 2a01:4f8:191::/64 network to the 
> server.  Following their instructions, I assign a static address from that 
> block and set the default route to fe80::1, either manually 
>  ip -6 addr add 2a01:4f8:191:::5/64 dev eth0
>  ip -6 route add default via fe80::1 dev eth0
> 
> or in /etc/network/interfaces like this
>  iface eth0 inet6 static
>   address 2a01:4f8:191:::5
>   netmask 64
>   gateway fe80::1
> 
> This works, I can ping6.  But it reproducible stops working after 20min, 
> confirmed using this command
>  while true; do date; ping6 -c3 -w5 www.google.com; sleep 10; done

Check ip neigh output. Does the entry for your default gateway go
STALE after those 20 minutes?

Also check the lifetime of any SLAAC ip addresses given in ip addr
output.

> I'm sure it's not Google rate-limiting my pings, I get the same results with 
> various IPv6 addresses that I'm authorized to ping.

fyi, google does not rate-limit pings against 8.8.8.8, 8.8.4.4 and
their IPv6 counterparts which I simply cannot memorize.

> To restore IPv4 connectivity (IPv4 still working), I can either reboot or 
> re-add the default route with these commands (order is important):
>  ip -6 route del default
>  ip -6 route del fe80::1 dev eth0
>  ip -6 route add fe80::1 dev eth0
>  ip -6 route add default via fe80::1 dev eth0

Do you really need to meddle with the fe80::1 route? Do you really
need an explicit route for fe80::1%eth0? Will it work without?

Does adding a route for 2000/3 via fe80::1 dev eth0 help, or is it
really necessary to remove the default route and to re-add it?

> Important data point: This server has 2 ethernet interfaces, so there are 
> 2 link-local fe80::/64 routes to eth0 and eth1.  I was suspicious that the 
> problem might be related, so I disabled IPv6 on the second interface 
> completely with with sysctl net.ipv6.conf.eth1.disable_ipv6 = 1.

No need, an fe80::/64 IP address is only valid when an interface is
added:

[2/501]mh@parada:~$ ping6 fe80::1
connect: Invalid argument
[3/502]mh@parada:~$ ping6 fe80::1%eth0
PING fe80::1%eth0(fe80::1) 56 data bytes
64 bytes from fe80::1: icmp_seq=1 ttl=64 time=2.25 ms
^C
--- fe80::1%eth0 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 2.252/2.252/2.252/0.000 ms
[4/503]mh@parada:~$

Same reason why I think that your explicit fe80::1 route is unnecessary.

> And that resulted in stable and flawless IPv6 connectivity!

Is the other interface connected? eth1 should not play a role here at
all.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421



Re: Losing IPv6 connectivity after 20min, why?

2016-09-19 Thread Gerdriaan Mulder
Hi Olaf,

Depending on whether you need the link-local on the other interface
(e.g. eth1), you could try a couple of things:
* remove that address from the interface (which also removes the
fe80::/64 route on that interface)
* remove the fe80::/64 route on eth1 (although the OS might add it
again at some point)

Either of those could mean that stuff like neighbour discovery
probably fails, but I guess you have to check that in your specific
situation.

Could you post the (relevant) output of `ip -6 route` and `ip -6 addr`
during the 20 minutes of 'working' IPv6 (i.e. with 2 fe80::/64 routes
and whether there are temporary addresses assigned to eth0 and eth1)?

Could you also check whether privacy extensions are enabled on eth0
and eth1 (/proc/sys/net/ipv6/conf/*/use_tempaddr)? I have a hunch that
this might explain the 20 minutes lifetime.

~ Gerdriaan

On 19 September 2016 at 12:55, Olaf Schreck  wrote:
> I have configured a Debian 7 server for IPv6 (in addition to IPv4).
> I can ping6 www.google.com and other addresses, fine.  BUT the server
> reproducibly looses IPv6 connectivity after roughly 20min, and I can't
> figure why this happens.  Clues anyone?
>
> My hoster (Hetzner) routes the 2a01:4f8:191::/64 network to the
> server.  Following their instructions, I assign a static address from that
> block and set the default route to fe80::1, either manually
>  ip -6 addr add 2a01:4f8:191:::5/64 dev eth0
>  ip -6 route add default via fe80::1 dev eth0
>
> or in /etc/network/interfaces like this
>  iface eth0 inet6 static
>   address 2a01:4f8:191:::5
>   netmask 64
>   gateway fe80::1
>
> This works, I can ping6.  But it reproducible stops working after 20min,
> confirmed using this command
>  while true; do date; ping6 -c3 -w5 www.google.com; sleep 10; done
>
> I'm sure it's not Google rate-limiting my pings, I get the same results with
> various IPv6 addresses that I'm authorized to ping.
>
> To restore IPv4 connectivity (IPv4 still working), I can either reboot or
> re-add the default route with these commands (order is important):
>  ip -6 route del default
>  ip -6 route del fe80::1 dev eth0
>  ip -6 route add fe80::1 dev eth0
>  ip -6 route add default via fe80::1 dev eth0
>
> Which will give another 20min of IPv6. 100% reproducible. And it's just the
> routing that needs to be fixed.
>
> I have exluded:
> - no router advertisements used by the hoster, no such packets seen with
>   tcpdump, server is not configured to accept RAs
> - no ip6tables rules, and default ACCEPT everywhere
> - no cronjob or other periodical script that could be responsible
> - no "security software" or similar that would interfere
>
> Important data point: This server has 2 ethernet interfaces, so there are
> 2 link-local fe80::/64 routes to eth0 and eth1.  I was suspicious that the
> problem might be related, so I disabled IPv6 on the second interface
> completely with with sysctl net.ipv6.conf.eth1.disable_ipv6 = 1.
>
> And that resulted in stable and flawless IPv6 connectivity!
>
> While this workaround is ok for this server, I have another one that shows
> the same symptoms.  But for that server I need IPv6 on the other interfaces,
> so the workaround does not apply.
>
> I'd rather like to learn why this happens, or what config part I may be
> missing. Clues or further debugging hints very welcome. Thanks!
>
> Olaf
>