Re: [systemd-devel] networkd losing dhcp lease with dracut / nfs root

2014-08-14 Thread Tom Gundersen
On Tue, Jul 15, 2014 at 3:50 AM, Rich Freeman
r-syst...@thefreemanclan.net wrote:
 not update valid_lft
 A minute later it again renews DHCP, but also does not update valid_lft.
 51 seconds later it again renews DHCP, and this time it updates valid_lft.

 So, the interface never drops, but it isn't really maintaining
 valid_lft at all points where it could.  I don't know what would have
 happened if it didn't get the lease at the last update - at that point
 there was around 30s left.  I guess I could test that if necessary by
 shutting down the dhcp server.

Hi Rich,

Sorry for not getting back to you sooner.

I had another look a this, and I cannot reproduce. Whenever my DHCP
lease is renewed, this is immediately reflected in the lft_valid. I
added a bit more debugging to networkd, so if you are able to
reproduce this with current git, please post your debug logs.

Cheers,

Tom
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] networkd losing dhcp lease with dracut / nfs root

2014-07-14 Thread Rich Freeman
On Sun, Jun 29, 2014 at 10:27 AM, Tom Gundersen t...@jklm.no wrote:
 On Sat, Jun 28, 2014 at 11:29 AM, Tom Gundersen t...@jklm.no wrote:
 Your analysis is correct. networkd is not updating the lft.

 We should change two things: dracut (or whatever is being used on your
 machine) should set an infinite lifetime when using NFS root (IMHO),
 and networkd should update the lft (and in particular force-set it to
 infinite if CriticalConnection is being used).

 The latter is on my TODO.

 I just pushed a fix for this in networkd, please let me know if you
 are still having issues.

Did this make it into 215?  If so, I'm still seeing odd behavior
though it no longer crashes.

I have a 5min dhcp lease (for testing).

If I set CriticalConnection then it sets valid_lft to forever, and if
not it starts at 300s - this seems right.

At 150 seconds left it renews DHCP, but does not update valid_lft
A minute later it again renews DHCP, but also does not update valid_lft.
51 seconds later it again renews DHCP, and this time it updates valid_lft.

So, the interface never drops, but it isn't really maintaining
valid_lft at all points where it could.  I don't know what would have
happened if it didn't get the lease at the last update - at that point
there was around 30s left.  I guess I could test that if necessary by
shutting down the dhcp server.

Rich
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] networkd losing dhcp lease with dracut / nfs root

2014-06-29 Thread Tom Gundersen
On Sat, Jun 28, 2014 at 11:29 AM, Tom Gundersen t...@jklm.no wrote:
 Your analysis is correct. networkd is not updating the lft.

 We should change two things: dracut (or whatever is being used on your
 machine) should set an infinite lifetime when using NFS root (IMHO),
 and networkd should update the lft (and in particular force-set it to
 infinite if CriticalConnection is being used).

 The latter is on my TODO.

I just pushed a fix for this in networkd, please let me know if you
are still having issues.

Cheers,

Tom
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] networkd losing dhcp lease with dracut / nfs root

2014-06-28 Thread Tom Gundersen
Hi Rich,

Your analysis is correct. networkd is not updating the lft.

We should change two things: dracut (or whatever is being used on your
machine) should set an infinite lifetime when using NFS root (IMHO),
and networkd should update the lft (and in particular force-set it to
infinite if CriticalConnection is being used).

The latter is on my TODO.

Cheers,

Tom

On Sat, Jun 28, 2014 at 5:19 AM, Rich Freeman
r-syst...@thefreemanclan.net wrote:
 I'm running systemd-212 and dracut-037, on a diskless box with an nfs
 root and pxe boot.

 After a number of updates I noticed that the box would freeze up after
 24h uptime - almost exactly.  This behavior is the same whether I have
 systemd-networkd running or not (it is configured to set up any
 interface matching e* with dhcp).

 I traced this to the dhcp lease time - if I set the lease to 10min the
 box freezes in 10min, with errors spewing to the network console
 shortly after about not being able to reach the nfs server.

 After some research, I suspect it is the result of:
 https://bugzilla.redhat.com/show_bug.cgi?id=1097523

 I monitored the box more closely and discovered that with a 10 minute
 lease the box is renewing the lease after 5 minutes.  However, if I
 run watch ip addr the box counts down the valid_lft from 600 seconds
 down to 1 second with no change after 5 minutes.

 If I disable systemd-networkd then the box doesn't renew the lease at
 all, and valid_lft counts down just the same.

 I suspect that systemd-networkd is renewing the lease but not updating
 the valid_lft on the interface, and thus after the original lease
 expires the kernel brings it down.

 The only other thing that is odd is that my interface has two IPs
 assigned, and I have no idea where one is coming from:
 2: eth0: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast
 state UP qlen 1000
 link/ether 00:01:2e:31:04:dc brd ff:ff:ff:ff:ff:ff
 inet 200.0.0.0/24 brd 200.0.0.255 scope global eth0
valid_lft forever preferred_lft forever
 inet 192.168.0.10/24 brd 192.168.0.255 scope global dynamic eth0
valid_lft 220sec preferred_lft 220sec
 inet6 fe80::201:2eff:fe31:4dc/64 scope link
valid_lft forever preferred_lft forever

 Clearly systemd-networkd is managing 192.168.0.10:
 Jun 27 23:12:43 mythliv2 systemd-networkd[442]: eth0: link is up
 Jun 27 23:12:43 mythliv2 systemd-networkd[442]: eth0: carrier on
 Jun 27 23:12:43 mythliv2 systemd[1]: Started Network Service.
 Jun 27 23:12:43 mythliv2 systemd-networkd[442]: eth0: DHCPv4 address
 192.168.0.10/24 via 192.168.0.101
 Jun 27 23:12:43 mythliv2 systemd-networkd[442]: eth0: link configured

 I'm not sure where the other IP is coming from - it shows up even if I
 don't enable systemd-networkd, so perhaps dracut is setting it up.
 I'm not sure if its valid_lft of forever is causing any confusion
 though.

 My network config:
 [Match]
 Name=e*

 [Network]
 DHCP=yes

 [DHCPv4]
 CriticalConnection=yes

 (I get the same behavior if I drop the CriticalConnection=yes)

 Any thoughts as to what is going wrong here?  I'm happy to test patches/etc.

 Rich
 ___
 systemd-devel mailing list
 systemd-devel@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/systemd-devel
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] networkd losing dhcp lease with dracut / nfs root

2014-06-27 Thread Rich Freeman
I'm running systemd-212 and dracut-037, on a diskless box with an nfs
root and pxe boot.

After a number of updates I noticed that the box would freeze up after
24h uptime - almost exactly.  This behavior is the same whether I have
systemd-networkd running or not (it is configured to set up any
interface matching e* with dhcp).

I traced this to the dhcp lease time - if I set the lease to 10min the
box freezes in 10min, with errors spewing to the network console
shortly after about not being able to reach the nfs server.

After some research, I suspect it is the result of:
https://bugzilla.redhat.com/show_bug.cgi?id=1097523

I monitored the box more closely and discovered that with a 10 minute
lease the box is renewing the lease after 5 minutes.  However, if I
run watch ip addr the box counts down the valid_lft from 600 seconds
down to 1 second with no change after 5 minutes.

If I disable systemd-networkd then the box doesn't renew the lease at
all, and valid_lft counts down just the same.

I suspect that systemd-networkd is renewing the lease but not updating
the valid_lft on the interface, and thus after the original lease
expires the kernel brings it down.

The only other thing that is odd is that my interface has two IPs
assigned, and I have no idea where one is coming from:
2: eth0: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast
state UP qlen 1000
link/ether 00:01:2e:31:04:dc brd ff:ff:ff:ff:ff:ff
inet 200.0.0.0/24 brd 200.0.0.255 scope global eth0
   valid_lft forever preferred_lft forever
inet 192.168.0.10/24 brd 192.168.0.255 scope global dynamic eth0
   valid_lft 220sec preferred_lft 220sec
inet6 fe80::201:2eff:fe31:4dc/64 scope link
   valid_lft forever preferred_lft forever

Clearly systemd-networkd is managing 192.168.0.10:
Jun 27 23:12:43 mythliv2 systemd-networkd[442]: eth0: link is up
Jun 27 23:12:43 mythliv2 systemd-networkd[442]: eth0: carrier on
Jun 27 23:12:43 mythliv2 systemd[1]: Started Network Service.
Jun 27 23:12:43 mythliv2 systemd-networkd[442]: eth0: DHCPv4 address
192.168.0.10/24 via 192.168.0.101
Jun 27 23:12:43 mythliv2 systemd-networkd[442]: eth0: link configured

I'm not sure where the other IP is coming from - it shows up even if I
don't enable systemd-networkd, so perhaps dracut is setting it up.
I'm not sure if its valid_lft of forever is causing any confusion
though.

My network config:
[Match]
Name=e*

[Network]
DHCP=yes

[DHCPv4]
CriticalConnection=yes

(I get the same behavior if I drop the CriticalConnection=yes)

Any thoughts as to what is going wrong here?  I'm happy to test patches/etc.

Rich
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel