Re: Network services started before NIC UP.
Erwan David wrote: > Everything in /etc/networkinterfaces. > > It is a bit complicated let me explain the situation before going to > configuration: Actually your situation sounds pretty normal to me. > # The primary network interface > auto eth0 > iface eth0 inet static > address 88.190.17.120 > netmask 255.255.255.0 > gateway 88.190.17.1 > up ip addr add 88.191.245.121/32 dev eth0 label eth0:0 > up ip -6 addr add 2001:0bc8:30d3::1/64 dev eth0 > down ip addr del 88.191.245.121/32 dev eth0 label eth0:0 > down ip -6 addr del 2001:0bc8:30d3::1/64 dev eth0 I don't see anything unusual there. However I am not an IPv6 expert and still need to learn the details of it. The IPv4 parts look perfectly reasonable. I have no reason to doubt the IPv6 parts. > 88.190.17.120 is the "private" address (if I change server I will get > another address) 88.191.245.121 and 2001:0bc8:30d3::1 are the "public > addresses", becaus I may migrate them to another machine at same > hoster, making them more robust for public facing services (web email > and ntp server in pool.ntp.org for this one) Yes. A common strategy. Looks good. > The router for IPv6 is given through the RA (I have the correct sysctl > set up for accepting teh RA *and* routing IPv6) I will assume it is good. The important thing is that it will start up using ifupdown. It is set to use "auto" meaning that it will start synchronously at system boot time. If it were using "allow-hotplug" then it would use the current standard event driven interface. The two startup paths should both work but they are different. It is certainly possible for them to behave differently with one path working and one not working. I have problems with NIS/yp with the allow-hotplug event driven path but it works with the auto path for example. (I need to debug that to root cause some day.) > > Just for the purposes of debugging if you are using "allow-hotplug" > > then try switching that to "auto". In theory allow-hotplug should > > always work but since it is the newer event driven method sometimes > > there are still bugs to be found. It is possible that your case is > > one of those. Try "auto" instead and see if that older start ordering > > causes things to work in the correct way. > > I always use auto for fixed machines, like this server. I see by this that you are already aware of the issues and understand the differences between. I will still say a lot for the archive because it might help someone else looking at the problem later. But then my question would be the reverse. If you were to switch to allow-hotplug would that cause things to happen differently and perhaps work? It would be something to try. Although I am sure you don't want to thrash your production server. Trying these experiments on a local victim development machine or VM would be good. Since you are using "auto" then the numbers defined in the LSB headers in the /etc/init.d/* scripts should drive the placement in the boot order in the /etc/rc2.d/S* symlinks. Things should work in that order. If things do not work in that order then that is the problem to find and fix. Also when the interface starts up it will execute the scripts registered in /etc/network/if-*.d/* and those will happen at the time when the interface status changes. But I doubt that is the problem here since by definition if-up.d/foo would happen after the interface is up and your problem is something happening before then. > resolv.conf is > > search rail.eu.org > nameserver 127.0.0.1 Just to verify, no "resolvconf" installed? > unbound listen on loopback when it is started: > > unbound 3048 unbound3u IPv4 11035 0t0 UDP 127.0.0.1:domain > unbound 3048 unbound4u IPv4 11036 0t0 TCP 127.0.0.1:domain > (LISTEN) > unbound 3048 unbound5u IPv6 11037 0t0 UDP [::1]:domain > unbound 3048 unbound6u IPv6 11038 0t0 TCP [::1]:domain (LISTEN) I think I will guess that the problem is that "auto" is the old path through the system boot. Something in your use of 'unbound' isn't set up for that path. Dig into how unbound starts. $ ls -1 /etc/rcS.d/S* $ ls -1 /etc/rc2.d/S* Look over that list and verify that it should be starting networking in /etc/rcS.d/S*networking and that unbound starts up when it is supposed to start up. For example for me: /etc/rcS.d/S15networking /etc/rc2.d/S03bind9 Everyone's numbers will be different of course since those are determined by the installed set of LSB headers from the /etc/init.d/* files. The numbers do not matter. They are set dynamically by 'insserv'. > > The errors you showed in the log file were from dns name resolution > > failures. How are nameservers configured for your machine? Are you > > using DHCP to set them? Or are they statically definited? Are you > > running a local machine nameserver daemon such as bind9 or dnsmasq or > > other? What is in the /etc/resolv.conf file? >
Re: Network services started before NIC UP.
On Wed, Dec 11, 2013 at 07:02:14PM CET, Bob Proulx said: > > In that case we will need to keep peeling back layers until the root > cause is found. How are you starting the network? Is this a section > listed in /etc/network/interfaces? Please show us the section. Or is > this using NetworkManager / wicd? (If NetworkManager/wicd then there > will be no section in /etc/network/interfaces for your network > device. No config there means that NM/wicd manages it.) Everything in /etc/networkinterfaces. It is a bit complicated let me explain the situation before going to configuration : The machine has 2 IPv4 addresses and 1 IPv6 on same physical interface. IPv6 is fixed, but I must get a prefix delegation by a dhcpv6 client for the prefix I get from hoster to be routed to the machine (I use dibbler per the hoster doc, but I could you another one) So here is /etc/network/interfaces # This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo iface lo inet loopback # The primary network interface auto eth0 iface eth0 inet static address 88.190.17.120 netmask 255.255.255.0 gateway 88.190.17.1 up ip addr add 88.191.245.121/32 dev eth0 label eth0:0 up ip -6 addr add 2001:0bc8:30d3::1/64 dev eth0 down ip addr del 88.191.245.121/32 dev eth0 label eth0:0 down ip -6 addr del 2001:0bc8:30d3::1/64 dev eth0 88.190.17.120 is the "private" address (if I change server I will get another address) 88.191.245.121 and 2001:0bc8:30d3::1 are the "public addresses", becaus I may migrate them to another machine at same hoster, making them more robust for public facing services (web email and ntp server in pool.ntp.org for this one) The router for IPv6 is given through the RA (I have the correct sysctl set up for accepting teh RA *and* routing IPv6) resolv.conf is search rail.eu.org nameserver 127.0.0.1 unbound listen on loopback when it is started: unbound 3048 unbound3u IPv4 11035 0t0 UDP 127.0.0.1:domain unbound 3048 unbound4u IPv4 11036 0t0 TCP 127.0.0.1:domain (LISTEN) unbound 3048 unbound5u IPv6 11037 0t0 UDP [::1]:domain unbound 3048 unbound6u IPv6 11038 0t0 TCP [::1]:domain (LISTEN) > To get ahead in the discussion I will suggest this excellent reference > for understanding the /etc/network/interfaces file. > > > http://www.debian.org/doc/manuals/debian-reference/ch05.en.html#_the_basic_syntax_of_etc_network_interfaces I know it thanks > > Just for the purposes of debugging if you are using "allow-hotplug" > then try switching that to "auto". In theory allow-hotplug should > always work but since it is the newer event driven method sometimes > there are still bugs to be found. It is possible that your case is > one of those. Try "auto" instead and see if that older start ordering > causes things to work in the correct way. I always use auto for fixed machines, like this server. > The errors you showed in the log file were from dns name resolution > failures. How are nameservers configured for your machine? Are you > using DHCP to set them? Or are they statically definited? Are you > running a local machine nameserver daemon such as bind9 or dnsmasq or > other? What is in the /etc/resolv.conf file? I use 2 dns servers, on different IP addresses : NSD on public addresses, authoritative for the rail.eu.org zone and 2001:0bc8:30d3::/48 reverse zone, unbound on loopback and 88.190.17.120 as recursive server for my small infrastructure But the problem is not here. I realize that my choice of logs was rather poor. Here is another excerpt that I will comment Dec 10 18:21:24 tee kernel: [ 14.722600] loop: module loaded Dec 10 18:21:24 tee kernel: [ 15.276191] bnx2 :02:00.0: firmware: agent loaded bnx2/bnx2-mips-09-6.2.1b.fw into memory Dec 10 18:21:24 tee kernel: [ 15.282623] bnx2 :02:00.0: firmware: agent loaded bnx2/bnx2-rv2p-09-6.0.17.fw into memory Dec 10 18:21:24 tee kernel: [ 15.282687] bnx2 :02:00.0: irq 49 for MSI/MSI-X Dec 10 18:21:24 tee kernel: [ 15.282699] bnx2 :02:00.0: irq 50 for MSI/MSI-X Dec 10 18:21:24 tee kernel: [ 15.282712] bnx2 :02:00.0: irq 51 for MSI/MSI-X Dec 10 18:21:24 tee kernel: [ 15.282718] bnx2 :02:00.0: irq 52 for MSI/MSI-X Dec 10 18:21:24 tee kernel: [ 15.282725] bnx2 :02:00.0: irq 53 for MSI/MSI-X Dec 10 18:21:24 tee kernel: [ 15.282732] bnx2 :02:00.0: irq 54 for MSI/MSI-X Dec 10 18:21:24 tee kernel: [ 15.282739] bnx2 :02:00.0: irq 55 for MSI/MSI-X Dec 10 18:21:24 tee kernel: [ 15.282746] bnx2 :02:00.0: irq 56 for MSI/MSI-X Dec 10 18:21:24 tee kernel: [ 15.282752] bnx2 :02:00.0: irq 57 for MSI/MSI-X Dec 10 18:21:24 tee kernel: [ 15.347478] bnx2 :02:00.0 eth0: using MSIX Dec 10 18:21:24 tee kernel: [ 15.347685] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready -> link not re
Re: Network services started before NIC UP.
Erwan David wrote: > Bob Proulx said: > > Erwan David wrote: > > > I have a problem that services are started on a server I manage before > > > link is UP. This leads to some services failing, or not bound to every > > > addresses : > > > > Please say what release track you are using? Unstable, Testing, Stable? > > I use testing. Testing. Gotcha. > > Please say whether you are using parallel boot or not. > > I use the stndard testing sysVrc with "make style parrallel boot". > > > Running > > 'insserv' manually and noting any errors and reporting them would be > > very useful. > > > > # insserv -v > > insserv: creating .depend.boot > > insserv: creating .depend.start > > insserv: creating .depend.stop > > No error. Very good. Many systems upgrading from Squeeze 6 to Wheezy 7 are held back to the legacy boot ordering due to "obsolete conffiles" in /etc/init.d that don't contain LSB headers. > > The 'ntpdate' package is obsolete. You should remove it and let ntpd > > do the task itself. I think that is very likely the problem. I think > > you have an old ntpdate package installed when it should have been > > removed. I think this old package is one (of possibly several) that > > is causing the boot sequence to be legacy mode instead of the current > > parallel mode controlled by insserv. I think because of those things > > the boot order is unhappy on your system. > > I maay remove ntpdate, but other services are touched by the > problem. Most annoyings are the dibbler dhcpv6 client (which does not > get any answer) and NSD name server (which binds to the IPv4 address > but not to the IUPv6 one which is at that time not yet configured). In that case we will need to keep peeling back layers until the root cause is found. How are you starting the network? Is this a section listed in /etc/network/interfaces? Please show us the section. Or is this using NetworkManager / wicd? (If NetworkManager/wicd then there will be no section in /etc/network/interfaces for your network device. No config there means that NM/wicd manages it.) To get ahead in the discussion I will suggest this excellent reference for understanding the /etc/network/interfaces file. http://www.debian.org/doc/manuals/debian-reference/ch05.en.html#_the_basic_syntax_of_etc_network_interfaces Just for the purposes of debugging if you are using "allow-hotplug" then try switching that to "auto". In theory allow-hotplug should always work but since it is the newer event driven method sometimes there are still bugs to be found. It is possible that your case is one of those. Try "auto" instead and see if that older start ordering causes things to work in the correct way. The errors you showed in the log file were from dns name resolution failures. How are nameservers configured for your machine? Are you using DHCP to set them? Or are they statically definited? Are you running a local machine nameserver daemon such as bind9 or dnsmasq or other? What is in the /etc/resolv.conf file? Bob signature.asc Description: Digital signature
Re: Network services started before NIC UP.
On Wed, Dec 11, 2013 at 12:12:14AM CET, Bob Proulx said: > Erwan David wrote: > > I have a problem that services are started on a server I manage before > > link is UP. This leads to some services failing, or not bound to every > > addresses : > > Please say what release track you are using? Unstable, Testing, Stable? I use testing. > Please say whether you are using parallel boot or not. I use the stndard testing sysVrc with "make style parrallel boot". > Running > 'insserv' manually and noting any errors and reporting them would be > very useful. > > # insserv -v > insserv: creating .depend.boot > insserv: creating .depend.start > insserv: creating .depend.stop No error. > > You can see that ntpdate is started before network is ready. What is > > more annoying is that I get an IPv6 prefix through dhcpv6 prefix > > delegation, and even though the logs do not appear here, it seems the > > dhcpv6 request is sent (by dibbler-client) before NIc is UP, which leads > > to my prefix not being routed to my server. > > > > Against which package should I report a bug, and is there a workaround > > for next reboot ? > > The 'ntpdate' package is obsolete. You should remove it and let ntpd > do the task itself. I think that is very likely the problem. I think > you have an old ntpdate package installed when it should have been > removed. I think this old package is one (of possibly several) that > is causing the boot sequence to be legacy mode instead of the current > parallel mode controlled by insserv. I think because of those things > the boot order is unhappy on your system. I maay remove ntpdate, but other services are touched by the problem. Most annoyings are the dibbler dhcpv6 client (which does not get any answer) and NSD name server (which binds to the IPv4 address but not to the IUPv6 one which is at that time not yet configured). -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20131211075544.ga4...@rail.eu.org
Re: Network services started before NIC UP.
Erwan David wrote: > I have a problem that services are started on a server I manage before > link is UP. This leads to some services failing, or not bound to every > addresses : Please say what release track you are using? Unstable, Testing, Stable? Please say whether you are using parallel boot or not. Running 'insserv' manually and noting any errors and reporting them would be very useful. # insserv -v insserv: creating .depend.boot insserv: creating .depend.start insserv: creating .depend.stop > You can see that ntpdate is started before network is ready. What is > more annoying is that I get an IPv6 prefix through dhcpv6 prefix > delegation, and even though the logs do not appear here, it seems the > dhcpv6 request is sent (by dibbler-client) before NIc is UP, which leads > to my prefix not being routed to my server. > > Against which package should I report a bug, and is there a workaround > for next reboot ? The 'ntpdate' package is obsolete. You should remove it and let ntpd do the task itself. I think that is very likely the problem. I think you have an old ntpdate package installed when it should have been removed. I think this old package is one (of possibly several) that is causing the boot sequence to be legacy mode instead of the current parallel mode controlled by insserv. I think because of those things the boot order is unhappy on your system. If the above guesses are correct then you will need to remove ntpdate. Then clean up other leftover files in /etc/init.d from packages that are possibly removed but not purged leaving the config files behind. dpkg -l | grep ^rc Then when 'insserv' is clean with no errors that things will be working correctly. Bob signature.asc Description: Digital signature
Network services started before NIC UP.
I have a problem that services are started on a server I manage before link is UP. This leads to some services failing, or not bound to every addresses : here is an excerpt from the logs : Dec 10 18:21:24 tee acpid: starting up with netlink and the input layer Dec 10 18:21:24 tee acpid: 1 rule loaded Dec 10 18:21:24 tee acpid: waiting for events: event logging is off Dec 10 18:21:24 tee ntpd[2361]: ntpd 4.2.6p5@1.2349-o Mon May 20 14:24:35 UTC 2013 (1) Dec 10 18:21:24 tee ntpd[2479]: proto: precision = 0.134 usec Dec 10 18:21:24 tee ntpd[2479]: Listen and drop on 0 v4wildcard 0.0.0.0 UDP 123 Dec 10 18:21:24 tee ntpd[2479]: Listen and drop on 1 v6wildcard :: UDP 123 Dec 10 18:21:24 tee ntpd[2479]: Listen normally on 2 lo 127.0.0.1 UDP 123 Dec 10 18:21:24 tee ntpd[2479]: Listen normally on 3 eth0 88.190.17.120 UDP 123 Dec 10 18:21:24 tee ntpd[2479]: Listen normally on 4 eth0:0 88.191.245.121 UDP 123 Dec 10 18:21:24 tee ntpd[2479]: Listen normally on 5 lo ::1 UDP 123 Dec 10 18:21:24 tee ntpd[2479]: peers refreshed Dec 10 18:21:24 tee ntpd[2479]: Listening on routing socket on fd #22 for interface updates Dec 10 18:21:24 tee ntpd[2479]: Deferring DNS for canon.inria.fr 1 Dec 10 18:21:24 tee ntpd[2479]: Deferring DNS for ntp.obspm.fr 1 Dec 10 18:21:24 tee ntpd[2479]: Deferring DNS for ntp.sceen.net 1 Dec 10 18:21:24 tee ntpd[2479]: Deferring DNS for clock.tix.ch 1 Dec 10 18:21:24 tee ntpd[2479]: Deferring DNS for ntp.proserve.nl 1 Dec 10 18:21:24 tee ntpd[2519]: signal_no_reset: signal 17 had flags 400 Dec 10 18:21:25 tee nsd[2521]: can't bind udp socket: Cannot assign requested address Dec 10 18:21:25 tee nsd[2521]: server initialization failed, nsd could not be started Dec 10 18:21:26 tee kernel: [ 18.584193] bnx2 :02:00.0 eth0: NIC Copper Link is Up, 1000 Mbps full duplex Dec 10 18:21:26 tee kernel: [ 18.584196] , receive & transmit flow control ON Dec 10 18:21:26 tee kernel: [ 18.584281] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready Dec 10 18:21:26 tee ntpd_intres[2519]: host name not found: canon.inria.fr Dec 10 18:21:26 tee ntpd_intres[2519]: host name not found: ntp.obspm.fr Dec 10 18:21:26 tee ntpd_intres[2519]: host name not found: ntp.sceen.net Dec 10 18:21:26 tee ntpd_intres[2519]: host name not found: clock.tix.ch Dec 10 18:21:26 tee ntpd_intres[2519]: host name not found: ntp.proserve.nl Dec 10 18:21:27 tee kernel: [ 19.684249] Netfilter messages via NETLINK v0.30. Dec 10 18:21:27 tee kernel: [ 19.686807] ip_set: protocol 6 Dec 10 18:21:28 tee ntpdate[2592]: Can't find host canon.inria.fr: Name or service not known (-2) Dec 10 18:21:28 tee ntpdate[2592]: Can't find host ntp.obspm.fr: Name or service not known (-2) Dec 10 18:21:28 tee ntpdate[2592]: Can't find host ntp.sceen.net: Name or service not known (-2) Dec 10 18:21:28 tee ntpdate[2592]: Can't find host clock.tix.ch: Name or service not known (-2) Dec 10 18:21:28 tee ntpdate[2592]: Can't find host ntp.proserve.nl: Name or service not known (-2) Dec 10 18:21:28 tee ntpdate[2592]: no servers can be used, exiting Dec 10 18:21:30 tee ntpd[2479]: Listen normally on 6 eth0 2001:bc8:30d3::1 UDP 123 Dec 10 18:21:30 tee ntpd[2479]: Listen normally on 7 eth0 fe80::be30:5bff:fecf:8bcb UDP 123 Dec 10 18:21:30 tee ntpd[2479]: peers refreshed You can see that ntpdate is started before network is ready. What is more annoying is that I get an IPv6 prefix through dhcpv6 prefix delegation, and even though the logs do not appear here, it seems the dhcpv6 request is sent (by dibbler-client) before NIc is UP, which leads to my prefix not being routed to my server. Against which package should I report a bug, and is there a workaround for next reboot ? Thank you. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/52a7852b.1010...@rail.eu.org