Re: Network services started before NIC UP.

2013-12-21 Thread Bob Proulx
Erwan David wrote:
> Everything in /etc/networkinterfaces.
> 
> It is a bit complicated let me explain the situation before going to
> configuration:

Actually your situation sounds pretty normal to me.

> # The primary network interface
> auto eth0
> iface eth0 inet static
> address 88.190.17.120
> netmask 255.255.255.0
> gateway 88.190.17.1
> up ip addr add 88.191.245.121/32 dev eth0 label eth0:0
> up ip -6 addr add 2001:0bc8:30d3::1/64 dev eth0
> down ip addr del 88.191.245.121/32 dev eth0 label eth0:0
> down ip -6 addr del 2001:0bc8:30d3::1/64 dev eth0

I don't see anything unusual there.  However I am not an IPv6 expert
and still need to learn the details of it.  The IPv4 parts look
perfectly reasonable.  I have no reason to doubt the IPv6 parts.

> 88.190.17.120 is the "private" address (if I change server I will get
> another address) 88.191.245.121 and 2001:0bc8:30d3::1 are the "public
> addresses", becaus I may migrate them to another machine at same
> hoster, making them more robust for public facing services (web email
> and ntp server in pool.ntp.org for this one)

Yes.  A common strategy.  Looks good.

> The router for IPv6 is given through the RA (I have the correct sysctl
> set up for accepting teh RA *and* routing IPv6)

I will assume it is good.

The important thing is that it will start up using ifupdown.  It is
set to use "auto" meaning that it will start synchronously at system
boot time.  If it were using "allow-hotplug" then it would use the
current standard event driven interface.  The two startup paths should
both work but they are different.  It is certainly possible for them
to behave differently with one path working and one not working.  I
have problems with NIS/yp with the allow-hotplug event driven path but
it works with the auto path for example.  (I need to debug that to
root cause some day.)

> > Just for the purposes of debugging if you are using "allow-hotplug"
> > then try switching that to "auto".  In theory allow-hotplug should
> > always work but since it is the newer event driven method sometimes
> > there are still bugs to be found.  It is possible that your case is
> > one of those.  Try "auto" instead and see if that older start ordering
> > causes things to work in the correct way.
> 
> I always use auto for fixed machines, like this server.

I see by this that you are already aware of the issues and understand
the differences between.  I will still say a lot for the archive
because it might help someone else looking at the problem later.

But then my question would be the reverse.  If you were to switch to
allow-hotplug would that cause things to happen differently and
perhaps work?  It would be something to try.  Although I am sure you
don't want to thrash your production server.  Trying these experiments
on a local victim development machine or VM would be good.

Since you are using "auto" then the numbers defined in the LSB headers
in the /etc/init.d/* scripts should drive the placement in the boot
order in the /etc/rc2.d/S* symlinks.  Things should work in that
order.  If things do not work in that order then that is the problem
to find and fix.

Also when the interface starts up it will execute the scripts
registered in /etc/network/if-*.d/* and those will happen at the time
when the interface status changes.  But I doubt that is the problem
here since by definition if-up.d/foo would happen after the interface
is up and your problem is something happening before then.

> resolv.conf is 
> 
> search rail.eu.org
> nameserver 127.0.0.1

Just to verify, no "resolvconf" installed?

> unbound listen on loopback when it is started:
> 
> unbound 3048 unbound3u  IPv4  11035  0t0  UDP 127.0.0.1:domain 
> unbound 3048 unbound4u  IPv4  11036  0t0  TCP 127.0.0.1:domain 
> (LISTEN)
> unbound 3048 unbound5u  IPv6  11037  0t0  UDP [::1]:domain 
> unbound 3048 unbound6u  IPv6  11038  0t0  TCP [::1]:domain (LISTEN)

I think I will guess that the problem is that "auto" is the old path
through the system boot.  Something in your use of 'unbound' isn't set
up for that path.  Dig into how unbound starts.

  $ ls -1 /etc/rcS.d/S*
  $ ls -1 /etc/rc2.d/S*

Look over that list and verify that it should be starting networking
in /etc/rcS.d/S*networking and that unbound starts up when it is
supposed to start up.  For example for me:

  /etc/rcS.d/S15networking
  /etc/rc2.d/S03bind9

Everyone's numbers will be different of course since those are
determined by the installed set of LSB headers from the /etc/init.d/*
files.  The numbers do not matter.  They are set dynamically by
'insserv'.

> > The errors you showed in the log file were from dns name resolution
> > failures.  How are nameservers configured for your machine?  Are you
> > using DHCP to set them?  Or are they statically definited?  Are you
> > running a local machine nameserver daemon such as bind9 or dnsmasq or
> > other?  What is in the /etc/resolv.conf file?
> 

Re: Network services started before NIC UP.

2013-12-11 Thread Erwan David
On Wed, Dec 11, 2013 at 07:02:14PM CET, Bob Proulx  said:

> 
> In that case we will need to keep peeling back layers until the root
> cause is found.  How are you starting the network?  Is this a section
> listed in /etc/network/interfaces?  Please show us the section.  Or is
> this using NetworkManager / wicd?  (If NetworkManager/wicd then there
> will be no section in /etc/network/interfaces for your network
> device.  No config there means that NM/wicd manages it.)

Everything in /etc/networkinterfaces.

It is a bit complicated let me explain the situation before going to 
configuration :

The machine has 2 IPv4 addresses and 1 IPv6 on same physical
interface.  IPv6 is fixed, but I must get a prefix delegation by a
dhcpv6 client for the prefix I get from hoster to be routed to the
machine (I use dibbler per the hoster doc, but I could you another one)

So here is /etc/network/interfaces

# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0 inet static
address 88.190.17.120
netmask 255.255.255.0
gateway 88.190.17.1
up ip addr add 88.191.245.121/32 dev eth0 label eth0:0
up ip -6 addr add 2001:0bc8:30d3::1/64 dev eth0
down ip addr del 88.191.245.121/32 dev eth0 label eth0:0
down ip -6 addr del 2001:0bc8:30d3::1/64 dev eth0

88.190.17.120 is the "private" address (if I change server I will get
another address) 88.191.245.121 and 2001:0bc8:30d3::1 are the "public
addresses", becaus I may migrate them to another machine at same
hoster, making them more robust for public facing services (web email
and ntp server in pool.ntp.org for this one)

The router for IPv6 is given through the RA (I have the correct sysctl
set up for accepting teh RA *and* routing IPv6)

resolv.conf is 

search rail.eu.org
nameserver 127.0.0.1


unbound listen on loopback when it is started:

unbound 3048 unbound3u  IPv4  11035  0t0  UDP 127.0.0.1:domain 
unbound 3048 unbound4u  IPv4  11036  0t0  TCP 127.0.0.1:domain (LISTEN)
unbound 3048 unbound5u  IPv6  11037  0t0  UDP [::1]:domain 
unbound 3048 unbound6u  IPv6  11038  0t0  TCP [::1]:domain (LISTEN)



> To get ahead in the discussion I will suggest this excellent reference
> for understanding the /etc/network/interfaces file.
> 
>   
> http://www.debian.org/doc/manuals/debian-reference/ch05.en.html#_the_basic_syntax_of_etc_network_interfaces

I know it thanks

> 
> Just for the purposes of debugging if you are using "allow-hotplug"
> then try switching that to "auto".  In theory allow-hotplug should
> always work but since it is the newer event driven method sometimes
> there are still bugs to be found.  It is possible that your case is
> one of those.  Try "auto" instead and see if that older start ordering
> causes things to work in the correct way.

I always use auto for fixed machines, like this server.

> The errors you showed in the log file were from dns name resolution
> failures.  How are nameservers configured for your machine?  Are you
> using DHCP to set them?  Or are they statically definited?  Are you
> running a local machine nameserver daemon such as bind9 or dnsmasq or
> other?  What is in the /etc/resolv.conf file?

I use 2 dns servers, on different IP addresses : NSD on public
addresses, authoritative for the rail.eu.org zone and
2001:0bc8:30d3::/48 reverse zone, unbound on loopback and
88.190.17.120 as recursive server for my small infrastructure

But the problem is not here. I realize that my choice of logs was
rather poor. Here is another excerpt that I will comment

Dec 10 18:21:24 tee kernel: [   14.722600] loop: module loaded
Dec 10 18:21:24 tee kernel: [   15.276191] bnx2 :02:00.0: firmware: agent 
loaded bnx2/bnx2-mips-09-6.2.1b.fw into memory
Dec 10 18:21:24 tee kernel: [   15.282623] bnx2 :02:00.0: firmware: agent 
loaded bnx2/bnx2-rv2p-09-6.0.17.fw into memory
Dec 10 18:21:24 tee kernel: [   15.282687] bnx2 :02:00.0: irq 49 for 
MSI/MSI-X
Dec 10 18:21:24 tee kernel: [   15.282699] bnx2 :02:00.0: irq 50 for 
MSI/MSI-X
Dec 10 18:21:24 tee kernel: [   15.282712] bnx2 :02:00.0: irq 51 for 
MSI/MSI-X
Dec 10 18:21:24 tee kernel: [   15.282718] bnx2 :02:00.0: irq 52 for 
MSI/MSI-X
Dec 10 18:21:24 tee kernel: [   15.282725] bnx2 :02:00.0: irq 53 for 
MSI/MSI-X
Dec 10 18:21:24 tee kernel: [   15.282732] bnx2 :02:00.0: irq 54 for 
MSI/MSI-X
Dec 10 18:21:24 tee kernel: [   15.282739] bnx2 :02:00.0: irq 55 for 
MSI/MSI-X
Dec 10 18:21:24 tee kernel: [   15.282746] bnx2 :02:00.0: irq 56 for 
MSI/MSI-X
Dec 10 18:21:24 tee kernel: [   15.282752] bnx2 :02:00.0: irq 57 for 
MSI/MSI-X
Dec 10 18:21:24 tee kernel: [   15.347478] bnx2 :02:00.0 eth0: using MSIX
Dec 10 18:21:24 tee kernel: [   15.347685] IPv6: ADDRCONF(NETDEV_UP): eth0: 
link is not ready

-> link not re

Re: Network services started before NIC UP.

2013-12-11 Thread Bob Proulx
Erwan David wrote:
> Bob Proulx said:
> > Erwan David wrote:
> > > I have a problem that services are started on a server I manage before
> > > link is UP. This leads to some services failing, or not bound to every
> > > addresses :
> > 
> > Please say what release track you are using?  Unstable, Testing, Stable?
> 
> I use testing.

Testing.  Gotcha.

> > Please say whether you are using parallel boot or not. 
> 
> I use the stndard testing sysVrc with "make style parrallel boot".
> 
> > Running
> > 'insserv' manually and noting any errors and reporting them would be
> > very useful.
> > 
> >   # insserv -v
> >   insserv: creating .depend.boot
> >   insserv: creating .depend.start
> >   insserv: creating .depend.stop
> 
> No error.

Very good.  Many systems upgrading from Squeeze 6 to Wheezy 7 are held
back to the legacy boot ordering due to "obsolete conffiles" in
/etc/init.d that don't contain LSB headers.

> > The 'ntpdate' package is obsolete.  You should remove it and let ntpd
> > do the task itself.  I think that is very likely the problem.  I think
> > you have an old ntpdate package installed when it should have been
> > removed.  I think this old package is one (of possibly several) that
> > is causing the boot sequence to be legacy mode instead of the current
> > parallel mode controlled by insserv.  I think because of those things
> > the boot order is unhappy on your system.
> 
> I maay remove ntpdate, but other services are touched by the
> problem. Most annoyings are the dibbler dhcpv6 client (which does not
> get any answer) and NSD name server (which binds to the IPv4 address
> but not to the IUPv6 one which is at that time not yet configured).

In that case we will need to keep peeling back layers until the root
cause is found.  How are you starting the network?  Is this a section
listed in /etc/network/interfaces?  Please show us the section.  Or is
this using NetworkManager / wicd?  (If NetworkManager/wicd then there
will be no section in /etc/network/interfaces for your network
device.  No config there means that NM/wicd manages it.)

To get ahead in the discussion I will suggest this excellent reference
for understanding the /etc/network/interfaces file.

  
http://www.debian.org/doc/manuals/debian-reference/ch05.en.html#_the_basic_syntax_of_etc_network_interfaces

Just for the purposes of debugging if you are using "allow-hotplug"
then try switching that to "auto".  In theory allow-hotplug should
always work but since it is the newer event driven method sometimes
there are still bugs to be found.  It is possible that your case is
one of those.  Try "auto" instead and see if that older start ordering
causes things to work in the correct way.

The errors you showed in the log file were from dns name resolution
failures.  How are nameservers configured for your machine?  Are you
using DHCP to set them?  Or are they statically definited?  Are you
running a local machine nameserver daemon such as bind9 or dnsmasq or
other?  What is in the /etc/resolv.conf file?

Bob


signature.asc
Description: Digital signature


Re: Network services started before NIC UP.

2013-12-10 Thread Erwan David
On Wed, Dec 11, 2013 at 12:12:14AM CET, Bob Proulx  said:
> Erwan David wrote:
> > I have a problem that services are started on a server I manage before
> > link is UP. This leads to some services failing, or not bound to every
> > addresses :
> 
> Please say what release track you are using?  Unstable, Testing, Stable?

I use testing.
 
> Please say whether you are using parallel boot or not. 

I use the stndard testing sysVrc with "make style parrallel boot".

> Running
> 'insserv' manually and noting any errors and reporting them would be
> very useful.
> 
>   # insserv -v
>   insserv: creating .depend.boot
>   insserv: creating .depend.start
>   insserv: creating .depend.stop

No error.
 
> > You can see that ntpdate is started before network is ready. What is
> > more annoying is that I get an IPv6 prefix through dhcpv6 prefix
> > delegation, and even though the logs do not appear here, it seems the
> > dhcpv6 request is sent (by dibbler-client) before NIc is UP, which leads
> > to my prefix not being routed to my server.
> > 
> > Against which package should I report a bug, and is there a workaround
> > for next reboot ?
> 
> The 'ntpdate' package is obsolete.  You should remove it and let ntpd
> do the task itself.  I think that is very likely the problem.  I think
> you have an old ntpdate package installed when it should have been
> removed.  I think this old package is one (of possibly several) that
> is causing the boot sequence to be legacy mode instead of the current
> parallel mode controlled by insserv.  I think because of those things
> the boot order is unhappy on your system.

I maay remove ntpdate, but other services are touched by the
problem. Most annoyings are the dibbler dhcpv6 client (which does not
get any answer) and NSD name server (which binds to the IPv4 address
but not to the IUPv6 one which is at that time not yet configured).




-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20131211075544.ga4...@rail.eu.org



Re: Network services started before NIC UP.

2013-12-10 Thread Bob Proulx
Erwan David wrote:
> I have a problem that services are started on a server I manage before
> link is UP. This leads to some services failing, or not bound to every
> addresses :

Please say what release track you are using?  Unstable, Testing, Stable?

Please say whether you are using parallel boot or not.  Running
'insserv' manually and noting any errors and reporting them would be
very useful.

  # insserv -v
  insserv: creating .depend.boot
  insserv: creating .depend.start
  insserv: creating .depend.stop

> You can see that ntpdate is started before network is ready. What is
> more annoying is that I get an IPv6 prefix through dhcpv6 prefix
> delegation, and even though the logs do not appear here, it seems the
> dhcpv6 request is sent (by dibbler-client) before NIc is UP, which leads
> to my prefix not being routed to my server.
> 
> Against which package should I report a bug, and is there a workaround
> for next reboot ?

The 'ntpdate' package is obsolete.  You should remove it and let ntpd
do the task itself.  I think that is very likely the problem.  I think
you have an old ntpdate package installed when it should have been
removed.  I think this old package is one (of possibly several) that
is causing the boot sequence to be legacy mode instead of the current
parallel mode controlled by insserv.  I think because of those things
the boot order is unhappy on your system.

If the above guesses are correct then you will need to remove
ntpdate.  Then clean up other leftover files in /etc/init.d from
packages that are possibly removed but not purged leaving the config
files behind.

  dpkg -l | grep ^rc

Then when 'insserv' is clean with no errors that things will be
working correctly.

Bob


signature.asc
Description: Digital signature


Network services started before NIC UP.

2013-12-10 Thread Erwan David
I have a problem that services are started on a server I manage before
link is UP. This leads to some services failing, or not bound to every
addresses :

here is an excerpt from the logs :

Dec 10 18:21:24 tee acpid: starting up with netlink and the input layer
Dec 10 18:21:24 tee acpid: 1 rule loaded
Dec 10 18:21:24 tee acpid: waiting for events: event logging is off
Dec 10 18:21:24 tee ntpd[2361]: ntpd 4.2.6p5@1.2349-o Mon May 20
14:24:35 UTC 2013 (1)
Dec 10 18:21:24 tee ntpd[2479]: proto: precision = 0.134 usec
Dec 10 18:21:24 tee ntpd[2479]: Listen and drop on 0 v4wildcard 0.0.0.0
UDP 123
Dec 10 18:21:24 tee ntpd[2479]: Listen and drop on 1 v6wildcard :: UDP 123
Dec 10 18:21:24 tee ntpd[2479]: Listen normally on 2 lo 127.0.0.1 UDP 123
Dec 10 18:21:24 tee ntpd[2479]: Listen normally on 3 eth0 88.190.17.120
UDP 123
Dec 10 18:21:24 tee ntpd[2479]: Listen normally on 4 eth0:0
88.191.245.121 UDP 123
Dec 10 18:21:24 tee ntpd[2479]: Listen normally on 5 lo ::1 UDP 123
Dec 10 18:21:24 tee ntpd[2479]: peers refreshed
Dec 10 18:21:24 tee ntpd[2479]: Listening on routing socket on fd #22
for interface updates
Dec 10 18:21:24 tee ntpd[2479]: Deferring DNS for canon.inria.fr 1
Dec 10 18:21:24 tee ntpd[2479]: Deferring DNS for ntp.obspm.fr 1
Dec 10 18:21:24 tee ntpd[2479]: Deferring DNS for ntp.sceen.net 1
Dec 10 18:21:24 tee ntpd[2479]: Deferring DNS for clock.tix.ch 1
Dec 10 18:21:24 tee ntpd[2479]: Deferring DNS for ntp.proserve.nl 1
Dec 10 18:21:24 tee ntpd[2519]: signal_no_reset: signal 17 had flags 400
Dec 10 18:21:25 tee nsd[2521]: can't bind udp socket: Cannot assign
requested address
Dec 10 18:21:25 tee nsd[2521]: server initialization failed, nsd could
not be started
Dec 10 18:21:26 tee kernel: [   18.584193] bnx2 :02:00.0 eth0: NIC
Copper Link is Up, 1000 Mbps full duplex
Dec 10 18:21:26 tee kernel: [   18.584196] , receive & transmit flow
control ON
Dec 10 18:21:26 tee kernel: [   18.584281] IPv6:
ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Dec 10 18:21:26 tee ntpd_intres[2519]: host name not found: canon.inria.fr
Dec 10 18:21:26 tee ntpd_intres[2519]: host name not found: ntp.obspm.fr
Dec 10 18:21:26 tee ntpd_intres[2519]: host name not found: ntp.sceen.net
Dec 10 18:21:26 tee ntpd_intres[2519]: host name not found: clock.tix.ch
Dec 10 18:21:26 tee ntpd_intres[2519]: host name not found: ntp.proserve.nl
Dec 10 18:21:27 tee kernel: [   19.684249] Netfilter messages via
NETLINK v0.30.
Dec 10 18:21:27 tee kernel: [   19.686807] ip_set: protocol 6
Dec 10 18:21:28 tee ntpdate[2592]: Can't find host canon.inria.fr: Name
or service not known (-2)
Dec 10 18:21:28 tee ntpdate[2592]: Can't find host ntp.obspm.fr: Name or
service not known (-2)
Dec 10 18:21:28 tee ntpdate[2592]: Can't find host ntp.sceen.net: Name
or service not known (-2)
Dec 10 18:21:28 tee ntpdate[2592]: Can't find host clock.tix.ch: Name or
service not known (-2)
Dec 10 18:21:28 tee ntpdate[2592]: Can't find host ntp.proserve.nl: Name
or service not known (-2)
Dec 10 18:21:28 tee ntpdate[2592]: no servers can be used, exiting
Dec 10 18:21:30 tee ntpd[2479]: Listen normally on 6 eth0
2001:bc8:30d3::1 UDP 123
Dec 10 18:21:30 tee ntpd[2479]: Listen normally on 7 eth0
fe80::be30:5bff:fecf:8bcb UDP 123
Dec 10 18:21:30 tee ntpd[2479]: peers refreshed

You can see that ntpdate is started before network is ready. What is
more annoying is that I get an IPv6 prefix through dhcpv6 prefix
delegation, and even though the logs do not appear here, it seems the
dhcpv6 request is sent (by dibbler-client) before NIc is UP, which leads
to my prefix not being routed to my server.

Against which package should I report a bug, and is there a workaround
for next reboot ?

Thank you.


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/52a7852b.1010...@rail.eu.org