Re: autopkgtest-build-lxd failing with bionic

2018-02-25 Thread Robie Basak
On Sun, Feb 25, 2018 at 11:08:24PM +0100, Martin Pitt wrote:
>   
> https://anonscm.debian.org/cgit/autopkgtest/autopkgtest.git/commit/?id=563eac74595

> + if ! echo '[ ! -d /run/systemd/system ] ||

Note that, further to a thread here a while ago, this test is broken on
Trusty (false positive). I'm not sure whether it matters for you or not
in this case though.


signature.asc
Description: PGP signature
-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


Re: autopkgtest-build-lxd failing with bionic

2018-02-25 Thread Martin Pitt
Hello all,

Steve Langasek [2018-02-22 16:36 -0800]:
> > OK, so I suppose we could replace the check with
> 
> >   if running_systemd
> >   wait for network-online.target
> >   else
> >   wait for runlevel 2

I modified Iain's initial patch and tested/applied:

  
https://anonscm.debian.org/cgit/autopkgtest/autopkgtest.git/commit/?id=563eac74595

@Iain: After "systemctl start" you don't need to wait for the unit to be active
-- start already blocks on that (otherwise, use --no-block).

Martin


signature.asc
Description: PGP signature
-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


Re: autopkgtest-build-lxd failing with bionic

2018-02-22 Thread Steve Langasek
On Thu, Feb 22, 2018 at 10:49:29PM +0100, Martin Pitt wrote:

> > >   - it's supposed to be a SysV backwards compat shim for LSB's "network"
> > > dependency, and not well-defined

> > From my POV, the sane definition is:

> >  - DNS setup is complete
> >  - all "required" network interfaces (implementation-defined) have completed
> >their configuration
> >  - if no network interfaces are defined to be "required", then at least one
> >interface is up

> > This is broad enough to encompass everything from VPNs to captive
> > portals to proxy-only networks, and provides a clear separation of
> > responsibilities.

> Since you are much more on top of the current netplan/networkd
> implementation in Ubuntu containers: does that currently match this
> definition?

I'm confident that it matches the implementation with respect to containers.
There are probably corner cases on desktop where the implementation is not
quite correct yet, but given an agreed definition of what the target
*should* do, the rest can be worked through as ordinary bugs.

> > >   - These tools should also work with Debian containers, which in theory
> > > could also run sysvinit.  This is also the reason why they still use
> > > `runlevel` instead of `systemctl is-system-running` or something
> > > similar.

> > Sure, but in principle, once you've reached runlevel 2 under sysvinit you
> > can rely on the network being up because that's part of the definition of
> > the runlevel.  So the systemd code doesn't need to have a sysvinit
> > equivalent.

> OK, so I suppose we could replace the check with

>   if running_systemd
>   wait for network-online.target
>   else
>   wait for runlevel 2

> which would still support non-systemd containers (like Ubuntu 14.04 or custom
> configs in Debian).

Yes, I think so.

Cheers,
-- 
Steve Langasek   Give me a lever long enough and a Free OS
Debian Developer   to set it on, and I can move the world.
Ubuntu Developerhttp://www.debian.org/
slanga...@ubuntu.com vor...@debian.org


signature.asc
Description: PGP signature
-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


Re: autopkgtest-build-lxd failing with bionic

2018-02-22 Thread Dimitri John Ledkov
On 22 February 2018 at 20:15, Steve Langasek  wrote:
> On Thu, Feb 15, 2018 at 02:16:01PM -0600, Ryan Harper wrote:
>
>> Or, invoke wait-online directly:
>
>>  /lib/systemd/systemd-networkd-wait-online
>
> That also hard-codes an implementation detail; we may be using something
> other than systemd-networkd in the runners.
>
> If assuming systemd is unpalatable to autopkgtest upstream, assuming
> systemd-networkd is right out :)
>

Yet systemd-networkd-wait-online is somewhat agnostic to technology
used to configure interfaces themself. E.g. it happily blocks and
waits for any interface, to magically configure. Since it was
introduced before all the classic networking tools had equivalent
wait-online integration. (e.g. await for container hypervisor to drop
in a veth and configure it...)

-- 
Regards,

Dimitri.

-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


Re: autopkgtest-build-lxd failing with bionic

2018-02-22 Thread Steve Langasek
On Thu, Feb 15, 2018 at 02:16:01PM -0600, Ryan Harper wrote:

> Or, invoke wait-online directly:

>  /lib/systemd/systemd-networkd-wait-online

That also hard-codes an implementation detail; we may be using something
other than systemd-networkd in the runners.

If assuming systemd is unpalatable to autopkgtest upstream, assuming
systemd-networkd is right out :)

-- 
Steve Langasek   Give me a lever long enough and a Free OS
Debian Developer   to set it on, and I can move the world.
Ubuntu Developerhttp://www.debian.org/
slanga...@ubuntu.com vor...@debian.org


signature.asc
Description: PGP signature
-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


Re: autopkgtest-build-lxd failing with bionic

2018-02-21 Thread Martin Pitt
Antonio Terceiro [2018-02-21 10:39 -0300]:
> > Cheers! I reworked it a bit, applied the same strategy to LXC (which is
> > equally affected), tested it, and landed
> > 
> >
> > https://anonscm.debian.org/cgit/autopkgtest/autopkgtest.git/commit/?id=20f479254
> 
> Aren't _all_ types of testbed affected by this in some way or another?

schroot isn't as it uses the host network, and for ssh there is no way to know;
usually that's a function of their setup scripts, and the most common case
there is an OpenStack instance where cloud-init already does that waiting.

qemu is also affected in principle; in practice I haven't heard reports or seen
instances of failed network communication there yet.

Martin

-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


Re: autopkgtest-build-lxd failing with bionic

2018-02-21 Thread Ryan Harper
On Thu, Feb 15, 2018 at 2:00 PM, Steve Langasek 
wrote:

> On Thu, Feb 15, 2018 at 06:48:31PM +, Iain Lane wrote:
> > [ autopkgtest-devel, this is
> >   https://lists.ubuntu.com/archives/ubuntu-devel/2018-
> February/040138.html
> >   and thread FYI - Reply-To / Mail-Followup-To set to exclude
> >   ubuntu-devel from this subthread so reviews go to the right place ]
>
> > On Thu, Feb 15, 2018 at 10:28:05AM -0500, Stéphane Graber wrote:
> > > […]
> > > And confirmed that networking inside both of them works fine here.
>
> > > I wonder if it's a netplan vs ifupdown thing hitting autopkgtest in
> this case?
>
> > I can build images: images(!) quite fine here, but when actually using
> > them I see these temporary resolution failures most of the time during
> > the initial apt-get update.
>
> > I tracked this down to a race condition - basically we try to do the
> > `apt-get update' before networking is fully up. (OK, I just saw Julian's
> > post which came in while I was writing this and says the same thing...)
>
> > There's a patch attached here which fixes the problem for me. I'm not
> > sure if there's a better way to do this - basically it starts
> > network-online.target and waits for it to become active, with a timeout.
> > Review appreciated.
>
> It's a bit odd to be "start"ing a target in this manner.  Is it even
> necessary to start the target, or would it be sufficient to just check
> is-active in a loop?
>
> In that case, I would suggest:
>
> timeout=60
> while ! lxc exec "$CONTAINER" -- systemctl is-active
> network-online.target \
>   && [ $timeout -ge 0 ]
> do
> timeout=$((timeout - 1))
> sleep 1
> done
> [ $timeout -ge 0 ] || {
> echo "Timed out waiting for network to come up" >&2
> exit 1
> }
>

Or, invoke wait-online directly:

 /lib/systemd/systemd-networkd-wait-online

lxc launch ubuntu-daily:bionic b3 &&
  sleep 0.2 &&
  lxc exec b3 -- /bin/bash -c 'sleep 2; echo "waiting on network";
/lib/systemd/systemd-networkd-wait-online &&  apt update'

systemd-networkd-wait-online isn't happy to start really early; it doesn't
attempt to reconnect to dbus if you run it before it's up
so that's the sleep 2 in there.

networkd is enabled only on artful and newer, so this won't help w.r.t
xenial and older.

Also, networkd comes up just as fast as networking in xenial  *if* you
don't have IPV6 and accept-ra enabled.
networkd spends approx 10 seconds waiting for a RA solicitation on my
Xenial machine;

If I disable dhcpv6 and accept-ra is false, then we do a v4 dhcp in less
than 2 seconds.

# cat /etc/netplan/50-cloud-init.yaml
# This file is generated from information provided by
# the datasource.  Changes to it will not persist across an instance.
# To disable cloud-init's network configuration capabilities, write a file
# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
# network: {config: disabled}
network:
version: 2
ethernets:
eth0:
dhcp4: true
dhcp6: false
accept-ra: off

root@b1:~# systemd-analyze blame
   725ms systemd-networkd-wait-online.service

With RA enabled (the default in networkd)

root@b1:~# systemd-analyze blame
 13.141s systemd-networkd-wait-online.service



>
> --
> Steve Langasek   Give me a lever long enough and a Free OS
> Debian Developer   to set it on, and I can move the world.
> Ubuntu Developerhttp://www.debian.org/
> slanga...@ubuntu.com vor...@debian.org
>
> --
> ubuntu-devel mailing list
> ubuntu-devel@lists.ubuntu.com
> Modify settings or unsubscribe at: https://lists.ubuntu.com/
> mailman/listinfo/ubuntu-devel
>
>
-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


Re: autopkgtest-build-lxd failing with bionic

2018-02-20 Thread Scott Kitterman
On Tuesday, February 20, 2018 10:44:42 PM Martin Pitt wrote:
> Steve Langasek [2018-02-16 11:12 -0800]:
...
> > I think the network-online.target is the better thing to key on.
> 
> I still don't like that much, though:
>   -  there is no requirement that this actually gets "implemented" or even
>  started (it's a passive target)
> 
>   - it's supposed to be a SysV backwards compat shim for LSB's "network"
> dependency, and not well-defined
> 
>   - These tools should also work with Debian containers, which in theory
> could also run sysvinit. This is also the reason why they still use
> `runlevel` instead of `systemctl is-system-running` or something similar.
> 
> All of these are just heuristics, though; you could have all sorts of cases
> where all of these break, like sharing the host's network namespace, having
> no default route but a route to the configured apt proxy, etc. Maybe the
> closest approximation to this would be to grab the archive URL from
> /etc/apt/sources.list and put it in a curl loop, but (1) neither wget nor
> curl are in minimal installs, and (2) at that point it could just as well
> be an apt-get retry loop.

So what's the right systemd way to ensure the network is up?  I continue to 
fight bugs in the postfix unit file both in Debian and Ubuntu over things 
happening before the network is up.  As far as I can determine from the 
documentation, network-online.target should work, but I agree it doesn't do so 
reliably.

Currently postfix@.service has:

After=network-online.target nss-lookup.target
Wants=network-online.target

If inet_interfaces has been set to a specific IP address (which is a 
legitimate use), then if postfix tries to start before that IP address is 
available errors ensue.

Scott K

-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


Re: autopkgtest-build-lxd failing with bionic

2018-02-20 Thread Martin Pitt
Steve Langasek [2018-02-16 11:12 -0800]:
> > >   [ -n "$(ip route show to 0/0)" ]
> 
> > This is better though, and works too. Please take a look at the attached
> > patch. Thanks! :-)
> 
> Actually no, this is racy, because the route comes up before DNS resolution
> is in place.

I'm not actually sure if network-online.target would actually guard against
that with all implementations. But in practice, in most cases you'll get DNS
either via static configuration (in which case there's nothing further to wait
for) or via DHCP (in which case your address and DNS solvers ought to arrive at
the same time). And there's still the "apt retries several times" fallback
(which is why I do see the initial apt failure, but the retry works).

> It's also not forwards-compatible with ipv6-only deploys.

Right now the container network config created by lxc/lxd/netplan assumes IPv4
only, so let's cross that bridge when we get to it. Indeed adding an
alternative `ip -6 show...` would easily rectify that.

> I think the network-online.target is the better thing to key on.

I still don't like that much, though:
  -  there is no requirement that this actually gets "implemented" or even
 started (it's a passive target)

  - it's supposed to be a SysV backwards compat shim for LSB's "network"
dependency, and not well-defined

  - These tools should also work with Debian containers, which in theory could
also run sysvinit. This is also the reason why they still use `runlevel`
instead of `systemctl is-system-running` or something similar.

All of these are just heuristics, though; you could have all sorts of cases
where all of these break, like sharing the host's network namespace, having no
default route but a route to the configured apt proxy, etc. Maybe the closest
approximation to this would be to grab the archive URL from
/etc/apt/sources.list and put it in a curl loop, but (1) neither wget nor curl
are in minimal installs, and (2) at that point it could just as well be an
apt-get retry loop.

So in summary, IMHO the "wait for default route" heuristics is simple and
effective enough for now.

Martin


signature.asc
Description: PGP signature
-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


Re: autopkgtest-build-lxd failing with bionic

2018-02-20 Thread Martin Pitt
Hello all,

Iain Lane [2018-02-16 11:52 +]:
> > I wouldn't pick on any of these: network-online.target is a sloppily defined
> > shim for SysV init backwards compatibility, and may not ever get started (in
> > fact, that's the goal ☺); and the container might not use networkd, so I
> > wouldn't use s-n-wait-online either. I think querying
> 
> Interesting. I thought that it was the systemd way to say 'I am online
> now' --- i.e. nm-online or systemd-networkd-wait-online, which is the
> question I wanted to get a positive answer to. I can see that the SysV
> implementation isn't great, but it's not clear to me that it was ill
> defined for this case.

"ill defined" is too strong, but it's "sloppy", just as the mere question of
what "the network is up" means in a world of dynamic interfaces, proxies, VPNs,
dynamic resolvers, etc.

> >   [ -n "$(ip route show to 0/0)" ]
> 
> This is better though, and works too. Please take a look at the attached
> patch. Thanks! :-)

Cheers! I reworked it a bit, applied the same strategy to LXC (which is
equally affected), tested it, and landed

   
https://anonscm.debian.org/cgit/autopkgtest/autopkgtest.git/commit/?id=20f479254

I'm going to overhaul setup-testbed too, as it still creates an ifupdown config
for modern (netplan) Ubuntu containers - I want to teach it to stop that.

Martin



signature.asc
Description: PGP signature
-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


Re: autopkgtest-build-lxd failing with bionic

2018-02-16 Thread Robie Basak
On Fri, Feb 16, 2018 at 08:15:35PM +0100, Julian Andres Klode wrote:
> > I think the network-online.target is the better thing to key on.
> 
> I think we should just grep the apt output and retry if it fails with
> connection error messages.

The problem is a general one though. It's not specific to apt. Any time
we use automation on a container or VM, we need to wait until it's
finished booting.

In uvtool this is what "uvt-kvm wait" provides, which currently waits
for upstart runlevel 2 or systemd runlevel 5 and then asks cloud-init
(since a script might also have asked cloud-init to do things it expects
done when the container is "ready"). Of course that's cloud-init
specific.

The script may need fixing, but Ubuntu should agree upon a general
answer to the common question. Even if the answer provides multiple
specified points if multiple points in time are appropriate to solve
different problems.

So, to me, retrying apt is a hacky workaround.

Robie


signature.asc
Description: PGP signature
-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


Re: autopkgtest-build-lxd failing with bionic

2018-02-16 Thread Julian Andres Klode
On Fri, Feb 16, 2018 at 11:12:32AM -0800, Steve Langasek wrote:
> On Fri, Feb 16, 2018 at 11:52:05AM +, Iain Lane wrote:
> > On Thu, Feb 15, 2018 at 09:55:47PM +0100, Martin Pitt wrote:
> > > Hello Iain, all,
> 
> > > Iain Lane [2018-02-15 18:48 +]:
> > > > There's a patch attached here which fixes the problem for me. I'm not
> > > > sure if there's a better way to do this - basically it starts
> > > > network-online.target and waits for it to become active, with a timeout.
> > > > Review appreciated.
> 
> > > I wouldn't pick on any of these: network-online.target is a sloppily 
> > > defined
> > > shim for SysV init backwards compatibility, and may not ever get started 
> > > (in
> > > fact, that's the goal ☺); and the container might not use networkd, so I
> > > wouldn't use s-n-wait-online either. I think querying
> 
> > Interesting. I thought that it was the systemd way to say 'I am online
> > now' --- i.e. nm-online or systemd-networkd-wait-online, which is the
> > question I wanted to get a positive answer to. I can see that the SysV
> > implementation isn't great, but it's not clear to me that it was ill
> > defined for this case.
> 
> > >   [ -n "$(ip route show to 0/0)" ]
> 
> > This is better though, and works too. Please take a look at the attached
> > patch. Thanks! :-)
> 
> Actually no, this is racy, because the route comes up before DNS resolution
> is in place.
> 
> It's also not forwards-compatible with ipv6-only deploys.
> 
> I think the network-online.target is the better thing to key on.

I think we should just grep the apt output and retry if it fails with
connection error messages. This should be fine until I have an improved
solution in apt itself, one of

(1) "there are no transient errors"
(2) one source must have updated
(3) all sources must have updated

Not sure on details. Could be an option for all three.

-- 
debian developer - deb.li/jak | jak-linux.org - free software dev
ubuntu core developer  i speak de, en

-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


Re: autopkgtest-build-lxd failing with bionic

2018-02-16 Thread Steve Langasek
On Fri, Feb 16, 2018 at 11:52:05AM +, Iain Lane wrote:
> On Thu, Feb 15, 2018 at 09:55:47PM +0100, Martin Pitt wrote:
> > Hello Iain, all,

> > Iain Lane [2018-02-15 18:48 +]:
> > > There's a patch attached here which fixes the problem for me. I'm not
> > > sure if there's a better way to do this - basically it starts
> > > network-online.target and waits for it to become active, with a timeout.
> > > Review appreciated.

> > I wouldn't pick on any of these: network-online.target is a sloppily defined
> > shim for SysV init backwards compatibility, and may not ever get started (in
> > fact, that's the goal ☺); and the container might not use networkd, so I
> > wouldn't use s-n-wait-online either. I think querying

> Interesting. I thought that it was the systemd way to say 'I am online
> now' --- i.e. nm-online or systemd-networkd-wait-online, which is the
> question I wanted to get a positive answer to. I can see that the SysV
> implementation isn't great, but it's not clear to me that it was ill
> defined for this case.

> >   [ -n "$(ip route show to 0/0)" ]

> This is better though, and works too. Please take a look at the attached
> patch. Thanks! :-)

Actually no, this is racy, because the route comes up before DNS resolution
is in place.

It's also not forwards-compatible with ipv6-only deploys.

I think the network-online.target is the better thing to key on.

-- 
Steve Langasek   Give me a lever long enough and a Free OS
Debian Developer   to set it on, and I can move the world.
Ubuntu Developerhttp://www.debian.org/
slanga...@ubuntu.com vor...@debian.org


signature.asc
Description: PGP signature
-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


Re: autopkgtest-build-lxd failing with bionic

2018-02-16 Thread Iain Lane
On Thu, Feb 15, 2018 at 12:00:41PM -0800, Steve Langasek wrote:
> It's a bit odd to be "start"ing a target in this manner.  Is it even
> necessary to start the target, or would it be sufficient to just check
> is-active in a loop?

Yeah, it is - it needs to be pulled in by something to get started, but
in this case it's not so we do the same thing in code. It's like this so
you don't end up blocking the boot unnecessarily waiting for the network
to be "up" when nothing needs it to be.

Doesn't matter any more though for this case. :-)

-- 
Iain Lane  [ i...@orangesquash.org.uk ]
Debian Developer   [ la...@debian.org ]
Ubuntu Developer   [ la...@ubuntu.com ]


signature.asc
Description: PGP signature
-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


Re: autopkgtest-build-lxd failing with bionic

2018-02-15 Thread Steve Langasek
On Thu, Feb 15, 2018 at 06:48:31PM +, Iain Lane wrote:
> [ autopkgtest-devel, this is
>   https://lists.ubuntu.com/archives/ubuntu-devel/2018-February/040138.html
>   and thread FYI - Reply-To / Mail-Followup-To set to exclude
>   ubuntu-devel from this subthread so reviews go to the right place ]

> On Thu, Feb 15, 2018 at 10:28:05AM -0500, Stéphane Graber wrote:
> > […]
> > And confirmed that networking inside both of them works fine here.

> > I wonder if it's a netplan vs ifupdown thing hitting autopkgtest in this 
> > case?

> I can build images: images(!) quite fine here, but when actually using
> them I see these temporary resolution failures most of the time during
> the initial apt-get update.

> I tracked this down to a race condition - basically we try to do the
> `apt-get update' before networking is fully up. (OK, I just saw Julian's
> post which came in while I was writing this and says the same thing...)

> There's a patch attached here which fixes the problem for me. I'm not
> sure if there's a better way to do this - basically it starts
> network-online.target and waits for it to become active, with a timeout.
> Review appreciated.

It's a bit odd to be "start"ing a target in this manner.  Is it even
necessary to start the target, or would it be sufficient to just check
is-active in a loop?

In that case, I would suggest:

timeout=60
while ! lxc exec "$CONTAINER" -- systemctl is-active 
network-online.target \
  && [ $timeout -ge 0 ]
do
timeout=$((timeout - 1))
sleep 1
done
[ $timeout -ge 0 ] || {
echo "Timed out waiting for network to come up" >&2
exit 1
}


-- 
Steve Langasek   Give me a lever long enough and a Free OS
Debian Developer   to set it on, and I can move the world.
Ubuntu Developerhttp://www.debian.org/
slanga...@ubuntu.com vor...@debian.org


signature.asc
Description: PGP signature
-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


Re: autopkgtest-build-lxd failing with bionic

2018-02-15 Thread Iain Lane
[ autopkgtest-devel, this is
  https://lists.ubuntu.com/archives/ubuntu-devel/2018-February/040138.html
  and thread FYI - Reply-To / Mail-Followup-To set to exclude
  ubuntu-devel from this subthread so reviews go to the right place ]

On Thu, Feb 15, 2018 at 10:28:05AM -0500, Stéphane Graber wrote:
> […]
> And confirmed that networking inside both of them works fine here.
> 
> I wonder if it's a netplan vs ifupdown thing hitting autopkgtest in this case?

I can build images: images(!) quite fine here, but when actually using
them I see these temporary resolution failures most of the time during
the initial apt-get update.

I tracked this down to a race condition - basically we try to do the
`apt-get update' before networking is fully up. (OK, I just saw Julian's
post which came in while I was writing this and says the same thing...)

There's a patch attached here which fixes the problem for me. I'm not
sure if there's a better way to do this - basically it starts
network-online.target and waits for it to become active, with a timeout.
Review appreciated.

Cheers,

-- 
Iain Lane  [ i...@orangesquash.org.uk ]
Debian Developer   [ la...@debian.org ]
Ubuntu Developer   [ la...@ubuntu.com ]
From c1924280973123c618fc07762b063abaf64d9d26 Mon Sep 17 00:00:00 2001
From: Iain Lane 
Date: Thu, 15 Feb 2018 16:21:59 +
Subject: [PATCH] lxd: If we're running systemd, wait until the network is up

We execute `apt-get update' more or less as soon as the container is
started. In some situations this is too early: it can be before network
is fully working.

If we have systemd, use network-online.target to wait until it thinks
networking is up.
---
 tools/autopkgtest-build-lxd | 19 ++-
 virt/autopkgtest-virt-lxd   |  2 ++
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/tools/autopkgtest-build-lxd b/tools/autopkgtest-build-lxd
index 623d5eb..9350a81 100755
--- a/tools/autopkgtest-build-lxd
+++ b/tools/autopkgtest-build-lxd
@@ -68,7 +68,7 @@ setup() {
 lxc exec "$CONTAINER" -- chmod 644 /etc/apt/apt.conf.d/01proxy
 fi
 
-# wait until it is booted: lxc exec works and we get a numeric runlevel
+# wait until it is booted: lxc exec works, we get a numeric runlevel and networking is up
 timeout=60
 while [ $timeout -ge 0 ]; do
 timeout=$((timeout - 1))
@@ -81,6 +81,23 @@ setup() {
 exit 1
 }
 
+# only if we're running systemd
+if lxc exec "$CONTAINER" -- test -d /run/systemd/system; then
+lxc exec "$CONTAINER" -- systemctl start network-online.target
+timeout=60
+while [ $timeout -ge 0 ]; do
+timeout=$((timeout - 1))
+if lxc exec "$CONTAINER" -- systemctl is-active network-online.target; then
+break
+fi
+sleep 1
+done
+[ $timeout -ge 0 ] || {
+echo "Timed out waiting for network to come up" >&2
+exit 1
+}
+fi
+
 ARCH=$(lxc exec "$CONTAINER" -- dpkg --print-architecture /dev/null || . /etc/os-release; echo "${NAME% *}"' /dev/null || awk "/^deb/ {sub(/\\[.*\\]/, \"\", \$0); print \$3; quit}" /etc/apt/sources.list' 

signature.asc
Description: PGP signature
-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


Re: autopkgtest-build-lxd failing with bionic

2018-02-15 Thread Julian Andres Klode
On Thu, Feb 15, 2018 at 04:10:01PM +0100, Martin Pitt wrote:
> Hello Timo,
> 
> Timo Aaltonen [2018-02-15 16:50 +0200]:
> > On 14.02.2018 22:03, Dimitri John Ledkov wrote:
> > > Hi,
> > > 
> > > I am on bionic and managed to build bionic container for testing using:
> > > 
> > > $ autopkgtest-build-lxd ubuntu-daily:bionic/amd64
> > > 
> > > Note this uses Ubuntu Foundations provided container as the base,
> > > rather than the third-party image that you are using from "images"
> > > remote.
> > > 
> > > Why are you using images: remote?
> > 
> > Because that's what the manpage suggests :)
> 
> Right, and quite deliberately. At least back in "my days", the ubuntu: and
> ubuntu-daily: images had a lot of fat in them which made them both
> unnecessarily slow (extra download time, requires more RAM/disk, etc.) and 
> also
> undesirable for test correctness, as having all of the unnecessary bits
> preinstalled easily hides missing dependencies.
> 
> The latter can be alleviated by purging stuff of course, and that's what
> happens for the cloud VM images in OpenStack:
> 
>   
> https://anonscm.debian.org/cgit/autopkgtest/autopkgtest.git/tree/setup-commands/setup-testbed#n242
> 
> But this takes even more time, and so far just hasn't been necessary as the
> images: ones were just right - they contain exactly what a generic container
> image is supposed to contain and are pleasantly small and fast.
> 
> > > Is the failure reproducible with ubuntu-daily:bionic?
> > > 
> > > If you can build images with ubuntu-daily:bionic, then you need to
> > > contact and file an issue with images: remote provider.
> > 
> > ubuntu-daily: works, images: fails for artful and bionic while xenial
> > works, and the image server is:
> > 
> > https://images.linuxcontainers.org/
> 
> These are being advertised and used a lot, so maybe Stephane's LXD team can
> help with fixing these? Them having no network at all sounds like a grave bug
> which should be fixed either way.

That's not what's going on at all. They do have working networking, but the
network does not come up fast enough. The apt update is not retried because
it exits with 0 because all it sees are transient errors.

-- 
debian developer - deb.li/jak | jak-linux.org - free software dev
ubuntu core developer  i speak de, en

-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


Re: autopkgtest-build-lxd failing with bionic

2018-02-15 Thread Stéphane Graber
On Thu, Feb 15, 2018 at 04:10:01PM +0100, Martin Pitt wrote:
> Hello Timo,
> 
> Timo Aaltonen [2018-02-15 16:50 +0200]:
> > On 14.02.2018 22:03, Dimitri John Ledkov wrote:
> > > Hi,
> > > 
> > > I am on bionic and managed to build bionic container for testing using:
> > > 
> > > $ autopkgtest-build-lxd ubuntu-daily:bionic/amd64
> > > 
> > > Note this uses Ubuntu Foundations provided container as the base,
> > > rather than the third-party image that you are using from "images"
> > > remote.
> > > 
> > > Why are you using images: remote?
> > 
> > Because that's what the manpage suggests :)
> 
> Right, and quite deliberately. At least back in "my days", the ubuntu: and
> ubuntu-daily: images had a lot of fat in them which made them both
> unnecessarily slow (extra download time, requires more RAM/disk, etc.) and 
> also
> undesirable for test correctness, as having all of the unnecessary bits
> preinstalled easily hides missing dependencies.
> 
> The latter can be alleviated by purging stuff of course, and that's what
> happens for the cloud VM images in OpenStack:
> 
>   
> https://anonscm.debian.org/cgit/autopkgtest/autopkgtest.git/tree/setup-commands/setup-testbed#n242
> 
> But this takes even more time, and so far just hasn't been necessary as the
> images: ones were just right - they contain exactly what a generic container
> image is supposed to contain and are pleasantly small and fast.
> 
> > > Is the failure reproducible with ubuntu-daily:bionic?
> > > 
> > > If you can build images with ubuntu-daily:bionic, then you need to
> > > contact and file an issue with images: remote provider.
> > 
> > ubuntu-daily: works, images: fails for artful and bionic while xenial
> > works, and the image server is:
> > 
> > https://images.linuxcontainers.org/
> 
> These are being advertised and used a lot, so maybe Stephane's LXD team can
> help with fixing these? Them having no network at all sounds like a grave bug
> which should be fixed either way.
> 
> That said, it could of course be that the setup script just needs some
> adjustments for the netplan changes:
> https://anonscm.debian.org/cgit/autopkgtest/autopkgtest.git/tree/setup-commands/setup-testbed
> As this doesn't know about netplan at all, just ifupdown.
> 
> Martin

stgraber@castiana:~$ lxc launch images:ubuntu/bionic/amd64 bionic
Creating bionic
Starting bionic

stgraber@castiana:~$ lxc launch images:ubuntu/artful/amd64 artful
Creating artful
Starting artful

stgraber@castiana:~$ lxc list
+-+-++--++---+
|NAME |  STATE  |  IPV4  | 
IPV6 |TYPE| SNAPSHOTS |
+-+-++--++---+
| artful  | RUNNING | 10.204.119.187 (eth0)  | 
2001:470:b368:4242:216:3eff:fe27:799b (eth0) | PERSISTENT | 0 |
+-+-++--++---+
| bionic  | RUNNING | 10.204.119.248 (eth0)  | 
2001:470:b368:4242:216:3eff:fe8c:7741 (eth0) | PERSISTENT | 0 |
+-+-++--++---+


And confirmed that networking inside both of them works fine here.

I wonder if it's a netplan vs ifupdown thing hitting autopkgtest in this case?

-- 
Stéphane Graber
Ubuntu developer
http://www.ubuntu.com


signature.asc
Description: PGP signature
-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


Re: autopkgtest-build-lxd failing with bionic

2018-02-15 Thread Martin Pitt
Hello Timo,

Timo Aaltonen [2018-02-15 16:50 +0200]:
> On 14.02.2018 22:03, Dimitri John Ledkov wrote:
> > Hi,
> > 
> > I am on bionic and managed to build bionic container for testing using:
> > 
> > $ autopkgtest-build-lxd ubuntu-daily:bionic/amd64
> > 
> > Note this uses Ubuntu Foundations provided container as the base,
> > rather than the third-party image that you are using from "images"
> > remote.
> > 
> > Why are you using images: remote?
> 
> Because that's what the manpage suggests :)

Right, and quite deliberately. At least back in "my days", the ubuntu: and
ubuntu-daily: images had a lot of fat in them which made them both
unnecessarily slow (extra download time, requires more RAM/disk, etc.) and also
undesirable for test correctness, as having all of the unnecessary bits
preinstalled easily hides missing dependencies.

The latter can be alleviated by purging stuff of course, and that's what
happens for the cloud VM images in OpenStack:

  
https://anonscm.debian.org/cgit/autopkgtest/autopkgtest.git/tree/setup-commands/setup-testbed#n242

But this takes even more time, and so far just hasn't been necessary as the
images: ones were just right - they contain exactly what a generic container
image is supposed to contain and are pleasantly small and fast.

> > Is the failure reproducible with ubuntu-daily:bionic?
> > 
> > If you can build images with ubuntu-daily:bionic, then you need to
> > contact and file an issue with images: remote provider.
> 
> ubuntu-daily: works, images: fails for artful and bionic while xenial
> works, and the image server is:
> 
> https://images.linuxcontainers.org/

These are being advertised and used a lot, so maybe Stephane's LXD team can
help with fixing these? Them having no network at all sounds like a grave bug
which should be fixed either way.

That said, it could of course be that the setup script just needs some
adjustments for the netplan changes:
https://anonscm.debian.org/cgit/autopkgtest/autopkgtest.git/tree/setup-commands/setup-testbed
As this doesn't know about netplan at all, just ifupdown.

Martin

-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


Re: autopkgtest-build-lxd failing with bionic

2018-02-15 Thread Timo Aaltonen
On 14.02.2018 22:03, Dimitri John Ledkov wrote:
> Hi,
> 
> I am on bionic and managed to build bionic container for testing using:
> 
> $ autopkgtest-build-lxd ubuntu-daily:bionic/amd64
> 
> Note this uses Ubuntu Foundations provided container as the base,
> rather than the third-party image that you are using from "images"
> remote.
> 
> Why are you using images: remote?

Because that's what the manpage suggests :)

> Is the failure reproducible with ubuntu-daily:bionic?
> 
> If you can build images with ubuntu-daily:bionic, then you need to
> contact and file an issue with images: remote provider.

ubuntu-daily: works, images: fails for artful and bionic while xenial
works, and the image server is:

https://images.linuxcontainers.org/




-- 
t

-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


Re: autopkgtest-build-lxd failing with bionic

2018-02-14 Thread Dimitri John Ledkov
Hi,

On 14 February 2018 at 12:01, Timo Aaltonen  wrote:
>
> Hi
>
> I'm not able to build a bionic container:
>
> autopkgtest-build-lxd images:ubuntu/bionic/amd64
> Creating autopkgtest-prepare-1ay
> Starting autopkgtest-prepare-1ay
> Container finished booting. Distribution Ubuntu, release bionic,
> architecture amd64
> Running setup script /usr/share/autopkgtest/setup-commands/setup-testbed...
> grep: //etc/network/interfaces: No such file or directory
> Err:1 http://archive.ubuntu.com/ubuntu bionic InRelease
>   Temporary failure resolving 'archive.ubuntu.com'
> Err:2 http://archive.ubuntu.com/ubuntu bionic-updates InRelease
>   Temporary failure resolving 'archive.ubuntu.com'
> Reading package lists...
> W: Failed to fetch
> http://archive.ubuntu.com/ubuntu/dists/bionic/InRelease  Temporary
> failure resolving 'archive.ubuntu.com'
> W: Failed to fetch
> http://archive.ubuntu.com/ubuntu/dists/bionic-updates/InRelease
> Temporary failure resolving 'archive.ubuntu.com'
> W: Some index files failed to download. They have been ignored, or old
> ones used instead.
> Reading package lists...
> Building dependency tree...
> Reading state information...
> dbus is already the newest version (1.12.2-1ubuntu1).
> The following NEW packages will be installed:
>   eatmydata libeatmydata1
> 0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
> Need to get 12.1 kB of archives.
> After this operation, 54.3 kB of additional disk space will be used.
> Err:1 http://archive.ubuntu.com/ubuntu bionic/main amd64 libeatmydata1
> amd64 105-5
>   Temporary failure resolving 'archive.ubuntu.com'
> Err:2 http://archive.ubuntu.com/ubuntu bionic/main amd64 eatmydata all 105-5
>   Temporary failure resolving 'archive.ubuntu.com'
> E: Failed to fetch
> http://archive.ubuntu.com/ubuntu/pool/main/libe/libeatmydata/libeatmydata1_105-5_amd64.deb
>  Temporary failure resolving 'archive.ubuntu.com'
> E: Failed to fetch
> http://archive.ubuntu.com/ubuntu/pool/main/libe/libeatmydata/eatmydata_105-5_all.deb
>  Temporary failure resolving 'archive.ubuntu.com'
> E: Unable to fetch some archives, maybe run apt-get update or try with
> --fix-missing?
>
> while building a sid container worked fine. Is there a bug open I've not 
> found yet, or should I file one?

I am on bionic and managed to build bionic container for testing using:

$ autopkgtest-build-lxd ubuntu-daily:bionic/amd64

Note this uses Ubuntu Foundations provided container as the base,
rather than the third-party image that you are using from "images"
remote.

Why are you using images: remote?

Is the failure reproducible with ubuntu-daily:bionic?

If you can build images with ubuntu-daily:bionic, then you need to
contact and file an issue with images: remote provider.

-- 
Regards,

Dimitri.

-- 
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel