Bug#964596: cloud.debian.org: Debian 10 EC2: IPv4 address suddenly flushed

2020-07-09 Thread Martin Olsson
Package: cloud.debian.org
Severity: major
User: cloud.debian@packages.debian.org
Usertags: aws

Problem:
Production systems in AWS lose all network connectivity after 1h, after a
dist-upgrade from Debian 9 to Debian 10 has been performed.
One can't ssh in to investigate and no remote console exists in AWS.
Fortunately, you *can* restart the EC2 instance, which will generate a new
dhcp lease and give you another 1h of access before the access is cut again.

How to reproduce:

Install a Debian 9 machine using the official Debian 9 AMI.

During the hardening of the machine, disable IPv6 completely:
# cat /etc/sysctl.d/disable_ipv6.conf
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.eth0.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

This hardened Debian 9 server works perfectly for a year.

Now perform a dist-upgrade to Debian 10.

Everything looks good. No errors during the upgrade.
After the final reboot, the server comes online as it should.

BUT...
After 1 hour we suddenly lose all access to the server.

A reset of the EC2 brings the access back, only to be lost again 1h later.

(unfortunately, neither dhclient nor the cloud-init scripts syslogged any
error, so it was pretty hard to figure out what was wrong)

It turns out to be the IPv6 hardening that generates problems for
dhclient/ifup.

I believe the problem lies in /sbin/dhclient-script :
if [ -n "$old_ip_address" ] &&
   [ "$old_ip_address" != "$new_ip_address" ]; then
# leased IP has changed => flush it
ip -4 addr flush dev ${interface} label ${interface}
fi

My guess is that when dhclient fails to set an IPv6 IP, the above code
flushes the current IPv4 configured on the machine, making it lose all
network connectivity.



My current workaround is to *not* do the above IPv6 hardening, then the
server works fine.




My /etc/network/interfaces configuration:
# interfaces(5) file used by ifup(8) and ifdown(8)
# Include files from /etc/network/interfaces.d:
source-directory /etc/network/interfaces.d
auto lo
iface lo inet loopback
auto eth0
iface eth0 inet dhcp
allow-hotplug eth0
iface eth0 inet6 manual
  up /usr/local/sbin/inet6-ifup-helper
  down /usr/local/sbin/inet6-ifup-helper
iface eth1 inet dhcp
allow-hotplug eth1
iface eth1 inet6 manual
  up /usr/local/sbin/inet6-ifup-helper
  down /usr/local/sbin/inet6-ifup-helper
iface eth2 inet dhcp
allow-hotplug eth2
iface eth2 inet6 manual
  up /usr/local/sbin/inet6-ifup-helper
  down /usr/local/sbin/inet6-ifup-helper
iface eth3 inet dhcp
allow-hotplug eth3
iface eth3 inet6 manual
  up /usr/local/sbin/inet6-ifup-helper
  down /usr/local/sbin/inet6-ifup-helper
iface eth4 inet dhcp
allow-hotplug eth4
iface eth4 inet6 manual
  up /usr/local/sbin/inet6-ifup-helper
  down /usr/local/sbin/inet6-ifup-helper
iface eth5 inet dhcp
allow-hotplug eth5
iface eth5 inet6 manual
  up /usr/local/sbin/inet6-ifup-helper
  down /usr/local/sbin/inet6-ifup-helper
iface eth6 inet dhcp
allow-hotplug eth6
iface eth6 inet6 manual
  up /usr/local/sbin/inet6-ifup-helper
  down /usr/local/sbin/inet6-ifup-helper
iface eth7 inet dhcp
allow-hotplug eth7
iface eth7 inet6 manual
  up /usr/local/sbin/inet6-ifup-helper
  down /usr/local/sbin/inet6-ifup-helper
iface eth8 inet dhcp
allow-hotplug eth8
iface eth8 inet6 manual
  up /usr/local/sbin/inet6-ifup-helper
  down /usr/local/sbin/inet6-ifup-helper


Log:
Jul 8 10:13:36 foobar ifup[363]: RTNETLINK answers: File exists
Jul 8 10:13:36 foobar ifup[363]: invoke-rc.d: could not determine current
runlevel
Jul 8 10:13:36 foobar dhclient[571]: bound to 10.75.75.75 -- renewal in
1491 seconds.
Jul 8 10:13:36 foobar ifup[363]: bound to 10.75.75.75 -- renewal in 1491
seconds.
Jul 8 10:13:36 foobar ifup[363]: Could not get a link-local address
Jul 8 10:13:36 foobar ifup[363]: ifup: failed to bring up eth0

1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group
default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
   valid_lft forever preferred_lft forever
2: eth0:  mtu 9001 qdisc mq state UP group
default qlen 1000
link/ether 06:ce:43:75:75:75 brd ff:ff:ff:ff:ff:ff


Additional findings:
If I compare the contents of the dir /etc/network/ of this 9-->10
dist-upgraded machine, it differs from a machine that is installed directly
with the Debian 10 AMI:
dist-upgraded:/etc/network> ls
if-down.d/  if-post-down.d/  if-pre-up.d/  if-up.d/  interfaces
 interfaces.d/

pure deb10:/etc/network> ls
cloud-ifupdown-helper* if-down.d/   if-pre-up.d/  interfaces
cloud-interfaces-template  if-post-down.d/  if-up.d/  interfaces.d/

This makes me think that the cloud-init package for Debian 10 does
something wrong.


Somewhat related bug: #846583

/Martin


Bug#949735: closed by Thomas Goirand (Re: Bug#949735: AWS Debian9 AMI: Problems with /etc/hosts when user-data set 'manage_etc_hosts: false')

2020-01-27 Thread Martin Olsson
Hi!

Well, yes it doesn't match my expectation but that's not the bigger issue here.

We're all used to have conf files with helping comments and default
options commented out. Therefore we trust the information in the file.
In this case the information is misleading.

The first garbage, "Your system has configured 'manage_etc_hosts' as
True", make me very confused since I trust the information.
I immediately start debugging my system -- only to realize that I've done
nothing wrong, but the message was just garbage from the original AMI.


Hence, I think this ticket should be re-opened and at least Action 1 be
resolved.


New Action 1:
Make the header say "Your system has configured 'manage_etc_hosts' as
False. Therefore this file is not managed by cloud-init." or something
similar.

or

Make /etc/hosts contain only the bare minimum entries, i.e. the 127.0.0.1
and ::1.

or

empty the file completely


In any case, do not leave the garbage in there.

/Martin


27 jan. 2020 10:51 Debian Bug Tracking System :
>
> This is an automatic notification regarding your Bug report
> which was filed against the cloud.debian.org package:
>
> #949735: AWS Debian9 AMI: Problems with /etc/hosts when user-data set 
> 'manage_etc_hosts: false'
>
> It has been closed by Thomas Goirand .
>
> Their explanation is attached below along with your original report.
> If this explanation is unsatisfactory and you have not received a
> better one in a separate message then please contact Thomas Goirand 
>  by
> replying to this email.
>
>
> --
> 949735: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=949735
> Debian Bug Tracking System
> Contact ow...@bugs.debian.org with problems
>
>
>
> -- Forwarded message --
> From: Thomas Goirand 
> To: Martin Olsson , 
> 949735-d...@bugs.debian.org
> Cc:
> Bcc:
> Date: Mon, 27 Jan 2020 10:40:06 +0100
> Subject: Re: Bug#949735: AWS Debian9 AMI: Problems with /etc/hosts when 
> user-data set 'manage_etc_hosts: false'
> On 1/24/20 11:51 AM, Martin Olsson wrote:
> > Action 1:
> > Please make the header state "Your system has configured
> > 'manage_etc_hosts' as False.
> > Therefore this file is not managed by cloud-init." or something similar.
> >
> > Action 2:
> > Please remove this line. Or actually, see 3) below.
> >
> > Action 3:
> > If 'manage_etc_hosts' is set to "false", do a one-time write to
> > /etc/hosts, setting the fqdn and hostname to 127.0.0.1.
> > Only do this the first time the EC2 is booted. After that, the file is
> > managed manually.
>
> Hi,
>
> So, basically, you're complaining that when you set manage_etc_hosts:
> false, you're getting a /etc/hosts that doesn't match your expectation.
> Well, that is exactly what the feature is about... I guess then, you're
> suppose to edit /etc/hosts by hand, and fix it the way it pleases you.
>
> Then you're talking about is features to add to cloud-init upstream.
> Feel free to get in touch with them, or contribute the code. As it
> stands, I don't think anyone from the Debian cloud team feels like
> patching cloud-init to do what you're suggesting.
>
> Usually, the normal way to use the cloud is to *not* set
> manage_etc_hosts and let cloud-init do it, then maybe after the first
> boot, change the value.
>
> I'm closing this bug, because I really don't see how we should address
> it in a reasonable way.
>
> Cheers,
>
> Thomas Goirand
>
>
> -- Forwarded message --
> From: Martin Olsson 
> To: sub...@bugs.debian.org
> Cc:
> Bcc:
> Date: Fri, 24 Jan 2020 11:51:33 +0100
> Subject: AWS Debian9 AMI: Problems with /etc/hosts when user-data set 
> 'manage_etc_hosts: false'
> Package: cloud.debian.org
>
> I setup an EC2 in AWS using the Debian 9 AMI.
> I pass along this user-data:
> cat /var/lib/cloud/instance/user-data.txt
> #cloud-config
> fqdn: foo.bar.cloud
> timezone: Europe/Stockholm
> manage_etc_hosts: false
> ssh_authorized_keys:
>   - ssh-rsa ...
>
> Sure enough, /etc/hosts is not managed. But...
>
> 1)
> The created /etc/hosts file give confusing and conflicting information.
> It says:
> # Your system has configured 'manage_etc_hosts' as True.
> # As a result, if you wish for changes to this file to persist
> # then you will need to either
> # a.) make changes to the master file in /etc/cloud/templates/hosts.tmpl
> # b.) change or remove the value of 'manage_etc_hosts' in
> # /etc/cloud/cloud.cfg or cloud-config from user-data
>
> This is wrong. I have configured it as 'false'.
> I guess this erroneous information come from some default /etc/hosts
> template in your AMI.
>
> Action 1:
> Ple

Bug#949735: AWS Debian9 AMI: Problems with /etc/hosts when user-data set 'manage_etc_hosts: false'

2020-01-24 Thread Martin Olsson
Package: cloud.debian.org

I setup an EC2 in AWS using the Debian 9 AMI.
I pass along this user-data:
cat /var/lib/cloud/instance/user-data.txt
#cloud-config
fqdn: foo.bar.cloud
timezone: Europe/Stockholm
manage_etc_hosts: false
ssh_authorized_keys:
  - ssh-rsa ...

Sure enough, /etc/hosts is not managed. But...

1)
The created /etc/hosts file give confusing and conflicting information.
It says:
# Your system has configured 'manage_etc_hosts' as True.
# As a result, if you wish for changes to this file to persist
# then you will need to either
# a.) make changes to the master file in /etc/cloud/templates/hosts.tmpl
# b.) change or remove the value of 'manage_etc_hosts' in
# /etc/cloud/cloud.cfg or cloud-config from user-data

This is wrong. I have configured it as 'false'.
I guess this erroneous information come from some default /etc/hosts
template in your AMI.

Action 1:
Please make the header state "Your system has configured
'manage_etc_hosts' as False.
Therefore this file is not managed by cloud-init." or something similar.


2)
The created /etc/hosts file contain wrong IP information.
It says:
127.0.1.1   ip-10-0-3-4.ec2.internalip-10-0-3-4

This is wrong. My EC2 don't have this IP. In case I do use the subnet
10.0.3.0/24, this line would confuse me since I'm not the one who
added it.
Again, I guess this is left-overs from some default /etc/hosts
template in your AMI (you probably used IP 10.0.3.4 when creating the
AMI).

Action 2:
Please remove this line. Or actually, see 3) below.


3)
Even if I set 'manage_etc_hosts: false' I would still like the
installed Debian EC2 machine to get an /etc/hosts similar to
a manually installed Debian machine, using netinst or CD1.
That is, I don't want /etc/hosts to be *managed* over time (after
reboots), but I *do* want the initial /etc/hosts to get the line
127.0.0.1  
127.0.0.1   localhost
...just like the normal debian-installer would do.

Action 3:
If 'manage_etc_hosts' is set to "false", do a one-time write to
/etc/hosts, setting the fqdn and hostname to 127.0.0.1.
Only do this the first time the EC2 is booted. After that, the file is
managed manually.



Bug#929860: apt-get hangs forever when invoked via timeout

2019-06-03 Thread Martin Olsson
Update:

5)
The problem only affects 'apt-get install', not 'apt-get update'.

6)
A workaround is to use --foreground
  timeout --foreground 1m apt-get -y install bc
This will finish successfully without any delay or strange exit-code.
However, this is a workaround. 'apt-get install' will launch child
processes and if they
should hang only the parent process will be killed.



Bug#929860: apt-get hangs forever when invoked via timeout

2019-06-01 Thread Martin Olsson
Package: apt
Version: 1.4.9

Hi!
I've found a reproduceable bug when using 'apt-get' together with
'timeout' within a shell.

How to reproduce:

I install a system using the debian-9.9.0-amd64-xfce-CD-1.iso.
I don't install anything at all except a ssh-server (not even any
sys-utils, just the base OS and sshd).
I login and create this script /root/foo.sh:   ('bc' can be replaced
with any package)

  #!/bin/sh
  timeout 1m apt-get -y -qq install bc

I run 'chmod 755 /root/foo.sh' and then execute it:
# /root/foo.sh
Selecting previously unselected package bc.
(Reading database ... 18047 files and directories currently installed.)
Preparing to unpack .../bc_1.06.95-9+b3_amd64.deb ...
Unpacking bc (1.06.95-9+b3) ...
Processing triggers for man-db (2.7.6.1-2) ...
Setting up bc (1.06.95-9+b3) ...

Here it freezes!
Nothing more happens, and it would hang forever if not my timeout
expired after a minute, making foo.sh continue.
foo.sh has nothing more to do so it terminates and I get a new prompt.

But for some reason, the tty is no longer echoing my keystrokes!
I blindly run 'echo $?' and get the response "124".
This is the return code from 'timeout' after it killed the apt-get job.


1)
What is happening here? The package is successfully installed but
apt-get never terminates.

2)
Running 'timeout 1m apt-get -y -qq install bc' directly on the
commandline works fine.
Running 'apt-get -y -qq install bc' directly in the foo.sh shell,
without any timeout, works fine.
So why doesn't apt-get terminate when started via 'timeout', when run
inside a shell?

3)
Why is the tty no longer echoing my keystrokes?

4)
I'm pretty sure my install-script in Debian 9.6 worked just fine with
'apt-get' together with 'timeout' within a shell.
That is, 'apt-get' did its job and terminated immediately and sucessfully.
Now when I install a new machine using Debian 9.9 it didn't work as expected.
So I guess the problem was introduced somewhere between Debian 9.6 and 9.9.



Anyhow... I can now reproduce the bug over and over:
I run 'reset' to get a normal tty again, purge the installed 'bc' tool
and then install it again via foo.sh:

# reset
# apt-get -y purge bc
# /root/foo.sh

Again, apt-get hangs after installing 'bc'.



(
Why do I call this a bug and not just an annoyance?
Here's my use-case (that used to work fine):

I have a script that declare a function to add a new APT source,
update and then install a package using that source:

  install_foobar() {
cd /tmp
timeout 1m wget "https://apt.foobar.com/foobar.deb; || return $?
dpkg -i foobar.deb || return $?
timeout 1m apt-get update || return $?
timeout 4m apt-get -y install foobar-agent || return $?
return 0
  }

The function is called by the script, and depending on its exit status
it does things differently:

  install_foobar || touch /root/foobar_install_failed
  if [ ! -f /root/foobar_install_failed ]; then
...perform cleanup...
...disable unnecessary services...
...create a foobar user...
...set permissions...
  fi

Now when the apt-get command timeout instead of exiting normally, code
124 is returned.
This in turn creates the 'foobar_install_failed' file, and the user
and settings, etc are not processed even though the foobar-agent
package was successfully installed.
)



Bug#928982: Bug: 'systemctl is-enabled' return enabled/true when alias symlinks exist

2019-05-14 Thread Martin Olsson
Hi!

I don't know why you need it, since the bug is in systemctl, but here it is:

#cat /etc/systemd/system/myfoobar.target
[Unit]
Description=MyFooBar (with 2 workers)
Wants=foo1of2.service foo2of2.service

[Install]
WantedBy=multi-user.target


Since Puppet use the exit-status from 'systemctl is-enabled', it is
important that this query can be trusted.
If you have a unit that is currently NOT enabled, the 'systemctl
is-enabled' query should not say "enabled" just because an alias
symlink exist.

If I delete the alias symlink, the 'systemctl is-enabled' query says
"disabled". Correct.
If I create the alias symlink, the 'systemctl is-enabled' query says
"enabled". Wrong!

If I remove the alias symlink and manually run 'systemctl enable
myfoobar.target', then the query says "enabled". Correct (since a
wants-symlink now exist).




Den tis 14 maj 2019 kl 17:19 skrev Michael Biebl :
>
> Am 14.05.19 um 16:58 schrieb Martin Olsson:
> > Package: systemd
> > Version: 232-25+deb9u11
> >
> > Problem:
> > The command 'systemctl is-enabled myfoobar.target' return "enabled"
> > (exit code 0) when it should return "disabled" (code >0).
>
>
> Please share the full myfoobar.target
>
>
> --
> Why is it that all of the instruments seeking intelligent life in the
> universe are pointed away from Earth?
>



Bug#928982: Bug: 'systemctl is-enabled' return enabled/true when alias symlinks exist

2019-05-14 Thread Martin Olsson
Package: systemd
Version: 232-25+deb9u11

Problem:
The command 'systemctl is-enabled myfoobar.target' return "enabled"
(exit code 0) when it should return "disabled" (code >0).

How to reproduce:
Create a symlink /etc/systemd/system/foo.target -->
/etc/systemd/system/myfoobar.target
either manually with 'ln -s' or via an "Alias=" in your unit file.

Without the alias symlink, 'systemctl is-enabled myfoobar.target'
return "disabled" just as it should.
When adding the symlink, 'systemctl is-enabled myfoobar.target'
suddenly return "enabled".
I think this is wrong.

The manual states:
--
is-enabled NAME...
Checks whether any of the specified unit files are enabled (as with
enable). Returns an exit code of 0 if at least one is enabled,
non-zero otherwise.
Prints the current enable status (see table). To suppress this output,
use --quiet. To show installation targets, use --full.
Result "enabled" (exit code 0) = Enabled via .wants/, .requires/ or
alias symlinks (permanently in /etc/systemd/system/, or transiently in
/run/systemd/system/).
--

Why should is-enabled report "enabled" on alias symlinks in
/etc/systemd/system/?
Aliases are just aliases, they don't automatically enable the
service/target/unit on boot.



How I found this issue:
I use Puppet to handle the state of my custom service (which is
actually a .target, with multiple services as Wants).
When Puppet check to see if the service 'myfoobar.target' is enabled,
it runs the command 'systemctl is-enabled myfoobar.target'. This
returns true (since I have an alias symlink
/etc/systemd/system/foo.target), so Puppet never force the service to
become enabled. :-(



Bug#927281: openvpn: systemd service won't start (silent fail)

2019-04-17 Thread Martin Olsson


Package: openvpn
Version: 2.4.0-6+deb9u1


Hi Alberto!

I just discovered a problem/bug in openvpn (installed on a fresh 
Debian 9 system).


After the package has been installed, I have to manually run 'systemctl 
daemon-reload', or the service will silently fail to start.




History / how to reproduce:
I install Debian 9 and uncheck everything to get a minimal install.
I run 'apt-get install openvpn'
I install my certificates and openvpn.conf.

I try to start the service. Nothing happens.

I tried 'systemctl stop openvpn' followed by 'systemctl start openvpn'.
Nothing happens.
Nothing is logged under /var/log/ 
In journalctl I see:

Apr 17 11:18:59 foobar systemd[1]: Stopped OpenVPN service.
Apr 17 11:19:07 foobar systemd[1]: Starting OpenVPN service...
Apr 17 11:19:07 foobar systemd[1]: Started OpenVPN service.
Looks good, but no openvpn process is started!

Silent fail. :-(



# systemctl status openvpn
● openvpn.service - OpenVPN service
   Loaded: loaded (/lib/systemd/system/openvpn.service; enabled; vendor preset: 
enabled)
   Active: active (exited) since Wed 2019-04-17 11:19:07 CEST; 13min ago
  Process: 443 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
 Main PID: 443 (code=exited, status=0/SUCCESS)
Tasks: 0 (limit: 4915)
   CGroup: /system.slice/openvpn.service

(the same thing (nothing) happens if I use '/etc/init.d/openvpn start')

# ps fax | grep openvpn
No process found.




But if I manually run '/usr/sbin/openvpn --config 
/etc/openvpn/openvpn.conf', everything is working just fine.


So the problem is not with openvpn itself, but with systemd/init.



After a reboot, sometimes it starts working, probably because something 
executed a 'systemctl daemon-reload'. However, on a minimal install of 
Debian 9, with only openvpn installed, even a reboot don't fix the 
problem. I have to manually issue the daemon-reload command.




So...

I think the solution is for you to add the command 'systemctl 
daemon-reload' to openvpn.postinst.
You already have it in openvpn.postrm, so you can copy the code from 
there.


/Martin

Bug#520876: general: new FTP app needs packaging ... http://bareftp.org

2009-03-23 Thread Martin Olsson
Package: general
Severity: normal

Hi,

There is a new FTP client called BareFTP. It would be nice to have it packaged 
into Debian unstable.

This application is available at:
http://bareftp.org/


-- System Information:
Debian Release: 5.0
  APT prefers jaunty-updates
  APT policy: (500, 'jaunty-updates'), (500, 'jaunty-security'), (500, 'jaunty')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.29-020629rc8-generic (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org