On Fri 03 Mar 2023 at 17:03:49 (+0000), daven...@tuxfamily.org wrote:
> On 2023-03-03 05:21, David Wright wrote:
> > On Thu 02 Mar 2023 at 11:44:17 (+0100), daven...@tuxfamily.org wrote:
> > > 
> > > I finally had the time to dig into the logs from two days ago, when
> > > the problem happens again
> > > It seems that dhclient does it requests trying different interfaces
> > > EXCEPT tun0 (see syslog below)
> > > 
> > > I looked into /etc/dhcp/dhclient.conf, which mostly commented. In the
> > > non-commented part shown below
> > > I don't see anything related to which interfaces are to be used as
> > > arguments when dhclient is executed
> > > But I'm not a dhclient expert.
> > > 
> > > At boot time/without VPN, dhclient is executed with enp.* interface
> > > accondinng to systemctl/ps -eFH.
> > > But It still tries different interfaces according to logs. And I have:
> > > no idea how it detects
> > > which interfaces to use for it requests and why it doesn't
> > > detect tun0.
> > > Maybe because tun0 is not yet exposed by the system at boot time when
> > > dhclient is started?
> > > 
> > > It stills try wlan interface even if it is down, which is weird, but
> > > doesn't seem to do harm except
> > > spamming syslog and spending a few additional seconds before trying
> > > another interface.
> > 
> > It's not clear what the status of your vpn was when the logs were
> > written, AFAICT. It's obviously not soon after booting, as you have
> > processes running with PIDs in the 40,000s.
> 
> The VPN was obviously running at that time.

I'm writing from what I've gleaned from the logs. You're writing from
the viewpoint of running the machine. If you'd originally written some
of what's below, I wouldn't have had to. Why do I have to? Because
there's no point in my writing anything unless I give the assumptions
on which it's based, enabling you to correct any false assumptions.

> IF the VPN was not running the 28th february (almost allday, including
> around 14:30), it wouldn't make sense to
> - make /etc/resolv.conf immutable so dhlient doesn't overwrite it
> while my home's resolver address
> - complain about dhclient ignoring tun0, which obviously exists only
> when the vpn is running
> - searching for logs for the 28th february around 14:30
> 
> I never said those logs were soon after running.

Nor me. I presumed that they weren't, and wrote that down in case
my presumption was incorrect.

> I'd love to start my
> workday after 14:00/2 pm… but obviously it not the case.
> These logs are from after dhcp leases ends and dhlicent tries to
> overwrite resolv.conf.
> Several hours after openconnect is executed (around 9:00 am).
> 
> I said when the system boots, before the VPN is up and ruuning, I can
> see dhclient has only one (eth0/ens-whatever_its-name_is) as an
> argument
> So I don't how dhclient determines what interfaces to use, and why it
> ignores tun0 BUT still detects and try to use ALL other interfaces,
> even when there down/not used
> 
> > 
> > It would make sense to start and stop the vnc from a terminal in which
> > script (bsdutils) was running, preferably with -T to add timestamps to
> > each command. You could then reconcile the logs with what you were
> > doing.
> 
> I guess you mean vpnc, or vpn? not vnc? or am I misinterpreting?

Yes, I used vpnc as a shorthand for starting your VPN session. I know
openconnect only as a package name, and I had no idea whether you
start it in a GUI (assuming X is running) or with some terminal command.
So again, I'm second-guessing pretty much what you already do.

Yes, this keyboard doesn't have a good touch, so I often have to add
missing letters while I read through what I've written.

> vpnc_script is started by openconnect (vpn client for cisco, among
> other things). OpenConnect is executed from terminal
> OpenConnect has timestamp but there is no output from openconnect when
> /etc/resolv is overwritten by dhclient. As OpenConnect doesn't know
> about it when it happens
> 
> That why I used auditd and ausearch with timestamp, and reconciled
> logs from auditd
> showing a process failing (as expected) to overwrite /etc/resolv.conf
> with logs from syslog showing dhclient trying to renew DHCP leases
> 
> Around 9:00 am, openconnect is executed manually from the terminal and
> works as expected.
> Right after the VPN is connected, I make resolv.conf file immutable
> and tell auditd to monitor write or append actions to this file
> 
> Everything works until around 14:30, dhclient tries to overwrites
> /etc/resolv and fails as expect.
> Auditd logs that action. The VPN is still running. I ^C openconnect
> only at the end of the day
> (unless something breaks, like temporary network issues forcing me to
> reconnect)
> 
> At that moment, things still works just because I made /etc/resolv.conf
> But it's annoying to have to think about adding/removing the immutable
> flag each time…

As you start your VPN in a terminal, it would be easy to script it.
Perhaps something like:

  connect
  (sleep 5; chattr)

where 5 is tuned to how much time before the connection settles.

  chattr; disconnect

is obviously simpler.

> > > ------------
> > > dave@debian-laptop:~$ systemctl status
> > > 
> > >  ├─ifup@enp2s0f0.service
> > >              │ └─1093 /sbin/dhclient -4 -v -i -pf
> > > /run/dhclient.enp2s0f0.pid -lf /var/lib/dhcp/dhclient.enp2s0f0.leases
> > > -I -df /var/lib/dhcp/dhclient6.enp2s0f0.leases enp2s0f0
> > > 
> > > ------------
> > 
> > I don't know why ifupdown is running, as I thought you were
> > controlling the network with connman.
> 
> That's the default debian behaviour.
> A colleague with debian stable and LXQT (therefore connmann as well,
> just like for LXDE, which I use)
> also has ifup running as parent process of dhclient. ifupdown is
> installed by default.

It might help to clean up your networking then. ConnMan should be able
to run wpasupplicant, and AFAIK can do its own DHCP.

I have an ancient laptop running systemd-networkd and wpasupplicant
for its ethernet and wifi. My other wifis use iwd. So I've removed
ifupdown, isc-dhcp-*, and avahi-autoipd from all my machines as
none of these is now required. All part of a cleanup following the
demise of wicd in bullseye.

> > > dave@debian-laptop:~$ sudo ausearch --interpret --file
> > > /etc/resolv.conf
> > > ----
> > > type=PROCTITLE msg=audit(28/02/2023 14:30:57.787:658) : proctitle=mv
> > > -f /etc/resolv.conf.dhclient-new.46082 /etc/resolv.conf
> > > type=PATH msg=audit(28/02/2023 14:30:57.787:658) : item=3
> > > name=/etc/resolv.conf inode=786809 dev=fd:01 mode=file,644 ouid=root
> > > ogid=root rdev=00:00 nametype=DELETE cap_fp=none cap_fi=none cap_fe=0
> > > cap_fver=0 cap_frootid=0
> > > type=PATH msg=audit(28/02/2023 14:30:57.787:658) : item=2
> > > name=/etc/resolv.conf.dhclient-new.46082 inode=786811 dev=fd:01
> > > mode=file,644 ouid=root ogid=root rdev=00:00 nametype=DELETE
> > > cap_fp=none cap_fi=none cap_fe=0 cap_fver=0 cap_frootid=0
> > > type=PATH msg=audit(28/02/2023 14:30:57.787:658) : item=1 name=/etc/
> > > inode=783361 dev=fd:01 mode=dir,755 ouid=root ogid=root rdev=00:00
> > > nametype=PARENT cap_fp=none cap_fi=none cap_fe=0 cap_fver=0
> > > cap_frootid=0
> > > type=PATH msg=audit(28/02/2023 14:30:57.787:658) : item=0 name=/etc/
> > > inode=783361 dev=fd:01 mode=dir,755 ouid=root ogid=root rdev=00:00
> > > nametype=PARENT cap_fp=none cap_fi=none cap_fe=0 cap_fver=0
> > > cap_frootid=0
> > > type=CWD msg=audit(28/02/2023 14:30:57.787:658) : cwd=/
> > > type=SYSCALL msg=audit(28/02/2023 14:30:57.787:658) : arch=x86_64
> > > syscall=rename success=no exit=EPERM(Opération non permise)
> > > a0=0x7ffc38ee0a9e a1=0x7ffc38ee0ac2 a2=0x0 a3=0xfffffffffffffa4d
> > > items=4 ppid=46082 pid=46093 auid=unset uid=root gid=root euid=root
> > > suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none)
> > > ses=unset comm=mv exe=/usr/bin/mv subj=unconfined key=(null)
> > > ----
> > > 
> > > Note : exit=EPERM can be ignored, because it happened while I made
> > > /etc/resolv.conf immutable
> > 
> > This appears to show mv -f /etc/resolv.conf.dhclient-new.46082
> > /etc/resolv.conf
> > failing, as expected.
> > 
> > > ------------
> > > 
> > > dave@debian-laptop:~$ less /var/log/syslog
> > > Feb 28 14:30:48 debian-laptop dhclient[2016]: DHCPDISCOVER on wlp3s0
> > > to 255.255.255.255 port 67 interval 3
> > > Feb 28 14:30:51 debian-laptop dhclient[39759]: DHCPDISCOVER on
> > > enx8c8caa616f43 to 255.255.255.255 port 67 interval 5
> > > Feb 28 14:30:51 debian-laptop dhclient[2016]: DHCPDISCOVER on wlp3s0
> > > to 255.255.255.255 port 67 interval 4
> > > Feb 28 14:30:55 debian-laptop dhclient[2016]: DHCPDISCOVER on wlp3s0
> > > to 255.255.255.255 port 67 interval 11
> > > Feb 28 14:30:56 debian-laptop dhclient[39759]: DHCPDISCOVER on
> > > enx8c8caa616f43 to 255.255.255.255 port 67 interval 9
> > > Feb 28 14:30:57 debian-laptop dhclient[1003]: DHCPREQUEST for
> > > 192.168.1.192 on enp2s0f0 to 192.168.1.1 port 67
> > > Feb 28 14:30:57 debian-laptop dhclient[1003]: DHCPACK of 192.168.1.192
> > > from 192.168.1.1
> > > Feb 28 14:30:57 debian-laptop connmand[1038]: enp2s0f0 {add} address
> > > 192.168.1.192/24 label enp2s0f0 family 2
> > > Feb 28 14:30:57 debian-laptop avahi-daemon[1034]: Registering new
> > > address record for 192.168.1.192 on enp2s0f0.IPv4.
> > > Feb 28 14:30:57 debian-laptop dhclient[1003]: bound to 192.168.1.192
> > > -- renewal in 17866 seconds.
> > 
> > One might expect this to be where mv -f fails.
> 
> It is not what bothers me… making /etc/resolv.conf immutable is just a
> temporary work around, but necessary until I have a working solution

So you wrote, 35 lines above. But at the top you wrote, "showing a
process failing (as expected) to overwrite /etc/resolv.conf with logs
from syslog showing dhclient trying to renew DHCP leases". So this
is an expected DHCP as the lease has expired. Aren't we interested
in your "popup" DHCPs that occur sometimes just minutes apart.

> > > Feb 28 14:31:05 debian-laptop dhclient[39759]: DHCPDISCOVER on
> > > enx8c8caa616f43 to 255.255.255.255 port 67 interval 12
> > > Feb 28 14:31:06 debian-laptop dhclient[2016]: DHCPDISCOVER on wlp3s0
> > > to 255.255.255.255 port 67 interval 11
> > > Feb 28 14:31:17 debian-laptop dhclient[39759]: DHCPDISCOVER on
> > > enx8c8caa616f43 to 255.255.255.255 port 67 interval 14
> > > Feb 28 14:31:17 debian-laptop systemd[1]: Started Run anacron jobs.
> > > Feb 28 14:31:17 debian-laptop anacron[46104]: Anacron 2.3 started on
> > > 2023-02-28
> > > Feb 28 14:31:17 debian-laptop anacron[46104]: Normal exit (0 jobs run)
> > > Feb 28 14:31:17 debian-laptop systemd[1]: anacron.service: Succeeded.
> > > Feb 28 14:31:17 debian-laptop dhclient[2016]: DHCPDISCOVER on wlp3s0
> > > to 255.255.255.255 port 67 interval 13
> > > Feb 28 14:31:24 debian-laptop dhclient[39759]: DHCPDISCOVER on wlp3s0
> > > to 255.255.255.255 port 67 interval 8
> > > Feb 28 14:31:30 debian-laptop dhclient[2016]: DHCPDISCOVER on wlp3s0
> > > to 255.255.255.255 port 67 interval 8
> > > Feb 28 14:31:31 debian-laptop dhclient[39759]: DHCPDISCOVER on
> > > enx8c8caa616f43 to 255.255.255.255 port 67 interval 16
> > > Feb 28 14:31:32 debian-laptop dhclient[39759]: DHCPDISCOVER on wlp3s0
> > > to 255.255.255.255 port 67 interval 17
> > > Feb 28 14:31:37 debian-laptop rtkit-daemon[1368]: Supervising 18
> > > threads of 8 processes of 1 users.
> > > Feb 28 14:31:37 debian-laptop rtkit-daemon[1368]: Supervising 18
> > > threads of 8 processes of 1 users.
> > > Feb 28 14:31:38 debian-laptop dhclient[2016]: DHCPDISCOVER on wlp3s0
> > > to 255.255.255.255 port 67 interval 11
> > > Feb 28 14:31:47 debian-laptop dhclient[39759]: DHCPDISCOVER on
> > > enx8c8caa616f43 to 255.255.255.255 port 67 interval 5
> > > Feb 28 14:31:49 debian-laptop dhclient[2016]: No DHCPOFFERS received.
> > > Feb 28 14:31:49 debian-laptop dhclient[2016]: No working leases in
> > > persistent database - sleeping.
> > > Feb 28 14:31:49 debian-laptop dave:
> > > /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero
> > > exit status 1
> > > Feb 28 14:31:49 debian-laptop dhclient[39759]: DHCPDISCOVER on wlp3s0
> > > to 255.255.255.255 port 67 interval 19
> > > Feb 28 14:31:52 debian-laptop dhclient[39759]: No DHCPOFFERS received.
> > > Feb 28 14:31:52 debian-laptop dhclient[39759]: No working leases in
> > > persistent database - sleeping.
> > > Feb 28 14:31:52 debian-laptop dave:
> > > /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero
> > > exit status 1
> > > Feb 28 14:32:08 debian-laptop dhclient[39759]: DHCPDISCOVER on wlp3s0
> > > to 255.255.255.255 port 67 interval 15
> > > Feb 28 14:32:23 debian-laptop dhclient[39759]: DHCPDISCOVER on wlp3s0
> > > to 255.255.255.255 port 67 interval 2
> > > Feb 28 14:32:25 debian-laptop dhclient[39759]: No DHCPOFFERS received.
> > > Feb 28 14:32:25 debian-laptop dhclient[39759]: No working leases in
> > > persistent database - sleeping.
> > > Feb 28 14:32:25 debian-laptop dave:
> > > /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero
> > > exit status 1
> > > Feb 28 14:32:48 debian-laptop dhclient[39759]: DHCPDISCOVER on
> > > wlp3s0:avahi to 255.255.255.255 port 67 interval 8
> > > Feb 28 14:32:56 debian-laptop dhclient[39759]: DHCPDISCOVER on
> > > wlp3s0:avahi to 255.255.255.255 port 67 interval 11
> > > Feb 28 14:33:07 debian-laptop dhclient[39759]: DHCPDISCOVER on
> > > wlp3s0:avahi to 255.255.255.255 port 67 interval 12
> > > 
> > > ------------
> > 
> > There seems to be a process, 1003, that succeeds in obtaining an
> > IP address of 192.168.1.192 on ethernet, and another, 2016, that
> > has set up the wireless interface, but presumably has no
> > connection to the DHCP server. These processes have been running
> > since soon after booting, perhaps within seconds.
> > 
> > The third process, 39759, is obviously much more recent, and is
> > trying to get an address for the interface enx8c…, but also the
> > wifi interface at the same time as PID 2016. Perhaps this
> > process was started when connecting or disconnecting the vpn, IDK.
> 
> No it wasn't, the vpn was connected hours before that. and
> disconnected hours after that.

Well, that was a guess I got wrong. I wouldn't have had to try
guessing if you'd explained what you were doing in the post, rather
than leaving it until we've attempted to decode it, unravelling
all these interspersed DHCP runs.

For all I know, you could have been downing the VPN for coffee-,
lunch- and tea-time.

> This is exactly my point, thus the topic name "Forcing dhclient to not
> ignore tun0 interface when it's available"

I hadn't got to that yet. I was still looking at your "popup"
problem. But OK, let's look at that.

> vpn is up and running, tun0 is available and being used by almost
> everything on the coputer and it is working
> BUT dhclient sends dhcp requests throught all interfaces, even
> unused/down ones, except tun0
> 
> Wireless interface is down, disabled in the software level (not
> physical killswitch). I fail to see why
> dhclient try to use it to send DHCP requests at all, while the
> interface is down.
> 
> It blindly tries litteraly all interfaces except, interface tun0 and
> tun0 IS available at that time
> 
> dhclient even tries the weird long hexadecimal name interface, the one
> starting with enx8c.
> I just learned (yesterday actually) from a colleague that the work's
> issued laptop has a 4G modem.
> I don't use it, there's no SIM card in it, and the interface is down.
> enx8c being I namenclature I never seen before, on other laptops, I
> never used it or seen it UP on my work's laptop and it has "NO
> CARRIER"
> So my guess enx8c is the 4G modem

Presumably one of the hardware enuerators could confirm that;
inxi, lshw and so on.

> > In any case, when both 2016 (at 31:49) and 39759 (at 32:25) fail
> > to get leases, it looks as if avahi-autoipd kicks in, and 39759
> > starts sending to the link-local address.
> > 
> > All this is guesswork: I don't run ifupdown, dhclient or
> > avahi-autoipd, so my logs aren't comparable in many respects.
> > Also, the immutable resolv.conf might cause subsequent commands
> > that consult it to go awry.
> 
> I wouldn't make it immutable temprarily, if I wasn't forced to
> Until I find a working solution, I have no choice when it comes to
> making resolv.conf immutable
> Sometimes dhclient overwrites several times in 20 minutes… or a few
> secondes after I stop openclient,
> and execute it again, so it rewrites proper /etc/resolv.conf… which
> then doesn't last depending on timing…

As I wrote a while back, I think you need to install a resolv.conf
manager like openresolv. (I'm thinking of trying to configure
systemd-resolved again, which I think I screwed up last time.)
The idea is that addresses on your 192.168.1.0 network are
resolved locally even while your VPN is up, but addresses outside
that are resolved by the "works" nameservers through the tunnel.

Cheers,
David.

Reply via email to