I find the MariaDB periodically stopping in my current configuration.
It stops and it doesn’t restart on its own. I can start it again
without difficulty using ‘systemctl start mariadb’.
I have the impression that it is a weekly occurrence. I thought it
was every Saturday, but when I paid attention, it occurred Friday,
June 5 and then again on Thursday, June 11. I noticed that the time,
06:02, is consistent.
I did review ‘systemctl status mariadb’ while it was down. I failed
to paste it here and it is lost for now, but it only indicated that
the service was stopping without offering a reason. The entry
occurred at 06:02:xx and led me to check other logs for more
information at that time. At the end of this message is some log data
surrounding the event, but all I see is confirmation that the database
became unavailable. Hopefully, there’s some place else to look for a
clue.
I’ve noticed that I should potentially be using the service name
packetfence-mariadb, not just mariadb. Currently, with the service
successfully started, ‘systemctl status mariadb’ and ‘systemctl status
packetfence-mariadb’ both successfully return the same results. And,
I’ve since realized I can start, stop and get status using either of
those service names.
Before sending this message, I’ve dug a bit further. I find that the
file /etc/systemd/system/packetfence-mariadb.service is improperly
linked to /lib/systemd/system/mariadb.service. I think it should be
linked to /lib/systemd/system/packetfence-mariadb.service. I believe
I created the improper link during some earlier testing of the
PacketFence upgrade process. It was a difficult experience to
complete the upgrade for some reason and, in the end, I removed all
the packages and reinstalled them and struggled to get it all working
again. I did so, but along the way I believe I changed that symbolic
link to get the database to come up.
This suggests to me that maybe PacketFence is not managing the mariadb
as it should.
I’ve tried changing the link to
/etc/systemd/system/packetfence-mariadb.service ->
/lib/systemd/system/packetfence-mariadb.service. There are errors and
the database will not start in this configuration. The error states
the /usr/local/pf/var/conf/mariadb.conf file is not found. When I
have the opportunity to check (I reboot when I make changes to the
links and such) the file is there. I’ve learned the defaults file is
created during database startup by ‘pfcmd generatemariadbconfig’ and
that process is reported to be successful.
● packetfence-mariadb.service - PacketFence MariaDB instance
Loaded: loaded (/lib/systemd/system/packetfence-mariadb.service;
enabled; vendor preset: enabled)
Active: activating (start) since Thu 2020-06-11 11:09:22 EDT; 30s ago
Process: 1011 ExecStartPre=/usr/local/pf/bin/pfcmd
generatemariadbconfig (code=exited, status=0/SUCCESS)
Main PID: 1019 (pf-mariadb)
Tasks: 7 (limit: 36864)
CGroup:
/packetfence.slice/packetfence-base.slice/packetfence-mariadb.service
├─ 1019 pf-mariadb
├─14588 /bin/bash /usr/bin/mysqld_safe
--defaults-file=/usr/local/pf/var/conf/mariadb.conf
└─14590 /bin/bash /usr/bin/mysqld_safe
--defaults-file=/usr/local/pf/var/conf/mariadb.conf
Jun 11 11:09:53 nadc1-pfence-01 pf-mariadb[1019]: Fatal error in
defaults handling. Program aborted
Jun 11 11:09:53 nadc1-pfence-01 pf-mariadb[1019]: WARNING: Defaults
file '/usr/local/pf/var/conf/mariadb.conf' not found!
Jun 11 11:09:53 nadc1-pfence-01 pf-mariadb[1019]: Could not open
required defaults file: /usr/local/pf/var/conf/mariadb.conf
Jun 11 11:09:53 nadc1-pfence-01 pf-mariadb[1019]: Fatal error in
defaults handling. Program aborted
Jun 11 11:09:53 nadc1-pfence-01 pf-mariadb[1019]: WARNING: Defaults
file '/usr/local/pf/var/conf/mariadb.conf' not found!
Jun 11 11:09:53 nadc1-pfence-01 pf-mariadb[1019]: 200611 11:09:53
mysqld_safe Logging to '/var/lib/mysql/nadc1-pfence-01.err'.
Jun 11 11:09:53 nadc1-pfence-01 pf-mariadb[1019]: 200611 11:09:53
mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
Jun 11 11:09:53 nadc1-pfence-01 pf-mariadb[1019]: Failed starting with
mode: standalone
Jun 11 11:09:53 nadc1-pfence-01 pf-mariadb[1019]: MariaDB is not alive
Here’s the folder listing showing the mariadb.conf in existence.
Created at the same time as the service start was attempted. You
might notice the permissions of the file are 0722, not 0622 as
expected. It’s something I tried by modifying generatemariadbconfig.
I only changed (increased) the permissions. It had no effect.
root@nadc1-pfence-01:/usr/local/pf/logs# ls -al /usr/local/pf/var/conf/
total 152
drwxrwsr-x 4 pf pf 4096 Jun 11 11:12 .
drwxrwsr-x 16 pf pf 4096 Jun 11 11:09 ..
-rwxrwxr-x 1 pf pf 1485 Jun 9 16:11 captive-portal-common
-rwxrwxr-x 1 pf pf 71 Nov 26 2019 .gitignore
-rwxrwxr-x 1 pf pf 1279 Jun 9 16:15 haproxy-db.conf
-rwxrwxr-x 1 pf pf 8275 Jun 11 11:12 haproxy-portal.conf
drwxrwsr-x 2 pf pf 4096 Feb 5 10:32 httpd.conf.d
-rwxrwxr-x 1 pf pf 13399 Jun 9 09:59 iptables.conf
-rwxrwxr-x 1 pf pf 1889 Jun 9 10:03 keepalived.conf
-rwxr--r-- 1 pf pf 3888 Jun 11 11:09 mariadb.conf
drwxrwsr-x 7 pf pf 4096 Jan 13 11:56 monitoring
-rwxrwxr-x 1 pf pf 1323 Jun 9 16:11 passthrough.lua
-rwxrwxr-x 1 pf pf 395 Jun 9 10:03 pfdns.conf
-rwxrwxr-x 1 pf pf 36446 Jun 9 10:04 redis_ntlm_cache.conf
-rwxrwxr-x 1 pf pf 36354 Jun 9 10:04 redis_queue.conf
-rwxrwxr-x 1 pf pf 632 Jun 11 06:01 snmptrapd.conf
-rwxrwxr-x 1 pf pf 451 Jun 9 16:11 ssl-certificates.conf
I compared with a different instance of PacketFence and find that the
symbolic link /etc/systemd/system/packetfence-mariadb.service ->
/lib/systemd/system/packetfence-mariadb.service shouldn’t exist at
all. This symbolic link actually exists as
/etc/systemd/system/*packetfence-base.target.wants*/packetfence-mariadb.service->/lib/systemd/system/packetfence-mariadb.service.
That, and there’s a symbolic link as
/etc/systemd/system/mariadb.service -> /dev/null. So, I’ve created
these conditions and get exactly the same result. The database fails
to start because the defaults file cannot be found or accessed.
The main process for starting the database appears to be pf-mariadb.
This process is running as root.
*There doesn’t seem to be an issue with permissions. The defaults
file does exist. Any thoughts on where I can go from here to try and
resolve the issue? To get PacketFence properly managing the MariaDB
service?*
Thank you.
Jeff
Log data at the time of the database shutdown:
Packetfence.log
---------------------
Jun 5 06:00:36 nadc1-pfence-01 packetfence: INFO -e(81425):
generating /usr/local/pf/var/conf/snmptrapd.conf
(pf::services::manager::snmptrapd::generateConfig)
Jun 5 06:00:36 nadc1-pfence-01 packetfence: INFO -e(81425):
generating /usr/local/pf/var/conf/snmptrapd.conf
(pf::services::manager::snmptrapd::generateConfig)
Jun 5 06:01:01 nadc1-pfence-01 pfipset[38826]:
t=2020-06-05T06:01:01-0400 lvl=info msg="No Inline Network bypass
ipsets reload" pid=38826
Jun 5 06:01:01 nadc1-pfence-01 pfipset[38826]:
t=2020-06-05T06:01:01-0400 lvl=info msg="No Inline Network bypass
ipsets reload" pid=38826
Jun 5 06:02:12 nadc1-pfence-01 packetfence: INFO -e(81578):
generating /usr/local/pf/var/conf/snmptrapd.conf
(pf::services::manager::snmptrapd::generateConfig)
Jun 5 06:02:12 nadc1-pfence-01 packetfence: INFO -e(81578):
generating /usr/local/pf/var/conf/snmptrapd.conf
(pf::services::manager::snmptrapd::generateConfig)
Jun 5 06:02:50 nadc1-pfence-01 pfqueue: pfqueue(80748) WARN:
[mac:unknown] database query failed with: MySQL server has gone away
(errno: 2006), will try again (pf::dal::db_execute)
Jun 5 06:02:50 nadc1-pfence-01 pfqueue: pfqueue(80748) WARN:
[mac:unknown] database query failed with: MySQL server has gone away
(errno: 2006), will try again (pf::dal::db_execute)
Jun 5 06:02:50 nadc1-pfence-01 pfqueue: pfqueue(80748) WARN:
[mac:unknown] database query failed with: MySQL server has gone away
(errno: 2006), will try again (pf::dal::db_execute)
Jun 5 06:02:50 nadc1-pfence-01 pfqueue: pfqueue(80748) ERROR:
[mac:unknown] Problem looking up tenant ID (1) in database
(pf::dal::set_tenant)
Jun 5 06:02:50 nadc1-pfence-01 pfqueue: pfqueue(80748) FATAL:
[mac:unknown] unable to connect to database: Can't connect to local
MySQL server through socket '/var/run/mysqld/mysqld.sock' (2 "No such
file or directory") at /usr/share/perl5/CHI/Driver.pm line 546.
Jun 5 06:02:50 nadc1-pfence-01 pfqueue: pfqueue(80748) FATAL:
[mac:unknown] unable to connect to database: Can't connect to local
MySQL server through socket '/var/run/mysqld/mysqld.sock' (2 "No such
file or directory") at /usr/local/pf/lib/pf/dal.pm line 69.
mariadb_error.log
------------------------
Empty
Pfmon.log
--------------
Jun 5 06:02:39 nadc1-pfence-01 pfmon: pfmon(80800) INFO:
[mac:unknown] deleted 0 entries from chi_cache for batch_delete
(1591351359 1591351359) (pf::dal::batch_remove)
Jun 5 06:02:48 nadc1-pfence-01 pfmon: pfmon(80800) INFO:
[mac:unknown] deleted 0 entries from chi_cache for batch_delete
(1591351368 1591351368) (pf::dal::batch_remove)
Jun 5 06:02:48 nadc1-pfence-01 pfmon: pfmon(80800) INFO:
[mac:unknown] 80800 cleanup_chi_database_cache shutting down
(main::_runtask)
Jun 5 06:02:48 nadc1-pfence-01 pfmon: pfmon(80800) INFO:
[mac:unknown] deleted 0 entries from chi_cache for batch_delete
(1591351368 1591351368) (pf::dal::batch_remove)
Jun 5 06:02:48 nadc1-pfence-01 pfmon: pfmon(80800) INFO:
[mac:unknown] 80800 cleanup_chi_database_cache shutting down
(main::_runtask)
Jun 5 06:02:56 nadc1-pfence-01 pfmon: pfmon(78149) FATAL:
[mac:unknown] unable to connect to database: Can't connect to local
MySQL server through socket '/var/run/mysqld/mysqld.sock' (2 "No such
file or directory") at /usr/share/perl5/CHI/Driver.pm line 546.
Jun 5 06:02:56 nadc1-pfence-01 pfmon: pfmon(78149) FATAL:
[mac:unknown] unable to connect to database: Can't connect to local
MySQL server through socket '/var/run/mysqld/mysqld.sock' (2 "No such
file or directory") at /usr/share/perl5/CHI/Driver.pm line 546.
Jun 5 06:02:57 nadc1-pfence-01 pfmon: pfmon(81402) FATAL:
[mac:unknown] unable to connect to database: Can't connect to local
MySQL server through socket '/var/run/mysqld/mysqld.sock' (2 "No such
file or directory") at /usr/share/perl5/CHI/Driver.pm line 546.
Jun 5 06:02:57 nadc1-pfence-01 pfmon: pfmon(81402) INFO:
[mac:unknown] getting security_events triggers for accounting cleanup
(pf::accounting::acct_maintenance)
Jun 5 06:02:57 nadc1-pfence-01 pfmon: pfmon(81402) FATAL:
[mac:unknown] unable to connect to database: Can't connect to local
MySQL server through socket '/var/run/mysqld/mysqld.sock' (2 "No such
file or directory") at /usr/share/perl5/CHI/Driver.pm line 546.
Pfstats.log
-------------
Jun 5 06:02:23 nadc1-pfence-01 pfstats[39357]:
t=2020-06-05T06:02:23-0400 lvl=info msg="Calling Unified API on uri:
https://127.0.0.1:9999/api/v1/queues/stats" pid=39357
Jun 5 06:02:30 nadc1-pfence-01 pfstats[39357]:
t=2020-06-05T06:02:30-0400 lvl=info msg="Calling Unified API on uri:
https://127.0.0.1:9999/api/v1/queues/stats" pid=39357
Jun 5 06:02:30 nadc1-pfence-01 pfstats[39357]:
t=2020-06-05T06:02:30-0400 lvl=info msg="Calling Unified API on uri:
https://127.0.0.1:9999/api/v1/queues/stats" pid=39357
Jun 5 06:02:40 nadc1-pfence-01 pfstats[39357]:
t=2020-06-05T06:02:40-0400 lvl=info msg="Calling Unified API on uri:
https://127.0.0.1:9999/api/v1/queues/stats" pid=39357
Jun 5 06:02:40 nadc1-pfence-01 pfstats[39357]:
t=2020-06-05T06:02:40-0400 lvl=info msg="Calling Unified API on uri:
https://127.0.0.1:9999/api/v1/queues/stats" pid=39357
Jun 5 06:02:48 nadc1-pfence-01 pfstats[39357]:
t=2020-06-05T06:02:48-0400 lvl=info msg="Calling Unified API on uri:
https://127.0.0.1:9999/api/v1/dhcp/stats" pid=39357
Jun 5 06:02:48 nadc1-pfence-01 pfstats[39357]:
t=2020-06-05T06:02:48-0400 lvl=info msg="Calling Unified API on uri:
https://127.0.0.1:9999/api/v1/dhcp/stats" pid=39357
Jun 5 06:02:50 nadc1-pfence-01 pfstats[39357]: [mysql] 2020/06/05
06:02:50 packets.go:112: closing bad idle connection: unexpected read
from socket
Jun 5 06:02:50 nadc1-pfence-01 pfstats[39357]:
t=2020-06-05T06:02:50-0400 lvl=eror msg="Error while performing SQL
query: dial unix /var/run/mysqld/mysqld.sock: connect: no such file or
directory" pid=39357
Jun 5 06:02:50 nadc1-pfence-01 pfstats: [mysql] 2020/06/05 06:02:50
packets.go:112: closing bad idle connection: unexpected read from socket
Jun 5 06:02:50 nadc1-pfence-01 pfstats[39357]:
t=2020-06-05T06:02:50-0400 lvl=eror msg="Error while performing SQL
query: dial unix /var/run/mysqld/mysqld.sock: connect: no such file or
directory" pid=39357
Jun 5 06:02:51 nadc1-pfence-01 pfstats[39357]:
t=2020-06-05T06:02:51-0400 lvl=eror msg="Error while performing SQL
query: dial unix /var/run/mysqld/mysqld.sock: connect: no such file or
directory" pid=39357
Jun 5 06:02:51 nadc1-pfence-01 pfstats[39357]:
t=2020-06-05T06:02:51-0400 lvl=eror msg="Error while performing SQL
query: dial unix /var/run/mysqld/mysqld.sock: connect: no such file or
directory" pid=39357
Jun 5 06:02:52 nadc1-pfence-01 pfstats[39357]:
t=2020-06-05T06:02:52-0400 lvl=eror msg="Error while performing SQL
query: dial unix /var/run/mysqld/mysqld.sock: connect: no such file or
directory" pid=39357
/var/log/mysql/error.log
---------------------------------
Empty
/var/log/daemon.log
----------------------------
Jun 5 06:02:49 nadc1-pfence-01 systemd[1]: Time has been changed
Jun 5 06:02:49 nadc1-pfence-01 systemd[1]: apt-daily.timer: Adding 7h
34min 29.704814s random time.
Jun 5 06:02:49 nadc1-pfence-01 systemd[1]: anacron.timer: Adding
42.370601s random time.
Jun 5 06:02:49 nadc1-pfence-01 systemd: Time has been changed
Jun 5 06:02:49 nadc1-pfence-01 systemd: apt-daily.timer: Adding 7h
34min 29.704814s random time.
Jun 5 06:02:49 nadc1-pfence-01 systemd: anacron.timer: Adding
42.370601s random time.
Jun 5 06:02:51 nadc1-pfence-01 systemd[1]: Stopped MariaDB 10.1.41
database server.
Jun 5 06:02:51 nadc1-pfence-01 systemd[1]: Reloading.
Jun 5 06:02:51 nadc1-pfence-01 systemd: Stopped MariaDB 10.1.41
database server.
Jun 5 06:02:51 nadc1-pfence-01 systemd: Reloading.
Jun 5 06:02:52 nadc1-pfence-01 systemd[1]: apt-daily.timer: Adding 1h
9min 19.806442s random time.
Jun 5 06:02:52 nadc1-pfence-01 systemd[1]: anacron.timer: Adding 1min
21.451889s random time.
Jun 5 06:02:52 nadc1-pfence-01 systemd[1]: Reloading.
Jun 5 06:02:52 nadc1-pfence-01 systemd: apt-daily.timer: Adding 1h
9min 19.806442s random time.
Jun 5 06:02:52 nadc1-pfence-01 systemd: anacron.timer: Adding 1min
21.451889s random time.
Jun 5 06:02:52 nadc1-pfence-01 systemd: Reloading.
Jun 5 06:02:53 nadc1-pfence-01 systemd[1]: apt-daily.timer: Adding 5h
51min 13.477790s random time.
Jun 5 06:02:53 nadc1-pfence-01 systemd[1]: anacron.timer: Adding 3min
11.205977s random time.
Jun 5 06:02:53 nadc1-pfence-01 systemd[1]: Reloading.
/var/log/syslog
-------------------
Jun 5 06:02:48 nadc1-pfence-01 systemd[1]: apt-daily.timer: Adding 2h
15min 30.360084s random time.
Jun 5 06:02:48 nadc1-pfence-01 systemd[1]: anacron.timer: Adding 3min
32.833603s random time.
Jun 5 06:02:48 nadc1-pfence-01 systemd[1]: Reloading.
Jun 5 06:02:48 nadc1-pfence-01 systemd: apt-daily.timer: Adding 2h
15min 30.360084s random time.
Jun 5 06:02:48 nadc1-pfence-01 systemd: anacron.timer: Adding 3min
32.833603s random time.
Jun 5 06:02:48 nadc1-pfence-01 systemd: Reloading.
Jun 5 06:02:49 nadc1-pfence-01 systemd[1]: apt-daily.timer: Adding
10h 7min 24.310921s random time.
Jun 5 06:02:49 nadc1-pfence-01 systemd[1]: anacron.timer: Adding 3min
36.452655s random time.
Jun 5 06:02:49 nadc1-pfence-01 systemd[1]: Stopping MariaDB 10.1.41
database server...
Jun 5 06:02:49 nadc1-pfence-01 systemd: apt-daily.timer: Adding 10h
7min 24.310921s random time.
Jun 5 06:02:49 nadc1-pfence-01 systemd: anacron.timer: Adding 3min
36.452655s random time.
Jun 5 06:02:49 nadc1-pfence-01 systemd: Stopping MariaDB 10.1.41
database server...
Jun 5 06:02:49 nadc1-pfence-01 systemd[1]: Time has been changed
Jun 5 06:02:49 nadc1-pfence-01 systemd[1]: apt-daily.timer: Adding 7h
34min 29.704814s random time.
Jun 5 06:02:49 nadc1-pfence-01 systemd[1]: anacron.timer: Adding
42.370601s random time.
Jun 5 06:02:49 nadc1-pfence-01 systemd: Time has been changed
Jun 5 06:02:49 nadc1-pfence-01 systemd: apt-daily.timer: Adding 7h
34min 29.704814s random time.
Jun 5 06:02:49 nadc1-pfence-01 systemd: anacron.timer: Adding
42.370601s random time.
Jun 5 06:02:51 nadc1-pfence-01 systemd[1]: Stopped MariaDB 10.1.41
database server.
Jun 5 06:02:51 nadc1-pfence-01 systemd[1]: Reloading.
Jun 5 06:02:51 nadc1-pfence-01 systemd: Stopped MariaDB 10.1.41
database server.
Jun 5 06:02:51 nadc1-pfence-01 systemd: Reloading.
Jun 5 06:02:52 nadc1-pfence-01 systemd[1]: apt-daily.timer: Adding 1h
9min 19.806442s random time.
Jun 5 06:02:52 nadc1-pfence-01 systemd[1]: anacron.timer: Adding 1min
21.451889s random time.
Jun 5 06:02:52 nadc1-pfence-01 systemd[1]: Reloading.
Jun 5 06:02:52 nadc1-pfence-01 systemd: apt-daily.timer: Adding 1h
9min 19.806442s random time.
Jun 5 06:02:52 nadc1-pfence-01 systemd: anacron.timer: Adding 1min
21.451889s random time.
Jun 5 06:02:52 nadc1-pfence-01 systemd: Reloading.
Jun 5 06:02:53 nadc1-pfence-01 systemd[1]: apt-daily.timer: Adding 5h
51min 13.477790s random time.
Jun 5 06:02:53 nadc1-pfence-01 systemd[1]: anacron.timer: Adding 3min
11.205977s random time.
Jun 5 06:02:53 nadc1-pfence-01 systemd[1]: Reloading.
Jun 5 06:02:53 nadc1-pfence-01 systemd: apt-daily.timer: Adding 5h
51min 13.477790s random time.
Jun 5 06:02:53 nadc1-pfence-01 systemd: anacron.timer: Adding 3min
11.205977s random time.
Jun 5 06:02:53 nadc1-pfence-01 systemd: Reloading.
Jun 5 06:02:53 nadc1-pfence-01 systemd[1]: apt-daily.timer: Adding
15min 34.916876s random time.
Jun 5 06:02:53 nadc1-pfence-01 systemd[1]: anacron.timer: Adding 1min
5.946330s random time.
Jun 5 06:02:54 nadc1-pfence-01 systemd[1]: Reloading.
Jun 5 06:02:53 nadc1-pfence-01 systemd: apt-daily.timer: Adding 15min
34.916876s random time.
Jun 5 06:02:53 nadc1-pfence-01 systemd: anacron.timer: Adding 1min
5.946330s random time.
Jun 5 06:02:54 nadc1-pfence-01 systemd: Reloading.
PRIVACY NOTICE: The information contained in this e-mail, including
any attachments, is confidential and intended only for the named
recipient(s). Unauthorized use, disclosure, forwarding, or copying is
strictly prohibited and may be unlawful. If you are not the intended
recipient, please delete the e-mail and any attachments and notify us
immediately by return e-mail.
_______________________________________________
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users