On 13-09-18 05:57, Fajar A. Nugraha wrote: > On Wed, Sep 12, 2018 at 9:33 PM, Kees Bakker <[email protected] > <mailto:[email protected]>> wrote: > > Hey, > > This with a LXD/LXC on a Ubuntu 18.04 server. Storage is done > with LVM. It was installed as a cluster with just one node. > It was also added as remote for three other LXD servers (all Ubuntu 16.04 > and LXD 2.0.x). These old servers have BTRFS storage. > > > Only added as remote? not lxd clustering > (https://lxd.readthedocs.io/en/latest/clustering/)?
Yes, only as remote. > > > > Suddenly I cannot do any lxc command anymore. They all give > > Error: failed to begin transaction: database is locked > > In /var/log/lxd/lxd.log it prints the following message every 10 seconds > > lvl=warn msg="Failed to get current raft nodes: failed to fetch raft > server address: failed to begin transaction: database is locked" > t=2018-09-12T16:28:44+0200 > > Extra information. This afternoon I have upgraded one of the "old" servers > to LXD 3.0 (from xenial-backports). This was triggered by the problems we > have with a container in ERROR state and a kworker at 100% cpu load. > > > > Do package versions on upgraded servers match? i.e. all lxd, liblxc1, etc all > 3.0 from xenial-backports, without any 2.x or ppa packages mixed in? > > Have you restart lxd on the upgraded server? Not manually, no. The upgrade wasn't totally smooth. It ran into a timeout setting up some lxc network config. Then did a reboot, and the shutdown was hanging for something with ebtables (new package because of the move to 3.0). I forced a powerdown and luckily the server came up normal. After that I noticed the problem described above. Restarting the lxd server solve it, and it is back to normal. (( I didn't know for sure that the LXD server can be restarted without killing the containers. But it worked. )) Here are a few lines from lxd.log at the time it started giving the problem. lvl=info msg="Raft: Snapshot to 597621 complete" t=2018-09-12T15:47:01+0200 lvl=info msg="Raft: Starting snapshot up to 597696" t=2018-09-12T15:52:15+0200 lvl=info msg="Raft: Compacting logs from 597494 to 597568" t=2018-09-12T15:52:16+0200 lvl=info msg="Raft: Snapshot to 597696 complete" t=2018-09-12T15:52:16+0200 lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T15:56:55+0200 lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T15:57:04+0200 lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T15:57:13+0200 lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T15:57:22+0200 lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T15:57:31+0200 lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T15:57:40+0200 lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T15:57:49+0200 lvl=info msg="Raft: Starting snapshot up to 597760" t=2018-09-12T15:57:52+0200 lvl=warn msg="Raft: Unable to get address for server id 1, using fallback address 0: failed to begin transaction: database is locked" t=2018-09-12T15:57:57+0200 lvl=info msg="Raft: Compacting logs from 597569 to 597632" t=2018-09-12T15:57:57+0200 lvl=info msg="Raft: Snapshot to 597760 complete" t=2018-09-12T15:57:57+0200 lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T15:57:58+0200 lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T15:58:07+0200 lvl=warn msg="Failed to get current raft nodes: failed to fetch raft server address: failed to begin transaction: database is locked" t=2018-09-12T15:58:16+0200 > > If you temporarily move ~/.config/lxc somehere else (to "remove" all the > remotes, among other things), does lxc command work? > I'll remember that for next time. Right now the server is working again. Thanks
_______________________________________________ lxc-users mailing list [email protected] http://lists.linuxcontainers.org/listinfo/lxc-users
