>>> Donat Zenichev <donat.zenic...@gmail.com> schrieb am 21.10.2019 um 09:12 in Nachricht <CANLwQCn2MC60R9LpVqaz85w7-ozDvYKmgiNqjx-LZRXo+m=x...@mail.gmail.com>: > Hello and sorry for soo late response of mine, I somehow missed your answer. > > Sure let me share a bit of useful information on the count. > First of all the system specific things are: > - Hypervisor is a usual VMware product - VSphere > - VMs OS is: Ubuntu 18.04 LTS > - Pacemaker is of version: 1.1.18-0ubuntu1.1 > > And yes it's IProute, that has a version - 4.15.0-2ubuntu1 > > To be mentioned that after I moved to another way of handling this (with > set failure-timeout ) I haven't seen any errors so far, on-fail action > still remains "restart". > But it's obvious, failure-timeout just clears all fail counters for me, so > I don't see any fails now.
Failures should be logged in logfiles still. failure-timeout also does not prevent a restart on failure; it just extends the number of restart attempts. > > Another thing to be mentioned, that monitor functionality for IPaddr2 > resource was failing in the years past as well, I just didn't pay much > attention on that. > That time VM machines under my control were working over Ubuntu 14.04 and > hypervisor was - Proxmox of the branch 5+ (cannot exactly remember the > version, perhaps that was 5.4+). > > For one this could be a critical case indeed, since sometimes an absence of > IP address (for a certain DB for e.g. with loading of hundreds of thousands > SQL requests) can lead to a huge out age. > I don't have the first idea of how to investigate this further. But, I have > a staging setup where my hands are not tied, so let me know if we can > research something. We had a similar case for the NFS server, and I added a script that does the same monitoring as the RA, but logs what the command outputs in case the output changed. Unfortunately I did not see the error since I added the script ;-) > > And have a nice day! Regards, Ulrich > > On Mon, Oct 7, 2019 at 7:21 PM Jan Pokorný <jpoko...@redhat.com> wrote: > >> Donat, >> >> On 07/10/19 09:24 -0500, Ken Gaillot wrote: >> > If this always happens when the VM is being snapshotted, you can put >> > the cluster in maintenance mode (or even unmanage just the IP >> > resource) while the snapshotting is happening. I don't know of any >> > reason why snapshotting would affect only an IP, though. >> >> it might be interesting if you could share the details to grow the >> shared knowledge and experience in case there are some instances of >> these problems reported in the future. >> >> In particular, it'd be interesting to hear: >> >> - hypervisor >> >> - VM OS + if plain oblivious to running virtualized, >> or "the optimal arrangement" (e.g., specialized drivers, virtio, >> "guest additions", etc.) >> >> (I think IPaddr2 is iproute2-only, hence in turn, VM OS must be Linux) >> >> Of course, there might be more specific things to look at if anyone >> here is an expert with particular hypervisor technology and the way >> the networking works with it (no, not me at all). >> >> -- >> Poki >> _______________________________________________ >> Manage your subscription: >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> ClusterLabs home: https://www.clusterlabs.org/ > > > > -- > > Best regards, > Donat Zenichev _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/