Status changed to 'Confirmed' because the bug affects multiple users.

** Changed in: keepalived (Ubuntu)
       Status: New => Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to keepalived in Ubuntu.
https://bugs.launchpad.net/bugs/1792298

Title:
  keepalived: MISC healthchecker's exit status is erroneously treated as
  a permanent error

Status in keepalived package in Ubuntu:
  Confirmed

Bug description:
  1) The release of Ubuntu we are using
  $ lsb_release -rd
  Description:    Ubuntu 16.04.5 LTS
  Release:        16.04

  2) The version of the package we are using
  $ apt-cache policy keepalived
  keepalived:
    Installed: 1:1.2.24-1ubuntu0.16.04.1
  ...

  3) What we expected to happen
  MISC healthcheckers would be treated normally.

  4) What happened instead
  We are trying to use Ubuntu 16.04's keepalived with our own MISC 
healthchecker, which is implemented to exit with exit code 3, and getting the 
following log messages endlessly.

  --- Note: some IP fields are masked ---
  Sep 12 06:55:09 devsvr Keepalived[16705]: Healthcheck child process(34232) 
died: Respawning
  Sep 12 06:55:09 devsvr Keepalived[16705]: Starting Healthcheck child process, 
pid=34239
  Sep 12 06:55:09 devsvr Keepalived_healthcheckers[34239]: Initializing ipvs
  Sep 12 06:55:09 devsvr Keepalived_healthcheckers[34239]: Registering Kernel 
netlink reflector
  Sep 12 06:55:09 devsvr Keepalived_healthcheckers[34239]: Registering Kernel 
netlink command channel
  Sep 12 06:55:09 devsvr Keepalived_healthcheckers[34239]: Opening file 
'/etc/keepalived/keepalived.conf'.
  Sep 12 06:55:09 devsvr Keepalived_healthcheckers[34239]: Using LinkWatch 
kernel netlink reflector...
  Sep 12 06:55:09 devsvr Keepalived_healthcheckers[34239]: Activating 
healthchecker for service [XX.XX.XX.18]:80
  Sep 12 06:55:09 devsvr Keepalived_healthcheckers[34239]: Activating 
healthchecker for service [XX.XX.XX.19]:80
  Sep 12 06:55:09 devsvr Keepalived_healthcheckers[34239]: Activating 
healthchecker for service [XX.XX.XX.18]:443
  Sep 12 06:55:09 devsvr Keepalived_healthcheckers[34239]: Activating 
healthchecker for service [XX.XX.XX.19]:443
  ...
  Sep 12 06:55:09 devsvr Keepalived_healthcheckers[34239]: Activating 
healthchecker for service [XX.XX.XX.52]:443
  Sep 12 06:55:09 devsvr Keepalived_healthcheckers[34239]: Activating 
healthchecker for service [XX.XX.XX.53]:443
  Sep 12 06:55:10 devsvr Keepalived_healthcheckers[34239]: pid 34257 exited 
with permanent error CONFIG. Terminating
  Sep 12 06:55:10 devsvr Keepalived_healthcheckers[34239]: Removing service 
[XX.XX.XX.24]:25 from VS [YY.YY.YY.YY]:0
  Sep 12 06:55:10 devsvr Keepalived_healthcheckers[34239]: Removing service 
[XX.XX.XX.25]:25 from VS [YY.YY.YY.YY]:0
  Sep 12 06:55:10 devsvr Keepalived_healthcheckers[34239]: Removing service 
[XX.XX.XX.21]:56667 from VS [ZZ.ZZ.ZZ.ZZ]:0
  Sep 12 06:55:10 devsvr Keepalived_healthcheckers[34239]: Removing service 
[XX.XX.XX.52]:443 from VS [WW.WW.WW.WW]:0
  Sep 12 06:55:10 devsvr Keepalived[16705]: Healthcheck child process(34239) 
died: Respawning
  Sep 12 06:55:10 devsvr Keepalived[16705]: Starting Healthcheck child process, 
pid=34260
  ...
  ---

  It looks like our MISC healthchecker's exit code 3, which should be a
  valid value according to the following description, is treated as a
  permanent error since it is equal to KEEPALIVED_EXIT_CONFIG defined in
  keepalived's lib/scheduler.h :

  ---
             # MISC healthchecker, run a program
             MISC_CHECK
             {
                 # External script or program
                 ...
                 #   exit status 2-255: svc check success, weight
                 #     changed to 2 less than exit status.
                 #   (for example: exit status of 255 would set
                 #     weight to 253)
                 misc_dynamic
             }
  ---

  The problem, we think, have started with this patch (we did not see the 
problem in Ubuntu 14.04):
    Stop respawning children repeatedly after permanent error
    - 
https://github.com/acassen/keepalived/commit/4ae9314af448eb8ea4f3d8ef39bcc469779b0fec

  The problem will be fixed by this patch (not included in Ubuntu 16.04):
    Make report_child_status() check for vrrp and checker child processes
    - 
https://github.com/acassen/keepalived/commit/ca955a7c1a6af324428ff04e24be68a180be127f

  Please consider backporting it to Ubuntu 16.04's keepalived.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/keepalived/+bug/1792298/+subscriptions

_______________________________________________
Mailing list: https://launchpad.net/~ubuntu-ha
Post to     : ubuntu-ha@lists.launchpad.net
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp

Reply via email to