[Ubuntu-ha] [Bug 1884149] Update Released

2020-07-13 Thread Łukasz Zemczak
The verification of the Stable Release Update for haproxy has completed
successfully and the package is now being released to -updates.
Subsequently, the Ubuntu Stable Release Updates Team is being
unsubscribed and will not receive messages about this bug report.  In
the event that you encounter a regression using the package from
-updates please report a new bug using ubuntu-bug and tag the bug report
regression-update so we can easily find any regressions.

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to haproxy in Ubuntu.
https://bugs.launchpad.net/bugs/1884149

Title:
  haproxy crashes on in __pool_get_first if unique-id-header is used

Status in HAProxy:
  Fix Released
Status in haproxy package in Ubuntu:
  Fix Released
Status in haproxy source package in Bionic:
  Fix Released
Status in haproxy package in Debian:
  Fix Released

Bug description:
  [Impact]

   * The handling of locks in haproxy led to a state that between idle http 
 connections one could have indicated a connection was destroyed. In that 
 case the code went on and accessed a just freed resource. As upstream 
 puts it "It can have random implications between requests as
   it may lead a wrong connection's polling to be re-enabled or disabled
   for example, especially with threads."

   * Backport the fix from upstreams 1.8 stable branch

  [Test Case]

   * It is a race and might be hard to trigger.
 An haproxy config to be in front of three webservers can be seen below.
 Setting up three apaches locally didn't trigger the same bug, but we 
 know it is timing sensitive.

   * Simon (anbox) has a setup which reliably triggers this and will run the 
 tests there.

   * The bad case will trigger a crash as reported below.

  [Regression Potential]

   * This change is in >=Disco and has no further bugs reported against it 
 (no follow on change) which should make it rather safe. Also no other
 change to that file context in 1.8 stable since then.
 The change is on the locking of connections. So if we want to expect 
 regressions, then they would be at the handling of concurrent 
 connections.

  [Other Info]
   
   * Strictly speaking it is a race, so triggering it depends on load and 
 machine cpu/IO speed.

  
  ---

  
  Version 1.8.8-1ubuntu0.10 of haproxy in Ubuntu 18.04 (bionic) crashes with

  

  Thread 2.1 "haproxy" received signal SIGSEGV, Segmentation fault.
  [Switching to Thread 0xf77b1010 (LWP 17174)]
  __pool_get_first (pool=0xaac6ddd0, pool=0xaac6ddd0) at 
include/common/memory.h:124
  124   include/common/memory.h: No such file or directory.
  (gdb) bt
  #0  __pool_get_first (pool=0xaac6ddd0, pool=0xaac6ddd0) at 
include/common/memory.h:124
  #1  pool_alloc_dirty (pool=0xaac6ddd0) at include/common/memory.h:154
  #2  pool_alloc (pool=0xaac6ddd0) at include/common/memory.h:229
  #3  conn_new () at include/proto/connection.h:655
  #4  cs_new (conn=0x0) at include/proto/connection.h:683
  #5  connect_conn_chk (t=0xaacb8820) at src/checks.c:1553
  #6  process_chk_conn (t=0xaacb8820) at src/checks.c:2135
  #7  process_chk (t=0xaacb8820) at src/checks.c:2281
  #8  0xaabca0b4 in process_runnable_tasks () at src/task.c:231
  #9  0xaab76f44 in run_poll_loop () at src/haproxy.c:2399
  #10 run_thread_poll_loop (data=) at src/haproxy.c:2461
  #11 0xaaad79ec in main (argc=, argv=0xaac61b30) at 
src/haproxy.c:3050

  

  when running on an ARM64 system. The haproxy.cfg looks like this:

  

  global
  log /dev/log local0
  log /dev/log local1 notice
  maxconn 4096
  user haproxy
  group haproxy
  spread-checks 0
  tune.ssl.default-dh-param 1024
  ssl-default-bind-ciphers 
ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:!DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:!DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:AES:!CAMELLIA:DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK:!aECDH:!EDH-DSS-DES-CBC3-SHA:!EDH-RSA-DES-CBC3-SHA:!KRB5-DES-CBC3-SHA

  defaults
  log global
  mode tcp
  option httplog
  option dontlognull
  retries 3
  timeout queue 2
  timeout client 5
  timeout connect 5000
  timeout server 5

  frontend anbox-stream-gateway-lb-5-80
  bind 0.0.0.0:80
  default_backend api_http
  mode http
    

[Ubuntu-ha] [Bug 1877280] Re: attrd can segfault on exit

2020-05-11 Thread Łukasz Zemczak
Hello Dan, or anyone else affected,

Accepted pacemaker into xenial-proposed. The package will build now and
be available at
https://launchpad.net/ubuntu/+source/pacemaker/1.1.14-2ubuntu1.8 in a
few hours, and then in the -proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.  Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, what testing has been
performed on the package and change the tag from verification-needed-
xenial to verification-done-xenial. If it does not fix the bug for you,
please add a comment stating that, and change the tag to verification-
failed-xenial. In either case, without details of your testing we will
not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance for helping!

N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.

** Changed in: pacemaker (Ubuntu Xenial)
   Status: In Progress => Fix Committed

** Tags added: verification-needed verification-needed-xenial

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to pacemaker in Ubuntu.
https://bugs.launchpad.net/bugs/1877280

Title:
  attrd can segfault on exit

Status in pacemaker package in Ubuntu:
  Fix Released
Status in pacemaker source package in Xenial:
  Fix Committed

Bug description:
  [impact]

  pacemaker's attrd may segfault on exit.

  [test case]

  this is a follow on to bug 1871166, the patches added there prevented
  one segfault but this one emerged.  As with that bug, I can't
  reproduce this myself, but the original reporter is able to reproduce
  intermittently.

  [regression potential]

  any regression would likely impact the exit path of attrd, possibly
  causing a segfault or other incorrect exit.

  [scope]

  this is needed only for Xenial.

  this is fixed upstream by commit 3c62fb1d0d which is included in
  Bionic and later.

  [other info]

  this is a follow on to bug 1871166.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1877280/+subscriptions

___
Mailing list: https://launchpad.net/~ubuntu-ha
Post to : ubuntu-ha@lists.launchpad.net
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp


[Ubuntu-ha] [Bug 1871166] Update Released

2020-05-11 Thread Łukasz Zemczak
The verification of the Stable Release Update for pacemaker has
completed successfully and the package is now being released to
-updates.  Subsequently, the Ubuntu Stable Release Updates Team is being
unsubscribed and will not receive messages about this bug report.  In
the event that you encounter a regression using the package from
-updates please report a new bug using ubuntu-bug and tag the bug report
regression-update so we can easily find any regressions.

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to pacemaker in Ubuntu.
https://bugs.launchpad.net/bugs/1871166

Title:
  lrmd crashes

Status in pacemaker package in Ubuntu:
  Fix Released
Status in pacemaker source package in Xenial:
  Fix Released

Bug description:
  [impact]

  lrmd crashes and dumps core.

  [test case]

  I can not reproduce, but it is reproducable in the specific setup of
  the person reporting the bug to me.

  [regression potential]

  this patches the cancel/cleanup part of the code, so regressions would
  likely involve possible memory leaks (instead of use-after-free
  segfaults), failure to correctly cancel or cleanup operations, or
  other failure during cancel action.

  [scope]

  this is fixed by commits:
  933d46ef20591757301784773a37e06b78906584
  94a4c58f675d163085a055f59fd6c3a2c9f57c43
  dc36d4375c049024a6f9e4d2277a3e6444fad05b
  deabcc5a6aa93dadf0b20364715b559a5b9848ac
  b85037b75255061a41d0ec3fd9b64f271351b43e

  which are all included starting with version 1.1.17, and Bionic
  includes version 1.1.18, so this is fixed already in Bionic and later.

  This is needed only for Xenial.

  [other info]

  As mentioned in the test case section, I do not have a setup where I'm
  able to reproduce this, but I can ask the initial reporter to test and
  verify the fix, and they have verified a test build fixed the problem
  for them.

  Also, the upstream commits removed two symbols, which I elided from
  the backported patches; those symbols are still available and, while
  it is unlikely there were any users of those symbols outside pacemaker
  itself, this change should not break any possible external users.  See
  patch 0002 header in the upload for more detail.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1871166/+subscriptions

___
Mailing list: https://launchpad.net/~ubuntu-ha
Post to : ubuntu-ha@lists.launchpad.net
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp


[Ubuntu-ha] [Bug 1866119] Update Released

2020-04-21 Thread Łukasz Zemczak
The verification of the Stable Release Update for pacemaker has
completed successfully and the package is now being released to
-updates.  Subsequently, the Ubuntu Stable Release Updates Team is being
unsubscribed and will not receive messages about this bug report.  In
the event that you encounter a regression using the package from
-updates please report a new bug using ubuntu-bug and tag the bug report
regression-update so we can easily find any regressions.

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to pacemaker in Ubuntu.
https://bugs.launchpad.net/bugs/1866119

Title:
  [bionic] fence_scsi not working properly with Pacemaker
  1.1.18-2ubuntu1.1

Status in pacemaker package in Ubuntu:
  Fix Released
Status in pacemaker source package in Bionic:
  Fix Released

Bug description:
  OBS: This bug was originally into LP: #1865523 but it was split.

   SRU: pacemaker

  [Impact]

   * fence_scsi is not currently working in a share disk environment

   * all clusters relying in fence_scsi and/or fence_scsi + watchdog
  won't be able to start the fencing agents OR, in worst case scenarios,
  the fence_scsi agent might start but won't make scsi reservations in
  the shared scsi disk.

   * this bug is taking care of pacemaker 1.1.18 issues with fence_scsi,
  since the later was fixed at LP: #1865523.

  [Test Case]

   * having a 3-node setup, nodes called "clubionic01, clubionic02,
  clubionic03", with a shared scsi disk (fully supporting persistent
  reservations) /dev/sda, with corosync and pacemaker operational and
  running, one might try:

  rafaeldtinoco@clubionic01:~$ crm configure
  crm(live)configure# property stonith-enabled=on
  crm(live)configure# property stonith-action=off
  crm(live)configure# property no-quorum-policy=stop
  crm(live)configure# property have-watchdog=true
  crm(live)configure# commit
  crm(live)configure# end
  crm(live)# end

  rafaeldtinoco@clubionic01:~$ crm configure primitive fence_clubionic \
  stonith:fence_scsi params \
  pcmk_host_list="clubionic01 clubionic02 clubionic03" \
  devices="/dev/sda" \
  meta provides=unfencing

  And see the following errors:

  Failed Actions:
  * fence_clubionic_start_0 on clubionic02 'unknown error' (1): call=6, 
status=Error, exitreason='',
  last-rc-change='Wed Mar  4 19:53:12 2020', queued=0ms, exec=1105ms
  * fence_clubionic_start_0 on clubionic03 'unknown error' (1): call=6, 
status=Error, exitreason='',
  last-rc-change='Wed Mar  4 19:53:13 2020', queued=0ms, exec=1109ms
  * fence_clubionic_start_0 on clubionic01 'unknown error' (1): call=6, 
status=Error, exitreason='',
  last-rc-change='Wed Mar  4 19:53:11 2020', queued=0ms, exec=1108ms

  and corosync.log will show:

  warning: unpack_rsc_op_failure: Processing failed op start for
  fence_clubionic on clubionic01: unknown error (1)

  [Regression Potential]

   * LP: #1865523 shows fence_scsi fully operational after SRU for that
  bug is done.

   * LP: #1865523 used pacemaker 1.1.19 (vanilla) in order to fix
  fence_scsi.

   * There are changes to: cluster resource manager daemon, local
  resource manager daemon and police engine. From all the changes, the
  police engine fix is the biggest, but still not big for a SRU. This
  could cause police engine, thus cluster decisions, to mal function.

   * All patches are based in upstream fixes made right after
  Pacemaker-1.1.18, used by Ubuntu Bionic and were tested with
  fence_scsi to make sure it fixed the issues.

  [Other Info]

   * Original Description:

  Trying to setup a cluster with an iscsi shared disk, using fence_scsi
  as the fencing mechanism, I realized that fence_scsi is not working in
  Ubuntu Bionic. I first thought it was related to Azure environment
  (LP: #1864419), where I was trying this environment, but then, trying
  locally, I figured out that somehow pacemaker 1.1.18 is not fencing
  the shared scsi disk properly.

  Note: I was able to "backport" vanilla 1.1.19 from upstream and
  fence_scsi worked. I have then tried 1.1.18 without all quilt patches
  and it didnt work as well. I think that bisecting 1.1.18 <-> 1.1.19
  might tell us which commit has fixed the behaviour needed by the
  fence_scsi agent.

  (k)rafaeldtinoco@clubionic01:~$ crm conf show
  node 1: clubionic01.private
  node 2: clubionic02.private
  node 3: clubionic03.private
  primitive fence_clubionic stonith:fence_scsi \
  params pcmk_host_list="10.250.3.10 10.250.3.11 10.250.3.12" 
devices="/dev/sda" \
  meta provides=unfencing
  property cib-bootstrap-options: \
  have-watchdog=false \
  dc-version=1.1.18-2b07d5c5a9 \
  cluster-infrastructure=corosync \
  cluster-name=clubionic \
  stonith-enabled=on \
  stonith-action=off \
  no-quorum-policy=stop \
  symmetric-cluster=true

  

  (k)rafaeldtinoco@clubionic02:~$ sudo crm_mon -1
  Stack: 

[Ubuntu-ha] [Bug 1871166] Re: lrmd crashes

2020-04-14 Thread Łukasz Zemczak
This looks good, though this is quite a lot of code changes (and
refactoring) for a bug without a clear reproduction scenario. Would it
be possible to, along with the verification to be done by the reporting
person, perform some sanity runs to make sure the cancel/cleanup parts
of the code did not regress? Thanks!

** Changed in: pacemaker (Ubuntu Xenial)
   Status: In Progress => Fix Committed

** Tags added: verification-needed verification-needed-xenial

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to pacemaker in Ubuntu.
https://bugs.launchpad.net/bugs/1871166

Title:
  lrmd crashes

Status in pacemaker package in Ubuntu:
  Fix Released
Status in pacemaker source package in Xenial:
  Fix Committed

Bug description:
  [impact]

  lrmd crashes and dumps core.

  [test case]

  I can not reproduce, but it is reproducable in the specific setup of
  the person reporting the bug to me.

  [regression potential]

  this patches the cancel/cleanup part of the code, so regressions would
  likely involve possible memory leaks (instead of use-after-free
  segfaults), failure to correctly cancel or cleanup operations, or
  other failure during cancel action.

  [scope]

  this is fixed by commits:
  933d46ef20591757301784773a37e06b78906584
  94a4c58f675d163085a055f59fd6c3a2c9f57c43
  dc36d4375c049024a6f9e4d2277a3e6444fad05b
  deabcc5a6aa93dadf0b20364715b559a5b9848ac
  b85037b75255061a41d0ec3fd9b64f271351b43e

  which are all included starting with version 1.1.17, and Bionic
  includes version 1.1.18, so this is fixed already in Bionic and later.

  This is needed only for Xenial.

  [other info]

  As mentioned in the test case section, I do not have a setup where I'm
  able to reproduce this, but I can ask the initial reporter to test and
  verify the fix, and they have verified a test build fixed the problem
  for them.

  Also, the upstream commits removed two symbols, which I elided from
  the backported patches; those symbols are still available and, while
  it is unlikely there were any users of those symbols outside pacemaker
  itself, this change should not break any possible external users.  See
  patch 0002 header in the upload for more detail.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1871166/+subscriptions

___
Mailing list: https://launchpad.net/~ubuntu-ha
Post to : ubuntu-ha@lists.launchpad.net
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp


[Ubuntu-ha] [Bug 1871166] Please test proposed package

2020-04-14 Thread Łukasz Zemczak
Hello Dan, or anyone else affected,

Accepted pacemaker into xenial-proposed. The package will build now and
be available at
https://launchpad.net/ubuntu/+source/pacemaker/1.1.14-2ubuntu1.7 in a
few hours, and then in the -proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.  Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, what testing has been
performed on the package and change the tag from verification-needed-
xenial to verification-done-xenial. If it does not fix the bug for you,
please add a comment stating that, and change the tag to verification-
failed-xenial. In either case, without details of your testing we will
not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance for helping!

N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to pacemaker in Ubuntu.
https://bugs.launchpad.net/bugs/1871166

Title:
  lrmd crashes

Status in pacemaker package in Ubuntu:
  Fix Released
Status in pacemaker source package in Xenial:
  Fix Committed

Bug description:
  [impact]

  lrmd crashes and dumps core.

  [test case]

  I can not reproduce, but it is reproducable in the specific setup of
  the person reporting the bug to me.

  [regression potential]

  this patches the cancel/cleanup part of the code, so regressions would
  likely involve possible memory leaks (instead of use-after-free
  segfaults), failure to correctly cancel or cleanup operations, or
  other failure during cancel action.

  [scope]

  this is fixed by commits:
  933d46ef20591757301784773a37e06b78906584
  94a4c58f675d163085a055f59fd6c3a2c9f57c43
  dc36d4375c049024a6f9e4d2277a3e6444fad05b
  deabcc5a6aa93dadf0b20364715b559a5b9848ac
  b85037b75255061a41d0ec3fd9b64f271351b43e

  which are all included starting with version 1.1.17, and Bionic
  includes version 1.1.18, so this is fixed already in Bionic and later.

  This is needed only for Xenial.

  [other info]

  As mentioned in the test case section, I do not have a setup where I'm
  able to reproduce this, but I can ask the initial reporter to test and
  verify the fix, and they have verified a test build fixed the problem
  for them.

  Also, the upstream commits removed two symbols, which I elided from
  the backported patches; those symbols are still available and, while
  it is unlikely there were any users of those symbols outside pacemaker
  itself, this change should not break any possible external users.  See
  patch 0002 header in the upload for more detail.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1871166/+subscriptions

___
Mailing list: https://launchpad.net/~ubuntu-ha
Post to : ubuntu-ha@lists.launchpad.net
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp


[Ubuntu-ha] [Bug 1848902] Update Released

2019-12-02 Thread Łukasz Zemczak
The verification of the Stable Release Update for haproxy has completed
successfully and the package is now being released to -updates.
Subsequently, the Ubuntu Stable Release Updates Team is being
unsubscribed and will not receive messages about this bug report.  In
the event that you encounter a regression using the package from
-updates please report a new bug using ubuntu-bug and tag the bug report
regression-update so we can easily find any regressions.

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to haproxy in Ubuntu.
https://bugs.launchpad.net/bugs/1848902

Title:
  haproxy in bionic can get stuck

Status in haproxy package in Ubuntu:
  Fix Released
Status in haproxy source package in Bionic:
  Fix Released

Bug description:
  [Impact]

   * The master process will exit with the status of the last worker.
     When the worker is killed with SIGTERM, it is expected to get 143 as an
     exit status. Therefore, we consider this exit status as normal from a
     systemd point of view. If it happens when not stopping, the systemd
     unit is configured to always restart, so it has no adverse effect.

   * Backport upstream fix - adding another accepted RC to the systemd
     service

  [Test Case]

   * You want to install haproxy and have it running. Then sigterm it a lot.
     With the fix it would restart the service all the time, well except
     restart limit. But in the bad case it will just stay down and didn't
     even try to restart it.

     $ apt install haproxy
     $ for x in {1..100}; do pkill -TERM -x haproxy ; sleep 0.1 ; done
     $ systemctl status haproxy

 The above is a hacky way to trigger some A/B behavior on the fix.
 It isn't perfect as systemd restart counters will kick in and you 
 essentially check a secondary symptom.
 I'd recommend to in addition run the following:

     $ apt install haproxy
     $ for x in {1..1000}; do pkill -TERM -x haproxy ; sleep 0.001 systemctl 
  reset-failed haproxy.service; done
     $ systemctl status haproxy

 You can do so with even smaller sleeps, that should keep the service up 
 and running (this isn't changing with the fix, but should work with the 
new code).

  [Regression Potential]

   * This eventually is a conffile modification, so if there are other
     modifications done by the user they will get a prompt. But that isn't a
     regression. I checked the code and I can't think of another RC=143 that
     would due to that "no more" detected as error. I really think other
     than the update itself triggering a restart (as usual for services)
     there is no further regression potential to this.

  [Other Info]

   * Fix already active in IS hosted cloud without issues since a while
   * Also reports (comment #5) show that others use this in production as
     well

  ---

  On a Bionic/Stein cloud, after a network partition, we saw several
  units (glance, swift-proxy and cinder) fail to start haproxy, like so:

  root@juju-df624b-6-lxd-4:~# systemctl status haproxy.service
  ● haproxy.service - HAProxy Load Balancer
     Loaded: loaded (/lib/systemd/system/haproxy.service; enabled; vendor 
preset: enabled)
     Active: failed (Result: exit-code) since Sun 2019-10-20 00:23:18 UTC; 1h 
35min ago
   Docs: man:haproxy(1)
     file:/usr/share/doc/haproxy/configuration.txt.gz
    Process: 2002655 ExecStart=/usr/sbin/haproxy -Ws -f $CONFIG -p $PIDFILE 
$EXTRAOPTS (code=exited, status=143)
    Process: 2002649 ExecStartPre=/usr/sbin/haproxy -f $CONFIG -c -q $EXTRAOPTS 
(code=exited, status=0/SUCCESS)
   Main PID: 2002655 (code=exited, status=143)

  Oct 20 00:16:52 juju-df624b-6-lxd-4 systemd[1]: Starting HAProxy Load 
Balancer...
  Oct 20 00:16:52 juju-df624b-6-lxd-4 systemd[1]: Started HAProxy Load Balancer.
  Oct 20 00:23:18 juju-df624b-6-lxd-4 systemd[1]: Stopping HAProxy Load 
Balancer...
  Oct 20 00:23:18 juju-df624b-6-lxd-4 haproxy[2002655]: [WARNING] 292/001652 
(2002655) : Exiting Master process...
  Oct 20 00:23:18 juju-df624b-6-lxd-4 haproxy[2002655]: [ALERT] 292/001652 
(2002655) : Current worker 2002661 exited with code 143
  Oct 20 00:23:18 juju-df624b-6-lxd-4 haproxy[2002655]: [WARNING] 292/001652 
(2002655) : All workers exited. Exiting... (143)
  Oct 20 00:23:18 juju-df624b-6-lxd-4 systemd[1]: haproxy.service: Main process 
exited, code=exited, status=143/n/a
  Oct 20 00:23:18 juju-df624b-6-lxd-4 systemd[1]: haproxy.service: Failed with 
result 'exit-code'.
  Oct 20 00:23:18 juju-df624b-6-lxd-4 systemd[1]: Stopped HAProxy Load Balancer.
  root@juju-df624b-6-lxd-4:~#

  The Debian maintainer came up with the following patch for this:

    https://www.mail-archive.com/haproxy@formilux.org/msg30477.html

  Which was added to the 1.8.10-1 Debian upload and merged into upstream 1.8.13.
  Unfortunately Bionic is on 1.8.8-1ubuntu0.4 and doesn't have this patch.

  Please consider pulling this patch into an SRU for 

[Ubuntu-ha] [Bug 1815101] Update Released

2019-11-25 Thread Łukasz Zemczak
The verification of the Stable Release Update for systemd has completed
successfully and the package is now being released to -updates.
Subsequently, the Ubuntu Stable Release Updates Team is being
unsubscribed and will not receive messages about this bug report.  In
the event that you encounter a regression using the package from
-updates please report a new bug using ubuntu-bug and tag the bug report
regression-update so we can easily find any regressions.

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to keepalived in Ubuntu.
https://bugs.launchpad.net/bugs/1815101

Title:
  [master] Restarting systemd-networkd breaks keepalived, heartbeat,
  corosync, pacemaker (interface aliases are restarted)

Status in Keepalived Charm:
  New
Status in netplan:
  Confirmed
Status in heartbeat package in Ubuntu:
  Won't Fix
Status in keepalived package in Ubuntu:
  In Progress
Status in systemd package in Ubuntu:
  In Progress
Status in keepalived source package in Bionic:
  Confirmed
Status in systemd source package in Bionic:
  Confirmed
Status in keepalived source package in Disco:
  Confirmed
Status in systemd source package in Disco:
  Confirmed
Status in keepalived source package in Eoan:
  In Progress
Status in systemd source package in Eoan:
  Fix Released

Bug description:
  [impact]

  - ALL related HA software has a small problem if interfaces are being
  managed by systemd-networkd: nic restarts/reconfigs are always going
  to wipe all interfaces aliases when HA software is not expecting it to
  (no coordination between them.

  - keepalived, smb ctdb, pacemaker, all suffer from this. Pacemaker is
  smarter in this case because it has a service monitor that will
  restart the virtual IP resource, in affected node & nic, before
  considering a real failure, but other HA service might consider a real
  failure when it is not.

  [test case]

  - comment #14 is a full test case: to have 3 node pacemaker, in that
  example, and cause a networkd service restart: it will trigger a
  failure for the virtual IP resource monitor.

  - other example is given in the original description for keepalived.
  both suffer from the same issue (and other HA softwares as well).

  [regression potential]

  - this backports KeepConfiguration parameter, which adds some
  significant complexity to networkd's configuration and behavior, which
  could lead to regressions in correctly configuring the network at
  networkd start, or incorrectly maintaining configuration at networkd
  restart, or losing network state at networkd stop.

  - Any regressions are most likely to occur during networkd start,
  restart, or stop, and most likely to involve missing or incorrect ip
  address(es).

  - the change is based in upstream patches adding the exact feature we
  needed to fix this issue & it will be integrated with a netplan change
  to add the needed stanza to systemd nic configuration file
  (KeepConfiguration=)

  [other info]

  original description:
  ---

  Configure netplan for interfaces, for example (a working config with
  IP addresses obfuscated)

  network:
  ethernets:
  eth0:
  addresses: [192.168.0.5/24]
  dhcp4: false
  nameservers:
    search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, 
phone.blah.com]
    addresses: [10.22.11.1]
  eth2:
  addresses:
    - 12.13.14.18/29
    - 12.13.14.19/29
  gateway4: 12.13.14.17
  dhcp4: false
  nameservers:
    search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, 
phone.blah.com]
    addresses: [10.22.11.1]
  eth3:
  addresses: [10.22.11.6/24]
  dhcp4: false
  nameservers:
    search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, 
phone.blah.com]
    addresses: [10.22.11.1]
  eth4:
  addresses: [10.22.14.6/24]
  dhcp4: false
  nameservers:
    search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, 
phone.blah.com]
    addresses: [10.22.11.1]
  eth7:
  addresses: [9.5.17.34/29]
  dhcp4: false
  optional: true
  nameservers:
    search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, 
phone.blah.com]
    addresses: [10.22.11.1]
  version: 2

  Configure keepalived (again, a working config with IP addresses
  obfuscated)

  global_defs   # Block id
  {
  notification_email {
  sysadm...@blah.com
  }
  notification_email_from keepali...@system3.hq.blah.com
  smtp_server 10.22.11.7 # IP
  smtp_connect_timeout 30  # integer, seconds
  router_id system3  # string identifying the machine,
     

[Ubuntu-ha] [Bug 1815101] Re: [master] Restarting systemd-networkd breaks keepalived, heartbeat, corosync, pacemaker (interface aliases are restarted)

2019-11-07 Thread Łukasz Zemczak
Hello Leroy, or anyone else affected,

Accepted systemd into eoan-proposed. The package will build now and be
available at https://launchpad.net/ubuntu/+source/systemd/242-7ubuntu3.2
in a few hours, and then in the -proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.  Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested and change the tag from
verification-needed-eoan to verification-done-eoan. If it does not fix
the bug for you, please add a comment stating that, and change the tag
to verification-failed-eoan. In either case, without details of your
testing we will not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance for helping!

N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.

** Changed in: systemd (Ubuntu Eoan)
   Status: In Progress => Fix Committed

** Tags added: verification-needed verification-needed-eoan

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to keepalived in Ubuntu.
https://bugs.launchpad.net/bugs/1815101

Title:
  [master] Restarting systemd-networkd breaks keepalived, heartbeat,
  corosync, pacemaker (interface aliases are restarted)

Status in Keepalived Charm:
  New
Status in netplan:
  Confirmed
Status in heartbeat package in Ubuntu:
  Triaged
Status in keepalived package in Ubuntu:
  In Progress
Status in systemd package in Ubuntu:
  In Progress
Status in heartbeat source package in Bionic:
  Triaged
Status in keepalived source package in Bionic:
  Confirmed
Status in systemd source package in Bionic:
  Confirmed
Status in heartbeat source package in Disco:
  Triaged
Status in keepalived source package in Disco:
  Confirmed
Status in systemd source package in Disco:
  Confirmed
Status in heartbeat source package in Eoan:
  Triaged
Status in keepalived source package in Eoan:
  In Progress
Status in systemd source package in Eoan:
  Fix Committed

Bug description:
  [impact]

  - ALL related HA software has a small problem if interfaces are being
  managed by systemd-networkd: nic restarts/reconfigs are always going
  to wipe all interfaces aliases when HA software is not expecting it to
  (no coordination between them.

  - keepalived, smb ctdb, pacemaker, all suffer from this. Pacemaker is
  smarter in this case because it has a service monitor that will
  restart the virtual IP resource, in affected node & nic, before
  considering a real failure, but other HA service might consider a real
  failure when it is not.

  [test case]

  - comment #14 is a full test case: to have 3 node pacemaker, in that
  example, and cause a networkd service restart: it will trigger a
  failure for the virtual IP resource monitor.

  - other example is given in the original description for keepalived.
  both suffer from the same issue (and other HA softwares as well).

  [regression potential]

  - this backports KeepConfiguration parameter, which adds some
  significant complexity to networkd's configuration and behavior, which
  could lead to regressions in correctly configuring the network at
  networkd start, or incorrectly maintaining configuration at networkd
  restart, or losing network state at networkd stop.

  - Any regressions are most likely to occur during networkd start,
  restart, or stop, and most likely to involve missing or incorrect ip
  address(es).

  - the change is based in upstream patches adding the exact feature we
  needed to fix this issue & it will be integrated with a netplan change
  to add the needed stanza to systemd nic configuration file
  (KeepConfiguration=)

  [other info]

  original description:
  ---

  Configure netplan for interfaces, for example (a working config with
  IP addresses obfuscated)

  network:
  ethernets:
  eth0:
  addresses: [192.168.0.5/24]
  dhcp4: false
  nameservers:
    search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, 
phone.blah.com]
    addresses: [10.22.11.1]
  eth2:
  addresses:
    - 12.13.14.18/29
    - 12.13.14.19/29
  gateway4: 12.13.14.17
  dhcp4: false
  nameservers:
    search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, 
phone.blah.com]
    addresses: [10.22.11.1]
  eth3:
  addresses: [10.22.11.6/24]
  dhcp4: false
  nameservers:
    search: [blah.com, other.blah.com, hq.blah.com, 

[Ubuntu-ha] [Bug 1841936] Update Released

2019-11-04 Thread Łukasz Zemczak
The verification of the Stable Release Update for haproxy has completed
successfully and the package is now being released to -updates.
Subsequently, the Ubuntu Stable Release Updates Team is being
unsubscribed and will not receive messages about this bug report.  In
the event that you encounter a regression using the package from
-updates please report a new bug using ubuntu-bug and tag the bug report
regression-update so we can easily find any regressions.

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to haproxy in Ubuntu.
https://bugs.launchpad.net/bugs/1841936

Title:
  Rebuild openssl 1.1.1 to pickup TLSv1.3 (bionic) and unbreak existing
  builds against 1.1.1 (dh key size)

Status in HAProxy:
  Fix Released
Status in haproxy package in Ubuntu:
  Fix Committed
Status in haproxy source package in Bionic:
  Fix Committed
Status in haproxy source package in Disco:
  Fix Released
Status in haproxy source package in Eoan:
  Fix Released
Status in haproxy source package in Focal:
  Fix Committed

Bug description:
  [Impact-Bionic]

   * openssl 1.1.1 has been backported to Bionic for its longer
     support upstream period

   * That would allow the extra feature of TLSv1.3 in some consuming
     packages what seems "for free". Just with a no change rebuild it would
     pick that up.

  [Impact Disco-Focal]

   * openssl >=1.1.1 is in Disco-Focal already and thereby it was built
     against that already. That made it pick up TLSv1.3, but also a related
     bug that broke the ability to control the DHE key, it was always in
     "ECDH auto" mode. Therefore the daemon didn't follow the config
     anymore.
     Upgraders would regress having their DH key behavior changed
     unexpectedly.

  [Test Case]

   A)
   * run "haproxy -vv" and check the reported TLS versions to include 1.3
   B)
   * download https://github.com/drwetter/testssl.sh
   * Install haproxy
 * ./testssl.sh --pfs :443
 * Check the reported DH key/group (shoudl stay 1024)
 * Check if settings work to bump it like
 tune.ssl.default-dh-param 2048
   into
 /etc/haproxy/haproxy.cfg

  [Regression Potential-Bionic]

   * This should be low, the code already runs against the .so of the newer
     openssl library. This would only make it recognize the newer TLS
     support.
     i'd expect more trouble as-is with the somewhat big delta between what
     it was built against vs what it runs with than afterwards.
   * [1] and [2]  indicate that any config that would have been made for
     TLSv1.2 [1] would not apply to the v1.3 as it would be configured in
     [2].
     It is good to have no entry for [2] yet as following the defaults of
     openssl is the safest as that would be updated if new insights/CVEs are
     known.
     But this could IMHO be the "regression that I'd expect", one explcitly
     configured the v1.2 things and once both ends support v1.3 that might
     be auto-negotiated. One can then set "force-tlsv12" but that is an
     administrative action [3]
   * Yet AFAIK this fine grained control [2] for TLSv1.3 only exists in
     >=1.8.15 [4] and Bionic is on haproxy 1.8.8. So any user of TLSv1.3 in
     Bionic haproxy would have to stay without that. There are further
     changes to TLS v1.3 handling enhancements [5] but also fixes [6] which
     aren't in 1.8.8 in Bionic.
     So one could say enabling this will enable an inferior TLSv1.3 and one
     might better not enable it, for an SRU the bar to not break old
     behavior is intentionally high - I tried to provide as much as possible
     background, the decision is up to the SRU team.

  [Regression Potential-Disco-Focal]

   * The fixes let the admin regain control of the DH key configuration
     which is the fix. But remember that the default config didn't specify
     any. Therefore we have two scenarios:
     a) an admin had set custom DH parameters which were ignored. He had no
    chance to control them and needs the fix. He might have been under
    the impression that his keys are safe (there is a CVE against small
    ones) and only now is he really safe -> gain high, regression low
     b) an admin had not set anything, the default config is meant to use
    (compatibility) and the program reported "I'm using 1024, but you
    should set it higher". But what really happened was ECDH auto mode
    which has longer keys and different settings. Those systems will
    be "fixed" by finally following the config, but that means the key
    will "now" after the fix be vulnerable.
    -> for their POV a formerly secure setup will become vulnerable
     I'd expect that any professional setup would use explicit config as it
     has seen the warning since day #1 and also any kind of deployment
     recipes should use big keys. So the majority of users should be in (a).
     c) And OTOH there are people like the reporter 

[Ubuntu-ha] [Bug 1819046] Update Released

2019-03-21 Thread Łukasz Zemczak
The verification of the Stable Release Update for pacemaker has
completed successfully and the package has now been released to
-updates.  Subsequently, the Ubuntu Stable Release Updates Team is being
unsubscribed and will not receive messages about this bug report.  In
the event that you encounter a regression using the package from
-updates please report a new bug using ubuntu-bug and tag the bug report
regression-update so we can easily find any regressions.

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to pacemaker in Ubuntu.
https://bugs.launchpad.net/bugs/1819046

Title:
  Systemd unit file reads settings from wrong path

Status in pacemaker package in Ubuntu:
  Fix Released
Status in pacemaker source package in Xenial:
  Fix Released

Bug description:
  [Impact]
  Systemd Unit file doesn't read any settings by default

  [Description]
  The unit file shipped with the Xenial pacemaker package tries to read 
environment settings from /etc/sysconfig/ instead of /etc/default/. The result 
is that settings defined in /etc/default/pacemaker are not effective.
  Since the /etc/default/pacemaker file is created with default values when the 
pacemaker package is installed, we should source that in the systemd unit file.

  [Test Case]
  1) Deploy a Xenial container:
  $ lxc launch ubuntu:xenial pacemaker
  2) Update container and install pacemaker:
  root@pacemaker:~# apt update && apt install pacemaker -y
  3) Change default pacemaker log location:
  root@pacemaker:~# echo "PCMK_logfile=/tmp/pacemaker.log" >> 
/etc/default/pacemaker
  4) Restart pacemaker service and verify that log file exists:
  root@pacemaker:~# systemctl restart pacemaker.service
  root@pacemaker:~# ls -l /tmp/pacemaker.log
  ls: cannot access '/tmp/pacemaker.log': No such file or directory

  After fixing the systemd unit, changes to /etc/default/pacemaker get picked 
up correctly:
  root@pacemaker:~# ls -l /tmp/pacemaker.log
  -rw-rw 1 hacluster haclient 27335 Mar  7 20:46 /tmp/pacemaker.log

  
  [Regression Potential]
  The regression potential for this should be very low, since the configuration 
file is already being created by default and other systemd unit files are using 
the /etc/default config. In case the file doesn't exist or the user removed it, 
the "-" prefix will gracefully ignore the missing file according to the 
systemd.exec manual [0].
  Nonetheless, the new package will be tested with autopkgtests and the fix 
will be validated in a reproduction environment.

  [0] https://www.freedesktop.org/software/systemd/man/systemd.exec.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1819046/+subscriptions

___
Mailing list: https://launchpad.net/~ubuntu-ha
Post to : ubuntu-ha@lists.launchpad.net
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp


[Ubuntu-ha] [Bug 1819046] Re: Systemd unit file reads settings from wrong path

2019-03-14 Thread Łukasz Zemczak
Hello Heitor, or anyone else affected,

Accepted pacemaker into xenial-proposed. The package will build now and
be available at
https://launchpad.net/ubuntu/+source/pacemaker/1.1.14-2ubuntu1.5 in a
few hours, and then in the -proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.  Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested and change the tag from
verification-needed-xenial to verification-done-xenial. If it does not
fix the bug for you, please add a comment stating that, and change the
tag to verification-failed-xenial. In either case, without details of
your testing we will not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance for helping!

N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.

** Changed in: pacemaker (Ubuntu Xenial)
   Status: In Progress => Fix Committed

** Tags added: verification-needed verification-needed-xenial

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to pacemaker in Ubuntu.
https://bugs.launchpad.net/bugs/1819046

Title:
  Systemd unit file reads settings from wrong path

Status in pacemaker package in Ubuntu:
  Fix Released
Status in pacemaker source package in Xenial:
  Fix Committed

Bug description:
  [Impact]
  Systemd Unit file doesn't read any settings by default

  [Description]
  The unit file shipped with the Xenial pacemaker package tries to read 
environment settings from /etc/sysconfig/ instead of /etc/default/. The result 
is that settings defined in /etc/default/pacemaker are not effective.
  Since the /etc/default/pacemaker file is created with default values when the 
pacemaker package is installed, we should source that in the systemd unit file.

  [Test Case]
  1) Deploy a Xenial container:
  $ lxc launch ubuntu:xenial pacemaker
  2) Update container and install pacemaker:
  root@pacemaker:~# apt update && apt install pacemaker -y
  3) Change default pacemaker log location:
  root@pacemaker:~# echo "PCMK_logfile=/tmp/pacemaker.log" >> 
/etc/default/pacemaker
  4) Restart pacemaker service and verify that log file exists:
  root@pacemaker:~# systemctl restart pacemaker.service
  root@pacemaker:~# ls -l /tmp/pacemaker.log
  ls: cannot access '/tmp/pacemaker.log': No such file or directory

  After fixing the systemd unit, changes to /etc/default/pacemaker get picked 
up correctly:
  root@pacemaker:~# ls -l /tmp/pacemaker.log
  -rw-rw 1 hacluster haclient 27335 Mar  7 20:46 /tmp/pacemaker.log

  
  [Regression Potential]
  The regression potential for this should be very low, since the configuration 
file is already being created by default and other systemd unit files are using 
the /etc/default config. In case the file doesn't exist or the user removed it, 
the "-" prefix will gracefully ignore the missing file according to the 
systemd.exec manual [0].
  Nonetheless, the new package will be tested with autopkgtests and the fix 
will be validated in a reproduction environment.

  [0] https://www.freedesktop.org/software/systemd/man/systemd.exec.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1819046/+subscriptions

___
Mailing list: https://launchpad.net/~ubuntu-ha
Post to : ubuntu-ha@lists.launchpad.net
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp


[Ubuntu-ha] [Bug 1804069] Update Released

2019-02-07 Thread Łukasz Zemczak
The verification of the Stable Release Update for haproxy has completed
successfully and the package has now been released to -updates.
Subsequently, the Ubuntu Stable Release Updates Team is being
unsubscribed and will not receive messages about this bug report.  In
the event that you encounter a regression using the package from
-updates please report a new bug using ubuntu-bug and tag the bug report
regression-update so we can easily find any regressions.

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to haproxy in Ubuntu.
https://bugs.launchpad.net/bugs/1804069

Title:
  haproxy fails on arm64 due to alignment error

Status in haproxy package in Ubuntu:
  Fix Released
Status in haproxy source package in Bionic:
  Fix Committed
Status in haproxy source package in Cosmic:
  Fix Released

Bug description:
  [Impact]
  haproxy as shipped with bionic and cosmic doesn't work on arm64 
architectures, crashing the moment it serves a request.

  [Test Case]

  * install haproxy and apache in an up-to-date ubuntu release you are testing, 
in an arm64 system:
  sudo apt update && sudo apt dist-upgrade -y && sudo apt install haproxy 
apache2 -y

  * Create /etc/haproxy/haproxy.cfg with the following contents:
  global
  chroot /var/lib/haproxy
  user haproxy
  group haproxy
  daemon
  maxconn 4096

  defaults
  log global
  option dontlognull
  option redispatch
  retries 3
  timeout client 50s
  timeout connect 10s
  timeout http-request 5s
  timeout server 50s
  maxconn 4096

  frontend test-front
  bind *:8080
  mode http
  default_backend test-back

  backend test-back
  mode http
  stick store-request src
  stick-table type ip size 256k expire 30m
  server test-1 localhost:80

  * in one terminal, keep tailing the (still nonexistent) haproxy log file:
  tail -F /var/log/haproxy.log

  * in another terminal, restart haproxy:
  sudo systemctl restart haproxy

  * The haproxy log will become live, and already show errors:
  Jan 24 19:22:23 cosmic-haproxy-1804069 haproxy[2286]: [WARNING] 023/191958 
(2286) : Exiting Master process...
  Jan 24 19:22:23 cosmic-haproxy-1804069 haproxy[2286]: [ALERT] 023/191958 
(2286) : Current worker 2287 exited with code 143
  Jan 24 19:22:23 cosmic-haproxy-1804069 haproxy[2286]: [WARNING] 023/191958 
(2286) : All workers exited. Exiting... (143)

  * Run wget to try to fetch the apache frontpage, via haproxy, limited to one 
attempt. It will fail:
  $ wget -t1 http://localhost:8080
  --2019-01-24 19:23:51--  http://localhost:8080/
  Resolving localhost (localhost)... 127.0.0.1
  Connecting to localhost (localhost)|127.0.0.1|:8080... connected.
  HTTP request sent, awaiting response... No data received.
  Giving up.
  $ echo $?
  4

  * the haproxy logs will show errors:
  Jan 24 19:24:36 cosmic-haproxy-1804069 haproxy[6411]: [ALERT] 023/192351 
(6411) : Current worker 6412 exited with code 135
  Jan 24 19:24:36 cosmic-haproxy-1804069 haproxy[6411]: [ALERT] 023/192351 
(6411) : exit-on-failure: killing every workers with SIGTERM
  Jan 24 19:24:36 cosmic-haproxy-1804069 haproxy[6411]: [WARNING] 023/192351 
(6411) : All workers exited. Exiting... (135)

  * Update the haproxy package and try the wget one more time. This time
  it will work, and the haproxy logs will stay silent:

  $ wget -t1 http://localhost:8080
  --2019-01-24 19:26:14--  http://localhost:8080/
  Resolving localhost (localhost)... 127.0.0.1
  Connecting to localhost (localhost)|127.0.0.1|:8080... connected.
  HTTP request sent, awaiting response... 200 OK
  Length: 10918 (11K) [text/html]
  Saving to: ‘index.html’

  index.html
  
100%[>]
  10.66K  --.-KB/sin 0s

  2019-01-24 19:26:14 (75.3 MB/s) - ‘index.html’ saved [10918/10918]

  [Regression Potential]
  Patch was applied upstream in 1.8.15 and is available in the same form in the 
latest 1.8.17 release. The patch is a bit low level, but seems to have been 
well understood.

  [Other Info]
  After writing the testing instructions for this bug, I decided they could be 
easily converted to a DEP8 test, which I did and included in this SRU. This new 
test, very simple but effective, shows that arm64 is working, and that the 
other architectures didn't break.

  [Original Description]

  This fault was reported via the haproxy mailing list https://www.mail-
  archive.com/hapr...@formilux.org/msg31749.html

  And then patched in the haproxy code here
  
https://github.com/haproxy/haproxy/commit/52dabbc4fad338233c7f0c96f977a43f8f81452a

  Without this patch haproxy is not functional on aarch64/arm64.
  Experimental deployments of openstack-ansible on arm64 fail because of
  this bug, and without a fix applied to the ubuntu 

[Ubuntu-ha] [Bug 1755061] Re: HAProxyContext on Ubuntu 14.04 generates config that fails to start on boot

2019-01-28 Thread Łukasz Zemczak
The version of haproxy in the proposed pocket of Trusty that was
purported to fix this bug report has been removed because the bugs that
were to be fixed by the upload were not verified in a timely (105 days)
fashion.

** Tags removed: verification-needed-trusty

** Changed in: haproxy (Ubuntu Trusty)
   Status: Fix Committed => Won't Fix

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to haproxy in Ubuntu.
https://bugs.launchpad.net/bugs/1755061

Title:
  HAProxyContext on Ubuntu 14.04 generates config that fails to start on
  boot

Status in haproxy package in Ubuntu:
  Fix Released
Status in haproxy source package in Trusty:
  Won't Fix

Bug description:
  [Impact]
  Valid haproxy configuration directives don't work on trusty as /run/haproxy 
does not survive reboots and is not-recreated on daemon start.

  [Test Case]
  sudo apt install haproxy
  configure /etc/haproxy.cfg with a admin socket in /run/haproxy:

  global
  log /var/lib/haproxy/dev/log local0
  log /var/lib/haproxy/dev/log local1 notice
  maxconn 2
  user haproxy
  group haproxy
  spread-checks 0
  stats socket /var/run/haproxy/admin.sock mode 600 level admin
  stats timeout 2m

  Restart haproxy (will fail as /{,var/}run/haproxy does not exist)

  [Regression Potential]
  Minimal - same fix is in later package revisions

  [Original Bug Report]
  While testing upgrades of an Ubuntu 14.04 deployment of OpenStack from ~15.04 
to 17.11 charms, I noticed that a number of the OpenStack charmed services 
failed to start haproxy when I rebooted their units: cinder, glance, keystone, 
neutron-api, nova-cloud-controller, and swift-proxy.

  The following was in /var/log/boot.log:

  [ALERT] 069/225906 (1100) : cannot bind socket for UNIX listener 
(/var/run/haproxy/admin.sock). Aborting.
  [ALERT] 069/225906 (1100) : [/usr/sbin/haproxy.main()] Some protocols failed 
to start their listeners! Exiting.
   * Starting haproxy haproxy  
[fail]

  The charm created /var/run/haproxy, but since /var/run (really /run)
  is a tmpfs, this did not survive the reboot and so haproxy could not
  create the socket.

  I compared the haproxy.cfg the charm creates with the default config
  shipped by the Ubuntu 16.04 haproxy package, and it seems that
  charmhelpers/contrib/openstack/templates/haproxy.cfg is closely based
  on the package, including the admin.sock directive.  However, on
  Ubuntu 16.04, /etc/init.d/haproxy ensures that /var/run/haproxy exists
  before it starts haproxy:

  [agnew(work)] diff -u haproxy-1.4.24/debian/haproxy.init 
haproxy-1.6.3/debian/haproxy.init
  --- haproxy-1.4.24/debian/haproxy.init  2015-12-16 03:55:29.0 +1300
  +++ haproxy-1.6.3/debian/haproxy.init   2015-12-31 20:10:38.0 +1300
  [...]
  @@ -50,6 +41,10 @@

   haproxy_start()
   {
  +   [ -d "$RUNDIR" ] || mkdir "$RUNDIR"
  +   chown haproxy:haproxy "$RUNDIR"
  +   chmod 2775 "$RUNDIR"
  +
  check_haproxy_config

  start-stop-daemon --quiet --oknodo --start --pidfile "$PIDFILE" \
  [...]

  charm-helpers or the OpenStack charms or both should be updated so
  that haproxy will start on boot when running on Ubuntu 14.04.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/haproxy/+bug/1755061/+subscriptions

___
Mailing list: https://launchpad.net/~ubuntu-ha
Post to : ubuntu-ha@lists.launchpad.net
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp


[Ubuntu-ha] [Bug 1755061] Proposed package removed from archive

2019-01-28 Thread Łukasz Zemczak
The version of haproxy in the proposed pocket of Trusty that was
purported to fix this bug report has been removed because the bugs that
were to be fixed by the upload were not verified in a timely (105 days)
fashion.

** Tags removed: verification-needed

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to haproxy in Ubuntu.
https://bugs.launchpad.net/bugs/1755061

Title:
  HAProxyContext on Ubuntu 14.04 generates config that fails to start on
  boot

Status in haproxy package in Ubuntu:
  Fix Released
Status in haproxy source package in Trusty:
  Won't Fix

Bug description:
  [Impact]
  Valid haproxy configuration directives don't work on trusty as /run/haproxy 
does not survive reboots and is not-recreated on daemon start.

  [Test Case]
  sudo apt install haproxy
  configure /etc/haproxy.cfg with a admin socket in /run/haproxy:

  global
  log /var/lib/haproxy/dev/log local0
  log /var/lib/haproxy/dev/log local1 notice
  maxconn 2
  user haproxy
  group haproxy
  spread-checks 0
  stats socket /var/run/haproxy/admin.sock mode 600 level admin
  stats timeout 2m

  Restart haproxy (will fail as /{,var/}run/haproxy does not exist)

  [Regression Potential]
  Minimal - same fix is in later package revisions

  [Original Bug Report]
  While testing upgrades of an Ubuntu 14.04 deployment of OpenStack from ~15.04 
to 17.11 charms, I noticed that a number of the OpenStack charmed services 
failed to start haproxy when I rebooted their units: cinder, glance, keystone, 
neutron-api, nova-cloud-controller, and swift-proxy.

  The following was in /var/log/boot.log:

  [ALERT] 069/225906 (1100) : cannot bind socket for UNIX listener 
(/var/run/haproxy/admin.sock). Aborting.
  [ALERT] 069/225906 (1100) : [/usr/sbin/haproxy.main()] Some protocols failed 
to start their listeners! Exiting.
   * Starting haproxy haproxy  
[fail]

  The charm created /var/run/haproxy, but since /var/run (really /run)
  is a tmpfs, this did not survive the reboot and so haproxy could not
  create the socket.

  I compared the haproxy.cfg the charm creates with the default config
  shipped by the Ubuntu 16.04 haproxy package, and it seems that
  charmhelpers/contrib/openstack/templates/haproxy.cfg is closely based
  on the package, including the admin.sock directive.  However, on
  Ubuntu 16.04, /etc/init.d/haproxy ensures that /var/run/haproxy exists
  before it starts haproxy:

  [agnew(work)] diff -u haproxy-1.4.24/debian/haproxy.init 
haproxy-1.6.3/debian/haproxy.init
  --- haproxy-1.4.24/debian/haproxy.init  2015-12-16 03:55:29.0 +1300
  +++ haproxy-1.6.3/debian/haproxy.init   2015-12-31 20:10:38.0 +1300
  [...]
  @@ -50,6 +41,10 @@

   haproxy_start()
   {
  +   [ -d "$RUNDIR" ] || mkdir "$RUNDIR"
  +   chown haproxy:haproxy "$RUNDIR"
  +   chmod 2775 "$RUNDIR"
  +
  check_haproxy_config

  start-stop-daemon --quiet --oknodo --start --pidfile "$PIDFILE" \
  [...]

  charm-helpers or the OpenStack charms or both should be updated so
  that haproxy will start on boot when running on Ubuntu 14.04.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/haproxy/+bug/1755061/+subscriptions

___
Mailing list: https://launchpad.net/~ubuntu-ha
Post to : ubuntu-ha@lists.launchpad.net
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp


[Ubuntu-ha] [Bug 1744062] Re: [SRU] L3 HA: multiple agents are active at the same time

2018-08-06 Thread Łukasz Zemczak
Hello Corey, or anyone else affected,

Accepted keepalived into xenial-proposed. The package will build now and
be available at
https://launchpad.net/ubuntu/+source/keepalived/1:1.2.24-1ubuntu0.16.04.1
in a few hours, and then in the -proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested and change the tag from
verification-needed-xenial to verification-done-xenial. If it does not
fix the bug for you, please add a comment stating that, and change the
tag to verification-failed-xenial. In either case, without details of
your testing we will not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance!

** Changed in: keepalived (Ubuntu Xenial)
   Status: Triaged => Fix Committed

** Tags added: verification-needed-xenial

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to keepalived in Ubuntu.
https://bugs.launchpad.net/bugs/1744062

Title:
  [SRU] L3 HA: multiple agents are active at the same time

Status in Ubuntu Cloud Archive:
  Triaged
Status in Ubuntu Cloud Archive mitaka series:
  Triaged
Status in Ubuntu Cloud Archive queens series:
  Fix Committed
Status in neutron:
  New
Status in keepalived package in Ubuntu:
  Fix Released
Status in neutron package in Ubuntu:
  Invalid
Status in keepalived source package in Xenial:
  Fix Committed
Status in neutron source package in Xenial:
  Invalid
Status in keepalived source package in Bionic:
  Fix Released
Status in neutron source package in Bionic:
  Invalid

Bug description:
  [Impact]

  This is the same issue reported in
  https://bugs.launchpad.net/neutron/+bug/1731595, however that is
  marked as 'Fix Released' and the issue is still occurring and I can't
  change back to 'New' so it seems best to just open a new bug.

  It seems as if this bug surfaces due to load issues. While the fix
  provided by Venkata in https://bugs.launchpad.net/neutron/+bug/1731595
  (https://review.openstack.org/#/c/522641/) should help clean things up
  at the time of l3 agent restart, issues seem to come back later down
  the line in some circumstances. xavpaice mentioned he saw multiple
  routers active at the same time when they had 464 routers configured
  on 3 neutron gateway hosts using L3HA, and each router was scheduled
  to all 3 hosts. However, jhebden mentions that things seem stable at
  the 400 L3HA router mark, and it's worth noting this is the same
  deployment that xavpaice was referring to.

  keepalived has a patch upstream in 1.4.0 that provides a fix for
  removing left-over addresses if keepalived aborts. That patch will be
  cherry-picked to Ubuntu keepalived packages.

  [Test Case]
  The following SRU process will be followed:
  https://wiki.ubuntu.com/OpenStackUpdates

  In order to avoid regression of existing consumers, the OpenStack team
  will run their continuous integration test against the packages that
  are in -proposed. A successful run of all available tests will be
  required before the proposed packages can be let into -updates.

  The OpenStack team will be in charge of attaching the output summary
  of the executed tests. The OpenStack team members will not mark
  ‘verification-done’ until this has happened.

  [Regression Potential]
  The regression potential is lowered as the fix is cherry-picked without 
change from upstream. In order to mitigate the regression potential, the 
results of the aforementioned tests are attached to this bug.

  [Discussion]

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1744062/+subscriptions

___
Mailing list: https://launchpad.net/~ubuntu-ha
Post to : ubuntu-ha@lists.launchpad.net
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp


[Ubuntu-ha] [Bug 1744062] Update Released

2018-07-30 Thread Łukasz Zemczak
The verification of the Stable Release Update for keepalived has
completed successfully and the package has now been released to
-updates.  Subsequently, the Ubuntu Stable Release Updates Team is being
unsubscribed and will not receive messages about this bug report.  In
the event that you encounter a regression using the package from
-updates please report a new bug using ubuntu-bug and tag the bug report
regression-update so we can easily find any regressions.

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to keepalived in Ubuntu.
https://bugs.launchpad.net/bugs/1744062

Title:
  [SRU] L3 HA: multiple agents are active at the same time

Status in Ubuntu Cloud Archive:
  Triaged
Status in Ubuntu Cloud Archive mitaka series:
  Triaged
Status in Ubuntu Cloud Archive queens series:
  Fix Committed
Status in neutron:
  New
Status in keepalived package in Ubuntu:
  Fix Released
Status in neutron package in Ubuntu:
  Invalid
Status in keepalived source package in Xenial:
  Triaged
Status in neutron source package in Xenial:
  Invalid
Status in keepalived source package in Bionic:
  Fix Committed
Status in neutron source package in Bionic:
  Invalid

Bug description:
  [Impact]

  This is the same issue reported in
  https://bugs.launchpad.net/neutron/+bug/1731595, however that is
  marked as 'Fix Released' and the issue is still occurring and I can't
  change back to 'New' so it seems best to just open a new bug.

  It seems as if this bug surfaces due to load issues. While the fix
  provided by Venkata in https://bugs.launchpad.net/neutron/+bug/1731595
  (https://review.openstack.org/#/c/522641/) should help clean things up
  at the time of l3 agent restart, issues seem to come back later down
  the line in some circumstances. xavpaice mentioned he saw multiple
  routers active at the same time when they had 464 routers configured
  on 3 neutron gateway hosts using L3HA, and each router was scheduled
  to all 3 hosts. However, jhebden mentions that things seem stable at
  the 400 L3HA router mark, and it's worth noting this is the same
  deployment that xavpaice was referring to.

  keepalived has a patch upstream in 1.4.0 that provides a fix for
  removing left-over addresses if keepalived aborts. That patch will be
  cherry-picked to Ubuntu keepalived packages.

  [Test Case]
  The following SRU process will be followed:
  https://wiki.ubuntu.com/OpenStackUpdates

  In order to avoid regression of existing consumers, the OpenStack team
  will run their continuous integration test against the packages that
  are in -proposed. A successful run of all available tests will be
  required before the proposed packages can be let into -updates.

  The OpenStack team will be in charge of attaching the output summary
  of the executed tests. The OpenStack team members will not mark
  ‘verification-done’ until this has happened.

  [Regression Potential]
  The regression potential is lowered as the fix is cherry-picked without 
change from upstream. In order to mitigate the regression potential, the 
results of the aforementioned tests are attached to this bug.

  [Discussion]

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1744062/+subscriptions

___
Mailing list: https://launchpad.net/~ubuntu-ha
Post to : ubuntu-ha@lists.launchpad.net
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp


[Ubuntu-ha] [Bug 1744062] Re: [SRU] L3 HA: multiple agents are active at the same time

2018-07-09 Thread Łukasz Zemczak
Hello Corey, or anyone else affected,

Accepted keepalived into bionic-proposed. The package will build now and
be available at
https://launchpad.net/ubuntu/+source/keepalived/1:1.3.9-1ubuntu0.18.04.1
in a few hours, and then in the -proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested and change the tag from
verification-needed-bionic to verification-done-bionic. If it does not
fix the bug for you, please add a comment stating that, and change the
tag to verification-failed-bionic. In either case, without details of
your testing we will not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance!

** Changed in: keepalived (Ubuntu Bionic)
   Status: Triaged => Fix Committed

** Tags added: verification-needed verification-needed-bionic

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to keepalived in Ubuntu.
https://bugs.launchpad.net/bugs/1744062

Title:
  [SRU] L3 HA: multiple agents are active at the same time

Status in Ubuntu Cloud Archive:
  Triaged
Status in Ubuntu Cloud Archive mitaka series:
  Triaged
Status in Ubuntu Cloud Archive ocata series:
  Triaged
Status in Ubuntu Cloud Archive pike series:
  Triaged
Status in Ubuntu Cloud Archive queens series:
  Triaged
Status in neutron:
  New
Status in keepalived package in Ubuntu:
  Fix Released
Status in neutron package in Ubuntu:
  New
Status in keepalived source package in Xenial:
  Triaged
Status in neutron source package in Xenial:
  New
Status in keepalived source package in Bionic:
  Fix Committed
Status in neutron source package in Bionic:
  New

Bug description:
  [Impact]

  This is the same issue reported in
  https://bugs.launchpad.net/neutron/+bug/1731595, however that is
  marked as 'Fix Released' and the issue is still occurring and I can't
  change back to 'New' so it seems best to just open a new bug.

  It seems as if this bug surfaces due to load issues. While the fix
  provided by Venkata in https://bugs.launchpad.net/neutron/+bug/1731595
  (https://review.openstack.org/#/c/522641/) should help clean things up
  at the time of l3 agent restart, issues seem to come back later down
  the line in some circumstances. xavpaice mentioned he saw multiple
  routers active at the same time when they had 464 routers configured
  on 3 neutron gateway hosts using L3HA, and each router was scheduled
  to all 3 hosts. However, jhebden mentions that things seem stable at
  the 400 L3HA router mark, and it's worth noting this is the same
  deployment that xavpaice was referring to.

  keepalived has a patch upstream in 1.4.0 that provides a fix for
  removing left-over addresses if keepalived aborts. That patch will be
  cherry-picked to Ubuntu keepalived packages.

  [Test Case]
  The following SRU process will be followed:
  https://wiki.ubuntu.com/OpenStackUpdates

  In order to avoid regression of existing consumers, the OpenStack team
  will run their continuous integration test against the packages that
  are in -proposed. A successful run of all available tests will be
  required before the proposed packages can be let into -updates.

  The OpenStack team will be in charge of attaching the output summary
  of the executed tests. The OpenStack team members will not mark
  ‘verification-done’ until this has happened.

  [Regression Potential]
  The regression potential is lowered as the fix is cherry-picked without 
change from upstream. In order to mitigate the regression potential, the 
results of the aforementioned tests are attached to this bug.

  [Discussion]

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1744062/+subscriptions

___
Mailing list: https://launchpad.net/~ubuntu-ha
Post to : ubuntu-ha@lists.launchpad.net
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp


[Ubuntu-ha] [Bug 1316970] Update Released

2018-04-12 Thread Łukasz Zemczak
The verification of the Stable Release Update for pacemaker has
completed successfully and the package has now been released to
-updates.  Subsequently, the Ubuntu Stable Release Updates Team is being
unsubscribed and will not receive messages about this bug report.  In
the event that you encounter a regression using the package from
-updates please report a new bug using ubuntu-bug and tag the bug report
regression-update so we can easily find any regressions.

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to pacemaker in Ubuntu.
https://bugs.launchpad.net/bugs/1316970

Title:
  g_dbus memory leak in lrmd

Status in pacemaker package in Ubuntu:
  Fix Released
Status in pacemaker source package in Trusty:
  Fix Released

Bug description:
  [Impact]
  lrmd daemon with upstart resource has memory leak in Trusty

  affected to pacemaker 1.1.10.
  affected to glib2.0 2.40.2-0ubuntu1 >> for glib2.0 created new lp [1]

  Please note that patch for pacemaker is created myself.

  [Test Case]

  https://pastebin.ubuntu.com/p/fqK6Cx3SKK/
  you can check memory leak with this script

  [Regression]
  Restarting daemon after upgrading this pkg will be needed. this patch adds 
free for non-freed dynamic allocated memory. so it solves memory leak.

  [Others]

  this patch is from my self with testing.

  Please review carefully if it is ok.

  [1] https://bugs.launchpad.net/ubuntu/+source/glib2.0/+bug/1750741

  [Original Description]

  I'm running Pacemaker 1.1.10+git20130802-1ubuntu1 on Ubuntu Saucy
  (13.10) and have encountered a memory leak in lrmd.

  The details of the bug are covered here in this thread
  (http://oss.clusterlabs.org/pipermail/pacemaker/2014-May/021689.html)
  but to summarise, the Pacemaker developers believe the leak is caused
  by the g_dbus API, the use of which was removed in Pacemaker 1.11.

  I've also attached the Valgrind output from the run that exposed the
  issue.

  Given that this issue affects production stability (a periodic restart
  of Pacemaker is required), will a version of 1.11 be released for
  Trusty? (I'm happy to upgrade the OS to Trusty to get it).

  If not, can you advise which version of the OS will be the first to
  take 1.11 please?

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1316970/+subscriptions

___
Mailing list: https://launchpad.net/~ubuntu-ha
Post to : ubuntu-ha@lists.launchpad.net
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp


[Ubuntu-ha] [Bug 1316970] Re: g_dbus memory leak in lrmd

2018-04-05 Thread Łukasz Zemczak
Hello Greg, or anyone else affected,

Accepted pacemaker into trusty-proposed. The package will build now and
be available at
https://launchpad.net/ubuntu/+source/pacemaker/1.1.10+git20130802-1ubuntu2.5
in a few hours, and then in the -proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested and change the tag from
verification-needed-trusty to verification-done-trusty. If it does not
fix the bug for you, please add a comment stating that, and change the
tag to verification-failed-trusty. In either case, without details of
your testing we will not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance!

** Changed in: pacemaker (Ubuntu Trusty)
   Status: In Progress => Fix Committed

** Tags added: verification-needed verification-needed-trusty

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to pacemaker in Ubuntu.
https://bugs.launchpad.net/bugs/1316970

Title:
  g_dbus memory leak in lrmd

Status in pacemaker package in Ubuntu:
  Fix Released
Status in pacemaker source package in Trusty:
  Fix Committed

Bug description:
  [Impact]
  lrmd daemon with upstart resource has memory leak in Trusty

  affected to pacemaker 1.1.10.
  affected to glib2.0 2.40.2-0ubuntu1 >> for glib2.0 created new lp [1]

  Please note that patch for pacemaker is created myself.

  [Test Case]

  https://pastebin.ubuntu.com/p/fqK6Cx3SKK/
  you can check memory leak with this script

  [Regression]
  Restarting daemon after upgrading this pkg will be needed. this patch adds 
free for non-freed dynamic allocated memory. so it solves memory leak.

  [Others]

  this patch is from my self with testing.

  Please review carefully if it is ok.

  [1] https://bugs.launchpad.net/ubuntu/+source/glib2.0/+bug/1750741

  [Original Description]

  I'm running Pacemaker 1.1.10+git20130802-1ubuntu1 on Ubuntu Saucy
  (13.10) and have encountered a memory leak in lrmd.

  The details of the bug are covered here in this thread
  (http://oss.clusterlabs.org/pipermail/pacemaker/2014-May/021689.html)
  but to summarise, the Pacemaker developers believe the leak is caused
  by the g_dbus API, the use of which was removed in Pacemaker 1.11.

  I've also attached the Valgrind output from the run that exposed the
  issue.

  Given that this issue affects production stability (a periodic restart
  of Pacemaker is required), will a version of 1.11 be released for
  Trusty? (I'm happy to upgrade the OS to Trusty to get it).

  If not, can you advise which version of the OS will be the first to
  take 1.11 please?

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1316970/+subscriptions

___
Mailing list: https://launchpad.net/~ubuntu-ha
Post to : ubuntu-ha@lists.launchpad.net
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp


[Ubuntu-ha] [Bug 1740892] Update Released

2018-03-05 Thread Łukasz Zemczak
The verification of the Stable Release Update for corosync has completed
successfully and the package has now been released to -updates.
Subsequently, the Ubuntu Stable Release Updates Team is being
unsubscribed and will not receive messages about this bug report.  In
the event that you encounter a regression using the package from
-updates please report a new bug using ubuntu-bug and tag the bug report
regression-update so we can easily find any regressions.

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to corosync in Ubuntu.
https://bugs.launchpad.net/bugs/1740892

Title:
  corosync upgrade on 2018-01-02 caused pacemaker to fail

Status in OpenStack hacluster charm:
  Invalid
Status in corosync package in Ubuntu:
  Fix Released
Status in pacemaker package in Ubuntu:
  Fix Released
Status in corosync source package in Trusty:
  Won't Fix
Status in pacemaker source package in Trusty:
  Won't Fix
Status in corosync source package in Xenial:
  Fix Released
Status in pacemaker source package in Xenial:
  Fix Released
Status in corosync source package in Artful:
  Fix Released
Status in pacemaker source package in Artful:
  Fix Released
Status in corosync source package in Bionic:
  Fix Released
Status in corosync package in Debian:
  New

Bug description:
  [Impact]

  When corosync and pacemaker are both installed, a corosync upgrade
  caused pacemaker to fail. pacemaker will need to be restarted manually
  to work again, it won't recover by itself.

  [Test Case]

  1) Have corosync (< 2.3.5-3ubuntu2) and pacemaker (< 1.1.14-2ubuntu1.3) 
installed
  2) Make sure corosync & pacemaker are running via systemctl status cmd.
  3) Upgrade corosync
  4) Look corosync and pacemaker via systemctl status cmd again.

  You will notice pacemaker is dead (inactive) and doesn't recover,
  unless a systemctl start pacemaker is done manually.

  [Regression Potential]

  Regression potential is low, it doesn't change corosync/pacemaker core
  functionality. This patch make sure thing goes smoother at the
  packaging level during a corosync upgrade where pacemaker is
  installed/involved.

  This can also be useful in particular in situation where the system
  has "unattended-upgrades" enable (software upgrades without
  supervision), and no sysadmin available to start pacemaker manually
  because this isn't a schedule maintenance.

  For the symbol tag change in Artful to (optional), please refer
  yourself to comment #60 from slangasek.

  For the asctime change in Artful, please refer yourself to comment #51
  & comment #52.

  Note that both Artful changes in pacemaker above are only necessary
  for the package to build (even as-is without this patch). They aren't
  a requirement for the patch the work, but for the src pkg to build.

  [Other Info]

  XENIAL Merge-proposal:
  
https://code.launchpad.net/~nacc/ubuntu/+source/corosync/+git/corosync/+merge/336338
  
https://code.launchpad.net/~nacc/ubuntu/+source/pacemaker/+git/pacemaker/+merge/336339

  [Original Description]

  During upgrades on 2018-01-02, corosync and it's libs were upgraded:

  (from a trusty/mitaka cloud)

  Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4),
  corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64
  (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3,
  2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4),
  libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4),
  libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64
  (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3,
  2.3.3-1ubuntu4)

  During this process, it appears that pacemaker service is restarted
  and it errors:

  syslog:Jan  2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]:   notice: 
crm_update_peer_state: pcmk_quorum_notification: Node 
juju-machine-1-lxc-3[1001] - state is now lost (was member)
  syslog:Jan  2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]:   notice: 
crm_update_peer_state: pcmk_quorum_notification: Node 
juju-machine-1-lxc-3[1001] - state is now member (was lost)
  syslog:Jan  2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: 
cfg_connection_destroy: Connection destroyed
  syslog:Jan  2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:   notice: 
pcmk_shutdown_worker: Shuting down Pacemaker
  syslog:Jan  2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:   notice: 
stop_child: Stopping crmd: Sent -15 to process 2050
  syslog:Jan  2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: 
pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2)
  syslog:Jan  2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: 
mcp_cpg_destroy: Connection destroyed

  Also affected xenial/ocata

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions

___
Mailing list: 

[Ubuntu-ha] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail

2018-02-26 Thread Łukasz Zemczak
Hello Drew, or anyone else affected,

Accepted pacemaker into artful-proposed. The package will build now and
be available at
https://launchpad.net/ubuntu/+source/pacemaker/1.1.17+really1.1.16-1ubuntu2
in a few hours, and then in the -proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested and change the tag from
verification-needed-artful to verification-done-artful. If it does not
fix the bug for you, please add a comment stating that, and change the
tag to verification-failed-artful. In either case, without details of
your testing we will not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance!

** Changed in: pacemaker (Ubuntu Artful)
   Status: In Progress => Fix Committed

** Changed in: corosync (Ubuntu Xenial)
   Status: In Progress => Fix Committed

** Tags added: verification-needed-xenial

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to corosync in Ubuntu.
https://bugs.launchpad.net/bugs/1740892

Title:
  corosync upgrade on 2018-01-02 caused pacemaker to fail

Status in OpenStack hacluster charm:
  Invalid
Status in corosync package in Ubuntu:
  Fix Released
Status in pacemaker package in Ubuntu:
  Fix Released
Status in corosync source package in Trusty:
  Won't Fix
Status in pacemaker source package in Trusty:
  Won't Fix
Status in corosync source package in Xenial:
  Fix Committed
Status in pacemaker source package in Xenial:
  Fix Committed
Status in corosync source package in Artful:
  Fix Committed
Status in pacemaker source package in Artful:
  Fix Committed
Status in corosync source package in Bionic:
  Fix Released
Status in corosync package in Debian:
  New

Bug description:
  [Impact]

  When corosync and pacemaker are both installed, a corosync upgrade
  caused pacemaker to fail. pacemaker will need to be restarted manually
  to work again, it won't recover by itself.

  [Test Case]

  1) Have corosync (< 2.3.5-3ubuntu2) and pacemaker (< 1.1.14-2ubuntu1.3) 
installed
  2) Make sure corosync & pacemaker are running via systemctl status cmd.
  3) Upgrade corosync
  4) Look corosync and pacemaker via systemctl status cmd again.

  You will notice pacemaker is dead (inactive) and doesn't recover,
  unless a systemctl start pacemaker is done manually.

  [Regression Potential]

  Regression potential is low, it doesn't change corosync/pacemaker core
  functionality. This patch make sure thing goes smoother at the
  packaging level during a corosync upgrade where pacemaker is
  installed/involved.

  This can also be useful in particular in situation where the system
  has "unattended-upgrades" enable (software upgrades without
  supervision), and no sysadmin available to start pacemaker manually
  because this isn't a schedule maintenance.

  For the symbol tag change in Artful to (optional), please refer
  yourself to comment #60 from slangasek.

  For the asctime change in Artful, please refer yourself to comment #51
  & comment #52.

  Note that both Artful changes in pacemaker above are only necessary
  for the package to build (even as-is without this patch). They aren't
  a requirement for the patch the work, but for the src pkg to build.

  [Other Info]

  XENIAL Merge-proposal:
  
https://code.launchpad.net/~nacc/ubuntu/+source/corosync/+git/corosync/+merge/336338
  
https://code.launchpad.net/~nacc/ubuntu/+source/pacemaker/+git/pacemaker/+merge/336339

  [Original Description]

  During upgrades on 2018-01-02, corosync and it's libs were upgraded:

  (from a trusty/mitaka cloud)

  Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4),
  corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64
  (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3,
  2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4),
  libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4),
  libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64
  (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3,
  2.3.3-1ubuntu4)

  During this process, it appears that pacemaker service is restarted
  and it errors:

  syslog:Jan  2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]:   notice: 
crm_update_peer_state: pcmk_quorum_notification: Node 
juju-machine-1-lxc-3[1001] - state is now lost (was member)
  syslog:Jan  2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]:   notice: 
crm_update_peer_state: pcmk_quorum_notification: Node 
juju-machine-1-lxc-3[1001] - state is now member (was lost)
  syslog:Jan  2 16:14:32 juju-machine-0-lxc-4 

[Ubuntu-ha] [Bug 1740892] Re: corosync upgrade on 2018-01-02 caused pacemaker to fail

2018-02-26 Thread Łukasz Zemczak
Hello Drew, or anyone else affected,

Accepted corosync into artful-proposed. The package will build now and
be available at
https://launchpad.net/ubuntu/+source/corosync/2.4.2-3ubuntu0.17.10.1 in
a few hours, and then in the -proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested and change the tag from
verification-needed-artful to verification-done-artful. If it does not
fix the bug for you, please add a comment stating that, and change the
tag to verification-failed-artful. In either case, without details of
your testing we will not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance!

** Changed in: corosync (Ubuntu Artful)
   Status: In Progress => Fix Committed

** Tags added: verification-needed verification-needed-artful

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to corosync in Ubuntu.
https://bugs.launchpad.net/bugs/1740892

Title:
  corosync upgrade on 2018-01-02 caused pacemaker to fail

Status in OpenStack hacluster charm:
  Invalid
Status in corosync package in Ubuntu:
  Fix Released
Status in pacemaker package in Ubuntu:
  Fix Released
Status in corosync source package in Trusty:
  Won't Fix
Status in pacemaker source package in Trusty:
  Won't Fix
Status in corosync source package in Xenial:
  In Progress
Status in pacemaker source package in Xenial:
  In Progress
Status in corosync source package in Artful:
  Fix Committed
Status in pacemaker source package in Artful:
  In Progress
Status in corosync source package in Bionic:
  Fix Released
Status in corosync package in Debian:
  New

Bug description:
  [Impact]

  When corosync and pacemaker are both installed, a corosync upgrade
  caused pacemaker to fail. pacemaker will need to be restarted manually
  to work again, it won't recover by itself.

  [Test Case]

  1) Have corosync (< 2.3.5-3ubuntu2) and pacemaker (< 1.1.14-2ubuntu1.3) 
installed
  2) Make sure corosync & pacemaker are running via systemctl status cmd.
  3) Upgrade corosync
  4) Look corosync and pacemaker via systemctl status cmd again.

  You will notice pacemaker is dead (inactive) and doesn't recover,
  unless a systemctl start pacemaker is done manually.

  [Regression Potential]

  Regression potential is low, it doesn't change corosync/pacemaker core
  functionality. This patch make sure thing goes smoother at the
  packaging level during a corosync upgrade where pacemaker is
  installed/involved.

  This can also be useful in particular in situation where the system
  has "unattended-upgrades" enable (software upgrades without
  supervision), and no sysadmin available to start pacemaker manually
  because this isn't a schedule maintenance.

  For the symbol tag change in Artful to (optional), please refer
  yourself to comment #60 from slangasek.

  For the asctime change in Artful, please refer yourself to comment #51
  & comment #52.

  Note that both Artful changes in pacemaker above are only necessary
  for the package to build (even as-is without this patch). They aren't
  a requirement for the patch the work, but for the src pkg to build.

  [Other Info]

  XENIAL Merge-proposal:
  
https://code.launchpad.net/~nacc/ubuntu/+source/corosync/+git/corosync/+merge/336338
  
https://code.launchpad.net/~nacc/ubuntu/+source/pacemaker/+git/pacemaker/+merge/336339

  [Original Description]

  During upgrades on 2018-01-02, corosync and it's libs were upgraded:

  (from a trusty/mitaka cloud)

  Upgrade: libcmap4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4),
  corosync:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcfg6:amd64
  (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libcpg4:amd64 (2.3.3-1ubuntu3,
  2.3.3-1ubuntu4), libquorum5:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4),
  libcorosync-common4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4),
  libsam4:amd64 (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libvotequorum6:amd64
  (2.3.3-1ubuntu3, 2.3.3-1ubuntu4), libtotem-pg5:amd64 (2.3.3-1ubuntu3,
  2.3.3-1ubuntu4)

  During this process, it appears that pacemaker service is restarted
  and it errors:

  syslog:Jan  2 16:09:33 juju-machine-0-lxc-4 pacemakerd[1994]:   notice: 
crm_update_peer_state: pcmk_quorum_notification: Node 
juju-machine-1-lxc-3[1001] - state is now lost (was member)
  syslog:Jan  2 16:09:34 juju-machine-0-lxc-4 pacemakerd[1994]:   notice: 
crm_update_peer_state: pcmk_quorum_notification: Node 
juju-machine-1-lxc-3[1001] - state is now member (was lost)
  syslog:Jan  2 16:14:32 juju-machine-0-lxc-4 pacemakerd[1994]:error: 
cfg_connection_destroy: Connection destroyed
  syslog:Jan  2 

[Ubuntu-ha] [Bug 1739033] Update Released

2018-01-02 Thread Łukasz Zemczak
The verification of the Stable Release Update for corosync has completed
successfully and the package has now been released to -updates.
Subsequently, the Ubuntu Stable Release Updates Team is being
unsubscribed and will not receive messages about this bug report.  In
the event that you encounter a regression using the package from
-updates please report a new bug using ubuntu-bug and tag the bug report
regression-update so we can easily find any regressions.

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to corosync in Ubuntu.
https://bugs.launchpad.net/bugs/1739033

Title:
  Corosync: Assertion 'sender_node != NULL' failed when bind iface is
  ready after corosync boots

Status in corosync package in Ubuntu:
  Fix Released
Status in corosync source package in Trusty:
  Fix Committed
Status in corosync source package in Xenial:
  Fix Released
Status in corosync source package in Zesty:
  Fix Released
Status in corosync source package in Artful:
  Fix Released

Bug description:
  [Impact]

  Corosync sigaborts if it starts before the interface it has to bind to
  is ready.

  On boot, if no interface in the bindnetaddr range is up/configured,
  corosync binds to lo (127.0.0.1). Once an applicable interface is up,
  corosync crashes with the following error message:

  corosync: votequorum.c:2019: message_handler_req_exec_votequorum_nodeinfo: 
Assertion `sender_node != NULL' failed.
  Aborted (core dumped)

  The last log entries show that the interface is trying to join the
  cluster:

  Dec 19 11:36:05 [22167] xenial-pacemaker corosync debug   [TOTEM ] 
totemsrp.c:2089 entering OPERATIONAL state.
  Dec 19 11:36:05 [22167] xenial-pacemaker corosync notice  [TOTEM ] 
totemsrp.c:2095 A new membership (169.254.241.10:444) was formed. Members 
joined: 704573706

  During the quorum calculation, the generated nodeid (704573706) for
  the node is being used instead of the nodeid specified in the
  configuration file (1), and the assert fails because the nodeid is not
  present in the member list. Corosync should use the correct nodeid and
  continue running after the interface is up, as shown in a fixed
  corosync boot:

  Dec 19 11:50:56 [4824] xenial-corosync corosync notice  [TOTEM ]
  totemsrp.c:2095 A new membership (169.254.241.10:80) was formed.
  Members joined: 1

  [Environment]

  Xenial 16.04.3

  Packages:

  ii  corosync 2.3.5-3ubuntu1amd64cluster engine 
daemon and utilities
  ii  libcorosync-common4:amd642.3.5-3ubuntu1amd64cluster engine 
common library

  [Test Case]

  Config:

  totem {
  version: 2
  member {
  memberaddr: 169.254.241.10
  }
  member {
  memberaddr: 169.254.241.20
  }
  transport: udpu

  crypto_cipher: none
  crypto_hash: none
  nodeid: 1
  interface {
  ringnumber: 0
  bindnetaddr: 169.254.241.0
  mcastport: 5405
  ttl: 1
  }
  }

  quorum {
  provider: corosync_votequorum
  expected_votes: 2
  }

  nodelist {
  node {
  ring0_addr: 169.254.241.10
  nodeid: 1
  }
  node {
  ring0_addr: 169.254.241.20
  nodeid: 2
  }
  }

  1. ifdown interface (169.254.241.10)
  2. start corosync (/usr/sbin/corosync -f)
  3. ifup interface

  [Regression Potential]

  This patch affects corosync boot; the regression potential is for
  other problems during corosync startup and/or configuration parsing.

  [Other info]

  # Upstream corosync commit :
  
https://github.com/corosync/corosync/commit/aab55a004bb12ebe78db341dc56759dfe710c1b2

  # git describe aab55a004bb12ebe78db341dc56759dfe710c1b2
  v2.3.5-45-gaab55a0

  # rmadison corosync
  corosync | 2.3.3-1ubuntu1   | trusty  | source, amd64, arm64, armhf, 
i386, powerpc, ppc64el
  corosync | 2.3.3-1ubuntu3   | trusty-updates  | source, amd64, arm64, armhf, 
i386, powerpc, ppc64el
  corosync | 2.3.5-3ubuntu1   | xenial  | source, amd64, arm64, armhf, 
i386, powerpc, ppc64el, s390x
  corosync | 2.4.2-3build1| zesty   | source, amd64, arm64, armhf, 
i386, ppc64el, s390x
  corosync | 2.4.2-3build1| artful  | source, amd64, arm64, armhf, 
i386, ppc64el, s390x
  corosync | 2.4.2-3build1| bionic  | source, amd64, arm64, armhf, 
i386, ppc64el, s390x

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/corosync/+bug/1739033/+subscriptions

___
Mailing list: https://launchpad.net/~ubuntu-ha
Post to : ubuntu-ha@lists.launchpad.net
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp