[Bug 1751923] Re: [SRU]_heal_instance_info_cache periodic task bases on port list from nova db, not from neutron server

2021-08-27 Thread Jorge Niedbalski
Hello,

I've verified that this problem doesn't reproduces with the package
contained in proposed.

1) Deployed this bundle of bionic-queens

Upgraded to the following version:


root@juju-51d6ad-1751923-6:/home/ubuntu# dpkg -l | grep nova
ii  nova-api-os-compute  2:17.0.13-0ubuntu3
all  OpenStack Compute - OpenStack Compute API frontend
ii  nova-common  2:17.0.13-0ubuntu3
all  OpenStack Compute - common files
ii  nova-conductor   2:17.0.13-0ubuntu3
all  OpenStack Compute - conductor service
ii  nova-placement-api   2:17.0.13-0ubuntu3
all  OpenStack Compute - placement API frontend
ii  nova-scheduler   2:17.0.13-0ubuntu3
all  OpenStack Compute - virtual machine scheduler
ii  python-nova  2:17.0.13-0ubuntu3
all  OpenStack Compute Python libraries


root@juju-51d6ad-1751923-7:/home/ubuntu# dpkg -l | grep nova
ii  nova-api-metadata2:17.0.13-0ubuntu3 
   all  OpenStack Compute - metadata API frontend
ii  nova-common  2:17.0.13-0ubuntu3 
   all  OpenStack Compute - common files
ii  nova-compute 2:17.0.13-0ubuntu3 
   all  OpenStack Compute - compute node base
ii  nova-compute-kvm 2:17.0.13-0ubuntu3 
   all  OpenStack Compute - compute node (KVM)
ii  nova-compute-libvirt 2:17.0.13-0ubuntu3 
   all  OpenStack Compute - compute node libvirt support
ii  python-nova  2:17.0.13-0ubuntu3 
   all  OpenStack Compute Python libraries
ii  python-novaclient2:9.1.1-0ubuntu1   
   all  client library for OpenStack Compute API - Python 2.7
ii  python3-novaclient   2:9.1.1-0ubuntu1   
   all  client library for OpenStack Compute API - 3.x


root@juju-51d6ad-1751923-6:/home/ubuntu# systemctl status nova*|grep -i active
   Active: active (running) since Fri 2021-08-27 22:02:25 UTC; 1h 7min ago
   Active: active (running) since Fri 2021-08-27 22:02:12 UTC; 1h 8min ago
   Active: active (running) since Fri 2021-08-27 22:02:25 UTC; 1h 7min ago


3) Created a server with 4 private ports, 1 public one.

ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack$ openstack server list
+--+---++---++---+
| ID   | Name  | Status | Networks  
| Image  | 
Flavor|
+--+---++---++---+
| 5843e6b5-e1a7-4208-9f19-1d051c032afb | cirros-232302 | ACTIVE | 
private=192.168.21.22, 192.168.21.6, 192.168.21.10, 192.168.21.13, 10.5.150.1 | 
cirros | m1.cirros |
+--+---++---++---+

ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack$ nova interface-list 
5843e6b5-e1a7-4208-9f19-1d051c032afb 
++--+--+---+---+
| Port State | Port ID  | Net ID
   | IP addresses  | MAC Addr  |
++--+--+---+---+
| ACTIVE | 1680b164-14d7-4d6e-b085-94292ece82cf | 
8d91e266-0925-4c29-8039-0d71862df4fc | 192.168.21.13 | fa:16:3e:cf:f8:c8 |
| ACTIVE | 5865a40e-36fa-4cf9-bd40-85a1e78031f5 | 
8d91e266-0925-4c29-8039-0d71862df4fc | 192.168.21.6  | fa:16:3e:eb:73:b1 |
| ACTIVE | 5f400107-d9eb-4a1b-a37b-3bd034d8f995 | 
8d91e266-0925-4c29-8039-0d71862df4fc | 192.168.21.10 | fa:16:3e:95:9a:78 |
| ACTIVE | b11d1c8e-d42a-41e0-a7ad-e34a7a93d020 | 
8d91e266-0925-4c29-8039-0d71862df4fc | 192.16 8.21.22 | fa:16:3e:a3:45:45 |
++--+--+---+---+

4) I can see the 4 tap devices.

root@juju-51d6ad-1751923-7:/home/ubuntu# virsh dumpxml instance-0001|grep 
-i tap
  
  
  
  


5) I modified the instance info caches removing one of the interfaces.

Database changed
mysql>  update instance_info_caches set 

[Bug 1879798] Re: designate-manage pool update doesn't reflects targets master dns servers into zones.

2021-08-25 Thread Jorge Niedbalski
I see the change in prior releases S/T/U

https://github.com/openstack/designate/commit/b967e9f706373f1aad6db882c2295fbbe1fadfc9
https://github.com/openstack/designate/commit/953492904772933f5f8e265d1ae6cc1e6385fcc6
https://github.com/openstack/designate/commit/0b5634643b4b69cd0a7d5499f258602604741d22

Can this be backported into the cloud-archive releases?

Thanks

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1879798

Title:
  designate-manage pool update doesn't reflects targets master dns
  servers into zones.

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-designate/+bug/1879798/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1751923] Re: [SRU]_heal_instance_info_cache periodic task bases on port list from nova db, not from neutron server

2021-07-28 Thread Jorge Niedbalski
I am in the process to verify bionic/rocky/queens releases.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1751923

Title:
  [SRU]_heal_instance_info_cache periodic task bases on port list from
  nova db, not from neutron server

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1751923/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1927868] Re: vRouter not working after update to 16.3.1

2021-06-24 Thread Jorge Niedbalski
Hello,

I reviewed the code path and upgrade in my reproducer, following the approach
of upgrading neutron-gateway and subsequently neutron-api doesn't works because 
of a mismatch
in the migrations/rpc versions that causes the ha port to fail to be 
created/updated,
then the keepalived process cannot be spawned and finally the 
state-change-monitor
fails to find the PID for that keepalived process.  

If I upgrade neutron-api, run the migrations to head and then upgrade
the gateways, all seems correct.

I upgraded from the following versions

root@juju-da864d-1927868-5:/home/ubuntu# dpkg -l |grep keepalived
ii  keepalived   1:1.3.9-1ubuntu0.18.04.2   
 amd64Failover and monitoring daemon for LVS 
clusters

root@juju-da864d-1927868-5:/home/ubuntu# dpkg -l |grep neutron-common
ii  neutron-common   2:15.3.3-0ubuntu1~cloud0   
 all  Neutron is a virtual network service for 
Openstack - common

--> To

root@juju-da864d-1927868-5:/home/ubuntu# dpkg -l |grep neutron-common
ii  neutron-common   2:16.3.2-0ubuntu3~cloud0   
 all  Neutron is a virtual network service for 
Openstack - common


I created a router with HA enabled as follows


$ openstack router list
+--+-++---+--+-+--+
| ID   | Name| Status | State | 
Project  | Distributed | HA   |
+--+-++---+--+-+--+
| 09fa811f-410c-4360-8cae-687e7e73ff21 | provider-router | ACTIVE | UP| 
6f5aaf5130764305a5d37862e3ff18ce | False   | True |
+--+-++---+--+-+--+


===> Prior to upgrade I can list the keepalived processed linked to the 
ha-router

root 22999  0.0  0.0  91816  3052 ?Ss   19:17   0:00
keepalived -P -f /var/lib/neutron/ha_confs/09fa811f-410c-4360-8cae-
687e7e73ff21/keepalived.conf -p /var/lib/neutron/ha_confs/09fa811f-
410c-4360-8cae-687e7e73ff21.pid.keepalived -r /var/lib/neutron/ha_confs
/09fa811f-410c-4360-8cae-687e7e73ff21.pid.keepalived-vrrp -D

root 23001  0.0  0.1  92084  4088 ?S19:17   0:00
keepalived -P -f /var/lib/neutron/ha_confs/09fa811f-410c-4360-8cae-
687e7e73ff21/keepalived.conf -p /var/lib/neutron/ha_confs/09fa811f-
410c-4360-8cae-687e7e73ff21.pid.keepalived -r /var/lib/neutron/ha_confs
/09fa811f-410c-4360-8cae-687e7e73ff21.pid.keepalived-vrrp -D


===> After upgrading -- None is returned, and in fact the keepalived processes 
aren't spawned
after neutron-* is upgraded.

Pre-upgrade:
Jun 24 19:17:07 juju-da864d-1927868-5 Keepalived[22997]: Starting Keepalived 
v1.3.9 (10/21,2017)
Jun 24 19:17:07 juju-da864d-1927868-5 Keepalived[22999]: Starting VRRP child 
process, pid=23001

Post - upgrade -- Not started

Jun 24 19:30:41 juju-da864d-1927868-5 Keepalived[22999]: Stopping
Jun 24 19:30:42 juju-da864d-1927868-5 Keepalived_vrrp[23001]: Stopped
Jun 24 19:30:42 juju-da864d-1927868-5 Keepalived[22999]: Stopped Keepalived 
v1.3.9 (10/21,2017)

The reason for those keepalived processes not re-spawned is

1) The ml2 process starts the router devices by requesting a rpc call on the 
device details. This
one fails with different oslo target versions.

Therefore is required for the neutron-api migrations to be applied
before the gateways.

9819:2021-06-24 19:31:09.935 31744 DEBUG
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-
14f31407-6342-4f71-98b8-4437e166dbaa - - - - -] Starting to process
devices in:{'current': {'87cfdd45-fea7-4c06-aa13-174cb71b294f',
'b8e18ba0-c65b-498e-9a8b-34c0fcc42d07',
'926b7377-30f4-4b2c-9064-8aab3918a385'}, 'added':
{'87cfdd45-fea7-4c06-aa13-174cb71b294f'}, 'removed': set(), 'updated':
set(), 're_added': set()} rpc_loop /usr/lib/python3/dist-
packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py:2685

9821:2021-06-24 19:31:10.028 31744 ERROR neutron.agent.rpc [req-
14f31407-6342-4f71-98b8-4437e166dbaa - - - - -] Failed to get details
for device 87cfdd45-fea7-4c06-aa13-174cb71b294f:
oslo_messaging.rpc.client.RemoteError: Remote error:
InvalidTargetVersion Invalid target version 1.1

9869:2021-06-24 19:31:10.510 31744 DEBUG
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-
14f31407-6342-4f71-98b8-4437e166dbaa - - - - -] retrying failed devices
{'87cfdd45-fea7-4c06-aa13-174cb71b294f'}
_update_port_info_failed_devices_stats /usr/lib/python3/dist-
packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py:1674

2)  Then the l3 ha router creation mechanism can't process the HA router 
because the HA port id 

[Bug 1879798] Re: designate-manage pool update doesn't reflects targets master dns servers into zones.

2021-06-09 Thread Jorge Niedbalski
Master/Train/Ussuri/Stein fixed upstream
https://review.opendev.org/q/topic:%22bug%252F1879798%22+(status:open%20OR%20status:merged)

Needs backports for UCA

** Also affects: cloud-archive
   Importance: Undecided
   Status: New

** Also affects: cloud-archive/ussuri
   Importance: Undecided
   Status: New

** Also affects: cloud-archive/wallaby
   Importance: Undecided
   Status: New

** Also affects: cloud-archive/victoria
   Importance: Undecided
   Status: New

** Also affects: cloud-archive/xena
   Importance: Undecided
   Status: New

** Changed in: cloud-archive/xena
   Status: New => Fix Released

** Changed in: cloud-archive/wallaby
   Status: New => Fix Released

** Changed in: cloud-archive/victoria
   Status: New => Fix Committed

** Changed in: cloud-archive/ussuri
   Status: New => Fix Committed

** Also affects: cloud-archive/stein
   Importance: Undecided
   Status: New

** Also affects: cloud-archive/train
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1879798

Title:
  designate-manage pool update doesn't reflects targets master dns
  servers into zones.

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-designate/+bug/1879798/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1751923] Re: [SRU]_heal_instance_info_cache periodic task bases on port list from nova db, not from neutron server

2021-06-01 Thread Jorge Niedbalski
@corey anything in specific you need at my end to get this SRU reviewed?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1751923

Title:
  [SRU]_heal_instance_info_cache periodic task bases on port list from
  nova db, not from neutron server

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1751923/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1751923] Re: [SRU]_heal_instance_info_cache periodic task bases on port list from nova db, not from neutron server

2021-05-17 Thread Jorge Niedbalski
** Description changed:

  [Impact]
  
  * During periodic task _heal_instance_info_cache the instance_info_caches are 
not updated using instance port_ids taken from neutron, but from nova db.
  * This causes that existing VMs to loose their network interfaces after 
reboot.
  
  [Test Plan]
  
  * This bug is reproducible on Bionic/Queens clouds.
  
  1) Deploy the following Juju bundle: https://paste.ubuntu.com/p/HgsqZfsDGh/
- 2) Run the following script: https://paste.ubuntu.com/p/DrFcDXZGSt/
+ 2) Run the following script: https://paste.ubuntu.com/p/c4VDkqyR2z/
  3) If the script finishes with "Port not found" , the bug is still present.
  
  [Where problems could occur]
  
  ** No specific regression potential has been identified.
  ** Check the other info section ***
- 
  
  [Other Info]
  
  How it looks now?
  =
  
  _heal_instance_info_cache during crontask:
  
  
https://github.com/openstack/nova/blob/ef4000a0d326deb004843ee51d18030224c5630f/nova/compute/manager.py#L6525
  
  is using network_api to get instance_nw_info (instance_info_caches):
  
  \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0try:
  
\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0#
 Call to network API to get instance info.. this will
  
\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0#
 force an update to the instance's info_cache
  
\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0self.network_api.get_instance_nw_info(context,
 instance)
  
  self.network_api.get_instance_nw_info() is listed below:
  
  
https://github.com/openstack/nova/blob/ef4000a0d326deb004843ee51d18030224c5630f/nova/network/neutronv2/api.py#L1377
  
  and it uses _build_network_info_model() without networks and port_ids
  parameters (because we're not adding any new interface to instance):
  
  
https://github.com/openstack/nova/blob/ef4000a0d326deb004843ee51d18030224c5630f/nova/network/neutronv2/api.py#L2356
  
  Next: _gather_port_ids_and_networks() generates the list of instance
  networks and port_ids:
  
  \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0networks, port_ids = 
self._gather_port_ids_and_networks(
  
\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0context,
 instance, networks, port_ids, client)
  
  
https://github.com/openstack/nova/blob/ef4000a0d326deb004843ee51d18030224c5630f/nova/network/neutronv2/api.py#L2389-L2390
  
  
https://github.com/openstack/nova/blob/ef4000a0d326deb004843ee51d18030224c5630f/nova/network/neutronv2/api.py#L1393
  
  As we see that _gather_port_ids_and_networks() takes the port list from
  DB:
  
  
https://github.com/openstack/nova/blob/ef4000a0d326deb004843ee51d18030224c5630f/nova/objects/instance.py#L1173-L1176
  
  And thats it. When we lose a port its not possible to add it again with this 
periodic task.
  The only way is to clean device_id field in neutron port object and re-attach 
the interface using `nova interface-attach`.
  
  When the interface is missing and there is no port configured on compute
  host (for example after compute reboot) - interface is not added to
  instance and from neutron point of view port state is DOWN.
  
  When the interface is missing in cache and we reboot hard the instance -
  its not added as tapinterface in xml file = we don't have the network on
  host.
  
  Steps to reproduce
  ==
  1. Spawn devstack
  2. Spawn VM inside devstack with multiple ports (for example also from 2 
different networks)
  3. Update the DB row, drop one interface from interfaces_list
  4. Hard-Reboot the instance
  5. See that nova list shows instance without one address, but nova 
interface-list shows all 
addresseshttps://launchpad.net/~niedbalski/+archive/ubuntu/lp1751923/+packages
  6. See that one port is missing in instance xml files
  7. In theory the _heal_instance_info_cache should fix this things, it relies 
on memory, not on the fresh list of instance ports taken from neutron.
  
  Reproduced Example
  ==
  1. Spawn VM with 1 private network port
  nova boot --flavor m1.small --image cirros-0.3.5-x86_64-disk --nic 
net-name=private  test-2
  2. Attach ports to have 2 private and 2 public interfaces
  nova list:
  | a64ed18d-9868-4bf0-90d3-d710d278922d | test-2 | ACTIVE | -  | 
Running | public=2001:db8::e, 172.24.4.15, 2001:db8::c, 172.24.4.16; 
private=fdda:5d77:e18e:0:f816:3eff:fee8:, 10.0.0.3, 
fdda:5d77:e18e:0:f816:3eff:fe53:231c, 10.0.0.5 |
  
  So we see 4 ports:
  stack@mjozefcz-devstack-ptg:~$ nova interface-list 
a64ed18d-9868-4bf0-90d3-d710d278922d
  
++--+--+---+---+
  | Port State | Port ID  | Net ID  
 | 

[Bug 1751923] Re: [SRU]_heal_instance_info_cache periodic task bases on port list from nova db, not from neutron server

2021-05-17 Thread Jorge Niedbalski
** Patch added: "lp1751923_bionic.debdiff"
   
https://bugs.launchpad.net/nova/+bug/1751923/+attachment/5498309/+files/lp1751923_bionic.debdiff

** Description changed:

  [Impact]
  
  * During periodic task _heal_instance_info_cache the instance_info_caches are 
not updated using instance port_ids taken from neutron, but from nova db.
  * This causes that existing VMs to loose their network interfaces after 
reboot.
  
  [Test Plan]
  
  * This bug is reproducible on Bionic/Queens clouds.
  
  1) Deploy the following Juju bundle: https://paste.ubuntu.com/p/HgsqZfsDGh/
  2) Run the following script: https://paste.ubuntu.com/p/DrFcDXZGSt/
  3) If the script finishes with "Port not found" , the bug is still present.
  
  [Where problems could occur]
  
+ ** No specific regression potential has been identified.
  ** Check the other info section ***
  
  
  [Other Info]
  
  How it looks now?
  =
  
  _heal_instance_info_cache during crontask:
  
  
https://github.com/openstack/nova/blob/ef4000a0d326deb004843ee51d18030224c5630f/nova/compute/manager.py#L6525
  
  is using network_api to get instance_nw_info (instance_info_caches):
  
  \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0try:
  
\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0#
 Call to network API to get instance info.. this will
  
\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0#
 force an update to the instance's info_cache
  
\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0self.network_api.get_instance_nw_info(context,
 instance)
  
  self.network_api.get_instance_nw_info() is listed below:
  
  
https://github.com/openstack/nova/blob/ef4000a0d326deb004843ee51d18030224c5630f/nova/network/neutronv2/api.py#L1377
  
  and it uses _build_network_info_model() without networks and port_ids
  parameters (because we're not adding any new interface to instance):
  
  
https://github.com/openstack/nova/blob/ef4000a0d326deb004843ee51d18030224c5630f/nova/network/neutronv2/api.py#L2356
  
  Next: _gather_port_ids_and_networks() generates the list of instance
  networks and port_ids:
  
  \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0networks, port_ids = 
self._gather_port_ids_and_networks(
  
\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0context,
 instance, networks, port_ids, client)
  
  
https://github.com/openstack/nova/blob/ef4000a0d326deb004843ee51d18030224c5630f/nova/network/neutronv2/api.py#L2389-L2390
  
  
https://github.com/openstack/nova/blob/ef4000a0d326deb004843ee51d18030224c5630f/nova/network/neutronv2/api.py#L1393
  
  As we see that _gather_port_ids_and_networks() takes the port list from
  DB:
  
  
https://github.com/openstack/nova/blob/ef4000a0d326deb004843ee51d18030224c5630f/nova/objects/instance.py#L1173-L1176
  
  And thats it. When we lose a port its not possible to add it again with this 
periodic task.
  The only way is to clean device_id field in neutron port object and re-attach 
the interface using `nova interface-attach`.
  
  When the interface is missing and there is no port configured on compute
  host (for example after compute reboot) - interface is not added to
  instance and from neutron point of view port state is DOWN.
  
  When the interface is missing in cache and we reboot hard the instance -
  its not added as tapinterface in xml file = we don't have the network on
  host.
  
  Steps to reproduce
  ==
  1. Spawn devstack
  2. Spawn VM inside devstack with multiple ports (for example also from 2 
different networks)
  3. Update the DB row, drop one interface from interfaces_list
  4. Hard-Reboot the instance
- 5. See that nova list shows instance without one address, but nova 
interface-list shows all addresses
+ 5. See that nova list shows instance without one address, but nova 
interface-list shows all 
addresseshttps://launchpad.net/~niedbalski/+archive/ubuntu/lp1751923/+packages
  6. See that one port is missing in instance xml files
  7. In theory the _heal_instance_info_cache should fix this things, it relies 
on memory, not on the fresh list of instance ports taken from neutron.
  
  Reproduced Example
  ==
  1. Spawn VM with 1 private network port
  nova boot --flavor m1.small --image cirros-0.3.5-x86_64-disk --nic 
net-name=private  test-2
  2. Attach ports to have 2 private and 2 public interfaces
  nova list:
  | a64ed18d-9868-4bf0-90d3-d710d278922d | test-2 | ACTIVE | -  | 
Running | public=2001:db8::e, 172.24.4.15, 2001:db8::c, 172.24.4.16; 
private=fdda:5d77:e18e:0:f816:3eff:fee8:, 10.0.0.3, 
fdda:5d77:e18e:0:f816:3eff:fe53:231c, 10.0.0.5 |
  
  So we see 4 ports:
  stack@mjozefcz-devstack-ptg:~$ nova interface-list 
a64ed18d-9868-4bf0-90d3-d710d278922d
  

[Bug 1751923] Re: [SRU]_heal_instance_info_cache periodic task bases on port list from nova db, not from neutron server

2021-05-17 Thread Jorge Niedbalski
Hello,

I've prepared a PPA for testing the proposed patch on B/Queens
https://launchpad.net/~niedbalski/+archive/ubuntu/lp1751923/+packages

Attached is the debdiff for bionic.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1751923

Title:
  [SRU]_heal_instance_info_cache periodic task bases on port list from
  nova db, not from neutron server

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1751923/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1751923] Re: [SRU]_heal_instance_info_cache periodic task bases on port list from nova db, not from neutron server

2021-05-17 Thread Jorge Niedbalski
** Changed in: cloud-archive/rocky
   Status: In Progress => Won't Fix

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1751923

Title:
  [SRU]_heal_instance_info_cache periodic task bases on port list from
  nova db, not from neutron server

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1751923/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1751923] Re: [SRU]_heal_instance_info_cache periodic task bases on port list from nova db, not from neutron server

2021-05-15 Thread Jorge Niedbalski
** Description changed:

- Description
- ===
+ [Impact]
  
- During periodic task _heal_instance_info_cache the instance_info_caches
- are not updated using instance port_ids taken from neutron, but from
- nova db.
+ * During periodic task _heal_instance_info_cache the instance_info_caches are 
not updated using instance port_ids taken from neutron, but from nova db.
+ * This causes that existing VMs to loose their network interfaces after 
reboot.
  
- Sometimes, perhaps because of some race-condition, its possible to lose
- some ports from instance_info_caches. Periodic task
- _heal_instance_info_cache should clean this up (add missing records),
- but in fact it's not working this way.
+ [Test Plan]
+ 
+ * This bug is reproducible on Bionic/Queens clouds.
+ 
+ 1) Deploy the following Juju bundle: https://paste.ubuntu.com/p/HgsqZfsDGh/
+ 2) Run the following script: https://paste.ubuntu.com/p/DrFcDXZGSt/
+ 3) If the script finishes with "Port not found" , the bug is still present.
+ 
+ [Where problems could occur]
+ 
+ ** Check the other info section ***
+ 
+ 
+ [Other Info]
  
  How it looks now?
  =
  
  _heal_instance_info_cache during crontask:
  
  
https://github.com/openstack/nova/blob/ef4000a0d326deb004843ee51d18030224c5630f/nova/compute/manager.py#L6525
  
  is using network_api to get instance_nw_info (instance_info_caches):
  
- try:
- # Call to network API to get instance info.. this will
- # force an update to the instance's info_cache
- self.network_api.get_instance_nw_info(context, instance)
+ \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0try:
+ 
\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0#
 Call to network API to get instance info.. this will
+ 
\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0#
 force an update to the instance's info_cache
+ 
\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0self.network_api.get_instance_nw_info(context,
 instance)
  
  self.network_api.get_instance_nw_info() is listed below:
  
  
https://github.com/openstack/nova/blob/ef4000a0d326deb004843ee51d18030224c5630f/nova/network/neutronv2/api.py#L1377
  
  and it uses _build_network_info_model() without networks and port_ids
  parameters (because we're not adding any new interface to instance):
  
  
https://github.com/openstack/nova/blob/ef4000a0d326deb004843ee51d18030224c5630f/nova/network/neutronv2/api.py#L2356
  
  Next: _gather_port_ids_and_networks() generates the list of instance
  networks and port_ids:
  
-   networks, port_ids = self._gather_port_ids_and_networks(
- context, instance, networks, port_ids, client)
+ \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0networks, port_ids = 
self._gather_port_ids_and_networks(
+ 
\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0context,
 instance, networks, port_ids, client)
  
  
https://github.com/openstack/nova/blob/ef4000a0d326deb004843ee51d18030224c5630f/nova/network/neutronv2/api.py#L2389-L2390
  
  
https://github.com/openstack/nova/blob/ef4000a0d326deb004843ee51d18030224c5630f/nova/network/neutronv2/api.py#L1393
  
  As we see that _gather_port_ids_and_networks() takes the port list from
  DB:
  
  
https://github.com/openstack/nova/blob/ef4000a0d326deb004843ee51d18030224c5630f/nova/objects/instance.py#L1173-L1176
  
  And thats it. When we lose a port its not possible to add it again with this 
periodic task.
  The only way is to clean device_id field in neutron port object and re-attach 
the interface using `nova interface-attach`.
  
  When the interface is missing and there is no port configured on compute
  host (for example after compute reboot) - interface is not added to
  instance and from neutron point of view port state is DOWN.
  
  When the interface is missing in cache and we reboot hard the instance -
  its not added as tapinterface in xml file = we don't have the network on
  host.
  
  Steps to reproduce
  ==
  1. Spawn devstack
  2. Spawn VM inside devstack with multiple ports (for example also from 2 
different networks)
  3. Update the DB row, drop one interface from interfaces_list
  4. Hard-Reboot the instance
  5. See that nova list shows instance without one address, but nova 
interface-list shows all addresses
  6. See that one port is missing in instance xml files
  7. In theory the _heal_instance_info_cache should fix this things, it relies 
on memory, not on the fresh list of instance ports taken from neutron.
  
  Reproduced Example
  ==
  1. Spawn VM with 1 private network port
  nova boot --flavor m1.small --image cirros-0.3.5-x86_64-disk --nic 
net-name=private  test-2
  2. Attach ports to have 2 private and 2 public interfaces
  nova list:
  | a64ed18d-9868-4bf0-90d3-d710d278922d | 

[Bug 1751923] Re: _heal_instance_info_cache periodic task bases on port list from nova db, not from neutron server

2021-05-15 Thread Jorge Niedbalski
** Changed in: nova (Ubuntu)
   Status: Confirmed => Fix Released

** Changed in: cloud-archive/queens
   Status: New => In Progress

** Changed in: cloud-archive/queens
 Assignee: (unassigned) => Jorge Niedbalski (niedbalski)

** Changed in: nova (Ubuntu Bionic)
 Assignee: (unassigned) => Jorge Niedbalski (niedbalski)

** Summary changed:

- _heal_instance_info_cache periodic task bases on port list from nova db, not 
from neutron server
+ [SRU]_heal_instance_info_cache periodic task bases on port list from nova db, 
not from neutron server

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1751923

Title:
  [SRU]_heal_instance_info_cache periodic task bases on port list from
  nova db, not from neutron server

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1751923/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1885430] Re: [Bionic/Stein] Ceilometer-agent fails to collect metrics after restart

2021-04-15 Thread Jorge Niedbalski
Verified victoria / focal - groovy

* (focal/victoria) => https://pastebin.ubuntu.com/p/XPVQbwKY7v/
* (groovy-proposed) ==> (https://pastebin.ubuntu.com/p/ZHsvzXR7QH/)


** Tags removed: verification-needed verification-needed-focal 
verification-needed-groovy verification-victoria-needed
** Tags added: verification-done verification-done-focal 
verification-done-groovy verification-victoria-done

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1885430

Title:
  [Bionic/Stein] Ceilometer-agent fails to collect metrics after restart

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-ceilometer-agent/+bug/1885430/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1885430] Re: [Bionic/Stein] Ceilometer-agent fails to collect metrics after restart

2021-04-15 Thread Jorge Niedbalski
Verified train, stein, ussuri.

* train, results: (https://pastebin.ubuntu.com/p/Y7sD9w3rWz/)
* stein, results: (https://pastebin.ubuntu.com/p/ZHsvzXR7QH/)
* ussuri, results: (https://pastebin.ubuntu.com/p/jymSscH3TS/)



** Tags removed: verification-train-needed verification-ussuri-needed
** Tags added: verification-train-done verification-ussuri-done

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1885430

Title:
  [Bionic/Stein] Ceilometer-agent fails to collect metrics after restart

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-ceilometer-agent/+bug/1885430/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1890491] Re: A pacemaker node fails monitor (probe) and stop /start operations on a resource because it returns "rc=189

2021-04-09 Thread Jorge Niedbalski
Hello Lucas,

I'll reformat the patch accordingly and re-submit. Thanks.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1890491

Title:
  A pacemaker node fails monitor (probe) and stop /start operations on a
  resource because it returns "rc=189

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1890491/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1885430] Re: [Bionic/Stein] Ceilometer-agent fails to collect metrics after restart

2021-03-18 Thread Jorge Niedbalski
---> Installed version

root@juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# dpkg -l |grep -i ceilometer
ii  ceilometer-agent-compute 1:12.1.1-0ubuntu1~cloud1   
 all  ceilometer compute agent
ii  ceilometer-common1:12.1.1-0ubuntu1~cloud1   
 all  ceilometer common files
ii  python3-ceilometer   1:12.1.1-0ubuntu1~cloud1   
 all  ceilometer python libraries


Run through 2 cases

1) Service restart
2) Reboot

---> Service restart case


root@juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status 
ceilometer-agent-compute
● ceilometer-agent-compute.service - Ceilometer Agent Compute
   Loaded: loaded (/lib/systemd/system/ceilometer-agent-compute.service; 
enabled; vendor preset: enabled)
   Active: active (running) since Thu 2021-03-18 21:20:01 UTC; 2min 35s ago
 Main PID: 27650 (ceilometer-poll)
Tasks: 6 (limit: 4702)
   CGroup: /system.slice/ceilometer-agent-compute.service
   ├─27650 ceilometer-polling: master process 
[/usr/bin/ceilometer-polling --config-file=/etc/ceilometer/ceilometer.conf 
--polling-namespaces compute --log-file=/var/log/cei
   └─27735 ceilometer-polling: AgentManager worker(0)

Mar 18 21:20:01 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Stopped Ceilometer 
Agent Compute.
Mar 18 21:20:01 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started Ceilometer 
Agent Compute.
Mar 18 21:20:03 juju-bf8c6a-lm-ceilometer-7 ceilometer-agent-compute[27650]: 
Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option 
"log-dir" from group "DEFAULT
root@juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status nova-compute
● nova-compute.service - OpenStack Compute
   Loaded: loaded (/lib/systemd/system/nova-compute.service; disabled; vendor 
preset: enabled)
   Active: active (running) since Thu 2021-03-18 18:46:56 UTC; 2h 35min ago
 Main PID: 2199 (nova-compute)
Tasks: 22 (limit: 4702)
   CGroup: /system.slice/nova-compute.service
   └─2199 /usr/bin/python3 /usr/bin/nova-compute 
--config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf 
--log-file=/var/log/nova/nova-compute.log

Mar 18 18:46:56 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started
OpenStack Compute.


--

root@juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl stop nova-compute
root@juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl disable 
nova-compute.service
Synchronizing state of nova-compute.service with SysV service script with 
/lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install disable nova-compute
root@juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status nova-compute
● nova-compute.service - OpenStack Compute
   Loaded: loaded (/lib/systemd/system/nova-compute.service; disabled; vendor 
preset: enabled)
   Active: inactive (dead) since Thu 2021-03-18 21:23:30 UTC; 7s ago
 Main PID: 2199 (code=exited, status=0/SUCCESS)

Mar 18 18:46:56 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started OpenStack 
Compute.
Mar 18 21:23:24 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Stopping OpenStack 
Compute...
Mar 18 21:23:30 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Stopped OpenStack 
Compute.
root@juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# 


root@juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status 
ceilometer-agent-compute
● ceilometer-agent-compute.service - Ceilometer Agent Compute
   Loaded: loaded (/lib/systemd/system/ceilometer-agent-compute.service; 
enabled; vendor preset: enabled)
   Active: inactive (dead) since Thu 2021-03-18 21:23:24 UTC; 29s ago
 Main PID: 761 (code=exited, status=0/SUCCESS)

Mar 18 21:23:13 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started Ceilometer 
Agent Compute.
Mar 18 21:23:14 juju-bf8c6a-lm-ceilometer-7 ceilometer-agent-compute[761]: 
Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option 
"log-dir" from group "DEFAULT".
Mar 18 21:23:24 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Stopping Ceilometer 
Agent Compute...
Mar 18 21:23:24 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Stopped Ceilometer 
Agent Compute.



root@juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# 
/etc/init.d/ceilometer-agent-compute restart
[ ok ] Restarting ceilometer-agent-compute (via systemctl): 
ceilometer-agent-compute.service.
root@juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status 
ceilometer-agent-compute
● ceilometer-agent-compute.service - Ceilometer Agent Compute
   Loaded: loaded (/lib/systemd/system/ceilometer-agent-compute.service; 
enabled; vendor preset: enabled)
   Active: active (running) since Thu 2021-03-18 21:24:13 UTC; 2s ago
 Main PID: 1549 (ceilometer-poll)
Tasks: 6 (limit: 4702)
   CGroup: /system.slice/ceilometer-agent-compute.service
   ├─1549 ceilometer-polling: master process 
[/usr/bin/ceilometer-polling --config-file=/etc/ceilometer/ceilometer.conf 
--polling-namespaces compute --log-file=/var/log/ceil
   └─1604 ceilometer-polling: AgentManager worker(0)

Mar 18 

[Bug 1885430] Re: [Bionic/Stein] Ceilometer-agent fails to collect metrics after restart

2021-03-11 Thread Jorge Niedbalski
With the proposed patch on stein:

--- nova-compute disabled / no requires  --- machine rebooted


root@juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status nova-compute
● nova-compute.service - OpenStack Compute
   Loaded: loaded (/lib/systemd/system/nova-compute.service; disabled; vendor 
preset: enabled)
   Active: inactive (dead)
   
root@juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status 
ceilometer-agent-compute
● ceilometer-agent-compute.service - Ceilometer Agent Compute
   Loaded: loaded (/lib/systemd/system/ceilometer-agent-compute.service; 
enabled; vendor preset: enabled)
   Active: active (running) since Thu 2021-03-11 21:54:08 UTC; 57s ago
 Main PID: 851 (ceilometer-poll)
Tasks: 6 (limit: 4702)
   CGroup: /system.slice/ceilometer-agent-compute.service
   ├─ 851 ceilometer-polling: master process 
[/usr/bin/ceilometer-polling --config-file=/etc/ceilometer/ceilometer.conf 
--polling-namespaces compute --log-file=/var/log/ceil
   └─3114 ceilometer-polling: AgentManager worker(0)

Mar 11 21:54:08 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started Ceilometer 
Agent Compute.
Mar 11 21:54:25 juju-bf8c6a-lm-ceilometer-7 ceilometer-agent-compute[851]: 
Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option 
"log-dir" from group "DEFAULT".


--- nova-compute disabled / no required --- machine rebooted


ubuntu@juju-bf8c6a-lm-ceilometer-7:~$ sudo su
root@juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# uptime
 21:56:25 up 0 min,  1 user,  load average: 1.67, 0.41, 0.14
root@juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status nova-compute
● nova-compute.service - OpenStack Compute
   Loaded: loaded (/lib/systemd/system/nova-compute.service; disabled; vendor 
preset: enabled)
   Active: active (running) since Thu 2021-03-11 21:56:07 UTC; 20s ago
 Main PID: 2743 (nova-compute)
Tasks: 22 (limit: 4702)
   CGroup: /system.slice/nova-compute.service
   └─2743 /usr/bin/python3 /usr/bin/nova-compute 
--config-file=/etc/nova/nova.conf --config-file=/etc/nova/nova-compute.conf 
--log-file=/var/log/nova/nova-compute.log

Mar 11 21:56:07 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started OpenStack 
Compute.
root@juju-bf8c6a-lm-ceilometer-7:/home/ubuntu# systemctl status 
ceilometer-agent-compute
● ceilometer-agent-compute.service - Ceilometer Agent Compute
   Loaded: loaded (/lib/systemd/system/ceilometer-agent-compute.service; 
enabled; vendor preset: enabled)
   Active: active (running) since Thu 2021-03-11 21:56:00 UTC; 32s ago
 Main PID: 861 (ceilometer-poll)
Tasks: 6 (limit: 4702)
   CGroup: /system.slice/ceilometer-agent-compute.service
   ├─ 861 ceilometer-polling: master process 
[/usr/bin/ceilometer-polling --config-file=/etc/ceilometer/ceilometer.conf 
--polling-namespaces compute --log-file=/var/log/ceil
   └─1583 ceilometer-polling: AgentManager worker(0)

Mar 11 21:56:00 juju-bf8c6a-lm-ceilometer-7 systemd[1]: Started Ceilometer 
Agent Compute.
Mar 11 21:56:05 juju-bf8c6a-lm-ceilometer-7 ceilometer-agent-compute[861]: 
Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option 
"log-dir" from group "DEFAULT".

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1885430

Title:
  [Bionic/Stein] Ceilometer-agent fails to collect metrics after restart

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-ceilometer-agent/+bug/1885430/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1885430] Re: [Bionic/Stein] Ceilometer-agent fails to collect metrics after restart

2021-03-10 Thread Jorge Niedbalski
Note: the only way that works reliable in both cases (restart, bootup)
is by adding requires=nova-compute.service to the systemd service file
/lib/systemd/system/ceilometer-agent-compute.service

I've tried modifying the sysvinit file , rebooted with only the
required-start in the sysvinit file, and it doesn't work either
https://pastebin.canonical.com/p/STTRFyw9Wy/

I agree with Drew, option is Wants= or Requires (more strict), I tested
the 2nd and works ok after service restart and machine bootup (with both
enabled/disabled nova-compute).

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1885430

Title:
  [Bionic/Stein] Ceilometer-agent fails to collect metrics after restart

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-ceilometer-agent/+bug/1885430/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1906720] [NEW] Fix the disable_ssl_certificate_validation option

2020-12-03 Thread Jorge Niedbalski
Public bug reported:

[Environment]

Bionic 
python3-httplib2 | 0.9.2+dfsg-1ubuntu0.2 

[Description]

maas cli fails to work with apis over https with self-signed certificates due 
to the lack
of disable_ssl_certificate_validation option with python 3.5.


[Distribution/Release, Package versions, Platform]
cat /etc/lsb-release; dpkg -l | grep maas
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.5 LTS"
ii maas 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all "Metal as a Service" is a 
physical cloud and IPAM
ii maas-cli 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all MAAS client and 
command-line interface
ii maas-common 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all MAAS server common 
files
ii maas-dhcp 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all MAAS DHCP server
ii maas-proxy 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all MAAS Caching Proxy
ii maas-rack-controller 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all Rack 
Controller for MAAS
ii maas-region-api 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all Region 
controller API service for MAAS
ii maas-region-controller 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all Region 
Controller for MAAS
ii python3-django-maas 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all MAAS server 
Django web framework (Python 3)
ii python3-maas-client 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all MAAS python 
API client (Python 3)
ii python3-maas-provisioningserver 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all 
MAAS server provisioning libraries (Python 3)

[Steps to Reproduce]

- prepare a maas server(installed by packages for me and the customer). it 
doesn't have to be HA to reproduce
- prepare a set of certificate, key and ca-bundle
- place a new conf[2] in /etc/nginx/sites-enabled and `sudo systemctl restart 
nginx`
- add the ca certificates to the host
sudo mkdir /usr/share/ca-certificates/extra
sudo cp -v ca-bundle.crt /usr/share/ca-certificates/extra/
dpkg-reconfigure ca-certificates
- login with a new profile over https url
- when not added the ca-bundle to the trusted ca cert store, it fails to login 
and '--insecure' flag also doesn't work[3]

[Known Workarounds]
None

** Affects: python-httplib2 (Ubuntu)
 Importance: Undecided
 Status: Fix Released

** Affects: python-httplib2 (Ubuntu Bionic)
 Importance: Undecided
 Status: Confirmed

** Affects: python-httplib2 (Ubuntu Focal)
 Importance: Undecided
 Status: Fix Released

** Affects: python-httplib2 (Ubuntu Groovy)
 Importance: Undecided
 Status: Fix Released

** Affects: python-httplib2 (Ubuntu Hirsute)
 Importance: Undecided
 Status: Fix Released

** Also affects: python-httplib2 (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Also affects: python-httplib2 (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Also affects: python-httplib2 (Ubuntu Hirsute)
   Importance: Undecided
   Status: New

** Also affects: python-httplib2 (Ubuntu Groovy)
   Importance: Undecided
   Status: New

** Changed in: python-httplib2 (Ubuntu Hirsute)
   Status: New => Fix Released

** Changed in: python-httplib2 (Ubuntu Groovy)
   Status: New => Fix Released

** Changed in: python-httplib2 (Ubuntu Focal)
   Status: New => Fix Released

** Changed in: python-httplib2 (Ubuntu Bionic)
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1906720

Title:
  Fix the disable_ssl_certificate_validation option

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/python-httplib2/+bug/1906720/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1906720] Re: Fix the disable_ssl_certificate_validation option

2020-12-03 Thread Jorge Niedbalski
Backport fix https://github.com/httplib2/httplib2/pull/15 into bionic

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1906720

Title:
  Fix the disable_ssl_certificate_validation option

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/python-httplib2/+bug/1906720/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1906719] [NEW] Fix the disable_ssl_certificate_validation option

2020-12-03 Thread Jorge Niedbalski
Public bug reported:

[Environment]

Bionic 
python3-httplib2 | 0.9.2+dfsg-1ubuntu0.2 

[Description]

maas cli fails to work with apis over https with self-signed certificates due 
to the lack
of disable_ssl_certificate_validation option with python 3.5.


[Distribution/Release, Package versions, Platform]
cat /etc/lsb-release; dpkg -l | grep maas
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.5 LTS"
ii maas 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all "Metal as a Service" is a 
physical cloud and IPAM
ii maas-cli 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all MAAS client and 
command-line interface
ii maas-common 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all MAAS server common 
files
ii maas-dhcp 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all MAAS DHCP server
ii maas-proxy 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all MAAS Caching Proxy
ii maas-rack-controller 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all Rack 
Controller for MAAS
ii maas-region-api 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all Region 
controller API service for MAAS
ii maas-region-controller 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all Region 
Controller for MAAS
ii python3-django-maas 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all MAAS server 
Django web framework (Python 3)
ii python3-maas-client 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all MAAS python 
API client (Python 3)
ii python3-maas-provisioningserver 2.8.2-8577-g.a3e674063-0ubuntu1~18.04.1 all 
MAAS server provisioning libraries (Python 3)

[Steps to Reproduce]

- prepare a maas server(installed by packages for me and the customer). it 
doesn't have to be HA to reproduce
- prepare a set of certificate, key and ca-bundle
- place a new conf[2] in /etc/nginx/sites-enabled and `sudo systemctl restart 
nginx`
- add the ca certificates to the host
sudo mkdir /usr/share/ca-certificates/extra
sudo cp -v ca-bundle.crt /usr/share/ca-certificates/extra/
dpkg-reconfigure ca-certificates
- login with a new profile over https url
- when not added the ca-bundle to the trusted ca cert store, it fails to login 
and '--insecure' flag also doesn't work[3]

[Known Workarounds]
None

** Affects: python-httplib2 (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1906719

Title:
  Fix the disable_ssl_certificate_validation option

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/python-httplib2/+bug/1906719/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872106] Re: isc-dhcp-server crashing constantly [Ubuntu 20.04]

2020-11-04 Thread Jorge Niedbalski
Hello Karsten,

Can you check comments https://bugs.launchpad.net/dhcp/+bug/1872118/comments/62
 and https://bugs.launchpad.net/dhcp/+bug/1872118/comments/63 and validate the 
versions?

* Also, could be possible to upload the crash report here? and the
output of dpkg -l

Thanks,

Jorge

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872106

Title:
  isc-dhcp-server crashing constantly [Ubuntu 20.04]

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/isc-dhcp/+bug/1872106/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1899064] [NEW] Decrease time_threshold for zone_purge task.

2020-10-08 Thread Jorge Niedbalski
Public bug reported:

[Environment]

Ussuri
Charms 20.08 

[Description]

After deleting a zone on a designate-bind backend, the zone
remains active until the zone purge producer task gets executed.

$ openstack zone show 98c02cfb-5a2f-
Could not find Zone

$ openstack zone delete 98c02cfb-5a2f-
Could not find Zone

mysql> select * from zones where id="98c02cfb-5a2f-";
Empty set (0.01 sec)


363:2020-09-25 05:23:41.154 1685647 DEBUG designate.central.service 
[req-8223a934-84df-44eb-97bd-a0194343955a - - - - -] Performing purge with 
limit of 100 and criterion of {u'deleted': u'!0', u'deleted_at': u'<=2020-09-18 
05:23:41.143790', u'shard': u'BETWEEN 1365,2729'} purge_zones 
/usr/lib/python2.7/dist-packages/designate/central/service.py:1131

No zone was found by this criteria, therefore, the hard delete zone from the 
database 
https://github.com/openstack/designate/blob/2e3d8ab80daac00bad7d2b46246660592163bf17/designate/storage/impl_sqlalchemy/__init__.py#L454
didn't apply.

This delta is governed by time_threshold 
https://github.com/openstack/designate/blob/89435416a1dcb6df2a347f43680cfe57d1eb0a82/designate/conf/producer.py#L100
which is set to 1 week.

### Proposed actions

1) Make the time_threshold shorter to a 1 hour span by default.
2) Don't list deleted_at zones in designate list operation.

** Affects: charm-designate
 Importance: Undecided
 Status: New

** Affects: designate (Ubuntu)
 Importance: Undecided
 Status: New

** Also affects: designate (Ubuntu)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1899064

Title:
  Decrease time_threshold for zone_purge task.

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-designate/+bug/1899064/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1890491] Re: A pacemaker node fails monitor (probe) and stop /start operations on a resource because it returns "rc=189

2020-09-08 Thread Jorge Niedbalski
** Patch added: "lp1890491-bionic.debdiff"
   
https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1890491/+attachment/5408794/+files/lp1890491-bionic.debdiff

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1890491

Title:
  A pacemaker node fails monitor (probe) and stop /start operations on a
  resource because it returns "rc=189

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1890491/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1890491] Re: A pacemaker node fails monitor (probe) and stop /start operations on a resource because it returns "rc=189

2020-08-18 Thread Jorge Niedbalski
** Changed in: pacemaker (Ubuntu Bionic)
   Status: New => In Progress

** Changed in: pacemaker (Ubuntu Bionic)
 Assignee: (unassigned) => Jorge Niedbalski (niedbalski)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1890491

Title:
  A pacemaker node fails monitor (probe) and stop /start operations on a
  resource because it returns "rc=189

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1890491/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872118] Re: [SRU] DHCP Cluster crashes after a few hours

2020-08-17 Thread Jorge Niedbalski
** No longer affects: isc-dhcp (Ubuntu)

** No longer affects: isc-dhcp (Ubuntu Focal)

** No longer affects: isc-dhcp (Ubuntu Groovy)

** Changed in: bind9-libs (Ubuntu Groovy)
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872118

Title:
  [SRU] DHCP Cluster crashes after a few hours

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhcp/+bug/1872118/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1890491] Re: A pacemaker node fails monitor (probe) and stop /start operations on a resource because it returns "rc=189

2020-08-14 Thread Jorge Niedbalski
Hello,

I am testing a couple of patches (both imported from master), through
this PPA: https://launchpad.net/~niedbalski/+archive/ubuntu/fix-1890491

c20f8920 - don't order implied stops relative to a remote connection
938e99f2 - remote state is failed if node is shutting down with connection 
failure

I'll report back here if these patches fixes the behavior described in my 
previous
comment.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1890491

Title:
  A pacemaker node fails monitor (probe) and stop /start operations on a
  resource because it returns "rc=189

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1890491/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1890491] Re: A pacemaker node fails monitor (probe) and stop /start operations on a resource because it returns "rc=189

2020-08-13 Thread Jorge Niedbalski
I am able to reproduce a similar issue with the following bundle:
https://paste.ubuntu.com/p/VJ3m7nMN79/

Resource created with
sudo pcs resource create test2 ocf:pacemaker:Dummy op_sleep=10 op monitor 
interval=30s timeout=30s op start timeout=30s op stop timeout=30s

juju ssh nova-cloud-controller/2 "sudo pcs constraint location test2 prefers 
juju-acda3d-pacemaker-remote-10.cloud.sts"
juju ssh nova-cloud-controller/2 "sudo pcs constraint location test2 prefers 
juju-acda3d-pacemaker-remote-11.cloud.sts"
juju ssh nova-cloud-controller/2 "sudo pcs constraint location test2 prefers 
juju-acda3d-pacemaker-remote-12.cloud.sts"


Online: [ juju-acda3d-pacemaker-remote-7 juju-acda3d-pacemaker-remote-8 
juju-acda3d-pacemaker-remote-9 ]
RemoteOnline: [ juju-acda3d-pacemaker-remote-10.cloud.sts 
juju-acda3d-pacemaker-remote-11.cloud.sts 
juju-acda3d-pacemaker-remote-12.cloud.sts ]

Full list of resources:

Resource Group: grp_nova_vips
res_nova_bf9661e_vip (ocf::heartbeat:IPaddr2): Started 
juju-acda3d-pacemaker-remote-7
Clone Set: cl_nova_haproxy [res_nova_haproxy]
Started: [ juju-acda3d-pacemaker-remote-7 juju-acda3d-pacemaker-remote-8 
juju-acda3d-pacemaker-remote-9 ]
juju-acda3d-pacemaker-remote-10.cloud.sts (ocf::pacemaker:remote): Started 
juju-acda3d-pacemaker-remote-8
juju-acda3d-pacemaker-remote-12.cloud.sts (ocf::pacemaker:remote): Started 
juju-acda3d-pacemaker-remote-8
juju-acda3d-pacemaker-remote-11.cloud.sts (ocf::pacemaker:remote): Started 
juju-acda3d-pacemaker-remote-7

test2 (ocf::pacemaker:Dummy): Started juju-acda3d-pacemaker-
remote-10.cloud.sts

## After running the following commands on juju-acda3d-pacemaker-
remote-10.cloud.sts

1) sudo systemctl stop pacemaker_remote
2) forcedfully shutdown (openstack server stop ) in less than 10 seconds 
after the pacemaker_remote gets
executed.

Remote is shutdown

RemoteOFFLINE: [ juju-acda3d-pacemaker-remote-10.cloud.sts ]

The resource status remains as stopped across the 3 machines, and
doesn't recovers.

$ juju run --application nova-cloud-controller "sudo pcs resource show | grep 
-i test2"
- Stdout: " test2\t(ocf::pacemaker:Dummy):\tStopped\n"
UnitId: nova-cloud-controller/0
- Stdout: " test2\t(ocf::pacemaker:Dummy):\tStopped\n"
UnitId: nova-cloud-controller/1
- Stdout: " test2\t(ocf::pacemaker:Dummy):\tStopped\n"
UnitId: nova-cloud-controller/2

However, If I do a clean shutdown (without interrupting the pacemaker_remote 
fence), that ends up
with the resource migrated correctly to another node.

6 nodes configured
9 resources configured

Online: [ juju-acda3d-pacemaker-remote-7 juju-acda3d-pacemaker-remote-8 
juju-acda3d-pacemaker-remote-9 ]
RemoteOnline: [ juju-acda3d-pacemaker-remote-11.cloud.sts 
juju-acda3d-pacemaker-remote-12.cloud.sts ]
RemoteOFFLINE: [ juju-acda3d-pacemaker-remote-10.cloud.sts ]

Full list of resources:

[...]
test2 (ocf::pacemaker:Dummy): Started juju-acda3d-pacemaker-remote-12.cloud.sts

I will keep investigating this behavior and determine is this is linked
to the bug reported.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1890491

Title:
  A pacemaker node fails monitor (probe) and stop /start operations on a
  resource because it returns "rc=189

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1890491/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872118] Re: [SRU] DHCP Cluster crashes after a few hours

2020-08-11 Thread Jorge Niedbalski
** Patch added: "lp-1872118-groovy.debdiff"
   
https://bugs.launchpad.net/ubuntu/+source/isc-dhcp/+bug/1872118/+attachment/5400760/+files/lp-1872118-groovy.debdiff

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872118

Title:
  [SRU] DHCP Cluster crashes after a few hours

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhcp/+bug/1872118/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872118] Re: [SRU] DHCP Cluster crashes after a few hours

2020-08-11 Thread Jorge Niedbalski
** Patch added: "lp-1872118-focal.debdiff"
   
https://bugs.launchpad.net/ubuntu/+source/isc-dhcp/+bug/1872118/+attachment/5400761/+files/lp-1872118-focal.debdiff

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872118

Title:
  [SRU] DHCP Cluster crashes after a few hours

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhcp/+bug/1872118/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872118] Re: [SRU] DHCP Cluster crashes after a few hours

2020-08-11 Thread Jorge Niedbalski
Uploaded debdiff(s) for groovy and focal. This will require a follow up
rebuild change for isc-dhcp, once the library change lands.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872118

Title:
  [SRU] DHCP Cluster crashes after a few hours

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhcp/+bug/1872118/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872118] Re: [SRU] DHCP Cluster crashes after a few hours

2020-08-11 Thread Jorge Niedbalski
** Description changed:

+ [Description]
  
- I have a pair of DHCP serevrs running in a cluster on ubuntu 20.04, All 
worked perfectly until recently, when they started stopping with code=killed, 
status=6/ABRT.
- This is being fixed by 
+ isc-dhcp-server uses libisc-export (coming from bin9-libs package) for 
handling the socket event(s) when configured in peer mode (master/secondary). 
It's possible that a sequence of messages dispatched by the master that 
requires acknowledgment from its peers holds a socket
+ in a pending to send state, a timer or a subsequent write request can be 
scheduled into this socket and the !sock->pending_send assertion
+ will be raised when trying to write again at the time data hasn't been 
flushed entirely and the pending_send flag hasn't been reset to 0 state.
  
- https://bugs.launchpad.net/bugs/1870729
+ If this race condition happens, the following stacktrace will be
+ hit:
  
- However now one stops after a few hours with the following errors. One
- can stay on line but not both.
+ (gdb) info threads
+   Id Target Id Frame
+ * 1 Thread 0x7fb4ddecb700 (LWP 3170) __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:50
+   2 Thread 0x7fb4dd6ca700 (LWP 3171) __lll_lock_wait 
(futex=futex@entry=0x7fb4de6d2028, private=0) at lowlevellock.c:52
+   3 Thread 0x7fb4de6cc700 (LWP 3169) futex_wake (private=, 
processes_to_wake=1, futex_word=) at 
../sysdeps/nptl/futex-internal.h:364
+   4 Thread 0x7fb4de74f740 (LWP 3148) futex_wait_cancelable 
(private=, expected=0, futex_word=0x7fb4de6cd0d0) at 
../sysdeps/nptl/futex-internal.h:183
+ 
+ (gdb) frame 2
+ #2 0x7fb4dec85985 in isc_assertion_failed (file=file@entry=0x7fb4decd8878 
"../../../../lib/isc/unix/socket.c", line=line@entry=3361, 
type=type@entry=isc_assertiontype_insist,
+ cond=cond@entry=0x7fb4decda033 "!sock->pending_send") at 
../../../lib/isc/assertions.c:52
+ (gdb) bt
+ #1 0x7fb4deaa7859 in __GI_abort () at abort.c:79
+ #2 0x7fb4dec85985 in isc_assertion_failed (file=file@entry=0x7fb4decd8878 
"../../../../lib/isc/unix/socket.c", line=line@entry=3361, 
type=type@entry=isc_assertiontype_insist,
+ cond=cond@entry=0x7fb4decda033 "!sock->pending_send") at 
../../../lib/isc/assertions.c:52
+ #3 0x7fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at 
../../../../lib/isc/unix/socket.c:4041
+ #4 process_fd (writeable=, readable=, fd=11, 
manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4054
+ #5 process_fds (writefds=, readfds=0x7fb4de6d1090, maxfd=13, 
manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4211
+ #6 watcher (uap=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4397
+ #7 0x7fb4dea68609 in start_thread (arg=) at 
pthread_create.c:477
+ #8 0x7fb4deba4103 in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95
+ 
+ (gdb) frame 3
+ #3 0x7fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at 
../../../../lib/isc/unix/socket.c:4041
+ 4041 in ../../../../lib/isc/unix/socket.c
+ (gdb) p sock->pending_send
+ $2 = 1
+ 
+ [TEST CASE]
+ 
+ 1) Install isc-dhcp-server in 2 focal machine(s).
+ 2) Configure peer/cluster mode as follows:
+Primary configuration: https://pastebin.ubuntu.com/p/XYj648MghK/
+Secondary configuration: https://pastebin.ubuntu.com/p/PYkcshZCWK/
+ 2) Run dhcpd as follows in both machine(s)
+ 
+ # dhcpd -f -d -4 -cf /etc/dhcp/dhcpd.conf --no-pid ens4
+ 
+ 3) Leave the cluster running for a long (2h) period until the crash/race
+ condition is reproduced.
  
  
+ [REGRESSION POTENTIAL]
  
- Syslog shows 
- Apr 10 17:20:15 dhcp-primary sh[6828]: 
../../../../lib/isc/unix/socket.c:3361: INSIST(!sock->pending_send) failed, 
back trace
- Apr 10 17:20:15 dhcp-primary sh[6828]: #0 0x7fbe78702a4a in ??
- Apr 10 17:20:15 dhcp-primary sh[6828]: #1 0x7fbe78702980 in ??
- Apr 10 17:20:15 dhcp-primary sh[6828]: #2 0x7fbe7873e7e1 in ??
- Apr 10 17:20:15 dhcp-primary sh[6828]: #3 0x7fbe784e5609 in ??
- Apr 10 17:20:15 dhcp-primary sh[6828]: #4 0x7fbe78621103 in ??
- 
- 
- nothing in kern.log
- 
- 
- apport.log shows
- ERROR: apport (pid 6850) Fri Apr 10 17:20:15 2020: called for pid 6828, 
signal 6, core limit 0, dump mode 2
- ERROR: apport (pid 6850) Fri Apr 10 17:20:15 2020: not creating core for pid 
with dump mode of 2
- ERROR: apport (pid 6850) Fri Apr 10 17:20:15 2020: executable: 
/usr/sbin/dhcpd (command line "dhcpd -user dhcpd -group dhcpd -f -4 -pf 
/run/dhcp-server/dhcpd.pid -cf /etc/dhcp/dhcpd.conf")
- ERROR: apport (pid 6850) Fri Apr 10 17:20:15 2020: is_closing_session(): no 
DBUS_SESSION_BUS_ADDRESS in environment
- ERROR: apport (pid 6850) Fri Apr 10 17:20:15 2020: wrote report 
/var/crash/_usr_sbin_dhcpd.0.crash
- 
- 
- /var/crash/_usr_sbin_dhcpd.0.crash shows
- 
- ProblemType: Crash
- Architecture: amd64
- CrashCounter: 1
- Date: Fri Apr 10 17:20:15 2020
- DistroRelease: Ubuntu 20.04
- ExecutablePath: /usr/sbin/dhcpd
- ExecutableTimestamp: 1586210315
- ProcCmdline: dhcpd -user dhcpd -group dhcpd -f -4 -pf 

[Bug 1872118] Re: [SRU] DHCP Cluster crashes after a few hours

2020-08-11 Thread Jorge Niedbalski
** Summary changed:

- DHCP Cluster crashes after a few hours
+ [SRU] DHCP Cluster crashes after a few hours

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872118

Title:
  [SRU] DHCP Cluster crashes after a few hours

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhcp/+bug/1872118/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872118] Re: DHCP Cluster crashes after a few hours

2020-08-11 Thread Jorge Niedbalski
** Changed in: bind9-libs (Ubuntu Focal)
   Status: New => In Progress

** Changed in: bind9-libs (Ubuntu Groovy)
   Status: New => In Progress

** Changed in: isc-dhcp (Ubuntu Focal)
   Status: New => In Progress

** Changed in: isc-dhcp (Ubuntu Groovy)
   Status: Confirmed => In Progress

** Changed in: bind9-libs (Ubuntu Focal)
 Assignee: (unassigned) => Jorge Niedbalski (niedbalski)

** Changed in: bind9-libs (Ubuntu Groovy)
 Assignee: (unassigned) => Jorge Niedbalski (niedbalski)

** Changed in: isc-dhcp (Ubuntu Focal)
 Assignee: (unassigned) => Jorge Niedbalski (niedbalski)

** Changed in: isc-dhcp (Ubuntu Groovy)
 Assignee: (unassigned) => Jorge Niedbalski (niedbalski)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872118

Title:
  DHCP Cluster crashes after a few hours

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhcp/+bug/1872118/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872118] Re: DHCP Cluster crashes after a few hours

2020-08-06 Thread Jorge Niedbalski
Hello @Andrew, @rlaager,


Any crashes to report before I propose this patch? my env is running this patch 
for close to 3 days without any failures.


Thanks,

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872118

Title:
  DHCP Cluster crashes after a few hours

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhcp/+bug/1872118/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872118] Re: DHCP Cluster crashes after a few hours

2020-08-06 Thread Jorge Niedbalski
Hello Andrew,

I just reviewed the core file that you provided. Thread 1 is the thread that 
panics
on assertion because sock.pending_send is already set. This is the condition I 
prevented
in the PPA, so *shouldn't* be hitting the frame 3

In my test systems I don't hit this condition, dispatch_send isn't called if 
pending_send
is set.

(gdb) thread 1
[Switching to thread 1 (Thread 0x7f39a41f5700 (LWP 18780))]
#1  0x7f39a4dd1859 in __GI_abort () at abort.c:79
79  in abort.c
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x7f39a4dd1859 in __GI_abort () at abort.c:79
#2  0x7f39a4faf985 in isc_assertion_failed (file=, 
line=, type=, cond=) at 
../../../lib/isc/assertions.c:52
#3  0x7f39a4feb7e1 in dispatch_send (sock=0x7f39a4a03730) at 
../../../../lib/isc/unix/socket.c:3380
#4  process_fd (writeable=, readable=, fd=0, 
manager=0x7f39a49fa010) at ../../../../lib/isc/unix/socket.c:4054
#5  process_fds (writefds=, readfds=0x16, maxfd=-1533038191, 
manager=0x7f39a49fa010) at ../../../../lib/isc/unix/socket.c:4211
#6  watcher (uap=0x7f39a49fa010) at ../../../../lib/isc/unix/socket.c:4397
[...]
(gdb) frame 3
#3  0x7f39a4feb7e1 in dispatch_send (sock=0x7f39a4a03730) at 
../../../../lib/isc/unix/socket.c:3380
3380../../../../lib/isc/unix/socket.c: No such file or directory.
(gdb) info locals
iev = 0x0
ev = 
sender = 0x2
iev = 
ev = 
sender = 

(gdb) p sock
$1 = (isc__socket_t *) 0x7f39a4a03730
(gdb) p sock.pending_send
$2 = 1

Can you check your library links, etc?

ubuntu@dhcpd1:~$ ldd /usr/sbin/dhcpd | grep export
libirs-export.so.161 => /lib/x86_64-linux-gnu/libirs-export.so.161 
(0x7f5cb62e5000)
libdns-export.so.1109 => /lib/x86_64-linux-gnu/libdns-export.so.1109 
(0x7f5cb60b)
libisc-export.so.1105 => /lib/x86_64-linux-gnu/libisc-export.so.1105 
(0x7f5cb6039000)
libisccfg-export.so.163 => 
/lib/x86_64-linux-gnu/libisccfg-export.so.163 (0x7f5cb5df5000)
ubuntu@dhcpd1:~$ dpkg -S /lib/x86_64-linux-gnu/libisc-export.so.1105
libisc-export1105:amd64: /lib/x86_64-linux-gnu/libisc-export.so.1105
ubuntu@dhcpd1:~$ apt-cache policy libisc-export1105  | grep -i ppa
  Installed: 1:9.11.16+dfsg-3~ppa1
  Candidate: 1:9.11.16+dfsg-3~ppa1
 *** 1:9.11.16+dfsg-3~ppa1 500
500 http://ppa.launchpad.net/niedbalski/1872188-dbg/ubuntu focal/main 
amd64 Packages

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872118

Title:
  DHCP Cluster crashes after a few hours

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhcp/+bug/1872118/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1890491] Re: A pacemaker node fails monitor (probe) and stop /start operations on a resource because it returns "rc=189

2020-08-05 Thread Jorge Niedbalski
** Also affects: pacemaker (Ubuntu Groovy)
   Importance: Undecided
   Status: New

** Also affects: pacemaker (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Also affects: pacemaker (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Changed in: pacemaker (Ubuntu Groovy)
   Status: New => Fix Released

** Changed in: pacemaker (Ubuntu Focal)
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1890491

Title:
  A pacemaker node fails monitor (probe) and stop /start operations on a
  resource because it returns "rc=189

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1890491/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872118] Re: DHCP Cluster crashes after a few hours

2020-08-05 Thread Jorge Niedbalski
OK, I have no crashes to report for the last 24 hours with the PPA
included here.

● isc-dhcp-server.service - ISC DHCP IPv4 server
 Loaded: loaded (/lib/systemd/system/isc-dhcp-server.service; enabled; 
vendor preset: enabled)
 Active: active (running) since Tue 2020-08-04 14:58:11 UTC; 1 day 1h ago
   Docs: man:dhcpd(8)
   Main PID: 1202 (dhcpd)
  Tasks: 5 (limit: 5882)
 Memory: 6.3M
 CGroup: /system.slice/isc-dhcp-server.service
 └─592 dhcpd -user dhcpd -group dhcpd -f -4 -pf 
/run/dhcp-server/dhcpd.pid -cf /etc/dhcp/dhcpd.conf

root@dhcpd1:/home/ubuntu# dpkg -l | grep ppa1
ii isc-dhcp-server 4.4.1-2.1ubuntu6~ppa1 amd64 ISC DHCP server for automatic IP 
address assignment
ii libirs-export161 1:9.11.16+dfsg-3~ppa1 amd64 Exported IRS Shared Library
ii libisc-export1105:amd64 1:9.11.16+dfsg-3~ppa1 amd64 Exported ISC Shared 
Library
ii libisccfg-export163 1:9.11.16+dfsg-3~ppa1 amd64 Exported ISC CFG Shared 
Library


Andrew, what's the current version of libisc-export1105:amd64 installed in your 
system? can you provide a dpkg -l output and a systemctl status for the dhcpd 
service? Did you restarted/killed the dhcpd processes after upgrading that 
library?

Thanks in advance.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872118

Title:
  DHCP Cluster crashes after a few hours

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhcp/+bug/1872118/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872118] Re: DHCP Cluster crashes after a few hours

2020-08-04 Thread Jorge Niedbalski
Andrew,

Thank you for your input.

** Do you have any logs or a crash report to take a look after you
upgraded these systems?

On my test lab , I am counting for 3+ hours without a crash.

root@dhcpd1:/home/ubuntu# dpkg -l | grep ppa1
ii  isc-dhcp-server4.4.1-2.1ubuntu6~ppa1 amd64  
  ISC DHCP server for automatic IP address assignment
ii  libirs-export161   1:9.11.16+dfsg-3~ppa1 amd64  
  Exported IRS Shared Library
ii  libisc-export1105:amd641:9.11.16+dfsg-3~ppa1 amd64  
  Exported ISC Shared Library
ii  libisccfg-export1631:9.11.16+dfsg-3~ppa1 amd64  
  Exported ISC CFG Shared Library

---

DHCPACK on 10.19.101.120 to 52:54:00:d1:eb:66 (sleek-kodiak) via ens4
balancing pool 555643e55f40 12  total 221  free 111  backup 110  lts 0  max-own 
(+/-)22
balanced pool 555643e55f40 12  total 221  free 111  backup 110  lts 0  
max-misbal 33
balancing pool 555643e55f40 12  total 221  free 111  backup 110  lts 0  max-own 
(+/-)22
balanced pool 555643e55f40 12  total 221  free 111  backup 110  lts 0  
max-misbal 33


---

balancing pool 5595dff0df10 12  total 221  free 111  backup 110  lts 0  max-own 
(+/-)22
balanced pool 5595dff0df10 12  total 221  free 111  backup 110  lts 0  
max-misbal 33
balancing pool 5595dff0df10 12  total 221  free 111  backup 110  lts 0  max-own 
(+/-)22
balanced pool 5595dff0df10 12  total 221  free 111  backup 110  lts 0  
max-misbal 33

---

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872118

Title:
  DHCP Cluster crashes after a few hours

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhcp/+bug/1872118/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872118] Re: DHCP Cluster crashes after a few hours

2020-08-04 Thread Jorge Niedbalski
Hello Andrew,

Correct me if I am wrong but it seems your system isn't running with
libisc-export1105:amd64 1:9.11.16+dfsg-3~ppa1 (?)

I am running the following packages from the PPA, please note that 
libisc-export1105 is required (that's
where the fix is located).

root@dhcpd1:/home/ubuntu# dpkg -l | grep ppa1
ii isc-dhcp-server 4.4.1-2.1ubuntu6~ppa1 amd64 ISC DHCP server for automatic IP 
address assignment
ii libirs-export161 1:9.11.16+dfsg-3~ppa1 amd64 Exported IRS Shared Library
ii libisc-export1105:amd64 1:9.11.16+dfsg-3~ppa1 amd64 Exported ISC Shared 
Library
ii libisccfg-export163 1:9.11.16+dfsg-3~ppa1 amd64 Exported ISC CFG Shared 
Library

---

Sent update done message to failover-partner
failover peer failover-partner: peer moves from recover to recover-done
failover peer failover-partner: peer update completed.
failover peer failover-partner: I move from recover to recover-done
failover peer failover-partner: peer moves from recover-done to normal
failover peer failover-partner: I move from recover-done to normal
failover peer failover-partner: Both servers normal
balancing pool 55d0a88a4f10 12 total 221 free 221 backup 0 lts -110 max-own 
(+/-)22
balanced pool 55d0a88a4f10 12 total 221 free 221 backup 0 lts -110 max-misbal 33


---

balanced pool 55eb2fe58f40 12 total 221 free 111 backup 110 lts 0 max-misbal 33
Sending updates to failover-partner.
failover peer failover-partner: peer moves from recover-done to normal
failover peer failover-partner: Both servers normal


---
DHCPDISCOVER from 52:54:00:d1:eb:66 via ens4: load balance to peer 
failover-partner
DHCPREQUEST for 10.19.101.120 (10.19.101.233) from 52:54:00:d1:eb:66 via ens4: 
lease owned by peer

On failover:

peer failover-partner: disconnected
failover peer failover-partner: I move from normal to communications-interrupted
DHCPDISCOVER from 52:54:00:e8:14:0a via ens4
DHCPOFFER on 10.19.101.10 to 52:54:00:e8:14:0a (shapely-peccary) via ens4
DHCPREQUEST for 10.19.101.10 (10.19.101.127) from 52:54:00:e8:14:0a 
(shapely-peccary) via ens4
DHCPACK on 10.19.101.10 to 52:54:00:e8:14:0a (shapely-peccary) via ens4


I'll leave this running until I can reproduce the crash or assume that the fix
works.

Please let me know if you can reproduce with those packages.

Thanks,

Jorge Niedbalski

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872118

Title:
  DHCP Cluster crashes after a few hours

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhcp/+bug/1872118/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872118] Re: DHCP Cluster crashes after a few hours

2020-08-04 Thread Jorge Niedbalski
Hello Andrew,

The fix is on the libisc-export1105 library (check dependencies on: 
https://launchpad.net/~niedbalski/+archive/ubuntu/fix-1872118/+packages), just 
replacing
the dhcpd binary won't be enough.

If you can install isc-dhcp-server and its dependencies from the PPA and
test, would be great.

Thanks for any feedback.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872118

Title:
  DHCP Cluster crashes after a few hours

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhcp/+bug/1872118/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872118] Re: DHCP Cluster crashes after a few hours

2020-08-03 Thread Jorge Niedbalski
** Follow up from my previous comment, I've built a PPA with a fix similar to 
the one
pointed in 430065.

https://launchpad.net/~niedbalski/+archive/ubuntu/fix-1872118

* I'd appreciate if anyone can test that PPA with focal and report back if the
problem keeps reproducible with that version.

In case it does, please upload the crash file / coredump and the configuration
file used.

Thank you.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872118

Title:
  DHCP Cluster crashes after a few hours

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhcp/+bug/1872118/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872118] Re: DHCP Cluster crashes after a few hours

2020-08-03 Thread Jorge Niedbalski
Hello,

I checked the backtrace of a crashed dhcpd running on 4.4.1-2.1ubuntu5.

(gdb) info threads
  Id Target Id Frame
* 1 Thread 0x7fb4ddecb700 (LWP 3170) __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:50
  2 Thread 0x7fb4dd6ca700 (LWP 3171) __lll_lock_wait 
(futex=futex@entry=0x7fb4de6d2028, private=0) at lowlevellock.c:52
  3 Thread 0x7fb4de6cc700 (LWP 3169) futex_wake (private=, 
processes_to_wake=1, futex_word=) at 
../sysdeps/nptl/futex-internal.h:364
  4 Thread 0x7fb4de74f740 (LWP 3148) futex_wait_cancelable (private=, expected=0, futex_word=0x7fb4de6cd0d0) at 
../sysdeps/nptl/futex-internal.h:183

(gdb) frame 2
#2 0x7fb4dec85985 in isc_assertion_failed (file=file@entry=0x7fb4decd8878 
"../../../../lib/isc/unix/socket.c", line=line@entry=3361, 
type=type@entry=isc_assertiontype_insist,
cond=cond@entry=0x7fb4decda033 "!sock->pending_send") at 
../../../lib/isc/assertions.c:52
(gdb) bt
#1 0x7fb4deaa7859 in __GI_abort () at abort.c:79
#2 0x7fb4dec85985 in isc_assertion_failed (file=file@entry=0x7fb4decd8878 
"../../../../lib/isc/unix/socket.c", line=line@entry=3361, 
type=type@entry=isc_assertiontype_insist,
cond=cond@entry=0x7fb4decda033 "!sock->pending_send") at 
../../../lib/isc/assertions.c:52
#3 0x7fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at 
../../../../lib/isc/unix/socket.c:4041
#4 process_fd (writeable=, readable=, fd=11, 
manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4054
#5 process_fds (writefds=, readfds=0x7fb4de6d1090, maxfd=13, 
manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4211
#6 watcher (uap=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4397
#7 0x7fb4dea68609 in start_thread (arg=) at 
pthread_create.c:477
#8 0x7fb4deba4103 in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

(gdb) frame 3
#3 0x7fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at 
../../../../lib/isc/unix/socket.c:4041
4041 in ../../../../lib/isc/unix/socket.c
(gdb) p sock->pending_send
$2 = 1

The code is crashing on this assertion: 
https://gitlab.isc.org/isc-projects/bind9/-/blob/v9_11_3/lib/isc/unix/socket.c#L3364
This was already reported and marked as fixed in debian (?) via [0]

""Now if a wakeup event occurres the socket would be dispatched for
processing regardless which kind of event (timer?) triggered the wakeup.
At least I did not find any sanity checks in process_fds() except
SOCK_DEAD(sock).

This leads to the following situation: The sock is not dead yet but it
is still pending when it is dispatched again.

I would now check sock->pending_send before calling dispatch_send().This
 would at least prevent the assertion failure - well knowing that the
situation described above ( not dead but still pending and alerting ) is
not a very pleasant one - until someone comes up with a better solution.
"""

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=430065#20

** Follow up questions:

0) The reproducer doesn't seems consistent and seems to be related to a race
condition associated with a internal timer/futex.
1) Can anyone confirm that a pristine upstream 4.4.1 doesn't reproduces the 
issue?

** Bug watch added: Debian Bug tracker #430065
   https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=430065

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872118

Title:
  DHCP Cluster crashes after a few hours

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhcp/+bug/1872118/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1888926] Re: tls.tlscfgcmd not recognized; rebuild rsyslog against librelp 1.5.0

2020-08-03 Thread Jorge Niedbalski
Hello,

I checked the backtrace of a crashed dhcpd running on 4.4.1-2.1ubuntu5.

(gdb)  info threads
  Id   Target IdFrame 
* 1Thread 0x7fb4ddecb700 (LWP 3170) __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:50
  2Thread 0x7fb4dd6ca700 (LWP 3171) __lll_lock_wait 
(futex=futex@entry=0x7fb4de6d2028, private=0) at lowlevellock.c:52
  3Thread 0x7fb4de6cc700 (LWP 3169) futex_wake (private=, 
processes_to_wake=1, futex_word=) at 
../sysdeps/nptl/futex-internal.h:364
  4Thread 0x7fb4de74f740 (LWP 3148) futex_wait_cancelable 
(private=, expected=0, futex_word=0x7fb4de6cd0d0) at 
../sysdeps/nptl/futex-internal.h:183


(gdb) frame 2
#2  0x7fb4dec85985 in isc_assertion_failed (file=file@entry=0x7fb4decd8878 
"../../../../lib/isc/unix/socket.c", line=line@entry=3361, 
type=type@entry=isc_assertiontype_insist, 
cond=cond@entry=0x7fb4decda033 "!sock->pending_send") at 
../../../lib/isc/assertions.c:52
(gdb) bt
#1  0x7fb4deaa7859 in __GI_abort () at abort.c:79
#2  0x7fb4dec85985 in isc_assertion_failed (file=file@entry=0x7fb4decd8878 
"../../../../lib/isc/unix/socket.c", line=line@entry=3361, 
type=type@entry=isc_assertiontype_insist, 
cond=cond@entry=0x7fb4decda033 "!sock->pending_send") at 
../../../lib/isc/assertions.c:52
#3  0x7fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at 
../../../../lib/isc/unix/socket.c:4041
#4  process_fd (writeable=, readable=, fd=11, 
manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4054
#5  process_fds (writefds=, readfds=0x7fb4de6d1090, maxfd=13, 
manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4211
#6  watcher (uap=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4397
#7  0x7fb4dea68609 in start_thread (arg=) at 
pthread_create.c:477
#8  0x7fb4deba4103 in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95


(gdb) frame 3
#3  0x7fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at 
../../../../lib/isc/unix/socket.c:4041
4041in ../../../../lib/isc/unix/socket.c
(gdb) p sock->pending_send
$2 = 1

The code is crashing on this assertion: 
https://gitlab.isc.org/isc-projects/bind9/-/blob/v9_11_3/lib/isc/unix/socket.c#L3364
This was already reported and marked as fixed in debian (?) via [0]

""Now if a wakeup event occurres the socket would be dispatched for
processing regardless which kind of event (timer?) triggered the wakeup.
At least I did not find any sanity checks in process_fds() except
SOCK_DEAD(sock).

This leads to the following situation: The sock is not dead yet but it
is still pending when it is dispatched again.

I would now check sock->pending_send before calling dispatch_send().This
 would at least prevent the assertion failure - well knowing that the
situation described above ( not dead but still pending and alerting ) is
not a very pleasant one - until someone comes up with a better solution.
"""

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=430065#20


** Follow up questions:

0) The reproducer doesn't seems consistent and seems to be related to a race 
condition associated with a internal timer/futex. 
1) Can anyone confirm that a pristine upstream 4.4.1 doesn't reproduces the 
issue?


** Bug watch added: Debian Bug tracker #430065
   https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=430065

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1888926

Title:
  tls.tlscfgcmd not recognized; rebuild rsyslog against librelp 1.5.0

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/rsyslog/+bug/1888926/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872118] Re: DHCP Cluster crashes after a few hours

2020-08-03 Thread Jorge Niedbalski
** Also affects: isc-dhcp (Ubuntu Groovy)
   Importance: Undecided
   Status: Confirmed

** Also affects: isc-dhcp (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Also affects: bind9-libs (Ubuntu)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872118

Title:
  DHCP Cluster crashes after a few hours

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhcp/+bug/1872118/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1872118] Re: DHCP Cluster crashes after a few hours

2020-08-03 Thread Jorge Niedbalski
Hello,

I am trying to setup a reproducer for the mentioned issue. I have 2
machines acting as peers with the following versions:


# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:Ubuntu 20.04.1 LTS
Release:20.04
Codename:   focal

# dpkg -l |grep -i isc-dh
ii  isc-dhcp-client4.4.1-2.1ubuntu5  amd64  
  DHCP client for automatically obtaining an IP address
ii  isc-dhcp-common4.4.1-2.1ubuntu5  amd64  
  common manpages relevant to all of the isc-dhcp packages
ii  isc-dhcp-server4.4.1-2.1ubuntu5  amd64  
  ISC DHCP server for automatic IP address assignment

=

Primary configuration:  https://pastebin.ubuntu.com/p/XYj648MghK/
Secondary configuration: https://pastebin.ubuntu.com/p/PYkcshZCWK/

Started with:

# dhcpd -f -d -4 -cf /etc/dhcp/dhcpd.conf --no-pid ens4

---> Raised some DHCP requests to these servers.

balanced pool 560b8c263f40 12  total 221  free 111  backup 110  lts 0  
max-misbal 33
Sending updates to failover-partner.
failover peer failover-partner: peer moves from recover-done to normal
failover peer failover-partner: Both servers normal
DHCPDISCOVER from 52:54:00:2d:53:93 via ens4
DHCPOFFER on 10.19.101.120 to 52:54:00:2d:53:93 (glistening-elephant) via ens4
DHCPREQUEST for 10.19.101.120 (10.19.101.236) from 52:54:00:2d:53:93 
(glistening-elephant) via ens4
DHCPACK on 10.19.101.120 to 52:54:00:2d:53:93 (glistening-elephant) via ens4

DHCPREQUEST for 10.19.101.120 from 52:54:00:2d:53:93 (glistening-elephant) via 
ens4
DHCPACK on 10.19.101.120 to 52:54:00:2d:53:93 (glistening-elephant) via ens4
DHCPREQUEST for 10.19.101.121 from 52:54:00:53:a3:d8 (valiant-motmot) via ens4
DHCPACK on 10.19.101.121 to 52:54:00:53:a3:d8 (valiant-motmot) via ens4


---


failover peer failover-partner: Both servers normal
balancing pool 5606b2c95f10 12  total 221  free 221  backup 0  lts -110  
max-own (+/-)22
balanced pool 5606b2c95f10 12  total 221  free 221  backup 0  lts -110  
max-misbal 33
balancing pool 5606b2c95f10 12  total 221  free 111  backup 110  lts 0  max-own 
(+/-)22
balanced pool 5606b2c95f10 12  total 221  free 111  backup 110  lts 0  
max-misbal 33
DHCPDISCOVER from 52:54:00:2d:53:93 via ens4: load balance to peer 
failover-partner
DHCPREQUEST for 10.19.101.120 (10.19.101.236) from 52:54:00:2d:53:93 via ens4: 
lease owned by peer


So far (after 1.5h) no crash has been reported in any of the servers. 

Questions:

1) Anything missed from the provided configuration?
2) Is this load or concurrency related? meaning a specific amount of leases 
needs to be allocated for this crash to happen? 

I will take a look to an existing crash/coredump.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1872118

Title:
  DHCP Cluster crashes after a few hours

To manage notifications about this bug go to:
https://bugs.launchpad.net/dhcp/+bug/1872118/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1888926] Re: tls.tlscfgcmd not recognized; rebuild rsyslog against librelp 1.5.0

2020-08-03 Thread Jorge Niedbalski
** Tags added: sts-sru-needed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1888926

Title:
  tls.tlscfgcmd not recognized; rebuild rsyslog against librelp 1.5.0

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/rsyslog/+bug/1888926/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1888926] Re: tls.tlscfgcmd not recognized; rebuild rsyslog against librelp 1.5.0

2020-08-03 Thread Jorge Niedbalski
gh a more
  complete fix would be to raise the Build-Depends from librelp-dev (>=
  1.4.0) to librelp-dev (>= 1.5.0).
+ 
+ [Risk potential]
+ 
+ * No identified as this is a rebuild that should have been done on all 
+ reverse dependencies of librelp-dev when upgraded from 1.4.0 to 1.5.0
+ 
+ 
+ [Fix]
+ 
+ Provide a rebuild SRU for focal.

** Changed in: rsyslog (Ubuntu Groovy)
   Status: New => Fix Released

** Changed in: rsyslog (Ubuntu Focal)
   Status: New => In Progress

** Changed in: rsyslog (Ubuntu Focal)
 Assignee: (unassigned) => Jorge Niedbalski (niedbalski)

** Changed in: rsyslog (Ubuntu Focal)
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1888926

Title:
  tls.tlscfgcmd not recognized; rebuild rsyslog against librelp 1.5.0

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/rsyslog/+bug/1888926/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1888926] Re: tls.tlscfgcmd not recognized; rebuild rsyslog against librelp 1.5.0

2020-08-03 Thread Jorge Niedbalski
** Patch added: "lp1888926-focal.debdiff"
   
https://bugs.launchpad.net/ubuntu/+source/rsyslog/+bug/1888926/+attachment/5398376/+files/lp1888926-focal.debdiff

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1888926

Title:
  tls.tlscfgcmd not recognized; rebuild rsyslog against librelp 1.5.0

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/rsyslog/+bug/1888926/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1888926] Re: tls.tlscfgcmd not recognized; rebuild rsyslog against librelp 1.5.0

2020-08-03 Thread Jorge Niedbalski
** Also affects: rsyslog (Ubuntu Groovy)
   Importance: Undecided
   Status: New

** Also affects: rsyslog (Ubuntu Focal)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1888926

Title:
  tls.tlscfgcmd not recognized; rebuild rsyslog against librelp 1.5.0

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/rsyslog/+bug/1888926/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1879798] Re: designate-manage pool update doesn't reflects targets master dns servers into zones.

2020-07-28 Thread Jorge Niedbalski
** Proposed change https://review.opendev.org/#/c/731603/ has been
merged already.

** Changed in: designate (Ubuntu)
   Status: In Progress => Fix Committed

** Changed in: charm-designate
   Status: Confirmed => Invalid

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1879798

Title:
  designate-manage pool update doesn't reflects targets master dns
  servers into zones.

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-designate/+bug/1879798/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1866085] Re: Not possible to create listeners that use barbican secret containers

2020-07-17 Thread Jorge Niedbalski
*** This bug is a duplicate of bug 1867676 ***
https://bugs.launchpad.net/bugs/1867676

** This bug has been marked a duplicate of bug 1867676
   Fetching by secret container doesn't raises 404 exception

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1866085

Title:
  Not possible to create listeners that use barbican secret containers

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/octavia/+bug/1866085/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867676] Re: Fetching by secret container doesn't raises 404 exception

2020-06-19 Thread Jorge Niedbalski
Hello Corey,

This package is available on B, E, F, G. I don't think is strictly
required to be backported into the queens cloud archive for other than
having this package in the correct pocket (UCA).

As I expressed in the bug description, The combination that triggers
this issue is octavia + barbican when creating listeners, but octavia
only exists >= Rocky. So I doubt that this could be hit in any other
situation other than a specific user using the standalone library of
barbicanclient, that would be a VERY rare use case.

The particular issue reported here was reported on Bionic (LTS) >= Rocky
clouds (where the UCA package was missed), and the bionic package in
universe was missing the fix.

I think its OK if you want to move this very same missing package
version into the UCA, but I think is safe to assume that the
verification done for Bionic would apply for anyone brave enough for
deploying a version of Queens barbican + Rocky Octavia.

I hope this clarifies the situation.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867676

Title:
  Fetching by secret container doesn't raises 404 exception

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1867676/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1879798] Re: designate-manage pool update doesn't reflects targets master dns servers into zones.

2020-05-28 Thread Jorge Niedbalski
### Notes 

* Proposed fix https://review.opendev.org/#/c/731603/ 
* Updated bug description


** Changed in: designate (Ubuntu)
   Status: New => In Progress

** Changed in: designate (Ubuntu)
 Assignee: (unassigned) => Jorge Niedbalski (niedbalski)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1879798

Title:
  designate-manage pool update doesn't reflects targets master dns
  servers into zones.

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-designate/+bug/1879798/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1879798] Re: replacing designate units causes issues previously created zones

2020-05-28 Thread Jorge Niedbalski
** Changed in: charm-designate-bind
   Status: Confirmed => Invalid

** Summary changed:

- replacing designate units causes issues previously created zones 
+ designate-manage pool update doesn't reflects targets master dns servers into 
zones.

** Description changed:

+ [Environment]
+ 
+ Ubuntu + Ussuri
+ 
+ [Description]
+ 
+ If running designate-manage pool update with new targets, those targets
+ gets properly updated in the pool target masters list, but those aren't
+ reflected into the zones that belongs to this pool, therefore, the masters
+ associated to that zones aren't updated causing failures as the expressed
+ in the Further Information section.
+ 
+ designate-manager pool update should offer an option to update the zones
+ associated to the pools with the new target masters and be able to apply
+ these changes into existing zones.
+ 
+ For the case of the bind9 backend the current workaround is to manually
+ run the rndc modzone command with the new masters, but that's not suitable
+ for large installations with multiple zones and pools.
+ 
+ 
+ [Further information]
+ 
  We have a designate/designate-bind setup. We migrated designate units to
  different machines, replacing 3 designate units with 3 new units.
  However, this caused issues with existing zones, including creating new
  recordsets for these zones. The zone would result in having an ERROR
  status and a CREATE action.
  
  Looking at the designate bind units, we see that designate is attempting
  to run:
  
  'addzone $zone { type slave; masters {$new_designate_ips port 5354;};
  file "slave.$zone.$hash"; };'
  
  This addzone fails due to the zone already existing. However, we found
  that the zone configuration (using 'rndc showzone $zone' from designate-
  bind unit) still had the old designate ips for its masters. There are
  also logs in /var/log/syslog like the following:
  
  May 20 06:27:10 juju-c27f05-15-lxd-1 named[72648]: transfer of '$zone'
  from $old_designate_ip#5354: failed to connect: host unreachable
  
  We were able to resolve this issue by modifying the zone config on all
  designate-bind units:
  
  juju run -a designate-bind -- rndc modzone $zone '{ type slave; file
  "slave.$zone.$hash"; masters { $new_designate_ip_1 port 5354;
  $new_designate_ip_2 port 5354; $new_designate_ip_3 port 5354; }; };'
  
  After modifying the zone, the recordset creations completed and resolved
  almost immediately.
  
  Would this be something the charm could do in an automated way when
  masters are removed/replaced, or is there a better way of fixing the
  zone configurations? For these designate migrations, we will have to
  enumerate over every zone to fix their configurations.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1879798

Title:
  designate-manage pool update doesn't reflects targets master dns
  servers into zones.

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-designate/+bug/1879798/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1879798] Re: replacing designate units causes issues previously created zones

2020-05-27 Thread Jorge Niedbalski
### Further observations 

I am able to partially reproduce the problem.

Bundle used: http://paste.ubuntu.com/p/myxQJnJvyn/


$ openstack zone create --email dnsmas...@example.com example.com.
$ rndc showzone example.com.

zone "example.com" { type slave; file "slave.example.com.f3e3fdaa-
857e-4786-afef-2b4cb2d03357"; masters { 10.5.0.10 port 5354; 10.5.0.41
port 5354; 10.5.0.31 port 5354; }; };


$ juju remove-unit designate/0 designate/1 designate/2 --force
removing unit designate/0
removing unit designate/1
removing unit designate/2


root@juju-54f98f-1879798-4:/home/ubuntu# ack master /var/log/syslog 
May 27 19:21:20 juju-54f98f-1879798-4 named[1505]: received control channel 
command 'addzone example.com  { type slave; masters { 10.5.0.10 port 5354; 
10.5.0.41 port 5354; 10.5.0.31 port 5354;}; file 
"slave.example.com.f3e3fdaa-857e-4786-afef-2b4cb2d03357"; };'
May 27 19:55:32 juju-54f98f-1879798-4 named[6653]: zone example.com/IN: 
refresh: timeout retrying without EDNS master 10.5.0.10#5354 (source 0.0.0.0#0)
May 27 19:55:47 juju-54f98f-1879798-4 named[6653]: zone example.com/IN: 
refresh: retry limit for master 10.5.0.10#5354 exceeded (source 0.0.0.0#0)
May 27 19:56:05 juju-54f98f-1879798-4 named[6653]: zone example.com/IN: 
refresh: retry limit for master 10.5.0.41#5354 exceeded (source 0.0.0.0#0)
May 27 19:56:23 juju-54f98f-1879798-4 named[6653]: zone example.com/IN: 
refresh: retry limit for master 10.5.0.31#5354 exceeded (source 0.0.0.0#0)


$ juju add-unit -n 3 designate


root@juju-54f98f-1879798-4:/home/ubuntu# ack addzone /var/log/syslog 
May 27 19:21:20 juju-54f98f-1879798-4 named[1505]: received control channel 
command 'addzone example.com  { type slave; masters { 10.5.0.10 port 5354; 
10.5.0.41 port 5354; 10.5.0.31 port 5354;}; file 
"slave.example.com.f3e3fdaa-857e-4786-afef-2b4cb2d03357"; };'
May 27 19:21:20 juju-54f98f-1879798-4 named[1505]: added zone example.com in 
view _default via addzone

---

(Continues).

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1879798

Title:
  replacing designate units causes issues previously created zones

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-designate/+bug/1879798/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1879798] Re: replacing designate units causes issues previously created zones

2020-05-27 Thread Jorge Niedbalski
** Also affects: designate (Ubuntu)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1879798

Title:
  replacing designate units causes issues previously created zones

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-designate/+bug/1879798/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867676] Re: Fetching by secret container doesn't raises 404 exception

2020-05-12 Thread Jorge Niedbalski
Hello,

I've verified that the current -proposed package fixes the issue for us, for the
given use case.

Using the following deployment bundle on a Bionic + Rocky cloud
http://paste.ubuntu.com/p/jnVdVvQg7k/

Without the patch, the problem is reproduced as expressed on the case
description:

ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack/00268110$
openstack secret container create --type='certificate' --name "test-
tls-1" --secret="certificate=https://10.5.0.11:9312/v1/secrets/7aa7727d-
f39b-45f8-9310-f5c595ad4feb"
--secret="private_key=https://10.5.0.11:9312/v1/secrets/189736d1-51d8-4cbe-9638-ceadcbb664ac;
--secret="intermediates=https://10.5.0.11:9312/v1/secrets/70e2cf9c-8110-4d25-a1e3-f7b6f3950e64;


ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack/00268110$ openstack 
loadbalancer listener create --protocol-port 443 --protocol "TERMINATED_HTTPS" 
--name "test-listener" 
--default-tls-container="https://10.5.0.11:9312/v1/containers/b548ab63-474d-4a94-b121-4eae8193fcc1;
 -- lb1
The PKCS12 bundle is unreadable. Please check the PKCS12 bundle validity. In 
addition, make sure it does not require a pass phrase. Error: [('asn1 encoding 
routines', 'asn1_d2i_read_bio', 'not enough data')] (HTTP 400) (Request-ID: 
req-c79fbcb1-06d8-47e4-9754-8066596ba262)


With the patch applied in the following version:


root@juju-be44b9-barbican-10:/home/ubuntu# dpkg -l |grep barbican
ii  python3-barbicanclient   4.6.0-0ubuntu1.1   
 all  OpenStack Key Management API client - Python 3.x


| https://10.5.0.11:9312/v1/containers/bd67d6f4-3a82-4a86-9679-c97a66ceeb19 | 
None   | 2020-05-12T21:37:32+00:00 | ACTIVE | certificate | 
certificate=https://10.5.0.11:9312/v1/secrets/26ed5706-5f0a-4f9f-b226-e8595031515e
   | None  |
|   |   
 |   || | 
private_key=https://10.5.0.11:9312/v1/secrets/9a3bd926-6ba9-46be-8168-6b5e79e09b36
   |   |
+---++---++-+--+---+


The issue isn't longer reproducible and listeners can be created.


ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack/00268110$ openstack 
loadbalancer listener create --protocol-port 443 --protocol "TERMINATED_HTTPS" 
--name "test-listener-2" 
--default-tls-container="https://10.5.0.11:9312/v1/containers/bd67d6f4-3a82-4a86-9679-c97a66ceeb19;
 -- lb2
+-+---+
| Field   | Value   
  |
+-+---+
| admin_state_up  | True
  |
| connection_limit| -1  
  |
| created_at  | 2020-05-12T21:38:28 
  |
| default_pool_id | None
  |
| default_tls_container_ref   | 
https://10.5.0.11:9312/v1/containers/bd67d6f4-3a82-4a86-9679-c97a66ceeb19 |
| description | 
  |
| id  | 971a679d-4a07-4012-8552-fac8f0f450ab
  |
| insert_headers  | None
  |
| l7policies  | 
  |
| loadbalancers   | 9a49ae4e-4bae-451d-bcec-b22dadf1df29
  |
| name| test-listener-2 
  |
| operating_status| OFFLINE 
  |
| project_id  | 2ab451be592d468bad963a95a342e099
  |
| protocol| TERMINATED_HTTPS
  |
| protocol_port   | 443 
  |
| provisioning_status | PENDING_CREATE  
  |
| sni_container_refs  | []  
  |
| timeout_client_data | 5   
  |
| timeout_member_connect  | 5000

[Bug 1867676] Re: Fetching by secret container doesn't raises 404 exception

2020-04-27 Thread Jorge Niedbalski
@sil2100 hello lukasz

I've added a **IMPACTED VERSIONS NOTE  note in the description. Any user
running Bionic may hit this issue if running the library in standalone
and hitting the same endpoints. However, this is unlikely to be manifested
by any user, unless it is deployed with octavia (which is in the 
cloud-archive). This component (octavia-api) makes extensive use of the 
barbicanclient API and therefore any clouds >= rocky
deployed on top of Bionic will manifest the issue. 

I hope this clarifies the situation further and if not, please let me know
to provide you any further details.



** Description changed:

  [Impact]
  
  Users of Ubuntu bionic running openstack clouds >= rocky
  can't create octavia load balancers listeners anymore since the backport of 
the following patch:
  
  
https://opendev.org/openstack/octavia/commit/a501714a76e04b33dfb24c4ead9956ed4696d1df
  
  This change was introduced as part of the following backports and
  their posterior syncs into the current Bionic version.
  
- This fix being SRUed here is contained in 4.8.1-0ubuntu1 (disco onwards)
- but not on the Bionic version 4.6.0-0ubuntu1.
+  IMPACTED VERSIONS NOTE 
  
- The issue gets exposed with the following octavia
- packages from UCA + python-barbicanclient 4.6.0ubuntu1.
+ This issue can be triggered in standalone without any cloud-archive
+ dependency and affects python-barbicanclient 4.6.0ubuntu1, which is the
+ Bionic version. The issue was fixed in 4.8.1-0ubuntu1 (disco onwards).
  
- Please note that likely this python-barbicanclient dependency should
- be part of UCA and not of main/universe.
+ However, this exception gets easily manifested in OpenStack deployments
+ that uses octavia packages from UCA + python-barbicanclient 4.6.0ubuntu1, as 
it provides direct interaction with the barbican client.
+ 
+ This means that any Ubuntu openstack cloud deployed from UCA on release
+ >= rocky will manifest this issue when deployed on top of Bionic
+ 
  
   octavia-api | 3.0.0-0ubuntu3~cloud0   | rocky  | all
   octavia-api | 4.0.0-0ubuntu1.1~cloud0 | stein  | all
   octavia-api | 4.0.0-0ubuntu1~cloud0   | train  | all
  
  This change added a new exception handler in the code
  that manages the decoding of the given PCKS12 certicate bundle when the 
listener is created, this handler now captures the PCKS12 decoding error and 
then raises it preventing
  the listener creation to happen (when its invoked with i.e.: 
--default-tls-container="https://10.5.0.4:9312/v1/containers/68154f38-fccf-4990-b88c-86eb3cc7fe1a;
 ) , this was originally being hidden
  under the legacy code handler as can be seen here:
  
  
https://opendev.org/openstack/octavia/commit/a501714a76e04b33dfb24c4ead9956ed4696d1df
  
  This exception is raised because the barbicanclient doesn't know how to 
distinguish between a given secret and a container, therefore, when the
  user specifies a container UUID the client tries to fetch a secret with that 
uuid (including the /containers/UUID path) and a error 400 (not the expected 
404 http error) is returned.
  
  The change proposed on the SRU makes the client aware of container and
  secret UUID(s) and is able to split the path to distinguish a non-secret
  (such as a container), in that way if a container is passed, it fails to
  pass the parsing validation and the right return code (404) is returned
  by the client.
  
  If a error 404 gets returned, then the except Exception block gets
  executed and the legacy driver code for decoding the pcks12 certicate in 
octavia is invoked, this legacy
  driver is able to decode the container payloads and the decoding of the 
pcks12 certificate succeeds.
  
  This differentiation was implemented here:
  
  https://github.com/openstack/python-
  barbicanclient/commit/6651c8ffce48ce7ff08f5563a8e6212677ea0468
  
  As an example (this worked before the latest bionic version was pushed)
  
  openstack loadbalancer listener create --protocol-port 443 --protocol
  "TERMINATED_HTTPS" --name "test-listener" --default-tls-
  container="https://10.5.0.4:9312/v1/containers/68154f38-fccf-4990-b88c-
  86eb3cc7fe1a" -- lb1
  
  With the newest package upgrade this creation will fail with the
  following exception:
  
  The PKCS12 bundle is unreadable. Please check the PKCS12 bundle
  validity. In addition, make sure it does not require a pass phrase.
  Error: [('asn1 encoding routines', 'asn1_d2i_read_bio', 'not enough
  data')] (HTTP 400) (Request-ID: req-8e48d0b5-3f5b-
  4d26-9920-72b03343596a)
  
  Further rationale on this can be found on
  https://storyboard.openstack.org/#!/story/2007371
  
  [Test Case]
  
  1) Deploy this bundle or similar (http://paste.ubuntu.com/p/cgbwKNZHbW/)
  
  2) Create self-signed certificate, key and ca
  (http://paste.ubuntu.com/p/xyyxHZGDFR/)
  
  3) Create the 3 certs at barbican
  
  $ openstack secret store --name "test-pk-1" --secret-type "private"
  --payload-content-type "text/plain" 

[Bug 1840844] Re: user with admin role gets logged out when trying to list images

2020-04-27 Thread Jorge Niedbalski
** Also affects: cloud-archive
   Importance: Undecided
   Status: New

** Also affects: horizon (Ubuntu)
   Importance: Undecided
   Status: New

** Also affects: horizon (Ubuntu Eoan)
   Importance: Undecided
   Status: New

** Also affects: horizon (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Also affects: horizon (Ubuntu Groovy)
   Importance: Undecided
   Status: New

** Also affects: horizon (Ubuntu Focal)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1840844

Title:
  user with admin role gets logged out when trying to list images

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1840844/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867398] Re: [Regression] unsupported protocol scheme

2020-04-13 Thread Jorge Niedbalski
The verification for eoan series has been performed using:

* Bundle http://paste.ubuntu.com/p/zBQfXWq77R/
* Following containerd config: https://paste.ubuntu.com/p/GDpjp2fd4t/

Annotations:  Status:  Pending
IP:   10.1.8.11
IPs:
  IP:  10.1.8.11
Containers:
  busybox:
Container ID:  
Image: niedbalski-bastion.cloud.sts:5000/busybox:latest
Image ID:  
Port:  
Host Port: 
Command:
  sleep
  3600
State:  Waiting
  Reason:   ErrImagePull
Ready:  False
Restart Count:  0
Environment:
Mounts:
  /var/run/secrets/kubernetes.io/serviceaccount from default-token-vwm4f 
(ro)
Conditions:
  Type  Status
  Initialized   True 
  Ready False 
  ContainersReady   False 
  PodScheduled  True 
Volumes:
  default-token-vwm4f:
Type:Secret (a volume populated by a Secret)
SecretName:  default-token-vwm4f
Optional:false
QoS Class:   BestEffort
Node-Selectors:  
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type Reason   AgeFrom   Message
   --         ---
  Normal   Scheduled  default-scheduler  
Successfully assigned default/busybox to juju-775746-00268738-1-4
  Warning  FailedMount  17skubelet, juju-775746-00268738-1-4  
MountVolume.SetUp failed for volume "default-token-vwm4f" : failed to sync 
secret cache: timed out waiting for the condition
  Normal   Pulling  11skubelet, juju-775746-00268738-1-4  Pulling 
image "niedbalski-bastion.cloud.sts:5000/busybox:latest"
  Warning  Failed   7s kubelet, juju-775746-00268738-1-4  Failed to 
pull image "niedbalski-bastion.cloud.sts:5000/busybox:latest": rpc error: code 
= Unknown desc = failed to pull and unpack image 
"niedbalski-bastion.cloud.sts:5000/busybox:latest": failed to resolve reference 
"niedbalski-bastion.cloud.sts:5000/busybox:latest": failed to do request: Head 
niedbalski-bastion.cloud.sts:///v2/busybox/manifests/latest: unsupported 
protocol scheme "niedbalski-bastion.cloud.sts"
  Warning  Failed   7s kubelet, juju-775746-00268738-1-4  Error: 
ErrImagePull
  Normal   BackOff  7s kubelet, juju-775746-00268738-1-4  Back-off 
pulling image "niedbalski-bastion.cloud.sts:5000/busybox:latest"
  Warning  Failed   7s kubelet, juju-775746-00268738-1-4  Error: 
ImagePullBackOff
---

After applying the -proposed version 1.3.3-0ubuntu1~19.10.2

ubuntu@niedbalski-bastion:~/stsstack-bundles/kubernetes/00268738$ kubectl apply 
-f busybox.yaml 
pod/busybox created
ubuntu@niedbalski-bastion:~/stsstack-bundles/kubernetes/00268738$ kubectl get 
pod -o wide
NAME  READY   STATUSRESTARTS   AGE   IP  NODE   
NOMINATED NODE   READINESS GATES
busybox   1/1 Running   0  4s10.1.8.31   
juju-775746-00268738-1-4  
ubuntu@niedbalski-bastion:~/stsstack-bundles/kubernetes/00268738$ juju run 
--application kubernetes-worker "sudo dpkg -l | grep containerd"
- Stdout: |
ii  containerd   1.3.3-0ubuntu1~19.10.2 
   amd64daemon to control runC
  UnitId: kubernetes-worker/0


Therefore the patch fixes the regression.

** Tags removed: verification-needed verification-needed-eoan
** Tags added: verification-done verification-done-eoan

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867398

Title:
  [Regression] unsupported protocol scheme

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/containerd/+bug/1867398/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867676] Re: Fetching by secret container doesn't raises 404 exception

2020-04-13 Thread Jorge Niedbalski
Hello @racb @coreycb,

I've updated the description, testing and others sections. Is there
anything left to be done at my side to continue with the SRU?

Thank you.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867676

Title:
  Fetching by secret container doesn't raises 404 exception

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1867676/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1870619] Re: rabbitmq-server startup does not wait long enough

2020-04-07 Thread Jorge Niedbalski
** No longer affects: rabbitmq-server (Ubuntu Disco)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1870619

Title:
  rabbitmq-server startup does not wait long enough

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-rabbitmq-server/+bug/1870619/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867676] Re: Fetching by secret container doesn't raises 404 exception

2020-04-06 Thread Jorge Niedbalski
** Description changed:

  [Impact]
  
  Users of Ubuntu bionic running openstack clouds >= rocky
  can't create octavia load balancers listeners anymore since the backport of 
the following patch:
  
  
https://opendev.org/openstack/octavia/commit/a501714a76e04b33dfb24c4ead9956ed4696d1df
  
  This change was introduced as part of the following backports and
  their posterior syncs into the current Bionic version.
  
  This fix being SRUed here is contained in 4.8.1-0ubuntu1 (disco onwards)
  but not on the Bionic version 4.6.0-0ubuntu1.
  
  The issue gets exposed with the following octavia
  packages from UCA + python-barbicanclient 4.6.0ubuntu1.
  
  Please note that likely this python-barbicanclient dependency should
- be part of UCA and not of main/universe. 
+ be part of UCA and not of main/universe.
  
-  octavia-api | 3.0.0-0ubuntu3~cloud0   | rocky  | all
-  octavia-api | 4.0.0-0ubuntu1.1~cloud0 | stein  | all
-  octavia-api | 4.0.0-0ubuntu1~cloud0   | train  | all
- 
+  octavia-api | 3.0.0-0ubuntu3~cloud0   | rocky  | all
+  octavia-api | 4.0.0-0ubuntu1.1~cloud0 | stein  | all
+  octavia-api | 4.0.0-0ubuntu1~cloud0   | train  | all
  
  This change added a new exception handler in the code
  that manages the decoding of the given PCKS12 certicate bundle when the 
listener is created, this handler now captures the PCKS12 decoding error and 
then raises it preventing
  the listener creation to happen (when its invoked with i.e.: 
--default-tls-container="https://10.5.0.4:9312/v1/containers/68154f38-fccf-4990-b88c-86eb3cc7fe1a;
 ) , this was originally being hidden
  under the legacy code handler as can be seen here:
  
  
https://opendev.org/openstack/octavia/commit/a501714a76e04b33dfb24c4ead9956ed4696d1df
  
  This exception is raised because the barbicanclient doesn't know how to 
distinguish between a given secret and a container, therefore, when the
  user specifies a container UUID the client tries to fetch a secret with that 
uuid (including the /containers/UUID path) and a error 400 (not the expected 
404 http error) is returned.
  
  The change proposed on the SRU makes the client aware of container and
  secret UUID(s) and is able to split the path to distinguish a non-secret
  (such as a container), in that way if a container is passed, it fails to
  pass the parsing validation and the right return code (404) is returned
  by the client.
  
  If a error 404 gets returned, then the except Exception block gets
  executed and the legacy driver code for decoding the pcks12 certicate in 
octavia is invoked, this legacy
  driver is able to decode the container payloads and the decoding of the 
pcks12 certificate succeeds.
  
  This differentiation was implemented here:
  
  https://github.com/openstack/python-
  barbicanclient/commit/6651c8ffce48ce7ff08f5563a8e6212677ea0468
  
  As an example (this worked before the latest bionic version was pushed)
  
  openstack loadbalancer listener create --protocol-port 443 --protocol
  "TERMINATED_HTTPS" --name "test-listener" --default-tls-
  container="https://10.5.0.4:9312/v1/containers/68154f38-fccf-4990-b88c-
  86eb3cc7fe1a" -- lb1
  
  With the newest package upgrade this creation will fail with the
  following exception:
  
  The PKCS12 bundle is unreadable. Please check the PKCS12 bundle
  validity. In addition, make sure it does not require a pass phrase.
  Error: [('asn1 encoding routines', 'asn1_d2i_read_bio', 'not enough
  data')] (HTTP 400) (Request-ID: req-8e48d0b5-3f5b-
  4d26-9920-72b03343596a)
  
  Further rationale on this can be found on
  https://storyboard.openstack.org/#!/story/2007371
- 
  
  [Test Case]
  
  1) Deploy this bundle or similar (http://paste.ubuntu.com/p/cgbwKNZHbW/)
  
  2) Create self-signed certificate, key and ca
  (http://paste.ubuntu.com/p/xyyxHZGDFR/)
  
  3) Create the 3 certs at barbican
  
  $ openstack secret store --name "test-pk-1" --secret-type "private"
  --payload-content-type "text/plain" --payload="$(cat
  ./keys/controller_key.pem)"
  
  $ openstack secret store --name "test-ca-1" --secret-type "certificate"
  --payload-content-type "text/plain" --payload="$(cat
  ./keys/controller_ca.pem)"
  
  $ openstack secret store --name "test-pub-1" --secret-type "certificate"
  --payload-content-type "text/plain" --payload="$(cat
  ./keys/controller_cert.pem)"
  
  4) Create a loadbalancer
  $ openstack loadbalancer create --name lb1 --vip-subnet-id private_subnet
  
  5) Create a secrets container
  
  $ openstack secret container create --type='certificate' --name "test-
  tls-1"
  
--secret="certificate=https://10.5.0.4:9312/v1/secrets/3c9109d9-05e0-45fe-9661-087c50061c00;
  --secret="private_key=https://10.5.0.4:9312/v1/secrets/378e8f8c-81f5
  -4b5a-bffd-c0c43a41b4a8"
  --secret="intermediates=https://10.5.0.4:9312/v1/secrets/07a7564d-
  b5c6-4433-a0a9-a195e2d54c57"
  
  6) Try to create the listener
  
  openstack loadbalancer listener create 

[Bug 1870619] Re: rabbitmq-server startup does not wait long enough

2020-04-03 Thread Jorge Niedbalski
** Also affects: rabbitmq-server (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Also affects: rabbitmq-server (Ubuntu Eoan)
   Importance: Undecided
   Status: New

** Also affects: rabbitmq-server (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Also affects: rabbitmq-server (Ubuntu Disco)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1870619

Title:
  rabbitmq-server startup does not wait long enough

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-rabbitmq-server/+bug/1870619/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1828988] Re: rabbitmq server fails to start after cluster reboot

2020-04-03 Thread Jorge Niedbalski
** Also affects: rabbitmq-server (Ubuntu)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1828988

Title:
  rabbitmq server fails to start after cluster reboot

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-rabbitmq-server/+bug/1828988/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1828988] Re: rabbitmq server fails to start after cluster reboot

2020-04-02 Thread Jorge Niedbalski
** Also affects: rabbitmq-server (Ubuntu)
   Importance: Undecided
   Status: New

** Also affects: rabbitmq-server (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Also affects: rabbitmq-server (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Also affects: rabbitmq-server (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Also affects: rabbitmq-server (Ubuntu Eoan)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1828988

Title:
  rabbitmq server fails to start after cluster reboot

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-rabbitmq-server/+bug/1828988/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867398] Re: [Regression] unsupported protocol scheme

2020-03-26 Thread Jorge Niedbalski
** Patch added: "Backport of Bug fix for Focal"
   
https://bugs.launchpad.net/ubuntu/+source/containerd/+bug/1867398/+attachment/5341943/+files/fix-1867398-focal.debdiff

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867398

Title:
  [Regression] unsupported protocol scheme

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/containerd/+bug/1867398/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867398] Re: [Regression] unsupported protocol scheme

2020-03-26 Thread Jorge Niedbalski
** Patch added: "Backport of Bug fix for eoan"
   
https://bugs.launchpad.net/ubuntu/+source/containerd/+bug/1867398/+attachment/5341942/+files/fix-1867398-eoan.debdiff

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867398

Title:
  [Regression] unsupported protocol scheme

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/containerd/+bug/1867398/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867398] Re: [Regression] unsupported protocol scheme

2020-03-26 Thread Jorge Niedbalski
** Changed in: containerd (Ubuntu Eoan)
   Status: Fix Released => Confirmed

** Changed in: containerd (Ubuntu Focal)
   Status: Fix Released => Confirmed

** Changed in: containerd (Ubuntu Eoan)
 Assignee: (unassigned) => Jorge Niedbalski (niedbalski)

** Changed in: containerd (Ubuntu Focal)
 Assignee: (unassigned) => Jorge Niedbalski (niedbalski)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867398

Title:
  [Regression] unsupported protocol scheme

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/containerd/+bug/1867398/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867398] Re: [Regression] unsupported protocol scheme

2020-03-26 Thread Jorge Niedbalski
Deployed the following bundle: http://paste.ubuntu.com/p/tdjqQ3GjJ2/

Followed the reproducer steps.

### With current bionic-updates version 1.3.3-0ubuntu1~18.04.1, problem
reproduced. #

ubuntu@niedbalski-bastion:~/stsstack-bundles/kubernetes/00268738$ kubectl 
delete pod --all
pod "busybox" deleted
ubuntu@niedbalski-bastion:~/stsstack-bundles/kubernetes/00268738$ juju run 
--application kubernetes-worker "sudo grep -i niedbalski /etc/containerd/* | 
grep -i endpoint" 
- Stdout: |
/etc/containerd/config.toml:  endpoint = 
["niedbalski-bastion.cloud.sts:5000"]
  UnitId: kubernetes-worker/0
- ReturnCode: 1
  Stdout: ""
  UnitId: kubernetes-worker/1

ubuntu@niedbalski-bastion:~/stsstack-bundles/kubernetes/00268738$ juju run 
--application kubernetes-worker "sudo dpkg -l |grep containerd"
- Stdout: |
ii  containerd  1.3.3-0ubuntu1~18.04.1  
amd64daemon to control runC
  UnitId: kubernetes-worker/0

ubuntu@niedbalski-bastion:~/stsstack-bundles/kubernetes/00268738$ kubectl 
delete pod --all
pod "busybox" deleted
ubuntu@niedbalski-bastion:~/stsstack-bundles/kubernetes/00268738$ kubectl get 
nodes
NAME STATUS ROLESAGE   VERSION
juju-3a79d2-00268738-4   Ready 13d   v1.16.8
juju-3a79d2-00268738-5   Ready,SchedulingDisabled  13d   v1.16.8
ubuntu@niedbalski-bastion:~/stsstack-bundles/kubernetes/00268738$ kubectl get 
pod -o wide
No resources found in default namespace.
(reverse-i-search)`deplo': kubectl edit ^Cployment microbot
(reverse-i-search)`appl': juju run --^Cplication kubernetes-worker "sudo dpkg 
-l |grep containerd"

ubuntu@niedbalski-bastion:~/stsstack-bundles/kubernetes/00268738$ kubectl apply 
-f busybox.yaml 
pod/busybox created
ubuntu@niedbalski-bastion:~/stsstack-bundles/kubernetes/00268738$ kubectl get 
pod -o wide
NAME  READY   STATUS RESTARTS   AGE   IP  NODE  
   NOMINATED NODE   READINESS GATES
busybox   0/1 ErrImagePull   0  3s10.1.84.4   
juju-3a79d2-00268738-4  


### With current -proposed version 1.3.3-0ubuntu1~18.04.2, problem is fixed. 
#


ubuntu@niedbalski-bastion:~/stsstack-bundles/kubernetes/00268738$ kubectl get 
nodes
NAME STATUS ROLESAGE   VERSION
juju-3a79d2-00268738-4   Ready 13d   v1.16.8
juju-3a79d2-00268738-5   Ready,SchedulingDisabled  13d   v1.16.8
ubuntu@niedbalski-bastion:~/stsstack-bundles/kubernetes/00268738$ kubectl get 
pod -o wide
No resources found in default namespace.
ubuntu@niedbalski-bastion:~/stsstack-bundles/kubernetes/00268738$ kubectl apply 
-f busybox.yaml 
pod/busybox created
ubuntu@niedbalski-bastion:~/stsstack-bundles/kubernetes/00268738$ kubectl get 
nodes
NAME STATUS ROLESAGE   VERSION
juju-3a79d2-00268738-4   Ready 13d   v1.16.8
juju-3a79d2-00268738-5   Ready,SchedulingDisabled  13d   v1.16.8
ubuntu@niedbalski-bastion:~/stsstack-bundles/kubernetes/00268738$ kubectl get 
pod -o wide
NAME  READY   STATUSRESTARTS   AGE   IP  NODE   
  NOMINATED NODE   READINESS GATES
busybox   1/1 Running   0  6s10.1.84.3   juju-3a79d2-00268738-4 
 


ubuntu@niedbalski-bastion:~/stsstack-bundles/kubernetes/00268738$ juju run 
--application kubernetes-worker "sudo dpkg -l |grep containerd"
- Stdout: |
ii  containerd  1.3.3-0ubuntu1~18.04.2  
amd64daemon to control runC
  UnitId: kubernetes-worker/0
- Stdout: |
ii  containerd  1.3.3-0ubuntu1~18.04.1  
amd64daemon to control runC
  UnitId: kubernetes-worker/1

ubuntu@niedbalski-bastion:~/stsstack-bundles/kubernetes/00268738$ more 
busybox.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: busybox
  namespace: default
spec:
  containers:
- name: busybox
  image: niedbalski-bastion.cloud.sts:5000/busybox:latest
  command:
- sleep
- "3600"
  imagePullSecrets:
- name: regcred
  restartPolicy: Always

Therefore the verification is done.

** Tags removed: verification-needed verification-needed-bionic
** Tags added: verification-done verification-done-bionic

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867398

Title:
  [Regression] unsupported protocol scheme

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/containerd/+bug/1867398/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867676] Re: Fetching by secret container doesn't raises 404 exception

2020-03-26 Thread Jorge Niedbalski
@racb, thank you very much for the review. I've updated description,
versions and the regression potential of the SRU template.

** Description changed:

  [Impact]
  
- Users of Ubuntu bionic running openstack clouds >= rocky 
- can't create octavia load balancers listeners anymore since the backport of 
the following patch: 
+ Users of Ubuntu bionic running openstack clouds >= rocky
+ can't create octavia load balancers listeners anymore since the backport of 
the following patch:
  
  
https://opendev.org/openstack/octavia/commit/a501714a76e04b33dfb24c4ead9956ed4696d1df
  
  This change was introduced as part of the following backports and
  their posterior syncs into the current Bionic version.
+ 
+ This fix being SRUed here is contained in 4.8.1-0ubuntu1 (disco onwards)
+ but not on the Bionic version 4.6.0-0ubuntu1.
+ 
+ The issue gets exposed with the following octavia
+ packages from UCA + python-barbicanclient 4.6.0ubuntu1.
+ 
+ Please note that likely this python-barbicanclient dependency should
+ be part of UCA and not of main/universe. 
+ 
+  octavia-api | 3.0.0-0ubuntu3~cloud0   | rocky  | all
+  octavia-api | 4.0.0-0ubuntu1.1~cloud0 | stein  | all
+  octavia-api | 4.0.0-0ubuntu1~cloud0   | train  | all
+ 
  
  This change added a new exception handler in the code
  that manages the decoding of the given PCKS12 certicate bundle when the 
listener is created, this handler now captures the PCKS12 decoding error and 
then raises it preventing
  the listener creation to happen (when its invoked with i.e.: 
--default-tls-container="https://10.5.0.4:9312/v1/containers/68154f38-fccf-4990-b88c-86eb3cc7fe1a;
 ) , this was originally being hidden
  under the legacy code handler as can be seen here:
  
  
https://opendev.org/openstack/octavia/commit/a501714a76e04b33dfb24c4ead9956ed4696d1df
- 
  
  This exception is raised because the barbicanclient doesn't know how to 
distinguish between a given secret and a container, therefore, when the
  user specifies a container UUID the client tries to fetch a secret with that 
uuid (including the /containers/UUID path) and a error 400 (not the expected 
404 http error) is returned.
  
  The change proposed on the SRU makes the client aware of container and
  secret UUID(s) and is able to split the path to distinguish a non-secret
  (such as a container), in that way if a container is passed, it fails to
  pass the parsing validation and the right return code (404) is returned
  by the client.
  
  If a error 404 gets returned, then the except Exception block gets
  executed and the legacy driver code for decoding the pcks12 certicate in 
octavia is invoked, this legacy
  driver is able to decode the container payloads and the decoding of the 
pcks12 certificate succeeds.
  
  This differentiation was implemented here:
  
  https://github.com/openstack/python-
  barbicanclient/commit/6651c8ffce48ce7ff08f5563a8e6212677ea0468
  
  As an example (this worked before the latest bionic version was pushed)
  
  openstack loadbalancer listener create --protocol-port 443 --protocol
  "TERMINATED_HTTPS" --name "test-listener" --default-tls-
  container="https://10.5.0.4:9312/v1/containers/68154f38-fccf-4990-b88c-
  86eb3cc7fe1a" -- lb1
  
  With the newest package upgrade this creation will fail with the
  following exception:
  
  The PKCS12 bundle is unreadable. Please check the PKCS12 bundle
  validity. In addition, make sure it does not require a pass phrase.
  Error: [('asn1 encoding routines', 'asn1_d2i_read_bio', 'not enough
  data')] (HTTP 400) (Request-ID: req-8e48d0b5-3f5b-
  4d26-9920-72b03343596a)
  
+ Further rationale on this can be found on
+ https://storyboard.openstack.org/#!/story/2007371
  
- Further rationale on this can be found on 
https://storyboard.openstack.org/#!/story/2007371
- 
- 
- ---
- [Impact]
- 
- As per https://storyboard.openstack.org/#!/story/2007371 we identified that
- ubuntu clouds running the version 4.6.0 (bionic) aren't raising a 404
- error when a secret container is passed.
- 
- This causes the code to not fall back into the legacy mode
  
  [Test Case]
  
  1) Deploy this bundle or similar (http://paste.ubuntu.com/p/cgbwKNZHbW/)
  
  2) Create self-signed certificate, key and ca
  (http://paste.ubuntu.com/p/xyyxHZGDFR/)
- 
  
  3) Create the 3 certs at barbican
  
  $ openstack secret store --name "test-pk-1" --secret-type "private"
  --payload-content-type "text/plain" --payload="$(cat
  ./keys/controller_key.pem)"
  
  $ openstack secret store --name "test-ca-1" --secret-type "certificate"
  --payload-content-type "text/plain" --payload="$(cat
  ./keys/controller_ca.pem)"
  
  $ openstack secret store --name "test-pub-1" --secret-type "certificate"
  --payload-content-type "text/plain" --payload="$(cat
  ./keys/controller_cert.pem)"
  
  4) Create a loadbalancer
  $ openstack loadbalancer create --name lb1 --vip-subnet-id private_subnet
  
- 
  5) Create a secrets container
  
  $ openstack secret 

[Bug 1867398] Re: [Regression] unsupported protocol scheme

2020-03-25 Thread Jorge Niedbalski
** Description changed:

  [Description]
  
  Kubernetes 1.16.17
  Containerd 1.3.3
  Ubuntu Bionic
  
  [Affected Releases]
  
   containerd | 1.3.3-0ubuntu1~18.04.1 | bionic-updates/universe  | source, 
amd64, arm64, armhf, i386, ppc64el, s390x
   containerd | 1.3.3-0ubuntu1~19.10.1 | eoan-updates/universe| source, 
amd64, arm64, armhf, i386, ppc64el, s390x
   containerd | 1.3.3-0ubuntu1 | focal| source, 
amd64, arm64, armhf, ppc64el, s390x
  
  [Impact]
  
  Reported upstream: https://github.com/containerd/containerd/issues/4108
  
- The bump of to version 1.3.3 through [0]
- https://bugs.launchpad.net/ubuntu/+source/containerd/+bug/1854841
+ User Impact:
  
- Caused a regression.
+ Since the Ubuntu bionic-updates bump of the version 1.3.3 through [0] 
https://bugs.launchpad.net/ubuntu/+source/containerd/+bug/1854841
+ a regression was introduced.
  
- The following endpoint description works with containerd 1.2.X without 
defining
- a protocol scheme. (/etc/containerd/config.toml).
+ The following endpoint description stopped working when scheduling pods
+ with k8s 1.16-1.17 isn't longer working.
  
  
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."niedbalski-bastion.cloud.sts:5000"]
    endpoint = ["niedbalski-bastion.cloud.sts:5000"]
- This stopped working on 1.3.X , scheduling pods with k8s 1.16-1.17 doesn't
- works using the same registry mirror definition.
  
- The pod definition is:
+ 
+ As an example, A pod defined as following:
  
  apiVersion: v1
  kind: Pod
  metadata:
    name: busybox
    namespace: default
  spec:
    containers:
  - name: busybox
    image: niedbalski-bastion.cloud.sts:5000/busybox:latest
    command:
  - sleep
  - "3600"
    imagePullSecrets:
  - name: regcred
    restartPolicy: Always
- New pods fail with the following error:
+ 
+ Will fail with the following error:
  
  " failed to do request: Head niedbalski-
  bastion.cloud.sts:///v2/busybox/manifests/latest: unsupported protocol
  scheme "niedbalski-bastion.cloud.sts"
  
  Normal Scheduled default-scheduler Successfully assigned default/busybox to 
juju-3a79d2-00268738-4
  Normal Pulling 8m39s (x4 over 10m) kubelet, juju-3a79d2-00268738-4 Pulling 
image "niedbalski-bastion.cloud.sts:5000/busybox:latest"
  Warning Failed 8m39s (x4 over 10m) kubelet, juju-3a79d2-00268738-4 Failed to 
pull image "niedbalski-bastion.cloud.sts:5000/busybox:latest": rpc error: code 
= Unknown desc = failed to pull and unpack image 
"niedbalski-bastion.cloud.sts:5000/busybox:latest": failed to resolve reference 
"niedbalski-bastion.cloud.sts:5000/busybox:latest": failed to do request: Head 
niedbalski-bastion.cloud.sts:///v2/busybox/manifests/latest: unsupported 
protocol scheme "niedbalski-bastion.cloud.sts"
  Warning Failed 8m39s (x4 over 10m) kubelet, juju-3a79d2-00268738-4 Error: 
ErrImagePull
  Warning Failed 8m27s (x6 over 10m) kubelet, juju-3a79d2-00268738-4 Error: 
ImagePullBackOff
  Normal BackOff 4m56s (x21 over 10m) kubelet, juju-3a79d2-00268738-4 Back-off 
pulling image "niedbalski-bastion.cloud.sts:5000/busybox:latest"
  
  [Test Case]
  
- Configure a private docker repository repository
+ 1) Configure a private docker repository repository
  
- Modify the containerd registry mirror config as follows:
+ 2)  Modify the containerd registry mirror config as follows:
  ** http://paste.ubuntu.com/p/yP63WMkVT6/
  
- Execute the following pod (http://paste.ubuntu.com/p/BVYQFMfCmk/)
+ 3) Execute the following pod (http://paste.ubuntu.com/p/BVYQFMfCmk/)
  
  Status of the scheduled pod should be ImagePullBackOff
  and the before mentioned error should be raised.
+ 
  
  [Possible workaround and solution]
  
  As a workaround change the endpoint to support the scheme (https://)
  Provide a fallback mechanism for URL parsing validation to fallback to http 
or https.
  I suspect that this change introduced on 1.3.3 through
  0b29c9c) may be the offending commit.
  
  [Regression Potential]
  
- ** Not identified yet any regression potential, this functionality fixes
- an existing regression introduced in the latest update.
+ ** The change proposed on the SRU takes in consideration both cases
+ 1) a endpoint without a schema 2) a endpoint with a schema. 
+ 
+ 1) worked in 1.2.6 as explained in the "Impact section" and stopped
+ being supported with the current Bionic version 1.3.3, 2) Should work
+ on both cases.
+ 
+ In neither case this should break existing endpoint definitions
+ now new deployments of containerd.
  
  [Other Info]
  
  ** This commit upstream
  
https://github.com/containerd/containerd/commit/a022c218194c05449ad69b69c48fc6cac9d6f0b3
  addresses the issue.

** Description changed:

  [Description]
  
  Kubernetes 1.16.17
  Containerd 1.3.3
  Ubuntu Bionic
  
  [Affected Releases]
  
   containerd | 1.3.3-0ubuntu1~18.04.1 | bionic-updates/universe  | source, 
amd64, arm64, armhf, i386, ppc64el, s390x
   containerd | 1.3.3-0ubuntu1~19.10.1 | 

[Bug 1867676] Re: Fetching by secret container doesn't raises 404 exception

2020-03-25 Thread Jorge Niedbalski
** Description changed:

+ [Impact]
+ 
+ Users of Ubuntu bionic running openstack clouds >= rocky 
+ can't create octavia load balancers listeners anymore since the backport of 
the following patch: 
+ 
+ 
https://opendev.org/openstack/octavia/commit/a501714a76e04b33dfb24c4ead9956ed4696d1df
+ 
+ This change was introduced as part of the following backports and
+ their posterior syncs into the current Bionic version.
+ 
+ This change added a new exception handler in the code
+ that manages the decoding of the given PCKS12 certicate bundle when the 
listener is created, this handler now captures the PCKS12 decoding error and 
then raises it preventing
+ the listener creation to happen (when its invoked with i.e.: 
--default-tls-container="https://10.5.0.4:9312/v1/containers/68154f38-fccf-4990-b88c-86eb3cc7fe1a;
 ) , this was originally being hidden
+ under the legacy code handler as can be seen here:
+ 
+ 
https://opendev.org/openstack/octavia/commit/a501714a76e04b33dfb24c4ead9956ed4696d1df
+ 
+ 
+ This exception is raised because the barbicanclient doesn't know how to 
distinguish between a given secret and a container, therefore, when the
+ user specifies a container UUID the client tries to fetch a secret with that 
uuid (including the /containers/UUID path) and a error 400 (not the expected 
404 http error) is returned.
+ 
+ The change proposed on the SRU makes the client aware of container and
+ secret UUID(s) and is able to split the path to distinguish a non-secret
+ (such as a container), in that way if a container is passed, it fails to
+ pass the parsing validation and the right return code (404) is returned
+ by the client.
+ 
+ If a error 404 gets returned, then the except Exception block gets
+ executed and the legacy driver code for decoding the pcks12 certicate in 
octavia is invoked, this legacy
+ driver is able to decode the container payloads and the decoding of the 
pcks12 certificate succeeds.
+ 
+ This differentiation was implemented here:
+ 
+ https://github.com/openstack/python-
+ barbicanclient/commit/6651c8ffce48ce7ff08f5563a8e6212677ea0468
+ 
+ As an example (this worked before the latest bionic version was pushed)
+ 
+ openstack loadbalancer listener create --protocol-port 443 --protocol
+ "TERMINATED_HTTPS" --name "test-listener" --default-tls-
+ container="https://10.5.0.4:9312/v1/containers/68154f38-fccf-4990-b88c-
+ 86eb3cc7fe1a" -- lb1
+ 
+ With the newest package upgrade this creation will fail with the
+ following exception:
+ 
+ The PKCS12 bundle is unreadable. Please check the PKCS12 bundle
+ validity. In addition, make sure it does not require a pass phrase.
+ Error: [('asn1 encoding routines', 'asn1_d2i_read_bio', 'not enough
+ data')] (HTTP 400) (Request-ID: req-8e48d0b5-3f5b-
+ 4d26-9920-72b03343596a)
+ 
+ 
+ Further rationale on this can be found on 
https://storyboard.openstack.org/#!/story/2007371
+ 
+ 
+ ---
  [Impact]
  
  As per https://storyboard.openstack.org/#!/story/2007371 we identified that
  ubuntu clouds running the version 4.6.0 (bionic) aren't raising a 404
  error when a secret container is passed.
  
  This causes the code to not fall back into the legacy mode
  
  [Test Case]
  
- Deploy this bundle or similar (http://paste.ubuntu.com/p/cgbwKNZHbW/)
- Create self-signed certificate, key and ca 
(http://paste.ubuntu.com/p/xyyxHZGDFR/)
- Create the 3 certs at barbican
- $ openstack secret store --name "test-pk-1" --secret-type "private" 
--payload-content-type "text/plain" --payload="$(cat ./keys/controller_key.pem)"
- $ openstack secret store --name "test-ca-1" --secret-type "certificate" 
--payload-content-type "text/plain" --payload="$(cat ./keys/controller_ca.pem)"
- $ openstack secret store --name "test-pub-1" --secret-type "certificate" 
--payload-content-type "text/plain" --payload="$(cat 
./keys/controller_cert.pem)"
+ 1) Deploy this bundle or similar (http://paste.ubuntu.com/p/cgbwKNZHbW/)
  
- Create a loadbalancer
+ 2) Create self-signed certificate, key and ca
+ (http://paste.ubuntu.com/p/xyyxHZGDFR/)
+ 
+ 
+ 3) Create the 3 certs at barbican
+ 
+ $ openstack secret store --name "test-pk-1" --secret-type "private"
+ --payload-content-type "text/plain" --payload="$(cat
+ ./keys/controller_key.pem)"
+ 
+ $ openstack secret store --name "test-ca-1" --secret-type "certificate"
+ --payload-content-type "text/plain" --payload="$(cat
+ ./keys/controller_ca.pem)"
+ 
+ $ openstack secret store --name "test-pub-1" --secret-type "certificate"
+ --payload-content-type "text/plain" --payload="$(cat
+ ./keys/controller_cert.pem)"
+ 
+ 4) Create a loadbalancer
  $ openstack loadbalancer create --name lb1 --vip-subnet-id private_subnet
  
- Create a secrets container
+ 
+ 5) Create a secrets container
  
  $ openstack secret container create --type='certificate' --name "test-
  tls-1"
  
--secret="certificate=https://10.5.0.4:9312/v1/secrets/3c9109d9-05e0-45fe-9661-087c50061c00;
  --secret="private_key=https://10.5.0.4:9312/v1/secrets/378e8f8c-81f5
  

[Bug 1867398] Re: [Regression] unsupported protocol scheme

2020-03-23 Thread Jorge Niedbalski
** Description changed:

- [Environment]
+ [Description]
  
  Kubernetes 1.16.17
  Containerd 1.3.3
  Ubuntu Bionic
  
  [Affected Releases]
  
-  containerd | 1.3.3-0ubuntu1~18.04.1 | bionic-updates/universe  | source, 
amd64, arm64, armhf, i386, ppc64el, s390x
-  containerd | 1.3.3-0ubuntu1~19.10.1 | eoan-updates/universe| source, 
amd64, arm64, armhf, i386, ppc64el, s390x
-  containerd | 1.3.3-0ubuntu1 | focal| source, 
amd64, arm64, armhf, ppc64el, s390x
+  containerd | 1.3.3-0ubuntu1~18.04.1 | bionic-updates/universe  | source, 
amd64, arm64, armhf, i386, ppc64el, s390x
+  containerd | 1.3.3-0ubuntu1~19.10.1 | eoan-updates/universe| source, 
amd64, arm64, armhf, i386, ppc64el, s390x
+  containerd | 1.3.3-0ubuntu1 | focal| source, 
amd64, arm64, armhf, ppc64el, s390x
  
- 
- [Description]
+ [Impact]
  
  Reported upstream: https://github.com/containerd/containerd/issues/4108
  
  The bump of to version 1.3.3 through [0]
  https://bugs.launchpad.net/ubuntu/+source/containerd/+bug/1854841
  
  Caused a regression.
  
  The following endpoint description works with containerd 1.2.X without 
defining
  a protocol scheme. (/etc/containerd/config.toml).
  
  
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."niedbalski-bastion.cloud.sts:5000"]
    endpoint = ["niedbalski-bastion.cloud.sts:5000"]
  This stopped working on 1.3.X , scheduling pods with k8s 1.16-1.17 doesn't
  works using the same registry mirror definition.
  
  The pod definition is:
  
  apiVersion: v1
  kind: Pod
  metadata:
    name: busybox
    namespace: default
  spec:
    containers:
  - name: busybox
    image: niedbalski-bastion.cloud.sts:5000/busybox:latest
    command:
  - sleep
  - "3600"
    imagePullSecrets:
  - name: regcred
    restartPolicy: Always
  New pods fail with the following error:
  
  " failed to do request: Head niedbalski-
  bastion.cloud.sts:///v2/busybox/manifests/latest: unsupported protocol
  scheme "niedbalski-bastion.cloud.sts"
  
  Normal Scheduled default-scheduler Successfully assigned default/busybox to 
juju-3a79d2-00268738-4
  Normal Pulling 8m39s (x4 over 10m) kubelet, juju-3a79d2-00268738-4 Pulling 
image "niedbalski-bastion.cloud.sts:5000/busybox:latest"
  Warning Failed 8m39s (x4 over 10m) kubelet, juju-3a79d2-00268738-4 Failed to 
pull image "niedbalski-bastion.cloud.sts:5000/busybox:latest": rpc error: code 
= Unknown desc = failed to pull and unpack image 
"niedbalski-bastion.cloud.sts:5000/busybox:latest": failed to resolve reference 
"niedbalski-bastion.cloud.sts:5000/busybox:latest": failed to do request: Head 
niedbalski-bastion.cloud.sts:///v2/busybox/manifests/latest: unsupported 
protocol scheme "niedbalski-bastion.cloud.sts"
  Warning Failed 8m39s (x4 over 10m) kubelet, juju-3a79d2-00268738-4 Error: 
ErrImagePull
  Warning Failed 8m27s (x6 over 10m) kubelet, juju-3a79d2-00268738-4 Error: 
ImagePullBackOff
  Normal BackOff 4m56s (x21 over 10m) kubelet, juju-3a79d2-00268738-4 Back-off 
pulling image "niedbalski-bastion.cloud.sts:5000/busybox:latest"
  
- [Steps to reproduce]
+ [Test Case]
  
  Configure a private docker repository repository
  
  Modify the containerd registry mirror config as follows:
  ** http://paste.ubuntu.com/p/yP63WMkVT6/
  
  Execute the following pod (http://paste.ubuntu.com/p/BVYQFMfCmk/)
  
  Status of the scheduled pod should be ImagePullBackOff
  and the before mentioned error should be raised.
  
  [Possible workaround and solution]
  
  As a workaround change the endpoint to support the scheme (https://)
  Provide a fallback mechanism for URL parsing validation to fallback to http 
or https.
  I suspect that this change introduced on 1.3.3 through
  0b29c9c) may be the offending commit.
+ 
+ [Regression Potential]
+ 
+ ** Not identified yet any regression potential, this functionality fixes
+ an existing regression introduced in the latest update.
+ 
+ [Other Info]
+ 
+ ** This commit upstream
+ 
https://github.com/containerd/containerd/commit/a022c218194c05449ad69b69c48fc6cac9d6f0b3
+ addresses the issue.

** Patch added: "Patch for bionic - Iter 2"
   
https://bugs.launchpad.net/ubuntu/+source/containerd/+bug/1867398/+attachment/5340471/+files/fix-1867398-bionic.debdiff

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867398

Title:
  [Regression] unsupported protocol scheme

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/containerd/+bug/1867398/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867398] Re: [Regression] unsupported protocol scheme

2020-03-23 Thread Jorge Niedbalski
Thank you for reviewing @dgadomski, Uploaded a debdiff that address your
comments and updated the SRU template accordingly.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867398

Title:
  [Regression] unsupported protocol scheme

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/containerd/+bug/1867398/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867398] Re: [Regression] unsupported protocol scheme

2020-03-20 Thread Jorge Niedbalski
** Tags removed: sts-sponsor-dgadomski
** Tags added: sts-needs-sponsor sts-sponsors

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867398

Title:
  [Regression] unsupported protocol scheme

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/containerd/+bug/1867398/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867398] Re: [Regression] unsupported protocol scheme

2020-03-19 Thread Jorge Niedbalski
** Patch added: "Bionic debdiff"
   
https://bugs.launchpad.net/ubuntu/+source/containerd/+bug/1867398/+attachment/5339108/+files/fix-1867398-bionic.debdiff

** Changed in: containerd (Ubuntu Focal)
   Status: New => Fix Released

** Changed in: containerd (Ubuntu Eoan)
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867398

Title:
  [Regression] unsupported protocol scheme

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/containerd/+bug/1867398/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867398] Re: [Regression] unsupported protocol scheme

2020-03-19 Thread Jorge Niedbalski
Attached is the debdiff that fixes this problem in bionic, it's a
backport of commit [0]

[0]
https://github.com/containerd/containerd/commit/a022c218194c05449ad69b69c48fc6cac9d6f0b3

** Tags added: sts-needs-sponsor

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867398

Title:
  [Regression] unsupported protocol scheme

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/containerd/+bug/1867398/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867676] Re: Fetching by secret container doesn't raises 404 exception

2020-03-17 Thread Jorge Niedbalski
** Patch added: "Patch for bionic"
   
https://bugs.launchpad.net/ubuntu/+source/python-barbicanclient/+bug/1867676/+attachment/5338009/+files/lp-1867676-bionic.debdiff

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867676

Title:
  Fetching by secret container doesn't raises 404 exception

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/python-barbicanclient/+bug/1867676/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867676] Re: Fetching by secret container doesn't raises 404 exception

2020-03-17 Thread Jorge Niedbalski
** Changed in: python-barbicanclient (Ubuntu Bionic)
   Status: New => Won't Fix

** Changed in: python-barbicanclient (Ubuntu Bionic)
   Status: Won't Fix => Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867676

Title:
  Fetching by secret container doesn't raises 404 exception

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/python-barbicanclient/+bug/1867676/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867676] [NEW] Fetching by secret container doesn't raises 404 exception

2020-03-16 Thread Jorge Niedbalski
Public bug reported:

[Description]

As per https://storyboard.openstack.org/#!/story/2007371 we identified that
ubuntu clouds running the version 4.6.0 (bionic) aren't raising a 404
error when a secret container is passed.

This causes the code to not fall back into the legacy mode

[Reproducer]

Deploy this bundle or similar (http://paste.ubuntu.com/p/cgbwKNZHbW/)
Create self-signed certificate, key and ca 
(http://paste.ubuntu.com/p/xyyxHZGDFR/)
Create the 3 certs at barbican
$ openstack secret store --name "test-pk-1" --secret-type "private" 
--payload-content-type "text/plain" --payload="$(cat ./keys/controller_key.pem)"
$ openstack secret store --name "test-ca-1" --secret-type "certificate" 
--payload-content-type "text/plain" --payload="$(cat ./keys/controller_ca.pem)"
$ openstack secret store --name "test-pub-1" --secret-type "certificate" 
--payload-content-type "text/plain" --payload="$(cat 
./keys/controller_cert.pem)"

Create a loadbalancer
$ openstack loadbalancer create --name lb1 --vip-subnet-id private_subnet

Create a secrets container

$ openstack secret container create --type='certificate' --name "test-
tls-1"
--secret="certificate=https://10.5.0.4:9312/v1/secrets/3c9109d9-05e0-45fe-9661-087c50061c00;
--secret="private_key=https://10.5.0.4:9312/v1/secrets/378e8f8c-81f5
-4b5a-bffd-c0c43a41b4a8"
--secret="intermediates=https://10.5.0.4:9312/v1/secrets/07a7564d-
b5c6-4433-a0a9-a195e2d54c57"

Create the listener
openstack loadbalancer listener create --protocol-port 443 --protocol 
"TERMINATED_HTTPS" --name "test-listener" 
--default-tls-container="https://10.5.0.4:9312/v1/containers/68154f38-fccf-4990-b88c-86eb3cc7fe1a;
 -- lb1

This creation will fail with the following exception:

The PKCS12 bundle is unreadable. Please check the PKCS12 bundle
validity. In addition, make sure it does not require a pass phrase.
Error: [('asn1 encoding routines', 'asn1_d2i_read_bio', 'not enough
data')] (HTTP 400) (Request-ID: req-8e48d0b5-3f5b-
4d26-9920-72b03343596a)

[Possible Regressions]

* No regressions identified so far.

[Fix]

The following changesets needs to be backported into the bionic version
4.6.0-0ubuntu1

All of those are part of 4.8.0 onward.

** 
https://github.com/openstack/python-barbicanclient/commit/6651c8ffce48ce7ff08f5563a8e6212677ea0468
** 
https://github.com/openstack/python-barbicanclient/commit/4eec7121b39de3849b469c56d85b95520aab7bad

Corresponding reviews

https://review.opendev.org/#/c/602810/
https://review.opendev.org/#/c/628046/

** Affects: python-barbicanclient (Ubuntu)
 Importance: Undecided
 Status: Fix Released

** Affects: python-barbicanclient (Ubuntu Bionic)
 Importance: Undecided
 Status: New

** Affects: python-barbicanclient (Ubuntu Disco)
 Importance: Undecided
 Status: Fix Released

** Affects: python-barbicanclient (Ubuntu Eoan)
 Importance: Undecided
 Status: Fix Released

** Affects: python-barbicanclient (Ubuntu Focal)
 Importance: Undecided
 Status: Fix Released

** Also affects: python-barbicanclient (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Also affects: python-barbicanclient (Ubuntu Eoan)
   Importance: Undecided
   Status: New

** Also affects: python-barbicanclient (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Also affects: python-barbicanclient (Ubuntu Disco)
   Importance: Undecided
   Status: New

** Changed in: python-barbicanclient (Ubuntu Focal)
   Status: New => Fix Released

** Changed in: python-barbicanclient (Ubuntu Eoan)
   Status: New => Fix Released

** Changed in: python-barbicanclient (Ubuntu Disco)
   Status: New => Fix Released

** Description changed:

  [Description]
  
  As per https://storyboard.openstack.org/#!/story/2007371 we identified that
  ubuntu clouds running the version 4.6.0 (bionic) aren't raising a 404
  error when a secret container is passed.
  
  This causes the code to not fall back into the legacy mode
  
  [Reproducer]
  
  Deploy this bundle or similar (http://paste.ubuntu.com/p/cgbwKNZHbW/)
  Create self-signed certificate, key and ca 
(http://paste.ubuntu.com/p/xyyxHZGDFR/)
  Create the 3 certs at barbican
  $ openstack secret store --name "test-pk-1" --secret-type "private" 
--payload-content-type "text/plain" --payload="$(cat ./keys/controller_key.pem)"
  $ openstack secret store --name "test-ca-1" --secret-type "certificate" 
--payload-content-type "text/plain" --payload="$(cat ./keys/controller_ca.pem)"
  $ openstack secret store --name "test-pub-1" --secret-type "certificate" 
--payload-content-type "text/plain" --payload="$(cat 
./keys/controller_cert.pem)"
  
  Create a loadbalancer
  $ openstack loadbalancer create --name lb1 --vip-subnet-id private_subnet
  
  Create a secrets container
  
  $ openstack secret container create --type='certificate' --name "test-
  tls-1"
  
--secret="certificate=https://10.5.0.4:9312/v1/secrets/3c9109d9-05e0-45fe-9661-087c50061c00;
  

[Bug 1867398] [NEW] [Regression] unsupported protocol scheme

2020-03-13 Thread Jorge Niedbalski
Public bug reported:

[Environment]

Kubernetes 1.16.17
Containerd 1.3.3
Ubuntu Bionic

[Affected Releases]

 containerd | 1.3.3-0ubuntu1~18.04.1 | bionic-updates/universe  | source, 
amd64, arm64, armhf, i386, ppc64el, s390x
 containerd | 1.3.3-0ubuntu1~19.10.1 | eoan-updates/universe| source, 
amd64, arm64, armhf, i386, ppc64el, s390x
 containerd | 1.3.3-0ubuntu1 | focal| source, 
amd64, arm64, armhf, ppc64el, s390x


[Description]

Reported upstream: https://github.com/containerd/containerd/issues/4108

The bump of to version 1.3.3 through [0]
https://bugs.launchpad.net/ubuntu/+source/containerd/+bug/1854841

Caused a regression.

The following endpoint description works with containerd 1.2.X without defining
a protocol scheme. (/etc/containerd/config.toml).


[plugins."io.containerd.grpc.v1.cri".registry.mirrors."niedbalski-bastion.cloud.sts:5000"]
  endpoint = ["niedbalski-bastion.cloud.sts:5000"]
This stopped working on 1.3.X , scheduling pods with k8s 1.16-1.17 doesn't
works using the same registry mirror definition.

The pod definition is:

apiVersion: v1
kind: Pod
metadata:
  name: busybox
  namespace: default
spec:
  containers:
- name: busybox
  image: niedbalski-bastion.cloud.sts:5000/busybox:latest
  command:
- sleep
- "3600"
  imagePullSecrets:
- name: regcred
  restartPolicy: Always
New pods fail with the following error:

" failed to do request: Head niedbalski-
bastion.cloud.sts:///v2/busybox/manifests/latest: unsupported protocol
scheme "niedbalski-bastion.cloud.sts"

Normal Scheduled default-scheduler Successfully assigned default/busybox to 
juju-3a79d2-00268738-4
Normal Pulling 8m39s (x4 over 10m) kubelet, juju-3a79d2-00268738-4 Pulling 
image "niedbalski-bastion.cloud.sts:5000/busybox:latest"
Warning Failed 8m39s (x4 over 10m) kubelet, juju-3a79d2-00268738-4 Failed to 
pull image "niedbalski-bastion.cloud.sts:5000/busybox:latest": rpc error: code 
= Unknown desc = failed to pull and unpack image 
"niedbalski-bastion.cloud.sts:5000/busybox:latest": failed to resolve reference 
"niedbalski-bastion.cloud.sts:5000/busybox:latest": failed to do request: Head 
niedbalski-bastion.cloud.sts:///v2/busybox/manifests/latest: unsupported 
protocol scheme "niedbalski-bastion.cloud.sts"
Warning Failed 8m39s (x4 over 10m) kubelet, juju-3a79d2-00268738-4 Error: 
ErrImagePull
Warning Failed 8m27s (x6 over 10m) kubelet, juju-3a79d2-00268738-4 Error: 
ImagePullBackOff
Normal BackOff 4m56s (x21 over 10m) kubelet, juju-3a79d2-00268738-4 Back-off 
pulling image "niedbalski-bastion.cloud.sts:5000/busybox:latest"

[Steps to reproduce]

Configure a private docker repository repository

Modify the containerd registry mirror config as follows:
** http://paste.ubuntu.com/p/yP63WMkVT6/

Execute the following pod (http://paste.ubuntu.com/p/BVYQFMfCmk/)

Status of the scheduled pod should be ImagePullBackOff
and the before mentioned error should be raised.

[Possible workaround and solution]

As a workaround change the endpoint to support the scheme (https://)
Provide a fallback mechanism for URL parsing validation to fallback to http or 
https.
I suspect that this change introduced on 1.3.3 through
0b29c9c) may be the offending commit.

** Affects: containerd (Ubuntu)
 Importance: Undecided
 Status: New

** Affects: containerd (Ubuntu Bionic)
 Importance: Undecided
 Status: New

** Affects: containerd (Ubuntu Eoan)
 Importance: Undecided
 Status: New

** Affects: containerd (Ubuntu Focal)
 Importance: Undecided
 Status: New

** Also affects: containerd (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Description changed:

  [Environment]
  
  Kubernetes 1.16.17
  Containerd 1.3.3
  Ubuntu Bionic
+ 
+ [Affected Releases]
+ 
+  containerd | 1.3.3-0ubuntu1~18.04.1 | bionic-updates/universe  | source, 
amd64, arm64, armhf, i386, ppc64el, s390x
+  containerd | 1.3.3-0ubuntu1~19.10.1 | eoan-updates/universe| source, 
amd64, arm64, armhf, i386, ppc64el, s390x
+  containerd | 1.3.3-0ubuntu1 | focal| source, 
amd64, arm64, armhf, ppc64el, s390x
+ 
  
  [Description]
  
  Reported upstream: https://github.com/containerd/containerd/issues/4108
  
  The bump of to version 1.3.3 through [0]
  https://bugs.launchpad.net/ubuntu/+source/containerd/+bug/1854841
  
  Caused a regression.
  
  The following endpoint description works with containerd 1.2.X without 
defining
  a protocol scheme. (/etc/containerd/config.toml).
  
- 
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."niedbalski-bastion.cloud.sts:5000"]
-   endpoint = ["niedbalski-bastion.cloud.sts:5000"]
+ 
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."niedbalski-bastion.cloud.sts:5000"]
+   endpoint = ["niedbalski-bastion.cloud.sts:5000"]
  This stopped working on 1.3.X , scheduling pods with k8s 1.16-1.17 doesn't
  works using the same registry mirror definition.
  
  The 

[Bug 1866085] [NEW] Not possible to create listeners that use barbican secret containers

2020-03-04 Thread Jorge Niedbalski
Public bug reported:

[Description]

On train and stein , with the addition of this change 
https://opendev.org/openstack/octavia/commit/a501714a76e04b33dfb24c4ead9956ed4696d1df
(and posterior backport to stable releases) isn't longer possible to create 
listeners that
use barbican secret containers except for single secrets exported as pkcs12 
directly.

Before that change, any exception raised when trying to to decode the PKCS12 
bundle would have resulted on falling back to the legacy
barbican certificate manager code, which supports secret containers [2], while 
with the addition of this line,  makes this exception to raise and not falling 
back to the legacy code anymore.

When this exception is raised, the following error is displayed to the
user:

$ openstack loadbalancer listener create --protocol-port 443 --protocol
"TERMINATED_HTTPS" --name "test-listener" --default-tls-
container="https://10.5.0.4:9312/v1/containers/68154f38-fccf-4990-b88c-
86eb3cc7fe1a" -- lb1

The PKCS12 bundle is unreadable. Please check the PKCS12 bundle
validity. In addition, make sure it does not require a pass phrase.
Error: [('asn1 encoding routines', 'asn1_d2i_read_bio', 'not enough
data')] (HTTP 400) (Request-ID: req-8e48d0b5-3f5b-
4d26-9920-72b03343596a)

In fact, I've tested creating a listener by removing the patch [0]
therefore falling back to the legacy mode and it works.

[Reproducer]

0) Deploy this bundle or similar (http://paste.ubuntu.com/p/cgbwKNZHbW/)
1) Create self-signed certificate, key and ca 
(http://paste.ubuntu.com/p/xyyxHZGDFR/)
2) Create the 3 certs at barbican

$ openstack secret store --name "test-pk-1" --secret-type "private" 
--payload-content-type "text/plain" --payload="$(cat ./keys/controller_key.pem)"
$ openstack secret store --name "test-ca-1" --secret-type "certificate" 
--payload-content-type "text/plain" --payload="$(cat ./keys/controller_ca.pem)"
$ openstack secret store --name "test-pub-1" --secret-type "certificate" 
--payload-content-type "text/plain" --payload="$(cat 
./keys/controller_cert.pem)"

3) Create a loadbalancer
$ openstack loadbalancer create --name lb1 --vip-subnet-id private_subnet

4) Create a secrets container

$  openstack secret container create --type='certificate' --name "test-
tls-1"
--secret="certificate=https://10.5.0.4:9312/v1/secrets/3c9109d9-05e0-45fe-9661-087c50061c00;
--secret="private_key=https://10.5.0.4:9312/v1/secrets/378e8f8c-81f5
-4b5a-bffd-c0c43a41b4a8"
--secret="intermediates=https://10.5.0.4:9312/v1/secrets/07a7564d-
b5c6-4433-a0a9-a195e2d54c57"

5) Create the listener

 openstack loadbalancer listener create --protocol-port 443 --protocol
"TERMINATED_HTTPS" --name "test-listener" --default-tls-
container="https://10.5.0.4:9312/v1/containers/68154f38-fccf-4990-b88c-
86eb3cc7fe1a" -- lb1

This creation will fail with the following exception:

The PKCS12 bundle is unreadable. Please check the PKCS12 bundle
validity. In addition, make sure it does not require a pass phrase.
Error: [('asn1 encoding routines', 'asn1_d2i_read_bio', 'not enough
data')] (HTTP 400) (Request-ID: req-8e48d0b5-3f5b-
4d26-9920-72b03343596a)

[ Possible solutions ]

* Undo this backport on stable releases
* Fix the current master code to support secret containers and not only plain 
pkcs12 certs.


[0] 
https://review.opendev.org/#/c/683954/1/octavia/certificates/manager/barbican.py
[1] 
https://opendev.org/openstack/octavia/commit/a501714a76e04b33dfb24c4ead9956ed4696d1df
[2] 
https://github.com/openstack/octavia/blob/master/octavia/certificates/manager/barbican_legacy.py#L141

** Affects: octavia (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1866085

Title:
  Not possible to create listeners that use barbican secret containers

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/octavia/+bug/1866085/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1866085] Re: Not possible to create listeners that use barbican secret containers

2020-03-04 Thread Jorge Niedbalski
Storyboard link: https://storyboard.openstack.org/#!/story/2007371

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1866085

Title:
  Not possible to create listeners that use barbican secret containers

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/octavia/+bug/1866085/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1841700] Re: instance ingress bandwidth limiting doesn't works in ocata.

2020-02-19 Thread Jorge Niedbalski
@corey.bryant, Yes corey, please go ahead. If this issue re-surfaces
again, i'll provide you a new proposal that fixes the case mentioned
before.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1841700

Title:
  instance ingress bandwidth limiting doesn't works in ocata.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1841700/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1822416] Re: resolve: do not hit CNAME or DNAME entry in NODATA cache

2020-01-21 Thread Jorge Niedbalski
** Also affects: systemd (Ubuntu Eoan)
   Importance: Undecided
   Status: New

** Also affects: systemd (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Also affects: systemd (Ubuntu Focal)
   Importance: Undecided
   Status: Confirmed

** Also affects: systemd (Ubuntu Disco)
   Importance: Undecided
   Status: New

** This bug is no longer a duplicate of bug 1818527
   Stub resolver cache is corrupted

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1822416

Title:
  resolve: do not hit CNAME or DNAME entry in NODATA cache

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1822416/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1848286] Re: octavia is not reporting metrics like lbaasv2

2019-12-02 Thread Jorge Niedbalski
** Tags removed: verification-needed verification-needed-disco 
verification-needed-eoan verification-rocky-needed verification-stein-needed 
verification-train-needed
** Tags added: verification-done verification-done-disco verification-done-eoan 
verification-rocky-done verification-stein-done verification-train-done

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1848286

Title:
  octavia is not reporting metrics like lbaasv2

To manage notifications about this bug go to:
https://bugs.launchpad.net/ceilometer/+bug/1848286/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1848286] Re: octavia is not reporting metrics like lbaasv2

2019-12-02 Thread Jorge Niedbalski
I've deployed this bundle[0] with proposed and the following configuration:

1) juju config ceilometer enable-all-pollsters=true
2) sudo systemctl restart ceilometer*
3) sudo ceilometer-upgrade

train
-
ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack$ gnocchi metric 
list|grep loadbalancer | wc -l
0

ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack$ gnocchi metric 
list|grep loadbalancer | wc -l
72

stein
-
ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack$ gnocchi metric 
list|grep loadbalancer | wc -l
0

ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack$ gnocchi metric 
list|grep loadbalancer | wc -l
72


rocky
-
ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack$ gnocchi metric 
list|grep loadbalancer | wc -l
0
ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack$ gnocchi metric 
list|grep loadbalancer | wc -l
72


eoan
-
ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack$ gnocchi metric 
list|grep loadbalancer | wc -l
0
ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack$ gnocchi metric 
list|grep loadbalancer | wc -l
72


disco
-
ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack$ gnocchi metric 
list|grep loadbalancer | wc -l
0
ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack$ gnocchi metric 
list|grep loadbalancer | wc -l
72

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1848286

Title:
  octavia is not reporting metrics like lbaasv2

To manage notifications about this bug go to:
https://bugs.launchpad.net/ceilometer/+bug/1848286/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1841700] Re: instance ingress bandwidth limiting doesn't works in ocata.

2019-11-29 Thread Jorge Niedbalski
I have tested the different upgrade scenarios outlayed on the spreadsheet
linked to this case.

I found an error with existing qos policies when upgrading from pike to 
ocata-patched
which might require a further change on the patchset.

I will mark this verification as failed and re-submit a new patch with the
required bits.

The results with all other combinations are shared next.

Bundle deployed: http://paste.ubuntu.com/p/tMNX577zCk/

Steps executed:

juju config neutron-api enable-qos=true


ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack$ openstack network qos 
policy create bw-limit
+-+--+
| Field   | Value|
+-+--+
| description |  |
| id  | 7e6d0652-6cd4-4d5c-b58e-b7664c1c4587 |
| is_default  | None |
| name| bw-limit |
| project_id  | 2c16c3a423444a43a39e11fcc768ad22 |
| rules   | []   |
| shared  | False|
+-+--+

openstack network qos rule create --type bandwidth-limit --max-kbps 300 
--max-burst-kbits 300 --ingress bw-limit
++--+
| Field  | Value|
++--+
| direction  | ingress  |
| id | eda4481f-61d0-4f52-91f7-fe979f776705 |
| max_burst_kbps | 300  |
| max_kbps   | 300  |
| name   | None |
| project_id |  |
++--+

openstack network qos rule list bw-limit
+--+--+-+--+-+--+---+---+
| ID   | QoS Policy ID| 
Type| Max Kbps | Max Burst Kbits | Min Kbps | DSCP mark | Direction 
|
+--+--+-+--+-+--+---+---+
| eda4481f-61d0-4f52-91f7-fe979f776705 | 7e6d0652-6cd4-4d5c-b58e-b7664c1c4587 | 
bandwidth_limit |  300 | 300 |  |   | ingress   
|
+--+--+-+--+-+--+--
 

ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack$ openstack port list -f 
value | grep 10.5.150.5
7c010ca8-1e96-4419-a4e9-4c56da05c806  fa:16:3e:a1:38:50 
ip_address='10.5.150.5', subnet_id='5bb6de3b-6c72-4ef5-a0c2-821dfafaa822' N/A

ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack$ openstack port list -f 
value | grep 10.5.150.0
8c44909b-a4cf-4c7b-903f-f9f31f4a8045  fa:16:3e:4d:53:4e 
ip_address='10.5.150.0', subnet_id='5bb6de3b-6c72-4ef5-a0c2-821dfafaa822' N/A

ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack$ openstack port
set 7c010ca8-1e96-4419-a4e9-4c56da05c806 --qos-policy 7e6d0652-6cd4
-4d5c-b58e-b7664c1c4587

ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack$ openstack port
set 8c44909b-a4cf-4c7b-903f-f9f31f4a8045 --qos-policy 7e6d0652-6cd4
-4d5c-b58e-b7664c1c4587

ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack$ for port in $(openstack 
port list -f value -c ID); do openstack port show $port | grep qos; done 
| qos_policy_id | 7e6d0652-6cd4-4d5c-b58e-b7664c1c4587  
|
| qos_policy_id | 7e6d0652-6cd4-4d5c-b58e-b7664c1c4587  
|


Tempest run:

http://paste.ubuntu.com/p/RPgrvvz25J/



Pre-patched version

ubuntu@niedbalski-bastion:~/tempest$ openstack network qos policy show 
30c09b4b-0c51-4679-99b9-23b62ba247d7
+-+--+
| Field   | Value|
+-+--+
| description |  |
| id  | 30c09b4b-0c51-4679-99b9-23b62ba247d7 |
| is_default  | None |
| name| bw-limit |
| project_id  | 2c16c3a423444a43a39e11fcc768ad22 |
| rules   | []   |
| shared  | False|
+-+--+


1 ubuntu@niedbalski-bastion:~/tempest$ openstack network qos rule create --type 
bandwidth-limit --max-kbps 300 --max-burst-kbits 300 
30c09b4b-0c51-4679-99b9-23b62ba247d7
++--+
| Field  | Value 

[Bug 1848286] Re: octavia is not reporting metrics like lbaasv2

2019-11-27 Thread Jorge Niedbalski
Patches have landed in stable branches. Thanks

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1848286

Title:
  octavia is not reporting metrics like lbaasv2

To manage notifications about this bug go to:
https://bugs.launchpad.net/ceilometer/+bug/1848286/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1848286] Re: octavia is not reporting metrics like lbaasv2

2019-11-27 Thread Jorge Niedbalski
** Description changed:

  [Environment]
  
  Stein OpenStack
  Ubuntu Bionic
  
  [Description]
  
  From my understanding the current Octavia API should be backwards
  compatible with LBaaS v2, therefore, the current LBaaS v2 pollster [1]
  should suffice to gather the following meters:
  
  meters:
  
  - network.services.lb.outgoing.bytes
  - network.services.lb.incoming.bytes
  - network.services.lb.pool
  - network.services.lb.listener
  - network.services.lb.member
  - network.services.lb.health_monitor
  - network.services.lb.loadbalancer
  - network.services.lb.total.connections
  - network.services.lb.active.connections
  
  However, the following warning is noticed when restarted the ceilometer
  services.
  
  2999:2019-11-07 15:05:41.665 22487 WARNING ceilometer.publisher.gnocchi [-] 
metric network.services.lb.loadbalancer is not handled by Gnocchi
  3467:2019-11-07 15:06:21.830 3916 WARNING ceilometer.publisher.gnocchi [-] 
metric network.services.lb.loadbalancer is not handled by Gnocchi
  
  In fact, by checking at the gnocchi metric list 
(https://pastebin.canonical.com/p/zyMTFv8vww/) the network.services.lb
  metrics/resources haven't been created.
  
  The reason is that for a gnocchi resource to exist, ceilometer has to create 
the resource type explicitly.
  This action is commanded by the ceilometer gnocchi client [2] and it uses the 
file 
/usr/lib/python3/dist-packages/ceilometer/publisher/data/gnocchi_resources.yaml
  for the resource definitions, which doesn't includes the lbaasv2 directives.
  
+ [Test case]
+ 
+ 1) Deploy a gnocchi backed ceilometer unit with octavia deployed.
+ 2) check that lb meters aren't being gathered
+ 
+ $ gnocchi metric list|grep loadbalancer | wc -l
+ 0
+ 3) install the patched version
+ 4) configure all pollsters
+ $  juju config ceilometer enable-all-pollsters=true
+ 5) sudo systemctl restart ceilometer*
+ 6) sudo ceilometer-upgrade
+ 7) Check that lb metrics are being gathered
+ 
+ ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack$ gnocchi metric 
list|grep loadbalancer | wc -l
+ 72
+ 
+ ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack$ gnocchi metric 
list|grep network.services
+ | 3947b249-b1ed-489d-9f51-b7549b5b78ce | ceilometer-low | 
network.services.lb.loadbalancer | loadbalancer | 
f97838f1-215d-4fcb-bd78-61e1f9859507 |
+ | 5527c5b6-d17e-4218-868b-c8d5326575a6 | ceilometer-low | 
network.services.lb.active.connections | connection | 
f97838f1-215d-4fcb-bd78-61e1f9859507 |
+ | 57d63bcb-48e1-47fb-825c-1109ad7d966d | ceilometer-low | 
network.services.lb.loadbalancer | loadbalancer | 
82d129f3-64a5-40d6-a72b-7ae704046176 |
+ | 5d93c751-0645-4b7b-a315-26b0e96dda3d | ceilometer-low | 
network.services.lb.outgoing.bytes | B | 82d129f3-64a5-40d6-a72b-7ae704046176 |
+ | 71ace776-9216-40a4-848c-42e9cc6f93a0 | ceilometer-low | 
network.services.lb.listener | listener | e7a4dd96-73a1-4aca-946d-8e87ba8f7a37 |
+ | 738b1dd0-0930-4bfa-b358-1a27fb63121e | ceilometer-low | 
network.services.lb.total.connections | connection | 
f97838f1-215d-4fcb-bd78-61e1f9859507 |
+ | 7b4726f6-9490-4d6c-a586-7e12b738b001 | ceilometer-low | 
network.services.lb.health_monitor | health_monitor | 
3ebc3ac0-8e1c-47d9-8789-bb51e6d29eba |
+ | 8689e872-c6d0-450f-bb73-d19197c62418 | ceilometer-low | 
network.services.lb.incoming.bytes | B | 82d129f3-64a5-40d6-a72b-7ae704046176 |
+ | ae72c5f3-4d8b-400b-ba86-2aaf068c20fc | ceilometer-low | 
network.services.lb.incoming.bytes | B | f97838f1-215d-4fcb-bd78-61e1f9859507 |
+ | b59ef5ae-9d71-4884-bb7c-66e0d46f5ecb | ceilometer-low | 
network.services.lb.outgoing.bytes | B | f97838f1-215d-4fcb-bd78-61e1f9859507 |
+ | bdf259a1-7471-4c23-ab2b-cac11f522f45 | ceilometer-low | 
network.services.lb.active.connections | connection | 
82d129f3-64a5-40d6-a72b-7ae704046176 |
+ | e5d2e433-9482-4e2a-9485-e73d7a13271a | ceilometer-low | 
network.services.lb.listener | listener | c87e0f88-26bf-496f-a882-71b1c67956b4 |
+ | eb634e4e-8e94-4ee4-8fac-4f4db12d56aa | ceilometer-low | 
network.services.lb.total.connections | connection | 
82d129f3-64a5-40d6-a72b-7ae7040
+ 
+  
+ [Regression Potential]
+ 
+ There is no regression potential identified on this patch as it uses
+ the new metrics upgrade and it doesn't
+ enables any new metrics it if the lbaasv2 pollster isn't configured.
+ 
+ 
  [Proposed solution]
  
  Enable the resources on /usr/lib/python3/dist 
packages/ceilometer/publisher/data/gnocchi_resources.yaml
  and modify the gnocchi client provider. I am able to gather the metrics for 
the octavia load balancers.
  
  ubuntu@niedbalski-bastion:~/stsstack-bundles/openstack$ gnocchi metric 
list|grep network.services
  | 3947b249-b1ed-489d-9f51-b7549b5b78ce | ceilometer-low | 
network.services.lb.loadbalancer | loadbalancer | 
f97838f1-215d-4fcb-bd78-61e1f9859507 |
  | 5527c5b6-d17e-4218-868b-c8d5326575a6 | ceilometer-low | 
network.services.lb.active.connections | connection | 
f97838f1-215d-4fcb-bd78-61e1f9859507 |
  | 

[Bug 1848286] Re: octavia is not reporting metrics like lbaasv2

2019-11-21 Thread Jorge Niedbalski
Stein debdiff

** Patch added: "lp1848286-stein.debdiff"
   
https://bugs.launchpad.net/charm-octavia/+bug/1848286/+attachment/5306922/+files/lp1848286-stein.debdiff

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1848286

Title:
  octavia is not reporting metrics like lbaasv2

To manage notifications about this bug go to:
https://bugs.launchpad.net/ceilometer/+bug/1848286/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1841700] Re: instance ingress bandwidth limiting doesn't works in ocata.

2019-09-23 Thread Jorge Niedbalski
Test results will be collected through
https://docs.google.com/spreadsheets/d/1ajozctmyHAKEBlG2Aet-
Zdv7v9uFcRlIRiW5KeOk778/edit#gid=0 , please feel free to add any missing
test case on that document.

** Description changed:

  [Environment]
  
  Xenial-Ocata deployment
  
  [Description]
  
  The instance ingress bandwidth limit implementation was targeted for
  Ocata [0], but the full implementation ingress/egress was done during
  the pike [1] cycle.
  
  However, isn't reported or explicit that ingress direction isn't
  supported in ocata, which causes the following exception when --ingress
  is specified.
  
+ It would be desirable for this feature to be available on Ocata for being 
able to
+ set ingress/egress bandwidth limits on the ports.
+ 
+ [Testing]
+ 
+ Without these patches, trying to set a ingress bandwidth-limit rule
+ the following exception will be raised.
+ 
  $ openstack network qos rule create --type bandwidth-limit --max-kbps 300 
--max-burst-kbits 300 --ingress bw-limiter
  Failed to create Network QoS rule: BadRequestException: 400: Client Error for 
url: https://openstack:9696/v2.0/qos/policies//bandwidth_limit_rules, 
Unrecognized attribute(s) 'direction'
  
- It would be desirable for this feature to be available on Ocata for being 
able to 
- set ingress/egress bandwidth limits on the ports.
+ A single policy set (without the --ingress parameter) as supported in
+ Ocata will just create a limiter on the egress side.
+ 
+ 
+ 1) Check the policy list
+ 
+ $ openstack network qos policy list
+ 
+--+++-+--+
+ | ID | Name | Shared | Default | Project |
+ 
+--+++-+--+
+ | 2c9c85e2-4b65-4146-b7bf-47895379c938 | bw-limiter | False | None | 
c45b1c0a681d4d9788f911e29166056d |
+ 
+--+++-+--+
+ 
+ 2) Check that the qoes rule is set to 300 kbps.
+ 
+ $ openstack network qos rule list 2c9c85e2-4b65-4146-b7bf-47895379c938
+ 
+ | 01eb228d-5803-4095-9e8e-f13d4312b2ef | 2c9c85e2-4b65-4146-b7bf-
+ 47895379c938 | bandwidth_limit | 300 | 300 | | | |
+ 
+ 
+ 3) Set the Qos policy on any port.
+ 
+ $ openstack port set 9a74b3c8-9ed8-4670-ad1f-932febfcf059 --qos-policy
+ 2c9c85e2-4b65-4146-b7bf-47895379c938
+ 
+ $ openstack port show 9a74b3c8-9ed8-4670-ad1f-932febfcf059 | grep qos
+ | qos_policy_id | 2c9c85e2-4b65-4146-b7bf-47895379c938 |
+ 
+ 
+ 4) Check that the egress traffic rules have been applied
+ 
+ # iperf3 -c 192.168.21.9 -t 10
+ Connecting to host 192.168.21.9, port 5201
+ [ 4] local 192.168.21.3 port 34528 connected to 192.168.21.9 port 5201
+ [ ID] Interval Transfer Bandwidth Retr Cwnd
+ [ 4] 0.00-1.00 sec 121 KBytes 988 Kbits/sec 23 2.44 KBytes
+ [ 4] 7.00-8.00 sec 40.2 KBytes 330 Kbits/sec 14 3.66 KBytes
+ [ 4] 8.00-9.00 sec 36.6 KBytes 299 Kbits/sec 15 2.44 KBytes
+ [ 4] 9.00-10.00 sec 39.0 KBytes 320 Kbits/sec 18 3.66 KBytes
+ - - - - - - - - - - - - - - - - - - - - - - - - -
+ [ ID] Interval Transfer Bandwidth Retr
+ [ 4] 0.00-10.00 sec 435 KBytes 356 Kbits/sec 159 sender
+ [ 4] 0.00-10.00 sec 384 KBytes 314 Kbits/sec receiver
+ 
+ iperf Done.
+ 
+ 5) Check that no ingress traffic limit has been applied.
+ 
+ # iperf3 -c 192.168.21.9 -R -t 10
+ Connecting to host 192.168.21.9, port 5201
+ Reverse mode, remote host 192.168.21.9 is sending
+ [ 4] local 192.168.21.3 port 34524 connected to 192.168.21.9 port 5201
+ [ ID] Interval Transfer Bandwidth
+ [ 4] 0.00-1.00 sec 38.1 MBytes 319 Mbits/sec
+ [ 4] 8.00-9.00 sec 74.6 MBytes 626 Mbits/sec
+ [ 4] 9.00-10.00 sec 73.2 MBytes 614 Mbits/sec
+ - - - - - - - - - - - - - - - - - - - - - - - - -
+ [ ID] Interval Transfer Bandwidth Retr
+ [ 4] 0.00-10.00 sec 1.07 GBytes 918 Mbits/sec 1045 sender
+ [ 4] 0.00-10.00 sec 1.07 GBytes 916 Mbits/sec receiver
+ 
+ 
+ --->
+ 
+ 6) With the patches applied from the PPA or proposed, run the migration
+ steps on the neutron-api node, repeat the previous steps, but make sure
+ to specify the traffic direction with --ingress as follows:
+ 
+ $ openstack network qos rule create --type bandwidth-limit --max-kbps 300 
--ingress testing-policy
+ ++--+
+ | Field | Value |
+ ++--+
+ | direction | ingress |
+ | id | 6d01cefa-0042-40cd-ae74-bcb723ca7ca4 |
+ | max_burst_kbps | 0 |
+ | max_kbps | 300 |
+ | name | None |
+ | project_id | |
+ ++--+
+ 
+ 7) Set the policy into any server port.
+ 
+ $ openstack port set 50b8f714-3ee4-4260-8359-820420471bdb --qos-policy
+ fac2be5e-64e0-4308-b477-0f8c0096c0b8
+ 
+ 8) Check that the policy has been applied
+ 
+ $ openstack port show 50b8f714-3ee4-4260-8359-820420471bdb | grep qos
+ | qos_policy_id | 

[Bug 1841700] Re: instance ingress bandwidth limiting doesn't works in ocata.

2019-09-09 Thread Jorge Niedbalski
** Patch added: "lp1841700-xenial-ocata.debdiff"
   
https://bugs.launchpad.net/neutron/+bug/1841700/+attachment/5287521/+files/lp1841700-xenial-ocata.debdiff

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1841700

Title:
  instance ingress bandwidth limiting doesn't works in ocata.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1841700/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1841700] Re: instance ingress bandwidth limiting doesn't works in ocata.

2019-09-09 Thread Jorge Niedbalski
Waiting on OE approval.

* Testing cases comments have been updated, the xenial-ocata debdiff
patch is on the previous comment.


** Patch removed: "lp1841700-xenial-ocata.debdiff"
   
https://bugs.launchpad.net/neutron/+bug/1841700/+attachment/5287520/+files/lp1841700-xenial-ocata.debdiff

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1841700

Title:
  instance ingress bandwidth limiting doesn't works in ocata.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1841700/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1841700] Re: instance ingress bandwidth limiting doesn't works in ocata.

2019-09-09 Thread Jorge Niedbalski
** Patch added: "lp1841700-xenial-ocata.debdiff"
   
https://bugs.launchpad.net/neutron/+bug/1841700/+attachment/5287520/+files/lp1841700-xenial-ocata.debdiff

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1841700

Title:
  instance ingress bandwidth limiting doesn't works in ocata.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1841700/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1841700] Re: instance ingress bandwidth limiting doesn't works in ocata.

2019-09-09 Thread Jorge Niedbalski
Hello,

I've tested the PPA on
https://launchpad.net/~niedbalski/+archive/ubuntu/hf-00237904/ with
xenial-ocata deployed cloud.

Please note that it's a requirement to run the migrations steps up to
heads.

ubuntu@niedbalski-bastion:~/octavia/openstack$ juju config neutron-api
enable-qos=True

| 50b8f714-3ee4-4260-8359-820420471bdb |  | fa:16:3e:88:c1:a0 |
ip_address='10.5.150.6', subnet_id='7daca73f-31ab-4401-bc6e-
e74dd38b7fc1'   | N/A|


ubuntu@niedbalski-bastion:~/octavia/openstack$ openstack network qos policy 
create testing-policy --share 
+-+--+
| Field   | Value|
+-+--+
| description |  |
| id  | fac2be5e-64e0-4308-b477-0f8c0096c0b8 |
| is_default  | None |
| name| testing-policy   |
| project_id  | d215199a727a4384ad7d43825e86ff69 |
| rules   | []   |
| shared  | True |
| tags| []   |
+-+--+

ubuntu@niedbalski-bastion:~/octavia/openstack$ openstack network qos rule 
create --type bandwidth-limit --max-kbps 300 --ingress testing-policy
Failed to create Network QoS rule: BadRequestException: 400: Client Error for 
url: 
http://10.5.0.66:9696/v2.0/qos/policies/fac2be5e-64e0-4308-b477-0f8c0096c0b8/bandwidth_limit_rules,
 {"NeutronError": {"message": "Unrecognized attribute(s) 'direction'", "type": 
"HTTPBadRequest", "detail": ""}}


> With the patch applied


neutron-api

root@juju-cd6736-1841700-4:/home/ubuntu# systemctl status neutron*
● neutron-server.service - OpenStack Neutron Server
   Loaded: loaded (/lib/systemd/system/neutron-server.service; enabled; vendor 
preset: enabled)
   Active: active (running) since Mon 2019-09-09 15:47:00 UTC; 1min 39s ago
  Process: 13597 ExecStartPre=/bin/chown neutron:adm /var/log/neutron 
(code=exited, status=0/SUCCESS)

ubuntu@juju-cd6736-1841700-4:~$ sudo su
root@juju-cd6736-1841700-4:/home/ubuntu# dpkg -l |grep neutron
ii  neutron-common   
2:10.0.7-0ubuntu1~cloud1ubuntu1+hf1560961v20190826.13 all  Neutron is a 
virtual network service for Openstack - common
ii  neutron-plugin-ml2   
2:10.0.7-0ubuntu1~cloud1ubuntu1+hf1560961v20190826.13 all  Neutron is a 
virtual network service for Openstack - ML2 plugin
ii  neutron-server   
2:10.0.7-0ubuntu1~cloud1ubuntu1+hf1560961v20190826.13 all  Neutron is a 
virtual network service for Openstack - server
ii  python-neutron   
2:10.0.7-0ubuntu1~cloud1ubuntu1+hf1560961v20190826.13 all  Neutron is a 
virtual network service for Openstack - Python library

neutron-gateway

root@juju-cd6736-1841700-5:/home/ubuntu# dpkg -l | grep neutron
ii  neutron-common   
2:10.0.7-0ubuntu1~cloud1ubuntu1+hf1560961v20190826.13 all  Neutron is a 
virtual network service for Openstack - common
ii  neutron-dhcp-agent   
2:10.0.7-0ubuntu1~cloud1ubuntu1+hf1560961v20190826.13 all  Neutron is a 
virtual network service for Openstack - DHCP agent
ii  neutron-l3-agent 
2:10.0.7-0ubuntu1~cloud1ubuntu1+hf1560961v20190826.13 all  Neutron is a 
virtual network service for Openstack - l3 agent
ii  neutron-lbaas-common 2:10.0.1-0ubuntu1~cloud0   
   all  Neutron is a virtual network service for Openstack - 
common
ii  neutron-lbaasv2-agent2:10.0.1-0ubuntu1~cloud0   
   all  Neutron is a virtual network service for Openstack - 
LBaaSv2 agent
ii  neutron-metadata-agent   
2:10.0.7-0ubuntu1~cloud1ubuntu1+hf1560961v20190826.13 all  Neutron is a 
virtual network service for Openstack - metadata agent
ii  neutron-metering-agent   
2:10.0.7-0ubuntu1~cloud1ubuntu1+hf1560961v20190826.13 all  Neutron is a 
virtual network service for Openstack - metering agent
ii  neutron-openvswitch-agent
2:10.0.7-0ubuntu1~cloud1ubuntu1+hf1560961v20190826.13 all  Neutron is a 
virtual network service for Openstack - Open vSwitch plugin agent
ii  python-neutron   
2:10.0.7-0ubuntu1~cloud1ubuntu1+hf1560961v20190826.13 all  Neutron is a 
virtual network service for Openstack - Python library

root@juju-cd6736-1841700-5:/home/ubuntu# systemctl status 
neutron-openvswitch-agent
● neutron-openvswitch-agent.service - Openstack Neutron Open vSwitch Plugin 
Agent
   Loaded: loaded (/lib/systemd/system/neutron-openvswitch-agent.service; 
enabled; vendor preset: enabled)
   Active: active (running) since Mon 2019-09-09 15:52:12 UTC; 1min 58s ago


compute-node

root@juju-cd6736-1841700-8:/home/ubuntu# systemctl list-unit-files neutron*
UNIT FILE 

[Bug 1831181] Re: [aodh.notifier] Not setting user_domain_id raises keystone error: The resource could not be found.

2019-09-02 Thread Jorge Niedbalski
Deployed on q/r/b the problem is no longer reproduced
https://pastebin.canonical.com/p/hPV8SVv2Th/, marking the verification
as completed.

** Tags removed: sts-sru-needed verification-needed verification-needed-bionic 
verification-queens-needed verification-rocky-needed
** Tags added: sts-sru-done verification-done verification-done-bionic 
verification-queens-done verification-rocky-done

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1831181

Title:
  [aodh.notifier] Not setting user_domain_id raises keystone error: The
  resource could not be found.

To manage notifications about this bug go to:
https://bugs.launchpad.net/aodh/+bug/1831181/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1841700] Re: instance ingress bandwidth limiting doesn't works in ocata.

2019-09-02 Thread Jorge Niedbalski
** Also affects: neutron (Ubuntu)
   Importance: Undecided
   Status: New

** Also affects: neutron (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Changed in: neutron (Ubuntu)
   Status: New => Fix Released

** Also affects: cloud-archive
   Importance: Undecided
   Status: New

** Changed in: neutron
   Status: Invalid => Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1841700

Title:
  instance ingress bandwidth limiting doesn't works in ocata.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1841700/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

  1   2   3   4   5   6   7   8   >