[Yahoo-eng-team] [Bug 2042089] [NEW] neutron : going to shared network is working, going back not
Public bug reported: We have admin-generated provider-networks. Projects are allowed to create ports and instances on these networks. When we now set the "shared" property on these networks, we are no longer allowed to unset this property. We get the error : "Unable to reconfigure sharing settings for network net.vlan10.provider. Multiple tenants are using it.". Once all ports and instances created by non-admin projects are removed we can again unset the "shared" property. So, we are allowed to set a parameter for which it is afterwards no longer possible to unset. We have now a network that is visible by all and we do not prefer this situation. Removing the corresponding RBAC policy is also not allowed. This is a OpenStack-Ansible installation with version Yoga. ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2042089 Title: neutron : going to shared network is working, going back not Status in neutron: New Bug description: We have admin-generated provider-networks. Projects are allowed to create ports and instances on these networks. When we now set the "shared" property on these networks, we are no longer allowed to unset this property. We get the error : "Unable to reconfigure sharing settings for network net.vlan10.provider. Multiple tenants are using it.". Once all ports and instances created by non-admin projects are removed we can again unset the "shared" property. So, we are allowed to set a parameter for which it is afterwards no longer possible to unset. We have now a network that is visible by all and we do not prefer this situation. Removing the corresponding RBAC policy is also not allowed. This is a OpenStack-Ansible installation with version Yoga. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/2042089/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2012104] [NEW] Neutron picking incorrect ovn records
Public bug reported: For one of our compute machines I'm seeing two network agents that appear unhealthy: ``` $ os network agent list | fgrep "register deleted" | compute1 | OVN Controller agent | ("Chassis" register deleted) | | XXX | UP| ovn-controller | | c085d57a-3a2b-4f97-8250-23d3f914b078 | OVN Metadata agent | ("Chassis" register deleted) | | XXX | UP| neutron-ovn-metadata-agent | ``` The ("Chassis" register deleted) message appears to come from the fix for this: https://bugs.launchpad.net/neutron/+bug/1951149 Searching for that external id I can find this private chassis and it's chassis indeed seems empty: ``` $ sudo ovn-sbctl find chassis-private | grep -A 5 e621e0fb-83d3-4a18-82b3-c842996548ed' _uuid : e621e0fb-83d3-4a18-82b3-c842996548ed chassis : [] external_ids: {"neutron:liveness_check_at"="2022-06-17T08:43:33.393639+00:00", "neutron:metadata_liveness_check_at"="2022-06-17T02:27:21.309718+00:00", "neutron:ovn-metadata-id"="c085d57a-3a2b-4f97-8250-23d3f914b078", "ne utron:ovn-metadata-sb-cfg"="150397"} name: compute1 nb_cfg : 150397 nb_cfg_timestamp: 1657729945956 ``` But there's also: ``` $ sudo ovn-sbctl find chassis hostname=compute1.stack _uuid : 164cb56b-1a3c-4401-bc52-6fa5e58d8f2a encaps : [c442312a-9dfa-4ffe-9db7-afe5f9055962] external_ids: {datapath-type=system, iface-types="bareudp,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan", is-interconn="false", "neutron:ovn-metadata-sb-cfg"="250161", ovn-bridge-mappings="", ovn-chassis-mac-mappings="", ovn-cms-options="", ovn-enable-lflow-cache="true", ovn-limit-lflow-cache="", ovn-memlimit-lflow-cache-kb="", ovn-monitor-all="false", ovn-trim-limit-lflow-cache="", ovn-trim-timeout-ms="", ovn-trim-wmark-perc-lflow-cache="", port-up-notif="true"} hostname: compute1.stack name: compute1.stack nb_cfg : 0 other_config: {datapath-type=system, iface-types="bareudp,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan", is-interconn="false", ovn-bridge-mappings="", ovn-chassis-mac-mappings="", ovn-cms-options="", ovn-enable-lflow-cache="true", ovn-limit-lflow-cache="", ovn-memlimit-lflow-cache-kb="", ovn-monitor-all="false", ovn-trim-limit-lflow-cache="", ovn-trim-timeout-ms="", ovn-trim-wmark-perc-lflow-cache="", port-up-notif="true"} transport_zones : [] vtep_logical_switches: [] $ sudo ovn-sbctl find chassis-private chassis=164cb56b-1a3c-4401-bc52-6fa5e58d8f2a _uuid : cbec617d-19dc-481c-ba99-b4132244773c chassis : 164cb56b-1a3c-4401-bc52-6fa5e58d8f2a external_ids: {"neutron:ovn-metadata-id"="3328a0c7-081b-58a9-9e91-baf5c8c259cd", "neutron:ovn-metadata-sb-cfg"="312321"} name: compute1.stack nb_cfg : 312321 nb_cfg_timestamp: 1679042105359 ``` Which seems to be a correct entry -- should neutron not pick up this entry rather than the one with "chassis : []"? Software versions: ii neutron-server 2:20.2.0-0ubuntu1~cloud0 all Neutron is a virtual network service for Openstack - server ii ovn-central22.03.0-0ubuntu1~cloud0 amd64OVN central components Distributor ID: Ubuntu Description:Ubuntu 20.04.4 LTS Release:20.04 Codename: focal Please let me know if I can provide more diagnostics. ** Affects: neutron Importance: Undecided Status: New ** Description changed: For one of our compute machines I'm seeing two network agents that appear unhealthy: + ``` $ os network agent list | fgrep "register deleted" | compute1 | OVN Controller agent | ("Chassis" register deleted) | | XXX | UP| ovn-controller | | c085d57a-3a2b-4f97-8250-23d3f914b078 | OVN Metadata agent | ("Chassis" register deleted) | | XXX | UP| neutron-ovn-metadata-agent | + ``` The ("Chassis" register deleted) message appears to come from the fix for this: https://bugs.launchpad.net/neutron/+bug/1951149 Searching for that external id I can find this private chassis and it's chassis indeed seems empty: + ``` $ sudo ovn-sbctl find chassis-private | grep -A 5 e621e0fb-83d3-4a18-82b3-c842996548ed' _uuid : e621e0fb-83d3-4a18-82b3-c842996548ed chassis : [] external_ids: {"neutron:liveness_check_at"="2022-06-17T08:43:33.393639+00:00", "neutron:metadata_liveness_check_at"="2022-06-17T02:27:21.309718+00:00", "neutron:ovn-metadata-id"="c085d57a-3a2b-4f97-8250-23d3f914b078", "ne utron:ovn-metadata-sb-cfg"="150397"} name: compute1 nb_cfg : 150397 nb_cfg_timestamp: 1657729945956 + ``` But
[Yahoo-eng-team] [Bug 2011298] [NEW] network config can't override kernel command line arguments
Public bug reported: This isn't really a bug, it's more of a missing functionality. The quick description is that if you specify the network configuration by a kernel command line parameter (`ip=blahblahblah`), there doesn't appear to be a way to override it by defining a different network setup in cloud-init. The kernel command line will always take precedence no matter what you define in `meta-data` or `user-data`. This is the conclusion I drew both from the documentation, as well as by looking at the source on GitHub. The kernel command line has absolute priority and can't be overriden. I found a workaround, by writing the config file (`/etc/netplan/50-cloud-init.yml` in my case) directly through "write_files" and then executing "netplan apply" in "runcmd". This does what I want, but it doesn't look nice. Perhaps you're wondering why I need this. I boot ubuntu MAAS images over network. In some data centers DHCP isn't available and at the same time I need to assign more than one IP address to the system, and/or configure bonding. So the bootloader constructs the kernel command line, the machine boots, but only has one ethernet interface configured, with one IP address. I can't remove the kernel command line argument, as then cloud-init doesn't know how to configure the network in order to download the `user-data` / `meta-data`. Please consider providing an alternative approach to the workaround I explained above. ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/2011298 Title: network config can't override kernel command line arguments Status in cloud-init: New Bug description: This isn't really a bug, it's more of a missing functionality. The quick description is that if you specify the network configuration by a kernel command line parameter (`ip=blahblahblah`), there doesn't appear to be a way to override it by defining a different network setup in cloud-init. The kernel command line will always take precedence no matter what you define in `meta-data` or `user-data`. This is the conclusion I drew both from the documentation, as well as by looking at the source on GitHub. The kernel command line has absolute priority and can't be overriden. I found a workaround, by writing the config file (`/etc/netplan/50-cloud-init.yml` in my case) directly through "write_files" and then executing "netplan apply" in "runcmd". This does what I want, but it doesn't look nice. Perhaps you're wondering why I need this. I boot ubuntu MAAS images over network. In some data centers DHCP isn't available and at the same time I need to assign more than one IP address to the system, and/or configure bonding. So the bootloader constructs the kernel command line, the machine boots, but only has one ethernet interface configured, with one IP address. I can't remove the kernel command line argument, as then cloud-init doesn't know how to configure the network in order to download the `user-data` / `meta-data`. Please consider providing an alternative approach to the workaround I explained above. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/2011298/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1983306] [NEW] schema validation not in line with specification
Public bug reported: Related to #1983303. My user-data begins with #include, as it's not a "Cloud Config Data" but an "Include File" as described in the official documentation. However, this causes the validator `cloud-init schema --system` to complain that ``` Error: Cloud config schema errors: format-l1.c1: File None needs to begin with "#cloud-config" ``` Ok I thought, I just manually add "#cloud-config" at the top and re- test: ``` Error: Cloud-config is not a YAML dict. ``` Well, it's not a YAML dict because it's not a cloud config data but an include file, which isn't in the YAML format. See the specification: https://cloudinit.readthedocs.io/en/latest/topics/format.html Also look at the implementation in `user_data.py`, function `_do_include`. As you can see, this file isn't processed as YAML but parsed line by line. So the specification and implementation agree, but the schema validator doesn't and thinks it should process it as YAML. This wouldn't be a practical problem for me, but due to #19833303 I get mangled logs and can't work around it. ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1983306 Title: schema validation not in line with specification Status in cloud-init: New Bug description: Related to #1983303. My user-data begins with #include, as it's not a "Cloud Config Data" but an "Include File" as described in the official documentation. However, this causes the validator `cloud-init schema --system` to complain that ``` Error: Cloud config schema errors: format-l1.c1: File None needs to begin with "#cloud-config" ``` Ok I thought, I just manually add "#cloud-config" at the top and re- test: ``` Error: Cloud-config is not a YAML dict. ``` Well, it's not a YAML dict because it's not a cloud config data but an include file, which isn't in the YAML format. See the specification: https://cloudinit.readthedocs.io/en/latest/topics/format.html Also look at the implementation in `user_data.py`, function `_do_include`. As you can see, this file isn't processed as YAML but parsed line by line. So the specification and implementation agree, but the schema validator doesn't and thinks it should process it as YAML. This wouldn't be a practical problem for me, but due to #19833303 I get mangled logs and can't work around it. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1983306/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1983303] [NEW] logs mangled
Public bug reported: The fix of #1978422 mangles logs as `sed` modifies them while they are still open (at least that's my deduction). Applies to both cloud- init.log and cloud-init-output.log. ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1983303 Title: logs mangled Status in cloud-init: New Bug description: The fix of #1978422 mangles logs as `sed` modifies them while they are still open (at least that's my deduction). Applies to both cloud- init.log and cloud-init-output.log. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1983303/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1953139] [NEW] On IPv6 overlay networks linuxbridge vxlans are created always on loopback device
Public bug reported: Neutron looks up for vxlan parent devices by IFA_LABEL attribute returned from pyroute2. If I set up IPv4 overlay network, this works without issues. But when I setup an IPv6 overlay, always the device with index 0 (usually "lo") is being returned by get_devices_with_ip because the device structure returned for IPv6 addresses doesn't contain IFA_LABEL. If IFA_LABEL is not found, neutron ip_lib.py tries to find the name of the 'owner' of the address by device index (index is already known here): for ip_address in ip_addresses: index = ip_address['index'] name = get_attr(ip_address, 'IFA_LABEL') or devices.get(index) if not name: device = get_devices_info(namespace, index=index) if not device: continue name = device[0]['name'] However priviliged/agent/linux/ip_lib.py get_link_devices() doesn't use the index kwarg correctly, and returns all devices in the system, and the code above always returns the first device in the list. My solution is now to transform get_link_devices() to pass arguments to ip.get_links() correctly: --- ip_lib.py.orig 2021-12-03 10:28:40.312266929 + +++ ip_lib.py 2021-12-03 10:26:33.337486559 + @@ -564,7 +564,10 @@ """ try: with get_iproute(namespace) as ip: -return make_serializable(ip.get_links(**kwargs)) +if "index" in kwargs: +return make_serializable(ip.get_links(kwargs['index'])) +else: +return make_serializable(ip.get_links(**kwargs)) except OSError as e: if e.errno == errno.ENOENT: raise NetworkNamespaceNotFound(netns_name=namespace) ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1953139 Title: On IPv6 overlay networks linuxbridge vxlans are created always on loopback device Status in neutron: New Bug description: Neutron looks up for vxlan parent devices by IFA_LABEL attribute returned from pyroute2. If I set up IPv4 overlay network, this works without issues. But when I setup an IPv6 overlay, always the device with index 0 (usually "lo") is being returned by get_devices_with_ip because the device structure returned for IPv6 addresses doesn't contain IFA_LABEL. If IFA_LABEL is not found, neutron ip_lib.py tries to find the name of the 'owner' of the address by device index (index is already known here): for ip_address in ip_addresses: index = ip_address['index'] name = get_attr(ip_address, 'IFA_LABEL') or devices.get(index) if not name: device = get_devices_info(namespace, index=index) if not device: continue name = device[0]['name'] However priviliged/agent/linux/ip_lib.py get_link_devices() doesn't use the index kwarg correctly, and returns all devices in the system, and the code above always returns the first device in the list. My solution is now to transform get_link_devices() to pass arguments to ip.get_links() correctly: --- ip_lib.py.orig 2021-12-03 10:28:40.312266929 + +++ ip_lib.py 2021-12-03 10:26:33.337486559 + @@ -564,7 +564,10 @@ """ try: with get_iproute(namespace) as ip: -return make_serializable(ip.get_links(**kwargs)) +if "index" in kwargs: +return make_serializable(ip.get_links(kwargs['index'])) +else: +return make_serializable(ip.get_links(**kwargs)) except OSError as e: if e.errno == errno.ENOENT: raise NetworkNamespaceNotFound(netns_name=namespace) To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1953139/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1931392] [NEW] sensitive metadata and jinja templates
Public bug reported: The documentation doesn't explain well how to use sanitized metadata (that will show up in instance-data-sensitive.json rather than instance- data.json) with jinja templates inside user-data. As far as I can see, it doesn't work. The source code mentions two magic keys that are sanitized: "merged_cfg" and "security-credentials". Defining variables with these names inside meta-data correctly sanitizes them and only puts them inside files only readable by root, however then they don't work inside user-data as jinja templates (as "{{ds.meta_data.security- credentials}}", for example), they are instead replaced by CI_MISSING_JINJA_VAR. Using differently named variables makes the template work, but they aren't sanitized in the logs/runtime files. In what way, if any, this is supposed to work? Should I instead just chmod the relevant log/runtime files through an entry in bootcmd? ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1931392 Title: sensitive metadata and jinja templates Status in cloud-init: New Bug description: The documentation doesn't explain well how to use sanitized metadata (that will show up in instance-data-sensitive.json rather than instance-data.json) with jinja templates inside user-data. As far as I can see, it doesn't work. The source code mentions two magic keys that are sanitized: "merged_cfg" and "security-credentials". Defining variables with these names inside meta-data correctly sanitizes them and only puts them inside files only readable by root, however then they don't work inside user-data as jinja templates (as "{{ds.meta_data.security-credentials}}", for example), they are instead replaced by CI_MISSING_JINJA_VAR. Using differently named variables makes the template work, but they aren't sanitized in the logs/runtime files. In what way, if any, this is supposed to work? Should I instead just chmod the relevant log/runtime files through an entry in bootcmd? To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1931392/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1915216] [NEW] Can't find proper metadata source IP - Interoperability problem with CentOS8/Stream, NetworkManager and Apache CloudStack
Public bug reported: System environment: Apache CloudStack 4.11; KVM zone In CentOS 8 either Upstream, there is NetworkManager. cloud-init currently packaged there is 20.3-9.el8. We are talking about the code of the CloudStack datasource. What we observe, is that on our CentOS test systems, cloud-init jumps into the default_gateway() method to return VR IP address 192.102.146.1. This is however wrong, this IP does not return metadata. To compare, an Ubuntu 20.04 deployed on same network resolves to 192.102.146.5. This IP can be found under /run/NetworkManager: ./NetworkManager/resolv.conf:nameserver 192.102.146.5 ./NetworkManager/no-stub-resolv.conf:nameserver 192.102.146.5 ./NetworkManager/devices/2:next-server=192.102.146.5 While CloudStack datasource follows several approaches to find the IP, the code does not seem to implement the situation when there is NetworkManager. What happens instead: - first approach is to try data-server DNS entry first; this is up to our system, we will try out as well - then, it looks for DHCP lease file location "/run/systemd/netif/leases". For some reason, this value is a hardcoded variable in net/dhcp.py: NETWORKD_LEASES_DIR = '/run/systemd/netif/leases' - then, it finds lease file /var/lib/NetworkManager/internal-ea2b5464-7c5e-3243-aa40-7d77805f41ee-ens3.lease, but there is (as opposite to what we see in Ubuntu) just one line, "ADDRESS=192.102.146.34" - why this file does not contain the expected entry "SERVER_ADDRESS=192.102.146.5" as well, I am not sure. - well and finally it is going to the default gateway method. Would you say this is a bug, or maybe a missing feature to ensure interoperability with NetworkManager? (in terms that cloudinit does not look under /run/NetworkManager/) ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1915216 Title: Can't find proper metadata source IP - Interoperability problem with CentOS8/Stream, NetworkManager and Apache CloudStack Status in cloud-init: New Bug description: System environment: Apache CloudStack 4.11; KVM zone In CentOS 8 either Upstream, there is NetworkManager. cloud-init currently packaged there is 20.3-9.el8. We are talking about the code of the CloudStack datasource. What we observe, is that on our CentOS test systems, cloud-init jumps into the default_gateway() method to return VR IP address 192.102.146.1. This is however wrong, this IP does not return metadata. To compare, an Ubuntu 20.04 deployed on same network resolves to 192.102.146.5. This IP can be found under /run/NetworkManager: ./NetworkManager/resolv.conf:nameserver 192.102.146.5 ./NetworkManager/no-stub-resolv.conf:nameserver 192.102.146.5 ./NetworkManager/devices/2:next-server=192.102.146.5 While CloudStack datasource follows several approaches to find the IP, the code does not seem to implement the situation when there is NetworkManager. What happens instead: - first approach is to try data-server DNS entry first; this is up to our system, we will try out as well - then, it looks for DHCP lease file location "/run/systemd/netif/leases". For some reason, this value is a hardcoded variable in net/dhcp.py: NETWORKD_LEASES_DIR = '/run/systemd/netif/leases' - then, it finds lease file /var/lib/NetworkManager/internal-ea2b5464-7c5e-3243-aa40-7d77805f41ee-ens3.lease, but there is (as opposite to what we see in Ubuntu) just one line, "ADDRESS=192.102.146.34" - why this file does not contain the expected entry "SERVER_ADDRESS=192.102.146.5" as well, I am not sure. - well and finally it is going to the default gateway method. Would you say this is a bug, or maybe a missing feature to ensure interoperability with NetworkManager? (in terms that cloudinit does not look under /run/NetworkManager/) To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1915216/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1906266] Re: After upgrade: "libvirt.libvirtError: Requested operation is not valid: format of backing image %s of image %s was not specified"
It appears ubuntu bionic/focal ship with libvirt < 6.1.0, marking nova as affected in those distribs ** Also affects: nova (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1906266 Title: After upgrade: "libvirt.libvirtError: Requested operation is not valid: format of backing image %s of image %s was not specified" Status in OpenStack Compute (nova): Won't Fix Status in nova package in Ubuntu: New Bug description: In a site upgraded to Ussuri we are getting faults starting instances 2020-11-30 13:41:40.586 232871 ERROR oslo_messaging.rpc.server libvirt.libvirtError: Requested operation is not valid: format of backing image '/var/lib/nova/instances/_base/xxx' of image '/var/lib/nova/instances/xxx' was not specified in the image metadata (See https://libvirt.org/kbase/backing_chains.html for troubleshooting) Bug #1864020 reports similar symptoms, where due to an upstream change in Libvirt v6.0.0+ images need the backing format specified. The fix for Bug #1864020 handles the case for new instances. However, for upgraded instances we're hitting the same problem, as those still don't have backing format specified. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1906266/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1906266] [NEW] After upgrade: "libvirt.libvirtError: Requested operation is not valid: format of backing image %s of image %s was not specified"
Public bug reported: In a site upgraded to Ussuri we are getting faults starting instances 2020-11-30 13:41:40.586 232871 ERROR oslo_messaging.rpc.server libvirt.libvirtError: Requested operation is not valid: format of backing image '/var/lib/nova/instances/_base/xxx' of image '/var/lib/nova/instances/xxx' was not specified in the image metadata (See https://libvirt.org/kbase/backing_chains.html for troubleshooting) Bug #1864020 reports similar symptoms, where due to an upstream change in Libvirt v6.0.0+ images need the backing format specified. The fix for Bug #1864020 handles the case for new instances. However, for upgraded instances we're hitting the same problem, as those still don't have backing format specified. ** Affects: nova Importance: Undecided Status: New ** Summary changed: - libvirt.libvirtError: Requested operation is not valid: format of backing image %s of image %s was not specified + After upgrade: "libvirt.libvirtError: Requested operation is not valid: format of backing image %s of image %s was not specified" -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1906266 Title: After upgrade: "libvirt.libvirtError: Requested operation is not valid: format of backing image %s of image %s was not specified" Status in OpenStack Compute (nova): New Bug description: In a site upgraded to Ussuri we are getting faults starting instances 2020-11-30 13:41:40.586 232871 ERROR oslo_messaging.rpc.server libvirt.libvirtError: Requested operation is not valid: format of backing image '/var/lib/nova/instances/_base/xxx' of image '/var/lib/nova/instances/xxx' was not specified in the image metadata (See https://libvirt.org/kbase/backing_chains.html for troubleshooting) Bug #1864020 reports similar symptoms, where due to an upstream change in Libvirt v6.0.0+ images need the backing format specified. The fix for Bug #1864020 handles the case for new instances. However, for upgraded instances we're hitting the same problem, as those still don't have backing format specified. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1906266/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1905587] [NEW] SGs shared via RBAC for an instance are not handled in Horizon
Public bug reported: It appears we can't edit security groups for an instance shared via rbac. Sharing SGs via RBAC is done via: `openstack network rbac create --target-project $project --action access_as_shared --type security_group ...` We can see the shared SGs in the cli, when doing `openstack server show ...` However I can't see those SGs on the instance in Horizon. Also, when adding another SG in Horizon the one shared via RBAC is removed from the instance One thing we noticed that might possibly be related is that in https://github.com/openstack/horizon/blob/stable/stein/openstack_dashboard/api/neutron.py#L370 the SGs seem to be taken only from the project Versions: OpenStack Stein python3-django-horizon 3:15.2.0-0ubuntu1~cloud0 ** Affects: horizon Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1905587 Title: SGs shared via RBAC for an instance are not handled in Horizon Status in OpenStack Dashboard (Horizon): New Bug description: It appears we can't edit security groups for an instance shared via rbac. Sharing SGs via RBAC is done via: `openstack network rbac create --target-project $project --action access_as_shared --type security_group ...` We can see the shared SGs in the cli, when doing `openstack server show ...` However I can't see those SGs on the instance in Horizon. Also, when adding another SG in Horizon the one shared via RBAC is removed from the instance One thing we noticed that might possibly be related is that in https://github.com/openstack/horizon/blob/stable/stein/openstack_dashboard/api/neutron.py#L370 the SGs seem to be taken only from the project Versions: OpenStack Stein python3-django-horizon 3:15.2.0-0ubuntu1~cloud0 To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1905587/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1901054] [NEW] cloud-init at sles does not require wget
Public bug reported: This might be a subtle error which is neither a direct package dependency nor explicitedly an error in the cloud-init software, but this in fact leads to problems. Also, it does not seem to be a distro- specific problem. Scenario: 1. User of cloud-init is not aware it will use wget to download meta-data 2. There is no direct dependency in the cloud-init package to wget (checked for Ubuntu and SLES 15) 3. Cloud-init will not work properly as wget is missing Error output from /etc/cloud-init.log (observed with lastest patch level for version 19.4 on SLES 15) 2020-10-22 14:08:05,095 - util.py[DEBUG]: Running command ['wget', '--quiet', '--tries', '3', '--timeout', '20', '--output-document', '-', '--header', 'DomU_Request: send_my_password', '192.168.1.1:8080'] with allowed return codes [0] (shell=False, capture=True) FileNotFoundError: [Errno 2] No such file or directory: b'wget': b'wget' Command: ['wget', '--quiet', '--tries', '3', '--timeout', '20', '--output-document', '-', '--header', 'DomU_Request: send_my_password', '192.168.1.1:8080'] Reason: [Errno 2] No such file or directory: b'wget': b'wget' Suggestion: possibly, cloud-init should exit immediately if wget not detected in the system, and/or have it as a dependency in distro packages? ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1901054 Title: cloud-init at sles does not require wget Status in cloud-init: New Bug description: This might be a subtle error which is neither a direct package dependency nor explicitedly an error in the cloud-init software, but this in fact leads to problems. Also, it does not seem to be a distro- specific problem. Scenario: 1. User of cloud-init is not aware it will use wget to download meta-data 2. There is no direct dependency in the cloud-init package to wget (checked for Ubuntu and SLES 15) 3. Cloud-init will not work properly as wget is missing Error output from /etc/cloud-init.log (observed with lastest patch level for version 19.4 on SLES 15) 2020-10-22 14:08:05,095 - util.py[DEBUG]: Running command ['wget', '--quiet', '--tries', '3', '--timeout', '20', '--output-document', '-', '--header', 'DomU_Request: send_my_password', '192.168.1.1:8080'] with allowed return codes [0] (shell=False, capture=True) FileNotFoundError: [Errno 2] No such file or directory: b'wget': b'wget' Command: ['wget', '--quiet', '--tries', '3', '--timeout', '20', '--output-document', '-', '--header', 'DomU_Request: send_my_password', '192.168.1.1:8080'] Reason: [Errno 2] No such file or directory: b'wget': b'wget' Suggestion: possibly, cloud-init should exit immediately if wget not detected in the system, and/or have it as a dependency in distro packages? To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1901054/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1892361] [NEW] SRIOV instance gets type-PF interface, libvirt kvm fails
Public bug reported: When spawning an SR-IOV enabled instance on a newly deployed host, nova attempts to spawn it with an type-PF pci device. This fails with the below stack trace. After restarting neutron-sriov-agent and nova-compute services on the compute node and spawning an SR-IOV instance again, a type-VF pci device is selected, and instance spawning succeeds. Stack trace: 2020-08-20 08:29:09.558 7624 DEBUG oslo_messaging._drivers.amqpdriver [-] received reply msg_id: 6db8011e6ecd4fd0aaa53c8f89f08b1b __call__ /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:400 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [req-e3e49d07-24c6-4c62-916e-f830f70983a2 ddcfb3640535428798aa3c8545362bd4 dd99e7950a5b46b5b924ccd1720b6257 - 015e4fd7db304665ab5378caa691bb8b 015e4fd7db304665ab5378caa691bb8b] [insta nce: 9498ea75-fe88-4020-9a9e-f4c437c6de11] Instance failed to spawn: libvirtError: unsupported configuration: Interface type hostdev is currently supported on SR-IOV Virtual Functions only 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] Traceback (most recent call last): 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2274, in _build_resources 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] yield resources 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2054, in _build_and_run_instance 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] block_device_info=block_device_info) 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 3147, in spawn 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] destroy_disks_on_failure=True) 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 5651, in _create_domain_and_network 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] destroy_disks_on_failure) 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] self.force_reraise() 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] six.reraise(self.type_, self.value, self.tb) 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 5620, in _create_domain_and_network 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] post_xml_callback=post_xml_callback) 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line , in _create_domain 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] guest.launch(pause=pause) 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/guest.py", line 144, in launch 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] self._encoded_xml, errors='ignore') 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] self.force_reraise() 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] six.reraise(self.type_, self.value, self.tb) 2020-08-20
[Yahoo-eng-team] [Bug 1888256] [NEW] Neutron start radvd and mess up the routing table when: ipv6_ra_mode=not set ipv6-address-mode=slaac
24 brd 193.224.218.255 scope global dynamic eth0 valid_lft 86353sec preferred_lft 86353sec inet6 2001:738:0:527:f816:3eff:fe71:ca8d/64 scope global dynamic mngtmpaddr valid_lft 2591994sec preferred_lft 604794sec inet6 fe80::f816:3eff:fe71:ca8d/64 scope link valid_lft forever preferred_lft forever debian@test:~$ ip -6 route ::1 dev lo proto kernel metric 256 pref medium 2001:738:0:527::/64 dev eth0 proto kernel metric 256 expires 2591990sec pref medium fe80::/64 dev eth0 proto kernel metric 256 pref medium default via fe80::f816:3eff:fea1:7e69 dev eth0 proto ra metric 1024 expires 251sec hoplimit 64 pref medium default via fe80::5:73ff:fea0:2cf dev eth0 proto ra metric 1024 expires 1790sec hoplimit 64 pref medium As you can see, I'v got two default routes, where the upper one is not ment to be there. Could you point out something I missed, or there are some kind of bug, which makes this? Thanks: Peter ERDOSI (Fazy) ** Affects: neutron Importance: Undecided Status: New ** Tags: ipv6 ra-mode -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1888256 Title: Neutron start radvd and mess up the routing table when: ipv6_ra_mode=not set ipv6-address-mode=slaac Status in neutron: New Bug description: Hello! I would like to report a possible bug. We currently using Rocky with Ubuntu 18.04. We use custom ansible for deployment. We have a setup, where the upstream core Cisco nexus DC switches answers to RA-s. This works fine with a network, which we had for years (upgraded from kilo) Now, we made a new region, with new network nodes, etc. and the IPv6 not works as in the old region. In the new region, we had this subnet: [PROD][root(cc1:0)] <~> openstack subnet show Flat1-subnet-v6 +---+--+ | Field | Value| +---+--+ | allocation_pools | 2001:738:0:527::2-2001:738:0:527:::: | | cidr | 2001:738:0:527::/64 | | created_at| 2020-07-01T22:59:53Z | | description | | | dns_nameservers | | | enable_dhcp | True | | gateway_ip| 2001:738:0:527::1| | host_routes | | | id| a5a9991c-62f3-4f46-b1ef-e293dc0fb781 | | ip_version| 6| | ipv6_address_mode | slaac| | ipv6_ra_mode | None | | name | Flat1-subnet-v6 | | network_id| fa55bfc7-ab42-4d97-987e-645cca7a0601 | | project_id| b48a9319a66e45f3b04cc8bb70e3113c | | revision_number | 0| | segment_id| None | | service_types | | | subnetpool_id | None | | tags | | | updated_at| 2020-07-01T22:59:53Z | +---+--+ As you can see, the address mode is SLAAC, the RA mode is: None. Checking from network node, we see the qrouter: [PROD][root(net1:0)] ip netns exec qrouter-4ffa4f55-95aa-4ce1-b4f8-8bbb2f9d53e1 ip a 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 35: ha-5dfb8647-f7: mtu 1450 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether fa:16:3e:1c:4d:8d brd ff:ff:ff:ff:ff:ff inet 169.254.192.3/18 brd 169.254.255.255 scope global ha-5dfb8647-f7 valid_lft forever preferred_lft forever inet 169.254.0.162/24 scope global ha-5dfb8647-f7 valid_lft forever preferred_lft forever inet6 fe80::f816:3eff:fe1c:4d8d/64 scope link valid_lft forever preferred_lft forever 36: qr-a6d7ceab-80: mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether fa:16:3e:a1:7e:69 brd ff:ff:ff:ff:ff:f
[Yahoo-eng-team] [Bug 1884571] [NEW] Horizon Attach interface exception, if subnet contains letters with ACUTE
Public bug reported: Hy! We run into a small "problem". (Using Rocky) A subnet in a project was named with a letter "é" and the horizon dropped an exception, when we tried to attach a new interface to the instance. When spin up a new instance, and add it to this network, it's just works fine. After the subnet was renamed, the Attach interface started working again. Not a big deal, but maybe someone can reproduce, and it can be fixed. Thanks, Peter The error message: [Mon Jun 22 17:51:28.619521 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] Internal Server Error: /project/instances/b88b44f1-2b3d-4539-b21c-b45029f959a1/attach_interface [Mon Jun 22 17:51:28.619638 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] Traceback (most recent call last): [Mon Jun 22 17:51:28.619665 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] File "/usr/lib/python2.7/dist-packages/django/core/handlers/exception.py", line 41, in inner [Mon Jun 22 17:51:28.619685 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] response = get_response(request) [Mon Jun 22 17:51:28.619706 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] File "/usr/lib/python2.7/dist-packages/django/core/handlers/base.py", line 187, in _get_response [Mon Jun 22 17:51:28.619724 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] response = self.process_exception_by_middleware(e, request) [Mon Jun 22 17:51:28.619743 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] File "/usr/lib/python2.7/dist-packages/django/core/handlers/base.py", line 185, in _get_response [Mon Jun 22 17:51:28.619761 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] response = wrapped_callback(request, *callback_args, **callback_kwargs) [Mon Jun 22 17:51:28.619780 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] File "/usr/lib/python2.7/dist-packages/horizon/decorators.py", line 36, in dec [Mon Jun 22 17:51:28.619798 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] return view_func(request, *args, **kwargs) [Mon Jun 22 17:51:28.619816 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] File "/usr/lib/python2.7/dist-packages/horizon/decorators.py", line 52, in dec [Mon Jun 22 17:51:28.619834 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] return view_func(request, *args, **kwargs) [Mon Jun 22 17:51:28.619853 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] File "/usr/lib/python2.7/dist-packages/horizon/decorators.py", line 36, in dec [Mon Jun 22 17:51:28.619871 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] return view_func(request, *args, **kwargs) [Mon Jun 22 17:51:28.619931 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] File "/usr/lib/python2.7/dist-packages/horizon/decorators.py", line 113, in dec [Mon Jun 22 17:51:28.619955 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] return view_func(request, *args, **kwargs) [Mon Jun 22 17:51:28.619973 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] File "/usr/lib/python2.7/dist-packages/horizon/decorators.py", line 84, in dec [Mon Jun 22 17:51:28.619992 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] return view_func(request, *args, **kwargs) [Mon Jun 22 17:51:28.620010 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] File "/usr/lib/python2.7/dist-packages/django/views/generic/base.py", line 68, in view [Mon Jun 22 17:51:28.620029 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] return self.dispatch(request, *args, **kwargs) [Mon Jun 22 17:51:28.620061 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] File "/usr/lib/python2.7/dist-packages/django/views/generic/base.py", line 88, in dispatch [Mon Jun 22 17:51:28.620081 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] return handler(request, *args, **kwargs) [Mon Jun 22 17:51:28.620123 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] File "/usr/lib/python2.7/dist-packages/django/views/generic/edit.py", line 174, in get [Mon Jun 22 17:51:28.620152 2020] [wsgi:error] [pid 1629918:tid 140338093852416] [remote 192.168.51.253:47116] return self.render_to_response(self.get_context_data())
[Yahoo-eng-team] [Bug 1878481] Re: server add volume fails
** Changed in: nova Status: Invalid => New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1878481 Title: server add volume fails Status in OpenStack Compute (nova): New Bug description: Description === (HTTP 500) It seems to start by not finding the instance in nova. Environment === Environment: (Rocky) openstack-nova-common-18.3.0-1.el7.noarch openstack-nova-conductor-18.3.0-1.el7.noarch openstack-nova-api-18.3.0-1.el7.noarch openstack-nova-scheduler-18.3.0-1.el7.noarch openstack-nova-console-18.3.0-1.el7.noarch openstack-nova-placement-api-18.3.0-1.el7.noarch python2-novaclient-11.0.1-1.el7.noarch openstack-nova-novncproxy-18.3.0-1.el7.noarch python-nova-18.3.0-1.el7.noarch Hypervisor: Linux KVM Storate: CEPH: 14.2.8 / RBD volumes Network: Neutron (OpenVSwitch) Steps to reproduce == I did: openstack --debug server add volume c331814d-b758-460e-9972-bc1e987b933d 0fc5bec7-9364-458c-a809-f38389890a60 I also did: openstack --debug server add volume allalal peter-test-001 I received: ClientException: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible. (HTTP 500) (Request-ID: req-fb493f88-61c5-4ebd-829b-305b31aaa010) Expected result === Associate the volume and exit cleanly. Actual result = Nova fails to find existing instance and fails the attempt to add the volume. All other server commands for the instance work as expected (stop, start, migrate, etc.) Logs & Configs == /etc/nova/nova.conf [DEFAULT] cpu_allocation_ratio = 16.0 debug = true enabled_apis = osapi_compute,metadata metadata_proxy_shared_secret = metadata_workers = 4 my_ip = 10.176.0.42 ram_allocation_ratio = 1.0 scheduler_host_subset_size = 2 transport_url = rabbit://openstack:@mq-a01.xxx.cloud,openstack:@mq-a02.xxx.cloud,openstack:@mq-a03.xxx.cloud cinder_catalog_info=volumev3:cinderv3:publicURL [api] auth_strategy = keystone [api_database] connection = mysql+pymysql://atx_nova_api:@db-a00.xxx.cloud/atx_nova_api [filter_scheduler] enabled_filters = AvailabilityZoneFilter,ComputeCapabilitiesFilter,ComputeFilter,ImagePropertiesFilter,RamFilter,RetryFilter,ServerGroupAffinityFilter,ServerGroupAntiAffinityFilter host_subset_size = 4 io_ops_weight_multiplier = 10.0 max_instances_per_host = 100 ram_weight_multiplier = 6.0 soft_affinity_weight_multiplier = 10.0 soft_anti_affinity_weight_multiplier = 10.0 weight_classes = nova.scheduler.weights.all_weighers [database] connection = mysql+pymysql://atx_nova:@db-a00.xxx.cloud/atx_nova [glance] api_servers = https://glance-a00.xxx.cloud [keystone_authtoken] auth_type = password www_authenticate_uri = https://keystone-a00.xxx.cloud auth_url = https://keystone-a00.xxx.cloud memcached_servers = memcached-a01.xxx.cloud:11211,memcached-a02.xxx.cloud:11211,memcached-a03.xxx.cloud:11211 password = project_domain_name = default project_name = service user_domain_name = default username = atx_nova service_token_roles_required = true [neutron] auth_type = password auth_url = https://keystone-a00.xxx.cloud metadata_proxy_shared_secret = password = project_domain_name = Default project_name = service region_name = atx service_metadata_proxy = true #uses keystoneauth1: url = https://neutron-a00.xxx.cloud user_domain_name = Default username = atx_neutron [oslo_concurrency] lock_path = /var/lib/nova/tmp [placement] auth_type = password auth_url = https://keystone-a00.xxx.cloud os_region_name = atx password = project_domain_name = Default project_name = service user_domain_name = Default username = atx_placement [scheduler] discover_hosts_in_cells_interval = 300 [oslo_notifications_group] driver = messaging topics = notifications [cache] backend=oslo_cache.memcache_pool enable = true [oslo_messaging_rabbit] amqp_durable_queues = true rabbit_ha_queues = true rabbit_retry_backoff = 2 rabbit_retry_interval = 1 [peisch@jump ~]$ openstack volume list +--++---+--+-+ | ID | Name | Status| Size | Attached to | +--++---+--+-+ | ... | ...| available | 10 | | | 0fc5bec7-9364-458c-a809-f38389890a60 | peter-test-001 | available | 10 | | | ... | ...| available
[Yahoo-eng-team] [Bug 1878481] [NEW] server add volume fails
Public bug reported: Description === (HTTP 500) It seems to start by not finding the instance in nova. Environment === Environment: (Rocky) openstack-nova-common-18.3.0-1.el7.noarch openstack-nova-conductor-18.3.0-1.el7.noarch openstack-nova-api-18.3.0-1.el7.noarch openstack-nova-scheduler-18.3.0-1.el7.noarch openstack-nova-console-18.3.0-1.el7.noarch openstack-nova-placement-api-18.3.0-1.el7.noarch python2-novaclient-11.0.1-1.el7.noarch openstack-nova-novncproxy-18.3.0-1.el7.noarch python-nova-18.3.0-1.el7.noarch Hypervisor: Linux KVM Storate: CEPH: 14.2.8 / RBD volumes Network: Neutron (OpenVSwitch) Steps to reproduce == I did: openstack --debug server add volume c331814d-b758-460e-9972-bc1e987b933d 0fc5bec7-9364-458c-a809-f38389890a60 I also did: openstack --debug server add volume allalal peter-test-001 I received: ClientException: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible. (HTTP 500) (Request-ID: req-fb493f88-61c5-4ebd-829b-305b31aaa010) Expected result === Associate the volume and exit cleanly. Actual result = Nova fails to find existing instance and fails the attempt to add the volume. All other server commands for the instance work as expected (stop, start, migrate, etc.) Logs & Configs == /etc/nova/nova.conf [DEFAULT] cpu_allocation_ratio = 16.0 debug = true enabled_apis = osapi_compute,metadata metadata_proxy_shared_secret = metadata_workers = 4 my_ip = 10.176.0.42 ram_allocation_ratio = 1.0 scheduler_host_subset_size = 2 transport_url = rabbit://openstack:@mq-a01.xxx.cloud,openstack:@mq-a02.xxx.cloud,openstack:@mq-a03.xxx.cloud cinder_catalog_info=volumev3:cinderv3:publicURL [api] auth_strategy = keystone [api_database] connection = mysql+pymysql://atx_nova_api:@db-a00.xxx.cloud/atx_nova_api [filter_scheduler] enabled_filters = AvailabilityZoneFilter,ComputeCapabilitiesFilter,ComputeFilter,ImagePropertiesFilter,RamFilter,RetryFilter,ServerGroupAffinityFilter,ServerGroupAntiAffinityFilter host_subset_size = 4 io_ops_weight_multiplier = 10.0 max_instances_per_host = 100 ram_weight_multiplier = 6.0 soft_affinity_weight_multiplier = 10.0 soft_anti_affinity_weight_multiplier = 10.0 weight_classes = nova.scheduler.weights.all_weighers [database] connection = mysql+pymysql://atx_nova:@db-a00.xxx.cloud/atx_nova [glance] api_servers = https://glance-a00.xxx.cloud [keystone_authtoken] auth_type = password www_authenticate_uri = https://keystone-a00.xxx.cloud auth_url = https://keystone-a00.xxx.cloud memcached_servers = memcached-a01.xxx.cloud:11211,memcached-a02.xxx.cloud:11211,memcached-a03.xxx.cloud:11211 password = project_domain_name = default project_name = service user_domain_name = default username = atx_nova service_token_roles_required = true [neutron] auth_type = password auth_url = https://keystone-a00.xxx.cloud metadata_proxy_shared_secret = password = project_domain_name = Default project_name = service region_name = atx service_metadata_proxy = true #uses keystoneauth1: url = https://neutron-a00.xxx.cloud user_domain_name = Default username = atx_neutron [oslo_concurrency] lock_path = /var/lib/nova/tmp [placement] auth_type = password auth_url = https://keystone-a00.xxx.cloud os_region_name = atx password = project_domain_name = Default project_name = service user_domain_name = Default username = atx_placement [scheduler] discover_hosts_in_cells_interval = 300 [oslo_notifications_group] driver = messaging topics = notifications [cache] backend=oslo_cache.memcache_pool enable = true [oslo_messaging_rabbit] amqp_durable_queues = true rabbit_ha_queues = true rabbit_retry_backoff = 2 rabbit_retry_interval = 1 [peisch@jump ~]$ openstack volume list +--++---+--+-+ | ID | Name | Status| Size | Attached to | +--++---+--+-+ | ... | ...| available | 10 | | | 0fc5bec7-9364-458c-a809-f38389890a60 | peter-test-001 | available | 10 | | | ... | ...| available | 10 | | +--++---+--+-+ [peisch@hopslam ~]$ openstack server list +--+---++-+--++ | ID | Name | Status | Networks | Image| Flavor | +--+---++-+--++ | ... | ... | ACTIVE | admin=10.1.2.5 | Centos77 | 2x2x20 | | c331814d-b758-460e-9972-bc1e987b933d | allalal | ACTIVE | admin=10.1
[Yahoo-eng-team] [Bug 1855462] [NEW] Can't use advanced log configuration with Ubuntu and CloudStack
Public bug reported: System environment information: Apache CloudStack data source with VMWare hypervisor and a minimal Ubuntu 18.04 OS via net install from online mirror. #cloud-config datasource: CloudStack: max_wait: 120 timeout: 50 datasource_list: - CloudStack system_info: distro: ubuntu default_user: name: cloud lock_passwd: False sudo: ["ALL=(ALL) NOPASSWD:ALL"] disable_ec2_metadata: true ssh_pwauth: yes disable_root: true preserve_hostname: false output: init: output: ">> /var/log/cloud-init.out" error: ">> /var/log/cloud-init.err" config: ">> /var/log/cloud-config.log" final: - ">> /var/log/cloud-final.out" - ">>/var/log/cloud-final.err" packages: - ca-certificates - pastebinit Results: - networking is disabled, networkl interface is down - no cloud init log - no sudo rights for cloud user - hostname remains "localhost" - cloud-init status: disabled - cloud-init init: enforces cloud-init, with the following messages in the logs which then are created as desired in the config above. -> in "cloud-init.err": "No init modules to run under section cloud_init_modules" -> in "cloud-init.log": "finish: init-network: FAIL: searching for network datasources" To be sure, I've tested the YAML config and indented it in a way that yamllint produces only warnings, but no errors (a strict YAML would start with ---, not sure whether it would be ok in this context) ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1855462 Title: Can't use advanced log configuration with Ubuntu and CloudStack Status in cloud-init: New Bug description: System environment information: Apache CloudStack data source with VMWare hypervisor and a minimal Ubuntu 18.04 OS via net install from online mirror. #cloud-config datasource: CloudStack: max_wait: 120 timeout: 50 datasource_list: - CloudStack system_info: distro: ubuntu default_user: name: cloud lock_passwd: False sudo: ["ALL=(ALL) NOPASSWD:ALL"] disable_ec2_metadata: true ssh_pwauth: yes disable_root: true preserve_hostname: false output: init: output: ">> /var/log/cloud-init.out" error: ">> /var/log/cloud-init.err" config: ">> /var/log/cloud-config.log" final: - ">> /var/log/cloud-final.out" - ">>/var/log/cloud-final.err" packages: - ca-certificates - pastebinit Results: - networking is disabled, networkl interface is down - no cloud init log - no sudo rights for cloud user - hostname remains "localhost" - cloud-init status: disabled - cloud-init init: enforces cloud-init, with the following messages in the logs which then are created as desired in the config above. -> in "cloud-init.err": "No init modules to run under section cloud_init_modules" -> in "cloud-init.log": "finish: init-network: FAIL: searching for network datasources" To be sure, I've tested the YAML config and indented it in a way that yamllint produces only warnings, but no errors (a strict YAML would start with ---, not sure whether it would be ok in this context) To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1855462/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1855430] [NEW] Unclear documentation how a complete minimal working configuration could look like for Ubuntu and Apache CloudStack
Public bug reported: It's certainly clear that there are lots of possible configuration combinations (OSxProvider). Initially, I have thought that picking up the relevant configuration by pieces would work, but then reading the manual leaves some uncertainties. - information that you can place the configuration either in /etc/cloud/cloud.cfg or in /etc/cloud/cloud.cfg.d/* should be placed more prominently in an opening section, maybe also with an explained possible layout in the least. Are both ways idempotent, say, you could place everything in the single config file? - for example for network configuration, the documentation https://cloudinit.readthedocs.io/en/latest/topics/network-config- format-v2.html says "ethernets: []" is a valid default piece of configuration. Is it equivalent to using the "match" keyword with an asterisk? - for CloudStack, is the data source configuration also a valid complete minimal configuration so that the machine is able to configure a default DHCP network and get metadata? - debug: "verbose: true/false (defaulting to true)" does this mean, "verbose" is not needed but the "debug" statement has to be there to produce output? - logging: "The default config is given as "output: { all: "| tee -a /var/log/cloud-init-output.log" }" - does it mean, for default logging I do not need to provide the "output" statement or exactly that is required to enabled it? ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1855430 Title: Unclear documentation how a complete minimal working configuration could look like for Ubuntu and Apache CloudStack Status in cloud-init: New Bug description: It's certainly clear that there are lots of possible configuration combinations (OSxProvider). Initially, I have thought that picking up the relevant configuration by pieces would work, but then reading the manual leaves some uncertainties. - information that you can place the configuration either in /etc/cloud/cloud.cfg or in /etc/cloud/cloud.cfg.d/* should be placed more prominently in an opening section, maybe also with an explained possible layout in the least. Are both ways idempotent, say, you could place everything in the single config file? - for example for network configuration, the documentation https://cloudinit.readthedocs.io/en/latest/topics/network-config- format-v2.html says "ethernets: []" is a valid default piece of configuration. Is it equivalent to using the "match" keyword with an asterisk? - for CloudStack, is the data source configuration also a valid complete minimal configuration so that the machine is able to configure a default DHCP network and get metadata? - debug: "verbose: true/false (defaulting to true)" does this mean, "verbose" is not needed but the "debug" statement has to be there to produce output? - logging: "The default config is given as "output: { all: "| tee -a /var/log/cloud-init-output.log" }" - does it mean, for default logging I do not need to provide the "output" statement or exactly that is required to enabled it? To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1855430/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1848026] Re: Can't reset password on Ubuntu LTS 18.04.3
no more relevant ** Changed in: cloud-init Status: Incomplete => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1848026 Title: Can't reset password on Ubuntu LTS 18.04.3 Status in cloud-init: Invalid Bug description: We have the following CI/CD setup with CloudStack, that is each iteration does the following: - deploy a local bootstrap VM using online network installation image, seed and sysprep scripts to install cloud-init, configuring it for CloudStack data source - create a CloudStack template from this bootstrap VM, actually two templates (for KVM and VMWare hypervisors) - use this template to create a test CloudStack VM - run acceptance tests for example, reset password and test login with it, on fail extract logs and other available cloud-init intelligence For password reset, the test is like the following: - call "resetPasswordForVirtualMachine()" API - save new VM password - login to VM (custom user cloud, I've also manually tested root and ubuntu and also sudo) Error: password denied Get logs: login with original password of the user "cloud". Observations: - /run/cloud-init/instance-data.json get written - there are log messages like 2019-10-14 12:41:04,126 - util.py[DEBUG]: Running command ['wget', '-- quiet', '--tries', '3', '--timeout', '20', '--output-document', '-', ' --header', 'DomU_Request: send_my_password', '192.102.146.5:8080'] with allowed return codes [0] (shell=False, capture=True) 2019-10-14 12:41:04,144 - util.py[DEBUG]: Running command ['wget', '-- quiet', '--tries', '3', '--timeout', '20', '--output-document', '-', ' --header', 'DomU_Request: saved_password', '192.102.146.5:8080'] with allowed return codes [0] (shell=False, capture=True) Which errors should appear if the password could not be changed? Relevant version information: - Ubuntu 18.04.3 LTS - Apache CloudStack fork by Accelerite based on ACS 4.10, versioned as custom 4.11.0.7. - Cloud-init v. 19.2-36-g059d049c-0ubuntu2~18.04.1 Contents of cloud init configuration (I've removed obsolete custom cloud-init replacements mentioned in the previous bug report #1847604, so there should be no other custom configuration): datasource: CloudStack: {} None: {} datasource_list: - CloudStack To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1848026/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1853839] [NEW] Make signing error logging more detailed
Public bug reported: In case of signing faults the cms_sign_data() function in keystoneclient/common/cms.py currently includes little detail on the failing `openssl cms` operation. To aid operations in troubleshooting, it would be helpful if the following were logged in case of faults: - The cert and key files being used - SSL message_digest - The full error message of openssl Version: seen in current master, 925c2c1 ** Affects: python-keystoneclient Importance: Undecided Status: New ** Project changed: keystone => python-keystoneclient ** Description changed: In case of signing faults the cms_sign_data() function in keystoneclient/common/cms.py currently includes little detail on the failing `openssl cms` operation. To aid operations in troubleshooting, it would be helpful if the following were logged in case of faults: - The cert and key files being used - SSL message_digest - The full error message of openssl + + + Version: seen in current master, 925c2c1 -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1853839 Title: Make signing error logging more detailed Status in python-keystoneclient: New Bug description: In case of signing faults the cms_sign_data() function in keystoneclient/common/cms.py currently includes little detail on the failing `openssl cms` operation. To aid operations in troubleshooting, it would be helpful if the following were logged in case of faults: - The cert and key files being used - SSL message_digest - The full error message of openssl Version: seen in current master, 925c2c1 To manage notifications about this bug go to: https://bugs.launchpad.net/python-keystoneclient/+bug/1853839/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1850988] [NEW] [Cloud-init 18.5][CentOS 7 on vSphere] Crash when configuring static dual-stack (IPv4 + IPv6) networking
Public bug reported: Environment: - Stock CentOS 7 image template (comes with OpenVM tools) with cloud-init 18.5 installed - Single NIC VM - vSphere 6.5 hypervisor Repro steps: - Customize the VM with a vSphere customization spec that has NIC setting with static IPv4 and IPv6 information - OpenVM tools running inside guest will delegate guest customization to cloud-init - Cloud-init crashes with ValueError: Unknown subnet type 'static6' found for interface 'ens192' . See the following relevant excerts and stacktrace (found in /var/log/cloudinit.log): [...snip...] 2019-11-01 02:23:41,899 - DataSourceOVF.py[DEBUG]: Found VMware Customization Config File at /var/run/vmware-imc/cust.cfg 2019-11-01 02:23:41,899 - config_file.py[INFO]: Parsing the config file /var/run/vmware-imc/cust.cfg. 2019-11-01 02:23:41,900 - config_file.py[DEBUG]: FOUND CATEGORY = 'NETWORK' 2019-11-01 02:23:41,900 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NETWORK|NETWORKING' = 'yes' 2019-11-01 02:23:41,900 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NETWORK|BOOTPROTO' = 'dhcp' 2019-11-01 02:23:41,900 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NETWORK|HOSTNAME' = 'pr-centos-ci' 2019-11-01 02:23:41,900 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NETWORK|DOMAINNAME' = 'gsslabs.local' 2019-11-01 02:23:41,900 - config_file.py[DEBUG]: FOUND CATEGORY = 'NIC-CONFIG' 2019-11-01 02:23:41,900 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC-CONFIG|NICS' = 'NIC1' 2019-11-01 02:23:41,900 - config_file.py[DEBUG]: FOUND CATEGORY = 'NIC1' 2019-11-01 02:23:41,902 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|MACADDR' = '00:50:56:89:b7:48' 2019-11-01 02:23:41,902 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|ONBOOT' = 'yes' 2019-11-01 02:23:41,902 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|IPv4_MODE' = 'BACKWARDS_COMPATIBLE' 2019-11-01 02:23:41,902 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|BOOTPROTO' = 'static' 2019-11-01 02:23:41,902 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|IPADDR' = '1.1.1.4' 2019-11-01 02:23:41,902 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|NETMASK' = '255.255.255.0' 2019-11-01 02:23:41,902 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|IPv6ADDR|1' = '2600::10' 2019-11-01 02:23:41,902 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|IPv6NETMASK|1' = '64' 2019-11-01 02:23:41,903 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'NIC1|IPv6GATEWAY|1' = '2600::1' 2019-11-01 02:23:41,903 - config_file.py[DEBUG]: FOUND CATEGORY = 'DNS' 2019-11-01 02:23:41,903 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'DNS|DNSFROMDHCP' = 'no' 2019-11-01 02:23:41,904 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'DNS|SUFFIX|1' = 'sqa.local' 2019-11-01 02:23:41,904 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'DNS|NAMESERVER|1' = '192.168.0.10' 2019-11-01 02:23:41,904 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'DNS|NAMESERVER|2' = 'fc00:10:118:192:250:56ff:fe89:64a8' 2019-11-01 02:23:41,904 - config_file.py[DEBUG]: FOUND CATEGORY = 'DATETIME' 2019-11-01 02:23:41,904 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'DATETIME|TIMEZONE' = 'Asia/Kolkata' 2019-11-01 02:23:41,904 - config_file.py[DEBUG]: ADDED KEY-VAL :: 'DATETIME|UTC' = 'no' 2019-11-01 02:23:41,904 - DataSourceOVF.py[DEBUG]: Preparing the Network configuration 2019-11-01 02:23:41,907 - util.py[DEBUG]: Running command ['ip', 'addr', 'show'] with allowed return codes [0] (shell=False, capture=True) 2019-11-01 02:23:41,926 - config_nic.py[INFO]: Configuring the interfaces file 2019-11-01 02:23:41,927 - config_nic.py[INFO]: Debian OS not detected. Skipping the configure step 2019-11-01 02:23:41,927 - util.py[DEBUG]: Recursively deleting /var/run/vmware-imc [...snip...] 2019-11-01 02:23:43,225 - stages.py[INFO]: Applying network configuration from ds bringup=False: {'version': 1, 'config': [{'subnets': [{'control': 'auto', 'netmask': '255.255.255.0', 'type': 'static', 'address': '1.1.1.4'}, {'netmask': '64', 'type': 'static6', 'address': '2600::10'}], 'type': 'physical', 'name': u'ens192', 'mac_address': '00:50:56:89:b7:48'}, {'search': ['sqa.local'], 'type': 'nameserver', 'address': ['192.168.0.10', 'fc00:10:118:192:250:56ff:fe89:64a8']}]} 2019-11-01 02:23:43,226 - __init__.py[DEBUG]: Selected renderer 'sysconfig' from priority list: None 2019-11-01 02:23:43,244 - util.py[WARNING]: failed stage init-local 2019-11-01 02:23:43,249 - util.py[DEBUG]: failed stage init-local Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/cloudinit/cmd/main.py", line 652, in status_wrapper ret = functor(name, args) File "/usr/lib/python2.7/site-packages/cloudinit/cmd/main.py", line 362, in main_init init.apply_network_config(bring_up=bool(mode != sources.DSMODE_LOCAL)) File "/usr/lib/python2.7/site-packages/cloudinit/stages.py", line 672, in apply_network_config return self.distro.apply_network_config(netcfg, bring_up=bring_up) File "/usr/lib/python2.7/site-packages/cloudinit/distros/__init__.py", line 178, in
[Yahoo-eng-team] [Bug 1848026] [NEW] Can't reset password on Ubuntu LTS 18.04.3
Public bug reported: We have the following CI/CD setup with CloudStack, that is each iteration does the following: - deploy a local bootstrap VM using online network installation image, seed and sysprep scripts to install cloud-init, configuring it for CloudStack data source - create a CloudStack template from this bootstrap VM, actually two templates (for KVM and VMWare hypervisors) - use this template to create a test CloudStack VM - run acceptance tests for example, reset password and test login with it, on fail extract logs and other available cloud-init intelligence For password reset, the test is like the following: - call "resetPasswordForVirtualMachine()" API - save new VM password - login to VM (custom user cloud, I've also manually tested root and ubuntu and also sudo) Error: password denied Get logs: login with original password of the user "cloud". Observations: - /run/cloud-init/instance-data.json get written - there are log messages like 2019-10-14 12:41:04,126 - util.py[DEBUG]: Running command ['wget', '-- quiet', '--tries', '3', '--timeout', '20', '--output-document', '-', '-- header', 'DomU_Request: send_my_password', '192.102.146.5:8080'] with allowed return codes [0] (shell=False, capture=True) 2019-10-14 12:41:04,144 - util.py[DEBUG]: Running command ['wget', '-- quiet', '--tries', '3', '--timeout', '20', '--output-document', '-', '-- header', 'DomU_Request: saved_password', '192.102.146.5:8080'] with allowed return codes [0] (shell=False, capture=True) Which errors should appear if the password could not be changed? Relevant version information: - Ubuntu 18.04.3 LTS - Apache CloudStack fork by Accelerite based on ACS 4.10, versioned as custom 4.11.0.7. - Cloud-init v. 19.2-36-g059d049c-0ubuntu2~18.04.1 Contents of cloud init configuration (I've removed obsolete custom cloud-init replacements mentioned in the previous bug report #1847604, so there should be no other custom configuration): datasource: CloudStack: {} None: {} datasource_list: - CloudStack ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1848026 Title: Can't reset password on Ubuntu LTS 18.04.3 Status in cloud-init: New Bug description: We have the following CI/CD setup with CloudStack, that is each iteration does the following: - deploy a local bootstrap VM using online network installation image, seed and sysprep scripts to install cloud-init, configuring it for CloudStack data source - create a CloudStack template from this bootstrap VM, actually two templates (for KVM and VMWare hypervisors) - use this template to create a test CloudStack VM - run acceptance tests for example, reset password and test login with it, on fail extract logs and other available cloud-init intelligence For password reset, the test is like the following: - call "resetPasswordForVirtualMachine()" API - save new VM password - login to VM (custom user cloud, I've also manually tested root and ubuntu and also sudo) Error: password denied Get logs: login with original password of the user "cloud". Observations: - /run/cloud-init/instance-data.json get written - there are log messages like 2019-10-14 12:41:04,126 - util.py[DEBUG]: Running command ['wget', '-- quiet', '--tries', '3', '--timeout', '20', '--output-document', '-', ' --header', 'DomU_Request: send_my_password', '192.102.146.5:8080'] with allowed return codes [0] (shell=False, capture=True) 2019-10-14 12:41:04,144 - util.py[DEBUG]: Running command ['wget', '-- quiet', '--tries', '3', '--timeout', '20', '--output-document', '-', ' --header', 'DomU_Request: saved_password', '192.102.146.5:8080'] with allowed return codes [0] (shell=False, capture=True) Which errors should appear if the password could not be changed? Relevant version information: - Ubuntu 18.04.3 LTS - Apache CloudStack fork by Accelerite based on ACS 4.10, versioned as custom 4.11.0.7. - Cloud-init v. 19.2-36-g059d049c-0ubuntu2~18.04.1 Contents of cloud init configuration (I've removed obsolete custom cloud-init replacements mentioned in the previous bug report #1847604, so there should be no other custom configuration): datasource: CloudStack: {} None: {} datasource_list: - CloudStack To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1848026/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1847604] Re: Can't reset password for VM instance
It seems that our custom scripts to patch cloud-init scripts originate from earlier period of time so we will test without them. ** Changed in: cloud-init Status: Incomplete => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1847604 Title: Can't reset password for VM instance Status in cloud-init: Invalid Bug description: We use Apache Cloud Stack. Since we move to Ubuntu 18.04 it's not possible to change default password of the user. The only issue I can see in the logs is that /sbin/restorecon can't be found, indeed there is no such binary, even if the package "restorecond" is installed. Attachment: relevant logs. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1847604/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1847604] [NEW] Can't reset password for VM instance
Public bug reported: We use Apache Cloud Stack. Since we move to Ubuntu 18.04 it's not possible to change default password of the user. The only issue I can see in the logs is that /sbin/restorecon can't be found, indeed there is no such binary, even if the package "restorecond" is installed. Attachment: relevant logs. ** Affects: cloud-init Importance: Undecided Status: New ** Attachment added: "cloud-init.log" https://bugs.launchpad.net/bugs/1847604/+attachment/5296215/+files/cloud-init.log -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1847604 Title: Can't reset password for VM instance Status in cloud-init: New Bug description: We use Apache Cloud Stack. Since we move to Ubuntu 18.04 it's not possible to change default password of the user. The only issue I can see in the logs is that /sbin/restorecon can't be found, indeed there is no such binary, even if the package "restorecond" is installed. Attachment: relevant logs. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1847604/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1794569] Re: DVR with static routes may cause routed traffic to be dropped
** Changed in: neutron Status: Invalid => New ** Description changed: - Neutron version: 9.4.1 (EOL, but bug may still be present) + Neutron version: 10.0.7 Network scenario: Openvswitch with DVR Openvswitch version: 2.6.1 - OpenStack installation version: Newton + OpenStack installation version: Ocata Operating system: Ubuntu 16.04.5 LTS Kernel: 4.4.0-135 x86_64 Symptoms: Instances whose default gateway is a DVR interface (10.10.255.1 in our case) occassionaly lose connectivity to non-local networks. Meaning, any packet that had to pass through the local virtual router is dropped. Sometimes this behavior lasts for a few milliseconds, sometimes tens of seconds. Since floating-ip traffic is a subset of those cases, north-south connectivity breaks too. Steps to reproduce: - Use DVR routing mode - Configure at least one static route in the virtual router, whose next hop is NOT an address managed by Neutron (e.g. a physical interface on a VPN gateway; in our case 10.2.0.0/24 with next-hop 10.10.0.254) - Have an instance plugged into a Flat or VLAN network, use the virtual router as the default gateway - Try to reach a host inside the statically-routed network from within the instance Possible explanation: Distributed routers get their ARP caches populated by neutron-l3-agent at its startup. The agent takes all the ports in a given subnet and fills in their IP-to-MAC mappings inside the qrouter- namespace, as permanent entries (meaning they won't expire from the cache). However, if Neutron doesn't manage an IP (as is the case with our static route's next-hop 10.10.0.254), a permanent record isn't created, naturally. So when we try to reach a host in the statically-routed network (e.g. 10.2.0.10) from inside the instance, the packet goes to default gateway (10.10.255.1). After it arrives to the qrouter- namespace, there is a static route for this host pointing to 10.10.0.254 as next-hop. However qrouter- doesn't have its MAC address, so what it does is it sends out an ARP request with source MAC of the distributed router's qr- interface. And that's the problem. Since ARP requests are usually broadcasts, they land on pretty much every hypervisor in the network within the same VLAN. Combined with the fact that qr- interfaces in a given qrouter- namespace have the same MAC address on every host, this leads to a disaster: every integration bridge will recieve that ARP request on the port that connects it to the Flat/VLAN network and learns that the qr- interface's MAC address is actually there - not on the qr- port also attached to br-int. From this moment on, packets from instances that need to pass via qrouter- are forwarded to the Flat/VLAN network interface, circumventing the qrouter- namespace. This is especially problematic with traffic that needs to be SNAT-ed on its way out. Workarounds: - The workaround that we used is creating stub Neutron ports for next-hop addresses, with correct MACs. After restarting neutron-l3-agents, they got populated into the qrouter- ARP cache as permanent entries. - Next option is setting the static route into the instances' routing tables instead of the virtual router. This way it's the instance that makes ARP discovery and not the qrouter- namespace. - Another workaround might consist of using ebtables/arptables on hypervisors to block incoming ARP requests from qrouters. Possible long-term solution: Maybe it would help if ancillary bridges (those connecting Flat/VLAN network interfaces to br-int) contained an OVS flow that drops ARP requests with source MAC addresses of qr- interfaces originating from the physical interface. Since their IPs and MACs are well defined (their device_owner is "network:router_interface_distributed"), it shouldn't be a problem setting these flows up. However I'm not sure of the shortcomings of this approach. -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1794569 Title: DVR with static routes may cause routed traffic to be dropped Status in neutron: New Bug description: Neutron version: 10.0.7 Network scenario: Openvswitch with DVR Openvswitch version: 2.6.1 OpenStack installation version: Ocata Operating system: Ubuntu 16.04.5 LTS Kernel: 4.4.0-135 x86_64 Symptoms: Instances whose default gateway is a DVR interface (10.10.255.1 in our case) occassionaly lose connectivity to non-local networks. Meaning, any packet that had to pass through the local virtual router is dropped. Sometimes this behavior lasts for a few milliseconds, sometimes tens of seconds. Since floating-ip traffic is a subset of those cases, north-south connectivity breaks too. Steps to reproduce: - Use DVR routing mode - Configure at least one static route in the virtual router, whose next hop is NOT an address
[Yahoo-eng-team] [Bug 1839621] [NEW] Inappropriate split of transport_url string
Public bug reported: In /etc/nova/nova.conf line 3085 if your password for messaging provider (such as rabbit) contains "#" character then string will be splitted inaccurately preventing nova service from starting. Steps to reproduce 1. In /etc/nova/nova.conf set transport url to transport_url=rabbit://openstack:test#passw...@controller.host.example.com 2. systemctl start openstack-nova-api.service openstack-nova- consoleauth.service openstack-nova-scheduler.service openstack-nova- conductor.service openstack-nova-novncproxy.service this will produce: Job for openstack-nova-consoleauth.service failed because the control process exited with error code. See "systemctl status openstack-nova-consoleauth.service" and "journalctl -xe" for details. Job for openstack-nova-api.service failed because the control process exited with error code. See "systemctl status openstack-nova-api.service" and "journalctl -xe" for details. Job for openstack-nova-conductor.service failed because the control process exited with error code. See "systemctl status openstack-nova-conductor.service" and "journalctl -xe" for details. Job for openstack-nova-scheduler.service failed because the control process exited with error code. See "systemctl status openstack-nova-scheduler.service" and "journalctl -xe" for details. 3. Check journalctl -xe logs and notice: nova-conductor[31437]: ValueError: invalid literal for int() with base 10: 'test' systemd[1]: openstack-nova-conductor.service: main process exited, code=exited, status=1/FAILURE systemd[1]: Failed to start OpenStack Nova Conductor Server. Environment: OS: CentOS Linux release 7.6.1810 kernel: 3.10.0-957.21.3.el7.x86_64 rpm -qa | grep nova python2-novaclient-13.0.1-1.el7.noarch openstack-nova-conductor-19.0.1-1.el7.noarch openstack-nova-console-19.0.1-1.el7.noarch openstack-nova-common-19.0.1-1.el7.noarch openstack-nova-novncproxy-19.0.1-1.el7.noarch python2-nova-19.0.1-1.el7.noarch openstack-nova-api-19.0.1-1.el7.noarch openstack-nova-scheduler-19.0.1-1.el7.noarch ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1839621 Title: Inappropriate split of transport_url string Status in OpenStack Compute (nova): New Bug description: In /etc/nova/nova.conf line 3085 if your password for messaging provider (such as rabbit) contains "#" character then string will be splitted inaccurately preventing nova service from starting. Steps to reproduce 1. In /etc/nova/nova.conf set transport url to transport_url=rabbit://openstack:test#passw...@controller.host.example.com 2. systemctl start openstack-nova-api.service openstack-nova- consoleauth.service openstack-nova-scheduler.service openstack-nova- conductor.service openstack-nova-novncproxy.service this will produce: Job for openstack-nova-consoleauth.service failed because the control process exited with error code. See "systemctl status openstack-nova-consoleauth.service" and "journalctl -xe" for details. Job for openstack-nova-api.service failed because the control process exited with error code. See "systemctl status openstack-nova-api.service" and "journalctl -xe" for details. Job for openstack-nova-conductor.service failed because the control process exited with error code. See "systemctl status openstack-nova-conductor.service" and "journalctl -xe" for details. Job for openstack-nova-scheduler.service failed because the control process exited with error code. See "systemctl status openstack-nova-scheduler.service" and "journalctl -xe" for details. 3. Check journalctl -xe logs and notice: nova-conductor[31437]: ValueError: invalid literal for int() with base 10: 'test' systemd[1]: openstack-nova-conductor.service: main process exited, code=exited, status=1/FAILURE systemd[1]: Failed to start OpenStack Nova Conductor Server. Environment: OS: CentOS Linux release 7.6.1810 kernel: 3.10.0-957.21.3.el7.x86_64 rpm -qa | grep nova python2-novaclient-13.0.1-1.el7.noarch openstack-nova-conductor-19.0.1-1.el7.noarch openstack-nova-console-19.0.1-1.el7.noarch openstack-nova-common-19.0.1-1.el7.noarch openstack-nova-novncproxy-19.0.1-1.el7.noarch python2-nova-19.0.1-1.el7.noarch openstack-nova-api-19.0.1-1.el7.noarch openstack-nova-scheduler-19.0.1-1.el7.noarch To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1839621/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1821016] [NEW] Race on reboot, fault on spawning sriov instance after reboot
Public bug reported: There appears to be some race between nova-compute and neutron-sriov-agent when rebooting. When trying to bring up an sriov-enabled instance this intermittently (but often) fails after a fresh reboot. After restarting the neutron-sriov-agent and nova-compute seems to fix this, i.e. can again spawn instances w/ sriov ports * Pre-conditions: - Sriov interfaces configured and functional, i.e. can spawn functional sriov enabled instances * Step-by-step reproduction steps: - Verify sriov enabled instance can be spawned as the admin user - Reboot compute - Attempt to spawn an sriov enabled instance A, wait for fault - Restart: $ sudo service neutron-sriov-agent restart ; sleep 1 ; sudo service nova-compute restart - Attempt to spawn an sriov enabled instance B * Expected output: 2x ACTIVE instances A and B * Actual output: ERRORed instance A nova-compute.log contains 2019-03-20 13:59:36.636 22577 ERROR nova.compute.manager [req-eddf8d43-ca07-491f-931c-96b20cce7ef7 a0f1548a1b3f45379155ca1fb21c1599 7881e5796b2e4f80a9e5a7e089029bc3 - - -] [instance: 48f54a52-8cb7-4963-93fa-c412954a2086] Failed to allocate network(s) 2019-03-20 13:59:36.636 22577 ERROR nova.compute.manager [instance: 48f54a52-8cb7-4963-93fa-c412954a2086] Traceback (most recent call last): 2019-03-20 13:59:36.636 22577 ERROR nova.compute.manager [instance: 48f54a52-8cb7-4963-93fa-c412954a2086] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1939, in _build_and_run_instance 2019-03-20 13:59:36.636 22577 ERROR nova.compute.manager [instance: 48f54a52-8cb7-4963-93fa-c412954a2086] block_device_info=block_device_info) 2019-03-20 13:59:36.636 22577 ERROR nova.compute.manager [instance: 48f54a52-8cb7-4963-93fa-c412954a2086] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 2798, in spawn 2019-03-20 13:59:36.636 22577 ERROR nova.compute.manager [instance: 48f54a52-8cb7-4963-93fa-c412954a2086] destroy_disks_on_failure=True) 2019-03-20 13:59:36.636 22577 ERROR nova.compute.manager [instance: 48f54a52-8cb7-4963-93fa-c412954a2086] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 5321, in _create_domain_and_network 2019-03-20 13:59:36.636 22577 ERROR nova.compute.manager [instance: 48f54a52-8cb7-4963-93fa-c412954a2086] raise exception.VirtualInterfaceCreateException() 2019-03-20 13:59:36.636 22577 ERROR nova.compute.manager [instance: 48f54a52-8cb7-4963-93fa-c412954a2086] VirtualInterfaceCreateException: Virtual Interface creation failed 2019-03-20 13:59:36.636 22577 ERROR nova.compute.manager [instance: 48f54a52-8cb7-4963-93fa-c412954a2086] ACTIVE instance B * Version: Ocata running on Ubuntu xenial neutron 10.0.7-0ubuntu1~cloud1 nova 15.1.5-0ubuntu1~cloud1 ** Affects: neutron Importance: Undecided Status: New ** Affects: nova Importance: Undecided Status: New ** Tags: canonical-bootstack ** Also affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1821016 Title: Race on reboot, fault on spawning sriov instance after reboot Status in neutron: New Status in OpenStack Compute (nova): New Bug description: There appears to be some race between nova-compute and neutron-sriov-agent when rebooting. When trying to bring up an sriov-enabled instance this intermittently (but often) fails after a fresh reboot. After restarting the neutron-sriov-agent and nova-compute seems to fix this, i.e. can again spawn instances w/ sriov ports * Pre-conditions: - Sriov interfaces configured and functional, i.e. can spawn functional sriov enabled instances * Step-by-step reproduction steps: - Verify sriov enabled instance can be spawned as the admin user - Reboot compute - Attempt to spawn an sriov enabled instance A, wait for fault - Restart: $ sudo service neutron-sriov-agent restart ; sleep 1 ; sudo service nova-compute restart - Attempt to spawn an sriov enabled instance B * Expected output: 2x ACTIVE instances A and B * Actual output: ERRORed instance A nova-compute.log contains 2019-03-20 13:59:36.636 22577 ERROR nova.compute.manager [req-eddf8d43-ca07-491f-931c-96b20cce7ef7 a0f1548a1b3f45379155ca1fb21c1599 7881e5796b2e4f80a9e5a7e089029bc3 - - -] [instance: 48f54a52-8cb7-4963-93fa-c412954a2086] Failed to allocate network(s) 2019-03-20 13:59:36.636 22577 ERROR nova.compute.manager [instance: 48f54a52-8cb7-4963-93fa-c412954a2086] Traceback (most recent call last): 2019-03-20 13:59:36.636 22577 ERROR nova.compute.manager [instance: 48f54a52-8cb7-4963-93fa-c412954a2086] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1939, in _build_and_run_instance 2019-03-20 13:59:36.636 22577 ERROR nova.compute.manager [instance:
[Yahoo-eng-team] [Bug 1806701] [NEW] cloud-init may hang OS boot process due to grep for the entire ISO file when it is attached
Public bug reported: We have found in our test for SLES15 with cloud-init installed, if we attach a ISO file with the VM before VM is boot, it often takes more than 10 minutes to start the SLES OS. Sometimes it failed to start the SLES OS at all. We've root caused it is due to the "is_cdrom_ovf()" func of "tools/ds- identify". In this function, there is the following logic to detect if an ISO contains certain string: >local idstr="http://schemas.dmtf.org/ovf/environment/1; >grep --quiet --ignore-case "$idstr" "${PATH_ROOT}$dev" ref: https://git.launchpad.net/cloud-init/tree/tools/ds-identify It is trying to grep the who ISO file for a certain string, which causes intense IO pressure for the system. What is worse is that sometimes the ISO file is large (e.g. >5GB for installer DVD) and it is mounted over NFS. The "grep" process often consume 99% CPU and seems hang. Then the systemd starts more and more "grep" process which smoke the CPU and consumes all the IO bandwidth for the ISO file. Then the system may hang for a long time and sometimes failed to start. To fix this issue, I suggest that we should not grep for the entire ISO file. Rather then we should just check if the file/dir exists with os.path.exists(). -debug log snip pek2-gosv-16-dhcp180:~ # ps -ef UIDPID PPID C STIME TTY TIME CMD root 1 0 0 13:32 ?00:00:04 /usr/lib/systemd/systemd --switched-root --system --deserialize 24 … root 474 1 0 13:34 ?00:00:00 /bin/sh /usr/lib/cloud-init/ds-identify root 482 474 2 13:34 ?00:00:15 grep --quiet --ignore-case http://schemas.dmtf.org/ovf/environment/1 /dev/sr1 root 1020 1 0 13:35 ?00:00:00 /bin/sh /usr/lib/cloud-init/ds-identify root 1039 1020 1 13:35 ?00:00:07 grep --quiet --ignore-case http://schemas.dmtf.org/ovf/environment/1 /dev/sr1 polkitd 1049 1 0 13:37 ?00:00:00 /usr/lib/polkit-1/polkitd --no-debug root 1051 1 0 13:37 ?00:00:00 /usr/sbin/wickedd --systemd --foreground root 1052 1 0 13:37 ?00:00:00 /usr/lib/systemd/systemd-logind root 1054 1 0 13:37 ?00:00:00 /usr/sbin/wickedd-nanny --systemd --foreground root 1073 1 0 13:37 ?00:00:00 /usr/bin/vmtoolsd root 1097 1 0 13:37 ?00:00:00 /bin/sh /usr/lib/cloud-init/ds-identify root 1110 1097 1 13:37 ?00:00:04 grep --quiet --ignore-case http://schemas.dmtf.org/ovf/environment/1 /dev/sr1 root 1304 1 0 13:38 ?00:00:00 /bin/sh /usr/lib/cloud-init/ds-identify root 1312 1304 1 13:38 ?00:00:03 grep --quiet --ignore-case http://schemas.dmtf.org/ovf/environment/1 /dev/sr1 root 1537 1 0 13:40 ?00:00:00 /usr/bin/plymouth --wait root 1613 1 0 13:40 ?00:00:00 /bin/sh /usr/lib/cloud-init/ds-identify root 1645 1613 0 13:40 ?00:00:02 grep --quiet --ignore-case http://schemas.dmtf.org/ovf/environment/1 /dev/sr1 … Grep use nearly 100% cpu, system very slow. top - 13:46:37 up 26 min, 2 users, load average: 14.14, 15.03, 10.57 Tasks: 225 total, 6 running, 219 sleeping, 0 stopped, 0 zombie %Cpu(s): 40.1 us, 49.3 sy, 0.0 ni, 0.0 id, 1.4 wa, 0.0 hi, 9.1 si, 0.0 st KiB Mem : 1000916 total,64600 free, 355880 used, 580436 buff/cache KiB Swap: 1288168 total, 1285600 free, 2568 used. 492688 avail Mem PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND 4427 root 20 0 40100 3940 3084 R 99.90 0.394 0:27.41 top 1016 root 20 0 197796 4852 3400 R 99.90 0.485 1:26.44 vmtoolsd 1723 root 20 07256 1860 1556 D 99.90 0.186 0:28.44 grep 484 root 20 07256 1684 1396 D 99.90 0.168 1:51.22 grep 1278 root 20 07256 1856 1556 D 99.90 0.185 0:38.44 grep 1398 root 20 07256 1860 1556 R 99.90 0.186 0:28.53 grep 1061 root 20 07256 1856 1556 D 99.90 0.185 0:56.62 grep -debug log snip ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1806701 Title: cloud-init may hang OS boot process due to grep for the entire ISO file when it is attached Status in cloud-init: New Bug description: We have found in our test for SLES15 with cloud-init installed, if we attach a ISO file with the VM before VM is boot, it often takes more than 10 minutes to start the SLES OS. Sometimes it failed to start the SLES OS at all. We've root caused it is due to the "is_cdrom_ovf()" func of "tools /ds-identify". In this function, there is the following logic to detect if an ISO contains certain string: >local
[Yahoo-eng-team] [Bug 1794569] [NEW] DVR with static routes may cause routed traffic to be dropped
Public bug reported: Neutron version: 9.4.1 (EOL, but bug may still be present) Network scenario: Openvswitch with DVR Openvswitch version: 2.6.1 OpenStack installation version: Newton Operating system: Ubuntu 16.04.5 LTS Kernel: 4.4.0-135 x86_64 Symptoms: Instances whose default gateway is a DVR interface (10.10.255.1 in our case) occassionaly lose connectivity to non-local networks. Meaning, any packet that had to pass through the local virtual router is dropped. Sometimes this behavior lasts for a few milliseconds, sometimes tens of seconds. Since floating-ip traffic is a subset of those cases, north-south connectivity breaks too. Steps to reproduce: - Use DVR routing mode - Configure at least one static route in the virtual router, whose next hop is NOT an address managed by Neutron (e.g. a physical interface on a VPN gateway; in our case 10.2.0.0/24 with next-hop 10.10.0.254) - Have an instance plugged into a Flat or VLAN network, use the virtual router as the default gateway - Try to reach a host inside the statically-routed network from within the instance Possible explanation: Distributed routers get their ARP caches populated by neutron-l3-agent at its startup. The agent takes all the ports in a given subnet and fills in their IP-to-MAC mappings inside the qrouter- namespace, as permanent entries (meaning they won't expire from the cache). However, if Neutron doesn't manage an IP (as is the case with our static route's next-hop 10.10.0.254), a permanent record isn't created, naturally. So when we try to reach a host in the statically-routed network (e.g. 10.2.0.10) from inside the instance, the packet goes to default gateway (10.10.255.1). After it arrives to the qrouter- namespace, there is a static route for this host pointing to 10.10.0.254 as next-hop. However qrouter- doesn't have its MAC address, so what it does is it sends out an ARP request with source MAC of the distributed router's qr- interface. And that's the problem. Since ARP requests are usually broadcasts, they land on pretty much every hypervisor in the network within the same VLAN. Combined with the fact that qr- interfaces in a given qrouter- namespace have the same MAC address on every host, this leads to a disaster: every integration bridge will recieve that ARP request on the port that connects it to the Flat/VLAN network and learns that the qr- interface's MAC address is actually there - not on the qr- port also attached to br-int. From this moment on, packets from instances that need to pass via qrouter- are forwarded to the Flat/VLAN network interface, circumventing the qrouter- namespace. This is especially problematic with traffic that needs to be SNAT-ed on its way out. Workarounds: - The workaround that we used is creating stub Neutron ports for next-hop addresses, with correct MACs. After restarting neutron-l3-agents, they got populated into the qrouter- ARP cache as permanent entries. - Another workaround might consist of using ebtables/arptables on hypervisors to block incoming ARP requests from qrouters. Possible long-term sloution: Maybe it would help if ancillary bridges (those connecting Flat/VLAN network interfaces to br-int) contained an OVS flow that drops ARP requests with source MAC addresses of qr- interfaces originating from the physical interface. Since their IPs and MACs are well defined (their device_owner is "network:router_interface_distributed"), it shouldn't be a problem setting these flows up. However I'm not sure of the shortcomings of this approach. ** Affects: neutron Importance: Undecided Status: New ** Tags: drop dvr route static traffic -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1794569 Title: DVR with static routes may cause routed traffic to be dropped Status in neutron: New Bug description: Neutron version: 9.4.1 (EOL, but bug may still be present) Network scenario: Openvswitch with DVR Openvswitch version: 2.6.1 OpenStack installation version: Newton Operating system: Ubuntu 16.04.5 LTS Kernel: 4.4.0-135 x86_64 Symptoms: Instances whose default gateway is a DVR interface (10.10.255.1 in our case) occassionaly lose connectivity to non-local networks. Meaning, any packet that had to pass through the local virtual router is dropped. Sometimes this behavior lasts for a few milliseconds, sometimes tens of seconds. Since floating-ip traffic is a subset of those cases, north-south connectivity breaks too. Steps to reproduce: - Use DVR routing mode - Configure at least one static route in the virtual router, whose next hop is NOT an address managed by Neutron (e.g. a physical interface on a VPN gateway; in our case 10.2.0.0/24 with next-hop 10.10.0.254) - Have an instance plugged into a Flat or VLAN network, use the virtual router as the default gateway - Try to reach a host inside the
[Yahoo-eng-team] [Bug 1767267] [NEW] character of set image property multiqueue command is wrong
Public bug reported: This bug tracker is for errors with the documentation, use the following as a template and remove or add fields as you see fit. Convert [ ] into [x] to check boxes: - [x] This doc is inaccurate in this way: command example to set image multiqueue property have wrong character, the word in example "hw_vif_mutliqueue_enabled" should be hw_vif_multiqueue_enabled - [x] I have a fix to the document that I can paste below including example: input and output. The correct example command to set image multiqueue property should be: $ openstack image set --property hw_vif_multiqueue_enabled=true IMAGE_NAME If you have a troubleshooting or support issue, use the following resources: - Ask OpenStack: http://ask.openstack.org - The mailing list: http://lists.openstack.org - IRC: 'openstack' channel on Freenode --- Release: 11.0.4.dev14 on 2018-04-24 21:28 SHA: 1d3568da772dfba989f6b0f18a99f6d02860c2a6 Source: https://git.openstack.org/cgit/openstack/neutron/tree/doc/source/admin/config-ovs-dpdk.rst URL: https://docs.openstack.org/neutron/pike/admin/config-ovs-dpdk.html ** Affects: neutron Importance: Undecided Status: New ** Tags: doc -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1767267 Title: character of set image property multiqueue command is wrong Status in neutron: New Bug description: This bug tracker is for errors with the documentation, use the following as a template and remove or add fields as you see fit. Convert [ ] into [x] to check boxes: - [x] This doc is inaccurate in this way: command example to set image multiqueue property have wrong character, the word in example "hw_vif_mutliqueue_enabled" should be hw_vif_multiqueue_enabled - [x] I have a fix to the document that I can paste below including example: input and output. The correct example command to set image multiqueue property should be: $ openstack image set --property hw_vif_multiqueue_enabled=true IMAGE_NAME If you have a troubleshooting or support issue, use the following resources: - Ask OpenStack: http://ask.openstack.org - The mailing list: http://lists.openstack.org - IRC: 'openstack' channel on Freenode --- Release: 11.0.4.dev14 on 2018-04-24 21:28 SHA: 1d3568da772dfba989f6b0f18a99f6d02860c2a6 Source: https://git.openstack.org/cgit/openstack/neutron/tree/doc/source/admin/config-ovs-dpdk.rst URL: https://docs.openstack.org/neutron/pike/admin/config-ovs-dpdk.html To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1767267/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1746609] [NEW] test_boot_server_from_encrypted_volume_luks cannot detach an encrypted StorPool-backed volume
Public bug reported: Hi, First of all, thanks a lot for working on Nova! The StorPool third-party Cinder CI has been failing on every test run today with the same problem: the test_boot_server_from_encrypted_volume_luks Tempest test fails when trying to detach a volume with an exception in the nova-compute service log: "Failed to detach volume 645fd643-89fc-4b3d-9ea5-59c764fc39a2 from /dev/vdb: AttributeError: 'NoneType' object has no attribute 'format_dom'" An example stack trace may be seen at: - nova-compute log: http://logs.ci-openstack.storpool.com/18/539318/1/check/dsvm-tempest-storpool/c3daf58/logs/screen-n-cpu.txt.gz#_Jan_31_18_07_27_971552 - console log (with the list of tests run): http://logs.ci-openstack.storpool.com/18/539318/1/check/dsvm-tempest-storpool/c3daf58/console.html Actually, start from http://logs.ci-openstack.storpool.com/ - any of the recent five or six failures can be traced back to this problem. Of course, it is completely possible that the (recently merged) StorPool Nova volume attachment driver or the (also recently merged) StorPool os- brick connector is at fault; if there are any configuration fields or method parameters that we should be preserving, passing through, or handling in some other way, please let us know and we will modify our drivers. Also, our CI system is available for testing any suggested patches or workarounds. Thanks in advance for looking at this, and thanks for your work on Nova and OpenStack in general! Best regards, Peter ** Affects: nova Importance: Undecided Status: New ** Tags: libvirt queens-rc-potential volumes -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1746609 Title: test_boot_server_from_encrypted_volume_luks cannot detach an encrypted StorPool-backed volume Status in OpenStack Compute (nova): New Bug description: Hi, First of all, thanks a lot for working on Nova! The StorPool third-party Cinder CI has been failing on every test run today with the same problem: the test_boot_server_from_encrypted_volume_luks Tempest test fails when trying to detach a volume with an exception in the nova-compute service log: "Failed to detach volume 645fd643-89fc-4b3d- 9ea5-59c764fc39a2 from /dev/vdb: AttributeError: 'NoneType' object has no attribute 'format_dom'" An example stack trace may be seen at: - nova-compute log: http://logs.ci-openstack.storpool.com/18/539318/1/check/dsvm-tempest-storpool/c3daf58/logs/screen-n-cpu.txt.gz#_Jan_31_18_07_27_971552 - console log (with the list of tests run): http://logs.ci-openstack.storpool.com/18/539318/1/check/dsvm-tempest-storpool/c3daf58/console.html Actually, start from http://logs.ci-openstack.storpool.com/ - any of the recent five or six failures can be traced back to this problem. Of course, it is completely possible that the (recently merged) StorPool Nova volume attachment driver or the (also recently merged) StorPool os-brick connector is at fault; if there are any configuration fields or method parameters that we should be preserving, passing through, or handling in some other way, please let us know and we will modify our drivers. Also, our CI system is available for testing any suggested patches or workarounds. Thanks in advance for looking at this, and thanks for your work on Nova and OpenStack in general! Best regards, Peter To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1746609/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1735725] [NEW] No "default" physnet for vlan provider
Public bug reported: When creating a network with vlan provider "Physical Network" field is prepopulated with the string "default", although such a physnet doesn't exist. The user actually needs to specify a real physnet here; trying to add a net with physnet "default" fails. UI wise leaving the field empty would make this clearer. Version: Mitaka, 9.1.2 ** Affects: horizon Importance: Undecided Status: New ** Tags: canonical-bootstack ** Description changed: When creating a network with vlan provider "Physical Network" field is prepopulated with the string "default", although such a physnet doesn't exist. The user actually needs to specify a real physnet here; trying to add a net with physnet "default" fails. UI wise leaving the field empty would make this clearer. + + Version: Mitaka, 9.1.2 -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1735725 Title: No "default" physnet for vlan provider Status in OpenStack Dashboard (Horizon): New Bug description: When creating a network with vlan provider "Physical Network" field is prepopulated with the string "default", although such a physnet doesn't exist. The user actually needs to specify a real physnet here; trying to add a net with physnet "default" fails. UI wise leaving the field empty would make this clearer. Version: Mitaka, 9.1.2 To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1735725/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1720163] [NEW] Openvswitch ports not getting recreated
Public bug reported: Due to an apparent communication issue between neutron-openvswitch-agent and openvswitch we were in a situation where we had linux veth devices set up for a given instance (eg. tapXXX, qvbXXX) but were missing corresponding ovs ports and iptables rules. In this situation we had to manually delete veth devices and restart nova-compute and neutron-openvswitch-agent to get openvswitch ports and iptables rules recreated. It would be more robust if neutron and openvswitch would automatically detect missing objects and synchronize their view of which devices and ports should be present Versions: Openstack Mitaka on Ubuntu xenial neutron: 2:8.4.0-0ubuntu5 openvswitch-switch 2.5.2-0ubuntu0.16.04.1 ** Affects: neutron Importance: Undecided Status: New ** Tags: canonical-bootstack -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1720163 Title: Openvswitch ports not getting recreated Status in neutron: New Bug description: Due to an apparent communication issue between neutron-openvswitch- agent and openvswitch we were in a situation where we had linux veth devices set up for a given instance (eg. tapXXX, qvbXXX) but were missing corresponding ovs ports and iptables rules. In this situation we had to manually delete veth devices and restart nova-compute and neutron-openvswitch-agent to get openvswitch ports and iptables rules recreated. It would be more robust if neutron and openvswitch would automatically detect missing objects and synchronize their view of which devices and ports should be present Versions: Openstack Mitaka on Ubuntu xenial neutron: 2:8.4.0-0ubuntu5 openvswitch-switch 2.5.2-0ubuntu0.16.04.1 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1720163/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1711319] [NEW] Horizon secgroup misleading help text
Public bug reported: Hy, One of my colleague found a bit misleading information, when he wanted to add a new rule to secgroup with wildcard IP Protocol. If you click on the question mark, it gives a hint to write "-1", but the horizon drop back the "Not a valid IP protocol number" error, without posting the form. (see in the attached image) The problem can be solved, if you left empty the protocol field, and the rule will be added as "IP protocol - Any". We using Mitaka now: openstack-dashboard (2:9.1.2-0ubuntu1~cloud0) Can somebody change the hint text in the packages? Regards, Peter ERDOSI ** Affects: horizon Importance: Undecided Status: New ** Attachment added: "If you write nothing in the IP protocol field, the rule will be added as wanted" https://bugs.launchpad.net/bugs/1711319/+attachment/4934123/+files/secgroup_protocol_wildcard_hint.png -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1711319 Title: Horizon secgroup misleading help text Status in OpenStack Dashboard (Horizon): New Bug description: Hy, One of my colleague found a bit misleading information, when he wanted to add a new rule to secgroup with wildcard IP Protocol. If you click on the question mark, it gives a hint to write "-1", but the horizon drop back the "Not a valid IP protocol number" error, without posting the form. (see in the attached image) The problem can be solved, if you left empty the protocol field, and the rule will be added as "IP protocol - Any". We using Mitaka now: openstack-dashboard (2:9.1.2-0ubuntu1~cloud0) Can somebody change the hint text in the packages? Regards, Peter ERDOSI To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1711319/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1702992] [NEW] get_diagnostics on a non-running instance raises unnecessary exception
Public bug reported: Calling get_diagnostics on an instance that is shutdown either via CLI or API generates an exception, and adds an row to the nova.instance_fault table Reproduce: create an instance power it off run `nova diagnostics instance` Expected result: a simple error message returned to the user actual result: a new row in the nova.instance_faults table. a lot of text in nova-compute.log: 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher [req-1c775f02-887d-42b6-974d-129dbe73c7ce ffa01db44dc4435dbb613f50da552315 b1744f8baaa44d4f9762b9a7eaffc61f - - -] Exception during message handling: Instance 6ab6cc05-bebf-4bc2-95ac-7daf317125d6 in power_state 4. Cannot get_diagnostics while the instance is in this state. 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher Traceback (most recent call last): 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher executor_callback)) 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher executor_callback) 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 129, in _do_dispatch 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher result = func(ctxt, **new_args) 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/exception.py", line 89, in wrapped 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher payload) 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 195, in __exit__ 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb) 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/exception.py", line 72, in wrapped 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher return f(self, context, *args, **kw) 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 378, in decorated_function 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher kwargs['instance'], e, sys.exc_info()) 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 195, in __exit__ 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb) 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 366, in decorated_function 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher return function(self, context, *args, **kwargs) 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 4089, in get_diagnostics 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher method='get_diagnostics') 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher InstanceInvalidState: Instance 6ab6cc05-bebf-4bc2-95ac-7daf317125d6 in power_state 4. Cannot get_diagnostics while the instance is in this state. 2017-07-07 19:00:12.301 23077 ERROR oslo_messaging.rpc.dispatcher 2017-07-07 19:00:12.305 23077 ERROR oslo_messaging._drivers.common [req-1c775f02-887d-42b6-974d-129dbe73c7ce ffa01db44dc4435dbb613f50da552315 b1744f8baaa44d4f9762b9a7eaffc61f - - -] Returning exception Instance 6ab6cc05-bebf-4bc2-95ac-7daf317125d6 in power_state 4. Cannot get_diagnostics while the instance is in this state. to caller 2017-07-07 19:00:12.306 23077 ERROR oslo_messaging._drivers.common [req-1c775f02-887d-42b6-974d-129dbe73c7ce ffa01db44dc4435dbb613f50da552315 b1744f8baaa44d4f9762b9a7eaffc61f - - -] ['Traceback (most recent call last):\n', ' File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply\nexecutor_callback))\n', ' File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch\nexecutor_callback)\n', ' File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 129, in _do_dispatch\nresult = func(ctxt, **new_args)\n', ' File "/usr/lib/python2.7/site-packages/nova/exception.py", line 89, in wrapped\n payload)\n', ' File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 195, in __exit__\nsix.reraise(self.type_, self.value,
[Yahoo-eng-team] [Bug 1645597] [NEW] NoCloud module doesn't put GATEWAY and DNS* settings in ifcfg-* file in Fedora25/cloud-init 0.7.8
Public bug reported: Hello all! NoCloud module omits setting GATEWAY and DNS* variables while configuring /etc/sysconfig/network-scripts/ifcfg-eth0 file with BOOTPROTO=static. As a result configured system lacks Internet connectivity. meta-data file used: instance-id: $VMID local-hostname: $VMNAME hostname: $VMNAME network-interfaces: | auto eth0 iface eth0 inet static hwaddr $MAC address $IPADDRESS netmask $NETMASK gateway $GATEWAY dns-nameservers $DNSSERVERS dns-search $SEARCHDOMAINS Tis problem takes place on Fedora 25 with cloud-init 0.7.8 ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1645597 Title: NoCloud module doesn't put GATEWAY and DNS* settings in ifcfg-* file in Fedora25/cloud-init 0.7.8 Status in cloud-init: New Bug description: Hello all! NoCloud module omits setting GATEWAY and DNS* variables while configuring /etc/sysconfig/network-scripts/ifcfg-eth0 file with BOOTPROTO=static. As a result configured system lacks Internet connectivity. meta-data file used: instance-id: $VMID local-hostname: $VMNAME hostname: $VMNAME network-interfaces: | auto eth0 iface eth0 inet static hwaddr $MAC address $IPADDRESS netmask $NETMASK gateway $GATEWAY dns-nameservers $DNSSERVERS dns-search $SEARCHDOMAINS Tis problem takes place on Fedora 25 with cloud-init 0.7.8 To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1645597/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1645404] [NEW] NoCloud module incompletely configures network interfaces which leads to wrong /etc/resolv.conf configuration in RH-based distros
Public bug reported: Hello all! NoCloud module doesn't put dns-nameservers and dns-search (and hwaddr as well) values to /etc/sysconfig/network-scripts/ifcfg-* files during configuration in Red Hat-based distos, putting it to /etc/resolv.conf instead. As a result having no DNS1, DNS2 and SEARCH defined in ifcfg-* files with PEERDNS=on (default value), /etc/resolv.conf settings get overwritten with settings obtained from DHCP server is there's one around even if BOOTPROTO=static is set. On the other hand, /etc/resolv.conf settings are ignored completely with PEERDNS=off and settings made by cloud-init remain untouched, the downside is that settings obtained over DHCP (if configured) are not written to /etc/resolv.conf as well disabling ability to perform DNS queries. It would be better if NoCloud would put dns-nameservers and dns-search values to DNS1, DNS2 and SEARCH parameters in ifcfg-* files when they are set. The only option so far is to hard code DNS1 and DNS2 settings in custom made cloud image. That will always set these nameservers to /etc/resolv.conf (until you edit them manually in ifcfg-* file) but that's a warranty that system will always be able to perform DNS queries. The problem takes place in CentOS 7 with cloud-init 7.5 installed. ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1645404 Title: NoCloud module incompletely configures network interfaces which leads to wrong /etc/resolv.conf configuration in RH-based distros Status in cloud-init: New Bug description: Hello all! NoCloud module doesn't put dns-nameservers and dns-search (and hwaddr as well) values to /etc/sysconfig/network-scripts/ifcfg-* files during configuration in Red Hat-based distos, putting it to /etc/resolv.conf instead. As a result having no DNS1, DNS2 and SEARCH defined in ifcfg-* files with PEERDNS=on (default value), /etc/resolv.conf settings get overwritten with settings obtained from DHCP server is there's one around even if BOOTPROTO=static is set. On the other hand, /etc/resolv.conf settings are ignored completely with PEERDNS=off and settings made by cloud-init remain untouched, the downside is that settings obtained over DHCP (if configured) are not written to /etc/resolv.conf as well disabling ability to perform DNS queries. It would be better if NoCloud would put dns-nameservers and dns-search values to DNS1, DNS2 and SEARCH parameters in ifcfg-* files when they are set. The only option so far is to hard code DNS1 and DNS2 settings in custom made cloud image. That will always set these nameservers to /etc/resolv.conf (until you edit them manually in ifcfg-* file) but that's a warranty that system will always be able to perform DNS queries. The problem takes place in CentOS 7 with cloud-init 7.5 installed. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1645404/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1601986] [NEW] RuntimeError: osrandom engine already registered
Public bug reported: Horizon errors with 500 Internal Server Error. The apache error.log logs an exception "RuntimeError: osrandom engine already registered", cf. traceback below. We need to restart apache2 to recover. This happens in a non-deterministic way, ie. Horizon will function correctly for some time after throwing this error. Versions: python-django-horizon2:8.0.1-0ubuntu1~cloud0 apache2 2.4.7-1ubuntu4.10 libapache2-mod-wsgi 3.4-4ubuntu2.1.14.04.2 Traceback: [Mon Jul 11 20:16:46.373640 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] mod_wsgi (pid=2045796): Exception occurred processing WSGI script '/usr/share/openstack-dashboard/openstack_dashboard/wsgi/django.wsgi'. [Mon Jul 11 20:16:46.373681 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] Traceback (most recent call last): [Mon Jul 11 20:16:46.373697 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] File "/usr/lib/python2.7/dist-packages/django/core/handlers/wsgi.py", line 168, in __call__ [Mon Jul 11 20:16:46.390398 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] self.load_middleware() [Mon Jul 11 20:16:46.390420 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] File "/usr/lib/python2.7/dist-packages/django/core/handlers/base.py", line 46, in load_middleware [Mon Jul 11 20:16:46.390515 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] mw_instance = mw_class() [Mon Jul 11 20:16:46.390525 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] File "/usr/lib/python2.7/dist-packages/django/middleware/locale.py", line 23, in __init__ [Mon Jul 11 20:16:46.394033 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] for url_pattern in get_resolver(None).url_patterns: [Mon Jul 11 20:16:46.394052 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] File "/usr/lib/python2.7/dist-packages/django/core/urlresolvers.py", line 372, in url_patterns [Mon Jul 11 20:16:46.394500 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] patterns = getattr(self.urlconf_module, "urlpatterns", self.urlconf_module) [Mon Jul 11 20:16:46.394516 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] File "/usr/lib/python2.7/dist-packages/django/core/urlresolvers.py", line 366, in urlconf_module [Mon Jul 11 20:16:46.394533 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] self._urlconf_module = import_module(self.urlconf_name) [Mon Jul 11 20:16:46.394540 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module [Mon Jul 11 20:16:46.410602 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] __import__(name) [Mon Jul 11 20:16:46.410618 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] File "/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../../openstack_dashboard/urls.py", line 35, in [Mon Jul 11 20:16:46.416197 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] url(r'^api/', include('openstack_dashboard.api.rest.urls')), [Mon Jul 11 20:16:46.416219 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] File "/usr/lib/python2.7/dist-packages/django/conf/urls/__init__.py", line 28, in include [Mon Jul 11 20:16:46.422868 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] urlconf_module = import_module(urlconf_module) [Mon Jul 11 20:16:46.422882 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module [Mon Jul 11 20:16:46.422899 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] __import__(name) [Mon Jul 11 20:16:46.422905 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] File "/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../../openstack_dashboard/api/__init__.py", line 36, in [Mon Jul 11 20:16:46.432789 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] from openstack_dashboard.api import cinder [Mon Jul 11 20:16:46.432803 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] File "/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../../openstack_dashboard/api/cinder.py", line 30, in [Mon Jul 11 20:16:46.440814 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] from cinderclient.v2.contrib import list_extensions as cinder_list_extensions [Mon Jul 11 20:16:46.440829 2016] [:error] [pid 2045796:tid 139828791035648] [remote 172.16.4.81:33908] File
[Yahoo-eng-team] [Bug 1597844] [NEW] Broken links to 9.x versions in developer documentation
Public bug reported: Steps to reproduce: 1. Visit http://docs.openstack.org/developer/horizon/index.html. 2. Click any of the links under "Other Versions" 3. Receive 404 Not Found. ** Affects: horizon Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1597844 Title: Broken links to 9.x versions in developer documentation Status in OpenStack Dashboard (Horizon): New Bug description: Steps to reproduce: 1. Visit http://docs.openstack.org/developer/horizon/index.html. 2. Click any of the links under "Other Versions" 3. Receive 404 Not Found. To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1597844/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1580357] [NEW] Launch Instance NG does not auto select default security group or default to selecting the only network in a Project
Public bug reported: In the Mitaka Release of Horizon, the Launch Instance NG panels do not select the 'default' security group and if there is only one network in a Project it does not default to that network. In the Launch Instance Legacy workflow, this defaulted to the 'default' security group and if you only had one network it defaulted to select that network. In Mitaka, the Launch Instance Legacy workflow no longer selects the 'default' security group since it is defaulting to using the string instead of the id of the 'default' network and this change broke that [0]. I'm not expecting the Launch Instance Legacy workflow to be fixed, just documenting that fact since I switched back to that workflow expecting it to work like the Kilo dashboard. [0]: https://github.com/openstack/horizon/commit/5562694 ** Affects: horizon Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1580357 Title: Launch Instance NG does not auto select default security group or default to selecting the only network in a Project Status in OpenStack Dashboard (Horizon): New Bug description: In the Mitaka Release of Horizon, the Launch Instance NG panels do not select the 'default' security group and if there is only one network in a Project it does not default to that network. In the Launch Instance Legacy workflow, this defaulted to the 'default' security group and if you only had one network it defaulted to select that network. In Mitaka, the Launch Instance Legacy workflow no longer selects the 'default' security group since it is defaulting to using the string instead of the id of the 'default' network and this change broke that [0]. I'm not expecting the Launch Instance Legacy workflow to be fixed, just documenting that fact since I switched back to that workflow expecting it to work like the Kilo dashboard. [0]: https://github.com/openstack/horizon/commit/5562694 To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1580357/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1579667] [NEW] delete an shelved_offloaded server cause failure in cinder
Public bug reported: When deleting an shelved_offloaded STATE VM instance with volume attached, nova passes a connector dictionary: connector = {'ip': '127.0.0.1', 'initiator': 'iqn.fake'} to cinder for terminate connnection, this causes KeyError in cinder driver code : https://github.com/openstack/nova/blame/master/nova/compute/api.py#L1803 def _local_cleanup_bdm_volumes(self, bdms, instance, context): 1804"""The method deletes the bdm records and, if a bdm is a volume, call 1805the terminate connection and the detach volume via the Volume API. 1806Note that at this point we do not have the information about the 1807correct connector so we pass a fake one. 1808""" 1809elevated = context.elevated() 1810for bdm in bdms: 1811if bdm.is_volume: 1812# NOTE(vish): We don't have access to correct volume 1813# connector info, so just pass a fake 1814# connector. This can be improved when we 1815# expose get_volume_connector to rpc. 1816connector = {'ip': '127.0.0.1', 'initiator': 'iqn.fake'} 1817try: 1818self.volume_api.terminate_connection(context, 1819 bdm.volume_id, 1820 connector) 1821self.volume_api.detach(elevated, bdm.volume_id, 1822 instance.uuid) 1823if bdm.delete_on_termination: 1824self.volume_api.delete(context, bdm.volume_id) 1825except Exception as exc: 1826err_str = _LW("Ignoring volume cleanup failure due to %s") 1827LOG.warn(err_str % exc, instance=instance) 1828bdm.destroy() 1829 https://github.com/openstack/nova/blame/master/nova/compute/api.py#L1828 according to my debug, the connector info for terminate_connection is already there( in bdm object): so Nova should build correct connection info for terminate_connection. ==Steps to reproduce 1. create a server: nova boot 2. shelve the server: nova shelve 3. delete the server: nova delete Thanks Peter ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1579667 Title: delete an shelved_offloaded server cause failure in cinder Status in OpenStack Compute (nova): New Bug description: When deleting an shelved_offloaded STATE VM instance with volume attached, nova passes a connector dictionary: connector = {'ip': '127.0.0.1', 'initiator': 'iqn.fake'} to cinder for terminate connnection, this causes KeyError in cinder driver code : https://github.com/openstack/nova/blame/master/nova/compute/api.py#L1803 def _local_cleanup_bdm_volumes(self, bdms, instance, context): 1804 """The method deletes the bdm records and, if a bdm is a volume, call 1805 the terminate connection and the detach volume via the Volume API. 1806 Note that at this point we do not have the information about the 1807 correct connector so we pass a fake one. 1808 """ 1809 elevated = context.elevated() 1810 for bdm in bdms: 1811 if bdm.is_volume: 1812 # NOTE(vish): We don't have access to correct volume 1813 # connector info, so just pass a fake 1814 # connector. This can be improved when we 1815 # expose get_volume_connector to rpc. 1816 connector = {'ip': '127.0.0.1', 'initiator': 'iqn.fake'} 1817 try: 1818 self.volume_api.terminate_connection(context, 1819 bdm.volume_id, 1820 connector) 1821 self.volume_api.detach(elevated, bdm.volume_id, 1822 instance.uuid) 1823 if bdm.delete_on_termination: 1824 self.volume_api.delete(context, bdm.volume_id) 1825 except Exception as exc: 1826 err_str = _LW("Ignoring volume cleanup failure due to %s") 1827 LOG.warn(err_str % exc, instance=instance) 1828 bdm.destroy() 1829 https://github.com/openstack/nova/blame/master/nova/compute/api.py#L1828
[Yahoo-eng-team] [Bug 1576710] [NEW] Glance config generator files need to be updated for oslo_db
Public bug reported: The current config generator files at etc/oslo-config-generator/* need to be updated to import the correct packages for oslo_db.concurrency. They currently import oslo.db.concurrency which does not exist in the Mitaka version of oslo.db. I'm currently using the python2-oslo- db-4.6.0-1 package in the RDO repositories to attempt to build these configuration files. ** Affects: glance Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1576710 Title: Glance config generator files need to be updated for oslo_db Status in Glance: New Bug description: The current config generator files at etc/oslo-config-generator/* need to be updated to import the correct packages for oslo_db.concurrency. They currently import oslo.db.concurrency which does not exist in the Mitaka version of oslo.db. I'm currently using the python2-oslo- db-4.6.0-1 package in the RDO repositories to attempt to build these configuration files. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1576710/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1573875] [NEW] Nova able to start VM 2 times after failed to live migrate it
Public bug reported: Hi, I've faced a strange problem with nova. A few enviromental details: - We use Ubuntu 14.04 LTS - We use Kilo from Ubuntu cloud archive - We use KVM as Hypervisor with the stocked qemu 2.2 - We got Ceph as shared storage with libvirt-rbd devices - OVS neutron based networking, but it's all the same with other solutions I think. So, the workflow, which need to reproduce the bug: - Start a Windows guest (Linux distros not affected as I saw) - Live migrate this VM to another host (okay, I know, it's not fit 100% in cloud conception, but we must use it) As happend then, is a really wrong behavior: - The VM starts to migrate (virsh list shows it in a new host) - On the source side, virsh list tells me, the instance is stopped - After a few second, the destination host just remove the instance, and the source change it's state back to running - The network comes unavailable - The horizon reports, the instance is in shut off state and it's definietly not (the VNC is still available for example) - User can click on 'Start instance' button, and the instance will be started at the destination - We see those lines in a specified libvirt log: "qemu-system-x86_64: load of migration failed: Invalid argument" After a few google search whit this error, i've found this site: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1472500 It's not the exact error, but it's tells us a really important fact: those errors came with qemu 2.2, and it's had been fixed in 2.3... First of all, I've installed 2 CentOS compute node, which cames with qemu 2.3 by default, and the Windows migration started to work as Linux guests did before. Unfortunately, we must use Ubuntu, so we needed to find a workaround, which had been done yesterday... What I did: - Added Mitaka repository (which came out two days before) - Run this command (I cannot dist-upgrade openstack now): apt-get install qemu-system qemu-system-arm qemu-system-common qemu-system-mips qemu-system-misc qemu-system-ppc qemu-system-sparc qemu-system-x86 qemu-utils seabios libvirt-bin - Let the qemu 2.5 installed - The migration tests shows us, this new packages solves the issue What I want/advice, to repair this: - First of all, it would be nice to install qemu 2.5 with the original kilo repository, and I be able to upgrade without any 'quick and dirty' method (add-remove Mitaka repo until installing qemu). It is ASAP to us, cause if we not get this until the next weekend, i had to choose the quick and dirty way (but don't want to rush anybody... just telling :) ) - If nova able to start instances two times with same rbd block device, it's a really big hole in the system I think... we just corrupted 2 test Windows 7 guest with a few clicks... Some security check should be implementet, which collects the instances (and their states) from kvm at any VM starting, and if the algorithm sees, there are guest running with the same name (or some kind of uuid maybe) it's just not starting another copy... - Some kind of checking also would usefull, which automatically checks and compare the VM states in the database, and also in hypervisors side in a given interval (this check may can be disabled, and checking interval should be able to configured imho) I've not found any clue, that those things in nova side are repaired previously in liberty or mitaka... am I right, ot just someting avoid my attention? If any further information needed, feel free to ask :) Regards, Peter ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1573875 Title: Nova able to start VM 2 times after failed to live migrate it Status in OpenStack Compute (nova): New Bug description: Hi, I've faced a strange problem with nova. A few enviromental details: - We use Ubuntu 14.04 LTS - We use Kilo from Ubuntu cloud archive - We use KVM as Hypervisor with the stocked qemu 2.2 - We got Ceph as shared storage with libvirt-rbd devices - OVS neutron based networking, but it's all the same with other solutions I think. So, the workflow, which need to reproduce the bug: - Start a Windows guest (Linux distros not affected as I saw) - Live migrate this VM to another host (okay, I know, it's not fit 100% in cloud conception, but we must use it) As happend then, is a really wrong behavior: - The VM starts to migrate (virsh list shows it in a new host) - On the source side, virsh list tells me, the instance is stopped - After a few second, the destination host just remove the instance, and the source change it's state back to running - The network comes unavailable - The horizon reports, the instance is in shut off state and it's definietly not (the VNC is still available for example) - User can click
[Yahoo-eng-team] [Bug 1568809] [NEW] Add new fixed IP to existing Neutron port, if one is attached
Public bug reported: Currently, the "Attach Interface" button in dashboard always creates and attaches a new interface, no matter whether the instance already has an interface attached from a given subnet. However, from the networking/routing standpoint, it doesn't make sense to have a VM connected to the same subnet via two different "physical" interfaces (eth0 and eth1 for example) - this makes the second and all the subsequent IPs from that subnet unreachable unless you set up source-based routing (inside the VM). So, instead of creating a new Neutron port for each fixed IP, it might be better to update an existing port in that subnet, if one has already been attached to the instance. This is compatible with the Unix view of more IPs in one subnet - they work on the same base MAC address and cooperate with DHCP nicely. Moreover, such aliases don't confuse the routing table and these IPs work "out of the box". ** Affects: horizon Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1568809 Title: Add new fixed IP to existing Neutron port, if one is attached Status in OpenStack Dashboard (Horizon): New Bug description: Currently, the "Attach Interface" button in dashboard always creates and attaches a new interface, no matter whether the instance already has an interface attached from a given subnet. However, from the networking/routing standpoint, it doesn't make sense to have a VM connected to the same subnet via two different "physical" interfaces (eth0 and eth1 for example) - this makes the second and all the subsequent IPs from that subnet unreachable unless you set up source-based routing (inside the VM). So, instead of creating a new Neutron port for each fixed IP, it might be better to update an existing port in that subnet, if one has already been attached to the instance. This is compatible with the Unix view of more IPs in one subnet - they work on the same base MAC address and cooperate with DHCP nicely. Moreover, such aliases don't confuse the routing table and these IPs work "out of the box". To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1568809/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1563529] [NEW] "Boot from volume" fails on image size restriction
Public bug reported: When launching an instance with options "Boot from volume" or "Boot from volume snapshot (creates a new volume)", Horizon (rightfully) doesn't make it possible to specify volume size. Yet it uses the default "volume_size" value of 1, which is defined as initial in the form [0]. When Nova API processes this request, it also tries to verify the new volume size vs. minimum disk size, because "volume_size" is set [1]. This check will fail if your image requires a disk more than 1 GiB in size. I naively tried just not sending "volume_size" to the Nova API, but this triggered some other errors. Preferably, Horizon should be setting the "volume_size" accordingly to the volume or snapshot that has been chosen, similar to booting from image [2]. As a quick fix, you can set the disk requirements of your images to no more than 1 GiB. But you also have to re-create the affected volumes, since this min_disk metadata gets burned into them. -- Steps to reproduce: 0) Create a Glance image with a requirement of more than 1 GiB of disk space (2 if you don't trust yourself with the GB/GiB fuss) 1) Navigate to Compute -> Volumes and create a new bootable volume from that image, with plenty of space (10 GiB). 2) Launch an instance from that volume, either from Volumes or Instances screen. 3) Enjoy the VolumeSmallerThanMinDisk exception. Note that the CLI-equivalent process works as expected, for obvious reasons. -- [0] https://github.com/openstack/horizon/blob/stable/liberty/openstack_dashboard/dashboards/project/instances/workflows/create_instance.py#L116 [1] https://github.com/openstack/nova/blob/stable/liberty/nova/compute/api.py#L725 [2] https://github.com/openstack/horizon/blob/stable/liberty/openstack_dashboard/dashboards/project/instances/workflows/create_instance.py#L444 ** Affects: horizon Importance: Undecided Status: New ** Description changed: When launching an instance with options "Boot from volume" or "Boot from volume snapshot (creates a new volume)", Horizon (rightfully) doesn't make it possible to specify volume size. Yet it uses the default "volume_size" value of 1, which is defined as initial in the form [0]. When Nova API processes this request, it also tries to verify the new - volume size vs. minimum image size, because "volume_size" is set [1]. + volume size vs. minimum disk size, because "volume_size" is set [1]. This check will fail if your image requires a disk more than 1 GiB in size. I naively tried just not sending "volume_size" to the Nova API, but this triggered some other errors. Preferably, Horizon should be setting the "volume_size" accordingly to the volume or snapshot that has been chosen, similar to booting from image [2]. As a quick fix, you can set the disk requirements of your images to no more than 1 GiB. But you also have to re-create the affected volumes, since this min_disk metadata gets burned into them. -- Steps to reproduce: - 0) Create a Glance image with a requirement of more then 1 GiB of disk space (2 if you don't trust yourself with the GB/GiB fuss) + 0) Create a Glance image with a requirement of more than 1 GiB of disk space (2 if you don't trust yourself with the GB/GiB fuss) 1) Navigate to Compute -> Volumes and create a new bootable volume from that image, with plenty of space (10 GiB). 2) Launch an instance from that volume, either from Volumes or Instances screen. 3) Enjoy the VolumeSmallerThanMinDisk exception. Note that the CLI-equivalent process works as expected, for obvious reasons. -- [0] https://github.com/openstack/horizon/blob/stable/liberty/openstack_dashboard/dashboards/project/instances/workflows/create_instance.py#L116 [1] https://github.com/openstack/nova/blob/stable/liberty/nova/compute/api.py#L725 [2] https://github.com/openstack/horizon/blob/stable/liberty/openstack_dashboard/dashboards/project/instances/workflows/create_instance.py#L444 -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1563529 Title: "Boot from volume" fails on image size restriction Status in OpenStack Dashboard (Horizon): New Bug description: When launching an instance with options "Boot from volume" or "Boot from volume snapshot (creates a new volume)", Horizon (rightfully) doesn't make it possible to specify volume size. Yet it uses the default "volume_size" value of 1, which is defined as initial in the form [0]. When Nova API processes this request, it also tries to verify the new volume size vs. minimum disk size, because "volume_size" is set [1]. This check will fail if your image requires a disk more than 1 GiB in size. I naively tried just not sending "volume_size" to the Nova API, but this triggered some other errors. Preferably, Horizon should be setting
[Yahoo-eng-team] [Bug 1486583] [NEW] Orchestration Stacks has no button descriptions in russian language
Public bug reported: Orchestration Stacks has no button descriptions in russian language for Check Stack, Suspend Stack, Resume Stack, Delete Stack. ** Affects: horizon Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1486583 Title: Orchestration Stacks has no button descriptions in russian language Status in OpenStack Dashboard (Horizon): New Bug description: Orchestration Stacks has no button descriptions in russian language for Check Stack, Suspend Stack, Resume Stack, Delete Stack. To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1486583/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1474279] [NEW] FWaaS let connection opened if delete allow rule, beacuse of conntrack
Public bug reported: Hi, I've faced a problem with FWaaS plugin in Neutron (Juno). The firewall works, but when I delete a rule from the policy, the connection will still works because of conntrack... (I tried with ping, and ssh) It's okay, if the connection will kept alive, if it's really alive, (an active SSH for example) but if I delete the ICMP rule, and stop pinging, and restart pinging, the ping will still works... If I go to my neutron server, and do a conntrack -F command on my relevant qrouter, the firewall starts working based on the valid rules... Are there any way, to configure the conntrack cleanup when FWaaS configuration modified by user? If not, can somebody help me, where to make changes on code, to run that command in the proper namespace after the iptables rule-generation? Regards, Peter ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1474279 Title: FWaaS let connection opened if delete allow rule, beacuse of conntrack Status in neutron: New Bug description: Hi, I've faced a problem with FWaaS plugin in Neutron (Juno). The firewall works, but when I delete a rule from the policy, the connection will still works because of conntrack... (I tried with ping, and ssh) It's okay, if the connection will kept alive, if it's really alive, (an active SSH for example) but if I delete the ICMP rule, and stop pinging, and restart pinging, the ping will still works... If I go to my neutron server, and do a conntrack -F command on my relevant qrouter, the firewall starts working based on the valid rules... Are there any way, to configure the conntrack cleanup when FWaaS configuration modified by user? If not, can somebody help me, where to make changes on code, to run that command in the proper namespace after the iptables rule-generation? Regards, Peter To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1474279/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1451860] [NEW] Attached volume migration failed, due to incorrect arguments order passed to swap_volume
Public bug reported: Steps to reproduce: 1. create a volume in cinder 2. boot a server from image in nova 3. attach this volume to server 4. use ' cinder migrate --force-host-copy True 3fa956b6-ba59-46df-8a26-97fcbc18fc82 openstack-wangp11-02@pool_backend_1#Pool_1' log from nova compute:( see attched from detail info): 2015-05-05 00:33:31.768 ERROR root [req-b8424cde-e126-41b0-a27a-ef675e0c207f admin admin] Original exception being dropped: ['Traceback (most recent ca ll last):\n', ' File /opt/stack/nova/nova/compute/manager.py, line 351, in decorated_function\nreturn function(self, context, *args, **kwargs)\n ', ' File /opt/stack/nova/nova/compute/manager.py, line 4982, in swap_volume\ncontext, old_volume_id, instance_uuid=instance.uuid)\n', Attribut eError: 'unicode' object has no attribute 'uuid'\n] according to my debug result: # here parameters passed to swap_volume def swap_volume(self, ctxt, instance, old_volume_id, new_volume_id): return self.manager.swap_volume(ctxt, instance, old_volume_id, new_volume_id) # swap_volume function @wrap_exception() @reverts_task_state @wrap_instance_fault def swap_volume(self, context, old_volume_id, new_volume_id, instance): Swap volume for an instance. context = context.elevated() bdm = objects.BlockDeviceMapping.get_by_volume_id( context, old_volume_id, instance_uuid=instance.uuid) connector = self.driver.get_volume_connector(instance) You can find: passed in order is self, ctxt, instance, old_volume_id, new_volume_id while function definition is self, context, old_volume_id, new_volume_id, instance this cause the 'unicode' object has no attribute 'uuid'\n error when trying to access instance['uuid'] BTW: this problem was introduced in https://review.openstack.org/#/c/172152 affect both Kilo and master Thanks Peter ** Affects: nova Importance: Undecided Status: New ** Tags: nova volume-migration ** Attachment added: screen-n-cpu.log https://bugs.launchpad.net/bugs/1451860/+attachment/4391417/+files/screen-n-cpu.log -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1451860 Title: Attached volume migration failed, due to incorrect arguments order passed to swap_volume Status in OpenStack Compute (Nova): New Bug description: Steps to reproduce: 1. create a volume in cinder 2. boot a server from image in nova 3. attach this volume to server 4. use ' cinder migrate --force-host-copy True 3fa956b6-ba59-46df-8a26-97fcbc18fc82 openstack-wangp11-02@pool_backend_1#Pool_1' log from nova compute:( see attched from detail info): 2015-05-05 00:33:31.768 ERROR root [req-b8424cde-e126-41b0-a27a-ef675e0c207f admin admin] Original exception being dropped: ['Traceback (most recent ca ll last):\n', ' File /opt/stack/nova/nova/compute/manager.py, line 351, in decorated_function\nreturn function(self, context, *args, **kwargs)\n ', ' File /opt/stack/nova/nova/compute/manager.py, line 4982, in swap_volume\ncontext, old_volume_id, instance_uuid=instance.uuid)\n', Attribut eError: 'unicode' object has no attribute 'uuid'\n] according to my debug result: # here parameters passed to swap_volume def swap_volume(self, ctxt, instance, old_volume_id, new_volume_id): return self.manager.swap_volume(ctxt, instance, old_volume_id, new_volume_id) # swap_volume function @wrap_exception() @reverts_task_state @wrap_instance_fault def swap_volume(self, context, old_volume_id, new_volume_id, instance): Swap volume for an instance. context = context.elevated() bdm = objects.BlockDeviceMapping.get_by_volume_id( context, old_volume_id, instance_uuid=instance.uuid) connector = self.driver.get_volume_connector(instance) You can find: passed in order is self, ctxt, instance, old_volume_id, new_volume_id while function definition is self, context, old_volume_id, new_volume_id, instance this cause the 'unicode' object has no attribute 'uuid'\n error when trying to access instance['uuid'] BTW: this problem was introduced in https://review.openstack.org/#/c/172152 affect both Kilo and master Thanks Peter To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1451860/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1447490] [NEW] Deletion of instances will be stuck forever if any of deletion hung in 'multipath -r'
Public bug reported: I created about 25 VMs from bootable volumes, after finishing this, I ran a script to deletion all of them in a very short time. while what i saw was: all of the VMs were in 'deleting' status and would never be deleted after waiting for hours from ps cmd: stack@ubuntu-server13:/var/log/libvirt$ ps aux | grep multipath root 8205 0.0 0.0 504988 5560 ?SLl Apr22 0:01 /sbin/multipathd root 115515 0.0 0.0 64968 2144 pts/3S+ Apr22 0:00 sudo nova-rootwrap /etc/nova/rootwrap.conf multipath -r root 115516 0.0 0.0 42240 9488 pts/3S+ Apr22 0:00 /usr/bin/python /usr/local/bin/nova-rootwrap /etc/nova/rootwrap.conf multipath -r root 115525 0.0 0.0 41792 2592 pts/3S+ Apr22 0:00 /sbin/multipath -r stack151825 0.0 0.0 11744 936 pts/0S+ 02:10 0:00 grep --color=auto multipath then i killed the multipath -r commands all vm ran into ERROR status after digging into nova code, nova always trying to get a global file lock : @utils.synchronized('connect_volume') def disconnect_volume(self, connection_info, disk_dev): Detach the volume from instance_name. iscsi_properties = connection_info['data'] .. if self.use_multipath and multipath_device: return self._disconnect_volume_multipath_iscsi(iscsi_properties, multipath_device) and then rescan iscsi by 'multipath -r' def _disconnect_volume_multipath_iscsi(self, iscsi_properties, multipath_device): self._rescan_iscsi() self._rescan_multipath()--- self._run_multipath('-r', check_exit_code=[0, 1, 21]) In my case, 'multipath -r' hang for a very long time and did not exit for serveral hours in addtion, this block all deletion of VM instances in the same Nova Node IMO, Nova should not wait the BLOCK command forever, at least, a timeout is needed for command such as'multipath -r' and 'multipath -ll' or is there any other solution for my case? MY ENVIRONMENT: Ubuntu Server 14: multipath-tools multipath enabled in Nova node Thanks Peter ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1447490 Title: Deletion of instances will be stuck forever if any of deletion hung in 'multipath -r' Status in OpenStack Compute (Nova): New Bug description: I created about 25 VMs from bootable volumes, after finishing this, I ran a script to deletion all of them in a very short time. while what i saw was: all of the VMs were in 'deleting' status and would never be deleted after waiting for hours from ps cmd: stack@ubuntu-server13:/var/log/libvirt$ ps aux | grep multipath root 8205 0.0 0.0 504988 5560 ?SLl Apr22 0:01 /sbin/multipathd root 115515 0.0 0.0 64968 2144 pts/3S+ Apr22 0:00 sudo nova-rootwrap /etc/nova/rootwrap.conf multipath -r root 115516 0.0 0.0 42240 9488 pts/3S+ Apr22 0:00 /usr/bin/python /usr/local/bin/nova-rootwrap /etc/nova/rootwrap.conf multipath -r root 115525 0.0 0.0 41792 2592 pts/3S+ Apr22 0:00 /sbin/multipath -r stack151825 0.0 0.0 11744 936 pts/0S+ 02:10 0:00 grep --color=auto multipath then i killed the multipath -r commands all vm ran into ERROR status after digging into nova code, nova always trying to get a global file lock : @utils.synchronized('connect_volume') def disconnect_volume(self, connection_info, disk_dev): Detach the volume from instance_name. iscsi_properties = connection_info['data'] .. if self.use_multipath and multipath_device: return self._disconnect_volume_multipath_iscsi(iscsi_properties, multipath_device) and then rescan iscsi by 'multipath -r' def _disconnect_volume_multipath_iscsi(self, iscsi_properties, multipath_device): self._rescan_iscsi() self._rescan_multipath()--- self._run_multipath('-r', check_exit_code=[0, 1, 21]) In my case, 'multipath -r' hang for a very long time and did not exit for serveral hours in addtion, this block all deletion of VM instances in the same Nova Node IMO, Nova should not wait the BLOCK command forever, at least, a timeout is needed for command such as'multipath -r' and 'multipath -ll' or is there any other solution for my case? MY ENVIRONMENT: Ubuntu Server 14: multipath-tools multipath enabled in Nova node Thanks Peter To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1447490/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team
[Yahoo-eng-team] [Bug 1434169] [NEW] Unable to provision instance without network
Public bug reported: When running nova boot without a --nic option, nova will try to provision an interface for it, even for shared networks where the user does not have permission to do so. There is no way to provision an instance without a network in an installation where a shared network is present. ** Affects: nova Importance: Undecided Status: New ** Tags: canonical-bootstack -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1434169 Title: Unable to provision instance without network Status in OpenStack Compute (Nova): New Bug description: When running nova boot without a --nic option, nova will try to provision an interface for it, even for shared networks where the user does not have permission to do so. There is no way to provision an instance without a network in an installation where a shared network is present. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1434169/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1366649] Re: Typo in keystone/common/base64utils.py
** Changed in: keystone Status: In Progress = Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Keystone. https://bugs.launchpad.net/bugs/1366649 Title: Typo in keystone/common/base64utils.py Status in OpenStack Identity (Keystone): Invalid Bug description: Typo in keystone/common/base64utils.py To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1366649/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1334622] [NEW] Javascript issues found by jshint
Public bug reported: Running the jshint codeanalysis tool for javascript on the horizon codebase found 10 errors: horizon/static/horizon/js/horizon.d3linechart.js: line 415, col 11, It's not necessary to initialize 'last_point' to 'undefined'. horizon/static/horizon/js/horizon.d3linechart.js: line 415, col 35, It's not necessary to initialize 'last_point_color' to 'undefined'. horizon/static/horizon/js/horizon.d3piechart.js: line 145, col 10, ['key'] is better written in dot notation. horizon/static/horizon/js/horizon.d3piechart.js: line 146, col 10, ['value'] is better written in dot notation. horizon/static/horizon/js/horizon.d3piechart.js: line 162, col 44, ['value'] is better written in dot notation. horizon/static/horizon/js/horizon.d3piechart.js: line 236, col 27, ['key'] is better written in dot notation. horizon/static/horizon/js/horizon.d3piechart.js: line 237, col 33, ['value'] is better written in dot notation. horizon/static/horizon/js/horizon.forms.js: line 8, col 26, Use '===' to compare with ''. horizon/static/horizon/js/horizon.forms.js: line 25, col 26, Use '===' to compare with ''. horizon/static/horizon/js/horizon.forms.js: line 42, col 26, Use '===' to compare with ''. The exact commands: jshint horizon/static/horizon/js jshint horizon/static/horizon/tests ** Affects: horizon Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1334622 Title: Javascript issues found by jshint Status in OpenStack Dashboard (Horizon): New Bug description: Running the jshint codeanalysis tool for javascript on the horizon codebase found 10 errors: horizon/static/horizon/js/horizon.d3linechart.js: line 415, col 11, It's not necessary to initialize 'last_point' to 'undefined'. horizon/static/horizon/js/horizon.d3linechart.js: line 415, col 35, It's not necessary to initialize 'last_point_color' to 'undefined'. horizon/static/horizon/js/horizon.d3piechart.js: line 145, col 10, ['key'] is better written in dot notation. horizon/static/horizon/js/horizon.d3piechart.js: line 146, col 10, ['value'] is better written in dot notation. horizon/static/horizon/js/horizon.d3piechart.js: line 162, col 44, ['value'] is better written in dot notation. horizon/static/horizon/js/horizon.d3piechart.js: line 236, col 27, ['key'] is better written in dot notation. horizon/static/horizon/js/horizon.d3piechart.js: line 237, col 33, ['value'] is better written in dot notation. horizon/static/horizon/js/horizon.forms.js: line 8, col 26, Use '===' to compare with ''. horizon/static/horizon/js/horizon.forms.js: line 25, col 26, Use '===' to compare with ''. horizon/static/horizon/js/horizon.forms.js: line 42, col 26, Use '===' to compare with ''. The exact commands: jshint horizon/static/horizon/js jshint horizon/static/horizon/tests To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1334622/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1309079] [NEW] Running update-grub-legacy-ec2 doesn't cause /boot/grub/menu.lst
Public bug reported: This bug has been observed with at least EC2 AMI: ami-ec50a19b ubuntu/images/ebs/ubuntu-saucy-13.10-amd64-server-20140212, and other Ubuntu saucy AMIs. Steps to reproduce: 1. sudo apt-get update 2. sudo apt-get dist-upgrade 3. Get prompted for one thing, whether to install the maintainer's version of /boot/grub/menu.lst (choose to use the maintainer's file) 4. observe updated file 5. modify the file to remove the new entries 6. run update-grub-legacy-ec2 7. observe file not updated This is especially insidious because it claims the file is updated, with output like this: Found kernel: /boot/vmlinuz-3.11.0-19-generic Found kernel: /boot/vmlinuz-3.11.0-18-generic Found kernel: /boot/vmlinuz-3.11.0-17-generic Found kernel: /boot/vmlinuz-3.11.0-15-generic Found kernel: /boot/vmlinuz-3.11.0-14-generic Found kernel: /boot/vmlinuz-3.11.0-12-generic Found kernel: /boot/memtest86+.bin Updating /boot/grub/menu.lst ... done And in fact, /boot/grub/menu.lst has its modified time updated to now, but it doesn't actually change the contents of the file. This is extremely confusing. I note that UCF_FORCE_CONFFNEW=1 and DEBIAN_PRIORITY=low and: # debconf-get-selections | grep grub-l grub-legacy-ec2grub/update_grub_changeprompt_threewayselect install_new don't have any useful effect. I've emailed Ben Howard who confirmed he could reproduce this bug. ** Affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1309079 Title: Running update-grub-legacy-ec2 doesn't cause /boot/grub/menu.lst Status in Init scripts for use on cloud images: New Bug description: This bug has been observed with at least EC2 AMI: ami-ec50a19b ubuntu/images/ebs/ubuntu-saucy-13.10-amd64-server-20140212, and other Ubuntu saucy AMIs. Steps to reproduce: 1. sudo apt-get update 2. sudo apt-get dist-upgrade 3. Get prompted for one thing, whether to install the maintainer's version of /boot/grub/menu.lst (choose to use the maintainer's file) 4. observe updated file 5. modify the file to remove the new entries 6. run update-grub-legacy-ec2 7. observe file not updated This is especially insidious because it claims the file is updated, with output like this: Found kernel: /boot/vmlinuz-3.11.0-19-generic Found kernel: /boot/vmlinuz-3.11.0-18-generic Found kernel: /boot/vmlinuz-3.11.0-17-generic Found kernel: /boot/vmlinuz-3.11.0-15-generic Found kernel: /boot/vmlinuz-3.11.0-14-generic Found kernel: /boot/vmlinuz-3.11.0-12-generic Found kernel: /boot/memtest86+.bin Updating /boot/grub/menu.lst ... done And in fact, /boot/grub/menu.lst has its modified time updated to now, but it doesn't actually change the contents of the file. This is extremely confusing. I note that UCF_FORCE_CONFFNEW=1 and DEBIAN_PRIORITY=low and: # debconf-get-selections | grep grub-l grub-legacy-ec2grub/update_grub_changeprompt_threewayselect install_new don't have any useful effect. I've emailed Ben Howard who confirmed he could reproduce this bug. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1309079/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1291637] [NEW] memcache client race
Public bug reported: Nova uses thread-unsafe memcache client objects in multiple threads. For instance, nova-api's metadata WSGI server uses the same nova.api.metadata.handler.MetadataRequestHandler._cache object for every request. A memcache client object is thread unsafe because it has a single open socket connection to memcached. Thus the multiple threads will read from write to the same socket fd. Keystoneclient has the same bug. See https://bugs.launchpad.net/python- keystoneclient/+bug/1289074 for a patch to fix the problem. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1291637 Title: memcache client race Status in OpenStack Compute (Nova): New Bug description: Nova uses thread-unsafe memcache client objects in multiple threads. For instance, nova-api's metadata WSGI server uses the same nova.api.metadata.handler.MetadataRequestHandler._cache object for every request. A memcache client object is thread unsafe because it has a single open socket connection to memcached. Thus the multiple threads will read from write to the same socket fd. Keystoneclient has the same bug. See https://bugs.launchpad.net /python-keystoneclient/+bug/1289074 for a patch to fix the problem. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1291637/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1273273] [NEW] if run 'keystone-manage db_sync' without 'sudo', there should be a prompt that tables notcreated
Public bug reported: below is the detail: i have setup keystone database in mysql and grant all privilege to the keystone user when i execute 'keystone-manage db_sync' peter@openstack:~$ keystone-manage db_sync peter@openstack:~$ but actually , in mysql no tables will be create in keystone database [there should be a error to indicate user that no tables created due to not enough privilege] only when : peter@openstack:~$ sudo keystone-manage db_sync peter@openstack:~$ all tables will created in keystone database correctly. summary: if a prompt displayed, that will be very usefully for user to proceed installing keystone successfully ** Affects: keystone Importance: Undecided Status: New ** Tags: keystone -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Keystone. https://bugs.launchpad.net/bugs/1273273 Title: if run 'keystone-manage db_sync' without 'sudo', there should be a prompt that tables notcreated Status in OpenStack Identity (Keystone): New Bug description: below is the detail: i have setup keystone database in mysql and grant all privilege to the keystone user when i execute 'keystone-manage db_sync' peter@openstack:~$ keystone-manage db_sync peter@openstack:~$ but actually , in mysql no tables will be create in keystone database [there should be a error to indicate user that no tables created due to not enough privilege] only when : peter@openstack:~$ sudo keystone-manage db_sync peter@openstack:~$ all tables will created in keystone database correctly. summary: if a prompt displayed, that will be very usefully for user to proceed installing keystone successfully To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1273273/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1269204] [NEW] tempest.api.compute.servers.test_server_rescue.ServerRescueTestJSON.test_rescue_paused_instance fails sporadically in gate jobs
Public bug reported: tempest.api.compute.servers.test_server_rescue.ServerRescueTestJSON.test_rescue_paused_instance fails sporadically in gate jobs See: http://logs.openstack.org/08/66108/3/check/check-tempest-dsvm- full/fdbbfd3/console.html 2014-01-15 00:40:01.541 | == 2014-01-15 00:40:01.542 | FAIL: tempest.api.compute.servers.test_server_rescue.ServerRescueTestJSON.test_rescue_paused_instance[gate,negative] 2014-01-15 00:40:01.542 | tempest.api.compute.servers.test_server_rescue.ServerRescueTestJSON.test_rescue_paused_instance[gate,negative] 2014-01-15 00:40:01.543 | -- 2014-01-15 00:40:01.544 | _StringException: Empty attachments: 2014-01-15 00:40:01.544 | stderr 2014-01-15 00:40:01.544 | stdout 2014-01-15 00:40:01.544 | 2014-01-15 00:40:01.544 | pythonlogging:'': {{{ 2014-01-15 00:40:01.544 | 2014-01-15 00:12:20,256 Request: POST http://127.0.0.1:8774/v2/64edf5122f2d486682ecfaa53bd071b6/servers/83997448-4700-4333-8022-99328e6b5e1f/action 2014-01-15 00:40:01.545 | 2014-01-15 00:12:20,256 Request Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': 'Token omitted'} 2014-01-15 00:40:01.545 | 2014-01-15 00:12:20,256 Request Body: {pause: {}} 2014-01-15 00:40:01.545 | 2014-01-15 00:12:20,916 Response Status: 202 2014-01-15 00:40:01.545 | 2014-01-15 00:12:20,916 Nova request id: req-f716db87-36c7-46be-8672-9f5fbecd2c04 2014-01-15 00:40:01.545 | 2014-01-15 00:12:20,916 Response Headers: {'content-length': '0', 'date': 'Wed, 15 Jan 2014 00:12:20 GMT', 'content-type': 'text/html; charset=UTF-8', 'connection': 'close'} . . . 2014-01-15 00:40:01.934 | 2014-01-15 00:15:37,090 Request: POST http://127.0.0.1:8774/v2/64edf5122f2d486682ecfaa53bd071b6/servers/83997448-4700-4333-8022-99328e6b5e1f/action 2014-01-15 00:40:01.934 | 2014-01-15 00:15:37,090 Request Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': 'Token omitted'} 2014-01-15 00:40:01.934 | 2014-01-15 00:15:37,090 Request Body: {unpause: {}} 2014-01-15 00:40:01.935 | 2014-01-15 00:15:37,138 Response Status: 409 2014-01-15 00:40:01.935 | 2014-01-15 00:15:37,138 Nova request id: req-ff9827ce-fbae-4f3b-addf-8e0ddf6a29a7 2014-01-15 00:40:01.935 | 2014-01-15 00:15:37,138 Response Headers: {'content-length': '105', 'date': 'Wed, 15 Jan 2014 00:15:37 GMT', 'content-type': 'application/json; charset=UTF-8', 'connection': 'close'} 2014-01-15 00:40:01.935 | 2014-01-15 00:15:37,138 Response Body: {conflictingRequest: {message: Cannot 'unpause' while instance is in vm_state active, code: 409}} 2014-01-15 00:40:01.935 | }}} 2014-01-15 00:40:01.935 | 2014-01-15 00:40:01.935 | traceback-1: {{{ 2014-01-15 00:40:01.936 | Traceback (most recent call last): 2014-01-15 00:40:01.936 | File tempest/api/compute/servers/test_server_rescue.py, line 107, in _unpause 2014-01-15 00:40:01.936 | resp, body = self.servers_client.unpause_server(server_id) 2014-01-15 00:40:01.936 | File tempest/services/compute/json/servers_client.py, line 374, in unpause_server 2014-01-15 00:40:01.936 | return self.action(server_id, 'unpause', None, **kwargs) 2014-01-15 00:40:01.936 | File tempest/services/compute/json/servers_client.py, line 198, in action 2014-01-15 00:40:01.936 | post_body, self.headers) 2014-01-15 00:40:01.937 | File tempest/common/rest_client.py, line 302, in post 2014-01-15 00:40:01.937 | return self.request('POST', url, headers, body) 2014-01-15 00:40:01.937 | File tempest/common/rest_client.py, line 436, in request 2014-01-15 00:40:01.937 | resp, resp_body) 2014-01-15 00:40:01.937 | File tempest/common/rest_client.py, line 491, in _error_checker 2014-01-15 00:40:01.937 | raise exceptions.Conflict(resp_body) 2014-01-15 00:40:01.937 | Conflict: An object with that identifier already exists 2014-01-15 00:40:01.938 | Details: {u'conflictingRequest': {u'message': uCannot 'unpause' while instance is in vm_state active, u'code': 409}} 2014-01-15 00:40:01.938 | }}} 2014-01-15 00:40:01.938 | 2014-01-15 00:40:01.938 | Traceback (most recent call last): 2014-01-15 00:40:01.938 | File tempest/api/compute/servers/test_server_rescue.py, line 128, in test_rescue_paused_instance 2014-01-15 00:40:01.938 | self.servers_client.wait_for_server_status(self.server_id, 'PAUSED') 2014-01-15 00:40:01.938 | File tempest/services/compute/json/servers_client.py, line 162, in wait_for_server_status 2014-01-15 00:40:01.939 | raise_on_error=raise_on_error) 2014-01-15 00:40:01.939 | File tempest/common/waiters.py, line 91, in wait_for_server_status 2014-01-15 00:40:01.939 | raise exceptions.TimeoutException(message) 2014-01-15 00:40:01.939 | TimeoutException: Request timed out 2014-01-15 00:40:01.939 | Details: Server 83997448-4700-4333-8022-99328e6b5e1f failed to reach PAUSED status and task state
[Yahoo-eng-team] [Bug 1264848] [NEW] SSHTimeout
Public bug reported: See: http://logs.openstack.org/06/61006/8/check/check-grenade- dsvm/c1be113/console.html.gz 2013-12-26 17:12:03.092 | Traceback (most recent call last): 2013-12-26 17:12:03.092 | File tempest/scenario/test_minimum_basic.py, line 166, in test_minimum_basic_scenario 2013-12-26 17:12:03.093 | self.check_partitions() 2013-12-26 17:12:03.093 | File tempest/scenario/test_minimum_basic.py, line 137, in check_partitions 2013-12-26 17:12:03.093 | partitions = self.linux_client.get_partitions() 2013-12-26 17:12:03.093 | File tempest/common/utils/linux/remote_client.py, line 77, in get_partitions 2013-12-26 17:12:03.094 | output = self.ssh_client.exec_command(command) 2013-12-26 17:12:03.094 | File tempest/common/ssh.py, line 123, in exec_command 2013-12-26 17:12:03.094 | ssh = self._get_ssh_connection() 2013-12-26 17:12:03.095 | File tempest/common/ssh.py, line 93, in _get_ssh_connection 2013-12-26 17:12:03.095 | password=self.password) 2013-12-26 17:12:03.095 | SSHTimeout: Connection to the 172.24.4.227 via SSH timed out. 2013-12-26 17:12:03.096 | User: cirros, Password: None This might be similar to https://bugs.launchpad.net/nova/+bug/1200731 ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1264848 Title: SSHTimeout Status in OpenStack Compute (Nova): New Bug description: See: http://logs.openstack.org/06/61006/8/check/check-grenade- dsvm/c1be113/console.html.gz 2013-12-26 17:12:03.092 | Traceback (most recent call last): 2013-12-26 17:12:03.092 | File tempest/scenario/test_minimum_basic.py, line 166, in test_minimum_basic_scenario 2013-12-26 17:12:03.093 | self.check_partitions() 2013-12-26 17:12:03.093 | File tempest/scenario/test_minimum_basic.py, line 137, in check_partitions 2013-12-26 17:12:03.093 | partitions = self.linux_client.get_partitions() 2013-12-26 17:12:03.093 | File tempest/common/utils/linux/remote_client.py, line 77, in get_partitions 2013-12-26 17:12:03.094 | output = self.ssh_client.exec_command(command) 2013-12-26 17:12:03.094 | File tempest/common/ssh.py, line 123, in exec_command 2013-12-26 17:12:03.094 | ssh = self._get_ssh_connection() 2013-12-26 17:12:03.095 | File tempest/common/ssh.py, line 93, in _get_ssh_connection 2013-12-26 17:12:03.095 | password=self.password) 2013-12-26 17:12:03.095 | SSHTimeout: Connection to the 172.24.4.227 via SSH timed out. 2013-12-26 17:12:03.096 | User: cirros, Password: None This might be similar to https://bugs.launchpad.net/nova/+bug/1200731 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1264848/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1259520] [NEW] tempest.scenario.test_minimum_basic.TestMinimumBasicScenario.test_minimum_basic_scenario fails sporadically on check jobs
Public bug reported: tempest.scenario.test_minimum_basic.TestMinimumBasicScenario.test_minimum_basic_scenario fails sporadically on check jobs See: http://logs.openstack.org/06/61006/1/check/check-tempest-dsvm- full/856ca74/console.html 2013-12-10 11:47:12.101 | == 2013-12-10 11:47:12.106 | FAIL: tempest.scenario.test_minimum_basic.TestMinimumBasicScenario.test_minimum_basic_scenario[compute,image,network,volume] 2013-12-10 11:47:12.106 | tempest.scenario.test_minimum_basic.TestMinimumBasicScenario.test_minimum_basic_scenario[compute,image,network,volume] 2013-12-10 11:47:12.106 | -- 2013-12-10 11:47:12.108 | _StringException: Empty attachments: 2013-12-10 11:47:12.109 | stderr 2013-12-10 11:47:12.109 | stdout 2013-12-10 11:47:12.109 | 2013-12-10 11:47:12.110 | pythonlogging:'': {{{ 2013-12-10 11:47:12.110 | 2013-12-10 11:38:04,562 paths: ami: /opt/stack/new/devstack/files/images/cirros-0.3.1-x86_64-uec/cirros-0.3.1-x86_64-blank.img, ari: /opt/stack/new/devstack/files/images/cirros-0.3.1-x86_64-uec/cirros-0.3.1-x86_64-initrd, aki: /opt/stack/new/devstack/files/images/cirros-0.3.1-x86_64-uec/cirros-0.3.1-x86_64-vmlinuz 2013-12-10 11:47:12.112 | 2013-12-10 11:38:04,562 curl -i -X POST -H 'x-image-meta-container_format: aki' -H 'User-Agent: python-glanceclient' -H 'x-image-meta-is_public: True' -H 'X-Auth-Token: MIISwAY...QaF4s2cc+7HjRFML6+aI-Wynj0dVFhZTRfD-U4jnlaGdHVmiR4vJIa0quCSVhsSE8U7e--P3cjslc709RdE1hFVmSeuneZgMLIjn2fi3EYT4p9RENfYSE8aSRwV+5js8Mh177L8Nk-ZLyX7BWETWfreD6IscFe8yUZ4ns09YgTLTE1e0+ANhj5PQabOMDX8yPrbYzltPNNZcXeofL-t5R1iLr006F390xq4Ywi2t9HzwtnX1eWC1AB1AarAD5p5P6Fw9wVyE8wuqb8waozvvu3TX5lPt7lzDOOW6+sreQ1ItTonqnjPW0evD-1vVY1DZTyhY70+Ku7dyzciGWhjwA==' -H 'Content-Type: application/octet-stream' -H 'x-image-meta-disk_format: aki' -H 'x-image-meta-name: scenario-aki--tempest-820355222' http://127.0.0.1:9292/v1/images 2013-12-10 11:47:12.112 | 2013-12-10 11:38:05,078 2013-12-10 11:47:12.112 | HTTP/1.1 201 Created 2013-12-10 11:47:12.113 | date: Tue, 10 Dec 2013 11:38:05 GMT 2013-12-10 11:47:12.113 | content-length: 441 2013-12-10 11:47:12.113 | content-type: application/json 2013-12-10 11:47:12.114 | location: http://127.0.0.1:9292/v1/images/9f44a50e-7fbd-4674-b8f9-c2739f7b270a 2013-12-10 11:47:12.115 | x-openstack-request-id: req-58c90e26-96bd-49cd-ab41-e94cefb110b0 2013-12-10 11:47:12.116 | 2013-12-10 11:47:12.116 | {image: {status: queued, deleted: false, container_format: aki, min_ram: 0, updated_at: 2013-12-10T11:38:05, owner: 0ebe8ba1a7c34e55a019908c5736493a, min_disk: 0, is_public: true, deleted_at: null, id: 9f44a50e-7fbd-4674-b8f9-c2739f7b270a, size: 0, name: scenario-aki--tempest-820355222, checksum: null, created_at: 2013-12-10T11:38:05, disk_format: aki, properties: {}, protected: false}} . . . 2013-12-10 11:47:14.034 | 2013-12-10 11:41:50,642 curl -i -X DELETE -H 'X-Auth-Token: MIISwAY...F4s2cc+7HjRFML6+aI-Wynj0dVFhZTRfD-U4jnlaGdHVmiR4vJIa0quCSVhsSE8U7e--P3cjslc709RdE1hFVmSeuneZgMLIjn2fi3EYT4p9RENfYSE8aSRwV+5js8Mh177L8Nk-ZLyX7BWETWfreD6IscFe8yUZ4ns09YgTLTE1e0+ANhj5PQabOMDX8yPrbYzltPNNZcXeofL-t5R1iLr006F390xq4Ywi2t9HzwtnX1eWC1AB1AarAD5p5P6Fw9wVyE8wuqb8waozvvu3TX5lPt7lzDOOW6+sreQ1ItTonqnjPW0evD-1vVY1DZTyhY70+Ku7dyzciGWhjwA==' -H 'Content-Type: application/octet-stream' -H 'User-Agent: python-glanceclient' http://127.0.0.1:9292/v1/images/9f44a50e-7fbd-4674-b8f9-c2739f7b270a 2013-12-10 11:47:14.035 | 2013-12-10 11:41:50,922 2013-12-10 11:47:14.035 | HTTP/1.1 200 OK 2013-12-10 11:47:14.035 | date: Tue, 10 Dec 2013 11:41:50 GMT 2013-12-10 11:47:14.035 | content-length: 0 2013-12-10 11:47:14.035 | content-type: text/html; charset=UTF-8 2013-12-10 11:47:14.035 | x-openstack-request-id: req-b3daa054-b695-4bc5-b0fe-90647731dd46 2013-12-10 11:47:14.036 | }}} 2013-12-10 11:47:14.036 | 2013-12-10 11:47:14.036 | Traceback (most recent call last): 2013-12-10 11:47:14.036 | File tempest/scenario/test_minimum_basic.py, line 160, in test_minimum_basic_scenario 2013-12-10 11:47:14.036 | self.nova_reboot() 2013-12-10 11:47:14.037 | File tempest/scenario/test_minimum_basic.py, line 124, in nova_reboot 2013-12-10 11:47:14.037 | self._wait_for_server_status('ACTIVE') 2013-12-10 11:47:14.037 | File tempest/scenario/test_minimum_basic.py, line 43, in _wait_for_server_status 2013-12-10 11:47:14.037 | self.compute_client.servers, server_id, status) 2013-12-10 11:47:14.037 | File tempest/scenario/manager.py, line 304, in status_timeout 2013-12-10 11:47:14.037 | not_found_exception=not_found_exception) 2013-12-10 11:47:14.038 | File tempest/scenario/manager.py, line 361, in _status_timeout 2013-12-10 11:47:14.038 | raise exceptions.TimeoutException(message) 2013-12-10 11:47:14.038 | TimeoutException: Request timed out 2013-12-10 11:47:14.038 | Details: Timed out waiting for thing
[Yahoo-eng-team] [Bug 1258856] [NEW] tempest.api.compute.servers.test_server_actions.ServerActionsTestXML.test_stop_start_server fails with quota error
Public bug reported: See: http://logs.openstack.org/10/56710/10/check/check-tempest-dsvm- full/187705b/console.html.gz 2013-12-07 01:35:54.908 | == 2013-12-07 01:35:54.909 | FAIL: tempest.api.compute.servers.test_server_actions.ServerActionsTestXML.test_stop_start_server[gate] 2013-12-07 01:35:54.909 | tempest.api.compute.servers.test_server_actions.ServerActionsTestXML.test_stop_start_server[gate] 2013-12-07 01:35:54.909 | -- 2013-12-07 01:35:54.909 | _StringException: Empty attachments: 2013-12-07 01:35:54.909 | stderr 2013-12-07 01:35:54.909 | stdout 2013-12-07 01:35:54.910 | 2013-12-07 01:35:54.910 | pythonlogging:'': {{{ 2013-12-07 01:35:54.910 | 2013-12-07 01:15:59,684 Request: GET http://127.0.0.1:8774/v2/04d6f70250e94a5e94e506386812bba3/servers/562fe7bb-f952-4cc3-af24-5d612dc3f022 2013-12-07 01:35:54.910 | 2013-12-07 01:15:59,685 Request Headers: {'Content-Type': 'application/xml', 'Accept': 'application/xml', 'X-Auth-Token': 'Token omitted'} 2013-12-07 01:35:54.910 | 2013-12-07 01:15:59,745 Response Status: 404 2013-12-07 01:35:54.910 | 2013-12-07 01:15:59,745 Nova request id: req-ee502842-f694-4b3f-b696-0b0d1a2ce925 2013-12-07 01:35:54.911 | 2013-12-07 01:15:59,745 Response Headers: {'content-length': '137', 'date': 'Sat, 07 Dec 2013 01:15:59 GMT', 'content-type': 'application/xml; charset=UTF-8', 'connection': 'close'} 2013-12-07 01:35:54.911 | 2013-12-07 01:15:59,746 Response Body: itemNotFound code=404 xmlns=http://docs.openstack.org/compute/api/v1.1;messageInstance could not be found/message/itemNotFound 2013-12-07 01:35:54.911 | 2013-12-07 01:15:59,746 Request: DELETE http://127.0.0.1:8774/v2/04d6f70250e94a5e94e506386812bba3/servers/562fe7bb-f952-4cc3-af24-5d612dc3f022 2013-12-07 01:35:54.911 | 2013-12-07 01:15:59,746 Request Headers: {'X-Auth-Token': 'Token omitted'} 2013-12-07 01:35:54.911 | 2013-12-07 01:15:59,801 Response Status: 404 2013-12-07 01:35:54.911 | 2013-12-07 01:15:59,802 Nova request id: req-703c4944-902f-4a14-a27c-0ae724ce66b1 2013-12-07 01:35:54.911 | 2013-12-07 01:15:59,802 Response Headers: {'content-length': '73', 'date': 'Sat, 07 Dec 2013 01:15:59 GMT', 'content-type': 'application/json; charset=UTF-8', 'connection': 'close'} 2013-12-07 01:35:54.912 | 2013-12-07 01:15:59,802 Response Body: {itemNotFound: {message: Instance could not be found, code: 404}} 2013-12-07 01:35:54.912 | 2013-12-07 01:15:59,803 Object not found 2013-12-07 01:35:54.912 | Details: {itemNotFound: {message: Instance could not be found, code: 404}} 2013-12-07 01:35:54.912 | 2013-12-07 01:15:59.803 5212 TRACE tempest.api.compute.base Traceback (most recent call last): 2013-12-07 01:35:54.912 | 2013-12-07 01:15:59.803 5212 TRACE tempest.api.compute.base File tempest/api/compute/base.py, line 188, in rebuild_server 2013-12-07 01:35:54.912 | 2013-12-07 01:15:59.803 5212 TRACE tempest.api.compute.base cls.servers_client.delete_server(server_id) 2013-12-07 01:35:54.912 | 2013-12-07 01:15:59.803 5212 TRACE tempest.api.compute.base File tempest/services/compute/xml/servers_client.py, line 238, in delete_server 2013-12-07 01:35:54.913 | 2013-12-07 01:15:59.803 5212 TRACE tempest.api.compute.base return self.delete(servers/%s % str(server_id)) 2013-12-07 01:35:54.913 | 2013-12-07 01:15:59.803 5212 TRACE tempest.api.compute.base File tempest/common/rest_client.py, line 308, in delete 2013-12-07 01:35:54.913 | 2013-12-07 01:15:59.803 5212 TRACE tempest.api.compute.base return self.request('DELETE', url, headers) 2013-12-07 01:35:54.913 | 2013-12-07 01:15:59.803 5212 TRACE tempest.api.compute.base File tempest/common/rest_client.py, line 436, in request 2013-12-07 01:35:54.913 | 2013-12-07 01:15:59.803 5212 TRACE tempest.api.compute.base resp, resp_body) 2013-12-07 01:35:54.913 | 2013-12-07 01:15:59.803 5212 TRACE tempest.api.compute.base File tempest/common/rest_client.py, line 481, in _error_checker 2013-12-07 01:35:54.914 | 2013-12-07 01:15:59.803 5212 TRACE tempest.api.compute.base raise exceptions.NotFound(resp_body) 2013-12-07 01:35:54.914 | 2013-12-07 01:15:59.803 5212 TRACE tempest.api.compute.base NotFound: Object not found 2013-12-07 01:35:54.914 | 2013-12-07 01:15:59.803 5212 TRACE tempest.api.compute.base Details: {itemNotFound: {message: Instance could not be found, code: 404}} 2013-12-07 01:35:54.914 | 2013-12-07 01:15:59.803 5212 TRACE tempest.api.compute.base 2013-12-07 01:35:54.914 | 2013-12-07 01:15:59,803 Request: POST http://127.0.0.1:8774/v2/04d6f70250e94a5e94e506386812bba3/servers 2013-12-07 01:35:54.914 | 2013-12-07 01:15:59,804 Request Headers: {'Content-Type': 'application/xml', 'Accept': 'application/xml', 'X-Auth-Token': 'Token omitted'} 2013-12-07 01:35:54.914 | 2013-12-07 01:15:59,804 Request Body: ?xml version=1.0 encoding=UTF-8? 2013-12-07 01:35:54.915 | server
[Yahoo-eng-team] [Bug 1258319] [NEW] test_reboot_server_hard fails sporadically in swift check jobs
Public bug reported: test_reboot_server_hard fails sporadically in swift check jobs I believe this has been reported before, but I was not able to find it. See: http://logs.openstack.org/43/60343/1/gate/gate-tempest-dsvm- full/c92d206/console.html 2013-12-05 21:29:18.174 | == 2013-12-05 21:29:18.183 | FAIL: tempest.api.compute.servers.test_server_actions.ServerActionsTestXML.test_reboot_server_hard[gate,smoke] 2013-12-05 21:29:18.186 | tempest.api.compute.servers.test_server_actions.ServerActionsTestXML.test_reboot_server_hard[gate,smoke] 2013-12-05 21:29:18.200 | -- 2013-12-05 21:29:18.206 | _StringException: Empty attachments: 2013-12-05 21:29:18.206 | stderr 2013-12-05 21:29:18.207 | stdout 2013-12-05 21:29:18.207 | 2013-12-05 21:29:18.207 | pythonlogging:'': {{{ . . . 2013-12-05 21:29:19.174 | Traceback (most recent call last): 2013-12-05 21:29:19.175 | File tempest/api/compute/servers/test_server_actions.py, line 83, in test_reboot_server_hard 2013-12-05 21:29:19.175 | self.client.wait_for_server_status(self.server_id, 'ACTIVE') 2013-12-05 21:29:19.175 | File tempest/services/compute/xml/servers_client.py, line 369, in wait_for_server_status 2013-12-05 21:29:19.175 | extra_timeout=extra_timeout) 2013-12-05 21:29:19.176 | File tempest/common/waiters.py, line 82, in wait_for_server_status 2013-12-05 21:29:19.176 | raise exceptions.TimeoutException(message) 2013-12-05 21:29:19.176 | TimeoutException: Request timed out 2013-12-05 21:29:19.177 | Details: Server f313af9a-8ec1-4f77-b63f-76d9317d6423 failed to reach ACTIVE status within the required time (196 s). Current status: HARD_REBOOT. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1258319 Title: test_reboot_server_hard fails sporadically in swift check jobs Status in OpenStack Compute (Nova): New Bug description: test_reboot_server_hard fails sporadically in swift check jobs I believe this has been reported before, but I was not able to find it. See: http://logs.openstack.org/43/60343/1/gate/gate-tempest-dsvm- full/c92d206/console.html 2013-12-05 21:29:18.174 | == 2013-12-05 21:29:18.183 | FAIL: tempest.api.compute.servers.test_server_actions.ServerActionsTestXML.test_reboot_server_hard[gate,smoke] 2013-12-05 21:29:18.186 | tempest.api.compute.servers.test_server_actions.ServerActionsTestXML.test_reboot_server_hard[gate,smoke] 2013-12-05 21:29:18.200 | -- 2013-12-05 21:29:18.206 | _StringException: Empty attachments: 2013-12-05 21:29:18.206 | stderr 2013-12-05 21:29:18.207 | stdout 2013-12-05 21:29:18.207 | 2013-12-05 21:29:18.207 | pythonlogging:'': {{{ . . . 2013-12-05 21:29:19.174 | Traceback (most recent call last): 2013-12-05 21:29:19.175 | File tempest/api/compute/servers/test_server_actions.py, line 83, in test_reboot_server_hard 2013-12-05 21:29:19.175 | self.client.wait_for_server_status(self.server_id, 'ACTIVE') 2013-12-05 21:29:19.175 | File tempest/services/compute/xml/servers_client.py, line 369, in wait_for_server_status 2013-12-05 21:29:19.175 | extra_timeout=extra_timeout) 2013-12-05 21:29:19.176 | File tempest/common/waiters.py, line 82, in wait_for_server_status 2013-12-05 21:29:19.176 | raise exceptions.TimeoutException(message) 2013-12-05 21:29:19.176 | TimeoutException: Request timed out 2013-12-05 21:29:19.177 | Details: Server f313af9a-8ec1-4f77-b63f-76d9317d6423 failed to reach ACTIVE status within the required time (196 s). Current status: HARD_REBOOT. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1258319/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1257799] [NEW] tempest.api.compute.servers.test_servers.ServersTestJSON.test_create_with_existing_server_name fail in gate with BuildExceptionError
Public bug reported: tempest.api.compute.servers.test_servers.ServersTestJSON.test_create_with_existing_server_name fail in gate with BuildExceptionError See: http://logs.openstack.org/79/59879/2/gate/gate-tempest-dsvm- postgres-full/ceb9759/console.html 2013-12-04 06:11:28.345 | == 2013-12-04 06:11:28.372 | FAIL: tempest.api.compute.servers.test_servers.ServersTestJSON.test_create_with_existing_server_name[gate] 2013-12-04 06:11:28.372 | tempest.api.compute.servers.test_servers.ServersTestJSON.test_create_with_existing_server_name[gate] 2013-12-04 06:11:28.373 | -- 2013-12-04 06:11:28.373 | _StringException: Empty attachments: 2013-12-04 06:11:28.373 | stderr 2013-12-04 06:11:28.374 | stdout 2013-12-04 06:11:28.374 | 2013-12-04 06:11:28.374 | pythonlogging:'': {{{ 2013-12-04 06:11:28.374 | 2013-12-04 05:54:55,289 Request: POST http://127.0.0.1:8774/v2/6ddc4dd2e1bc4683bfb199e225f8c9e9/servers 2013-12-04 06:11:28.375 | 2013-12-04 05:54:55,289 Request Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': 'Token omitted'} 2013-12-04 06:11:28.375 | 2013-12-04 05:54:55,290 Request Body: {server: {flavorRef: 42, name: server-tempest-2040253626, imageRef: 465be8b0-dd45-47c5-91d4-efb628aa375e}} 2013-12-04 06:11:28.375 | 2013-12-04 05:54:55,594 Response Status: 202 2013-12-04 06:11:28.376 | 2013-12-04 05:54:55,594 Nova request id: req-57892045-1ed9-4f17-b96e-0babe4bebaf9 2013-12-04 06:11:28.376 | 2013-12-04 05:54:55,594 Response Headers: {'content-length': '434', 'location': 'http://127.0.0.1:8774/v2/6ddc4dd2e1bc4683bfb199e225f8c9e9/servers/3abd4be9-b3dc-4b87-9cf5-f5b597173cac', 'date': 'Wed, 04 Dec 2013 05:54:55 GMT', 'content-type': 'application/json', 'connection': 'close'} 2013-12-04 06:11:28.376 | 2013-12-04 05:54:55,594 Response Body: {server: {security_groups: [{name: default}], OS-DCF:diskConfig: MANUAL, id: 3abd4be9-b3dc-4b87-9cf5-f5b597173cac, links: [{href: http://127.0.0.1:8774/v2/6ddc4dd2e1bc4683bfb199e225f8c9e9/servers/3abd4be9-b3dc-4b87-9cf5-f5b597173cac;, rel: self}, {href: http://127.0.0.1:8774/6ddc4dd2e1bc4683bfb199e225f8c9e9/servers/3abd4be9-b3dc-4b87-9cf5-f5b597173cac;, rel: bookmark}], adminPass: 3F5iaFVxr8Pi}} 2013-12-04 06:11:28.377 | 2013-12-04 05:54:55,594 Request: GET http://127.0.0.1:8774/v2/6ddc4dd2e1bc4683bfb199e225f8c9e9/servers/3abd4be9-b3dc-4b87-9cf5-f5b597173cac 2013-12-04 06:11:28.377 | 2013-12-04 05:54:55,594 Request Headers: {'X-Auth-Token': 'Token omitted'} 2013-12-04 06:11:28.377 | 2013-12-04 05:54:55,706 Response Status: 200 2013-12-04 06:11:28.378 | 2013-12-04 05:54:55,706 Nova request id: req-85e0828c-baad-432a-b28f-9433b293eb5c 2013-12-04 06:11:28.378 | 2013-12-04 05:54:55,707 Response Headers: {'content-length': '1347', 'content-location': u'http://127.0.0.1:8774/v2/6ddc4dd2e1bc4683bfb199e225f8c9e9/servers/3abd4be9-b3dc-4b87-9cf5-f5b597173cac', 'date': 'Wed, 04 Dec 2013 05:54:55 GMT', 'content-type': 'application/json', 'connection': 'close'} 2013-12-04 06:11:28.378 | 2013-12-04 05:54:55,707 Response Body: {server: {status: BUILD, updated: 2013-12-04T05:54:55Z, hostId: , addresses: {}, links: [{href: http://127.0.0.1:8774/v2/6ddc4dd2e1bc4683bfb199e225f8c9e9/servers/3abd4be9-b3dc-4b87-9cf5-f5b597173cac;, rel: self}, {href: http://127.0.0.1:8774/6ddc4dd2e1bc4683bfb199e225f8c9e9/servers/3abd4be9-b3dc-4b87-9cf5-f5b597173cac;, rel: bookmark}], key_name: null, image: {id: 465be8b0-dd45-47c5-91d4-efb628aa375e, links: [{href: http://127.0.0.1:8774/6ddc4dd2e1bc4683bfb199e225f8c9e9/images/465be8b0-dd45-47c5-91d4-efb628aa375e;, rel: bookmark}]}, OS-EXT-STS:task_state: scheduling, OS-EXT-STS:vm_state: building, OS-SRV-USG:launched_at: null, flavor: {id: 42, links: [{href: http://127.0.0.1:8774/6ddc4dd2e1bc4683bfb199e225f8c9e9/flavors/42;, rel: bookmark}]}, id: 3abd4be9-b3dc-4b87-9cf5-f5b597173cac, security_groups: [{name: default}], OS-SRV-USG:termin ated_at: null, OS-EXT-AZ:availability_zone: nova, user_id: 21689027acab4b11ae0885e5cbd26a4b, name: server-tempest-2040253626, created: 2013-12-04T05:54:55Z, tenant_id: 6ddc4dd2e1bc4683bfb199e225f8c9e9, OS-DCF:diskConfig: MANUAL, os-extended-volumes:volumes_attached: [], accessIPv4: , accessIPv6: , progress: 0, OS-EXT-STS:power_state: 0, config_drive: , metadata: {}}} . . . 2013-12-04 06:11:28.407 | 2013-12-04 05:55:03,128 Request: GET http://127.0.0.1:8774/v2/6ddc4dd2e1bc4683bfb199e225f8c9e9/servers/c74d9399-03a4-4b01-8d5d-b3e4d8a85738 2013-12-04 06:11:28.408 | 2013-12-04 05:55:03,128 Request Headers: {'X-Auth-Token': 'Token omitted'} 2013-12-04 06:11:28.408 | 2013-12-04 05:55:03,165 Response Status: 404 2013-12-04 06:11:28.408 | 2013-12-04 05:55:03,165 Nova request id: req-ca12c9f4-eb0e-435c-8b87-2d233c45e50d 2013-12-04 06:11:28.409 | 2013-12-04 05:55:03,165 Response Headers: {'content-length': '73',
[Yahoo-eng-team] [Bug 1257829] [NEW] Misspelled encryption field in QemuImgInfo
Public bug reported: Location: openstack.common.imageutils.QemuImgInfo Method: __init__ Error: line 45, self.encryption = details.get('encryption') The parsing of the encryption field for qemu-img commands does not work. The key used to index the details dictionary for encryption information, 'encryption', does not match the key generated by qemu-img, 'encrypted.' As a result, the encryption field is always 'None', regardless of the image's encryption status. Example call to 'qemu-img info': $ qemu-img info encrypted_disk.qcow2 Disk image 'encrypted_disk.qcow2' is encrypted. password: image: encrypted_disk.qcow2 file format: qcow2 virtual size: 16G (17179869184 bytes) disk size: 136K encrypted: yes cluster_size: 65536 backing file: debian_squeeze_i386_standard.qcow2 (actual path: debian_squeeze_i386_standard.qcow2) Proposed Fix: Simply change the key used to index the encryption information. self.encrypted = details.get('encrypted') Since the fields in __init__ seem to be named to match the keys used to index the corresponding information, I would also propose changing the attribute from self.encryption to self.encrypted, and updating any references to it wherever appropriate. ** Affects: cinder Importance: Undecided Status: New ** Affects: nova Importance: Undecided Status: New ** Affects: oslo Importance: Undecided Status: New ** Tags: cinder encryption nova oslo qemu ** Also affects: cinder Importance: Undecided Status: New ** Also affects: oslo Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1257829 Title: Misspelled encryption field in QemuImgInfo Status in Cinder: New Status in OpenStack Compute (Nova): New Status in Oslo - a Library of Common OpenStack Code: New Bug description: Location: openstack.common.imageutils.QemuImgInfo Method: __init__ Error: line 45, self.encryption = details.get('encryption') The parsing of the encryption field for qemu-img commands does not work. The key used to index the details dictionary for encryption information, 'encryption', does not match the key generated by qemu- img, 'encrypted.' As a result, the encryption field is always 'None', regardless of the image's encryption status. Example call to 'qemu-img info': $ qemu-img info encrypted_disk.qcow2 Disk image 'encrypted_disk.qcow2' is encrypted. password: image: encrypted_disk.qcow2 file format: qcow2 virtual size: 16G (17179869184 bytes) disk size: 136K encrypted: yes cluster_size: 65536 backing file: debian_squeeze_i386_standard.qcow2 (actual path: debian_squeeze_i386_standard.qcow2) Proposed Fix: Simply change the key used to index the encryption information. self.encrypted = details.get('encrypted') Since the fields in __init__ seem to be named to match the keys used to index the corresponding information, I would also propose changing the attribute from self.encryption to self.encrypted, and updating any references to it wherever appropriate. To manage notifications about this bug go to: https://bugs.launchpad.net/cinder/+bug/1257829/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1257390] [NEW] races in assignment manager can cause spurious 404 when removing user from project
Public bug reported: Similar kind of bug as described in bug #1246489. When removing a user from a project, the assignment manager retrieves a list of all roles the user has on the project, then removes each role. Each (user, role, project) tuple is removed with a separate call into the driver. If, before a particular role has been removed, that role is deleted by another request calling into the manager (i.e., via delete_role), the call into the driver by the user removal request will raise a RoleNotFound exception and the request will return an HTTP 404 error. Furthermore, any roles in the list after the exceptional role will not be deleted. Another call to Manager.remove_user_from_project will remove the remaining roles. The 404 can easily be avoided by either putting a try: except: RoleNotFound .. pass around the driver.remove_role_from_user_and_project calls. Alternatively, a begin/end transaction interface could be added to the driver. In its simplest form, this interface could be implemented by serializing all transactions with a mutex. The SQL driver could implement the interface with database transactions. ** Affects: keystone Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Keystone. https://bugs.launchpad.net/bugs/1257390 Title: races in assignment manager can cause spurious 404 when removing user from project Status in OpenStack Identity (Keystone): New Bug description: Similar kind of bug as described in bug #1246489. When removing a user from a project, the assignment manager retrieves a list of all roles the user has on the project, then removes each role. Each (user, role, project) tuple is removed with a separate call into the driver. If, before a particular role has been removed, that role is deleted by another request calling into the manager (i.e., via delete_role), the call into the driver by the user removal request will raise a RoleNotFound exception and the request will return an HTTP 404 error. Furthermore, any roles in the list after the exceptional role will not be deleted. Another call to Manager.remove_user_from_project will remove the remaining roles. The 404 can easily be avoided by either putting a try: except: RoleNotFound .. pass around the driver.remove_role_from_user_and_project calls. Alternatively, a begin/end transaction interface could be added to the driver. In its simplest form, this interface could be implemented by serializing all transactions with a mutex. The SQL driver could implement the interface with database transactions. To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1257390/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1257396] [NEW] iso8601 DEBUG messages spam log
Public bug reported: Useless DEBUG messages are printed every time iso8601 parses a date: (iso8601.iso8601): 2013-12-03 12:47:12,924 DEBUG iso8601 parse_date Parsed 2013-12-03T17:47:12Z into {'tz_sign': None, 'second_fraction': None, 'hour': '17', 'tz_hour': None, 'month': '12', 'timezone': 'Z', 'second': '12', 'tz_minute': None, 'year': '2013', 'separator': 'T', 'day': '03', 'minute': '47'} with default timezone iso8601.iso8601.Utc object at 0x1571cd0 (iso8601.iso8601): 2013-12-03 12:47:12,924 DEBUG iso8601 to_int Got '2013' for 'year' with default None (iso8601.iso8601): 2013-12-03 12:47:12,925 DEBUG iso8601 to_int Got '12' for 'month' with default None (iso8601.iso8601): 2013-12-03 12:47:12,925 DEBUG iso8601 to_int Got '03' for 'day' with default None (iso8601.iso8601): 2013-12-03 12:47:12,925 DEBUG iso8601 to_int Got '17' for 'hour' with default None (iso8601.iso8601): 2013-12-03 12:47:12,925 DEBUG iso8601 to_int Got '47' for 'minute' with default None (iso8601.iso8601): 2013-12-03 12:47:12,925 DEBUG iso8601 to_int Got '12' for 'second' with default None The log level for iso8601 has been set to WARN in oslo-incubator: https://github.com/openstack/oslo-incubator/commit/cbfded9c. This change should be merged into keystone. ** Affects: keystone Importance: Undecided Assignee: Peter Feiner (pete5) Status: In Progress ** Changed in: keystone Assignee: (unassigned) = Peter Feiner (pete5) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Keystone. https://bugs.launchpad.net/bugs/1257396 Title: iso8601 DEBUG messages spam log Status in OpenStack Identity (Keystone): In Progress Bug description: Useless DEBUG messages are printed every time iso8601 parses a date: (iso8601.iso8601): 2013-12-03 12:47:12,924 DEBUG iso8601 parse_date Parsed 2013-12-03T17:47:12Z into {'tz_sign': None, 'second_fraction': None, 'hour': '17', 'tz_hour': None, 'month': '12', 'timezone': 'Z', 'second': '12', 'tz_minute': None, 'year': '2013', 'separator': 'T', 'day': '03', 'minute': '47'} with default timezone iso8601.iso8601.Utc object at 0x1571cd0 (iso8601.iso8601): 2013-12-03 12:47:12,924 DEBUG iso8601 to_int Got '2013' for 'year' with default None (iso8601.iso8601): 2013-12-03 12:47:12,925 DEBUG iso8601 to_int Got '12' for 'month' with default None (iso8601.iso8601): 2013-12-03 12:47:12,925 DEBUG iso8601 to_int Got '03' for 'day' with default None (iso8601.iso8601): 2013-12-03 12:47:12,925 DEBUG iso8601 to_int Got '17' for 'hour' with default None (iso8601.iso8601): 2013-12-03 12:47:12,925 DEBUG iso8601 to_int Got '47' for 'minute' with default None (iso8601.iso8601): 2013-12-03 12:47:12,925 DEBUG iso8601 to_int Got '12' for 'second' with default None The log level for iso8601 has been set to WARN in oslo-incubator: https://github.com/openstack/oslo-incubator/commit/cbfded9c. This change should be merged into keystone. To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1257396/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1253193] Re: instances occasionally do not exist in `nova list` immediately after boot
Nevermind, this was my client-side bug. I was coalescing 'nova list' requests from different threads without regard to other requests those threads may have made to nova-api (e.g., boot). ** Changed in: nova Status: New = Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1253193 Title: instances occasionally do not exist in `nova list` immediately after boot Status in OpenStack Compute (Nova): Invalid Bug description: The boot and list commands in nova don't seem to be sequentially consistent. Doing performance testing on the latest openstack code, I occasionally observe that the instance is not in the output of nova list. If I re-request the list a moment later, the instance is almost always there (i.e., show and list are eventually consistent). Note that above, when I saw nova ..., I'm not using the command-line nova client tool. Instead, I'm issuing the requests from the same process with the same novaclient.client.Client instance. The delay between invocations of the nova command-line tool would hide the race I'm observing. I observe this problem fairly rarely; roughly once per 100 instances booted. My suspicion is that the inconsistency arises from how nova-api interacts with the database. Thus it's pertinent to note that I'm running with 20 osapi workers. I'm using mysql. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1253193/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp