[Yahoo-eng-team] [Bug 1739323] Re: KeyError in host_manager for _get_host_states
Reviewed: https://review.openstack.org/529352 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=d72b33b986525a9b2c7aa08b609ae386de1d0e89 Submitter: Zuul Branch:master commit d72b33b986525a9b2c7aa08b609ae386de1d0e89 Author: Matthew BoothDate: Sat Dec 16 20:27:08 2017 + Fix an error in _get_host_states when deleting a compute node _get_host_states returns a generator which closes over seen_nodes, which is local, and self.host_state_map, which is global. It also modifies self.host_state_map, and will remove entries whose compute nodes are no longer present. If a compute node is deleted while a filter is still evaluating the generator returned by _get_host_states, the entry in self.host_state_map will be deleted if _get_host_states is called again. This will cause a KeyError when the first generator comes to evaluate the entry for the deleted compute node. We fix this by modifying the returned generator expression to check that a host_state_map entry still exists before returning it. An existing unit test is modified to exhibit the bug. Change-Id: Ibb7c43a0abc433f93fc3de71146263e6d5923666 Closes-Bug: #1739323 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1739323 Title: KeyError in host_manager for _get_host_states Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) ocata series: In Progress Status in OpenStack Compute (nova) pike series: In Progress Bug description: https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L674-L718 In _get_host_states, a list of all computes nodes is retrieved with a `state_key` of `(host, node)`. https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L692 https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L708 The small piece of code here removes all of the dead compute nodes from host_state_map https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L708 However, the result is returned by iterating over all seen nodes and using that index for host_state_map, some of which have been deleted by the code above, resulting in a KeyError. https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L718 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1739323/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1739323] Re: KeyError in host_manager for _get_host_states
** Also affects: nova/ocata Importance: Undecided Status: New ** Also affects: nova/pike Importance: Undecided Status: New ** Changed in: nova Assignee: Matt Riedemann (mriedem) => Matthew Booth (mbooth-9) ** Changed in: nova/ocata Status: New => Confirmed ** Changed in: nova/pike Status: New => Confirmed ** Changed in: nova/pike Importance: Undecided => High ** Changed in: nova/ocata Importance: Undecided => High -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1739323 Title: KeyError in host_manager for _get_host_states Status in OpenStack Compute (nova): In Progress Status in OpenStack Compute (nova) ocata series: Confirmed Status in OpenStack Compute (nova) pike series: Confirmed Bug description: https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L674-L718 In _get_host_states, a list of all computes nodes is retrieved with a `state_key` of `(host, node)`. https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L692 https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L708 The small piece of code here removes all of the dead compute nodes from host_state_map https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L708 However, the result is returned by iterating over all seen nodes and using that index for host_state_map, some of which have been deleted by the code above, resulting in a KeyError. https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L718 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1739323/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1739323] Re: KeyError in host_manager for _get_host_states
You aren't by chance running with some multiple scheduler thread workers patch are you? ** No longer affects: nova/ocata ** No longer affects: nova/pike ** Changed in: nova Status: Confirmed => Incomplete -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1739323 Title: KeyError in host_manager for _get_host_states Status in OpenStack Compute (nova): Incomplete Bug description: https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L674-L718 In _get_host_states, a list of all computes nodes is retrieved with a `state_key` of `(host, node)`. https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L692 https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L708 The small piece of code here removes all of the dead compute nodes from host_state_map https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L708 However, the result is returned by iterating over all seen nodes and using that index for host_state_map, some of which have been deleted by the code above, resulting in a KeyError. https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L718 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1739323/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1739323] Re: KeyError in host_manager for _get_host_states
https://github.com/openstack/nova/commit/4660333d0d97d8e00cf290ea1d4ed932f5edc1dc #diff-978b9f8734365934eaf8fbb01f11a7d7L624 ** Changed in: nova Status: New => Confirmed ** Changed in: nova Importance: Undecided => High ** Also affects: nova/ocata Importance: Undecided Status: New ** Also affects: nova/pike Importance: Undecided Status: New ** Changed in: nova/ocata Status: New => Confirmed ** Changed in: nova/pike Status: New => Confirmed ** Changed in: nova/pike Importance: Undecided => High ** Changed in: nova/ocata Importance: Undecided => High -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1739323 Title: KeyError in host_manager for _get_host_states Status in OpenStack Compute (nova): Confirmed Status in OpenStack Compute (nova) ocata series: Confirmed Status in OpenStack Compute (nova) pike series: Confirmed Bug description: https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L674-L718 In _get_host_states, a list of all computes nodes is retrieved with a `state_key` of `(host, node)`. https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L692 https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L708 The small piece of code here removes all of the dead compute nodes from host_state_map https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L708 However, the result is returned by iterating over all seen nodes and using that index for host_state_map, some of which have been deleted by the code above, resulting in a KeyError. https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L718 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1739323/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp