Public bug reported:

If a child cell has compute nodes without a heartbeat update but enabled (XXX 
state with
"nova-manage service list") the child cell continues to consider the available 
resources of
these compute nodes when updating the cell capacity.
This can be problematic when having several cells and trying to fill them 
completely.
Requests are sent to the cell that can fit more instances of the requested type 
however
when compute nodes are "down" the requests will fail with "No valid host" in 
the cell.

When updating the cell capacity the "disabled" compute nodes are excluded. This 
should
also happen if the compute node didn't have a heartbeat update during the 
"CONF.service_down_time".

How to reproduce:
1) Have a cell environment with 2 child cells (A and B).
2) Have nova-cells running in "debug". Confirm that the "Received capacities 
from child cell" A and B (in top nova-cell log) matches the number of available 
resources.
4) Stop some compute nodes in cell A.
5) Confirm that the "Received capacities from child cell A" don't change.
6) Cell scheduler can send requests to cell A that can fail with "No valid 
host".

** Affects: nova
     Importance: Undecided
     Assignee: Belmiro Moreira (moreira-belmiro-email-lists)
         Status: New


** Tags: cells

** Changed in: nova
     Assignee: (unassigned) => Belmiro Moreira (moreira-belmiro-email-lists)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1532562

Title:
  Cell capacities updates include available resources of compute nodes
  "down"

Status in OpenStack Compute (nova):
  New

Bug description:
  If a child cell has compute nodes without a heartbeat update but enabled (XXX 
state with
  "nova-manage service list") the child cell continues to consider the 
available resources of
  these compute nodes when updating the cell capacity.
  This can be problematic when having several cells and trying to fill them 
completely.
  Requests are sent to the cell that can fit more instances of the requested 
type however
  when compute nodes are "down" the requests will fail with "No valid host" in 
the cell.

  When updating the cell capacity the "disabled" compute nodes are excluded. 
This should
  also happen if the compute node didn't have a heartbeat update during the 
"CONF.service_down_time".

  How to reproduce:
  1) Have a cell environment with 2 child cells (A and B).
  2) Have nova-cells running in "debug". Confirm that the "Received capacities 
from child cell" A and B (in top nova-cell log) matches the number of available 
resources.
  4) Stop some compute nodes in cell A.
  5) Confirm that the "Received capacities from child cell A" don't change.
  6) Cell scheduler can send requests to cell A that can fail with "No valid 
host".

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1532562/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to