Re: [openstack-dev] [nova] About doing the migration claim with Placement API

2016-11-11 Thread Alex Xu
2016-11-03 4:52 GMT+08:00 Jay Pipes :

> On 11/01/2016 10:14 AM, Alex Xu wrote:
>
>> Currently we only update the resource usage with Placement API in the
>> instance claim and the available resource update periodic task. But
>> there is no claim for migration with placement API yet. This works is
>> tracked by https://bugs.launchpad.net/nova/+bug/1621709. In newton, we
>> only fix one bit which make the resource update periodic task works
>> correctly, then it will auto-heal everything. For the migration claim
>> part, that isn't the goal for newton release.
>>
>> So the first question is do we want to fix it in this release? If the
>> answer is yes, there have a concern need to discuss.
>>
>
> Yes, I believe we should fix the underlying problem in Ocata. The
> underlying problem is what Sylvain brought up: live migrations do not
> currently use any sort of claim operation. The periodic resource audit is
> relied upon to essentially clean up the state of claimed resources over
> time, and as Chris points out in review comments on
> https://review.openstack.org/#/c/244489/, this leads to the scheduler
> operating on stale data and can lead to an increase in retry operations.
>
> This needs to be fixed before even attempting to address the issue you
> bring up with the placement API calls from the resource tracker.


ok, let me see if I can help something at here.


>
>
> In order to implement the drop of migration claim, the RT needs to
>> remove allocation records on the specific RP(on the source/destination
>> compute node). But there isn't any API can do that. The API about remove
>> allocation records is 'DELETE /allocations/{consumer_uuid}', but it will
>> delete all the allocation records for the consumer. So the initial
>> fix(https://review.openstack.org/#/c/369172/) adds new API 'DELETE
>> /resource_providers/{rp_uuid}/allocations/{consumer_id}'. But Chris Dent
>> pointed out this against the original design. All the allocations for
>> the specific consumer only can be dropped together.
>>
>
> Yes, and this is by design. Consumption of resources -- or the freeing
> thereof -- must be an atomic, transactional operation.
>
> There also have suggestion from Andrew, we can update all the allocation
>> records for the consumer each time. That means the RT will build the
>> original allocation records and new allocation records for the claim
>> together, and put into one API. That API should be 'PUT
>> /allocations/{consumer_uuid}'. Unfortunately that API doesn't replace
>> all the allocation records for the consumer, it always amends the new
>> allocation records for the consumer.
>>
>
> I see no reason why we can't change the behaviour of the `PUT
> /allocations/{consumer_uuid}` call to allow changing either the amounts of
> the allocated resources (a resize operation) or the set of resource
> provider UUIDs referenced in the allocations list (a move operation).
>
> For instance, let's say we have an allocation for an instance "i1" that is
> consuming 2 VCPU and 2048 MEMORY_MB on compute node "rpA", 50 DISK_GB on a
> shared storage pool "rpC".
>
> The allocations table would have the following records in it:
>
> resource_provider resource_class consumer used
> - --  
> rpA   VCPU   i1  2
> rpA   MEMORY_MB  i1   2048
> rpC   DISK_GBi1 50
>
> Now, we need to migrate instance "i1" to compute node "rpB". The instance
> disk uses shared storage so the only allocation records we actually need to
> modify are the VCPU and MEMORY_MB records.
>

yea, think about with shared storage, this makes sense a lot. Thanks for
such detail explain at here!


>
> We would create the following REST API call from the resource tracker on
> the destination node:
>
> PUT /allocations/i1
> {
>   "allocations": [
>   {
> "resource_provider": {
>   "uuid": "rpB",
> },
> "resources": {
>   "VCPU": 2,
>   "MEMORY_MB": 2048
> }
>   },
>   {
> "resource_provider": {
>   "uuid": "rpC",
> },
> "resources": {
>   "DISK_GB": 50
> }
>   }
>   ]
> }
>
> The placement service would receive that request payload and immediately
> grab any existing allocation records referencing consumer_uuid of "i1". It
> would notice that records referencing "rpA" (the source compute node) are
> no longer needed. It would notice that the DISK_GB allocation hasn't
> changed. And finally it would notice that there are new VCPU and MEMORY_MB
> records referring to a new resource provider "rpB" (the destination compute
> node).
>
> A single SQL transaction would be built that executes the following:
>
> BEGIN;
>
>   # Grab the source and destination compute node provider generations
>   # to protect against concurrent writes...
>   $RPA_GEN := SELECT generation FROM resource_providers
>   WHERE uuid = 'rpA';
>   $RPB_GEN := SELECT generation FROM resource_providers

Re: [openstack-dev] [nova] About doing the migration claim with Placement API

2016-11-02 Thread Chris Friesen

On 11/02/2016 02:52 PM, Jay Pipes wrote:

On 11/01/2016 10:14 AM, Alex Xu wrote:

Currently we only update the resource usage with Placement API in the
instance claim and the available resource update periodic task. But
there is no claim for migration with placement API yet. This works is
tracked by https://bugs.launchpad.net/nova/+bug/1621709. In newton, we
only fix one bit which make the resource update periodic task works
correctly, then it will auto-heal everything. For the migration claim
part, that isn't the goal for newton release.

So the first question is do we want to fix it in this release? If the
answer is yes, there have a concern need to discuss.


Yes, I believe we should fix the underlying problem in Ocata. The underlying
problem is what Sylvain brought up: live migrations do not currently use any
sort of claim operation. The periodic resource audit is relied upon to
essentially clean up the state of claimed resources over time, and as Chris
points out in review comments on https://review.openstack.org/#/c/244489/, this
leads to the scheduler operating on stale data and can lead to an increase in
retry operations.


It's worse than that.  For pinned instances it can result in vCPUs from multiple 
instances running on the same host pCPUs (which defeats the whole point of 
pinning), and can result in outright live migration failures if the destination 
has fewer pCPUs or NUMA nodes than the source.



I see no reason why we can't change the behaviour of the `PUT
/allocations/{consumer_uuid}` call to allow changing either the amounts of the
allocated resources (a resize operation) or the set of resource provider UUIDs
referenced in the allocations list (a move operation).


Agreed, your example looks reasonable at first glance.

Chris

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] About doing the migration claim with Placement API

2016-11-02 Thread Jay Pipes

On 11/01/2016 10:14 AM, Alex Xu wrote:

Currently we only update the resource usage with Placement API in the
instance claim and the available resource update periodic task. But
there is no claim for migration with placement API yet. This works is
tracked by https://bugs.launchpad.net/nova/+bug/1621709. In newton, we
only fix one bit which make the resource update periodic task works
correctly, then it will auto-heal everything. For the migration claim
part, that isn't the goal for newton release.

So the first question is do we want to fix it in this release? If the
answer is yes, there have a concern need to discuss.


Yes, I believe we should fix the underlying problem in Ocata. The 
underlying problem is what Sylvain brought up: live migrations do not 
currently use any sort of claim operation. The periodic resource audit 
is relied upon to essentially clean up the state of claimed resources 
over time, and as Chris points out in review comments on 
https://review.openstack.org/#/c/244489/, this leads to the scheduler 
operating on stale data and can lead to an increase in retry operations.


This needs to be fixed before even attempting to address the issue you 
bring up with the placement API calls from the resource tracker.



In order to implement the drop of migration claim, the RT needs to
remove allocation records on the specific RP(on the source/destination
compute node). But there isn't any API can do that. The API about remove
allocation records is 'DELETE /allocations/{consumer_uuid}', but it will
delete all the allocation records for the consumer. So the initial
fix(https://review.openstack.org/#/c/369172/) adds new API 'DELETE
/resource_providers/{rp_uuid}/allocations/{consumer_id}'. But Chris Dent
pointed out this against the original design. All the allocations for
the specific consumer only can be dropped together.


Yes, and this is by design. Consumption of resources -- or the freeing 
thereof -- must be an atomic, transactional operation.



There also have suggestion from Andrew, we can update all the allocation
records for the consumer each time. That means the RT will build the
original allocation records and new allocation records for the claim
together, and put into one API. That API should be 'PUT
/allocations/{consumer_uuid}'. Unfortunately that API doesn't replace
all the allocation records for the consumer, it always amends the new
allocation records for the consumer.


I see no reason why we can't change the behaviour of the `PUT 
/allocations/{consumer_uuid}` call to allow changing either the amounts 
of the allocated resources (a resize operation) or the set of resource 
provider UUIDs referenced in the allocations list (a move operation).


For instance, let's say we have an allocation for an instance "i1" that 
is consuming 2 VCPU and 2048 MEMORY_MB on compute node "rpA", 50 DISK_GB 
on a shared storage pool "rpC".


The allocations table would have the following records in it:

resource_provider resource_class consumer used
- --  
rpA   VCPU   i1  2
rpA   MEMORY_MB  i1   2048
rpC   DISK_GBi1 50

Now, we need to migrate instance "i1" to compute node "rpB". The 
instance disk uses shared storage so the only allocation records we 
actually need to modify are the VCPU and MEMORY_MB records.


We would create the following REST API call from the resource tracker on 
the destination node:


PUT /allocations/i1
{
  "allocations": [
  {
"resource_provider": {
  "uuid": "rpB",
},
"resources": {
  "VCPU": 2,
  "MEMORY_MB": 2048
}
  },
  {
"resource_provider": {
  "uuid": "rpC",
},
"resources": {
  "DISK_GB": 50
}
  }
  ]
}

The placement service would receive that request payload and immediately 
grab any existing allocation records referencing consumer_uuid of "i1". 
It would notice that records referencing "rpA" (the source compute node) 
are no longer needed. It would notice that the DISK_GB allocation hasn't 
changed. And finally it would notice that there are new VCPU and 
MEMORY_MB records referring to a new resource provider "rpB" (the 
destination compute node).


A single SQL transaction would be built that executes the following:

BEGIN;

  # Grab the source and destination compute node provider generations
  # to protect against concurrent writes...
  $RPA_GEN := SELECT generation FROM resource_providers
  WHERE uuid = 'rpA';
  $RPB_GEN := SELECT generation FROM resource_providers
  WHERE uuid = 'rpB';

  # Delete the allocation records referring to the source for the VCPU
  # and MEMORY_MB resources
  DELETE FROM allocations
  WHERE consumer = 'i1'
  AND resource_provider = 'rpA'
  AND resource_class IN ('VCPU', 'MEMORY_MB');

  # Add allocation records referring to the destination for VCPU and
  # MEMORY_MB
  INSERT INTO allocations
  (resource_provider, resource_class, 

Re: [openstack-dev] [nova] About doing the migration claim with Placement API

2016-11-02 Thread Chris Friesen

On 11/02/2016 05:26 AM, Alex Xu wrote:



2016-11-02 16:26 GMT+08:00 Sylvain Bauza >:



#2 all those claim operations don't trigger an allocation request to the
placement API, while the regular boot operation does (hence your bug 
report).


Yes, except the booting new instance, other claims won't trigger allocation
request to the placement API.


We should normally go through the scheduler for 
resize/migration/live-migration/evacuate, so wouldn't it make sense to do some 
sort of allocation request?


Chris

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] About doing the migration claim with Placement API

2016-11-02 Thread Sylvain Bauza



Le 02/11/2016 12:26, Alex Xu a écrit :



2016-11-02 16:26 GMT+08:00 Sylvain Bauza >:




Le 01/11/2016 15:14, Alex Xu a écrit :

Currently we only update the resource usage with Placement API in
the instance claim and the available resource update periodic
task. But there is no claim for migration with placement API yet.
This works is tracked by
https://bugs.launchpad.net/nova/+bug/1621709
. In newton, we
only fix one bit which make the resource update periodic task
works correctly, then it will auto-heal everything. For the
migration claim part, that isn't the goal for newton release.


To be clear, there are two distinct points :
#1 there are MoveClaim objects that are synchronously made on
resize (and cold-migrate) and rebuild (and evacuate), but there is
no claim done by the live-migration path.
There is a long-standing bugfix
https://review.openstack.org/#/c/244489/
 that's been tracked by
https://bugs.launchpad.net/nova/+bug/1289064



Yea, thanks for the info. I say `migration claim` is more about the 
move claim. Maybe I should say the move claim.





Np, just a clarification for all of us, not you in particular :-)



#2 all those claim operations don't trigger an allocation request
to the placement API, while the regular boot operation does (hence
your bug report).


Yes, except the booting new instance, other claims won't trigger 
allocation request to the placement API.


Oops, I badly wrote my prose in English, I meant your point, ie. that we 
only write allocation requests for boot operations, and not for move 
operations.







So the first question is do we want to fix it in this release? If
the answer is yes, there have a concern need to discuss.



I'd appreciate if we could merge first #1 before #2 because the
placement API decisions could be wrong if we decide to only
allocate for certain move operations.


Sorry, I didn't get you, what is 'the placement API decisions' pointed to?


I personnally think that rather writing allocation records for all move 
operations but the live-migration case, we should first have the move 
operations being consistent by doing claim operations and only that 
being done, consider writing those allocation records to the placement API.


-Sylvain





In order to implement the drop of migration claim, the RT needs
to remove allocation records on the specific RP(on the
source/destination compute node). But there isn't any API can do
that. The API about remove allocation records is 'DELETE
/allocations/{consumer_uuid}', but it will delete all the
allocation records for the consumer. So the initial
fix(https://review.openstack.org/#/c/369172/
) adds new API 'DELETE
/resource_providers/{rp_uuid}/allocations/{consumer_id}'. But
Chris Dent pointed out this against the original design. All the
allocations for the specific consumer only can be dropped together.

There also have suggestion from Andrew, we can update all the
allocation records for the consumer each time. That means the RT
will build the original allocation records and new allocation
records for the claim together, and put into one API. That API
should be 'PUT /allocations/{consumer_uuid}'. Unfortunately that
API doesn't replace all the allocation records for the consumer,
it always amends the new allocation records for the consumer.

So which directly we should go at here?

Thanks
Alex




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:openstack-dev-requ...@lists.openstack.org?subject:unsubscribe  

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev  




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev





__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Re: [openstack-dev] [nova] About doing the migration claim with Placement API

2016-11-02 Thread Alex Xu
2016-11-02 16:26 GMT+08:00 Sylvain Bauza :

>
>
> Le 01/11/2016 15:14, Alex Xu a écrit :
>
> Currently we only update the resource usage with Placement API in the
> instance claim and the available resource update periodic task. But there
> is no claim for migration with placement API yet. This works is tracked by
> https://bugs.launchpad.net/nova/+bug/1621709. In newton, we only fix one
> bit which make the resource update periodic task works correctly, then it
> will auto-heal everything. For the migration claim part, that isn't the
> goal for newton release.
>
>
> To be clear, there are two distinct points :
> #1 there are MoveClaim objects that are synchronously made on resize (and
> cold-migrate) and rebuild (and evacuate), but there is no claim done by the
> live-migration path.
> There is a long-standing bugfix https://review.openstack.org/#/c/244489/
> that's been tracked by https://bugs.launchpad.net/nova/+bug/1289064
>

Yea, thanks for the info. I say `migration claim` is more about the move
claim. Maybe I should say the move claim.

>
>
> #2 all those claim operations don't trigger an allocation request to the
> placement API, while the regular boot operation does (hence your bug
> report).
>

Yes, except the booting new instance, other claims won't trigger allocation
request to the placement API.


>
>
>
>
> So the first question is do we want to fix it in this release? If the
> answer is yes, there have a concern need to discuss.
>
>
> I'd appreciate if we could merge first #1 before #2 because the placement
> API decisions could be wrong if we decide to only allocate for certain move
> operations.
>

Sorry, I didn't get you, what is 'the placement API decisions' pointed to?


>
>
> In order to implement the drop of migration claim, the RT needs to remove
> allocation records on the specific RP(on the source/destination compute
> node). But there isn't any API can do that. The API about remove allocation
> records is 'DELETE /allocations/{consumer_uuid}', but it will delete all
> the allocation records for the consumer. So the initial fix(
> https://review.openstack.org/#/c/369172/) adds new API 'DELETE
> /resource_providers/{rp_uuid}/allocations/{consumer_id}'. But Chris Dent
> pointed out this against the original design. All the allocations for the
> specific consumer only can be dropped together.
>
> There also have suggestion from Andrew, we can update all the allocation
> records for the consumer each time. That means the RT will build the
> original allocation records and new allocation records for the claim
> together, and put into one API. That API should be 'PUT
> /allocations/{consumer_uuid}'. Unfortunately that API doesn't replace all
> the allocation records for the consumer, it always amends the new
> allocation records for the consumer.
>
> So which directly we should go at here?
>
> Thanks
> Alex
>
>
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: 
> openstack-dev-requ...@lists.openstack.org?subject:unsubscribehttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] About doing the migration claim with Placement API

2016-11-02 Thread Sylvain Bauza



Le 01/11/2016 15:14, Alex Xu a écrit :
Currently we only update the resource usage with Placement API in the 
instance claim and the available resource update periodic task. But 
there is no claim for migration with placement API yet. This works is 
tracked by https://bugs.launchpad.net/nova/+bug/1621709. In newton, we 
only fix one bit which make the resource update periodic task works 
correctly, then it will auto-heal everything. For the migration claim 
part, that isn't the goal for newton release.


To be clear, there are two distinct points :
#1 there are MoveClaim objects that are synchronously made on resize 
(and cold-migrate) and rebuild (and evacuate), but there is no claim 
done by the live-migration path.
There is a long-standing bugfix https://review.openstack.org/#/c/244489/ 
that's been tracked by https://bugs.launchpad.net/nova/+bug/1289064


#2 all those claim operations don't trigger an allocation request to the 
placement API, while the regular boot operation does (hence your bug 
report).





So the first question is do we want to fix it in this release? If the 
answer is yes, there have a concern need to discuss.




I'd appreciate if we could merge first #1 before #2 because the 
placement API decisions could be wrong if we decide to only allocate for 
certain move operations.


In order to implement the drop of migration claim, the RT needs to 
remove allocation records on the specific RP(on the source/destination 
compute node). But there isn't any API can do that. The API about 
remove allocation records is 'DELETE /allocations/{consumer_uuid}', 
but it will delete all the allocation records for the consumer. So the 
initial fix(https://review.openstack.org/#/c/369172/) adds new API 
'DELETE /resource_providers/{rp_uuid}/allocations/{consumer_id}'. But 
Chris Dent pointed out this against the original design. All the 
allocations for the specific consumer only can be dropped together.


There also have suggestion from Andrew, we can update all the 
allocation records for the consumer each time. That means the RT will 
build the original allocation records and new allocation records for 
the claim together, and put into one API. That API should be 'PUT 
/allocations/{consumer_uuid}'. Unfortunately that API doesn't replace 
all the allocation records for the consumer, it always amends the new 
allocation records for the consumer.


So which directly we should go at here?

Thanks
Alex




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] About doing the migration claim with Placement API

2016-11-01 Thread Alex Xu
Currently we only update the resource usage with Placement API in the
instance claim and the available resource update periodic task. But there
is no claim for migration with placement API yet. This works is tracked by
https://bugs.launchpad.net/nova/+bug/1621709. In newton, we only fix one
bit which make the resource update periodic task works correctly, then it
will auto-heal everything. For the migration claim part, that isn't the
goal for newton release.

So the first question is do we want to fix it in this release? If the
answer is yes, there have a concern need to discuss.

In order to implement the drop of migration claim, the RT needs to remove
allocation records on the specific RP(on the source/destination compute
node). But there isn't any API can do that. The API about remove allocation
records is 'DELETE /allocations/{consumer_uuid}', but it will delete all
the allocation records for the consumer. So the initial fix(
https://review.openstack.org/#/c/369172/) adds new API 'DELETE
/resource_providers/{rp_uuid}/allocations/{consumer_id}'. But Chris Dent
pointed out this against the original design. All the allocations for the
specific consumer only can be dropped together.

There also have suggestion from Andrew, we can update all the allocation
records for the consumer each time. That means the RT will build the
original allocation records and new allocation records for the claim
together, and put into one API. That API should be 'PUT
/allocations/{consumer_uuid}'. Unfortunately that API doesn't replace all
the allocation records for the consumer, it always amends the new
allocation records for the consumer.

So which directly we should go at here?

Thanks
Alex
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev