Re: [openstack-dev] [nova][scheduler] Anyone relying on the host_subset_size config option?

2017-05-26 Thread Rui Chen
Beside eliminate race conditions, we use host_subnet_size in the special
cases, we have different capacity hardware in a deployment,
imagine a simple case, two compute hosts(RAM 48G vs 16G free), only enable
the RAM weighter for nova-scheduler, if we launch
10 instances(RAM 1G flavor) one by one, all the 10 instances will be
launched on the 48G RAM compute hosts, that don't we want,
host_subset_size help to distribute load to random available hosts in the
situation.

Thank you sending the mail to operators list, let us to get more feedback
before doing some changes.

2017-05-27 4:46 GMT+08:00 Ben Nemec :

>
>
> On 05/26/2017 12:17 PM, Edward Leafe wrote:
>
>> [resending to include the operators list]
>>
>> The host_subset_size configuration option was added to the scheduler to
>> help eliminate race conditions when two requests for a similar VM would be
>> processed close together, since the scheduler’s algorithm would select the
>> same host in both cases, leading to a race and a likely failure to build
>> for the second request. By randomly choosing from the top N hosts, the
>> likelihood of a race would be reduced, leading to fewer failed builds.
>>
>> Current changes in the scheduling process now have the scheduler claiming
>> the resources as soon as it selects a host. So in the case above with 2
>> similar requests close together, the first request will claim successfully,
>> but the second will fail *while still in the scheduler*. Upon failing the
>> claim, the scheduler will simply pick the next host in its weighed list
>> until it finds one that it can claim the resources from. So the
>> host_subset_size configuration option is no longer needed.
>>
>> However, we have heard that some operators are relying on this option to
>> help spread instances across their hosts, rather than using the RAM
>> weigher. My question is: will removing this randomness from the scheduling
>> process hurt any operators out there? Or can we safely remove that logic?
>>
>
> We used host_subset_size to schedule randomly in one of the TripleO CI
> clouds.  Essentially we had a heterogeneous set of hardware where the
> numerically larger (more RAM, more disk, equal CPU cores) systems were
> significantly slower.  This caused them to be preferred by the scheduler
> with a normal filter configuration, which is obviously not what we wanted.
> I'm not sure if there's a smarter way to handle it, but setting
> host_subset_size to the number of compute nodes and disabling basically all
> of the weighers allowed us to equally distribute load so at least the slow
> nodes weren't preferred.
>
> That said, we're migrating away from that frankencloud so I certainly
> wouldn't block any scheduler improvements on it.  I'm mostly chiming in to
> describe a possible use case.  And please feel free to point out if there's
> a better way to do this. :-)
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][scheduler] Anyone relying on the host_subset_size config option?

2017-05-26 Thread Ben Nemec



On 05/26/2017 12:17 PM, Edward Leafe wrote:

[resending to include the operators list]

The host_subset_size configuration option was added to the scheduler to help 
eliminate race conditions when two requests for a similar VM would be processed 
close together, since the scheduler’s algorithm would select the same host in 
both cases, leading to a race and a likely failure to build for the second 
request. By randomly choosing from the top N hosts, the likelihood of a race 
would be reduced, leading to fewer failed builds.

Current changes in the scheduling process now have the scheduler claiming the 
resources as soon as it selects a host. So in the case above with 2 similar 
requests close together, the first request will claim successfully, but the 
second will fail *while still in the scheduler*. Upon failing the claim, the 
scheduler will simply pick the next host in its weighed list until it finds one 
that it can claim the resources from. So the host_subset_size configuration 
option is no longer needed.

However, we have heard that some operators are relying on this option to help 
spread instances across their hosts, rather than using the RAM weigher. My 
question is: will removing this randomness from the scheduling process hurt any 
operators out there? Or can we safely remove that logic?


We used host_subset_size to schedule randomly in one of the TripleO CI 
clouds.  Essentially we had a heterogeneous set of hardware where the 
numerically larger (more RAM, more disk, equal CPU cores) systems were 
significantly slower.  This caused them to be preferred by the scheduler 
with a normal filter configuration, which is obviously not what we 
wanted.  I'm not sure if there's a smarter way to handle it, but setting 
host_subset_size to the number of compute nodes and disabling basically 
all of the weighers allowed us to equally distribute load so at least 
the slow nodes weren't preferred.


That said, we're migrating away from that frankencloud so I certainly 
wouldn't block any scheduler improvements on it.  I'm mostly chiming in 
to describe a possible use case.  And please feel free to point out if 
there's a better way to do this. :-)


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][scheduler] Anyone relying on the host_subset_size config option?

2017-05-26 Thread Jay Pipes

On 05/26/2017 01:14 PM, Edward Leafe wrote:

The host_subset_size configuration option was added to the scheduler to help 
eliminate race conditions when two requests for a similar VM would be processed 
close together, since the scheduler’s algorithm would select the same host in 
both cases, leading to a race and a likely failure to build for the second 
request. By randomly choosing from the top N hosts, the likelihood of a race 
would be reduced, leading to fewer failed builds.

Current changes in the scheduling process now have the scheduler claiming the 
resources as soon as it selects a host. So in the case above with 2 similar 
requests close together, the first request will claim successfully, but the 
second will fail *while still in the scheduler*. Upon failing the claim, the 
scheduler will simply pick the next host in its weighed list until it finds one 
that it can claim the resources from. So the host_subset_size configuration 
option is no longer needed.

However, we have heard that some operators are relying on this option to help 
spread instances across their hosts, rather than using the RAM weigher. My 
question is: will removing this randomness from the scheduling process hurt any 
operators out there? Or can we safely remove that logic?


Actually, I don't believe this should be removed. The randomness that is 
injected into the placement decision using this configuration setting is 
useful for reducing contention even in the scheduler claim process.


When benchmarking claims in the scheduler here:

https://github.com/jaypipes/placement-bench

I found that the use of a "partitioning strategy" resulted in dramatic 
reduction in lock contention in the claim process. The modulo and random 
partitioning strategies both seemed to work pretty well for reducing 
lock retries.


So, in short, I'd say keep it.

Best,
-jay

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][scheduler] Anyone relying on the host_subset_size config option?

2017-05-26 Thread Edward Leafe
[resending to include the operators list]

The host_subset_size configuration option was added to the scheduler to help 
eliminate race conditions when two requests for a similar VM would be processed 
close together, since the scheduler’s algorithm would select the same host in 
both cases, leading to a race and a likely failure to build for the second 
request. By randomly choosing from the top N hosts, the likelihood of a race 
would be reduced, leading to fewer failed builds.

Current changes in the scheduling process now have the scheduler claiming the 
resources as soon as it selects a host. So in the case above with 2 similar 
requests close together, the first request will claim successfully, but the 
second will fail *while still in the scheduler*. Upon failing the claim, the 
scheduler will simply pick the next host in its weighed list until it finds one 
that it can claim the resources from. So the host_subset_size configuration 
option is no longer needed.

However, we have heard that some operators are relying on this option to help 
spread instances across their hosts, rather than using the RAM weigher. My 
question is: will removing this randomness from the scheduling process hurt any 
operators out there? Or can we safely remove that logic?


-- Ed Leafe


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][scheduler] Anyone relying on the host_subset_size config option?

2017-05-26 Thread Edward Leafe
The host_subset_size configuration option was added to the scheduler to help 
eliminate race conditions when two requests for a similar VM would be processed 
close together, since the scheduler’s algorithm would select the same host in 
both cases, leading to a race and a likely failure to build for the second 
request. By randomly choosing from the top N hosts, the likelihood of a race 
would be reduced, leading to fewer failed builds.

Current changes in the scheduling process now have the scheduler claiming the 
resources as soon as it selects a host. So in the case above with 2 similar 
requests close together, the first request will claim successfully, but the 
second will fail *while still in the scheduler*. Upon failing the claim, the 
scheduler will simply pick the next host in its weighed list until it finds one 
that it can claim the resources from. So the host_subset_size configuration 
option is no longer needed.

However, we have heard that some operators are relying on this option to help 
spread instances across their hosts, rather than using the RAM weigher. My 
question is: will removing this randomness from the scheduling process hurt any 
operators out there? Or can we safely remove that logic?


-- Ed Leafe


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev