Re: [Openstack-operators] nova-placement-api tuning

2018-04-03 Thread Alex Schultz
On Tue, Apr 3, 2018 at 4:48 AM, Chris Dent  wrote:
> On Mon, 2 Apr 2018, Alex Schultz wrote:
>
>> So this is/was valid. A few years back there was some perf tests done
>> with various combinations of process/threads and for Keystone it was
>> determined that threads should be 1 while you should adjust the
>> process count (hence the bug). Now I guess the question is for every
>> service what is the optimal configuration but I'm not sure there's
>> anyone who's looking at this in the upstream for all the services.  In
>> the puppet modules for consistency we applied a similar concept for
>> all the services when they are deployed under apache.  It can be tuned
>> as needed for each service but I don't think we have any great
>> examples of perf numbers. It's really a YMMV thing. We ship a basic
>> default that isn't crazy, but it's probably not optimal either.
>
>
> Do you happen to recall if the trouble with keystone and threaded
> web servers had anything to do with eventlet? Support for the
> eventlet-based server was removed from keystone in Newton.
>

It was running under httpd I believe.

> I've been doing some experiments with placement using multiple uwsgi
> processes, each with multiple threads and it appears to be working
> very well. Ideally all the OpenStack HTTP-based services would be
> able to run effectively in that kind of setup. If they can't I'd
> like to help make it possible.
>
> In any case: processes 3, threads 1 for WSGIDaemonProcess for the
> placement service for a deployment of any real size errs on the
> side of too conservative and I hope we can make some adjustments
> there.
>

You'd say that until you realize that the deployment may also be
sharing every other service api running on the box.  Imagine keystone,
glance, nova, cinder, gnocchi, etc etc all running on the same
machine. Then 3 isn't so conservative. They start adding up and
exhausting resources (cpu cores/memory) really quickly.  In a perfect
world, yes each api service would get it's own system with processes
== processor count but in most cases they end up getting split between
the number of services running on the box.  In puppet we did a sliding
scale and have several facts[0] that can be used if a person doesn't
want to switch to $::processorcount.  If you're rolling your own you
can tune it easier but when you have to come up with something that
might be collocated with a bunch of other services you have to hedge
your bets to make sure it works most of the time.

Thanks,
-Alex

[0] 
http://git.openstack.org/cgit/openstack/puppet-openstacklib/tree/lib/facter/os_workers.rb

>
> --
> Chris Dent   ٩◔̯◔۶   https://anticdent.org/
> freenode: cdent tw: @anticdent
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] nova-placement-api tuning

2018-04-03 Thread Jay Pipes

On 04/03/2018 06:48 AM, Chris Dent wrote:

On Mon, 2 Apr 2018, Alex Schultz wrote:


So this is/was valid. A few years back there was some perf tests done
with various combinations of process/threads and for Keystone it was
determined that threads should be 1 while you should adjust the
process count (hence the bug). Now I guess the question is for every
service what is the optimal configuration but I'm not sure there's
anyone who's looking at this in the upstream for all the services.  In
the puppet modules for consistency we applied a similar concept for
all the services when they are deployed under apache.  It can be tuned
as needed for each service but I don't think we have any great
examples of perf numbers. It's really a YMMV thing. We ship a basic
default that isn't crazy, but it's probably not optimal either.


Do you happen to recall if the trouble with keystone and threaded
web servers had anything to do with eventlet? Support for the
eventlet-based server was removed from keystone in Newton.


IIRC, it had something to do with the way the keystoneauth middleware 
interacted with memcache... not sure if this is still valid any more 
though. Probably worth re-checking the performance.


-jay

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] nova-placement-api tuning

2018-04-03 Thread Chris Dent

On Mon, 2 Apr 2018, Alex Schultz wrote:


So this is/was valid. A few years back there was some perf tests done
with various combinations of process/threads and for Keystone it was
determined that threads should be 1 while you should adjust the
process count (hence the bug). Now I guess the question is for every
service what is the optimal configuration but I'm not sure there's
anyone who's looking at this in the upstream for all the services.  In
the puppet modules for consistency we applied a similar concept for
all the services when they are deployed under apache.  It can be tuned
as needed for each service but I don't think we have any great
examples of perf numbers. It's really a YMMV thing. We ship a basic
default that isn't crazy, but it's probably not optimal either.


Do you happen to recall if the trouble with keystone and threaded
web servers had anything to do with eventlet? Support for the
eventlet-based server was removed from keystone in Newton.

I've been doing some experiments with placement using multiple uwsgi
processes, each with multiple threads and it appears to be working
very well. Ideally all the OpenStack HTTP-based services would be
able to run effectively in that kind of setup. If they can't I'd
like to help make it possible.

In any case: processes 3, threads 1 for WSGIDaemonProcess for the
placement service for a deployment of any real size errs on the
side of too conservative and I hope we can make some adjustments
there.

--
Chris Dent   ٩◔̯◔۶   https://anticdent.org/
freenode: cdent tw: @anticdent___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] nova-placement-api tuning

2018-04-02 Thread Alex Schultz
On Fri, Mar 30, 2018 at 11:11 AM, iain MacDonnell
 wrote:
>
>
> On 03/29/2018 02:13 AM, Belmiro Moreira wrote:
>>
>> Some lessons so far...
>> - Scale keystone accordingly when enabling placement.
>
>
> Speaking of which; I suppose I have the same question for keystone
> (currently running under httpd also). I'm currently using threads=1, based
> on this (IIRC):
>
> https://bugs.launchpad.net/puppet-keystone/+bug/1602530
>
> but I'm not sure if that's valid?
>
> Between placement and ceilometer feeding gnocchi, keystone is kept very
> busy.
>
> Recommendations for processes/threads for keystone? And any other tuning
> hints... ?
>

So this is/was valid. A few years back there was some perf tests done
with various combinations of process/threads and for Keystone it was
determined that threads should be 1 while you should adjust the
process count (hence the bug). Now I guess the question is for every
service what is the optimal configuration but I'm not sure there's
anyone who's looking at this in the upstream for all the services.  In
the puppet modules for consistency we applied a similar concept for
all the services when they are deployed under apache.  It can be tuned
as needed for each service but I don't think we have any great
examples of perf numbers. It's really a YMMV thing. We ship a basic
default that isn't crazy, but it's probably not optimal either.

Thanks,
-Alex

> Thanks!
>
> ~iain
>
>
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] nova-placement-api tuning

2018-03-29 Thread Chris Dent

On Thu, 29 Mar 2018, iain MacDonnell wrote:


If I'm reading

http://modwsgi.readthedocs.io/en/develop/user-guides/processes-and-threading.html

right, it seems that the MPM is not pertinent when using WSGIDaemonProcess.


It doesn't impact the number wsgi processes that will exist or how
they are configured, but it does control the flexibility with which
apache itself will scale to accept initial connections. That's not a
problem you're yet seeing at your scale, but is an issue when the
number of compute nodes gets much bigger.

--
Chris Dent   ٩◔̯◔۶   https://anticdent.org/
freenode: cdent tw: @anticdent___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] nova-placement-api tuning

2018-03-29 Thread iain MacDonnell



On 03/29/2018 04:24 AM, Chris Dent wrote:

On Thu, 29 Mar 2018, Belmiro Moreira wrote:

[lots of great advice snipped]


- Change apache mpm default from prefork to event/worker.
- Increase the WSGI number of processes/threads considering where 
placement

is running.


If I'm reading

http://modwsgi.readthedocs.io/en/develop/user-guides/processes-and-threading.html

right, it seems that the MPM is not pertinent when using WSGIDaemonProcess.



Another option is to switch to nginx and uwsgi. In situations where
the web server is essentially operating as a proxy to another
process which is being the WSGI server, nginx has a history of being
very effective.


Evaluating adoption of uwsgi is on my to-do list ... not least because 
it'd enable restarting of services individually...


~iain


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] nova-placement-api tuning

2018-03-29 Thread Matt Riedemann

On 3/29/2018 12:05 PM, Chris Dent wrote:
Other suggestions? I'm looking at things like turning off 
scheduler_tracks_instance_changes, since affinity scheduling is not 
needed (at least so-far), but not sure that that will help with 
placement load (seems like it might, though?)


This won't impact the placement service itself.


It seemed like it might be causing the compute nodes to make calls to 
update allocations, so I was thinking it might reduce the load a bit, 
but I didn't confirm that. This was "clutching at straws" - hopefully 
I won't need to now.


There's duplication of instance state going to both placement and
the nova-scheduler. The number of calls from nova-compute to
placement reduces a bit as you updgrade to newer releases. It's
still more than we'd prefer.


As Chris said, scheduler_tracks_instance_changes doesn't have anything 
to do with Placement, and it will add more RPC load to your system 
because all computes are RPC casting to the scheduler for every instance 
create/delete/move operation along with a periodic that runs, by 
default, every minute on each compute service to sync things up.


The primary need for scheduler_tracks_instance_changes is the 
(anti-)affinity filters in the scheduler (and maybe if you're using the 
CachingScheduler). If you don't enable the (anti-)affinity filters (they 
are enabled by default), then you can disable 
scheduler_tracks_instance_changes.


Note that you can still disable scheduler_tracks_instance_changes and 
run the affinity filters, but the scheduler will likely make poor 
decisions in a busy cloud which can result in reschedules, which are 
also expensive.


Long-term, we hope to remove the need for 
scheduler_tracks_instance_changes at all because we should have all of 
the information we need about the instances in the Placement service, 
which is generally considered global to the deployment. However, we 
don't yet have a way to model affinity/distance in Placement, and that's 
what's holding us back from removing scheduler_tracks_instance_changes 
and the existing affinity filters.


--

Thanks,

Matt

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] nova-placement-api tuning

2018-03-29 Thread Chris Dent

On Thu, 29 Mar 2018, iain MacDonnell wrote:


placement python stack and kicks out the 401. So this mostly
indicates that socket accept is taking forever.


Well, this test connects and gets a 400 immediately:

echo | nc -v apihost 8778

so I don't think it's at at the socket level, but, I assume, the actual WSGI 
app, once the socket connection is established. I did try to choose a test 
that tickles the app, but doesn't "get too deep", as you say.


Sorry I was being terribly non-specific. I meant generically
somewhere along the way from the either the TCP socket that is
accept the initial http connection to 8778 or the unix domain socket
that is between apache2 and the wsgi daemon process. As you've
discerned, the TCP socket and apache2 are fine.

Good question. I could have sworn it was in the installation guide, but I 
can't find it now. It must have come from RDO, i.e.:


https://github.com/rdo-packages/nova-distgit/blob/rpm-master/nova-placement-api.conf


Ooph. I'll see if I can find someone to talk to about that.

Right, that was my basic assessment too so now I'm trying to figure out 
how it should be tuned, but had not been able to find any guidelines, so 
thought of asking here. You've confirmed that I'm on the right track (or at 
least "a" right track).


The mod wsgi docs have a fair bit of stuff about tuning in them, but
it is mixed in amongst various things, but
http://modwsgi.readthedocs.io/en/develop/user-guides/processes-and-threading.html
might be a good starting point.

Other suggestions? I'm looking at things like turning off 
scheduler_tracks_instance_changes, since affinity scheduling is not needed 
(at least so-far), but not sure that that will help with placement load 
(seems like it might, though?)


This won't impact the placement service itself.


It seemed like it might be causing the compute nodes to make calls to update 
allocations, so I was thinking it might reduce the load a bit, but I didn't 
confirm that. This was "clutching at straws" - hopefully I won't need to now.


There's duplication of instance state going to both placement and
the nova-scheduler. The number of calls from nova-compute to
placement reduces a bit as you updgrade to newer releases. It's
still more than we'd prefer.


--
Chris Dent   ٩◔̯◔۶   https://anticdent.org/
freenode: cdent tw: @anticdent___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] nova-placement-api tuning

2018-03-29 Thread iain MacDonnell



On 03/29/2018 01:19 AM, Chris Dent wrote:

On Wed, 28 Mar 2018, iain MacDonnell wrote:

Looking for recommendations on tuning of nova-placement-api. I have a 
few moderately-sized deployments (~200 nodes, ~4k instances), 
currently on Ocata, and instance creation is getting very slow as they 
fill up.


This should be well within the capabilities of an appropriately
installed placement service, so I reckon something is weird about
your installation. More within.


$ time curl http://apihost:8778/

{"error": {"message": "The request you have made requires 
authentication.", "code": 401, "title": "Unauthorized"}}

real    0m20.656s
user    0m0.003s
sys    0m0.001s


This is good choice for trying to determine what's up because it
avoids any interaction with the database and most of the stack of
code: the web server answers, runs a very small percentage of the
placement python stack and kicks out the 401. So this mostly
indicates that socket accept is taking forever.


Well, this test connects and gets a 400 immediately:

echo | nc -v apihost 8778

so I don't think it's at at the socket level, but, I assume, the actual 
WSGI app, once the socket connection is established. I did try to choose 
a test that tickles the app, but doesn't "get too deep", as you say.



nova-placement-api is running under mod_wsgi with the "standard"(?) 
config, i.e.:


Do you recall where this configuration comes from? The settings for
WSGIDaemonProcess are not very good and if there is some packaging
or documentation that is settings this way it would be good to find
it and fix it.


Good question. I could have sworn it was in the installation guide, but 
I can't find it now. It must have come from RDO, i.e.:


https://github.com/rdo-packages/nova-distgit/blob/rpm-master/nova-placement-api.conf



Depending on what else is on the host running placement I'd boost
processes to number of cores divided by 2, 3 or 4 and boost threads to
around 25. Or you can leave 'threads' off and it will default to 15
(at least in recent versions of mod wsgi).

With the settings a below you're basically saying that you want to
handle 3 connections at a time, which isn't great, since each of
your compute-nodes wants to talk to placement multiple times a
minute (even when nothing is happening).


Right, that was my basic assessment too so now I'm trying to figure 
out how it should be tuned, but had not been able to find any 
guidelines, so thought of asking here. You've confirmed that I'm on the 
right track (or at least "a" right track).




Tweaking the number of processes versus the number of threads
depends on whether it appears that the processes are cpu or I/O
bound. More threads helps when things are I/O bound.


Interesting. Will keep that in mind. Thanks!


...
 WSGIProcessGroup nova-placement-api
 WSGIApplicationGroup %{GLOBAL}
 WSGIPassAuthorization On
 WSGIDaemonProcess nova-placement-api processes=3 threads=1 user=nova 
group=nova

 WSGIScriptAlias / /usr/bin/nova-placement-api
...


[snip]

Other suggestions? I'm looking at things like turning off 
scheduler_tracks_instance_changes, since affinity scheduling is not 
needed (at least so-far), but not sure that that will help with 
placement load (seems like it might, though?)


This won't impact the placement service itself.


It seemed like it might be causing the compute nodes to make calls to 
update allocations, so I was thinking it might reduce the load a bit, 
but I didn't confirm that. This was "clutching at straws" - hopefully I 
won't need to now.




A while back I did some experiments with trying to overload
placement by using the fake virt driver in devstack and wrote it up
at  https://anticdent.org/placement-scale-fun.html


The gist was that with a properly tuned placement service it was
other parts of the system that suffered first.


Interesting. Thanks for sharing that!

~iain


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] nova-placement-api tuning

2018-03-29 Thread Chris Dent

On Thu, 29 Mar 2018, Belmiro Moreira wrote:

[lots of great advice snipped]


- Change apache mpm default from prefork to event/worker.
- Increase the WSGI number of processes/threads considering where placement
is running.


Another option is to switch to nginx and uwsgi. In situations where
the web server is essentially operating as a proxy to another
process which is being the WSGI server, nginx has a history of being
very effective.

--
Chris Dent   ٩◔̯◔۶   https://anticdent.org/
freenode: cdent tw: @anticdent___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] nova-placement-api tuning

2018-03-29 Thread Belmiro Moreira
Hi,
with Ocata upgrade we decided to run local placements (one service per
cellV1) because we were nervous about possible scalability issues but
specially the increase of the schedule time. Fortunately, this is now been
address with the placement-req-filter work.

We started slowly to aggregate our local placements into a the central one
(required for cellsV2).
Currently we have >7000 compute nodes (>40k requests per minute) into this
central placement. Still ~2000 compute nodes to go.

Some lessons so far...
- Scale keystone accordingly when enabling placement.
- Don't forget to configure memcache for keystone_authtoken.
- Change apache mpm default from prefork to event/worker.
- Increase the WSGI number of processes/threads considering where placement
is running.
- Have enough placement nodes considering your number of requests.
- Monitor the request time. This impacts VM scheduling. Also, depending how
it's configured the LB can also start removing placement nodes.
- DB could be a bottleneck.

We are still learning how to have a stable placement at scale.
It would be great if others can share their experiences.


Belmiro
CERN

On Thu, Mar 29, 2018 at 10:19 AM, Chris Dent  wrote:

> On Wed, 28 Mar 2018, iain MacDonnell wrote:
>
> Looking for recommendations on tuning of nova-placement-api. I have a few
>> moderately-sized deployments (~200 nodes, ~4k instances), currently on
>> Ocata, and instance creation is getting very slow as they fill up.
>>
>
> This should be well within the capabilities of an appropriately
> installed placement service, so I reckon something is weird about
> your installation. More within.
>
> $ time curl http://apihost:8778/
>> {"error": {"message": "The request you have made requires
>> authentication.", "code": 401, "title": "Unauthorized"}}
>> real0m20.656s
>> user0m0.003s
>> sys 0m0.001s
>>
>
> This is good choice for trying to determine what's up because it
> avoids any interaction with the database and most of the stack of
> code: the web server answers, runs a very small percentage of the
> placement python stack and kicks out the 401. So this mostly
> indicates that socket accept is taking forever.
>
> nova-placement-api is running under mod_wsgi with the "standard"(?)
>> config, i.e.:
>>
>
> Do you recall where this configuration comes from? The settings for
> WSGIDaemonProcess are not very good and if there is some packaging
> or documentation that is settings this way it would be good to find
> it and fix it.
>
> Depending on what else is on the host running placement I'd boost
> processes to number of cores divided by 2, 3 or 4 and boost threads to
> around 25. Or you can leave 'threads' off and it will default to 15
> (at least in recent versions of mod wsgi).
>
> With the settings a below you're basically saying that you want to
> handle 3 connections at a time, which isn't great, since each of
> your compute-nodes wants to talk to placement multiple times a
> minute (even when nothing is happening).
>
> Tweaking the number of processes versus the number of threads
> depends on whether it appears that the processes are cpu or I/O
> bound. More threads helps when things are I/O bound.
>
> ...
>>  WSGIProcessGroup nova-placement-api
>>  WSGIApplicationGroup %{GLOBAL}
>>  WSGIPassAuthorization On
>>  WSGIDaemonProcess nova-placement-api processes=3 threads=1 user=nova
>> group=nova
>>  WSGIScriptAlias / /usr/bin/nova-placement-api
>> ...
>>
>
> [snip]
>
> Other suggestions? I'm looking at things like turning off
>> scheduler_tracks_instance_changes, since affinity scheduling is not
>> needed (at least so-far), but not sure that that will help with placement
>> load (seems like it might, though?)
>>
>
> This won't impact the placement service itself.
>
> A while back I did some experiments with trying to overload
> placement by using the fake virt driver in devstack and wrote it up
> at https://anticdent.org/placement-scale-fun.html
>
> The gist was that with a properly tuned placement service it was
> other parts of the system that suffered first.
>
> --
> Chris Dent   ٩◔̯◔۶   https://anticdent.org/
> freenode: cdent tw: @anticdent
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] nova-placement-api tuning

2018-03-29 Thread Chris Dent

On Wed, 28 Mar 2018, iain MacDonnell wrote:

Looking for recommendations on tuning of nova-placement-api. I have a few 
moderately-sized deployments (~200 nodes, ~4k instances), currently on Ocata, 
and instance creation is getting very slow as they fill up.


This should be well within the capabilities of an appropriately
installed placement service, so I reckon something is weird about
your installation. More within.


$ time curl http://apihost:8778/
{"error": {"message": "The request you have made requires authentication.", 
"code": 401, "title": "Unauthorized"}}

real0m20.656s
user0m0.003s
sys 0m0.001s


This is good choice for trying to determine what's up because it
avoids any interaction with the database and most of the stack of
code: the web server answers, runs a very small percentage of the
placement python stack and kicks out the 401. So this mostly
indicates that socket accept is taking forever.

nova-placement-api is running under mod_wsgi with the "standard"(?) config, 
i.e.:


Do you recall where this configuration comes from? The settings for
WSGIDaemonProcess are not very good and if there is some packaging
or documentation that is settings this way it would be good to find
it and fix it.

Depending on what else is on the host running placement I'd boost
processes to number of cores divided by 2, 3 or 4 and boost threads to
around 25. Or you can leave 'threads' off and it will default to 15
(at least in recent versions of mod wsgi).

With the settings a below you're basically saying that you want to
handle 3 connections at a time, which isn't great, since each of
your compute-nodes wants to talk to placement multiple times a
minute (even when nothing is happening).

Tweaking the number of processes versus the number of threads
depends on whether it appears that the processes are cpu or I/O
bound. More threads helps when things are I/O bound.


...
 WSGIProcessGroup nova-placement-api
 WSGIApplicationGroup %{GLOBAL}
 WSGIPassAuthorization On
 WSGIDaemonProcess nova-placement-api processes=3 threads=1 user=nova 
group=nova

 WSGIScriptAlias / /usr/bin/nova-placement-api
...


[snip]

Other suggestions? I'm looking at things like turning off 
scheduler_tracks_instance_changes, since affinity scheduling is not needed 
(at least so-far), but not sure that that will help with placement load 
(seems like it might, though?)


This won't impact the placement service itself.

A while back I did some experiments with trying to overload
placement by using the fake virt driver in devstack and wrote it up
at https://anticdent.org/placement-scale-fun.html

The gist was that with a properly tuned placement service it was
other parts of the system that suffered first.

--
Chris Dent   ٩◔̯◔۶   https://anticdent.org/
freenode: cdent tw: @anticdent___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators