Re: [openstack-dev] PostgreSQL jobs slow in the gate

2014-09-18 Thread Sean Dague
On 09/17/2014 11:50 PM, Clark Boylan wrote:
 On Wed, Sep 17, 2014, at 06:48 PM, Clark Boylan wrote:
 On Wed, Sep 17, 2014, at 06:37 PM, Matt Riedemann wrote:


 On 9/17/2014 7:59 PM, Ian Wienand wrote:
 On 09/18/2014 09:49 AM, Clark Boylan wrote:
 Recent sampling of test run times shows that our tempest jobs run
 against clouds using PostgreSQL are significantly slower than jobs run
 against clouds using MySQL.

 FYI There is a possibly relevant review out for max_connections limits
 [1], although it seems to have some issues with shmem usage

 -i

 [1] https://review.openstack.org/#/c/121952/

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


 That's a backport of a fix from master where we were hitting fatal 
 errors due to too many DB connections which was brought on by the 
 changes to cinder and glance to run as many workers as there were CPUs 
 available.  So I don't think it probably plays here...

 The errors pointed out in another part of the thread have been around 
 for awhile, I think they are due to negative tests where we're hitting 
 unique constraints because of the negative tests, so they are expected.

 We should also note that the postgresql jobs run with the nova metadata 
 API service, I'm not sure how much of a factor that would have here.

 Is there anything else unique about those jobs from the MySQL ones?

 Good question. There are apparently other differences. The postgres job
 runs Keystone under eventlet instead of via apache mod_wsgi. It also
 sets FORCE_CONFIGDRIVE=False instead of always. And the final difference
 I can find is the one you point out, nova api metadata service is run as
 an independent thing.

 Could these things be related? It would be relatively simple to push a
 change or two to devstack-gate to test this but there are enough options
 here that I probably won't do that until we think at least one of these
 options is at fault.
 I am starting to feel bad that I picked on PostgreSQL and completely
 forgot that there were other items in play here. I went ahead and
 uploaded [0] to run all devstack jobs without keystone wsgi services
 (eventlet) and [1] to run all devstack job with keystone wsgi services
 and the initial results are pretty telling.
 
 It appears that keystone eventlet is the source of the slowness in this
 job. With keystone eventlet all of the devstack jobs are slower and with
 keystone wsgi all of the jobs are quicker. Probably need to collect a
 bit more data but this doesn't look good for keystone eventlet.
 
 Thank you Matt for pointing me in this direction.
 
 [0] https://review.openstack.org/#/c/122299/
 [1] https://review.openstack.org/#/c/122300/

Don't feel bad. :)

The point that Clark highlights here is a good one. There is an
assumption that once someone creates a job in infra, the magic elves are
responsible for it.

But there are no magic elves. So jobs like this need sponsors.

Maybe the right thing to do is not conflate this configuration and put
an eventlet version of the keystone job only on keystone (because the
keystone team was the one that proposed having a config like that, but
it's so far away from their project they aren't ever noticing when it's
regressing).

Same issue with the metadata server split. That's really only a thing
Nova cares about. It shouldn't impact anyone else.

-Sean

-- 
Sean Dague
http://dague.net

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] PostgreSQL jobs slow in the gate

2014-09-18 Thread Matt Riedemann



On 9/18/2014 5:49 AM, Sean Dague wrote:

On 09/17/2014 11:50 PM, Clark Boylan wrote:

On Wed, Sep 17, 2014, at 06:48 PM, Clark Boylan wrote:

On Wed, Sep 17, 2014, at 06:37 PM, Matt Riedemann wrote:



On 9/17/2014 7:59 PM, Ian Wienand wrote:

On 09/18/2014 09:49 AM, Clark Boylan wrote:

Recent sampling of test run times shows that our tempest jobs run
against clouds using PostgreSQL are significantly slower than jobs run
against clouds using MySQL.


FYI There is a possibly relevant review out for max_connections limits
[1], although it seems to have some issues with shmem usage

-i

[1] https://review.openstack.org/#/c/121952/

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



That's a backport of a fix from master where we were hitting fatal
errors due to too many DB connections which was brought on by the
changes to cinder and glance to run as many workers as there were CPUs
available.  So I don't think it probably plays here...

The errors pointed out in another part of the thread have been around
for awhile, I think they are due to negative tests where we're hitting
unique constraints because of the negative tests, so they are expected.

We should also note that the postgresql jobs run with the nova metadata
API service, I'm not sure how much of a factor that would have here.

Is there anything else unique about those jobs from the MySQL ones?


Good question. There are apparently other differences. The postgres job
runs Keystone under eventlet instead of via apache mod_wsgi. It also
sets FORCE_CONFIGDRIVE=False instead of always. And the final difference
I can find is the one you point out, nova api metadata service is run as
an independent thing.

Could these things be related? It would be relatively simple to push a
change or two to devstack-gate to test this but there are enough options
here that I probably won't do that until we think at least one of these
options is at fault.

I am starting to feel bad that I picked on PostgreSQL and completely
forgot that there were other items in play here. I went ahead and
uploaded [0] to run all devstack jobs without keystone wsgi services
(eventlet) and [1] to run all devstack job with keystone wsgi services
and the initial results are pretty telling.

It appears that keystone eventlet is the source of the slowness in this
job. With keystone eventlet all of the devstack jobs are slower and with
keystone wsgi all of the jobs are quicker. Probably need to collect a
bit more data but this doesn't look good for keystone eventlet.

Thank you Matt for pointing me in this direction.

[0] https://review.openstack.org/#/c/122299/
[1] https://review.openstack.org/#/c/122300/


Don't feel bad. :)

The point that Clark highlights here is a good one. There is an
assumption that once someone creates a job in infra, the magic elves are
responsible for it.

But there are no magic elves. So jobs like this need sponsors.

Maybe the right thing to do is not conflate this configuration and put
an eventlet version of the keystone job only on keystone (because the
keystone team was the one that proposed having a config like that, but
it's so far away from their project they aren't ever noticing when it's
regressing).

Same issue with the metadata server split. That's really only a thing
Nova cares about. It shouldn't impact anyone else.

-Sean



Neutron cares about the nova metadata API service right?

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] PostgreSQL jobs slow in the gate

2014-09-18 Thread Matt Riedemann



On 9/18/2014 12:35 AM, Morgan Fainberg wrote:

-Original Message-
From: Dean Troyer dtro...@gmail.com
Reply: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.org
Date: September 17, 2014 at 21:21:47
To: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.org
Subject:  Re: [openstack-dev] PostgreSQL jobs slow in the gate



Clark Boylan Wrotr :


It appears that keystone eventlet is the source of the slowness in this
job. With keystone eventlet all of the devstack jobs are slower and with
keystone wsgi all of the jobs are quicker. Probably need to collect a
bit more data but this doesn't look good for keystone eventlet.




On Wed, Sep 17, 2014 at 11:02 PM, Morgan Fainberg 
morgan.fainb...@gmail.com wrote:


I've kicked off a test[1] as well to check into some tunable options
(eventlet workers) for improving keystone eventlet performance. I'll circle
back with the infra team once we have a little more data on both fronts.
The Keystone team will use this data to figure out the best way to approach
this issue.



Brant submitted https://review.openstack.org/#/c/121384/ to up the Keystone
workers when API_WORKERS is set.

I submitted https://review.openstack.org/#/c/122013/ to set a scaling
default for API_WORKERS based on the available CPUs ((nproc+1)/2). There
is a summary in that commit message of the current reviews addressing the
workers in various projects.

I think it has become clear that DevStack needs to set a default for most
services that are currently either too big (nproc) or too small (Keystone
at 1). Of course, moving things to mod_wsgi moots all of that, but it'll
be a while before everything moves.

dt

--

Dean Troyer
dtro...@gmail.com


Dean,

We should probably look at increasing the default number of workers as well in 
the Keystone configuration (rather than just devstack). It looks like, with 
limited datasets, we are seeing a real improvement with Keystone and 4 workers 
(from my previously mentioned quick test). Thanks for the extra data points. 
This helps to confirm the issue is Keystone under eventlet.

Cheers,
Morgan

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



I pointed this out in Brant's devstack patch but we had a product team 
internally bring up this same point in Icehouse, they were really 
limited due to the eventlet workers issue in Keystone and once we 
provided the option (backported) it increased their throughput by 20%. 
We've been running with that in our internal Tempest runs (setting 
workers equal to number of CPUs / 2) and so far so good.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] PostgreSQL jobs slow in the gate

2014-09-17 Thread Joe Gordon
Postgres is also logging a lot of errors:
http://logs.openstack.org/63/122263/1/check/check-tempest-dsvm-postgres-full/2f27252/logs/postgres.txt.gz

On Wed, Sep 17, 2014 at 4:49 PM, Clark Boylan cboy...@sapwetik.org wrote:

 Hello,

 Recent sampling of test run times shows that our tempest jobs run
 against clouds using PostgreSQL are significantly slower than jobs run
 against clouds using MySQL.

 (check|gate)-tempest-dsvm-full has an average run time of 52.9 minutes
 (stddev 5.92 minutes) over 516 runs.
 (check|gate)-tempest-dsvm-postgres-full has an average run time of 73.78
 minutes (stddev 11.01 minutes) over 493 runs.

 I think this is a bug and and an important one to solve prior to release
 if we want to continue to care and feed for PostgreSQL support. I
 haven't filed a bug in LP because I am not sure where the slowness is
 and creating a bug against all the projects is painful. (If there are
 suggestions for how to do this in a non painful way I will happily go
 file a proper bug).

 Is there interest in fixing this? If not we should probably reconsider
 removing these PostgreSQL jobs from the gate.


++ to getting someone to own and fix this or drop it from the gate.


 Note, a quick spot check indicates the increase in job time is not
 related to job setup. Total time before running tempest appears to be
 just over 18 minutes in the jobs I checked.

 Thank you,
 Clark

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] PostgreSQL jobs slow in the gate

2014-09-17 Thread Joshua Harlow
Same seen @ 
http://logs.openstack.org/80/121280/32/check/check-tempest-dsvm-postgres-full/395a05d/logs/postgres.txt.gz

Although I'm not sure if those are abnormal or not (seems likely they wouldn't 
be).

Was there a postgres release?

On Sep 17, 2014, at 5:10 PM, Joe Gordon joe.gord...@gmail.com wrote:

 Postgres is also logging a lot of errors: 
 http://logs.openstack.org/63/122263/1/check/check-tempest-dsvm-postgres-full/2f27252/logs/postgres.txt.gz
 
 On Wed, Sep 17, 2014 at 4:49 PM, Clark Boylan cboy...@sapwetik.org wrote:
 Hello,
 
 Recent sampling of test run times shows that our tempest jobs run
 against clouds using PostgreSQL are significantly slower than jobs run
 against clouds using MySQL.
 
 (check|gate)-tempest-dsvm-full has an average run time of 52.9 minutes
 (stddev 5.92 minutes) over 516 runs.
 (check|gate)-tempest-dsvm-postgres-full has an average run time of 73.78
 minutes (stddev 11.01 minutes) over 493 runs.
 
 I think this is a bug and and an important one to solve prior to release
 if we want to continue to care and feed for PostgreSQL support. I
 haven't filed a bug in LP because I am not sure where the slowness is
 and creating a bug against all the projects is painful. (If there are
 suggestions for how to do this in a non painful way I will happily go
 file a proper bug).
 
 Is there interest in fixing this? If not we should probably reconsider
 removing these PostgreSQL jobs from the gate.
 
 
 ++ to getting someone to own and fix this or drop it from the gate.
  
 Note, a quick spot check indicates the increase in job time is not
 related to job setup. Total time before running tempest appears to be
 just over 18 minutes in the jobs I checked.
 
 Thank you,
 Clark
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] PostgreSQL jobs slow in the gate

2014-09-17 Thread Ian Wienand

On 09/18/2014 09:49 AM, Clark Boylan wrote:

Recent sampling of test run times shows that our tempest jobs run
against clouds using PostgreSQL are significantly slower than jobs run
against clouds using MySQL.


FYI There is a possibly relevant review out for max_connections limits
[1], although it seems to have some issues with shmem usage

-i

[1] https://review.openstack.org/#/c/121952/

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] PostgreSQL jobs slow in the gate

2014-09-17 Thread Matt Riedemann



On 9/17/2014 7:59 PM, Ian Wienand wrote:

On 09/18/2014 09:49 AM, Clark Boylan wrote:

Recent sampling of test run times shows that our tempest jobs run
against clouds using PostgreSQL are significantly slower than jobs run
against clouds using MySQL.


FYI There is a possibly relevant review out for max_connections limits
[1], although it seems to have some issues with shmem usage

-i

[1] https://review.openstack.org/#/c/121952/

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



That's a backport of a fix from master where we were hitting fatal 
errors due to too many DB connections which was brought on by the 
changes to cinder and glance to run as many workers as there were CPUs 
available.  So I don't think it probably plays here...


The errors pointed out in another part of the thread have been around 
for awhile, I think they are due to negative tests where we're hitting 
unique constraints because of the negative tests, so they are expected.


We should also note that the postgresql jobs run with the nova metadata 
API service, I'm not sure how much of a factor that would have here.


Is there anything else unique about those jobs from the MySQL ones?

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] PostgreSQL jobs slow in the gate

2014-09-17 Thread Clark Boylan
On Wed, Sep 17, 2014, at 06:37 PM, Matt Riedemann wrote:
 
 
 On 9/17/2014 7:59 PM, Ian Wienand wrote:
  On 09/18/2014 09:49 AM, Clark Boylan wrote:
  Recent sampling of test run times shows that our tempest jobs run
  against clouds using PostgreSQL are significantly slower than jobs run
  against clouds using MySQL.
 
  FYI There is a possibly relevant review out for max_connections limits
  [1], although it seems to have some issues with shmem usage
 
  -i
 
  [1] https://review.openstack.org/#/c/121952/
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 That's a backport of a fix from master where we were hitting fatal 
 errors due to too many DB connections which was brought on by the 
 changes to cinder and glance to run as many workers as there were CPUs 
 available.  So I don't think it probably plays here...
 
 The errors pointed out in another part of the thread have been around 
 for awhile, I think they are due to negative tests where we're hitting 
 unique constraints because of the negative tests, so they are expected.
 
 We should also note that the postgresql jobs run with the nova metadata 
 API service, I'm not sure how much of a factor that would have here.
 
 Is there anything else unique about those jobs from the MySQL ones?

Good question. There are apparently other differences. The postgres job
runs Keystone under eventlet instead of via apache mod_wsgi. It also
sets FORCE_CONFIGDRIVE=False instead of always. And the final difference
I can find is the one you point out, nova api metadata service is run as
an independent thing.

Could these things be related? It would be relatively simple to push a
change or two to devstack-gate to test this but there are enough options
here that I probably won't do that until we think at least one of these
options is at fault.
 
 -- 
 
 Thanks,
 
 Matt Riedemann
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] PostgreSQL jobs slow in the gate

2014-09-17 Thread Clark Boylan
On Wed, Sep 17, 2014, at 06:48 PM, Clark Boylan wrote:
 On Wed, Sep 17, 2014, at 06:37 PM, Matt Riedemann wrote:
  
  
  On 9/17/2014 7:59 PM, Ian Wienand wrote:
   On 09/18/2014 09:49 AM, Clark Boylan wrote:
   Recent sampling of test run times shows that our tempest jobs run
   against clouds using PostgreSQL are significantly slower than jobs run
   against clouds using MySQL.
  
   FYI There is a possibly relevant review out for max_connections limits
   [1], although it seems to have some issues with shmem usage
  
   -i
  
   [1] https://review.openstack.org/#/c/121952/
  
   ___
   OpenStack-dev mailing list
   OpenStack-dev@lists.openstack.org
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
  
  That's a backport of a fix from master where we were hitting fatal 
  errors due to too many DB connections which was brought on by the 
  changes to cinder and glance to run as many workers as there were CPUs 
  available.  So I don't think it probably plays here...
  
  The errors pointed out in another part of the thread have been around 
  for awhile, I think they are due to negative tests where we're hitting 
  unique constraints because of the negative tests, so they are expected.
  
  We should also note that the postgresql jobs run with the nova metadata 
  API service, I'm not sure how much of a factor that would have here.
  
  Is there anything else unique about those jobs from the MySQL ones?
 
 Good question. There are apparently other differences. The postgres job
 runs Keystone under eventlet instead of via apache mod_wsgi. It also
 sets FORCE_CONFIGDRIVE=False instead of always. And the final difference
 I can find is the one you point out, nova api metadata service is run as
 an independent thing.
 
 Could these things be related? It would be relatively simple to push a
 change or two to devstack-gate to test this but there are enough options
 here that I probably won't do that until we think at least one of these
 options is at fault.
I am starting to feel bad that I picked on PostgreSQL and completely
forgot that there were other items in play here. I went ahead and
uploaded [0] to run all devstack jobs without keystone wsgi services
(eventlet) and [1] to run all devstack job with keystone wsgi services
and the initial results are pretty telling.

It appears that keystone eventlet is the source of the slowness in this
job. With keystone eventlet all of the devstack jobs are slower and with
keystone wsgi all of the jobs are quicker. Probably need to collect a
bit more data but this doesn't look good for keystone eventlet.

Thank you Matt for pointing me in this direction.

[0] https://review.openstack.org/#/c/122299/
[1] https://review.openstack.org/#/c/122300/
  
  -- 
  
  Thanks,
  
  Matt Riedemann
  
  
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] PostgreSQL jobs slow in the gate

2014-09-17 Thread Morgan Fainberg
Clark Boylan cboy...@sapwetik.org Wrotr :

 On Wed, Sep 17, 2014, at 06:48 PM, Clark Boylan wrote:
  On Wed, Sep 17, 2014, at 06:37 PM, Matt Riedemann wrote:
  
  
   On 9/17/2014 7:59 PM, Ian Wienand wrote:
On 09/18/2014 09:49 AM, Clark Boylan wrote:
Recent sampling of test run times shows that our tempest jobs run
against clouds using PostgreSQL are significantly slower than jobs
 run
against clouds using MySQL.
   
FYI There is a possibly relevant review out for max_connections
 limits
[1], although it seems to have some issues with shmem usage
   
-i
   
[1] https://review.openstack.org/#/c/121952/
   
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org javascript:;
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
   
  
   That's a backport of a fix from master where we were hitting fatal
   errors due to too many DB connections which was brought on by the
   changes to cinder and glance to run as many workers as there were CPUs
   available.  So I don't think it probably plays here...
  
   The errors pointed out in another part of the thread have been around
   for awhile, I think they are due to negative tests where we're hitting
   unique constraints because of the negative tests, so they are expected.
  
   We should also note that the postgresql jobs run with the nova metadata
   API service, I'm not sure how much of a factor that would have here.
  
   Is there anything else unique about those jobs from the MySQL ones?
  
  Good question. There are apparently other differences. The postgres job
  runs Keystone under eventlet instead of via apache mod_wsgi. It also
  sets FORCE_CONFIGDRIVE=False instead of always. And the final difference
  I can find is the one you point out, nova api metadata service is run as
  an independent thing.
 
  Could these things be related? It would be relatively simple to push a
  change or two to devstack-gate to test this but there are enough options
  here that I probably won't do that until we think at least one of these
  options is at fault.
 I am starting to feel bad that I picked on PostgreSQL and completely
 forgot that there were other items in play here. I went ahead and
 uploaded [0] to run all devstack jobs without keystone wsgi services
 (eventlet) and [1] to run all devstack job with keystone wsgi services
 and the initial results are pretty telling.

 It appears that keystone eventlet is the source of the slowness in this
 job. With keystone eventlet all of the devstack jobs are slower and with
 keystone wsgi all of the jobs are quicker. Probably need to collect a
 bit more data but this doesn't look good for keystone eventlet.

 Thank you Matt for pointing me in this direction.

 [0] https://review.openstack.org/#/c/122299/
 [1] https://review.openstack.org/#/c/122300/
  
   --
  
   Thanks,
  
   Matt Riedemann
  


I've kicked off a test[1] as well to check into some tunable options
(eventlet workers) for improving keystone eventlet performance. I'll circle
back with the infra team once we have a little more data on both fronts.
The Keystone team will use this data to figure out the best way to approach
this issue.

--Morgan

[1] https://review.openstack.org/#/c/122308/
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] PostgreSQL jobs slow in the gate

2014-09-17 Thread Dean Troyer

 Clark Boylan cboy...@sapwetik.org Wrotr :

 It appears that keystone eventlet is the source of the slowness in this
 job. With keystone eventlet all of the devstack jobs are slower and with
 keystone wsgi all of the jobs are quicker. Probably need to collect a
 bit more data but this doesn't look good for keystone eventlet.


 On Wed, Sep 17, 2014 at 11:02 PM, Morgan Fainberg 
morgan.fainb...@gmail.com wrote:

 I've kicked off a test[1] as well to check into some tunable options
 (eventlet workers) for improving keystone eventlet performance. I'll circle
 back with the infra team once we have a little more data on both fronts.
 The Keystone team will use this data to figure out the best way to approach
 this issue.


Brant submitted https://review.openstack.org/#/c/121384/ to up the Keystone
workers when API_WORKERS is set.

I submitted https://review.openstack.org/#/c/122013/ to set a scaling
default for API_WORKERS based on the available CPUs ((nproc+1)/2).  There
is a summary in that commit message of the current reviews addressing the
workers in various projects.

I think it has become clear that DevStack needs to set a default for most
services that are currently either too big (nproc) or too small (Keystone
at 1).  Of course, moving things to mod_wsgi moots all of that, but it'll
be a while before everything moves.

dt

-- 

Dean Troyer
dtro...@gmail.com
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] PostgreSQL jobs slow in the gate

2014-09-17 Thread Morgan Fainberg
-Original Message-
From: Dean Troyer dtro...@gmail.com
Reply: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.org
Date: September 17, 2014 at 21:21:47
To: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.org
Subject:  Re: [openstack-dev] PostgreSQL jobs slow in the gate

 
  Clark Boylan Wrotr :
 
  It appears that keystone eventlet is the source of the slowness in this
  job. With keystone eventlet all of the devstack jobs are slower and with
  keystone wsgi all of the jobs are quicker. Probably need to collect a
  bit more data but this doesn't look good for keystone eventlet.
 
 
 On Wed, Sep 17, 2014 at 11:02 PM, Morgan Fainberg 
 morgan.fainb...@gmail.com wrote:
  
  I've kicked off a test[1] as well to check into some tunable options
  (eventlet workers) for improving keystone eventlet performance. I'll circle
  back with the infra team once we have a little more data on both fronts.
  The Keystone team will use this data to figure out the best way to approach
  this issue.
 
  
 Brant submitted https://review.openstack.org/#/c/121384/ to up the Keystone
 workers when API_WORKERS is set.
  
 I submitted https://review.openstack.org/#/c/122013/ to set a scaling
 default for API_WORKERS based on the available CPUs ((nproc+1)/2). There
 is a summary in that commit message of the current reviews addressing the
 workers in various projects.
  
 I think it has become clear that DevStack needs to set a default for most
 services that are currently either too big (nproc) or too small (Keystone
 at 1). Of course, moving things to mod_wsgi moots all of that, but it'll
 be a while before everything moves.
  
 dt
  
 --
  
 Dean Troyer
 dtro...@gmail.com

Dean,

We should probably look at increasing the default number of workers as well in 
the Keystone configuration (rather than just devstack). It looks like, with 
limited datasets, we are seeing a real improvement with Keystone and 4 workers 
(from my previously mentioned quick test). Thanks for the extra data points. 
This helps to confirm the issue is Keystone under eventlet.

Cheers,
Morgan 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev