So an update on this. We dropped neutron API/RPC workers from 60 each on our 3 neutron api servers to 10 workers each. Since that change the neutron timeouts have dropped to 0. Under icehouse we were able to run with a 60 workers each without any issues. ____________________________________________ Kris Lindgren Senior Linux Systems Engineer GoDaddy, LLC.
On 2/20/15, 9:29 AM, "Kris G. Lindgren" <[email protected]> wrote: >We have memcache enabled on the metadata servers. Part of our load is >because we have a cron job that pulls the metadata and does some stuff on >the server every ~10 minutes. We staggered the start times so that the >requests are spread out over a period of time and in general concurrent >requests are relatively low, under 10 per second max. > >We have looked at FD for the neutron process as well as the database. We >are not hitting a db deadlock and the slow query log - the slowest query >is ~1 second. > >We are using the following version of oslo components: messaging 1.4.1, db >1.3.0, config 1.4.0, serialization 1.1.0, utils 1.1.0 > >In general we do not have any process using high amount of cpu (asside >from rabbit). We see load on multiple neutron processes but its usually >under 7% per process and only 7-8 processes show up in the process list at >a time. > >We increased the neutron timeout value to 120 seconds and it still get >read timeouts. I am about to standup a neutron api server local on the >metadata server so that it specifically talks to that neutron instance and >see if that makes things better. >____________________________________________ > >Kris Lindgren >Senior Linux Systems Engineer >GoDaddy, LLC. > > > >On 2/20/15, 12:27 AM, "Robert van Leeuwen" ><[email protected]> wrote: > >>> After our icehouse -> juno upgrade we are noticing sporadic but >>>frequent errors from nova-metadata when trying to > serve metadata >>>requests. The error is the following: >> >>>Is anyone else noticing this or frequent read timeouts when talking to >>>neutron? Have you found a solution? >>> What have you tried? >> >>If it is metadata agent load related: >>Are you caching the metadata info? (configure memcached_servers in >>nova.conf) >>We noticed a huge benefit on the metadata agent performance. >>Within our cloud the biggest load is created from facter (puppet) which >>will query the metadata agent. >>This adds up to quite a lot of requests to the metadata agent quickly. >> >>Adding caching significantly decreases the load on all systems. >>(Without caching we have a VERY high nova-conductor load) >> >>Cheers, >>Rober van Leeuwen > _______________________________________________ OpenStack-operators mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
