Re: [openstack-dev] [Oslo] [Ironic] Can we change rpc_thread_pool_size default value?
On Wed, 2014-04-23 at 07:25 +0100, Mark McLoughlin wrote: On Tue, 2014-04-22 at 15:54 -0700, Devananda van der Veen wrote: Hi! When a project is using oslo.messaging, how can we change our default rpc_thread_pool_size? --- Background Ironic has hit a bug where a flood of API requests can deplete the RPC worker pool on the other end and cause things to break in very bad ways. Apparently, nova-conductor hit something similar a while back too. There've been a few long discussions on IRC about it, tracked partially here: https://bugs.launchpad.net/ironic/+bug/1308680 tldr; a way we can fix this is to set the rpc_thread_pool_size very small (eg, 4) and keep our conductor.worker_pool size near its current value (eg, 64). I'd like these to be the default option values, rather than require every user to change the rpc_thread_pool_size in their local ironic.conf file. We're also about to switch from the RPC module in oslo-incubator to using the oslo.messaging library. Why are these related? Because it looks impossible for us to change the default for this option from within Ironic, because the option is registered when EventletExecutor is instantaited (rather than loaded). https://github.com/openstack/oslo.messaging/blob/master/oslo/messaging/_executors/impl_eventlet.py#L76 It may have been possible for Ironic to set its own default before oslo.messaging, but it wouldn't have been recommended because there's no explicit API for doing so. With oslo.messaging, we have a set_transport_defaults() which shows how we'd approach adding this capability. The question comes down to whether this really is a situation where we need per-application defaults or just that the current defaults are screwed up. If the latter, I'd much rather just change the defaults. History is always useful :) Soren added the threadpool with a default size of 1024: https://code.launchpad.net/~soren/nova/rpc-threadpool/+merge/49896 Johannes changed it back to 64: https://review.openstack.org/6792 Mark. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Oslo] [Ironic] Can we change rpc_thread_pool_size default value?
Fwiw, we've seen this with nova-scheduler as well. I think the default pool size is too large in general. The problem that I've seen stems from the fact that DB calls all block and you can easily get a stack of 64 workers all waiting to do DB calls. And it happens to work out such that none of the rpc pool threads return before all run their DB calls. This is compounded by the explicit yield we have for every DB call in nova. Anyway, this means that all of the workers are tied up for quite a while. Since nova casts to the scheduler, it doesn't impact the API much. But if you were waiting on an RPC response, you could be waiting a while. Ironic does a lot of RPC calls. I don't think we know the exact behavior in Ironic, but I'm assuming it's something similar. If all rpc pool threads are essentially stuck until roughly the same time, you end up with API hangs. But we're also seeing periodic task run delays as well. It must be getting stuck behind a lot of the rpc worker threads such that lowering the number of threads helps considerably. Given DB calls all block the process right now, there's really not much advantage to a larger pool size. 64 is too much, IMO. It would make more sense if there was more IO that could be parallelized. That didn't answer your question. I've been meaning to ask the same one since we discovered this. :) - Chris On Apr 22, 2014, at 3:54 PM, Devananda van der Veen devananda@gmail.com wrote: Hi! When a project is using oslo.messaging, how can we change our default rpc_thread_pool_size? --- Background Ironic has hit a bug where a flood of API requests can deplete the RPC worker pool on the other end and cause things to break in very bad ways. Apparently, nova-conductor hit something similar a while back too. There've been a few long discussions on IRC about it, tracked partially here: https://bugs.launchpad.net/ironic/+bug/1308680 tldr; a way we can fix this is to set the rpc_thread_pool_size very small (eg, 4) and keep our conductor.worker_pool size near its current value (eg, 64). I'd like these to be the default option values, rather than require every user to change the rpc_thread_pool_size in their local ironic.conf file. We're also about to switch from the RPC module in oslo-incubator to using the oslo.messaging library. Why are these related? Because it looks impossible for us to change the default for this option from within Ironic, because the option is registered when EventletExecutor is instantaited (rather than loaded). https://github.com/openstack/oslo.messaging/blob/master/oslo/messaging/_executors/impl_eventlet.py#L76 Thanks, Devananda ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Oslo] [Ironic] Can we change rpc_thread_pool_size default value?
On Tue, 2014-04-22 at 15:54 -0700, Devananda van der Veen wrote: Hi! When a project is using oslo.messaging, how can we change our default rpc_thread_pool_size? --- Background Ironic has hit a bug where a flood of API requests can deplete the RPC worker pool on the other end and cause things to break in very bad ways. Apparently, nova-conductor hit something similar a while back too. There've been a few long discussions on IRC about it, tracked partially here: https://bugs.launchpad.net/ironic/+bug/1308680 tldr; a way we can fix this is to set the rpc_thread_pool_size very small (eg, 4) and keep our conductor.worker_pool size near its current value (eg, 64). I'd like these to be the default option values, rather than require every user to change the rpc_thread_pool_size in their local ironic.conf file. We're also about to switch from the RPC module in oslo-incubator to using the oslo.messaging library. Why are these related? Because it looks impossible for us to change the default for this option from within Ironic, because the option is registered when EventletExecutor is instantaited (rather than loaded). https://github.com/openstack/oslo.messaging/blob/master/oslo/messaging/_executors/impl_eventlet.py#L76 It may have been possible for Ironic to set its own default before oslo.messaging, but it wouldn't have been recommended because there's no explicit API for doing so. With oslo.messaging, we have a set_transport_defaults() which shows how we'd approach adding this capability. The question comes down to whether this really is a situation where we need per-application defaults or just that the current defaults are screwed up. If the latter, I'd much rather just change the defaults. Mark. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Oslo] [Ironic] Can we change rpc_thread_pool_size default value?
Hi! When a project is using oslo.messaging, how can we change our default rpc_thread_pool_size? --- Background Ironic has hit a bug where a flood of API requests can deplete the RPC worker pool on the other end and cause things to break in very bad ways. Apparently, nova-conductor hit something similar a while back too. There've been a few long discussions on IRC about it, tracked partially here: https://bugs.launchpad.net/ironic/+bug/1308680 tldr; a way we can fix this is to set the rpc_thread_pool_size very small (eg, 4) and keep our conductor.worker_pool size near its current value (eg, 64). I'd like these to be the default option values, rather than require every user to change the rpc_thread_pool_size in their local ironic.conf file. We're also about to switch from the RPC module in oslo-incubator to using the oslo.messaging library. Why are these related? Because it looks impossible for us to change the default for this option from within Ironic, because the option is registered when EventletExecutor is instantaited (rather than loaded). https://github.com/openstack/oslo.messaging/blob/master/oslo/messaging/_executors/impl_eventlet.py#L76 Thanks, Devananda ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev