CERN has upgraded to Cells v2 and is doing performance testing of the scheduler and were reporting some things today which got us back to this bug [1]. So I've starting pushing some patches related to this but also related to an older blueprint I created [2]. In summary, we do quite a bit of DB work just to load up a list of instance objects per host that the in-tree filters don't even use.

The first change [3] is a simple optimization to avoid the default joins on the instance_info_caches and security_groups tables. If you have out of tree filters that, for whatever reason, rely on the HostState.instances objects to have info_cache or security_groups set, they'll continue to work, but will have to round-trip to the DB to lazy-load the fields, which is going to be a performance penalty on that filter. See the change for details.

The second change in the series [4] is more drastic in that we'll do away with pulling the full Instance object per host, which means only a select set of optional fields can be lazy-loaded [5], and the rest will result in an exception. The patch currently has a workaround config option to continue doing things the old way if you have out of tree filters that rely on this, but for good citizens with only in-tree filters, you will get a performance improvement during scheduling.

There are some other things we can do to optimize more of this flow, but this email is just about the ones that have patches up right now.





OpenStack-operators mailing list

Reply via email to