Re: [openstack-dev] In memory joins in Nova

2015-08-12 Thread Mike Bayer
On 8/11/15 7:14 PM, Sachin Manpathak wrote: I am struggling with python code profiling in general. It has its own caveats like 100% plus overhead. However, on a host with only nova services (DB on a different host), I see cpu utilization spike up quickly with scale. The DB server is

Re: [openstack-dev] In memory joins in Nova

2015-08-12 Thread Dan Smith
If OTOH we are referring to the width of the columns and the join is such that you're going to get the same A identity over and over again, if you join A and B you get a wide row with all of A and B with a very large amount of redundant data sent over the wire again and again (note that the

Re: [openstack-dev] In memory joins in Nova

2015-08-12 Thread Dan Smith
In the past I've taken a different approach to problematic one to many relationships and have made the metadata a binary JSON blob. Is there some reason that won't work? We have done that for various pieces of data that were previously in system_metadata. Where this breaks down is if you need

Re: [openstack-dev] In memory joins in Nova

2015-08-12 Thread Clint Byrum
Excerpts from Mike Bayer's message of 2015-08-13 11:03:32 +0800: On 8/12/15 10:29 PM, Clint Byrum wrote: Excerpts from Dan Smith's message of 2015-08-12 23:12:23 +0800: If OTOH we are referring to the width of the columns and the join is such that you're going to get the same A identity

Re: [openstack-dev] In memory joins in Nova

2015-08-12 Thread Clint Byrum
Excerpts from Dan Smith's message of 2015-08-12 23:12:23 +0800: If OTOH we are referring to the width of the columns and the join is such that you're going to get the same A identity over and over again, if you join A and B you get a wide row with all of A and B with a very large amount

Re: [openstack-dev] In memory joins in Nova

2015-08-12 Thread Mike Bayer
On 8/12/15 10:29 PM, Clint Byrum wrote: Excerpts from Dan Smith's message of 2015-08-12 23:12:23 +0800: If OTOH we are referring to the width of the columns and the join is such that you're going to get the same A identity over and over again, if you join A and B you get a wide row with all

Re: [openstack-dev] In memory joins in Nova

2015-08-12 Thread Mike Bayer
On 8/12/15 1:49 PM, Sachin Manpathak wrote: Thanks, This feedback was helpful. Perhaps my paraphrasing was misleading. I am not running openstack at scale in order to see how much the DB can sustain. My observation was that the host running nova services saturates on CPU much earlier than

Re: [openstack-dev] In memory joins in Nova

2015-08-12 Thread Sachin Manpathak
Thanks, This feedback was helpful. Perhaps my paraphrasing was misleading. I am not running openstack at scale in order to see how much the DB can sustain. My observation was that the host running nova services saturates on CPU much earlier than the DB does. Joins could be one of the reasons. I

Re: [openstack-dev] In memory joins in Nova

2015-08-11 Thread Sachin Manpathak
Here are a few -- instance_get_all_by_filters joins manually with instances_fill_metadata -- https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py#L1890 https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py#L1782 Almost all instance query functions manually

Re: [openstack-dev] In memory joins in Nova

2015-08-11 Thread Chris Friesen
Just curious...have you measured this consuming a significant amount of CPU time? Or is it more a gut feel of this looks like it might be expensive? Chris On 08/11/2015 04:51 PM, Sachin Manpathak wrote: Here are a few -- instance_get_all_by_filters joins manually with

Re: [openstack-dev] In memory joins in Nova

2015-08-11 Thread Clint Byrum
Excerpts from Sachin Manpathak's message of 2015-08-12 05:40:36 +0800: Hi folks, Nova codebase seems to follow manual joins model where all data required by an API is fetched from multiple tables and then joined manually by using (in most cases) python dictionary lookups. I was wondering

Re: [openstack-dev] In memory joins in Nova

2015-08-11 Thread Dan Smith
Here are a few -- instance_get_all_by_filters joins manually with instances_fill_metadata -- https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py#L1890 https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py#L1782 Almost all instance query functions

Re: [openstack-dev] In memory joins in Nova

2015-08-11 Thread Sachin Manpathak
I am struggling with python code profiling in general. It has its own caveats like 100% plus overhead. However, on a host with only nova services (DB on a different host), I see cpu utilization spike up quickly with scale. The DB server is relatively calm and never goes over 20%. On a system which