Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters
I agree that this will have benefit , but how much the benefit is may highly depends on the type of instance created. If most of the instance are normal instance w/o any special requirement, we will have no benefit at all. Thanks --jyh -Original Message- From: Russell Bryant [mailto:rbry...@redhat.com] Sent: Sunday, November 03, 2013 12:12 AM To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters On 11/01/2013 06:39 AM, Jiang, Yunhong wrote: I noticed several filters (AggregateMultiTenancyIsoaltion, ram_filter, type_filter, AggregateInstanceExtraSpecsFilter) have DB access in the host_passes(). Some will even access for each invocation. Just curios if this is considered a performance issue? With a 10k nodes, 60 VM per node, and 3 hours VM life cycle cloud, it will have more than 1 million DB access per second. Not a small number IMHO. On a somewhat related note, here's an idea that would be pretty easy to implement. What if we added some optional metadata to scheduler filters to let them indicate where in the order of filters they should run? The filters you're talking about here we would probably want to run last. Other filters that could potentially efficiently eliminate a large number of hosts should be run first. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters
On 11/01/2013 06:39 AM, Jiang, Yunhong wrote: I noticed several filters (AggregateMultiTenancyIsoaltion, ram_filter, type_filter, AggregateInstanceExtraSpecsFilter) have DB access in the host_passes(). Some will even access for each invocation. Just curios if this is considered a performance issue? With a 10k nodes, 60 VM per node, and 3 hours VM life cycle cloud, it will have more than 1 million DB access per second. Not a small number IMHO. On a somewhat related note, here's an idea that would be pretty easy to implement. What if we added some optional metadata to scheduler filters to let them indicate where in the order of filters they should run? The filters you're talking about here we would probably want to run last. Other filters that could potentially efficiently eliminate a large number of hosts should be run first. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters
On 11/03/2013 03:12 PM, Russell Bryant wrote: On 11/01/2013 06:39 AM, Jiang, Yunhong wrote: I noticed several filters (AggregateMultiTenancyIsoaltion, ram_filter, type_filter, AggregateInstanceExtraSpecsFilter) have DB access in the host_passes(). Some will even access for each invocation. Just curios if this is considered a performance issue? With a 10k nodes, 60 VM per node, and 3 hours VM life cycle cloud, it will have more than 1 million DB access per second. Not a small number IMHO. On a somewhat related note, here's an idea that would be pretty easy to implement. What if we added some optional metadata to scheduler filters to let them indicate where in the order of filters they should run? The filters you're talking about here we would probably want to run last. Other filters that could potentially efficiently eliminate a large number of hosts should be run first. https://review.openstack.org/55072 -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters
Its intentional. Cells is there to split up your nodes into more manageable chunks. There are quite a few design summit sessions on looking into alternative approaches to our current scheduler. While I would love a single scheduler to make everyone happy, I am thinking we might end up with several scheduler, each with slightly different properties, and you pick one depending on what you want to do with your cloud. John On 31 October 2013 22:39, Jiang, Yunhong yunhong.ji...@intel.com wrote: I noticed several filters (AggregateMultiTenancyIsoaltion, ram_filter, type_filter, AggregateInstanceExtraSpecsFilter) have DB access in the host_passes(). Some will even access for each invocation. Just curios if this is considered a performance issue? With a 10k nodes, 60 VM per node, and 3 hours VM life cycle cloud, it will have more than 1 million DB access per second. Not a small number IMHO. Thanks --jyh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters
On Nov 1, 2013 10:20 AM, John Garbutt j...@johngarbutt.com wrote: Its intentional. Cells is there to split up your nodes into more manageable chunks. There are quite a few design summit sessions on looking into alternative approaches to our current scheduler. While I would love a single scheduler to make everyone happy, I am thinking we might end up with several scheduler, each with slightly different properties, and you pick one depending on what you want to do with your cloud. Agreed. John On 31 October 2013 22:39, Jiang, Yunhong yunhong.ji...@intel.com wrote: I noticed several filters (AggregateMultiTenancyIsoaltion, ram_filter, type_filter, AggregateInstanceExtraSpecsFilter) have DB access in the host_passes(). Some will even access for each invocation. As you noticed, not all filters make sense for a large system. Just curios if this is considered a performance issue? With a 10k nodes, 60 VM per node, and 3 hours VM life cycle cloud, it will have more than 1 million DB access per second. Not a small number IMHO. Thanks --jyh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters
On 11/01/13 at 10:16am, John Garbutt wrote: Its intentional. Cells is there to split up your nodes into more manageable chunks. I don't think you mean to say that there's intentionally a performance issue. But yes there are performance issues with the filter scheduler. Because I work on a deployment that uses cells to partition the workload I haven't seen them myself, but there are plenty of reports from others who have encountered them. And it's easy to run some back of the napkin calculations like was done below and see that scheduling will require a lot of resources if there's no partitioning. There are quite a few design summit sessions on looking into alternative approaches to our current scheduler. While I would love a single scheduler to make everyone happy, I am thinking we might end up with several scheduler, each with slightly different properties, and you pick one depending on what you want to do with your cloud. +1. We have the ability to drop in different schedulers right now, but there's only one really useful scheduler in the tree. There has been talk of making a more performant scheduler which schedules in a 'good enough' fashion through some approximation algorithm. I would love to see that get introduced as another scheduler and not as a rework of the filter scheduler. I suppose the chance scheduler could technically count for that, but I'm under the impression that it isn't used beyond testing. John On 31 October 2013 22:39, Jiang, Yunhong yunhong.ji...@intel.com wrote: I noticed several filters (AggregateMultiTenancyIsoaltion, ram_filter, type_filter, AggregateInstanceExtraSpecsFilter) have DB access in the host_passes(). Some will even access for each invocation. Just curios if this is considered a performance issue? With a 10k nodes, 60 VM per node, and 3 hours VM life cycle cloud, it will have more than 1 million DB access per second. Not a small number IMHO. Thanks --jyh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters
- Original Message - From: Yunhong Jiang yunhong.ji...@intel.com To: openstack-dev@lists.openstack.org Sent: Thursday, October 31, 2013 6:39:29 PM Subject: [openstack-dev] [nova][scheduler]The database access in the scheduler filters I noticed several filters (AggregateMultiTenancyIsoaltion, ram_filter, type_filter, AggregateInstanceExtraSpecsFilter) have DB access in the host_passes(). Some will even access for each invocation. Just curios if this is considered a performance issue? With a 10k nodes, 60 VM per node, and 3 hours VM life cycle cloud, it will have more than 1 million DB access per second. Not a small number IMHO. Thanks --jyh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev Sorry if I'm dumb, but please try to explain things to me. I don't think I follow... 10k nodes, 60 VM per node... is 600k VM in the whole cloud. A 3 hour life cycle for a VM means every hour 1/3 the nodes turn over so 200k VM are created/deleted per hour ... divide by 60 for ... 3,333.333 per minute or ... divide by 60 for ... 55.5 VM creations/deletions per second ... ... did I do that math right? So where's the million DB accesses per second come from? Are the rules fired for every VM on every access so that 600k VM + 1 new VM means the rules fire 600k + 1 times? What? Sorry... really confused. # Shawn Hartsock ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters
On 11/01/2013 09:09 AM, Andrew Laski wrote: On 11/01/13 at 10:16am, John Garbutt wrote: Its intentional. Cells is there to split up your nodes into more manageable chunks. I don't think you mean to say that there's intentionally a performance issue. But yes there are performance issues with the filter scheduler. Because I work on a deployment that uses cells to partition the workload I haven't seen them myself, but there are plenty of reports from others who have encountered them. And it's easy to run some back of the napkin calculations like was done below and see that scheduling will require a lot of resources if there's no partitioning. There are quite a few design summit sessions on looking into alternative approaches to our current scheduler. While I would love a single scheduler to make everyone happy, I am thinking we might end up with several scheduler, each with slightly different properties, and you pick one depending on what you want to do with your cloud. +1. We have the ability to drop in different schedulers right now, but there's only one really useful scheduler in the tree. There has been talk of making a more performant scheduler which schedules in a 'good enough' fashion through some approximation algorithm. I would love to see that get introduced as another scheduler and not as a rework of the filter scheduler. I suppose the chance scheduler could technically count for that, but I'm under the impression that it isn't used beyond testing. Agreed. There's a lot of discussion happening in two different directions, it seems. One group is very interested in improving the scheduler's ability to make the best decision possible using various policies. Another group is concerned with massive scale and is willing to accept good enough scheduling to get there. I think the filter scheduler is pretty reasonable for the best possible decision approach today. There's some stuff that could perform better. There's more policy knobs that could be added. There's the cross service issue to figure out ... but it's not bad. I'm very interested in a new good enough scheduler. I liked the idea of running a bunch of schedulers that each only look at a subset of your infrastructure and pick something that's good enough. I'm interested to hear other ideas in the session we have on this topic (rethinking scheduler design). Of course, you get a lot of the massive scale benefits by going to cells, too. If cells is our answer here, I really want to see more people stepping up to help with the cells code. There are still some feature gaps to fill. We should also be looking at the road to getting back to only having one way to deploy nova (cells). Having both cells vs non-cells options really isn't ideal long term. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters
As Shawn Hartsock pointed out in the reply, I made a stupid error in the calculation. It's in fact 55 access per second, not that big number I calculated. I thought I graduated from elementary school but seems I'm wrong. Really sorry for the stupid error. --jyh -Original Message- From: Russell Bryant [mailto:rbry...@redhat.com] Sent: Friday, November 01, 2013 9:18 AM To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters On 11/01/2013 09:09 AM, Andrew Laski wrote: On 11/01/13 at 10:16am, John Garbutt wrote: Its intentional. Cells is there to split up your nodes into more manageable chunks. I don't think you mean to say that there's intentionally a performance issue. But yes there are performance issues with the filter scheduler. Because I work on a deployment that uses cells to partition the workload I haven't seen them myself, but there are plenty of reports from others who have encountered them. And it's easy to run some back of the napkin calculations like was done below and see that scheduling will require a lot of resources if there's no partitioning. There are quite a few design summit sessions on looking into alternative approaches to our current scheduler. While I would love a single scheduler to make everyone happy, I am thinking we might end up with several scheduler, each with slightly different properties, and you pick one depending on what you want to do with your cloud. +1. We have the ability to drop in different schedulers right now, but there's only one really useful scheduler in the tree. There has been talk of making a more performant scheduler which schedules in a 'good enough' fashion through some approximation algorithm. I would love to see that get introduced as another scheduler and not as a rework of the filter scheduler. I suppose the chance scheduler could technically count for that, but I'm under the impression that it isn't used beyond testing. Agreed. There's a lot of discussion happening in two different directions, it seems. One group is very interested in improving the scheduler's ability to make the best decision possible using various policies. Another group is concerned with massive scale and is willing to accept good enough scheduling to get there. I think the filter scheduler is pretty reasonable for the best possible decision approach today. There's some stuff that could perform better. There's more policy knobs that could be added. There's the cross service issue to figure out ... but it's not bad. I'm very interested in a new good enough scheduler. I liked the idea of running a bunch of schedulers that each only look at a subset of your infrastructure and pick something that's good enough. I'm interested to hear other ideas in the session we have on this topic (rethinking scheduler design). Of course, you get a lot of the massive scale benefits by going to cells, too. If cells is our answer here, I really want to see more people stepping up to help with the cells code. There are still some feature gaps to fill. We should also be looking at the road to getting back to only having one way to deploy nova (cells). Having both cells vs non-cells options really isn't ideal long term. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters
Yes, you are right .. :( -Original Message- From: Shawn Hartsock [mailto:hartso...@vmware.com] Sent: Friday, November 01, 2013 8:20 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters - Original Message - From: Yunhong Jiang yunhong.ji...@intel.com To: openstack-dev@lists.openstack.org Sent: Thursday, October 31, 2013 6:39:29 PM Subject: [openstack-dev] [nova][scheduler]The database access in the scheduler filters I noticed several filters (AggregateMultiTenancyIsoaltion, ram_filter, type_filter, AggregateInstanceExtraSpecsFilter) have DB access in the host_passes(). Some will even access for each invocation. Just curios if this is considered a performance issue? With a 10k nodes, 60 VM per node, and 3 hours VM life cycle cloud, it will have more than 1 million DB access per second. Not a small number IMHO. Thanks --jyh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev Sorry if I'm dumb, but please try to explain things to me. I don't think I follow... 10k nodes, 60 VM per node... is 600k VM in the whole cloud. A 3 hour life cycle for a VM means every hour 1/3 the nodes turn over so 200k VM are created/deleted per hour ... divide by 60 for ... 3,333.333 per minute or ... divide by 60 for ... 55.5 VM creations/deletions per second ... ... did I do that math right? So where's the million DB accesses per second come from? Are the rules fired for every VM on every access so that 600k VM + 1 new VM means the rules fire 600k + 1 times? What? Sorry... really confused. # Shawn Hartsock ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters
On 11/01/2013 11:42 AM, Jiang, Yunhong wrote: Shawn, yes, there is 56 VM access every second, and for each VM access, the scheduler will invoke filter for each host, that means, for each VM access, the filter function will be invoked 10k times. So 56 * 10k = 560k, yes, half of 1M, but still big number. I'm fairly new to openstack so I may have missed earlier discussions, but has anyone looked at building a scheduler filter that would use database queries over sets of hosts rather rather than looping over each host and doing the logic in python? Seems like that would be a lot more efficient... Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters
I think there has been, and I think there will be a good design summit session for this. http://icehousedesignsummit.sched.org/event/cde73dadfd67eaae5bf98b90ba7de07 3#.UnPwKiRQ3mw I think what u have suggested could be a way to do it, as all databases do is set intersections and unions in the end anyway ;) And set unions and intersections is nearly synonymous for filtering ;) Some of the filters though likely would fall into stored procedure land. On 11/1/13 11:10 AM, Chris Friesen chris.frie...@windriver.com wrote: On 11/01/2013 11:42 AM, Jiang, Yunhong wrote: Shawn, yes, there is 56 VM access every second, and for each VM access, the scheduler will invoke filter for each host, that means, for each VM access, the filter function will be invoked 10k times. So 56 * 10k = 560k, yes, half of 1M, but still big number. I'm fairly new to openstack so I may have missed earlier discussions, but has anyone looked at building a scheduler filter that would use database queries over sets of hosts rather rather than looping over each host and doing the logic in python? Seems like that would be a lot more efficient... Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters
Aha, right after replied Harsock's mail, I realized I'm correct still. Glad that I did graduated from the school :) --jyh -Original Message- From: Jiang, Yunhong [mailto:yunhong.ji...@intel.com] Sent: Friday, November 01, 2013 10:32 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters As Shawn Hartsock pointed out in the reply, I made a stupid error in the calculation. It's in fact 55 access per second, not that big number I calculated. I thought I graduated from elementary school but seems I'm wrong. Really sorry for the stupid error. --jyh -Original Message- From: Russell Bryant [mailto:rbry...@redhat.com] Sent: Friday, November 01, 2013 9:18 AM To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters On 11/01/2013 09:09 AM, Andrew Laski wrote: On 11/01/13 at 10:16am, John Garbutt wrote: Its intentional. Cells is there to split up your nodes into more manageable chunks. I don't think you mean to say that there's intentionally a performance issue. But yes there are performance issues with the filter scheduler. Because I work on a deployment that uses cells to partition the workload I haven't seen them myself, but there are plenty of reports from others who have encountered them. And it's easy to run some back of the napkin calculations like was done below and see that scheduling will require a lot of resources if there's no partitioning. There are quite a few design summit sessions on looking into alternative approaches to our current scheduler. While I would love a single scheduler to make everyone happy, I am thinking we might end up with several scheduler, each with slightly different properties, and you pick one depending on what you want to do with your cloud. +1. We have the ability to drop in different schedulers right now, but there's only one really useful scheduler in the tree. There has been talk of making a more performant scheduler which schedules in a 'good enough' fashion through some approximation algorithm. I would love to see that get introduced as another scheduler and not as a rework of the filter scheduler. I suppose the chance scheduler could technically count for that, but I'm under the impression that it isn't used beyond testing. Agreed. There's a lot of discussion happening in two different directions, it seems. One group is very interested in improving the scheduler's ability to make the best decision possible using various policies. Another group is concerned with massive scale and is willing to accept good enough scheduling to get there. I think the filter scheduler is pretty reasonable for the best possible decision approach today. There's some stuff that could perform better. There's more policy knobs that could be added. There's the cross service issue to figure out ... but it's not bad. I'm very interested in a new good enough scheduler. I liked the idea of running a bunch of schedulers that each only look at a subset of your infrastructure and pick something that's good enough. I'm interested to hear other ideas in the session we have on this topic (rethinking scheduler design). Of course, you get a lot of the massive scale benefits by going to cells, too. If cells is our answer here, I really want to see more people stepping up to help with the cells code. There are still some feature gaps to fill. We should also be looking at the road to getting back to only having one way to deploy nova (cells). Having both cells vs non-cells options really isn't ideal long term. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters
Thanks: TIL. The filter invocation per host is the bit I was forgetting. I'm assuming that the facts about the hosts don't change several times a second so if you held the facts in RAM and then asserted the rules against those facts allowing for age-out/invalidation based on incoming updates then the whole system will run faster. I remember a thread on using dogpile/memoization for this kind of thing. # Shawn Hartsock - Original Message - From: Yunhong Jiang yunhong.ji...@intel.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Friday, November 1, 2013 1:42:03 PM Subject: Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters Shawn, yes, there is 56 VM access every second, and for each VM access, the scheduler will invoke filter for each host, that means, for each VM access, the filter function will be invoked 10k times. So 56 * 10k = 560k, yes, half of 1M, but still big number. --jyh -Original Message- From: Shawn Hartsock [mailto:hartso...@vmware.com] Sent: Friday, November 01, 2013 8:20 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters - Original Message - From: Yunhong Jiang yunhong.ji...@intel.com To: openstack-dev@lists.openstack.org Sent: Thursday, October 31, 2013 6:39:29 PM Subject: [openstack-dev] [nova][scheduler]The database access in the scheduler filters I noticed several filters (AggregateMultiTenancyIsoaltion, ram_filter, type_filter, AggregateInstanceExtraSpecsFilter) have DB access in the host_passes(). Some will even access for each invocation. Just curios if this is considered a performance issue? With a 10k nodes, 60 VM per node, and 3 hours VM life cycle cloud, it will have more than 1 million DB access per second. Not a small number IMHO. Thanks --jyh ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev Sorry if I'm dumb, but please try to explain things to me. I don't think I follow... 10k nodes, 60 VM per node... is 600k VM in the whole cloud. A 3 hour life cycle for a VM means every hour 1/3 the nodes turn over so 200k VM are created/deleted per hour ... divide by 60 for ... 3,333.333 per minute or ... divide by 60 for ... 55.5 VM creations/deletions per second ... ... did I do that math right? So where's the million DB accesses per second come from? Are the rules fired for every VM on every access so that 600k VM + 1 new VM means the rules fire 600k + 1 times? What? Sorry... really confused. # Shawn Hartsock ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters
Something should probably change. The fundamental design issue is that we've got a 1:1 relationship between rule execution and database fetch. The rules may fire at several orders of magnitude different rates of speed from data refreshes in the database. So I'd think you would want to decouple the database fetch from the rule assertion. # Shawn Hartsock - Original Message - From: Chris Friesen chris.frie...@windriver.com To: openstack-dev@lists.openstack.org Sent: Friday, November 1, 2013 2:10:52 PM Subject: Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters On 11/01/2013 11:42 AM, Jiang, Yunhong wrote: Shawn, yes, there is 56 VM access every second, and for each VM access, the scheduler will invoke filter for each host, that means, for each VM access, the filter function will be invoked 10k times. So 56 * 10k = 560k, yes, half of 1M, but still big number. I'm fairly new to openstack so I may have missed earlier discussions, but has anyone looked at building a scheduler filter that would use database queries over sets of hosts rather rather than looping over each host and doing the logic in python? Seems like that would be a lot more efficient... Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev