Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters

2013-11-04 Thread Jiang, Yunhong
I agree that this will have benefit , but how much the benefit is may highly 
depends on the type
of instance created. If most of the instance are normal instance w/o any 
special requirement, we will have no benefit at all.

Thanks
--jyh

 -Original Message-
 From: Russell Bryant [mailto:rbry...@redhat.com]
 Sent: Sunday, November 03, 2013 12:12 AM
 To: openstack-dev@lists.openstack.org
 Subject: Re: [openstack-dev] [nova][scheduler]The database access in the
 scheduler filters
 
 On 11/01/2013 06:39 AM, Jiang, Yunhong wrote:
  I noticed several filters (AggregateMultiTenancyIsoaltion, ram_filter,
 type_filter, AggregateInstanceExtraSpecsFilter) have DB access in the
 host_passes(). Some will even access for each invocation.
 
  Just curios if this is considered a performance issue? With a 10k nodes,
 60 VM per node, and 3 hours VM life cycle cloud, it will have more than 1
 million DB access per second. Not a small number IMHO.
 
 On a somewhat related note, here's an idea that would be pretty easy to
 implement.
 
 What if we added some optional metadata to scheduler filters to let them
 indicate where in the order of filters they should run?
 
 The filters you're talking about here we would probably want to run
 last.  Other filters that could potentially efficiently eliminate a
 large number of hosts should be run first.
 
 --
 Russell Bryant
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters

2013-11-03 Thread Russell Bryant
On 11/01/2013 06:39 AM, Jiang, Yunhong wrote:
 I noticed several filters (AggregateMultiTenancyIsoaltion, ram_filter, 
 type_filter, AggregateInstanceExtraSpecsFilter) have DB access in the 
 host_passes(). Some will even access for each invocation.
 
 Just curios if this is considered a performance issue? With a 10k nodes, 60 
 VM per node, and 3 hours VM life cycle cloud, it will have more than 1 
 million DB access per second. Not a small number IMHO.

On a somewhat related note, here's an idea that would be pretty easy to
implement.

What if we added some optional metadata to scheduler filters to let them
indicate where in the order of filters they should run?

The filters you're talking about here we would probably want to run
last.  Other filters that could potentially efficiently eliminate a
large number of hosts should be run first.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters

2013-11-03 Thread Russell Bryant
On 11/03/2013 03:12 PM, Russell Bryant wrote:
 On 11/01/2013 06:39 AM, Jiang, Yunhong wrote:
 I noticed several filters (AggregateMultiTenancyIsoaltion, ram_filter, 
 type_filter, AggregateInstanceExtraSpecsFilter) have DB access in the 
 host_passes(). Some will even access for each invocation.

 Just curios if this is considered a performance issue? With a 10k nodes, 60 
 VM per node, and 3 hours VM life cycle cloud, it will have more than 1 
 million DB access per second. Not a small number IMHO.
 
 On a somewhat related note, here's an idea that would be pretty easy to
 implement.
 
 What if we added some optional metadata to scheduler filters to let them
 indicate where in the order of filters they should run?
 
 The filters you're talking about here we would probably want to run
 last.  Other filters that could potentially efficiently eliminate a
 large number of hosts should be run first.
 

https://review.openstack.org/55072

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters

2013-11-01 Thread John Garbutt
Its intentional. Cells is there to split up your nodes into more
manageable chunks.

There are quite a few design summit sessions on looking into
alternative approaches to our current scheduler.

While I would love a single scheduler to make everyone happy, I am
thinking we might end up with several scheduler, each with slightly
different properties, and you pick one depending on what you want to
do with your cloud.

John

On 31 October 2013 22:39, Jiang, Yunhong yunhong.ji...@intel.com wrote:
 I noticed several filters (AggregateMultiTenancyIsoaltion, ram_filter, 
 type_filter, AggregateInstanceExtraSpecsFilter) have DB access in the 
 host_passes(). Some will even access for each invocation.

 Just curios if this is considered a performance issue? With a 10k nodes, 60 
 VM per node, and 3 hours VM life cycle cloud, it will have more than 1 
 million DB access per second. Not a small number IMHO.

 Thanks
 --jyh

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters

2013-11-01 Thread Joe Gordon
On Nov 1, 2013 10:20 AM, John Garbutt j...@johngarbutt.com wrote:

 Its intentional. Cells is there to split up your nodes into more
 manageable chunks.

 There are quite a few design summit sessions on looking into
 alternative approaches to our current scheduler.

 While I would love a single scheduler to make everyone happy, I am
 thinking we might end up with several scheduler, each with slightly
 different properties, and you pick one depending on what you want to
 do with your cloud.

Agreed.


 John

 On 31 October 2013 22:39, Jiang, Yunhong yunhong.ji...@intel.com wrote:
  I noticed several filters (AggregateMultiTenancyIsoaltion, ram_filter,
type_filter, AggregateInstanceExtraSpecsFilter) have DB access in the
host_passes(). Some will even access for each invocation.

As you noticed, not all filters make sense for a large system.

 
  Just curios if this is considered a performance issue? With a 10k
nodes, 60 VM per node, and 3 hours VM life cycle cloud, it will have more
than 1 million DB access per second. Not a small number IMHO.
 
  Thanks
  --jyh
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters

2013-11-01 Thread Andrew Laski

On 11/01/13 at 10:16am, John Garbutt wrote:

Its intentional. Cells is there to split up your nodes into more
manageable chunks.


I don't think you mean to say that there's intentionally a performance 
issue.  But yes there are performance issues with the filter scheduler.  

Because I work on a deployment that uses cells to partition the workload 
I haven't seen them myself, but there are plenty of reports from others 
who have encountered them.  And it's easy to run some back of the napkin 
calculations like was done below and see that scheduling will require a 
lot of resources if there's no partitioning.




There are quite a few design summit sessions on looking into
alternative approaches to our current scheduler.

While I would love a single scheduler to make everyone happy, I am
thinking we might end up with several scheduler, each with slightly
different properties, and you pick one depending on what you want to
do with your cloud.


+1.  We have the ability to drop in different schedulers right now, but 
there's only one really useful scheduler in the tree.  There has been 
talk of making a more performant scheduler which schedules in a 'good 
enough' fashion through some approximation algorithm.  I would love to 
see that get introduced as another scheduler and not as a rework of the 
filter scheduler.  I suppose the chance scheduler could technically 
count for that, but I'm under the impression that it isn't used beyond 
testing.




John

On 31 October 2013 22:39, Jiang, Yunhong yunhong.ji...@intel.com wrote:

I noticed several filters (AggregateMultiTenancyIsoaltion, ram_filter, 
type_filter, AggregateInstanceExtraSpecsFilter) have DB access in the 
host_passes(). Some will even access for each invocation.

Just curios if this is considered a performance issue? With a 10k nodes, 60 VM 
per node, and 3 hours VM life cycle cloud, it will have more than 1 million DB 
access per second. Not a small number IMHO.

Thanks
--jyh

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters

2013-11-01 Thread Shawn Hartsock


- Original Message -
 From: Yunhong Jiang yunhong.ji...@intel.com
 To: openstack-dev@lists.openstack.org
 Sent: Thursday, October 31, 2013 6:39:29 PM
 Subject: [openstack-dev] [nova][scheduler]The database access in the  
 scheduler filters
 
 I noticed several filters (AggregateMultiTenancyIsoaltion, ram_filter,
 type_filter, AggregateInstanceExtraSpecsFilter) have DB access in the
 host_passes(). Some will even access for each invocation.
 
 Just curios if this is considered a performance issue? With a 10k nodes, 60
 VM per node, and 3 hours VM life cycle cloud, it will have more than 1
 million DB access per second. Not a small number IMHO.
 
 Thanks
 --jyh
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

Sorry if I'm dumb, but please try to explain things to me. I don't think I 
follow...

10k nodes, 60 VM per node... is 600k VM in the whole cloud. A 3 hour life cycle 
for a VM means every hour 1/3 the nodes turn over so 200k VM  are 
created/deleted per hour ... divide by 60 for ... 3,333.333 per minute or ... 
divide by 60 for ... 55.5 VM creations/deletions per second ...

... did I do that math right? So where's the million DB accesses per second 
come from? Are the rules fired for every VM on every access so that 600k VM + 1 
new VM means the rules fire 600k + 1 times? What? Sorry... really confused.

# Shawn Hartsock

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters

2013-11-01 Thread Russell Bryant
On 11/01/2013 09:09 AM, Andrew Laski wrote:
 On 11/01/13 at 10:16am, John Garbutt wrote:
 Its intentional. Cells is there to split up your nodes into more
 manageable chunks.
 
 I don't think you mean to say that there's intentionally a performance
 issue.  But yes there are performance issues with the filter scheduler. 
 Because I work on a deployment that uses cells to partition the workload
 I haven't seen them myself, but there are plenty of reports from others
 who have encountered them.  And it's easy to run some back of the napkin
 calculations like was done below and see that scheduling will require a
 lot of resources if there's no partitioning.
 

 There are quite a few design summit sessions on looking into
 alternative approaches to our current scheduler.

 While I would love a single scheduler to make everyone happy, I am
 thinking we might end up with several scheduler, each with slightly
 different properties, and you pick one depending on what you want to
 do with your cloud.
 
 +1.  We have the ability to drop in different schedulers right now, but
 there's only one really useful scheduler in the tree.  There has been
 talk of making a more performant scheduler which schedules in a 'good
 enough' fashion through some approximation algorithm.  I would love to
 see that get introduced as another scheduler and not as a rework of the
 filter scheduler.  I suppose the chance scheduler could technically
 count for that, but I'm under the impression that it isn't used beyond
 testing.

Agreed.

There's a lot of discussion happening in two different directions, it
seems.  One group is very interested in improving the scheduler's
ability to make the best decision possible using various policies.
Another group is concerned with massive scale and is willing to accept
good enough scheduling to get there.

I think the filter scheduler is pretty reasonable for the best possible
decision approach today.  There's some stuff that could perform better.
 There's more policy knobs that could be added.  There's the cross
service issue to figure out ... but it's not bad.

I'm very interested in a new good enough scheduler.  I liked the idea
of running a bunch of schedulers that each only look at a subset of your
infrastructure and pick something that's good enough.  I'm interested to
hear other ideas in the session we have on this topic (rethinking
scheduler design).

Of course, you get a lot of the massive scale benefits by going to
cells, too.  If cells is our answer here, I really want to see more
people stepping up to help with the cells code.  There are still some
feature gaps to fill.  We should also be looking at the road to getting
back to only having one way to deploy nova (cells).  Having both cells
vs non-cells options really isn't ideal long term.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters

2013-11-01 Thread Jiang, Yunhong
As Shawn Hartsock pointed out in the reply, I made a stupid error in the 
calculation. It's in fact 55 access per second, not that big number I 
calculated. 
I thought I graduated from elementary school but seems I'm wrong. Really sorry 
for the stupid error.

--jyh

 -Original Message-
 From: Russell Bryant [mailto:rbry...@redhat.com]
 Sent: Friday, November 01, 2013 9:18 AM
 To: openstack-dev@lists.openstack.org
 Subject: Re: [openstack-dev] [nova][scheduler]The database access in the
 scheduler filters
 
 On 11/01/2013 09:09 AM, Andrew Laski wrote:
  On 11/01/13 at 10:16am, John Garbutt wrote:
  Its intentional. Cells is there to split up your nodes into more
  manageable chunks.
 
  I don't think you mean to say that there's intentionally a performance
  issue.  But yes there are performance issues with the filter scheduler.
  Because I work on a deployment that uses cells to partition the workload
  I haven't seen them myself, but there are plenty of reports from others
  who have encountered them.  And it's easy to run some back of the
 napkin
  calculations like was done below and see that scheduling will require a
  lot of resources if there's no partitioning.
 
 
  There are quite a few design summit sessions on looking into
  alternative approaches to our current scheduler.
 
  While I would love a single scheduler to make everyone happy, I am
  thinking we might end up with several scheduler, each with slightly
  different properties, and you pick one depending on what you want to
  do with your cloud.
 
  +1.  We have the ability to drop in different schedulers right now, but
  there's only one really useful scheduler in the tree.  There has been
  talk of making a more performant scheduler which schedules in a 'good
  enough' fashion through some approximation algorithm.  I would love
 to
  see that get introduced as another scheduler and not as a rework of the
  filter scheduler.  I suppose the chance scheduler could technically
  count for that, but I'm under the impression that it isn't used beyond
  testing.
 
 Agreed.
 
 There's a lot of discussion happening in two different directions, it
 seems.  One group is very interested in improving the scheduler's
 ability to make the best decision possible using various policies.
 Another group is concerned with massive scale and is willing to accept
 good enough scheduling to get there.
 
 I think the filter scheduler is pretty reasonable for the best possible
 decision approach today.  There's some stuff that could perform better.
  There's more policy knobs that could be added.  There's the cross
 service issue to figure out ... but it's not bad.
 
 I'm very interested in a new good enough scheduler.  I liked the idea
 of running a bunch of schedulers that each only look at a subset of your
 infrastructure and pick something that's good enough.  I'm interested to
 hear other ideas in the session we have on this topic (rethinking
 scheduler design).
 
 Of course, you get a lot of the massive scale benefits by going to
 cells, too.  If cells is our answer here, I really want to see more
 people stepping up to help with the cells code.  There are still some
 feature gaps to fill.  We should also be looking at the road to getting
 back to only having one way to deploy nova (cells).  Having both cells
 vs non-cells options really isn't ideal long term.
 
 --
 Russell Bryant
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters

2013-11-01 Thread Jiang, Yunhong
Yes, you are right .. :(

 -Original Message-
 From: Shawn Hartsock [mailto:hartso...@vmware.com]
 Sent: Friday, November 01, 2013 8:20 AM
 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] [nova][scheduler]The database access in the
 scheduler filters
 
 
 
 - Original Message -
  From: Yunhong Jiang yunhong.ji...@intel.com
  To: openstack-dev@lists.openstack.org
  Sent: Thursday, October 31, 2013 6:39:29 PM
  Subject: [openstack-dev] [nova][scheduler]The database access in the
   scheduler filters
 
  I noticed several filters (AggregateMultiTenancyIsoaltion, ram_filter,
  type_filter, AggregateInstanceExtraSpecsFilter) have DB access in the
  host_passes(). Some will even access for each invocation.
 
  Just curios if this is considered a performance issue? With a 10k nodes,
 60
  VM per node, and 3 hours VM life cycle cloud, it will have more than 1
  million DB access per second. Not a small number IMHO.
 
  Thanks
  --jyh
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 Sorry if I'm dumb, but please try to explain things to me. I don't think I
 follow...
 
 10k nodes, 60 VM per node... is 600k VM in the whole cloud. A 3 hour life
 cycle for a VM means every hour 1/3 the nodes turn over so 200k VM
 are created/deleted per hour ... divide by 60 for ... 3,333.333 per minute
 or ... divide by 60 for ... 55.5 VM creations/deletions per second ...
 
 ... did I do that math right? So where's the million DB accesses per second
 come from? Are the rules fired for every VM on every access so that 600k
 VM + 1 new VM means the rules fire 600k + 1 times? What? Sorry... really
 confused.
 
 # Shawn Hartsock
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters

2013-11-01 Thread Chris Friesen

On 11/01/2013 11:42 AM, Jiang, Yunhong wrote:

Shawn, yes, there is 56 VM access every second, and for each VM
access, the scheduler will invoke filter for each host, that means,
for each VM access, the filter function will be invoked 10k times. So
56 * 10k = 560k, yes, half of 1M, but still big number.



I'm fairly new to openstack so I may have missed earlier discussions, 
but has anyone looked at building a scheduler filter that would use 
database queries over sets of hosts rather rather than looping over each 
host and doing the logic in python?  Seems like that would be a lot more 
efficient...


Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters

2013-11-01 Thread Joshua Harlow
I think there has been, and I think there will be a good design summit
session for this.

http://icehousedesignsummit.sched.org/event/cde73dadfd67eaae5bf98b90ba7de07
3#.UnPwKiRQ3mw

I think what u have suggested could be a way to do it, as all databases do
is set intersections and unions in the end anyway ;)

And set unions and intersections is nearly synonymous for filtering ;)
Some of the filters though likely would fall into stored procedure land.

On 11/1/13 11:10 AM, Chris Friesen chris.frie...@windriver.com wrote:

On 11/01/2013 11:42 AM, Jiang, Yunhong wrote:
 Shawn, yes, there is 56 VM access every second, and for each VM
 access, the scheduler will invoke filter for each host, that means,
 for each VM access, the filter function will be invoked 10k times. So
 56 * 10k = 560k, yes, half of 1M, but still big number.


I'm fairly new to openstack so I may have missed earlier discussions,
but has anyone looked at building a scheduler filter that would use
database queries over sets of hosts rather rather than looping over each
host and doing the logic in python?  Seems like that would be a lot more
efficient...

Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters

2013-11-01 Thread Jiang, Yunhong
Aha, right after replied Harsock's mail, I realized I'm correct still. Glad 
that I did graduated from the school :)

--jyh

 -Original Message-
 From: Jiang, Yunhong [mailto:yunhong.ji...@intel.com]
 Sent: Friday, November 01, 2013 10:32 AM
 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] [nova][scheduler]The database access in the
 scheduler filters
 
 As Shawn Hartsock pointed out in the reply, I made a stupid error in the
 calculation. It's in fact 55 access per second, not that big number I
 calculated.
 I thought I graduated from elementary school but seems I'm wrong. Really
 sorry for the stupid error.
 
 --jyh
 
  -Original Message-
  From: Russell Bryant [mailto:rbry...@redhat.com]
  Sent: Friday, November 01, 2013 9:18 AM
  To: openstack-dev@lists.openstack.org
  Subject: Re: [openstack-dev] [nova][scheduler]The database access in
 the
  scheduler filters
 
  On 11/01/2013 09:09 AM, Andrew Laski wrote:
   On 11/01/13 at 10:16am, John Garbutt wrote:
   Its intentional. Cells is there to split up your nodes into more
   manageable chunks.
  
   I don't think you mean to say that there's intentionally a performance
   issue.  But yes there are performance issues with the filter scheduler.
   Because I work on a deployment that uses cells to partition the
 workload
   I haven't seen them myself, but there are plenty of reports from others
   who have encountered them.  And it's easy to run some back of the
  napkin
   calculations like was done below and see that scheduling will require a
   lot of resources if there's no partitioning.
  
  
   There are quite a few design summit sessions on looking into
   alternative approaches to our current scheduler.
  
   While I would love a single scheduler to make everyone happy, I am
   thinking we might end up with several scheduler, each with slightly
   different properties, and you pick one depending on what you want
 to
   do with your cloud.
  
   +1.  We have the ability to drop in different schedulers right now, but
   there's only one really useful scheduler in the tree.  There has been
   talk of making a more performant scheduler which schedules in a
 'good
   enough' fashion through some approximation algorithm.  I would love
  to
   see that get introduced as another scheduler and not as a rework of
 the
   filter scheduler.  I suppose the chance scheduler could technically
   count for that, but I'm under the impression that it isn't used beyond
   testing.
 
  Agreed.
 
  There's a lot of discussion happening in two different directions, it
  seems.  One group is very interested in improving the scheduler's
  ability to make the best decision possible using various policies.
  Another group is concerned with massive scale and is willing to accept
  good enough scheduling to get there.
 
  I think the filter scheduler is pretty reasonable for the best possible
  decision approach today.  There's some stuff that could perform better.
   There's more policy knobs that could be added.  There's the cross
  service issue to figure out ... but it's not bad.
 
  I'm very interested in a new good enough scheduler.  I liked the idea
  of running a bunch of schedulers that each only look at a subset of your
  infrastructure and pick something that's good enough.  I'm interested to
  hear other ideas in the session we have on this topic (rethinking
  scheduler design).
 
  Of course, you get a lot of the massive scale benefits by going to
  cells, too.  If cells is our answer here, I really want to see more
  people stepping up to help with the cells code.  There are still some
  feature gaps to fill.  We should also be looking at the road to getting
  back to only having one way to deploy nova (cells).  Having both cells
  vs non-cells options really isn't ideal long term.
 
  --
  Russell Bryant
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters

2013-11-01 Thread Shawn Hartsock
Thanks: TIL.

The filter invocation per host is the bit I was forgetting. 

I'm assuming that the facts about the hosts don't change several times a second 
so if you held the facts in RAM and then asserted the rules against those facts 
allowing for age-out/invalidation based on incoming updates then the whole 
system will run faster. I remember a thread on using dogpile/memoization for 
this kind of thing.

# Shawn Hartsock

- Original Message -
 From: Yunhong Jiang yunhong.ji...@intel.com
 To: OpenStack Development Mailing List (not for usage questions) 
 openstack-dev@lists.openstack.org
 Sent: Friday, November 1, 2013 1:42:03 PM
 Subject: Re: [openstack-dev] [nova][scheduler]The database access in  
 the scheduler filters
 
 Shawn, yes, there is 56 VM access every second, and for each VM access, the
 scheduler will invoke filter for each host, that means, for each VM access,
 the filter function will be invoked 10k times.
 So 56 * 10k = 560k, yes, half of 1M, but still big number.
 
 --jyh
 
  -Original Message-
  From: Shawn Hartsock [mailto:hartso...@vmware.com]
  Sent: Friday, November 01, 2013 8:20 AM
  To: OpenStack Development Mailing List (not for usage questions)
  Subject: Re: [openstack-dev] [nova][scheduler]The database access in the
  scheduler filters
  
  
  
  - Original Message -
   From: Yunhong Jiang yunhong.ji...@intel.com
   To: openstack-dev@lists.openstack.org
   Sent: Thursday, October 31, 2013 6:39:29 PM
   Subject: [openstack-dev] [nova][scheduler]The database access in the
  scheduler filters
  
   I noticed several filters (AggregateMultiTenancyIsoaltion, ram_filter,
   type_filter, AggregateInstanceExtraSpecsFilter) have DB access in the
   host_passes(). Some will even access for each invocation.
  
   Just curios if this is considered a performance issue? With a 10k nodes,
  60
   VM per node, and 3 hours VM life cycle cloud, it will have more than 1
   million DB access per second. Not a small number IMHO.
  
   Thanks
   --jyh
  
   ___
   OpenStack-dev mailing list
   OpenStack-dev@lists.openstack.org
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
  
  Sorry if I'm dumb, but please try to explain things to me. I don't think I
  follow...
  
  10k nodes, 60 VM per node... is 600k VM in the whole cloud. A 3 hour life
  cycle for a VM means every hour 1/3 the nodes turn over so 200k VM
  are created/deleted per hour ... divide by 60 for ... 3,333.333 per minute
  or ... divide by 60 for ... 55.5 VM creations/deletions per second ...
  
  ... did I do that math right? So where's the million DB accesses per second
  come from? Are the rules fired for every VM on every access so that 600k
  VM + 1 new VM means the rules fire 600k + 1 times? What? Sorry... really
  confused.
  
  # Shawn Hartsock
  
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][scheduler]The database access in the scheduler filters

2013-11-01 Thread Shawn Hartsock

Something should probably change.

The fundamental design issue is that we've got a 1:1 relationship between rule 
execution and database fetch. The rules may fire at several orders of magnitude 
different rates of speed from data refreshes in the database. So I'd think you 
would want to decouple the database fetch from the rule assertion.

# Shawn Hartsock

- Original Message -
 From: Chris Friesen chris.frie...@windriver.com
 To: openstack-dev@lists.openstack.org
 Sent: Friday, November 1, 2013 2:10:52 PM
 Subject: Re: [openstack-dev] [nova][scheduler]The database access in  the 
 scheduler filters
 
 On 11/01/2013 11:42 AM, Jiang, Yunhong wrote:
  Shawn, yes, there is 56 VM access every second, and for each VM
  access, the scheduler will invoke filter for each host, that means,
  for each VM access, the filter function will be invoked 10k times. So
  56 * 10k = 560k, yes, half of 1M, but still big number.
 
 
 I'm fairly new to openstack so I may have missed earlier discussions,
 but has anyone looked at building a scheduler filter that would use
 database queries over sets of hosts rather rather than looping over each
 host and doing the logic in python?  Seems like that would be a lot more
 efficient...
 
 Chris
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev