Re: [openstack-dev] Scheduler proposal

2015-10-16 Thread Julien Danjou
On Fri, Oct 16 2015, Joshua Harlow wrote: > Another idea is to use numpy and start representing filters as linear > equations, then use something like > https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.solve.html#numpy.linalg.solve > to solve linear equations given some data. > >

Re: [openstack-dev] Scheduler proposal

2015-10-16 Thread Clint Byrum
Excerpts from Ed Leafe's message of 2015-10-15 11:56:24 -0700: > Wow, I seem to have unleashed a bunch of pent-up frustration in the > community! It's great to see everyone coming forward with their ideas and > insights for improving the way Nova (and, by extension, all of OpenStack) can >

Re: [openstack-dev] Scheduler proposal

2015-10-16 Thread Alec Hothan (ahothan)
On 10/15/15, 11:11 PM, "Clint Byrum" wrote: >Excerpts from Ed Leafe's message of 2015-10-15 11:56:24 -0700: >> Wow, I seem to have unleashed a bunch of pent-up frustration in the >> community! It's great to see everyone coming forward with their ideas and >> insights for

Re: [openstack-dev] Scheduler proposal

2015-10-16 Thread Joshua Harlow
Clint Byrum wrote: Excerpts from Ed Leafe's message of 2015-10-15 11:56:24 -0700: Wow, I seem to have unleashed a bunch of pent-up frustration in the community! It's great to see everyone coming forward with their ideas and insights for improving the way Nova (and, by extension, all of

Re: [openstack-dev] Scheduler proposal

2015-10-15 Thread Joshua Harlow
Ed Leafe wrote: Wow, I seem to have unleashed a bunch of pent-up frustration in the community! It's great to see everyone coming forward with their ideas and insights for improving the way Nova (and, by extension, all of OpenStack) can potentially scale. I do have a few comments on the

Re: [openstack-dev] Scheduler proposal

2015-10-15 Thread Ed Leafe
Wow, I seem to have unleashed a bunch of pent-up frustration in the community! It's great to see everyone coming forward with their ideas and insights for improving the way Nova (and, by extension, all of OpenStack) can potentially scale. I do have a few comments on the discussion: 1) This

Re: [openstack-dev] Scheduler proposal

2015-10-14 Thread Dulko, Michal
On Tue, 2015-10-13 at 08:47 -0700, Joshua Harlow wrote: > Well great! > > When is that going to be accessible :-P > > Dulko, Michal wrote: > > On Mon, 2015-10-12 at 10:58 -0700, Joshua Harlow wrote: > >> Just a related thought/question. It really seems we (as a community) > >> need some kind of

Re: [openstack-dev] Scheduler proposal

2015-10-14 Thread Thomas Goirand
On 10/12/2015 07:10 PM, Monty Taylor wrote: > On 10/12/2015 12:43 PM, Clint Byrum wrote: >> Excerpts from Thomas Goirand's message of 2015-10-12 05:57:26 -0700: >>> On 10/11/2015 02:53 AM, Davanum Srinivas wrote: Thomas, i am curious as well. AFAIK, cassandra works well with

Re: [openstack-dev] Scheduler proposal

2015-10-13 Thread Dulko, Michal
On Mon, 2015-10-12 at 10:58 -0700, Joshua Harlow wrote: > Just a related thought/question. It really seems we (as a community) > need some kind of scale testing ground. Internally at yahoo we were/are > going to use a 200 hypervisor cluster for some of this and then expand > that into 200 * X

Re: [openstack-dev] Scheduler proposal

2015-10-13 Thread Dulko, Michal
On Mon, 2015-10-12 at 10:13 -0700, Clint Byrum wrote: > Zookeeper sits in a very different space from Cassandra. I have had good > success with it on OpenJDK as well. > > That said, we need to maybe go through some feature/risk matrices and > compare to etcd and Consul (this might be good to do

Re: [openstack-dev] Scheduler proposal

2015-10-13 Thread Jeremy Stanley
On 2015-10-12 20:49:44 -0700 (-0700), Joshua Harlow wrote: > Does the openstack foundation have access to a scaling area that > can be used by the community for this kind of experimental work? The OpenStack Foundation has a staff of fewer than 20 full-time employees, with a primary focus on event

Re: [openstack-dev] Scheduler proposal

2015-10-13 Thread Alec Hothan (ahothan)
On 10/12/15, 12:05 PM, "Monty Taylor" wrote: >On 10/12/2015 02:45 PM, Joshua Harlow wrote: >> Alec Hothan (ahothan) wrote: >>> >>> >>> >>> > >I want to do 100k hypervisors. No, that's not hyperbole. > >Also, I do not think that ZK/consul/etcd are very costly for small

Re: [openstack-dev] Scheduler proposal

2015-10-13 Thread Joshua Harlow
Jeremy Stanley wrote: On 2015-10-12 20:49:44 -0700 (-0700), Joshua Harlow wrote: Does the openstack foundation have access to a scaling area that can be used by the community for this kind of experimental work? The OpenStack Foundation has a staff of fewer than 20 full-time employees, with a

Re: [openstack-dev] Scheduler proposal

2015-10-13 Thread Joshua Harlow
Well great! When is that going to be accessible :-P Dulko, Michal wrote: On Mon, 2015-10-12 at 10:58 -0700, Joshua Harlow wrote: Just a related thought/question. It really seems we (as a community) need some kind of scale testing ground. Internally at yahoo we were/are going to use a 200

Re: [openstack-dev] Scheduler proposal

2015-10-13 Thread Ian Wells
On 12 October 2015 at 21:18, Clint Byrum wrote: > We _would_ keep a local cache of the information in the schedulers. The > centralized copy of it is to free the schedulers from the complexity of > having to keep track of it as state, rather than as a cache. We also don't >

Re: [openstack-dev] Scheduler proposal

2015-10-13 Thread Joshua Harlow
Clint Byrum wrote: Excerpts from Ian Wells's message of 2015-10-13 09:24:42 -0700: On 12 October 2015 at 21:18, Clint Byrum wrote: We _would_ keep a local cache of the information in the schedulers. The centralized copy of it is to free the schedulers from the complexity of

Re: [openstack-dev] Scheduler proposal

2015-10-13 Thread Jeremy Stanley
On 2015-10-13 09:15:02 -0700 (-0700), Clint Byrum wrote: > Excerpts from Jeremy Stanley's message of 2015-10-13 06:13:32 -0700: [...] > > it's not even within an order of magnitude of being 1k host > > scale (and at that, it's still a multi-cycle plan just to reach > > viability). > > Infra-cloud

Re: [openstack-dev] Scheduler proposal

2015-10-13 Thread Jeremy Stanley
On 2015-10-13 10:17:26 -0700 (-0700), Joshua Harlow wrote: [...] > Interesting, doesn't the foundation have money? I was under the > assumption it does (but I'm not a finance person); seeing that the > membership fee to become a member afaik is not cheap, and there > seems to be quite a-lot of

Re: [openstack-dev] Scheduler proposal

2015-10-13 Thread Clint Byrum
Excerpts from Jeremy Stanley's message of 2015-10-13 06:13:32 -0700: > On 2015-10-12 20:49:44 -0700 (-0700), Joshua Harlow wrote: > > Does the openstack foundation have access to a scaling area that > > can be used by the community for this kind of experimental work? > > The OpenStack Foundation

Re: [openstack-dev] Scheduler proposal

2015-10-13 Thread Clint Byrum
Excerpts from Ian Wells's message of 2015-10-13 09:24:42 -0700: > On 12 October 2015 at 21:18, Clint Byrum wrote: > > > We _would_ keep a local cache of the information in the schedulers. The > > centralized copy of it is to free the schedulers from the complexity of > > having

Re: [openstack-dev] Scheduler proposal

2015-10-13 Thread Joshua Harlow
Clint Byrum wrote: Excerpts from Jeremy Stanley's message of 2015-10-13 06:13:32 -0700: On 2015-10-12 20:49:44 -0700 (-0700), Joshua Harlow wrote: Does the openstack foundation have access to a scaling area that can be used by the community for this kind of experimental work? The OpenStack

Re: [openstack-dev] Scheduler proposal

2015-10-13 Thread Clint Byrum
Excerpts from Dulko, Michal's message of 2015-10-13 03:49:44 -0700: > On Mon, 2015-10-12 at 10:13 -0700, Clint Byrum wrote: > > Zookeeper sits in a very different space from Cassandra. I have had good > > success with it on OpenJDK as well. > > > > That said, we need to maybe go through some

Re: [openstack-dev] Scheduler proposal

2015-10-12 Thread Thierry Carrez
Adam Lawson wrote: > I have a quick question: how is Amazon doing this? When choosing a next > path forward that reliably scales, would be interesting to know how this > is already being done. Well, those who know probably would be sued if they told. Since they have a limited set of instance

Re: [openstack-dev] Scheduler proposal

2015-10-12 Thread Thierry Carrez
Clint Byrum wrote: > Excerpts from Joshua Harlow's message of 2015-10-10 17:43:40 -0700: >> I'm curious is there any more detail about #1 below anywhere online? >> >> Does cassandra use some features of the JVM that the openJDK version >> doesn't support? Something else? > > This about sums it

Re: [openstack-dev] Scheduler proposal

2015-10-12 Thread Joshua Harlow
Thierry Carrez wrote: Clint Byrum wrote: Excerpts from Joshua Harlow's message of 2015-10-10 17:43:40 -0700: I'm curious is there any more detail about #1 below anywhere online? Does cassandra use some features of the JVM that the openJDK version doesn't support? Something else? This about

Re: [openstack-dev] Scheduler proposal

2015-10-12 Thread Clint Byrum
Excerpts from Thomas Goirand's message of 2015-10-12 05:57:26 -0700: > On 10/11/2015 02:53 AM, Davanum Srinivas wrote: > > Thomas, > > > > i am curious as well. AFAIK, cassandra works well with OpenJDK. Can you > > please elaborate what you concerns are for #1? > > > > Thanks, > > Dims > >

Re: [openstack-dev] Scheduler proposal

2015-10-12 Thread Thomas Goirand
On 10/11/2015 02:53 AM, Davanum Srinivas wrote: > Thomas, > > i am curious as well. AFAIK, cassandra works well with OpenJDK. Can you > please elaborate what you concerns are for #1? > > Thanks, > Dims s/works well/works/ Upstream doesn't test against OpenJDK, and they close bugs without

Re: [openstack-dev] Scheduler proposal

2015-10-12 Thread Jean-Daniel Bonnetot
Hi everyone, What do you think about this proposal ? http://www.slideshare.net/viggates/openstack-india-meetupscheduler It seems they found a real solution for a scaling scheduler. The good idea is to move intelligence on compute. A synchronisation is only needed for anti-afinity and stuff like

Re: [openstack-dev] Scheduler proposal

2015-10-12 Thread Monty Taylor
On 10/12/2015 12:43 PM, Clint Byrum wrote: Excerpts from Thomas Goirand's message of 2015-10-12 05:57:26 -0700: On 10/11/2015 02:53 AM, Davanum Srinivas wrote: Thomas, i am curious as well. AFAIK, cassandra works well with OpenJDK. Can you please elaborate what you concerns are for #1?

Re: [openstack-dev] Scheduler proposal

2015-10-12 Thread Clint Byrum
Excerpts from Joshua Harlow's message of 2015-10-12 08:35:20 -0700: > Thierry Carrez wrote: > > Clint Byrum wrote: > >> Excerpts from Joshua Harlow's message of 2015-10-10 17:43:40 -0700: > >>> I'm curious is there any more detail about #1 below anywhere online? > >>> > >>> Does cassandra use some

Re: [openstack-dev] Scheduler proposal

2015-10-12 Thread Clint Byrum
Excerpts from Boris Pavlovic's message of 2015-10-11 01:14:08 -0700: > Clint, > > There are many PROS and CONS in both of approaches. > > Reinventing wheel (in this case it's quite simple task) and it gives more > flexibility and doesn't require > usage of ZK/Consul (which will simplify

Re: [openstack-dev] Scheduler proposal

2015-10-12 Thread Joshua Harlow
Clint Byrum wrote: Excerpts from Boris Pavlovic's message of 2015-10-11 01:14:08 -0700: Clint, There are many PROS and CONS in both of approaches. Reinventing wheel (in this case it's quite simple task) and it gives more flexibility and doesn't require usage of ZK/Consul (which will simplify

Re: [openstack-dev] Scheduler proposal

2015-10-12 Thread Alec Hothan (ahothan)
On 10/10/15, 11:35 PM, "Clint Byrum" wrote: >Excerpts from Alec Hothan (ahothan)'s message of 2015-10-09 21:19:14 -0700: >> >> On 10/9/15, 6:29 PM, "Clint Byrum" wrote: >> >> >Excerpts from Chris Friesen's message of 2015-10-09 17:33:38 -0700: >> >> On

Re: [openstack-dev] Scheduler proposal

2015-10-12 Thread Joshua Harlow
Alec Hothan (ahothan) wrote: On 10/10/15, 11:35 PM, "Clint Byrum" wrote: Excerpts from Alec Hothan (ahothan)'s message of 2015-10-09 21:19:14 -0700: On 10/9/15, 6:29 PM, "Clint Byrum" wrote: Excerpts from Chris Friesen's message of 2015-10-09

Re: [openstack-dev] Scheduler proposal

2015-10-12 Thread Monty Taylor
On 10/12/2015 02:45 PM, Joshua Harlow wrote: Alec Hothan (ahothan) wrote: On 10/10/15, 11:35 PM, "Clint Byrum" wrote: Excerpts from Alec Hothan (ahothan)'s message of 2015-10-09 21:19:14 -0700: On 10/9/15, 6:29 PM, "Clint Byrum" wrote: Excerpts

Re: [openstack-dev] Scheduler proposal

2015-10-12 Thread Alec Hothan (ahothan)
On 10/12/15, 11:45 AM, "Joshua Harlow" wrote: >Alec Hothan (ahothan) wrote: >> >> >> >> >> On 10/10/15, 11:35 PM, "Clint Byrum" wrote: >> >>> Excerpts from Alec Hothan (ahothan)'s message of 2015-10-09 21:19:14 -0700: On 10/9/15, 6:29 PM,

Re: [openstack-dev] Scheduler proposal

2015-10-12 Thread Joshua Harlow
Alec Hothan (ahothan) wrote: On 10/12/15, 11:45 AM, "Joshua Harlow" wrote: Alec Hothan (ahothan) wrote: On 10/10/15, 11:35 PM, "Clint Byrum" wrote: Excerpts from Alec Hothan (ahothan)'s message of 2015-10-09 21:19:14 -0700: On 10/9/15,

Re: [openstack-dev] Scheduler proposal

2015-10-12 Thread Joshua Harlow
Ian Wells wrote: On 10 October 2015 at 23:47, Clint Byrum > wrote: > Per before, my suggestion was that every scheduler tries to maintain a copy > of the cloud's state in memory (in much the same way, per the previous > example, as

Re: [openstack-dev] Scheduler proposal

2015-10-12 Thread Ian Wells
On 11 October 2015 at 00:23, Clint Byrum wrote: > I'm in, except I think this gets simpler with an intermediary service > like ZK/Consul to keep track of this 1GB of data and replace the need > for 6, and changes the implementation of 5 to "updates its record and > signals its

Re: [openstack-dev] Scheduler proposal

2015-10-12 Thread Ian Wells
On 10 October 2015 at 23:47, Clint Byrum wrote: > > Per before, my suggestion was that every scheduler tries to maintain a > copy > > of the cloud's state in memory (in much the same way, per the previous > > example, as every router on the internet tries to make a route table

Re: [openstack-dev] Scheduler proposal

2015-10-12 Thread Clint Byrum
Excerpts from Ian Wells's message of 2015-10-12 19:43:48 -0700: > On 11 October 2015 at 00:23, Clint Byrum wrote: > > > I'm in, except I think this gets simpler with an intermediary service > > like ZK/Consul to keep track of this 1GB of data and replace the need > > for 6, and

Re: [openstack-dev] Scheduler proposal

2015-10-11 Thread Clint Byrum
Excerpts from Alec Hothan (ahothan)'s message of 2015-10-09 21:19:14 -0700: > > On 10/9/15, 6:29 PM, "Clint Byrum" wrote: > > >Excerpts from Chris Friesen's message of 2015-10-09 17:33:38 -0700: > >> On 10/09/2015 03:36 PM, Ian Wells wrote: > >> > On 9 October 2015 at 12:50,

Re: [openstack-dev] Scheduler proposal

2015-10-11 Thread Boris Pavlovic
2Everybody, Just curios why we need such complexity. Let's take a look from other side: 1) Information about all hosts (even in case of 100k hosts) will be less then 1 GB 2) Usually servers that runs scheduler service have at least 64GB RAM and more on the board 3) math.log(10) < 12

Re: [openstack-dev] Scheduler proposal

2015-10-11 Thread Geoff O'Callaghan
On 11/10/2015 6:25 PM, "Clint Byrum" wrote: > > Excerpts from Boris Pavlovic's message of 2015-10-11 00:02:39 -0700: > > 2Everybody, > > > > Just curios why we need such complexity. > > > > > > Let's take a look from other side: > > 1) Information about all hosts (even in case

Re: [openstack-dev] Scheduler proposal

2015-10-11 Thread Boris Pavlovic
Clint, There are many PROS and CONS in both of approaches. Reinventing wheel (in this case it's quite simple task) and it gives more flexibility and doesn't require usage of ZK/Consul (which will simplify integration of it with current system) Using ZK/Consul for POC may save a lot of time and

Re: [openstack-dev] Scheduler proposal

2015-10-11 Thread Clint Byrum
Excerpts from Ian Wells's message of 2015-10-09 19:14:17 -0700: > On 9 October 2015 at 18:29, Clint Byrum wrote: > > > Instead of having the scheduler do all of the compute node inspection > > and querying though, you have the nodes push their stats into something > > like

Re: [openstack-dev] Scheduler proposal

2015-10-11 Thread Clint Byrum
Excerpts from Joshua Harlow's message of 2015-10-10 17:43:40 -0700: > I'm curious is there any more detail about #1 below anywhere online? > > Does cassandra use some features of the JVM that the openJDK version > doesn't support? Something else? > This about sums it up:

Re: [openstack-dev] Scheduler proposal

2015-10-11 Thread Clint Byrum
Excerpts from Boris Pavlovic's message of 2015-10-11 00:02:39 -0700: > 2Everybody, > > Just curios why we need such complexity. > > > Let's take a look from other side: > 1) Information about all hosts (even in case of 100k hosts) will be less > then 1 GB > 2) Usually servers that runs

Re: [openstack-dev] Scheduler proposal

2015-10-11 Thread Clint Byrum
Excerpts from Chris Friesen's message of 2015-10-09 23:16:43 -0700: > On 10/09/2015 07:29 PM, Clint Byrum wrote: > > > Even if you figured out how to make the in-memory scheduler crazy fast, > > There's still value in concurrency for other reasons. No matter how > > fast you make the scheduler,

Re: [openstack-dev] Scheduler proposal

2015-10-11 Thread Adam Lawson
I have a quick question: how is Amazon doing this? When choosing a next path forward that reliably scales, would be interesting to know how this is already being done. On Oct 9, 2015 10:12 AM, "Zane Bitter" wrote: > On 08/10/15 21:32, Ian Wells wrote: > >> >> > 2. if many

Re: [openstack-dev] Scheduler proposal

2015-10-11 Thread Davanum Srinivas
Thanks Clint! On Sat, Oct 10, 2015 at 11:53 PM, Clint Byrum wrote: > Excerpts from Joshua Harlow's message of 2015-10-10 17:43:40 -0700: > > I'm curious is there any more detail about #1 below anywhere online? > > > > Does cassandra use some features of the JVM that the

Re: [openstack-dev] Scheduler proposal

2015-10-11 Thread Joshua Harlow
Clint Byrum wrote: Excerpts from Boris Pavlovic's message of 2015-10-11 00:02:39 -0700: 2Everybody, Just curios why we need such complexity. Let's take a look from other side: 1) Information about all hosts (even in case of 100k hosts) will be less then 1 GB 2) Usually servers that runs

Re: [openstack-dev] Scheduler proposal

2015-10-11 Thread Amrith Kumar
-cassandra-support-openjdk From: Davanum Srinivas [mailto:dava...@gmail.com] Sent: Saturday, October 10, 2015 8:54 PM To: OpenStack Development Mailing List (not for usage questions) <openstack-dev@lists.openstack.org> Subject: Re: [openstack-dev] Scheduler proposal Not implying cas

Re: [openstack-dev] Scheduler proposal

2015-10-10 Thread Joshua Harlow
I'm curious is there any more detail about #1 below anywhere online? Does cassandra use some features of the JVM that the openJDK version doesn't support? Something else? -Josh Thomas Goirand wrote: On 10/07/2015 07:36 PM, Ed Leafe wrote: Several months ago I proposed an experiment [0] to

Re: [openstack-dev] Scheduler proposal

2015-10-10 Thread Thomas Goirand
On 10/07/2015 07:36 PM, Ed Leafe wrote: > Several months ago I proposed an experiment [0] to see if switching > the data model for the Nova scheduler to use Cassandra as the backend > would be a significant improvement as opposed to the current design This is probably right. I don't know, I'm not

Re: [openstack-dev] Scheduler proposal

2015-10-10 Thread Davanum Srinivas
Thomas, i am curious as well. AFAIK, cassandra works well with OpenJDK. Can you please elaborate what you concerns are for #1? Thanks, Dims On Sat, Oct 10, 2015 at 5:43 PM, Joshua Harlow wrote: > I'm curious is there any more detail about #1 below anywhere online? > >

Re: [openstack-dev] Scheduler proposal

2015-10-10 Thread Davanum Srinivas
Not implying cassandra is the right option. Just curious about the assertion. -- Dims On Sat, Oct 10, 2015 at 5:53 PM, Davanum Srinivas wrote: > Thomas, > > i am curious as well. AFAIK, cassandra works well with OpenJDK. Can you > please elaborate what you concerns are for

Re: [openstack-dev] Scheduler proposal

2015-10-10 Thread Chris Friesen
On 10/09/2015 07:29 PM, Clint Byrum wrote: Even if you figured out how to make the in-memory scheduler crazy fast, There's still value in concurrency for other reasons. No matter how fast you make the scheduler, you'll be slave to the response time of a single scheduling request. If you take

Re: [openstack-dev] Scheduler proposal

2015-10-09 Thread Chris Friesen
On 10/09/2015 03:36 PM, Ian Wells wrote: On 9 October 2015 at 12:50, Chris Friesen > wrote: Has anybody looked at why 1 instance is too slow and what it would take to make 1 scheduler instance work fast enough? This

Re: [openstack-dev] Scheduler proposal

2015-10-09 Thread Ian Wells
On 9 October 2015 at 18:29, Clint Byrum wrote: > Instead of having the scheduler do all of the compute node inspection > and querying though, you have the nodes push their stats into something > like Zookeeper or consul, and then have schedulers watch those stats > for changes

Re: [openstack-dev] Scheduler proposal

2015-10-09 Thread Joshua Harlow
Gregory Haynes wrote: Excerpts from Joshua Harlow's message of 2015-10-08 15:24:18 +: On this point, and just thinking out loud. If we consider saving compute_node information into say a node in said DLM backend (for example a znode in zookeeper[1]); this information would be updated

Re: [openstack-dev] Scheduler proposal

2015-10-09 Thread Joshua Harlow
And one last reply with more code: http://paste.openstack.org/show/475941/ (a creator of services that dynamically creates services, and destroys them after a set amount of time is included in here, along with the prior resource watcher). Works locally, should work for u as well. Output

Re: [openstack-dev] Scheduler proposal

2015-10-09 Thread Neil Jerram
or imprecise language.) Regards, Neil Original Message From: Clint Byrum Sent: Friday, 9 October 2015 19:08 To: openstack-dev Reply To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] Scheduler proposal Excerpts from Chris Friesen's message

Re: [openstack-dev] Scheduler proposal

2015-10-09 Thread Joshua Harlow
Further example stuff, Get kazoo installed (http://kazoo.readthedocs.org/) Output from my local run (with no data) $ python test.py Kazoo client has changed to state: CONNECTED Got data: '' for new resource /node/compute_nodes/h1.hypervisor.yahoo.com Idling (ran for 0.00s). Known

Re: [openstack-dev] Scheduler proposal

2015-10-09 Thread Gregory Haynes
Excerpts from Chris Friesen's message of 2015-10-09 19:36:03 +: > On 10/09/2015 12:55 PM, Gregory Haynes wrote: > > > There is a more generalized version of this algorithm for concurrent > > scheduling I've seen a few times - Pick N options at random, apply > > heuristic over that N to pick

Re: [openstack-dev] Scheduler proposal

2015-10-09 Thread Ian Wells
On 9 October 2015 at 12:50, Chris Friesen wrote: > Has anybody looked at why 1 instance is too slow and what it would take to > >> make 1 scheduler instance work fast enough? This does not preclude the >> use of >> concurrency for finer grain tasks in the background.

Re: [openstack-dev] Scheduler proposal

2015-10-09 Thread Alec Hothan (ahothan)
On 10/9/15, 6:29 PM, "Clint Byrum" wrote: >Excerpts from Chris Friesen's message of 2015-10-09 17:33:38 -0700: >> On 10/09/2015 03:36 PM, Ian Wells wrote: >> > On 9 October 2015 at 12:50, Chris Friesen > > >

Re: [openstack-dev] Scheduler proposal

2015-10-09 Thread Gregory Haynes
Excerpts from Joshua Harlow's message of 2015-10-08 15:24:18 +: > On this point, and just thinking out loud. If we consider saving > compute_node information into say a node in said DLM backend (for > example a znode in zookeeper[1]); this information would be updated > periodically by that

Re: [openstack-dev] Scheduler proposal

2015-10-09 Thread Alec Hothan (ahothan)
There are several ways to make python code that deals with a lot of data faster, especially when it comes to operating on DB fields from SQL tables (and that is not limited to the nova scheduler). Pulling data from large SQL tables and operating on them through regular python code (using

Re: [openstack-dev] Scheduler proposal

2015-10-09 Thread Clint Byrum
Excerpts from Chris Friesen's message of 2015-10-09 10:54:36 -0700: > On 10/09/2015 11:09 AM, Zane Bitter wrote: > > > The optimal way to do this would be a weighted random selection, where the > > probability of any given host being selected is proportional to its > > weighting. > > (Obviously

Re: [openstack-dev] Scheduler proposal

2015-10-09 Thread Alec Hothan (ahothan)
Still the point from Chris is valid. I guess the main reason openstack is going with multiple concurrent schedulers is to scale out by distributing the load between multiple instances of schedulers because 1 instance is too slow. This discussion is about coordinating the many instances of

Re: [openstack-dev] Scheduler proposal

2015-10-09 Thread Chris Friesen
On 10/09/2015 11:09 AM, Zane Bitter wrote: The optimal way to do this would be a weighted random selection, where the probability of any given host being selected is proportional to its weighting. (Obviously this is limited by the accuracy of the weighting function in expressing your actual

Re: [openstack-dev] Scheduler proposal

2015-10-09 Thread Gregory Haynes
Excerpts from Zane Bitter's message of 2015-10-09 17:09:46 +: > On 08/10/15 21:32, Ian Wells wrote: > > > > > 2. if many hosts suit the 5 VMs then this is *very* unlucky,because > > we should be choosing a host at random from the set of > > suitable hosts and that's a huge coincidence

Re: [openstack-dev] Scheduler proposal

2015-10-09 Thread Chris Friesen
On 10/09/2015 12:55 PM, Gregory Haynes wrote: There is a more generalized version of this algorithm for concurrent scheduling I've seen a few times - Pick N options at random, apply heuristic over that N to pick the best, attempt to schedule at your choice, retry on failure. As long as you have

Re: [openstack-dev] Scheduler proposal

2015-10-09 Thread Chris Friesen
On 10/09/2015 12:25 PM, Alec Hothan (ahothan) wrote: Still the point from Chris is valid. I guess the main reason openstack is going with multiple concurrent schedulers is to scale out by distributing the load between multiple instances of schedulers because 1 instance is too slow. This

Re: [openstack-dev] Scheduler proposal

2015-10-09 Thread Joshua Harlow
And also we should probably deprecate/not recommend: http://docs.openstack.org/developer/nova/api/nova.scheduler.filters.json_filter.html#nova.scheduler.filters.json_filter.JsonFilter That filter IMHO basically disallows optimizations like forming SQL statements for each filter (and then

Re: [openstack-dev] Scheduler proposal

2015-10-08 Thread Maish Saidel-Keesing
Forgive the top-post. Cross-posting to openstack-operators for their feedback as well. Ed the work seems very promising, and I am interested to see how this evolves. With my operator hat on I have one piece of feedback. By adding in a new Database solution (Cassandra) we are now up to three

Re: [openstack-dev] Scheduler proposal

2015-10-08 Thread Thierry Carrez
Maish Saidel-Keesing wrote: > Operational overhead has a cost - maintaining 3 different database > tools, backing them up, providing HA, etc. has operational cost. > > This is not to say that this cannot be overseen, but it should be taken > into consideration. > > And *if* they can be

Re: [openstack-dev] Scheduler proposal

2015-10-08 Thread Joshua Harlow
On Thu, 8 Oct 2015 10:43:01 -0400 Monty Taylor wrote: > On 10/08/2015 09:01 AM, Thierry Carrez wrote: > > Maish Saidel-Keesing wrote: > >> Operational overhead has a cost - maintaining 3 different database > >> tools, backing them up, providing HA, etc. has operational

Re: [openstack-dev] Scheduler proposal

2015-10-08 Thread Joshua Harlow
Joshua Harlow wrote: On Thu, 8 Oct 2015 10:43:01 -0400 Monty Taylor wrote: On 10/08/2015 09:01 AM, Thierry Carrez wrote: Maish Saidel-Keesing wrote: Operational overhead has a cost - maintaining 3 different database tools, backing them up, providing HA, etc. has

Re: [openstack-dev] Scheduler proposal

2015-10-08 Thread Clint Byrum
Excerpts from Joshua Harlow's message of 2015-10-08 08:38:57 -0700: > Joshua Harlow wrote: > > On Thu, 8 Oct 2015 10:43:01 -0400 > > Monty Taylor wrote: > > > >> On 10/08/2015 09:01 AM, Thierry Carrez wrote: > >>> Maish Saidel-Keesing wrote: > Operational overhead has a

Re: [openstack-dev] Scheduler proposal

2015-10-08 Thread Monty Taylor
On 10/08/2015 09:01 AM, Thierry Carrez wrote: Maish Saidel-Keesing wrote: Operational overhead has a cost - maintaining 3 different database tools, backing them up, providing HA, etc. has operational cost. This is not to say that this cannot be overseen, but it should be taken into

Re: [openstack-dev] Scheduler proposal

2015-10-08 Thread Kevin L. Mitchell
On Wed, 2015-10-07 at 23:17 -0600, Chris Friesen wrote: > Why is it inevitable? Well, I would say that this is probably a consequence of the CAP[1] theorem. > Theoretically if the DB knew about what resources were originally available > and > what resources have been consumed, then it should

Re: [openstack-dev] Scheduler proposal

2015-10-08 Thread Ed Leafe
On Oct 8, 2015, at 8:01 AM, Thierry Carrez wrote: >> Operational overhead has a cost - maintaining 3 different database >> tools, backing them up, providing HA, etc. has operational cost. >> >> This is not to say that this cannot be overseen, but it should be taken >>

Re: [openstack-dev] Scheduler proposal

2015-10-08 Thread Ed Leafe
On Oct 8, 2015, at 10:24 AM, Joshua Harlow wrote: > Now if we imagine each/all schedulers having watches > on /nova/compute_nodes/ ([2] consul and etc.d have equivalent concepts > afaik) then when a compute_node updates that information a push > notification (the watch

Re: [openstack-dev] Scheduler proposal

2015-10-08 Thread Ed Leafe
On Oct 8, 2015, at 11:03 AM, Kevin L. Mitchell wrote: >> Theoretically if the DB knew about what resources were originally available >> and >> what resources have been consumed, then it should be able to allocate >> resources >> race-free (possibly with some

Re: [openstack-dev] Scheduler proposal

2015-10-08 Thread Ed Leafe
On Oct 8, 2015, at 10:54 AM, Clint Byrum wrote: > ^^ THIS is the kind of architectural thinking I'd like to see us do more > of. Agreed. If nothing else, I'm glad that I was able to get people thinking about new approaches. > This isn't "hey I have a better database" it is

Re: [openstack-dev] Scheduler proposal

2015-10-08 Thread Joshua Harlow
Clint Byrum wrote: Excerpts from Joshua Harlow's message of 2015-10-08 08:38:57 -0700: Joshua Harlow wrote: On Thu, 8 Oct 2015 10:43:01 -0400 Monty Taylor wrote: On 10/08/2015 09:01 AM, Thierry Carrez wrote: Maish Saidel-Keesing wrote: Operational overhead has a

Re: [openstack-dev] Scheduler proposal

2015-10-08 Thread Ed Leafe
On Oct 8, 2015, at 1:38 PM, Ian Wells wrote: >> You've hit upon the problem with the current design: multiple, and >> potentially out-of-sync copies of the data. > > Arguably, this is the *intent* of the current design, not a problem with it. It may have been the

Re: [openstack-dev] Scheduler proposal

2015-10-08 Thread Ian Wells
On 8 October 2015 at 13:28, Ed Leafe wrote: > On Oct 8, 2015, at 1:38 PM, Ian Wells wrote: > > Truth be told, storing that data in MySQL is secondary to the correct > functioning of the scheduler. > > I have no problem with MySQL (well, I do, but that's

Re: [openstack-dev] Scheduler proposal

2015-10-08 Thread Ian Wells
On 7 October 2015 at 22:17, Chris Friesen wrote: > On 10/07/2015 07:23 PM, Ian Wells wrote: > >> >> The whole process is inherently racy (and this is inevitable, and >> correct), >> >> > Why is it inevitable? > It's inevitable because everything takes time, and some

Re: [openstack-dev] Scheduler proposal

2015-10-08 Thread Ian Wells
On 8 October 2015 at 09:10, Ed Leafe wrote: > You've hit upon the problem with the current design: multiple, and > potentially out-of-sync copies of the data. Arguably, this is the *intent* of the current design, not a problem with it. The data can never be perfect (ever) so

Re: [openstack-dev] Scheduler proposal

2015-10-07 Thread Clint Byrum
Excerpts from Zane Bitter's message of 2015-10-07 12:28:36 -0700: > On 07/10/15 13:36, Ed Leafe wrote: > > Several months ago I proposed an experiment [0] to see if switching the > > data model for the Nova scheduler to use Cassandra as the backend would be > > a significant improvement as

Re: [openstack-dev] Scheduler proposal

2015-10-07 Thread Chris Friesen
On 10/07/2015 11:36 AM, Ed Leafe wrote: I've finally gotten around to finishing writing up that proposal [1], and I'd like to hope that it would be the basis for future discussions about addressing some of the underlying issues that exist in OpenStack for historical reasons, and how we might

Re: [openstack-dev] Scheduler proposal

2015-10-07 Thread Ed Leafe
On Oct 7, 2015, at 6:00 PM, Chris Friesen wrote: > I've wondered for a while (ever since I looked at the scheduler code, really) > why we couldn't implement more of the scheduler as database transactions. > > I haven't used Cassandra, so maybe you can clarify

Re: [openstack-dev] Scheduler proposal

2015-10-07 Thread Chris Friesen
On 10/07/2015 07:23 PM, Ian Wells wrote: On 7 October 2015 at 16:00, Chris Friesen > wrote: 1) Some resources (RAM) only require tracking amounts. Other resources (CPUs, PCI devices) require tracking allocation of

Re: [openstack-dev] Scheduler proposal

2015-10-07 Thread Ian Wells
On 7 October 2015 at 16:00, Chris Friesen wrote: > 1) Some resources (RAM) only require tracking amounts. Other resources > (CPUs, PCI devices) require tracking allocation of specific individual host > resources (for CPU pinning, PCI device allocation, etc.).

Re: [openstack-dev] Scheduler proposal

2015-10-07 Thread Ed Leafe
On Oct 7, 2015, at 2:28 PM, Zane Bitter wrote: > It seems to me (disclaimer: not a Nova dev) that which database to use is > completely irrelevant to your proposal, Well, not entirely. The difference is that what Cassandra offers that separates it from other DBs is exactly

Re: [openstack-dev] Scheduler proposal

2015-10-07 Thread Fox, Kevin M
PM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] Scheduler proposal On Oct 7, 2015, at 2:28 PM, Zane Bitter <zbit...@redhat.com> wrote: > It seems to me (disclaimer: not a Nova dev) that which database to use is > completely irrele

Re: [openstack-dev] Scheduler proposal

2015-10-07 Thread Zane Bitter
On 07/10/15 13:36, Ed Leafe wrote: Several months ago I proposed an experiment [0] to see if switching the data model for the Nova scheduler to use Cassandra as the backend would be a significant improvement as opposed to the current design using multiple copies of the same data (compute_node

  1   2   >