Re: [openstack-dev] [nova] blueprint about multiple workers supported in nova-scheduler

Rui Chen Wed, 04 Mar 2015 23:58:21 -0800

We will face the same issue in multiple nova-scheduler process case, like
Sylvain say, right?


Two processes/workers can actually consume two distinct resources on the
same HostState.




2015-03-05 13:26 GMT+08:00 Alex Xu <[email protected]>:

> Rui, you still can run multiple nova-scheduler process now.
>
>
> 2015-03-05 10:55 GMT+08:00 Rui Chen <[email protected]>:
>
>> Looks like it's a complicated problem, and nova-scheduler can't scale-out
>> horizontally in active/active mode.
>>
>> Maybe we should illustrate the problem in the HA docs.
>>
>> http://docs.openstack.org/high-availability-guide/content/_schedulers.html
>>
>> Thanks for everybody's attention.
>>
>> 2015-03-05 5:38 GMT+08:00 Mike Bayer <[email protected]>:
>>
>>>
>>>
>>> Attila Fazekas <[email protected]> wrote:
>>>
>>> > Hi,
>>> >
>>> > I wonder what is the planned future of the scheduling.
>>> >
>>> > The scheduler does a lot of high field number query,
>>> > which is CPU expensive when you are using sqlalchemy-orm.
>>> > Does anyone tried to switch those operations to sqlalchemy-core ?
>>>
>>> An upcoming feature in SQLAlchemy 1.0 will remove the vast majority of
>>> CPU
>>> overhead from the query side of SQLAlchemy ORM by caching all the work
>>> done
>>> up until the SQL is emitted, including all the function overhead of
>>> building
>>> up the Query object, producing a core select() object internally from the
>>> Query, working out a large part of the object fetch strategies, and
>>> finally
>>> the string compilation of the select() into a string as well as
>>> organizing
>>> the typing information for result columns. With a query that is
>>> constructed
>>> using the “Baked” feature, all of these steps are cached in memory and
>>> held
>>> persistently; the same query can then be re-used at which point all of
>>> these
>>> steps are skipped. The system produces the cache key based on the
>>> in-place
>>> construction of the Query using lambdas so no major changes to code
>>> structure are needed; just the way the Query modifications are performed
>>> needs to be preceded with “lambda q:”, essentially.
>>>
>>> With this approach, the traditional session.query(Model) approach can go
>>> from start to SQL being emitted with an order of magnitude less function
>>> calls. On the fetch side, fetching individual columns instead of full
>>> entities has always been an option with ORM and is about the same speed
>>> as a
>>> Core fetch of rows. So using ORM with minimal changes to existing ORM
>>> code
>>> you can get performance even better than you’d get using Core directly,
>>> since caching of the string compilation is also added.
>>>
>>> On the persist side, the new bulk insert / update features provide a
>>> bridge
>>> from ORM-mapped objects to bulk inserts/updates without any unit of work
>>> sorting going on. ORM mapped objects are still more expensive to use in
>>> that
>>> instantiation and state change is still more expensive, but bulk
>>> insert/update accepts dictionaries as well, which again is competitive
>>> with
>>> a straight Core insert.
>>>
>>> Both of these features are completed in the master branch, the “baked
>>> query”
>>> feature just needs documentation, and I’m basically two or three tickets
>>> away from beta releases of 1.0. The “Baked” feature itself lives as an
>>> extension and if we really wanted, I could backport it into oslo.db as
>>> well
>>> so that it works against 0.9.
>>>
>>> So I’d ask that folks please hold off on any kind of migration from ORM
>>> to
>>> Core for performance reasons. I’ve spent the past several months adding
>>> features directly to SQLAlchemy that allow an ORM-based app to have
>>> routes
>>> to operations that perform just as fast as that of Core without a
>>> rewrite of
>>> code.
>>>
>>> > The scheduler does lot of thing in the application, like filtering
>>> > what can be done on the DB level more efficiently. Why it is not done
>>> > on the DB side ?
>>> >
>>> > There are use cases when the scheduler would need to know even more
>>> data,
>>> > Is there a plan for keeping `everything` in all schedulers process
>>> memory up-to-date ?
>>> > (Maybe zookeeper)
>>> >
>>> > The opposite way would be to move most operation into the DB side,
>>> > since the DB already knows everything.
>>> > (stored procedures ?)
>>> >
>>> > Best Regards,
>>> > Attila
>>> >
>>> >
>>> > ----- Original Message -----
>>> >> From: "Rui Chen" <[email protected]>
>>> >> To: "OpenStack Development Mailing List (not for usage questions)" <
>>> [email protected]>
>>> >> Sent: Wednesday, March 4, 2015 4:51:07 AM
>>> >> Subject: [openstack-dev] [nova] blueprint about multiple workers
>>> supported   in nova-scheduler
>>> >>
>>> >> Hi all,
>>> >>
>>> >> I want to make it easy to launch a bunch of scheduler processes on a
>>> host,
>>> >> multiple scheduler workers will make use of multiple processors of
>>> host and
>>> >> enhance the performance of nova-scheduler.
>>> >>
>>> >> I had registered a blueprint and commit a patch to implement it.
>>> >>
>>> https://blueprints.launchpad.net/nova/+spec/scheduler-multiple-workers-support
>>> >>
>>> >> This patch had applied in our performance environment and pass some
>>> test
>>> >> cases, like: concurrent booting multiple instances, currently we
>>> didn't find
>>> >> inconsistent issue.
>>> >>
>>> >> IMO, nova-scheduler should been scaled horizontally on easily way, the
>>> >> multiple workers should been supported as an out of box feature.
>>> >>
>>> >> Please feel free to discuss this feature, thanks.
>>> >>
>>> >> Best Regards
>>> >>
>>> >>
>>> >>
>>> __________________________________________________________________________
>>> >> OpenStack Development Mailing List (not for usage questions)
>>> >> Unsubscribe:
>>> [email protected]?subject:unsubscribe
>>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> >
>>> >
>>> __________________________________________________________________________
>>> > OpenStack Development Mailing List (not for usage questions)
>>> > Unsubscribe:
>>> [email protected]?subject:unsubscribe
>>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>>
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> [email protected]?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> [email protected]?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [email protected]?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] blueprint about multiple workers supported in nova-scheduler

Reply via email to