On Fri, Jul 19, 2013 at 10:15 AM, Dan Smith <[email protected]> wrote: > > > So rather than asking "what doesn't work / might not work in the > > future" I think the question should be "aside from them both being > > things that could be described as a conductor - what's the > > architectural reason for wanting to have these two separate groups of > > functionality in the same service ?" > > IMHO, the architectural reason is "lack of proliferation of services and > the added complexity that comes with it." If one expects the > proxy workload to always overshadow the task workload, then making > these two things a single service makes things a lot simpler.
I'd like to point a low-level detail that makes scaling nova-conductor at the process level extremely compelling: the database driver blocking the eventlet thread serializes nova's database access. Since the database connection driver is typically implemented in a library beyond the purview of eventlet's monkeypatching (i.e., a native python extension like _mysql.so), blocking database calls will block all eventlet coroutines. Since most of what nova-conductor does is access the database, a nova-conductor process's handling of requests is effectively serial. Nova-conductor is the gateway to the database for nova-compute processes. So permitting a single nova-conductor process would effectively serialize all database queries during instance creation, deletion, periodic instance refreshes, etc. Since these queries are made frequently (i.e., easily 100 times during instance creation) and while other global locks are held (e.g., in the case of nova-compute's ResourceTracker), most of what nova-compute does becomes serialized. In parallel performance experiments I've done, I have found that running multiple nova-conductor processes is the best way to mitigate the serialization of blocking database calls. Say I am booting N instances in parallel (usually up to N=40). If I have a single nova-conductor process, the duration of each nova-conductor RPC increases linearly with N, which can add _minutes_ to instance creation time (i.e., dozens of RPCs, some taking several seconds). However, if I run N nova-conductor processes in parallel, then the duration of the nova-conductor RPCs do not increase with N; since each RPC is most likely handled by a different nova-conductor, serial execution of each process is moot. Note that there are alternative methods for preventing the eventlet thread from blocking during database calls. However, none of these alternatives performed as well as multiple nova-conductor processes: Instead of using the native database driver like _mysql.so, you can use a pure-python driver, like pymysql by setting sql_connection=mysql+pymysql://... in the [DEFAULT] section of /etc/nova/nova.conf, which eventlet will monkeypatch to avoid blocking. The problem with this approach is the vastly greater CPU demand of the pure-python driver compared to the native driver. Since the pure-python driver is so much more CPU intensive, the eventlet thread spends most of its time talking to the database, which effectively the problem we had before! Instead of making database calls from eventlet's thread, you can submit them to eventlet's pool of worker threads and wait for the results. Try this by setting dbapi_use_tpool=True in the [DEFAULT] section of /etc/nova/nova.conf. The problem I found with this approach was the overhead of synchronizing with the worker threads. In particular, the time elapsed between the worker thread finishing and the waiting coroutine being resumed was typically several times greater than the duration of the database call itself. _______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
