Thanks for all the detailed analysis, Mike W, Mike B, and Roman.
 
For a production-ready database system, replication is a must I think. So, the 
questions are which replication mode is suitable for OpenStack and which way is 
suitable for OpenStack to improve performance and scalability of DB access.

In current implementation of database API in OpenStack, master/slave connection 
is defined for optimizing the performance. Developers of each OpenStack 
component take the responsibility of making use of it in the application 
context and some other guys take the responsibility of architecting database 
system to meet the requirements in various production environments. No general 
guideline for it. Actually, it is not that easy to determine which transaction 
is able to be conducted by slave due to data consistency and business logic for 
different OpenStack components.

The current status is that master/slave configuration is not widely used and 
only Nova uses slave connection in its periodic tasks which are not sensitive 
to the status of replication. Due to the nature of asynchronous replication, 
query to DB is not stable, so the risks of using slaves are apparent.

How about Galera multi-master cluster? As Mike Bayer said, it is virtually 
synchronous by default. It is still possible that outdated rows are queried 
that make results not stable.

When using such eventual consistency methods, you have to carefully design 
which transaction is tolerant of old data. AFAIK, no matter which component is, 
Nova, Cinder or Neutron, most of the transactions are not that 'tolerant'. As 
Mike Bayer said, consistent relational dataset is very important. As a 
footnote, consistent relational dataset is very important for OpenStack 
components. This is why only non-sensitive periodic tasks are using slaves in 
Nova.

Let's move forward to synchronous replication, like Galera with causal-reads 
on. The dominant advantage is that it has consistent relational dataset 
support. The disadvantage are that it uses optimistic locking and its 
performance sucks (also said by Mike Bayer :-). For optimistic locking problem, 
I think it can be dealt with by retry-on-deadlock. It's not the topic here.

If we first ignore the performance-suck problem, multi-master cluster with 
synchronous replication is the perfect for OpenStack with any masters+slaves 
enabled and it can truly scale-out.

So, the transparent read/write separation is dependent on such an environment. 
SQLalchemy tutorial provides code sample for it [1]. Besides, Mike Bayer also 
provides a blog post for it [2].

What I did is to re-implement it in OpenStack DB API modules in my development 
environment, using Galera cluster(causal-reads on). It has been running 
perfectly for more than a week. The routing session manager works well while 
maintaining data consistency.

Back to the performance-suck problem, theoretically causal-reads-on will 
definitely affect the overall performance of concurrent DB reads, but I cannot 
find any report(officially or unofficially) on 
causal-reads-performance-degradation. Actually in the production system of my 
company, the Galera performance is tuned via network round-trip time, network 
throughput, number of slave threads, keep-alive and wsrep flow control 
parameters.

All in all, firstly, transparent read/write separation is feasible using 
synchronous replication method. Secondly, it may help scale-out in large 
deployment without any code modification. Moreover, it needs fine-tuning (Of 
course, every production system needs it :-). Finally, I think if we can 
integrate it into oslo.db, it is a perfect plus for those who would like to 
deploy Galera (or other similar technology) as DB backend.

[1] 
http://docs.sqlalchemy.org/en/rel_0_9/orm/session.html#custom-vertical-partitioning
[2] 
http://techspot.zzzeek.org/2012/01/11/django-style-database-routers-in-sqlalchemy/
[3] Galera replication method: http://galeracluster.com/products/technology/


_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to