Re: [Sequoia] Sequoia with raidb1 distribution and mysql backend - consistency in high load environment

Emmanuel Cecchet Wed, 27 Aug 2008 09:17:29 -0700

Hi Tomasz,

Actually the scheduler is now just a big on/off button that let's querygo further in the controller or not, it does not really schedule queriesin the sense of a database scheduler. The group communication justensures that all queries are delivered in the same total order at eachcontroller.It is the role of the BackendTaskQueues (basically a database statemachine) to serialize the writes to the same table and only allow writesto different table to go in parallel (unless there are referentialintegrity constraints). This state machine ensures that for a set ofincoming queries in a total order it will deterministically play thequeries on the backend. As each backend runs exactly the same statemachine, it ensures that each backend is going to play the requests inthe same order.This means that if you have a workload with a high degree of writesexecuting in parallel, this will look really slow on Sequoia that willserialize all the writes to make sure they are executed 1-by-1 in adeterministic order. This is the price to pay for consistency.


I hope this clarifies the way some of the things work internally.

Thanks for your interest in Sequoia,
Emmanuel

I would like to ask you about mechanism responsible for dataconsistency in raidb1 distrubution mode in Sequoia (2.10.10). Thereisn't much information about this in documentation.
Assumptions:
Two sequoia controllers in raidb1 distribution mode (Scheduler set toPassThrough). Each Controller has one MySql backend.
I know, that communication protocol between controllers and betweencontroller and backend provides data consistency (especially order ispreserved).
As far as I know database engine does not guarantee that in high loadenvironment (many read/write operations with table locks on myisam)first query that arrives to the engine is executed earlier then aquery that arrives later.Eg. there are lots INSERT and DELETE operations. If INSERT operationwas send earlier then DELETE operation, there is no guarantee thatengine execute queries in te same order, and after this operationthere will be a record that should be deleted.
In one database environment it's acceptable problem, but when we usesequoia for data replication to more backends there could be situationwhere on two different backends databases there will be differentdata. And it is not acceptable.
As far I as I understand documentation pessimisticTransactionscheduler could help (because there is only one write operation inparallel and no limits on read queries). But there is no such optionon raidb1 configuration.
At the end, my question is:
Is there a solution in Sequoia to be sure that on every databasebackend in raidb1 scenario there are same data?
Regards,

--
Tomasz Lemański

------------------------------------------------------------------------

_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia



--
Emmanuel Cecchet

FTO @ Frog ThinkerOpen Source Development & Consulting

--
Web: http://www.frogthinker.org
email: [EMAIL PROTECTED]
Skype: emmanuel_cecchet

_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia

Re: [Sequoia] Sequoia with raidb1 distribution and mysql backend - consistency in high load environment

Reply via email to