[Dbix-class] Replicated Storage branch ready for review/discussion

John Napiorkowski Mon, 12 May 2008 16:24:09 -0700

Hey,

The 0.08 replication_redux branch has stabilized and is ready for review and 
comments.  This branch is basically a non compatible rewrite of the original 
DBIC::Storage::DBI::Replication storage class so I want to document the major 
changes and reasons for doing so.  Also, while working on this branch I fixed a 
bunch of non replication related issues, and wanted let people know about that.


DBIC::Storage::DBI::Replicated is an alternative storage engine that:

  - Splits read/write queries over two different storages while delegating to 
both storages when necessary (primarily during instantiation, so that all 
storages connect properly).

  - Defines a pool mechanism to hold a list of storages, set individual 
storages as active or not, and to validate the status and lag time of storages 
that are slaves in a replicated environment.

  - Defines a Balancer storage that can use various strategies to spread query 
load across a Pool.  It also defines a mechanism to automatically validate the 
Pool every certain amount of seconds.

  - Defines a Replicant storage type that adds some functionality that is 
specific to storages, such as an attribute for maintaining if the storage is 
active or not, and some additional debug output so that you can see which 
storage is handling the request.

The basic purpose of these classes is to support the common 'master/slaves' 
replication environment', where all data changing queries should be routed to 
the master, while all read queries balanced over a pool of slaves.  This is a 
very common style of database scaling, so having good support for this in DBIC 
would be very valuable, particularly to companies that don't have the capital 
for hardware based balancers and attending monitoring software.

I chose to break this out from blackbox style balancers (like DBD:Multi) 
because I was having some driver specific issues that we've already solved with 
our list of database specific storages, such as DBIC::Storage::DBI:mysql, and 
because this gives use more fine tuned control over the storages.  For example, 
this system makes it easy to query information about a particular storage in 
the pool.  Also it should be easy to write a custom balancer, such as a round 
robin style balancer, or even a least connected balancer.  In general I think 
it makes sense to integrate this into DBIC.

The test for this is t/93storage_replication.t, which defines a sqlite 
compatible test (using copy to fake replication) but allows you to override the 
master and replicant connect info so that you can test it on your own 
replicated environment.  I tested it against mysql native replication.

Places that probably could stand more abstraction would be the system for 
splitting read/write queries, which is currently integrated into Replicated.pm 
and the timer that the balancer uses to track when to validate the pool of 
slaves.  I actually did work on some separate query counter/timer event code, 
and will likely cut a branch shortly for it, so if anyone else is interested or 
could use something like that, please speak up.

Changes made to core DBIC classes include:

DBIC::Schema:

- changed the storage_type class accessor so that it can accept a hashref in 
addition to a string, in order to support storages that require args.

 Example:

$schema->storage_type({'Replicated' => \%options});

DBIC::Storage::DBI:

- added two virtual methods, 'is_replicating', 'lag_behind_master', to support 
the replication pool validation feature.  Added support for these methods in 
the mysql specific storage.

I realize in a lot of ways it's not ideal for these methods to be part of the 
base DBI storage, since not all storages will be replicating.  Suggestions on a 
better way to abstract this functionality would be welcomed.

Additional I made several small changes to the test suite to fix bugs I 
discovered when trying to run the core tests against my mysql replicating 
setup.  There were a couple of bad FK constraints that died on mysql, and I 
changed a test for a self-referential table so that it worked when the db 
actually enforced the constraints.  I wasn't able to get the entire test  suite 
to run against mysql because there are a large number of tests that assume 
sqlite, but now if you write a new test you could target other databases and 
check that by overriding the DBICTEST_DSN/DBUSER/DBPASS environment variables.  
I really recommend authors do this in the future, since it can only make our 
test coverage even better.

Along with this is a change I make to SQLT::Parser::DBIx::Class to better 
support mysql FK constraints that involve several columns in the constraint.  
However this will require a patch to SQLT::Producer::Mysql as well, to be fully 
fixed.  If anyone cares, talk to me about it.

I think the code is pretty clean and well documented (I even pod the test case) 
but please point out any trouble or areas of confusion.

Thanks for people advice and thoughts on this so far.
John Napiorkowski


      
____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/[EMAIL PROTECTED]

[Dbix-class] Replicated Storage branch ready for review/discussion

Reply via email to