On Jan 8, 2007, at 10:10 PM, Mark Stosberg wrote:

Henri Asseily wrote:

I don't know if I uploaded the latest version of DBIx::HA to CPAN, I'll
do it asap.

The docs for DBIx::HA say

"It is very simple to load-balance across read-only database servers.
Simply randomize or reorder the 'db_stack' entry in your database
configuration on a per-process basis."

I know it could be dead simple, but could an example of what this looks
like be provided?

Take the example of the pod:

$DATABASE::conf{'test'} = {
 max_retries => 2,
 db_stack => [
[ 'dbi:Sybase:server=prod1;database=test', 'user1', 'pass1', $attrib ], [ 'dbi:Sybase:server=prod2;database=test', 'user2', 'pass2', $attrib ], [ 'dbi:Sybase:server=prod3;database=test', 'user3', 'pass3', $attrib ],
 ], ...

Assume his 'test' handle is a read-only handle.
The stack definition shows that you have 3 read-only servers: prod1, prod2 and prod3. The particular process that uses this stack will start with prod1, and if it fails max_retries times, it will switch to the next one in the stack, prod2, and so on. If you want to have your application to be load-balanced across all the read-only databases, you can do it in many ways, but ultimately it boils down to having your children processes hitting different databases, and having stacks that are as unique as possible (so that if a db server fails, all its dependents don't fail over to the same secondary).

Here's a complex example:
In the case above, say you have 20 children for those 3 databases.
The different stack permutations are:

1: prod1,prod2,prod3
2: prod1,prod3,prod2
3: prod2,prod1,prod3
4: prod2,prod3,prod1
5: prod3,prod1,prod2
6: prod3,prod2,prod1

Either dynamically or statically create these 6 permutations and put them in @all_db_stacks: Then when one of your 20 children starts up, it will take a stack. Either have each take the next stack down, or randomly take a stack. One example under Apache if you don't want to have a shared global is to grab the child's process id and get its modulus by the size of @all_db_stack:
$stack_number = $$ % scalar(@all_db_stacks)

And so in the example config above:
db_stack => $all_db_stacks[$stack_number]

Assuming that Apache recycles its children pretty uniformly, you should have a uniform distribution across the database servers. If hypothetically the child process are 1 to 20, then stacks 1 and 2 will be allocated to 4 children each, while stacks 3,4,5 and 6 will be allocated to 3 children each for a total of 2x4+4x3=20 children. So when all the db servers are functioning properly, prod1 should have 8 clients, prod2 and 3 should have 6 each. If prod1 dies, its children get split evenly between prod2 and prod3.


How well has this worked in practice? So, if an Apache/mod_perl
application has 24 children, about 12 would end up talking to each DB as
a primary?

In the case where you have 24 children and 2 database servers, sure. You'll have 12 children hitting one db, and 12 hitting the other. And if one dies, all 24 will hit the remaining live one. There's no other way.


Also, DBD::Multi /does/ have this feature-- automatically randomly
selecting between to dbs with the same priority. Perhaps this feature
could be merged in.

DBIx::HA focuses on high availability. This means not only load balancing, but failover, especially in a mod_perl environment where the children are always connected. You can't just randomly select a db for each call for a number of reasons, the biggest one being that you'd need a live connection to all the dbs. In the case of 3 database servers and 100 children, each db server would need to have 100 connects active. Now that may not be a problem for you, but when you have 100 web servers with 50 children each, few database servers are happy with 5,000 concurrent open connections.

In such large setups, you also want good control over which servers/ children connect to which possible databases. In the case when you have 100 web servers, you'd probably want each server to have its own database failover stack, with all its children following the same pattern and using the same resources. This helps immensely the operations people who want to know machine and resource dependencies.

I found that using a simple low-tech stack system allowed for many possibilities, even though it's not very glamorous.


I do like the "time out" feature that DBIx::HA has.

That's another critical issue for high availability. There are too many ways a database server can be unavailable, and the standard timeouts of the database drivers are completely unreliable because they're 'safe'. DBIx::HA uses unsafe timeouts, so technically the Perl interpreter could be in a bad state after a timeout, but practically I've never seen this happen and being able to time out a query or a connect is an absolute necessity.


As to how well DBIx::HA has worked in practice, we're using it heavily at Shopzilla. One of the systems that's using it has hundreds of modperl servers hitting a couple dozen database servers. Database queries run in the tens of millions a day, etc...

That system is actually so large that we are considering going to the next step beyond DBIx::HA because the system is so big that even with DBIx::HA and the web server children connecting only to the first database server in the stack, there are too many connections on a single database server. At some point, when you optimize the database calls you end up being able to throw many more clients at a db, ultimately overloading its capacity to manage thousands of constantly open connections. So now we need to pool the connections, but that's another story for another time.

H


Reply via email to