On 03/23/2010 10:30 AM, MARK CALLAGHAN wrote:
On Tue, Mar 23, 2010 at 9:23 AM, Robert Hodges
<robert.hod...@continuent.com>  wrote:

Just a follow-on about some problems that aren't solved by prefetching,
because the issue is not fundamentally a problem of speeding up
single-thread I/O.  Also, education does not help.  I'm curious how you,
Jeremy, and others have addressed these in the past.

There are great solutions and good solutions we can use right now
(statement based replication + mk-slave-prefetch). With statement
based replication, one stream per shard won't help much if you allow
complex/slow joins in your binlog.

Education is required. People don't want to hear this. But from my
experience a lot of problems are caused by SQL app developers. We have
a special mailing list at work devoted to such problems and make sure
they are highly visible.

I wholeheartedly agree with this. The #1 performance fix for most of the clients I visited were rewriting queries.


1.) Slave catch up.  Loading a backup and then synchronizing is painfully
slow for large data sets.  This is a great place to look at parallelization
if you know the data are sharded.

Running one stream per shard would help but the biggest improvement I
can get right now is to eliminate IO wait in the slave. That is a much
bigger win for me than running the CPU bound parts in parallel. For my
binlog content -- one stream without IO waits is much faster than one
stream per shard with IO waits. Of course one stream per shard without
IO waits is even better.


2.) Shared resource computing.  Keeping tenants in multi-tenant systems
(including both SaaS as well as ISPs) from hogging resources is a big
problem.  For example, I'm working on an engagement now where single tenants
can do operations that induce a couple of hours of slave lag, thereby
blocking replication for all other tenants as well.  This problem is
sufficiently widespread that you want replication to handle it through
configuration rather than app changes, which are often either very costly or
difficult/impossible to do.

Ouch. You certainly want one stream per shard/tenant in that case. You
will have many other problems independent of replication if many
tenants run in the same mysql instance. Are any of these hosting
providers willing to fund the solutions for these problems?


Funny - I just happen to work for such a hosting provider...

_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : drizzle-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Reply via email to