Brian Aker <[EMAIL PROTECTED]> writes:

> It crosses my mind several times a day that if we required primary
> keys on tables that replication with be far less of a hack (and work
> quite a bit better). Why? Because we could then target update/delete

Yes, I also like the idea of requiring a primary key, for several reasons,
including replication.

And with 'require a primary key', I mean that the server should automatically
add a hidden primary key on the general layer (not in the storage engine) if
none is specified by the user. Or maybe if there is a non-unique index, add a
hidden extra column on that index to make it unique.

Row-based replication really does not work well without a primary key. In
effect, if there is no primary key, row-based replication uses the combination
of all fields as the key. If there is no index on the slave table,
_every_single_ replicated update or delete row event will do a full table
scan!

I don't think using an engine internal ROWID is a solution for
replication. Such an internal id is just that - internal. And the strong point
about replication is that it is external; it gives freedom to replicate data
without being restricted by engine internals. The rowid you typically have in
non-clustered engines such as MyISAM, Falcon, Oracle, ... is little more than
a pointer to an offset into a datafile/tablespace. Now you cannot replicate
between different engines, probably also not replicate between different
version of the same engine (row size change?), cannot replicate between
different setups of table space/storage on master and slave ... most of the
benefits of MySQL-style replication seem lost.

Both InnoDB and NDB adds their own hidden primary key if none is specified by
the user. Unfortunately, it causes quite a lot of complications to the storage
engine interface code that the upper MySQL layers do not know about this
hidden key. It adds _lots_ of special cases, due to no space in the record for
the hidden primary key, user-defined partitioning, and others. Much simpler if
the upper layer could add it once for all engines that need it.

About the only use for replicating without a primary key is an insert-only
table (eg. logs). Adding an auto_increment primary key to that really does not
cause much overhead, and solves a lot of problems with replication and
otherwise.

 - Kristian.

_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Reply via email to