Antony, those sounds like interesting ideas, and I hope you can get it
working. But a one-way replicable db with full-consistency guarantees
is not what CouchDB was ever intended to be (to be clear, this is a
statement of fact). I don't disapprove of it, but it doesn't fit with
CouchDB's design. I'm also not convinced of the utility of it in the
general case, the one-way replication limitation to me doesn't solve
interesting problems, but that's just my opinion.
Because what you are proposing isn't really what CouchDB was designed
for, it would take a lot of reworking of the designs of the
replicator, the storage engine and the exposed APIs to get it working
without what I consider to be onerous limitations. That's a lot of
work. Currently I'm not interested in doing it, I've got an existing
to-do list a mile long. That means you or someone else in the
community will have to lead that effort. But if you succeed and your
changes meet the community's approval and coding standards, it will be
rolled back into the core project.
I don't want to discourage you, but I want you to understand this is a
big effort and a change of direction for CouchDB that will require a
lot of effort.
-Damien
On Feb 9, 2009, at 6:03 AM, Antony Blakey wrote:
On 09/02/2009, at 1:07 PM, Damien Katz wrote:
would be a big problem of replicating huge databases. Everything
must come over in one transaction.
I doesn't *have* to come in one transaction, but I understand the
problem you are talking about - the commit point may delimit the
entire database, and always will for the first replication.
Particularly an issue for me because I'm approaching 4G of data to
replicate.
And the problem then with what I've previously suggested is that you
have to decide before hand whether to fail on a given replication or
not, and if you don't then you have to maintain a write lock
potentially for ever, subject to upstream failure. There seems to be
two solutions to this:
1. Allow a replication rollback commit point (e.g. header) to
persist so that rollback can occur at any time. This will allow a
target to abandon a long, incremental replication after several
attempts. A replication stream would not necessarily end with a
commit marker.
2. Record commit boundaries, possibly by recording a transaction-seq
with each document rev. I'm aware that this *is* about reifying
underlying transactions, but at least the reification is implicit.
Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787
A priest, a minister and a rabbi walk into a bar. The bartender says
"What is this, a joke?"