On 19/01/2009, at 3:51 PM, Paul Davis wrote:
There can be many _external processes for a single definition. So, not
only are requests not serialized, they can be concurrent etc.
OK. I'll be patching my deployment to ensure a single external process
per external definition.
IMO the _external system is considerable less useful in this form,
especially for external indexing. Concurrency and consistency should
be a matter for the external system to control, because it's the
external system that understands/imposes/relaxes the concurrency and
serialization requirements.
Maybe an external that acts more like a real server, even if the
single command channel needs a request multiplexing protocol.
A single _external process should only see monotonically increasing
update_seq's. I think it's techincally possible to have a smaller
update_seq processed later in time in a different os process though
(later in time <= few ms).
Possible => broken.
The ideas from the other thread about having a UUID per db and
compaction are interesting, are either of those included the fs layout
stuff you were working on?
No. UUIDs are useful for the fs because you need a strictly functional
mapping from name -> file, and using a UUID is begging the question.
The compaction issue isn't real. My first thought is that the purge
issue could be dealt with by a) having a notification of the purge and
b) having the purge_seq be set to the update_seq of the snapshot seen
by the purge. Maybe it works that way already.
I definitely prefer state transitions to be reified rather than
notified, and IMO it's more consistent with the overall couch model.
Personally I think an _external system with a few richer protocol is
required, rolling in notifications with the requests, so that an
external system can maintain accurate state-correspondence with the
canonical couch data, without exceptions e.g. without needing some
sideband for database life-cycle events. Also being able to make
queries to a given snapshot, using either the request channel or an
additional parameter via HTTP. The request channel is by far a better
idea because the snapshot can be implicitly scoped by the request.
I am adding the db UUID, view function UUID and revs in view results
for my own purposes, but there wasn't much interest on the list, and I
haven't the time to convince/shepherd/clean/publish etc that proposal
or implementation.
Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787
I contend that we are both atheists. I just believe in one fewer god
than you do. When you understand why you dismiss all the other
possible gods, you will understand why I dismiss yours.
--Stephen F Roberts