Sorry, I planned to post this about two weeks ago :-) but here it goes... I've been chatting with Peter about persistence for s5, and the version-control-like functionality we discussed about some time ago on the list. He said both things should go hand in hand, and I tend to agree; one feature I've always thought of for version control was "horizon", meaning, being able to configure how many revisions to store; by setting that to 1, you can effectively get persistence without version control.
The basic idea ============== Peter is building a replication mechanism into s5. There are two separate concepts, "site" and "host", with the site being a tree of vobjects (same concept as s4), and a host being a "realisation" of a site; the implication being that a site may be "clustered" on a bunch of hosts. So the replication system is used to pass update data between hosts that share a given object, but also to handle remote "mirror" objects (same concept as s4 remote vobject). Both are done by some variation of the subscriber pattern. Now, the first thing that floats up from this, is that you could have cluster setups where only one host does persistence; on cluster "bootup", this host would read the objects and initialise the whole cluster. What we're thinking is to take this one level further, and implement persistence itself as a "host"; so even in a single-host setup, persistence would be a discrete thing, that sits in a corner, and communicates with the host via the inter-host replication "protocol". Which would allow it to be off-process, if you want. Now, in version control terms, you can think of each host as a branch. So synchronisation between hosts is equivalent to a merge; and updating a mirror object is like updating a working copy (the "update" command in bzr/svn/cvs/git/etc). In both cases, the "client" host will send a reference (the id of the revision it already has, which may be "none"). For an update, the "sending" host may calculate what is more efficient to send, a full copy or a delta. For a merge, we want instead to send all revisions the "client" host doesn't have. (Maybe we could tie the horizon setting in here; if the "client" sends its horizon preference with the request, and the number of "new" revisions exceeds the horizon, just send the last N.) (Although according to Peter, "host == branch" is not entirely correct; a host could have more than one site, so it's probably more accurate to say a site is a repository, in the bzr sense.) Version control: what's stored ============================== A version control branch corresponds to a "concrete site", by which I mean the information about one site as seen by one host (as opposed to the "full site", which is the most up-to-date version as seen by the whole network of hosts that hold that site). Only one piece of information is held at this level: a full list of all object ids in the site (so we can control things like, in which revision an object got created or deleted). The real bulk of it is at vobject level. You could say a branch is made of a soup of vobject histories. (By the term "soup" I mean, they aren't stored in any kind of hierarchy or order.) A given vobject history is "tagged" by the (immutable) object id. Each revision of a vobject in history holds: - type list - child list - payload, if any (eg properties) - security capabilities Version control: how it's stored ================================ Atomicity is of course important, and a "transparent" version control system is only useful if merging is smart, because there will be no human element to resolve conflicts. This all ties into how historic information should be stored internally. Most recent brainpower in version control projects like bzr, in the last few years, went in the direction of line-based approaches. Although bzr has evolved past the weave format, it's very easy to explain: a weave is a sequence of line groups, where each group is marked with the revision id where it got added or removed. So, a weave of numbers instead of lines, using {} to represent an addition and [] for a removal, and letters for revision ids, could look like this: {a537[b[c1]b{b94}]c{c20}68}. This would represent a revision "a" being 537168, "b" being 5379468, "c" 5372068. Or go to http://bazaar-vcs.org/BzrWeaveFormat?highlight=%28weave%29 for a proper explanation :-) The important point here is that weaves and more recent formats in the same vein operate on sequences. We often think of them as operating on lines, but essentially, they're about sequences. Just as my example had sequences of numbers instead, we can easily use them for sequences of pretty much anything. Now, look at the 5 things we're storing in our version control: - global list of ids - vobject list of types - vobject list of children - vobject payload - vobject list of capabilities See a pattern? ;-) The only one who "doesn't belong" is the payload, but then, most of them will be a small discrete value (like a number), while occasionally we'll have long chunks of text, which are important to be smart about. So I think we can treat payload as bzr treat text files, and store it as a sequence of lines. This gives us "smart merge" for pretty much everything; with this kind of format, the situations where you get a conflict are MUCH rarer than with the usual one delta per revision. I'll probably start this project by writing a "libmerge" or something like that in C++, implementing a version of whatever is the latest tech in bzr, for handling arbitrary (STL) sequences. Revisions and transactions ========================== Revisions are identified by GUIDs, rather than numbers, because numbers change during merges. Based on the actor model of s5, we came to this revision model: by default, when an object (actor) finishes a "request", a revision is committed. By "request" I mean, it can call other methods **in the same object** to help out, and these methods won't trigger commits when they return. A revision corresponds to all changes that were made in response to a message from another object (local or remote). Tying into the actor model, this revision will *only* commit the changes to that vobject; that's specially important, since at that point a different thread may be running something else, for a different vobject. Also, of course, if there were no changes, there's no point introducing a new revision. It's important to note, during a "request", the object may (and probably will, in many cases) send messages to other local objects. These will trigger commits when they return! And if those other objects call back to the "first" object, then that will cause a commit too... which may end up committing some changes that were made by the original method. We think, in normal usage, that shouldn't be a problem, so it's a reasonable default behaviour. You can, of course, escape the default. I imagine having a method at the host level, which unconditionally commits a new revision, taking a set of objects as an argument. (Well, not quite unconditionally -- rather, as long as there have actually been any changes.) The other thing you can do is a larger revision, essentially stopping auto-commit for some time. The way this happens internally is more similar to a bzr "microbranch" than an SQL transaction, but we can still call it a transaction. So one host method "branches" current execution line from the latest host revision. From that point, all new revisions from methods in "child requests" (or same call stack) will be on that "microbranch", whether explicit or automatic. It's important to note, since this is a branch, those methods won't see any concurrent changes to the objects, made by calls outside the branch. This is intentional and important. (I think.) Then, of course, there would be a method to explicitly reconcile the "microbranch", or it would happen automatically at the end of the request where it was created. What makes this branch "micro" is that, on merge point, all "internal" revisions are discarded; what gets committed to the main branch is one single revision, accumulating all changes. Peter, blow me off if that sounds too hard to do :-) it would imply the ability of having more than one "version" of the same object in memory in the same host, and knowing which one is the "right" one for a given call... Horizons ======== Off the top of my head, I think we'd like to be able to set a horizon per host, per type, and per vobject, in that order of precedence (vobject overrides type). What if a vobject has two different types that specify a horizon? Respect the first? The last? What else ========= The "protocol" for replication is a whole other can of worms, I'll let Peter talk about that when he wants. One point he asked me to remember is that "cluster" replication propagates capability lists, while "mirror" replication doesn't. Probably. best, Lalo Martins -- So many of our dreams at first seem impossible, then they seem improbable, and then, when we summon the will, they soon become inevitable. ----- personal: http://lalo.hystericalraisins.net/ technical: http://www.hystericalraisins.net/ GNU: never give up freedom http://www.gnu.org/ _______________________________________________ vos-d mailing list vos-d@interreality.org http://www.interreality.org/cgi-bin/mailman/listinfo/vos-d