> So, what do people think should go into CACHED-HASH-INFO? In my oppinion, we should use it only once we really need it. It is an escape, should the necessity to store more global data arise -- not an extension to be used for no good reason.
Concerning distributed use of rcs, I think we can go without. My vision would be that everyone works locally and, upon receiving an rcs file just "attaches" the missing commits to the local rcs file. "Attaching" in the sense to come up with a file that looks as if the other party just locally checked out the parent version, modified and committed (but setting commit time and author as in the received file), i.e., commits extending a head of a branch just continue that branch and commits extending a version that is not head just start a new branch. Merging in the branches then would be done with rcsmerge, as usual. So the task is to recognise a version in the local rcs file, even though it has a different version number in the received file (and of course recognising to versions as different even though the version numbers coincide). To do so, for every revision compute a unique identifier that identifies the "semantics" of a version. My suggestion is to take a hash of (a string from which you can reconstruct) - the content - the log message - the identifier of the parent commit Then two version would be considered equal, if the semantic identifier coincides, and all new commits would be attached as new child of the parent commit (i.e., the commit in the old file that has the same identifier). Traversing from the initial revision onwards ensures that we always attach the parent first. To get a feeling for the semantics, you can use my old prototype script[1], which I'm still actively using today (despite having quadratic complexity, which is a bit anoying). However, I would suggest a different implementation based directly on the rcs functions. Essentially, we would have to traverse every version, starting from the initial one, to compute the recursive hashes. Unfortunately, the diffs are orientied the other way on the part HEAD ... 1.1. My suggestion therefore would be to traverse that path as you would do to co -r1.1, but additionally compute the reverse diffs. Then all versions can be traversed, and the hashes computed, without having more than one full version of the file in memory at a time. The effort would be linear in the number of revisions in the file, as is the effort for checking out the initial version in a typical case. Therefore my suggestion is to implement attaching without using the extension. This keeps as flexible for the future. Klaus [1] http://www.linta.de/~aehlig/university/rcsjoin.py
