[Trac-dev] Re: Version control refactoring suggestions

Christian Boos Mon, 29 Jan 2007 14:43:59 -0800

Hello Peter,

Peter Dimov wrote:
> A main assumption in current implementation is that a revision is global
> to the whole repository. This is however not the case with bazaar and
> other distributed version control systems.
>


Well, on one end of the spectrum, some dvcs use SHA1 hashes as revision
numbers, so that they're globally unique for practical purposes. At the
other end of the spectrum, traditional systems like CVS have a version
number which is only meaningful at the level of a given file. So bazaar
can be seen somewhere in between, IIUC.

Even if CVS is of no immediate interest to us (there's an Request-a-Hack
for it on TracHacks, though!), there's the multi-repository support that
would bring similar requirements. A revision 100 in repository A has
nothing in common with the revision 100 in repository B, of course. One
way to implement multi-repository support would be to have a virtual
top-level folder containing a virtual folder for each repository. In
this scenario, it also makes sense to require the path and the revision
number at the same time in order to identify a specific revision.

(I've just seen Thomas' mail pointing out the same similarity)

Another scenario would be for getting some convergence with Trac's own
versioned resource. Similar to CVS, we also need to combine path
information (the resource realm + id) and version number in order to
identify a specific revision of a resource (e.g. version 15 of
wiki:WikiStart). Getting some convergence between the resource
manipulation API and the versioncontrol API would be interesting to get
some algorithms (e.g. annotate) work either on vc nodes or on versioned
resources.

> Additionally there is some unnecessary duplication of functionality in
> the current api. For example, there are two ways to create a changeset 
> either from a Changeset.get_changes() from a Repository.get_changes().
> In the first case path restriction is implemented by Trac, whereas in
> the second it is implemented by the version control backend. This could
> surely be avoided.
>   

Well, here despite of the same name and similar output, the actions are
quite different, so I'm not sure it's a good idea to merge the two - as
in a merged implementation, we would probably have to differentiate both
situations and have two distinct code paths anyway. But I'm open, and we
could try out to see how it would look like.

> Let me do a summary of the functionality we have:
>
> changeset  displays all / some changes committed in a single revision
> diff  displays all / some changes committed over multiple revisions,
> possibly between different paths
> log  lists all revision that modify a given path
> source  lists the inventory at a given revision
>
> Suggestions:
>
> 1. Since a changeset is a special diff, Id suggest that they share
> implementation in class Changeset.
>   

see reservation above for point 1.

> 2. Also, revision is currently just a string and that is insufficient
> (e.g. to better implement bzr support) and IMHO wed better make a class
> out of it. A revision would be a snapshot of (part of) the repository at
> the time of a commit. Let that be the class Revision.
>   

ok ...

> 3. A Revision object could be extracted from a Repository from a
> revision number (currently hex allowed) and path (to support bzr like
> systems). The reason is that revno 100 at path1 would be different from
> revno 100 at path2.
>   

Right, I see this as the major improvement proposed here.

> 4. A Changeset would then be the difference between two nodes. A node is
> a file or dir at a specific revision (as it is now). A changeset
> (current implementation) would be the difference between '/' at revno 99
> and '/' at revno 100.
>   

See reservation above. Actually, Trac used to implement things that way
a long time ago, but - for Subversion at least, you loose important
information if you treat the changeset 100 as being simply "the
difference between '/' at revno 99 and '/' at revno 100". This is why we
switched to using a dedicated operation for retrieving the changeset
changes (svn_repos_replay) and only later we reintroduce the use of the
general diff operation (svn_repos_dir_delta) for arbitrary diffs. So at
least for Subversion (and Mercurial as well), we would have to have a
dedicated implementation for the two different situations.

Also they offer a fundamental distinction w.r.t. to the cache behavior:
the changeset corresponding to a revision could be cached (as there are
a finite number of them), the others would always have to be generated
dynamically (as there's a huge number of combinations). I'm actually
quite sure that in the end, the generalized diffs could be generated
*from the cache*. And by the way, that's also something I'd like that we
put some thought in, as I see a great potential to do most operations
through the cache. This would benefit _all_ vc backends, as they would
mainly have to provide revision changeset information, and little more
(retrieving the actual content of a node, getting the node properties...
such things).

> To summarize, here are the classes with some of their methods:
>
> Repository
>       get_revision (path, rev)
>   

Yes, this seems to be the right thing to do...

>       get_revisions (start, end)
>
> Revision
>       properties: revno, message, time, author, etc.
>       get_node (path)
>   

Yes, plus there should be a Revision.get_changeset(path) method, in
order to get the full list of changes if path is empty or the
"restricted changeset" if not, which corresponds to this revision.

Then, following-up to what Thomas wrote, there should be some revision
navigation methods for this class as well: perhaps children() /
parents() revisions, as a generalization of the linear next_revision /
previous_revision we have now (see #1492).
Likewise, the Repository would have to extended in order to give the
revision "roots" (instead of oldest_revision) and the revision "heads"
(instead of the youngest_revision).

> Changeset
>       create (node1, node2)
>       get_changes()
>
> I plan to try this out with the bzr backend (Aaron Bentley's trac+bzr)
> and would really appreciate if somebody gives me feedback, or even
> better, works in parallel on the svn backend.
>   

Great! The more Trac hackers, the better Trac in the end ;-)
Thank you very much for your interest in making Trac better!

-- Christian


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Trac 
Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/trac-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

[Trac-dev] Re: Version control refactoring suggestions

Reply via email to