[Trac-dev] Re: Version control refactoring suggestions

Peter Dimov Mon, 29 Jan 2007 18:16:28 -0800

Hi Christian,

Thanks for the detailed reply and interesting comments!

Christian Boos wrote:
> Well, on one end of the spectrum, some dvcs use SHA1 hashes as revision
> numbers, so that they're globally unique for practical purposes. At the
> other end of the spectrum, traditional systems like CVS have a version
> number which is only meaningful at the level of a given file. So bazaar
> can be seen somewhere in between, IIUC.

As I already wrote in my reply to Thomas, bzr does have GUIDs as
revision IDs. The same revision may be found under different revno in
different branches. I find revnos much more user friendly than SHA1
revids and the path gives some additional information too. I find
cahngeset:100/path/to/branch better much much better than
changeset:1ADC452...

If you browse revision 100 of branch, you could see some file.txt having
revision 44 because it was last changed back then. Of course if you
request file.txt from revision 100 you'll get the same as file.txt are
revision 44 ... I guess that's why you say it is somewhere in between
but... isn't that what svn is doing too?

> Another scenario would be for getting some convergence with Trac's own
> versioned resource. Similar to CVS, we also need to combine path
> information (the resource realm + id) and version number in order to
> identify a specific revision of a resource (e.g. version 15 of
> wiki:WikiStart). Getting some convergence between the resource
> manipulation API and the versioncontrol API would be interesting to get
> some algorithms (e.g. annotate) work either on vc nodes or on versioned
> resources.

That's an interesting idea and I see no reason this would not work.

>> Additionally there is some unnecessary duplication of functionality in
>> the current api. For example, there are two ways to create a changeset 
>> either from a Changeset.get_changes() from a Repository.get_changes().
>> In the first case path restriction is implemented by Trac, whereas in
>> the second it is implemented by the version control backend. This could
>> surely be avoided.
>>   
> 
> Well, here despite of the same name and similar output, the actions are
> quite different, so I'm not sure it's a good idea to merge the two - as
> in a merged implementation, we would probably have to differentiate both
> situations and have two distinct code paths anyway. But I'm open, and we
> could try out to see how it would look like.

For example it should be possible to reimplement the function that
restrict the path of a changeset. That could be done much more
efficiently by using the bzr api. I'm not really sure about that, but I
think it would be possible to generate the diff using the bzr api too.

I think I now see what is different between the two functions.
Repository.get_changes() is supposed to work even on totally unrelated
paths and not only on different revisions of the same path. I think that
if the paths are unrelated then trac should take care of creating the
diff. Trac would get the two nodes and run the diff on them. But the
changeset path restriction should be left to the version control
implementation. A default implementation could of course be provided in
the base Changeset.

> See reservation above. Actually, Trac used to implement things that way
> a long time ago, but - for Subversion at least, you loose important
> information if you treat the changeset 100 as being simply "the
> difference between '/' at revno 99 and '/' at revno 100". This is why we
> switched to using a dedicated operation for retrieving the changeset
> changes (svn_repos_replay) and only later we reintroduce the use of the
> general diff operation (svn_repos_dir_delta) for arbitrary diffs. So at
> least for Subversion (and Mercurial as well), we would have to have a
> dedicated implementation for the two different situations.

Maybe the above suggestion is applicable to svn and mercurial (i.e. trac
to do arbitrary diffs and vcs to retrieve changeset changes)?

> Also they offer a fundamental distinction w.r.t. to the cache behavior:
> the changeset corresponding to a revision could be cached (as there are
> a finite number of them), the others would always have to be generated
> dynamically (as there's a huge number of combinations). I'm actually
> quite sure that in the end, the generalized diffs could be generated
> *from the cache*. And by the way, that's also something I'd like that we
> put some thought in, as I see a great potential to do most operations
> through the cache. This would benefit _all_ vc backends, as they would
> mainly have to provide revision changeset information, and little more
> (retrieving the actual content of a node, getting the node properties...
> such things).

Caching provided by trac would be a really great thing! I've looked at
the svn cache repository and was thinking how I could reuse it for
trac+bzr. In the perfect scenario the vcs backend wouldn't care about
caching and have Trac handle all the details.

Restricting what should be cached shouldn't be a big deal. The Changeset
caching decorator would take the decision what is forwarded to the
backend and what is read from the cache. We would only cache changesets
where new is specified and all the others (new_path, old, old_path) are
None.

>> Revision
>>      properties: revno, message, time, author, etc.
>>      get_node (path)
>>   
> 
> Yes, plus there should be a Revision.get_changeset(path) method, in
> order to get the full list of changes if path is empty or the
> "restricted changeset" if not, which corresponds to this revision.

Yes, that's great!

> Then, following-up to what Thomas wrote, there should be some revision
> navigation methods for this class as well: perhaps children() /
> parents() revisions, as a generalization of the linear next_revision /
> previous_revision we have now (see #1492).

Getting the parent trees is done already. I'm not sure it would be
possible to get the children though. A branch wouldn't know about its
children unless some of them are merged back in. I'll have to check that
with Aaron.

> Likewise, the Repository would have to extended in order to give the
> revision "roots" (instead of oldest_revision) and the revision "heads"
> (instead of the youngest_revision).

You are generally right but... I don't see much sense of having this
youngest_rev altogether. I mean, it is used to find out whether the
revision you have is the oldest one

  if rev == youngest_rev:

We'd better use

  if rev.next_rev:

or something like that. I think we should refactor youngest_rev away.
What do others think?

Looking forward to your comments!

Regards,

Peter

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Trac 
Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/trac-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

[Trac-dev] Re: Version control refactoring suggestions

Reply via email to