On Wed, 7 Nov 2007, Martin Hollmichel wrote:
Mathias Bauer wrote:
As it got silent regarding the SCM topic on this list, do you know whether all plans to replace the dreaded CVS by something better have been cancelled?

Nothing has been canceled, currently comparisons of different candidates are carried out. See

http://wiki.services.openoffice.org/wiki/ESC_minutes#Face2Face_meeting_in_Hamburg_2007-10-29.2F30

There will be an follow up meeting this Friday 14.00 UTC on #oooscm on freenode for more detailed evaluation, please feel free to join,

If you mean me, then thanks for the offer but as an employee I have to work at this time and my employer forbids and technically prevents the use of IRC.

I can however share some thoughts here. At work I'm in a large project with several developers in a small nummber of countries. Our development environment is server based, mainly we are using ClearCase Multisite as SCM, but because of its inflexibilities especially regarding quick experiments and tests, a small number of developers including me has started to play with DSCMs for prototyping. So I converted just the mainline of our project using some self-written Perl tools to both Mercurial and Git and keep them up-to-date.

From the docs which Mathias Bauer referenced above, I noticed that you are
currently looking at SVN, Git and Bazaar. I'm wondering a bit why you are looking at Bazaar at all, because for example in <http://weblogs.mozillazine.org/preed/2007/04/version_control_system_shootou_1.html> Mozilla teams have clearly ruled out Bazaar due to bad performance; the article said it used over a month of constant runtime to complete a trunk only import, which is totally unacceptable.

I don't know how large an OpenOffice.org repository is, but I guess its a lot more than Mozilla's.

I also wonder why you are not looking at Mercurial yet (<http://www.selenic.com/mercurial/wiki>). It has been selected by a number of projects, e.g. OpenSolaris and Mozilla.

In my case I ended up with a Mercurial repository of ~950 Megabyte size (including ~500 MB working copy), which contained ~9200 files in ~1300 directories and had ~38000 changesets. A small number of files are binaries of huge size (tarballs and executables), which are handled ok as well. The conversion took about one night. Roughly the same applies for our Git repository.

Regarding features I wish I could have a combination of Mercurial and Git:

IMHO the Mercurial user interface is much cleaner and commands are less cryptic than that/those of Git. You also get small integer numbers valid only in the local repository in addition to the uncomfortable large hash numbers that allow easier handling when you need to reference a specific changeset.

Mercurials efficient storage does not need any kind of packing or garbage collection like in Git.

It is also said that Mercurial has good support for Microsoft Windows Operating systems as its based on Python, while Git is said to work only well on Windows with help of Cygwin and then with slow speed. I thought this would probably be a criteria for OOo development.

Mercurial is also well documented, a PDF book is available (a work in progress, but already in very good shape). Git's docs basically consist of a collection of manpages and some tutorials, definitely better than nothing, but it isn't up to Mercurials book yet.

Mercurials 'bundle' feature allows to bundle changesets for quick offline down/upload of a set of changesets in one file and provides support for management of patch sets in form of MQ (Mercurial queues).

On the downside it sometimes is not as space efficient as it could be. It makes excessive use of hardlinks, which works ok as long as you are cloning on the same filesystem. In this area Git is better as it allows so called 'shared' repositories where just the location of another repository is stored in a single file and used to access any missing info without the need to somehow duplicate it using either hardlinks or a plain file copy. This issue is however only important if you expect many of your people to have their repositories server-based on the same server.

Regarding branches Mercurial either supports them as seperate clones or in-repo branches but the latter are not completely removable from the repository later. Again in this area I've got the impression that Git is currently a bit better as it also supports cheap removable in-repo branches.

Git also allows to cherry pick single changesets somehow.

In contrary to SVN neither Git nor Mercurial can handle partial repositories, so you always have to download all of it and cannot restrict the download to a certain subdirectory. Also if you download just partial history (I think Git can do this) you will no longer be able to push from or pull into this repository.

The only chance to overcome this would be to subdivide the project into certain independant sub-repositories, which is more or less not handled very well. Mercurial has the forest extension (not part of the core) and I've learned that Git now has a comparable feature with submodules as well. But AFAIK you can't have atomic changesets over all of the sub-repos at once.

So here the coin has two sides. If you don't have a local repository yet and you need to fix just a single file, you are faster with SVN and need less space. However, if your repository is already in place, I think you are better of with Git or Mercurial.

Regarding merging, I think with Git or Mercurial you have the chance to find out the history of merges while SVN currently does not support that, if I'm informed correctly, but this will probably change in a future version.

Both Git and Mercurial do not support versioning of directories, something that I truly regret as I think of it a bit like a breach of concept regarding the fact that changesets cover a whole tree. So when a directory is going to be renamed, all files within this directory will be moved file by file; and I'm also not sure what happens with file permissions. In this area SVN is better.

Overall I think DSCMs in general are great to get new developers on board. With SVN they would not be able to locally check in changes or do their own branches as they will not have any check in access rights to OOo central SVN servers. Developers would also not be able to easily pull from each other. All of this is no problem with DSCMs. But you need to establish a number of people who maintain intermediate repository 'crew' clones where pushes are accepted or pulls are done into.

I would like to encourage you to choose a DSCM, Mercurial or Git; but in the end it should definitely replace the current CVS and not just work as second choice to be converted back into CVS.

Regards

Guido

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to