On Wed, 7 Nov 2007, Martin Hollmichel wrote:
Mathias Bauer wrote:
As it got silent regarding the SCM topic on this list, do you know
whether all plans to replace the dreaded CVS by something better have
been cancelled?
Nothing has been canceled, currently comparisons of different
candidates are carried out. See
http://wiki.services.openoffice.org/wiki/ESC_minutes#Face2Face_meeting_in_Hamburg_2007-10-29.2F30
There will be an follow up meeting this Friday 14.00 UTC on #oooscm on
freenode for more detailed evaluation, please feel free to join,
If you mean me, then thanks for the offer but as an employee I have to
work at this time and my employer forbids and technically prevents the use
of IRC.
I can however share some thoughts here. At work I'm in a large project
with several developers in a small nummber of countries. Our development
environment is server based, mainly we are using ClearCase Multisite as
SCM, but because of its inflexibilities especially regarding quick
experiments and tests, a small number of developers including me has
started to play with DSCMs for prototyping. So I converted just the
mainline of our project using some self-written Perl tools to both
Mercurial and Git and keep them up-to-date.
From the docs which Mathias Bauer referenced above, I noticed that you are
currently looking at SVN, Git and Bazaar. I'm wondering a bit why you are
looking at Bazaar at all, because for example in
<http://weblogs.mozillazine.org/preed/2007/04/version_control_system_shootou_1.html>
Mozilla teams have clearly ruled out Bazaar due to bad performance; the
article said it used over a month of constant runtime to complete a trunk
only import, which is totally unacceptable.
I don't know how large an OpenOffice.org repository is, but I guess its a
lot more than Mozilla's.
I also wonder why you are not looking at Mercurial yet
(<http://www.selenic.com/mercurial/wiki>). It has been selected by a
number of projects, e.g. OpenSolaris and Mozilla.
In my case I ended up with a Mercurial repository of ~950 Megabyte size
(including ~500 MB working copy), which contained ~9200 files in ~1300
directories and had ~38000 changesets. A small number of files are
binaries of huge size (tarballs and executables), which are handled ok as
well. The conversion took about one night. Roughly the same applies for
our Git repository.
Regarding features I wish I could have a combination of Mercurial and Git:
IMHO the Mercurial user interface is much cleaner and commands are less
cryptic than that/those of Git. You also get small integer numbers valid
only in the local repository in addition to the uncomfortable large hash
numbers that allow easier handling when you need to reference a specific
changeset.
Mercurials efficient storage does not need any kind of packing or garbage
collection like in Git.
It is also said that Mercurial has good support for Microsoft Windows
Operating systems as its based on Python, while Git is said to work only
well on Windows with help of Cygwin and then with slow speed. I thought
this would probably be a criteria for OOo development.
Mercurial is also well documented, a PDF book is available (a work in
progress, but already in very good shape). Git's docs basically consist of
a collection of manpages and some tutorials, definitely better than
nothing, but it isn't up to Mercurials book yet.
Mercurials 'bundle' feature allows to bundle changesets for quick offline
down/upload of a set of changesets in one file and provides support for
management of patch sets in form of MQ (Mercurial queues).
On the downside it sometimes is not as space efficient as it could be. It
makes excessive use of hardlinks, which works ok as long as you are
cloning on the same filesystem. In this area Git is better as it allows so
called 'shared' repositories where just the location of another repository
is stored in a single file and used to access any missing info without the
need to somehow duplicate it using either hardlinks or a plain file copy.
This issue is however only important if you expect many of your people to
have their repositories server-based on the same server.
Regarding branches Mercurial either supports them as seperate clones or
in-repo branches but the latter are not completely removable from the
repository later. Again in this area I've got the impression that Git is
currently a bit better as it also supports cheap removable in-repo
branches.
Git also allows to cherry pick single changesets somehow.
In contrary to SVN neither Git nor Mercurial can handle partial
repositories, so you always have to download all of it and cannot restrict
the download to a certain subdirectory. Also if you download just partial
history (I think Git can do this) you will no longer be able to push from
or pull into this repository.
The only chance to overcome this would be to subdivide the project into
certain independant sub-repositories, which is more or less not handled
very well. Mercurial has the forest extension (not part of the core) and
I've learned that Git now has a comparable feature with submodules as
well. But AFAIK you can't have atomic changesets over all of the sub-repos
at once.
So here the coin has two sides. If you don't have a local repository yet
and you need to fix just a single file, you are faster with SVN and need
less space. However, if your repository is already in place, I think you
are better of with Git or Mercurial.
Regarding merging, I think with Git or Mercurial you have the chance to
find out the history of merges while SVN currently does not support that,
if I'm informed correctly, but this will probably change in a future
version.
Both Git and Mercurial do not support versioning of directories, something
that I truly regret as I think of it a bit like a breach of concept
regarding the fact that changesets cover a whole tree. So when a directory
is going to be renamed, all files within this directory will be moved file
by file; and I'm also not sure what happens with file permissions. In this
area SVN is better.
Overall I think DSCMs in general are great to get new developers on board.
With SVN they would not be able to locally check in changes or do their
own branches as they will not have any check in access rights to OOo
central SVN servers. Developers would also not be able to easily pull from
each other. All of this is no problem with DSCMs. But you need to
establish a number of people who maintain intermediate repository 'crew'
clones where pushes are accepted or pulls are done into.
I would like to encourage you to choose a DSCM, Mercurial or Git; but in
the end it should definitely replace the current CVS and not just work as
second choice to be converted back into CVS.
Regards
Guido
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]