On Tue, 1 Sep 2015, Eric S. Raymond wrote: > Joseph Myers <jos...@codesourcery.com>: > > Indeed. Ideally the tree objects in the git conversion should have > > exactly the same contents as SVN commits, and so be shared with the > > git-svn history to reduce the eventual repository size (except where there > > are defects in the git-svn history, or the git conversion fixes up cvs2svn > > artifacts and so some old revisions end up more accurately reflecting old > > history than the SVN repository does). > > I don't think sharing with the git-svn history will be possible. git-svn > is a terrible whole-history converter; the odds of getting the same > topology out of reposurgeon are basically nil, and the problem of matching > different topologies is quite hard.
I'm not proposing sharing topology (commit objects). Only blob and tree objects. If two files have the same hash they will share the same blob object, and if two trees have files with the same hashes at the same paths then the tree objects will also have the same hash, and will be shared. Now, git-svn may well have made mistakes meaning some trees in the git-svn repository do not accurately correspond to any SVN revision of any branch (and so the objects aren't shared), but I'd expect most to be shared (even without disabling smart ignore handling, lots of tree objects for subdirectories would be shared, if those subdirectories don't have any ignore files or svn:ignore properties). The point is that since the git-svn repository has been in use for years, and there are many git-only branches there with lots of development on them, there are also many git commit references in list archives etc. which need to remain meaningful. While it would be possible to move the existing repository to a different URI (or put the new repository at a less-obvious URI), it seems simpler to put both sets of objects (with many objects in common) in the same repository (with appropriately renamed refs from the git-svn repository so that the objects aren't garbage-collected). This isn't something for reposurgeon to do. It's something that should be easy to do at the pure git level. At a minimum, I think it might be just one command to add the git-svn objects to a repository converted with reposurgeon. Untested, but should give an idea of what I'm thinking of: git fetch git://gcc.gnu.org/git/gcc.git \ 'refs/heads/*:refs/heads/git-old/*' \ 'refs/remotes/*:refs/heads/git-svn-old/*' \ 'regs/tags/*:refs/tags/git-old/*' (OK, you want to git gc afterwards to repack the whole repository.) -- Joseph S. Myers jos...@codesourcery.com