I converted my git archives (kernel and git itself) to do the SHA1 hash _before_ the compression phase.
So I'll just have to publically admit that everybody who complained about that particular design decision was right. Oh, well. On Wed, 20 Apr 2005, H. Peter Anvin wrote: > Linus Torvalds wrote: > > > > So I'll see if I can turn the current fsck into a "convert into > > uncompressed format", and do a nice clean format conversion. > > > > Just let me know what you want to do, and I can trivially change the > conversion scripts I've already written to do what you want. I actually wrote a trivial converter myself, and while I have to say that this object database conversion is a bit painful, the nice thing is that I tried very hard to make it so that the "git" programs will work with both a pre-conversion and a post-conversion database. The only program where that isn't true is "fsck-cache", since fsck-cache for obvious reasons is very very unhappy if the sha1 of a file doesn't match what it should be. But even there, a post-conversion fsck will eat old objects, it will just warn about a sha1 mismatch (and eventually it will refuse to touch them). Anyway, what this means is that you should be actually able to get my already-converted git database even using an older version of git: fsck will complain mightily, so don't run it. What I've done is to just switch the SHA1 calculation and the compression around, but I've left all other data structures in their original format, including the low-level object details like the fact that all objects are tagged with their type and length. As a result, the _only_ thing that breaks is that a new object will not have a SHA1 that matches the expectations of an old git, but since _checking_ the SHA1 is only done by fsck, not normal operations, all normal ops should work fine. So to convert your old git setup to a new git setup, do the following: - save your old setup. Just in case. I've converted my whole kernel tree this way, so it's actually tested and I felt comfortable enough with it to blow the old one away, but never take risks. - do _not_ update to my new version first. Instead, while you still have an fsck that is happy with your old archive, make sure to fsck everything you have with fsck-cache --unreachable $(cat .git/HEAD) and it shouldn't complain about anything. Use "git-prune-script" to remove dangling objects if you want. (If you read this after you already updated, no worries - everything should still work. It's just a good idea to verify your old repo first) - update to my new git tools. checkout, build, install - convert your git object database with convert-cache $(cat .git/HEAD) which will give you a new head object. Just for fun, you can double-check that "re-converting" that head object should always result in the same head object. If it doesn't, something is wrong. - take the new head object, and make it your new head: echo xxxxxx > .git/HEAD - run the new "fsck-cache". It should complain about "sha1 mismatch" for all your old objects, and they should all be unreachable (and you should have two root objects: your old root and your new root) - run "git-prune-script" to remove all the unreachable objects (which are all old). - run "fsck-cache --unreachable $(cat .git/HEAD)" with the new fsck again, just to check that it is now quiet. - blow your old index file away by re-reading your HEAD tree: cat-file commit $(cat .git/HEAD) read-tree ..... - "update-cache --refresh" Doing this on the git repository is nearly instantaneous. Doing it on the kernel takes maybe a minute or so, depending on how fast your machine is. Sorry about this, but it's a hell of a lot simpler to do it now than it will be after we have lots of users, and I've really tried to make the conversion be as simple and painless as possible. And while it doesn't matter right now (since git still does exactly the same - I did the minimal changes necessary to get the new hashes, and that's it), this _will_ allow us to notice existing objects before we compress them, and we can now play with different compression levels without it being horribly painful. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html