On Wed, Mar 11, 2009 at 11:01 PM, Daniel Cheng <[email protected]> wrote: >> I had assumed there would be a SSK referenced manifest which list the >> objects in the repo by hash along with an identifier for the CHK of >> the pack that contains them. If the clients save this information and > > That is, keep all objects loose? > If you means ordinary SSK manifest -- it does not support so many files > in a single directory.
I would have expected the SSK to point to 1-N CHK 'git' manifest files. But not lose, just pack definitions. i.e Pack CHK1234 contains these objects. >> don't purge any of the original objects, any of them should be able to >> reconstruct the CHK packs and reinsert the complete repo if it happens >> to fall out of the network. > > <side-talk> > Just to mess up the matter even more: > All object files are compressing using deflect algorithm, > different zlib version is known to generate different compressed bytes. > > Unless, of course, if you want to keep them decompressed.... Well, that is indeed an issue. Though git will preserve packs unaltered unless you tell it to repack. >> Also, reinserting just the CHK packs does have a big advantage: When >> something is actively developed people will often pull the SSK index > > <fact-check> > If the repository was seeded with jSite, the current code always pull > the original SSK -- to get the .git/HEAD file. > </fact-check> Sure, I'm not talking about the SSK falling out there, I'm talking about older packs falling out and new users not being able to pull the tree anymore. >> and any new CHK packs, but old packs may fall out of the network >> because no one requests them anymore, yet a new user needs them for >> their initial pull. > > Plans is to have this merged upstream. > Code that is too inventive are not welcomed. > > I have investigated the possibility to insert loose objects, redirects, etc.. > yet cannot found a way to do this without changing (or duplicating) too much > code in upstream. [snip] I certainly understand that, but at the same time, why merge anything upstream at all if you don't have something the works? I think if old packs can't be reinserted active development will easily be able to fall into a situation where new users can't pull the tree even though it works for old users. Also, if these issues are not addressed you have no hope of avoiding multiplying the storage by the number of developers. With distributed version control each developer will publish their own repository, if pack handling is done intelligently many developers can share the same back end objects. Otherwise you are multiplying the storage and network activity many-fold. > However, I have just spend one weekend in egit/jgit code... I may have > missed something. > If you know there are any ways to do this, patches are always welcome. I've not touched jgit yet, but I'll give it a look to see if it inspires me. Please don't put too much significance in my comments. We know talk is cheap. I only spoke up because it sounded like you were saying that something wasn't possible which should be, but the complexity counter is not without merit. _______________________________________________ Devl mailing list [email protected] http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
