On Mon, Sep 1, 2008 at 5:00 PM, Petr Rockai <[EMAIL PROTECTED]> wrote: > Hi, > > Dan Pascu <[EMAIL PROTECTED]> writes: >> If you want to go this path, why not use a database to store the data? >> Berkeley DB or SQLite should do just fine and they are most likely much >> faster in accessing the data that any archive, while giving you the same >> single file storage advantage (well maybe 3: pristine, patches and >> inventories). It won't give you the compression advantage (though you >> could store compressed data in them if that is really desired), but that >> is less of an issue I believe, as disk space is cheap and some stuff is >> already compressed (patches). Such a solution will definitely avoid the >> limited number of files per directory issue and can even offer the >> benefits of a hashed repository (no direct access to pristine to >> accidentally modify files), but without the need to hash the files, since >> they can be stored verbatim in the database. This would also make it >> slightly faster as the need to hash the files will dissapear. > it should also be noted, that it would make http repository access next to > impossible and other remote access pretty tricky. Also, hashing the files > gives > us much better consistency and robustness guarantees than any of the mentioned > "database" engines ever pretended. > > (As a sidenote, I'd probably say that compressing in a biggish, indexed and > compressed file is likely to give radically better compression ratio than > compressing individual patches. However, it would reduce any hardlink-sharing > possibilities to virtually zero, so the actual space advantage is very > dubious. A compromise solution, like the one GIT adopts, where "oldish" data > is > packed into bigger chunks and distributed as "packs" (of which there are often > several) might actually bring advantages of both. It is however not clear at > all to me, how to associate hashes to pack files -- probably through an > external index that could be fetched independently by http, and used to > determine if the pack is needed or useful; it would complicate the cache code > a > fair bit I guess, and make the lazy repositories a little more tricky overall, > but could give impressive speedups for initial, non-cached "get" over > http... Or might not, depending on how well pipelining actually works in > practice.)
Darcs tags might correspond rather well to packs. Jason _______________________________________________ darcs-users mailing list [email protected] http://lists.osuosl.org/mailman/listinfo/darcs-users
