On Mon, Nov 20, 2017 at 01:33:11PM -0700, Warren Young wrote: > I see a new wiki article: > > https://www.fossil-scm.org/index.html/wiki?name=Fossil-NG
There are two central design flaws in Fossil that affect larger repositories and those are the repos that primarily benefit from narrow/shallow clones. Properly addressing them is kind of a requirement for either. (1) The need to parse all artifacts on clone. Artificates should be strongly typed, i.e. the system should at the very least distinguish fully between "content" blobs and "meta data" blobs. Only the latter have and should be parsed. This has a number of important implications, but the easiest is that the number of artificates a rebuild or even just a sync has to look at goes down by a factor of 2 at the very least. For something like NetBSD src or pkgsrc, more like a factor of 10 (number of blobs in total / number of commits, i.e. the average commit touches 10 files). (2) Store true differential manifests. The current base line approach is a somewhat crude approximation. It has the advantage that only two manifests have to be parsed, but it makes the average manifest size much larger for larger file trees. The same benefit could be obtained by caching the file list, either every so often like the current base line or on-demand. The difference is that the cached manifests are not persistent meta data and don't have to be transfered. I would also add a point (3) which is kind of related to (1): (3) Make cluster manifests non-permanent artifacts. They can also consume a good amount of space and their purpose could be served by a Merkle tree as well. This is even more important when doing single branch sync. Joerg _______________________________________________ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users