It appears that the compression ratio shown with the ‘fossil db –db-check’ command is based on the actual total file size of the repo against the would-be size of all expanded versions stored separately (based on description here: https://www.fossil-scm.org/xfer/doc/trunk/www/stats.wiki).
There are two cases, however, that IMO the command gives a false impression of the actual compression achieved in the repo. * The first is the inclusion of un-versioned files which although inflate the total file size have no play in the versioning part, which is what I believe the compression ratio was meant to highlight. * The second is the presence of free pages not yet vacuumed. This is unused space that IMO ‘unfairly’ lowers the ratio. Taken from the wiki page earlier “... hence the SQLite project gets excellent 73:1 compression” If we were to add several big un-versioned files (such as an assortment of pre-built binaries for various configurations and platforms) the repo size will obviously increase dropping ‘unfairly’ (IMO) the ‘excellent’ ratio of compression, when in fact it hasn’t changed at all with respect to the versioned history. So, in practice, the compression ratio is not meaningful in a useful way when the repo includes either big un-versioned files or too many free pages, and could be improved to ignore those two cases from the computation. Your thoughts?
_______________________________________________ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users