On Sat, Jun 26, 2010 at 12:05 PM, Owen Shepherd <owen.sheph...@e43.eu>wrote:

>
> > SCSU is not that useful for storage compression since fossil already
> > uses zlib and it has no other advantages I am aware of.
>
> Deflate compression is only applied to commits. Deflate has
> significant overhead, and is inapplicable to smaller pieces of text
> (such as commit strings) which can non-the-less contribute
> significantly to size. On the other hand, SCSU performs better than
> UTF-8 for the vast majority of real world texts, as has already been
> enumerated.
>

The checkin-comments in Fossil are contained in the manifest artifacts,
which are both delta-compressed and deflated prior to storage in the current
implementation.

Copies of checkin-comments are stored uncompressed in a separate table (the
EVENT table) for ease of access during queries such as "timeline".  But the
amount of text stored there is small.  In Fossil's self-hosting repository
(with 3409 events) there is 337KB of comment text, or about 2.3% of the
total repository space.  In the 10-year history of SQLite there are 8664
events with 869KB of text, or 2.2% of the total repository space.  In both
those examples, the comments are pure ASCII, so SCSU compression would make
no difference.  But notice that we could store the text as UTF-32 and it
would still be less than 10% of the total repository.

In contrast, the delta- and deflate-compressed artifacts comprise about 70%
and 80% of the repository space for Fossil and SQLite, respectively.  The
artifact compression is very effective, achieving compression rations of
19:1 for Fossil and 39:1 for SQLite.

-- 
---------------------
D. Richard Hipp
d...@sqlite.org
_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Reply via email to