Re: [fossil-users] -db-check compression ratio

2017-12-13 Thread Tony Papadimitriou
-Original Message- 
From: Warren Young


* The second is the presence of free pages not yet vacuumed.  This is 
unused space that IMO ‘unfairly’ lowers the ratio.


I disagree.  The unused free pages *should* be charged against you, because 
that is space Fossil is taking on your disk, and thus should be compared to 
the size of all the versions checked out.


Well, differences of opinion!
Free pages are generated 'behind the scenes', usually without the user's 
direct control or consent.
Hypothetically, Fossil could decide to unnecessarily use half my disk for 
its own convenience.  I *should not* be charged for it because I did not 
choose this behavior.

What I should be charged for is my own content in the repo.
Besides, one has to keep vacuuming on very regular basis just to get 
accurate reporting.   A bit inconvenient, no?


the compression ratio is not meaningful in a useful way when the repo 
includes either big un-versioned files


If you have a .zip file at, let us say, 2.1:1 compression ratio because it 
mostly contains text files, then you add an MP3 to it, the compression 
ratio will drop.  Is that also incorrect?


Ah, but a repo is not a zip file, semantically.  A zip file keeps files on 
'equal' terms, so to speak, and therefore they should all be accounted for 
just the same.

It's not a question of file type, file extension, or file size.

A repo on the other hand has the (primary) job of keeping versioned history.
Un-versioned files are there for convenience and they normally do not affect 
the progress of the versioned project.
A project should be functional even if we remove the un-versioned files. 
So, not same-class citizens, in my book.


Obviously, in the end, it's all a matter of perspective.  To me, at least, 
compression ratio should relate only to versioned history to be a useful 
metric.


Thank you. 


___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] -db-check compression ratio

2017-12-13 Thread Warren Young
On Dec 13, 2017, at 8:31 AM, Tony Papadimitriou  wrote:
> 
> * The first is the inclusion of un-versioned files which although inflate the 
> total file size have no play in the versioning part, which is what I believe 
> the compression ratio was meant to highlight.

If unversioned file data isn’t on both sides of the division sign, then yeah, 
that’s a bug.

If you’re just saying that unversioned files cannot have delta versions by 
their very nature and thus always count 1:1, then the only way I can see to 
satisfy you is to have Fossil report the ratios for all artifacts as it does 
now plus a separate line for versioned artifacts only.  That’s not a matter of 
correctness, but instead just a matter of more detailed reporting.

Any other option feels like cooking the books to me.

> * The second is the presence of free pages not yet vacuumed.  This is unused 
> space that IMO ‘unfairly’ lowers the ratio.

I disagree.  The unused free pages *should* be charged against you, because 
that is space Fossil is taking on your disk, and thus should be compared to the 
size of all the versions checked out.

If you want to restore balance to the Force, run this occasionally:

for f in /museum/*.fossil
do
fossil rebuild -R $f --compress --vacuum --cluster
done 

A few months ago, I had a repo go from 39:1 to 43:1 as a result of running 
that.  That same repo was back up to 42:1 when I started composing this email, 
and is now back to 43:1 after running the above again.  All four numbers were 
correct at the time Fossil reported them.

> the compression ratio is not meaningful in a useful way when the repo 
> includes either big un-versioned files

If you have a .zip file at, let us say, 2.1:1 compression ratio because it 
mostly contains text files, then you add an MP3 to it, the compression ratio 
will drop.  Is that also incorrect?
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users