On Jan 8, 2015, at 11:16 AM, sky5w...@gmail.com wrote:

>    "Fossil does not perform well with very large repo's or histories > 15 
> years."
>    How is the performance hit quantified? 1day or 1hour / 1GB repo / commit?

Fossil does an O(N) scan over the entire DB as an extra integrity check, on the 
assumption that the filesystem may not be reliable.

(It’s a good assumption unless you’ve taken some uncommon steps to ensure that 
it *is* reliable.  See “Disks from the Perspective of a File System,” by 
Marshall Kirk McKusick in ACM Queue: http://goo.gl/hHvdQ8)

>    What is the limiting factor? 

The balance between your patience and your disk’s I/O throughput.

If your disk can run the Fossil MD5 scan at a rate of 100 MByte/sec, a 300 
MByte repository file will take at least 3 seconds to do many types of 
operations unless you turn the repo-cksum setting off.

This problem has historically been ignored since SQLite’s repo was viewed as 
“large” at ~50 MiB.

>    Is there a path to improve this performance similar to the SQLite speed 
> gains in the last 2 years?

The SQLite improvements improve Fossil’s speed, too.

I wouldn’t recommend turning off repo-cksum unless you are storing your fossils 
on uncommonly-durable storage:

1. Battery-backed hardware RAID; or

2. A filesystem that does data checksumming itself, like ZFS, so that Fossil’s 
data checksumming is redundant.
_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Reply via email to