Hello,

first of all, I know that Fossil was written with the idea of serving
SQLite project and projects of similar size well and that it does
great job in this task.

I'm just curious if there are people here tinkering with the idea to
make it more scale-able and allow its real usage also for projects of
bigger size.

Now, when git -> fossil (incremental) mirror functionality seems to be
working this may be even more interesting or tempting IMHO.

Let's talk about some real numbers to illustrate the situation. Let's
clone NetBSD src tree kindly provided by Jörg Sonnenberger by
following command:

$ time /opt/fossil-head/bin/fossil clone
http://netbsd.sonnenberger.org/ netbsd-src.fossil

It takes:

real    323m2.323s
user    42m0.262s
sys     13m18.003s

on my E5-2620 Sandy Bridge workstation. Of course part of this time is
spent perhaps on not so efficient network data send/receive, but
majority of time at least as observed from the output of the command
is spend on DB rebuild. I know that from the example of OpenBSD src
tree which is comparable in size with NetBSD and where rebuild alone
takes around 250 minutes on the same hardware and with the same
fossil.

So this is about time spent on rebuilt. What may be even more
important is how much data rebuild is going to write. Here I do not
have exact or perfectly exact numbers, but this is on my workstation
so I see what's going on by keeping drive meters on my eyes so let's
assume I'm not that off claiming that rebuild was writing data on
speed ~40 MB/s for 2 or even more hours. In sum this may be around 300
GBs of data written on this rebuild (rounded up). This is for
repository which final file size is:

$ ls -lha netbsd-src.fossil
-rw-r--r--   1 karel    karel       2.6G Oct 28 01:11 netbsd-src.fossil

and which results in the source tree of size of 2.7 GB.

Now just to show that this rebuild may be the biggest scalability
obstacle I'd like to compare with open/status/diff/commit operations:

- open: results in 2.7GB of data written to disk in the resulting
NetBSD source tree. It takes:
real    4m38.843s
user    1m44.221s
sys     1m58.553s

IMHO very nice result for the source tree of this size

- status/diff -- one random file modified: both runs for 4-5 seconds.
Also very nice results for the source tree of this size

- commit, this is a little bit harder. One file modified and commit takes:
real    4m0.765s
user    1m55.442s
sys     1m11.892s

IMHO not so nice, but still kind of acceptable even for development on
this source tree size. But certainly commit may be another target for
speedup hacking.

So that's it. Fossil used in those tests is:

This is fossil version 1.37 [0fa60142eb] 2016-10-26 21:45:52 UTC

and the tests were performed on ZFS mirror of two SSDs (1TB Crucial
MX200 and 1TB Samsung 850 Evo) on Solaris 11.2 running on E5-2620 with
32GB RAM -- if anybody is interested in this info for numbers
verification.

Cheers,
Karel
_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Reply via email to