[fossil-users] Scalability limits

2014-02-07 Thread Rich Neswold
Hello,

I first want to say what a terrific version control manager Fossil is!
I took my first serious look at Fossil last week and have already
converted a few of my personal projects away from 'git'. The built-in
bug tracker and wiki are genius touches! Thank you, Fossil community,
for your efforts.

I would like to mention, however, that Fossil hits a scalability wall
at some point, making it unsuitable for large projects.

I have been trying to pull the NetBSD source repository for a week and
have had nothing but problems. As of this moment, I haven't succeeded.
I first tried cloning the repository, but it would exit with an error
after ~2GB of data was transferred. I then downloaded the
repository[2] from the NetBSD FTP site (10GB !) Doing a 'rebuild'
starts out fine but, after 24 hours, I get to 60% complete and then it
take hours to advance another .1%. I tried to rebuild using various
options (--wal and setting the pagesize), but it all ends up slowing
down at the same place. The last time I tried it, the .fossil file was
10GB and the journal file reached 11GB!

I was able to download and rebuild the pkgsrc repository[3] in a
reasonable time -- it's only 2.7GB. So there's some point between the
two projects in which fossil's rebuild algorithm becomes so expensive,
it can't be cloned.

I don't have any question; I just thought I'd document my experiences.

-- 
Rich

[1] http://netbsd.sonnenberger.org/
[2] http://ftp.netbsd.org/pub/NetBSD/misc/repositories/fossil/src.fossil
[3] http://ftp.netbsd.org/pub/NetBSD/misc/repositories/fossil/pkgsrc.fossil
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Scalability limits

2014-02-07 Thread Stephan Beal
On Fri, Feb 7, 2014 at 4:33 PM, Rich Neswold rich.nesw...@gmail.com wrote:

 I don't have any question; I just thought I'd document my experiences.


Thanks for your feedback! IMO (possibly a minority opinion), Fossil has
never aspired to host repos quite as large as those. i remember the pkgsrc
repo being mentioned before (but thought it was bigger than 2.7GB), and
IIRC the delta manifest format was introduced to help support huge repos
like that one and the core TCL repo. Fossil's original purpose was to host
sqlite, and it works wonders for projects at that scale.

i'd be interested in seeing the output of 'dbstat' on your repo, except
that it could take some time for it to finish generating its output (so
don't feel obligated to try it). Here's the info for the current fossil
core repo:

[stephan@host:~/cvs/fossil/fossil]$ f dbstat
repository-size:   53739520 bytes (53.7MB)
artifact-count:24813 (stored as 5784 full text and 19029 delta blobs)
artifact-sizes:67440 average, 5153124 max, 1673191067 bytes (1.7GB)
total
compression-ratio: 31:1
checkins:  6615
files: 821 across all branches
wikipages: 26 (294 changes)
tickets:   1056 (3355 changes)
events:5
tagchanges:737
project-age:   2394 days or approximately 6.55 years.
project-id:CE59BB9F186226D80E49D1FA2DB29F935CCA0333
fossil-version:2014-02-07 08:58:55 [90bd20308b] [1.28] (gcc-4.8.1)
sqlite-version:2014-01-27 15:02:07 [be1acb610f] (3.8.3)
database-stats:52480 pages, 1024 bytes/pg, 109 free pages, UTF-8,
delete mode



-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
http://gplus.to/sgbeal
Freedom is sloppy. But since tyranny's the only guaranteed byproduct of
those who insist on a perfect world, freedom will have to do. -- Bigby Wolf
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Scalability limits

2014-02-07 Thread Ron Wilson
On Fri, Feb 7, 2014 at 11:17 AM, Stephan Beal sgb...@googlemail.com wrote:


 On Fri, Feb 7, 2014 at 4:33 PM, Rich Neswold rich.nesw...@gmail.comwrote:

 I don't have any question; I just thought I'd document my experiences.


 Thanks for your feedback! IMO (possibly a minority opinion), Fossil has
 never aspired to host repos quite as large as those. i remember the pkgsrc
 repo being mentioned before (but thought it was bigger than 2.7GB), and
 IIRC the delta manifest format was introduced to help support huge repos
 like that one and the core TCL repo. Fossil's original purpose was to host
 sqlite, and it works wonders for projects at that scale.


I am guessing this is a limitation of SQLite, which is designed to be
light. It would be interesting to see how Fossil would perform when
plugged in to, for example, PostgreSQL, MariaSQL or other heavy duty
SQL server. Of course, that could require rewriting a lot of SQL queries.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Scalability limits

2014-02-07 Thread Gour
On Fri, 7 Feb 2014 18:40:32 +0100
Stephan Beal sgb...@googlemail.com wrote:

 It would be really cool to see someone implement their own SCM based
 on fossil's core artifact model and their own db back-end, though. 

What about Monotone? Linus was looking at it, but it was too slow at
that time.


Sincerely,
Gour

-- 
Everyone is forced to act helplessly according to the qualities 
he has acquired from the modes of material nature; therefore no 
one can refrain from doing something, not even for a moment.

http://www.atmarama.net | Hlapicina (Croatia) | GPG: 52B5C810


___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Scalability limits

2014-02-07 Thread Lluís Batlle i Rossell
On Fri, Feb 07, 2014 at 07:39:37PM +0100, Gour wrote:
 On Fri, 7 Feb 2014 18:40:32 +0100
 Stephan Beal sgb...@googlemail.com wrote:
 
  It would be really cool to see someone implement their own SCM based
  on fossil's core artifact model and their own db back-end, though. 
 
 What about Monotone? Linus was looking at it, but it was too slow at
 that time.

It was a bug of monotone, that slowness. Fixed, for what I remember.

But monotone works on sqlite, if the deal is sqlite.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Scalability limits

2014-02-07 Thread Gour
On Fri, 7 Feb 2014 20:32:56 +0100
Lluís Batlle i Rossell vi...@viric.name
wrote:

 It was a bug of monotone, that slowness. Fixed, for what I remember.

Yeah, too bad. Otherwise we wouldn't see git. :-)

 But monotone works on sqlite, if the deal is sqlite.

Right, but  I see Monotone's influence in Fossil.


Sincerely,
Gour

-- 
He who is satisfied with gain which comes of its own accord, who 
is free from duality and does not envy, who is steady in both 
success and failure, is never entangled, although performing actions.

http://www.atmarama.net | Hlapicina (Croatia) | GPG: 52B5C810


___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Scalability limits

2014-02-07 Thread Joerg Sonnenberger
On Fri, Feb 07, 2014 at 05:17:23PM +0100, Stephan Beal wrote:
 i'd be interested in seeing the output of 'dbstat' on your repo, except
 that it could take some time for it to finish generating its output (so
 don't feel obligated to try it). Here's the info for the current fossil
 core repo:

Attached for pkgsrc and src.

Joerg
repository-size:   2852068352 bytes (2.9GB)
artifact-count:1096185 (stored as 190518 full text and 905667 delta blobs)
artifact-sizes:23053 average, 5763035 max, 25270457712 bytes (25.3GB) total
compression-ratio: 8:1
checkins:  384960
files: 129343 across all branches
wikipages: 0 (0 changes)
tickets:   0 (0 changes)
events:0
tagchanges:83
project-age:   6016 days or approximately 16.47 years.
project-id:a93518a42fa8e06695943fd79049ad4fcf8b9d00
fossil-version:2013-02-16 00:04:35 [d2e07756d9] [1.25] (gcc-4.5.3)
sqlite-version:2013-02-13 14:04:28 [7e10a62d0e] (3.7.16)
database-stats:2785223 pages, 1024 bytes/pg, 11 free pages, UTF-8, wal mode
repository-size:   2380333056 bytes (2.4GB)
artifact-count:1751692 (stored as 246938 full text and 1504754 delta blobs)
artifact-sizes:24080 average, 17336826 max, 42181390896 bytes (42.2GB) total
compression-ratio: 17:1
checkins:  278062
files: 284615 across all branches
wikipages: 0 (0 changes)
tickets:   0 (0 changes)
events:0
tagchanges:0
project-age:   7880 days or approximately 21.57 years.
project-id:f147779665278afdf4d91757d941046def2b6e5a
fossil-version:2013-02-16 00:04:35 [d2e07756d9] [1.25] (gcc-4.5.3)
sqlite-version:2013-02-13 14:04:28 [7e10a62d0e] (3.7.16)
database-stats:36321 pages, 65536 bytes/pg, 0 free pages, UTF-8, wal mode
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Scalability limits

2014-02-07 Thread Stephan Beal
On Fri, Feb 7, 2014 at 6:15 PM, Ron Wilson ronw.m...@gmail.com wrote:

 I am guessing this is a limitation of SQLite, which is designed to be
 light. It would be interesting to see how Fossil would perform when
 plugged in to, for example, PostgreSQL, MariaSQL or other heavy duty
 SQL server. Of course, that could require rewriting a lot of SQL queries.


When starting on libfossil i actually looked into that and decided against
it primarily because so much of the heavy lifting (and a lot of the lighter
work) in fossil is done by sqlite, and it would be a tremendous effort to
port that SQL logic either to C code or another SQL dialect. The fossil
core model supports arbitrary storage (not necessarily a db), but having
sql-based storage greatly simplifies many parts of the functionality and
fossil as an application (or library) is very tightly married to sqlite.
Then of course: the primary author of sqlite is the one writing most of the
SQL in fossil, which means that the SQL is very fine indeed :).

It would be possible to do on top of another db, but i don't think
anyone's going to volunteer to do it any time soon! It would be really cool
to see someone implement their own SCM based on fossil's core artifact
model and their own db back-end, though. It would likely require a complete
re-implementation, not just rewriting most of the SQL. libfossil (as
opposed to fossil) goes out of its way to abstract the sqlite3 API out of
the client's view, and could reasonably be ported to work with another db
with relatively little work, but the queries themselves are often very
sqlite-specific. That's where most of the work would be.

Anyway...

-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
http://gplus.to/sgbeal
Freedom is sloppy. But since tyranny's the only guaranteed byproduct of
those who insist on a perfect world, freedom will have to do. -- Bigby Wolf
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Scalability limits

2014-02-07 Thread Rich Neswold
On Fri, Feb 7, 2014 at 10:17 AM, Stephan Beal sgb...@googlemail.com wrote:
 i'd be interested in seeing the output of 'dbstat' on your repo, except that
 it could take some time for it to finish generating its output (so don't
 feel obligated to try it). Here's the info for the current fossil core repo:

I have another attempt in progress. This time, I'm running it on a
quad-core system with 12GB RAM. It's been close to 48 hours and it
reports being only 81.5% completed. The fossil file is 10GB, the -shm
file is 2.3GB and the -wal file is 30.2GB. When it's done, I'll report
the dbstats.

Thanks,

-- 
Rich
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Scalability limits

2014-02-07 Thread Stephan Beal
On Fri, Feb 7, 2014 at 9:11 PM, Joerg Sonnenberger
jo...@britannica.bec.dewrote:

 On Fri, Feb 07, 2014 at 05:17:23PM +0100, Stephan Beal wrote:
  i'd be interested in seeing the output of 'dbstat' on your repo, except
  that it could take some time for it to finish generating its output (so
  don't feel obligated to try it). Here's the info for the current fossil
  core repo:

 Attached for pkgsrc and src.


Holy cow, that's a lot of checkins. Does 21.5 years make src the
oldest-history fossil repo?

-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
http://gplus.to/sgbeal
Freedom is sloppy. But since tyranny's the only guaranteed byproduct of
those who insist on a perfect world, freedom will have to do. -- Bigby Wolf
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


[fossil-users] http_ssl paren issue

2014-02-07 Thread James Turner
I think maybe http_ssl needs an extra set of parentheses to deal with
this:

./src/http_ssl.c: In function 'ssl_open':
./src/http_ssl.c:288: warning: cast to pointer from integer of different size

-- 
James Turner
Index: src/http_ssl.c
==
--- src/http_ssl.c
+++ src/http_ssl.c
@@ -283,11 +283,11 @@
 return 1;
   }
   BIO_get_ssl(iBio, ssl);
 
 #if (SSLEAY_VERSION_NUMBER = 0x00908070)  !defined(OPENSSL_NO_TLSEXT)
-  if( !SSL_set_tlsext_host_name(ssl, 
pUrlData-useProxy?pUrlData-hostname:pUrlData-name) ){
+  if( !SSL_set_tlsext_host_name(ssl, 
(pUrlData-useProxy?pUrlData-hostname:pUrlData-name)) ){
 fossil_warning(WARNING: failed to set server name indication (SNI), 
   continuing without it.\n);
   }
 #endif
 

___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Scalability limits

2014-02-07 Thread Arnel Legaspi

On 2/8/2014 5:19 AM, Stephan Beal wrote:

It would be really cool
to see someone implement their own SCM based on fossil's core artifact
model and their own db back-end, though. It would likely require a complete
re-implementation, not just rewriting most of the SQL.


Wasn't Veracity (http://veracity-scm.com/) inspired by most of the 
concepts in Fossil? They also use Fossil as their DB back-end, and IIRC, 
they were planning to make/sell add-ons that allow using other SQL DB's 
like PostgreSQL, etc. as the repo back-end.


Too bad it's on hold at the moment, though.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Scalability limits

2014-02-07 Thread Nico Williams
On Fri, Feb 7, 2014 at 11:40 AM, Stephan Beal sgb...@googlemail.com wrote:
 On Fri, Feb 7, 2014 at 6:15 PM, Ron Wilson ronw.m...@gmail.com wrote:

 I am guessing this is a limitation of SQLite, which is designed to be
 light. It would be interesting to see how Fossil would perform when
 plugged in to, for example, PostgreSQL, MariaSQL or other heavy duty SQL
 server. Of course, that could require rewriting a lot of SQL queries.


 When starting on libfossil i actually looked into that and decided against
 it primarily because so much of the heavy lifting (and a lot of the lighter
 work) in fossil is done by sqlite, and it would be a tremendous effort to
 port that SQL logic either to C code or another SQL dialect. The fossil core
 [...]

One of the nice things about all the features that SQLite3 has been
growing lately (CTEs, recursive queries, before that recursive
triggers, foreign keys, ...) is that the more business logic that can
be expressed declaratively and therefore pushed into SQL, the less
complexity one has to have in C, Python, ...  That makes it much
easier to cope with future schema changes, or dataset changes that
require re-planning queries (the RDBMS can do it!).  And all those new
features are great for expressing complex business logic (particularly
CTEs).

Sticking to a portable subset of SQL, on the other hand, makes it
easier to scale up and down the device stack, dataset sizes, and
across the network.  Which makes improvements in the lowest common
denominator very welcome!

Now, if only PostgreSQL (and others) had a duck-type option to match
(roughly) SQLite3's duck typing...  That would bring the lowest common
denominator up to a very useful level.

Nico
--
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users