Re: [fossil-users] fossil commit is extremely slow

2013-07-27 Thread Richard Hipp
On Sat, Jul 27, 2013 at 3:16 PM, Eric Rubin-Smith eas@gmail.com wrote:

 I have a largish repo I ingested from CVS (via git, as I previously
 described on this list).  I'm using fossil 1.26.

 A tiny commit to a single file takes 63 seconds:

 [monk:code] $ time fossil commit -m Test check-in
 New_Version: c46175729e936137f58ef302308d1e95b62e6a61

 real1m2.767s
 user0m15.090s
 sys 0m7.227s

 I.e. ~22 seconds of CPU usage, and presumably the rest is on the disk.

 The box is pretty old (see below for /proc/cpuinfo), and I know that
 fossil is not written to be a speed demon -- but this still seems pretty
 ridiculous.


That is ridiculous.  Most commits take less than a second, even on archaic
machines, such as my 15-year-old PPC iBook clocked at 400MHz.

How many files are in your check-out?  What's the total size of all those
files (how big is the checkout)?  Is the repository or the check-out on a
network filesystem?

-- 
D. Richard Hipp
d...@sqlite.org
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] fossil commit is extremely slow

2013-07-27 Thread Eric Rubin-Smith
On Sat, Jul 27, 2013 at 3:23 PM, Richard Hipp d...@sqlite.org wrote:


 That is ridiculous.  Most commits take less than a second, even on archaic
 machines, such as my 15-year-old PPC iBook clocked at 400MHz.

 How many files are in your check-out?


[monk:repo.fossil] $ find .|wc -l
8095

What's the total size of all those files (how big is the checkout)?


[monk:repo.fossil] $ du -sch .
392M.
392Mtotal


 Is the repository or the check-out on a network filesystem?


No and no.

Eric
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] fossil commit is extremely slow

2013-07-27 Thread sky5walk
If Windows, add fossil.exe to the excluded process list of your antivirus
app.


On Sat, Jul 27, 2013 at 3:41 PM, Eric Rubin-Smith eas@gmail.com wrote:

 On Sat, Jul 27, 2013 at 3:23 PM, Richard Hipp d...@sqlite.org wrote:


 That is ridiculous.  Most commits take less than a second, even on
 archaic machines, such as my 15-year-old PPC iBook clocked at 400MHz.

 How many files are in your check-out?


 [monk:repo.fossil] $ find .|wc -l
 8095

 What's the total size of all those files (how big is the checkout)?


 [monk:repo.fossil] $ du -sch .
 392M.
 392Mtotal


 Is the repository or the check-out on a network filesystem?


 No and no.

 Eric


 ___
 fossil-users mailing list
 fossil-users@lists.fossil-scm.org
 http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] fossil commit is extremely slow

2013-07-27 Thread Richard Hipp
On Sat, Jul 27, 2013 at 3:41 PM, Eric Rubin-Smith eas@gmail.com wrote:

 On Sat, Jul 27, 2013 at 3:23 PM, Richard Hipp d...@sqlite.org wrote:

 What's the total size of all those files (how big is the checkout)?


 [monk:repo.fossil] $ du -sch .
 392M.
 392Mtotal



That would be the culprit.  As one of several self-checks (see
http://www.fossil-scm.org/fossil/doc/trunk/www/selfcheck.wiki), Fossil
always computes an MD5 checksum over the entire check-out and compares that
to the content being checked in, to make sure they are identical.  With a
392MB checkout on an older machine, that might easily take a minute.

The Fossil repositories for Fossil itself, and for SQLite are just 14MB and
22MB, respectively.  And I do most of my work on a fast machine, so I never
notice the extra commit-time needed for this self-check.

I think you can turn off this safety-check using:

 fossil setting repo-cksum off

Please try that, and let us know whether or not it solves your problem.

-- 
D. Richard Hipp
d...@sqlite.org
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] fossil commit is extremely slow

2013-07-27 Thread Eric Rubin-Smith
On Sat, Jul 27, 2013 at 4:15 PM, Richard Hipp d...@sqlite.org wrote:



 On Sat, Jul 27, 2013 at 3:41 PM, Eric Rubin-Smith eas@gmail.comwrote:

 On Sat, Jul 27, 2013 at 3:23 PM, Richard Hipp d...@sqlite.org wrote:

 What's the total size of all those files (how big is the checkout)?


 [monk:repo.fossil] $ du -sch .
 392M.
 392Mtotal



 That would be the culprit.  As one of several self-checks (see
 http://www.fossil-scm.org/fossil/doc/trunk/www/selfcheck.wiki), Fossil
 always computes an MD5 checksum over the entire check-out and compares that
 to the content being checked in, to make sure they are identical.  With a
 392MB checkout on an older machine, that might easily take a minute.



I tested this basic claim and do not believe it holds:

[monk:~] $ head -c $(echo 392*1024*1024|bc) /dev/zero  foo
[monk:~] $ du -sch foo
392Mfoo
392Mtotal
[monk:~] $ time md5sum foo
c6d8f8fc5c75fd6ecceb4edf42f3ac4d  foo

real0m1.324s
user0m0.998s
sys 0m0.247s

So just over a second to calculate that hash on the same box.  I retried
this after dropping kernel caches to test whether it's the disk, and it
still only took 3.6 seconds to calculate the hash.

Of course, that's just the time it takes to calculate the hash.  Obviously
it does not include the time spent concatenating the world together to send
to your MD5 function.  Perhaps there's a super-linear algorithm in that
concatenation stuff?

Turning off repo-cksum* **did* address the issue, at least by an order of
magnitude:

[monk:code] $ fossil setting repo-cksum off
[monk:code] $ time fossil commit -m test commit
New_Version: 4d3b92dca8a617d6004bbe4e9c158fc11882720d

real0m7.365s
user0m0.627s
sys 0m0.398s

Does this leave any serious gaps in fault-tolerance?

The new performance is acceptable, though I'm still happy to keep digging
around if you're still curious (either about what was taking so long, or
about what is still taking 7 seconds, or both).

Thanks Richard.

Eric
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] fossil commit is extremely slow

2013-07-27 Thread Stephan Beal
On Sat, Jul 27, 2013 at 10:31 PM, Eric Rubin-Smith eas@gmail.comwrote:

 [monk:code] $ fossil setting repo-cksum off


FYI: if you want that setting used globally by default for your repos, add
the -global flag. Otherwise it will apply on to that repo.


-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
http://gplus.to/sgbeal
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] fossil commit is extremely slow

2013-07-27 Thread Andy Bradford
Thus said Eric Rubin-Smith on Sat, 27 Jul 2013 16:31:46 -0400:

 I tested this basic claim and do not believe it holds:
 
 [monk:~] $ head -c $(echo 392*1024*1024|bc) /dev/zero  foo
 [monk:~] $ du -sch foo
 392Mfoo
 392Mtotal
 [monk:~] $ time md5sum foo
 c6d8f8fc5c75fd6ecceb4edf42f3ac4d  foo
 
 real0m1.324s
 user0m0.998s
 sys 0m0.247s

I  believe  this test  is  slightly  flawed.  You  have 8095  files  and
directories for a total  of 392M. This is not at all the  same as 1 file
that totals 392M.  So your test doesn't account for  the distribution of
the data  on the  disk and  the file system  slowness that  could result
therefrom.

A better comparison would be:

time find . -type f -exec md5sum {} \;

Andy
-- 
TAI64 timestamp: 400051f43494


___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] fossil commit is extremely slow

2013-07-27 Thread Eric Rubin-Smith
On Sat, Jul 27, 2013 at 4:58 PM, Andy Bradford 
amb-sendok-1377550706.oeilkncbciakkppah...@bradfords.org wrote:

 Thus said Eric Rubin-Smith on Sat, 27 Jul 2013 16:31:46 -0400:

  I tested this basic claim and do not believe it holds:
 
  [monk:~] $ head -c $(echo 392*1024*1024|bc) /dev/zero  foo
  [monk:~] $ du -sch foo
  392Mfoo
  392Mtotal
  [monk:~] $ time md5sum foo
  c6d8f8fc5c75fd6ecceb4edf42f3ac4d  foo
 
  real0m1.324s
  user0m0.998s
  sys 0m0.247s

 I  believe  this test  is  slightly  flawed.  You  have 8095  files  and
 directories for a total  of 392M. This is not at all the  same as 1 file
 that totals 392M.  So your test doesn't account for  the distribution of
 the data  on the  disk and  the file system  slowness that  could result
 therefrom.


Good point!  Not to mention duplicated syscall overhead etc.  I ran a riff
on your idea and got a very different result:

[monk:repo.fossil] $ time find . -type f -exec cat {} \; | md5sum -
3abe8f411181a328c7b64946ff6a9c7a  -

real0m37.631s
user0m2.973s
sys 0m11.543s

As you predicted, most of that time is spent on disk I/O, not e.g. in
forking 'cat'.  So that explains over half of the run-time for my fossil
command.

For the other half, I ran fossil under callgrind and found that at least
44% of its instruction reads were inside zlib, and at least 34% were spent
updating the MD5 sum:


Ir

41,797,779,918  PROGRAM TOTALS


Ir  file:function


18,101,410,264   /usr/src/debug/zlib-1.2.5/inflate.c:inflate (55531x)
[/lib64/libz.so.1.2.5]
18,101,410,264  *  /usr/src/debug/zlib-1.2.5/inffast.c:inflate_fast
[/lib64/libz.so.1.2.5]

13,824,797,833   /home/eas/Fossil-c9cb6e72932fefbe/./src/md5.c:MD5Update
(24296657x) [/usr/local/bin/fossil-1.26-eas-built]
 3,983   /home/eas/Fossil-c9cb6e72932fefbe/./src/md5.c:MD5Final
(7x) [/usr/local/bin/fossil-1.26-eas-built]
13,824,801,816  *
/home/eas/Fossil-c9cb6e72932fefbe/./src/md5.c:MD5Transform
[/usr/local/bin/fossil-1.26-eas-built]

(and those are just the top two functions).

All that uncompressing seems to come from blob_uncompress.  So I guess the
only remaining question is whether all those blob uncompresses are really
necessary.  I assume yes -- and in any case I have my answers. :-)

Thanks again.

Eric
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users