Thus said Matt Welland on Wed, 16 Apr 2014 09:01:28 -0700: > fossil commit cfgdat tests -m "Added another drc test" > Autosync: ssh://host/path/project.fossil > Round-trips: 1 Artifacts sent: 0 received: 0 > Error: Database error: database is locked: {UPDATE event SET mtime=(SELECT > m1 FROM time_fudge WHERE mid=objid) WHERE objid IN (SELECT mid FROM > time_fudge);} > Round-trips: 1 Artifacts sent: 0 received: 0 > Pull finished with 360 bytes sent, 280 bytes received > Autosync failed > continue in spite of sync failure (y/N)? n
I've done a fair bit of profiling with this, and this seems to happen primarily with the test-http command (the default sync method for SSH clients). I don't know what the history is behind the test-http command, but my guess is that it was really not intended to be a heavily used sync method for shared repositories. I'm not really sure why this particular database locking error happens so frequently with test-http, but not at all with http. This is happening in manifest_crosslink_end() when it's trying to fudge times. If I force my SSH command to use http instead of test-http, this error disappears entirely and I only ever see an occasional locking error due to multiple committers when I try to commit large change sets (like a 10,000 line, 840K change set); same behavior as standard HTTP/HTTPS transports in my environment (slow disk/cpu/network). Are all your users using SSH to access shared repositories? Or do you just have a few users using SSH? Perhaps it would be better to switch to using SSH keys and forced commands to cause fossil to use http instead of test-http? This does require a bit more setup. For example, each .fossil has to have the remote_user_ok configuration enabled so you can setup the REMOTE_USER environment variable for them. This is because there currently is no mechanism to use Fossil authentication while using SSH as the transport and fossil http requires it if you want to commit. I suppose an alternative configuration would be to give nobody/anonymous users the ability to write, which if SSH authentication is the only allowed sync method it may be acceptable. The only drawback that I see there is that the rcvfrom information would show up as having come from nobody, e.g., User: amb Received From: nobody @ 192.168.1.9 on 2014-04-20 04:33:35 I think one thing I've learned from all this is that forks and database locking errors occur much more frequently on slow hardware and large change sets. Also, I seem to be able to cause forking that goes undetected (without a warning). All of this probably explains why it is difficult to reproduce except on older hardware. As for making sync try harder, we could certainly just loop X number of times if we think it is worth it (not sure how feasible it will be to make it silent, or if there will be other side effects). Here I have it loop for 10 times before bailing. As you can see it failed once, but then succeeded the second time and received updates that indicate it is out of sync: $ fossil ci -m synctest2 Autosync: ssh://fossil/tmp/test.fossil Round-trips: 1 Artifacts sent: 0 received: 0 Error: Database error: database is locked: {UPDATE event SET mtime=(SELECT m1 FROM time_fudge WHERE mid=objid) WHERE objid IN (SELECT mid FROM time_fudge);} Round-trips: 1 Artifacts sent: 0 received: 0 Pull finished with 314 bytes sent, 280 bytes received Autosync failed Autosync: ssh://fossil/tmp/test.fossil Round-trips: 3 Artifacts sent: 0 received: 102 Pull finished with 3451 bytes sent, 170661 bytes received would fork. "update" first or use --allow-fork. There was also a sync failure on the first committer after it successfully committed the artifacts: $ fossil ci -m synctest1 Autosync: ssh://fossil/tmp/test.fossil Round-trips: 1 Artifacts sent: 0 received: 0 Pull finished with 316 bytes sent, 229 bytes received New_Version: 04e7debfa4f29ee3c1635007e3f380f0a0630366 Autosync: ssh://fossil/tmp/test.fossil Round-trips: 3 Artifacts sent: 101 received: 0 Error: Database error: database is locked: {UPDATE event SET mtime=(SELECT m1 FROM time_fudge WHERE mid=objid) WHERE objid IN (SELECT mid FROM time_fudge);} Round-trips: 3 Artifacts sent: 101 received: 0 Sync finished with 179617 bytes sent, 3234 bytes received Autosync failed Autosync: ssh://fossil/tmp/test.fossil Round-trips: 1 Artifacts sent: 0 received: 1 Sync finished with 4916 bytes sent, 2724 bytes received Thoughts? Andy -- TAI64 timestamp: 40000000535358db _______________________________________________ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users