Ben Franksen wrote: > Manoj Gudi wrote: >> I am trying to commit a new log file (size 20Mb), `darcs record` works >> fine, however `darcs push` hangs after asking for credentials.. >> >> Any ideas? > > You hit a long-standing weak spot of Darcs: transmitting large files can > take a long time.
tl;dr What we need to do in Darcs is to compress the bundle before sending it over the line. That's what git seems to do and it cuts down quite a bit on the transfer time. Here is how I arrived at the conclusion. Adding a few debug messages, I quickly found out that the culprit is the way we transfer the bundle. This is done with a pipe, the remote process (ssh user@host:path apply) reads the data from stdin. You can try to emulate this method with the following one-liner: ben@sarun[1]: /tmp/large > time cat american-english-times21 | (ssh frank...@tiber.acc.bessy.de /tmp/large/readit /tmp/large/american-english- times21) cat american-english-times21 0,00s user 0,05s system 0% cpu 2:25,97 total ( ssh frank...@tiber.acc.bessy.de /tmp/large/readit ; ) 1,06s user 0,09s system 0% cpu 2:42,86 total Here, "american-english-times21" is a file that is about 20MB large, and the script "readit" on the remote side is just franksen@tiber: /tmp/large > cat readit cat - > $1 Ok, so the raw transfer costs about 2:40 minutes. Darcs does a bit more so a constant overhead factor of 2..3 is not surprising. Note that even rsync over ssh isn't much faster, at least when the file does not yet exist on the remote side: ben@sarun[1]: /tmp/large > time rsync american-english-times21 frank...@tiber.acc.bessy.de:/tmp/large/american-english-times21 rsync american-english-times21 1,24s user 0,14s system 0% cpu 2:43,35 total On the other hand, git seems to be specially optimized for this purpose: ben@sarun[1]: /tmp/large > time git push --all frank...@tiber.acc.bessy.de:/tmp/large [...] git push --all frank...@tiber.acc.bessy.de:/tmp/large 1,96s user 0,04s system 4% cpu 46,982 total (That was after I studied the failure message and subsequently configured the remote git repo accordingly.) That made me think "well, how do they do that?" and the next thought was "compression?", so I tried it: ben@sarun[1]: /tmp/large > time cat american-english-times21 | gzip - | (ssh frank...@tiber.acc.bessy.de /tmp/large/readit /tmp/large/american-english- times21) cat american-english-times21 0,00s user 0,04s system 0% cpu 27,593 total gzip - 1,52s user 0,02s system 5% cpu 27,598 total ( ssh frank...@tiber.acc.bessy.de /tmp/large/readit ; ) 0,28s user 0,03s system 0% cpu 45,558 total Bingo! That's almost exactly the time 'git push' needs. It looks as if we already have the infrastructure for this (zip/unzip) in place, so the next thing I will try is to patch Darcs to use compression and see where that gets us. Cheers Ben -- "Make it so they have to reboot after every typo." -- Scott Adams _______________________________________________ darcs-users mailing list darcs-users@darcs.net http://lists.osuosl.org/mailman/listinfo/darcs-users