On 2008.04.25 00:54:32 +0200, Alexander Staubo <[EMAIL PROTECTED]> scribbled 1.2K characters: > On Fri, Apr 25, 2008 at 12:35 AM, zooko <[EMAIL PROTECTED]> wrote: > > A new user started to convert his company to darcs, but then had to > > back out and go back to using SVN when it turned out that "darcs > > whatsnew" took 17 seconds and his co-workers couldn't stand that. > > (The equivalent call, "svn diff" takes around 1.7 seconds -- about > > 10x as fast.) > > My experience is that Darcs performs rather poorly in the presence of > large, untracked files in the working directory. It probably reads > each file into memory, perhaps in order to determine whether it's > binary or not? > > On OS X and probably other operating systems this forces the OS to > swap out pretty much everything to disk, which makes the system > virtually unusable while it's running and for a while afterwards, when > the OS needs to swap everything back in again. > > I am intimately familiar with this issue because I tend to put 500MB > database dumps in my working directory before accidentally running > "darcs whatsnew". > > > It looks like there is probably quite a bit of room for optimization > > in darcs-2's use of the filesystem. > > Probably. Git and Mercurial do not suffer from this problem, either. > > Alexander.
It's hard to tell what's causing the slowdown. If you look at one of my profiling runs for 'whatsnew' <http://lists.osuosl.org/pipermail/darcs-devel/attachments/20080412/7854901d/attachment-0026.obj>, you see that COST CENTRE MODULE %time %alloc filetype_function Darcs.Repository.Prefs 73.6 12.8 But when you go down the actual trace for the filetype_function, or you look at the definition, none of the called functions seem to be real time-wasters (except maybe 'normalize'). It does call a locally defined 'isbin', but notice that what darcs sees as a binary file is done via regexes: filetype_function ∷ IO (FilePath → FileType) filetype_function = do binsfile ← def_prefval "binariesfile" "_darcs/prefs/binaries" bins ← get_lines binsfile `catch` (λe→ if isDoesNotExistError e then return [] else ioError e) gbs ← get_global "binaries" regexes ← return (map (λr → mkRegex r) (bins ++ gbs)) let isbin f = or $ map (λr → isJust $ matchRegex r f) regexes ftf f = if isbin $ normalize f then BinaryFile else TextFile in return ftf So filetype_function defines a new function, ftf, that matches based on extensions, AFAIC. There doesn't seem to be any loading of files involved - which makes the slowdown still a mystery. -- gwern C3I Uzi unix B-1B ies joe security MI5 Dateline SC
pgpekyGduVcXj.pgp
Description: PGP signature
_______________________________________________ darcs-users mailing list [email protected] http://lists.osuosl.org/mailman/listinfo/darcs-users
