E.Y.Kow wrote:
> Sat Mar 21 04:22:37 GMT 2009  David Roundy <droundy at darcs.net>
>   * fix memory leak in check/repair

I think I understand what this patch does now, and its relation to
Petr's and my patches,

Tue Dec 23 13:22:27 CET 2008  Petr Rockai <[email protected]>
  * Fix a memory leak in Darcs.Repository.Repair.

Tue Mar 10 13:03:57 CET 2009  Bertram Felgenhauer <[email protected]>
  * Improve memory usage of darcs check and repair.

The real problem that Petr's patch solved is that 'makePatchLazy'
doesn't actually release the original patch as (not!) advertised; it
still hangs on to the patch with an 'info <patch>' thunk.

To fix it, Petr changed 'applyAndFix' in Darcs.Repository.Repair to
reread all unmodified patches again from the repository instead of
using the result of 'makePatchLazy'.

My patch further modified Petr's code to not hang onto the original
patches over the recursive calls of the 'applyAndFix' worker function,
'aaf'.

Back to the original problem.

'makePatchLazy' calls 'write_and_read_patch' in HashedRepo or DarcsRepo.

David's patch mainly concerns 'write_and_read_patch' in HashedRepo.

> hunk ./src/Darcs/Repository/HashedRepo.hs 211
>  write_and_read_patch :: RepoPatch p => Cache -> Compression -> PatchInfoAnd 
> p C(x y)
>                       -> IO (PatchInfoAnd p C(x y))
>  write_and_read_patch c compr p = do (i,h) <- write_patch_if_necesary c compr 
> p
> -                                    Sealed x <- createHashed h (parse i)
> -                                    return $ patchInfoAndPatch i $ 
> unsafeCoerceP x
> -    where parse i h = do debugMessage ("Reading patch file: "++ show 
> (human_friendly i))
> +                                    unsafeInterleaveIO $ readp h i
> +    where parse i h = do debugMessage ("Rereading patch file: "++ show 
> (human_friendly i))
>                           (fn,ps) <- fetchFileUsingCache c HashedPatchesDir h
>                           case readPatch ps of
>                             Just (x,_) -> return x
> hunk ./src/Darcs/Repository/HashedRepo.hs 219
>                             Nothing -> fail $ unlines ["Couldn't parse patch 
> file "++fn,
>                                                        "which is",
>                                                        renderString $ 
> human_friendly i]
> +          readp h i = do Sealed x <- createHashed h (parse i)
> +                         return $ patchInfoAndPatch i $ unsafeCoerceP x

This refactors 'write_and_read_patch' and adds an 'unsafeInterLeaveIO'
around the call to 'createHashed'.

As far as I understand it, this change has little effect - the main work
of 'createHashed' is also delayed using 'unsafeInterLeaveIO'.

Note that this function calls 'write_patch_if_necesary'.

> hunk ./src/Darcs/Repository/HashedRepo.hs 266
>  write_patch_if_necesary :: RepoPatch p => Cache -> Compression
>                          -> PatchInfoAnd p C(x y) -> IO (PatchInfo, String)
>  write_patch_if_necesary c compr hp =
> -    case extractHash hp of
> -    Right h -> return (info hp, h)
> -    Left p -> fmap (\h -> (info hp, h)) $ writeHashFile c compr 
> HashedPatchesDir $ showPatch p
> +    seq infohp $ case extractHash hp of
> +                   Right h -> return (infohp, h)
> +                   Left p -> (\h -> (infohp, h)) `fmap`
> +                             writeHashFile c compr HashedPatchesDir 
> (showPatch p)
> +    where infohp = info hp

And here's the actual fix: In the old version, 'write_patch_if_necesary'
hung onto the old patch ('hp') due to the 'info hp' call. The fix is to
force the patch's info before returning it.

(This is essentially the same idea as in my patch above, but on a
different code path.)

As far as I can see, DarcsRepo.write_and_read_patch is not affected
by this, because it parses the patch info on demand.

To summarize: David's patch solves the problem of hanging onto the whole
contents of patches in memory during repair; only the patch infos are
kept around the whole time.

Petr's and my changes should still be a minor improvement, because it
causes darcs to keep fewer patch infos around.

Numbers:

=== zoo/darcs ===

          ||      HEAD |  +droundy |  +dr-petr |      -petr     
==========++===========+===========+===========+===========
    check || 25.6s 19M | 25.8s 19M | 28.0s 38M | 31.1s 163M
 get full ||  8.0s 10M |  8.0s 10M |  8.0s 10M |  8.0s  10M
 get lazy ||  0.7s  3M |  0.7s  3M |  0.7s  3M |  0.7s   3M
 pull 100 ||  5.5s 15M |  5.5s 15M |  5.4s 15M |  5.4s  15M
pull 1000 || 40.6s 32M | 41.7s 32M | 41.6s 32M | 41.2s  31M
   repair || 30.3s 19M | 26.6s 19M | 29.0s 38M | 31.1s 164M

HEAD:     darcs head of earlier today
+droundy: darcs head plus David's patch
+dr-petr: darcs head plus David's patch minus Petr's and my patches
-petr:    darcs head without Petr's and my patches

(Petr's patch on top of David's has a bigger effect than I expected,
almost halving memory usage for 'darcs repair')

So David's patch has little effect now. It can coexist peacefully with
Petr's and my changes though, and improving makePatchLazy seems to be
a good idea - I think it should be applied.

Bertram
_______________________________________________
darcs-users mailing list
[email protected]
http://lists.osuosl.org/mailman/listinfo/darcs-users

Reply via email to