On Wed, May 18, 2005 at 11:36:44AM -0700, Tupshin Harper wrote:
> David Roundy wrote:
> 
> >
> >Indeed, that should help.  But even as things are now, a darcs-unstable
> >initial record of the linux kernel requires only 10 times the CPU time that
> >tar czf does, and only 7.5 times the wallclock time.  So if we assume that
> >tar is pretty much optimal, we only have one order of magnitude improvement
> >left to be made.  I expect that changing the hunk format (as we've
> >discussed) should pretty much get us that order of magnitude in
> >improvement in CPU time.
> >
> >The memory usage is way worse than that of tar, but I'm optimistic that
> >we can improve things a bit in that realm.  Perhaps (for example) by
> >storing PackedString file paths, or by making the directory-reading portion
> >of slurp lazy.  In any case, 450M isn't such bad maximum memory consumption
> >for a project the size of the kernel.
> >  
> >
> It would seem that if an addfile primitive included the hunk patch of
> the file's initial contents instead of treating them as separate
> primitives, then certain implementation optimizations would be much more
> feasible.  If you are parsing an addfile patch, then you can assume:
> 1) the embedded hunk patch (or binary patch) only contains add lines and
> never delete lines
> 2) the hunk is not offset. it always starts at the beginning of the file
> 3) if the addfile itself does not conflict, then it is impossible for
> its embedded hunk to conflict
> 
> Given those assumptions, you can choose to not preparse and load the
> entire patch into memory, instead just stream the patch through hex
> decoding and write it directly to file.
> 
> Presumably something similar would be possible for rmfile patches as well
> 
> Thoughts?

One could certainly add file contents to addfile and rmfile.  This would
also have the advantage of eliminating the irritating issue of getting
prompted twice for addfile and rmfile patches.  After we've added support
for a repository format file we could make this change relatively
painlessly.  We could even avoid introducing a new patch type by simply
parsing the "old addfile" as a "new addfile" with no contents (i.e. an
empty file).  So we could have something like

addfile ./foo

parses to AddFile "./foo" "", while something like

addfile ./foo with 10 bytes and 1 line
1234567890

parses to AddFile "./foo" "1234567890"

I don't think this'll make a big difference efficiency-wise (once we
introduce the new hunk formatting plan), but it does look perhaps a bit
cleaner, it fixes a UI issue.  It also might simplify conflict marking a
bit.
-- 
David Roundy

_______________________________________________
darcs-users mailing list
[email protected]
http://www.abridgegame.org/mailman/listinfo/darcs-users

Reply via email to