Re: [darcs-users] Memory usage of Record

Jason Dagit Mon, 31 Aug 2009 09:03:24 -0700

Thanks Trent,

I'm adding Ganesh and Ian to the CC list to make sure they looks at
this thread.  Their input would be appreciated.

On Sun, Aug 30, 2009 at 2:24 AM, Trent W. Buck<[email protected]> wrote:
> Jason Dagit <[email protected]> writes:
>
>> I'm not sure where to go next with this information.  One idea I had,
>> was to not calculate/create hunk patches until they are absolutely
>> needed.
>
> I edit two files "foo.c" and "bar.c", then run "darcs record".  Looking
> at the diff it presents for foo.c, I see a different (unrelated) bug,
> and fix it in both foo.c and bar.c.  I then press "y" to accept the
> first hunk.
>
> Should the second hunk include the second bugfix?  Eager won't include
> it, lazy will.  This is particularly critical if the second bugfix
> involves re-editing the same lines of bar.c as were modified by the
> original edit.
>
> I think making record get hunks lazily is a Good Thing, but bear in mind
> it might break existing workflows (as above) that (ab?)use the existing
> behaviour.

I had though of it the same way.  The lazy/eager options and which one
is right?  In an ideal world, the choice of eager/lazy could be made
by the user in the UI.  For example, when you're looking at the hunk
there should be a rediff option and by default you get the eager diff.
 Personally, I think if we have to choose one it should be the lazy
choice because it gives you a chance to look at the diff and decide
you want to fix something.  Going the lazy route lets us defer the
creation of the hunk until we have somewhere to send it.  But, here is
a distinction between the lazy version and a super lazy version.  A
lazy version gets the diff when it needs to be displayed but hangs on
to that diff once it has been shown.  A super lazy version never
stores that diff in memory and only ever creates it when it needs to
be sent somewhere (such as the screen or a file).

Given the current UI I don't think the super lazy version is possible.
 I'm also not sure it's what we want.  I think I'm going to suggest
that the super lazy version is a bug and we should avoid it.  One
reason is because you could pick your hunks in one file, go on to a
different file, change the first file and then record things.  In the
super lazy version you'd endup with different patches than what you
agreed to in the UI.

Now, it's also possible to make the lazy and eager versions use less
memory by storing the hunks in files and just reloading them whenever
we need to look at them or show them to the user.  In fact, I think I
know what we should/could do to help a lot of the memory pressure.  I
think we should make it so that patch sequences are a zipper-ish data
structure that is stored on the disk.  For example, each patch would
be stored as a separate file on the disk in some special directory.
You could then request the next or previous patch in that sequence and
in the background there would be a read of that file.  The contract
for the caller would be a promise to not hold on to the patches longer
than necessary.  We would be trading lots of "on demand" disk reads
for in process memory usage.  One could imagine optimizing for small
patches by grouping them into files of about X mb.  It's entirely
possible that we could implement this file backed sequence as a single
file.

Thoughts?  Anyone know if Petr's work can be applied here?  Does camp
have an optimization for this?

Jason
_______________________________________________
darcs-users mailing list
[email protected]
http://lists.osuosl.org/mailman/listinfo/darcs-users

Re: [darcs-users] Memory usage of Record

Reply via email to