Thanks Trent, I'm adding Ganesh and Ian to the CC list to make sure they looks at this thread. Their input would be appreciated.
On Sun, Aug 30, 2009 at 2:24 AM, Trent W. Buck<[email protected]> wrote: > Jason Dagit <[email protected]> writes: > >> I'm not sure where to go next with this information. One idea I had, >> was to not calculate/create hunk patches until they are absolutely >> needed. > > I edit two files "foo.c" and "bar.c", then run "darcs record". Looking > at the diff it presents for foo.c, I see a different (unrelated) bug, > and fix it in both foo.c and bar.c. I then press "y" to accept the > first hunk. > > Should the second hunk include the second bugfix? Eager won't include > it, lazy will. This is particularly critical if the second bugfix > involves re-editing the same lines of bar.c as were modified by the > original edit. > > I think making record get hunks lazily is a Good Thing, but bear in mind > it might break existing workflows (as above) that (ab?)use the existing > behaviour. I had though of it the same way. The lazy/eager options and which one is right? In an ideal world, the choice of eager/lazy could be made by the user in the UI. For example, when you're looking at the hunk there should be a rediff option and by default you get the eager diff. Personally, I think if we have to choose one it should be the lazy choice because it gives you a chance to look at the diff and decide you want to fix something. Going the lazy route lets us defer the creation of the hunk until we have somewhere to send it. But, here is a distinction between the lazy version and a super lazy version. A lazy version gets the diff when it needs to be displayed but hangs on to that diff once it has been shown. A super lazy version never stores that diff in memory and only ever creates it when it needs to be sent somewhere (such as the screen or a file). Given the current UI I don't think the super lazy version is possible. I'm also not sure it's what we want. I think I'm going to suggest that the super lazy version is a bug and we should avoid it. One reason is because you could pick your hunks in one file, go on to a different file, change the first file and then record things. In the super lazy version you'd endup with different patches than what you agreed to in the UI. Now, it's also possible to make the lazy and eager versions use less memory by storing the hunks in files and just reloading them whenever we need to look at them or show them to the user. In fact, I think I know what we should/could do to help a lot of the memory pressure. I think we should make it so that patch sequences are a zipper-ish data structure that is stored on the disk. For example, each patch would be stored as a separate file on the disk in some special directory. You could then request the next or previous patch in that sequence and in the background there would be a read of that file. The contract for the caller would be a promise to not hold on to the patches longer than necessary. We would be trading lots of "on demand" disk reads for in process memory usage. One could imagine optimizing for small patches by grouping them into files of about X mb. It's entirely possible that we could implement this file backed sequence as a single file. Thoughts? Anyone know if Petr's work can be applied here? Does camp have an optimization for this? Jason _______________________________________________ darcs-users mailing list [email protected] http://lists.osuosl.org/mailman/listinfo/darcs-users
