On Feb 6, 2008 8:48 AM, David Roundy <[EMAIL PROTECTED]> wrote: > On Wed, Feb 06, 2008 at 11:44:42AM -0500, David Roundy wrote: > > > Maybe we can egg on the bytestring people and get them to submit > > > patches taking this further down (for example, they could improve our > > > between newlines and nth newline stuff). > > > > That'd be nice. > > I should note, though, that we can also make probably faster progress by > simply reducing the dependency on the counting of lines. In particular, > I'd like to change Hunk to store a new and old PackedString + line number > counts, rather than storing new and old [PackedString]. We'd still need to > find the nth newline, etc, but we'd use much less memory and probably be > faster to boot. Even better when we introduce (maybe before darcs 2 is > released?) a new hunk file format that doesn't have all those +'s and -'s. > Then parsing is lightning fast and we won't need duplicate copies of the > hunk data in memory.
I don't know if you're aware of this, but some time ago (probably measurable in years now) I did this very refactor of the patch format for hunks and experimented. My implementation with chunks was measurably slower than the existing implementation. Not by a lot but you could measure a small decrease in performance time. I checked and I don't seem to have my performance notes about it anymore. I would say that this type of refactor is surprisingly low priority. At least, that's what I felt like after spending time on it. The current parsing code is amazingly fast. Just my $0.02. I know others here are more experienced with both darcs internals and optimizing haskell, so maybe collectively you'll have better luck than I did. Jason _______________________________________________ darcs-devel mailing list darcs-devel@darcs.net http://lists.osuosl.org/mailman/listinfo/darcs-devel