Re: [WIP PATCH] Make svn_diff_diff skip identical prefix and suffix to make diff and blame faster

2010-10-19 Thread Johan Corveleyn
On Tue, Oct 12, 2010 at 12:10 PM, Julian Foad julian.f...@wandisco.com wrote: On Tue, 2010-10-12 at 00:31 +0200, Johan Corveleyn wrote: On Mon, Oct 11, 2010 at 11:53 AM, Julian Foad julian.f...@wandisco.com wrote: On Sat, 2010-10-09, Johan Corveleyn wrote: On Sat, Oct 9, 2010 at 2:57 AM,

Diff optimization: implement prefix/suffix-skipping in token-handling code (was: Re: [WIP PATCH] Make svn_diff_diff skip identical prefix and suffix to make diff and blame faster)

2010-10-19 Thread Johan Corveleyn
On Tue, Oct 12, 2010 at 12:35 PM, Julian Foad julian.f...@wandisco.com wrote: On Sun, 2010-10-10 at 23:43 +0200, Johan Corveleyn wrote: On Sat, Oct 9, 2010 at 2:21 PM, Johan Corveleyn jcor...@gmail.com wrote: On Sat, Oct 9, 2010 at 2:57 AM, Julian Foad julian.f...@wandisco.com wrote: But

Re: [WIP PATCH] Make svn_diff_diff skip identical prefix and suffix to make diff and blame faster

2010-10-11 Thread Julian Foad
On Sat, 2010-10-09, Johan Corveleyn wrote: On Sat, Oct 9, 2010 at 2:57 AM, Julian Foad julian.f...@wandisco.com wrote: On Sat, 2010-10-09, Johan Corveleyn wrote: Ok, third iteration of the patch in attachment. It passes make check. As discussed in [1], this version keeps 50 lines of the

Re: [WIP PATCH] Make svn_diff_diff skip identical prefix and suffix to make diff and blame faster

2010-10-10 Thread Johan Corveleyn
On Sat, Oct 9, 2010 at 2:21 PM, Johan Corveleyn jcor...@gmail.com wrote: On Sat, Oct 9, 2010 at 2:57 AM, Julian Foad julian.f...@wandisco.com wrote: But this makes me think, it looks to me like this whole prefix-suffix-skipping functionality would fit better inside the lower-level diff

Re: [WIP PATCH] Make svn_diff_diff skip identical prefix and suffix to make diff and blame faster

2010-10-09 Thread Johan Corveleyn
On Sat, Oct 9, 2010 at 2:57 AM, Julian Foad julian.f...@wandisco.com wrote: On Sat, 2010-10-09, Johan Corveleyn wrote: Ok, third iteration of the patch in attachment. It passes make check. As discussed in [1], this version keeps 50 lines of the identical suffix around, to give the algorithm a

Re: [WIP PATCH] Make svn_diff_diff skip identical prefix and suffix to make diff and blame faster

2010-10-09 Thread Daniel Shahaf
Johan Corveleyn wrote on Sat, Oct 09, 2010 at 14:21:09 +0200: (side-note: I considered first doing suffix scanning, then prefix scanning, so I could reuse the buffers/pointers from diff_baton all the time, and still have everything pointing correctly after eliminating prefix/suffix. But that

Re: [WIP PATCH] Make svn_diff_diff skip identical prefix and suffix to make diff and blame faster

2010-10-09 Thread Johan Corveleyn
On Sat, Oct 9, 2010 at 5:19 PM, Daniel Shahaf d...@daniel.shahaf.name wrote: Johan Corveleyn wrote on Sat, Oct 09, 2010 at 14:21:09 +0200: (side-note: I considered first doing suffix scanning, then prefix scanning, so I could reuse the buffers/pointers from diff_baton all the time, and still

Re: [WIP PATCH] Make svn_diff_diff skip identical prefix and suffix to make diff and blame faster

2010-10-08 Thread Johan Corveleyn
Ok, third iteration of the patch in attachment. It passes make check. As discussed in [1], this version keeps 50 lines of the identical suffix around, to give the algorithm a good chance to generate a diff output of good quality (in all but the most extreme cases, this will be the same as with

Re: [WIP PATCH] Make svn_diff_diff skip identical prefix and suffix to make diff and blame faster

2010-10-08 Thread Julian Foad
On Sat, 2010-10-09, Johan Corveleyn wrote: Ok, third iteration of the patch in attachment. It passes make check. As discussed in [1], this version keeps 50 lines of the identical suffix around, to give the algorithm a good chance to generate a diff output of good quality (in all but the most

Re: [WIP PATCH] Make svn_diff_diff skip identical prefix and suffix to make diff and blame faster

2010-10-03 Thread Johan Corveleyn
On Sun, Oct 3, 2010 at 1:46 AM, Johan Corveleyn jcor...@gmail.com wrote: Hi, Here is a second iteration of the patch. It now passes make check. Differences from the previous version are: - Support for \r eol-style (\n and \r\n was already ok). - The number of prefix_lines is now passed to

Re: [WIP PATCH] Make svn_diff_diff skip identical prefix and suffix to make diff and blame faster

2010-10-02 Thread Johan Corveleyn
Hi, Here is a second iteration of the patch. It now passes make check. Differences from the previous version are: - Support for \r eol-style (\n and \r\n was already ok). - The number of prefix_lines is now passed to svn_diff__lcs, so it can use that value to set the position offset of the EOF

Re: [WIP PATCH] Make svn_diff_diff skip identical prefix and suffix to make diff and blame faster

2010-09-28 Thread Daniel Shahaf
Index: subversion/include/svn_diff.h === --- subversion/include/svn_diff.h (revision 1001548) +++ subversion/include/svn_diff.h (working copy) @@ -112,6 +112,11 @@ (personally I prefer 'svn diff -x-p' to show the function

Re: [WIP PATCH] Make svn_diff_diff skip identical prefix and suffix to make diff and blame faster

2010-09-28 Thread Johan Corveleyn
Hi Daniel, Thanks for the feedback. On Tue, Sep 28, 2010 at 4:11 PM, Daniel Shahaf d...@daniel.shahaf.name wrote: Index: subversion/include/svn_diff.h === --- subversion/include/svn_diff.h     (revision 1001548) +++

Re: [WIP PATCH] Make svn_diff_diff skip identical prefix and suffix to make diff and blame faster

2010-09-28 Thread Daniel Shahaf
Johan Corveleyn wrote on Tue, Sep 28, 2010 at 23:37:23 +0200: On Tue, Sep 28, 2010 at 4:11 PM, Daniel Shahaf d...@daniel.shahaf.name wrote: Index: subversion/include/svn_diff.h === --- subversion/include/svn_diff.h    

[WIP PATCH] Make svn_diff_diff skip identical prefix and suffix to make diff and blame faster

2010-09-26 Thread Johan Corveleyn
Hi devs, As discussed in [1], here is a patch that makes svn_diff_diff (libsvn_diff/diff.c) skip the identical prefix and suffix of the original and modified files, before starting the LCS (longest common subsequence) algorithm on the non-matching part. This makes diff a lot faster (especially