Stefan Beller <sbel...@google.com> writes:

>>  * On 2/2, doing it at xdiff.c level may be limiting this good idea
>>    to flourish to its full potential, as the interface is fed only
>>    one diff_filepair at a time.
>
> I realized that after I implemented it. I agree we would want to have
> it function cross file.
>
> So from my current understanding of the code,
> * diffcore_std would call a new function diffcore_detect_moved(void)
>    just before diffcore_apply_filter is called.
> * The new function diffcore_detect_moved would then check if the
>    diff is a valid textual diff (i.e. real files, not submodules, but
>    deletion/creation of one file is allowed)
>    If so we generate the diff internally and as in 2/2 would
>    hash all added/removed lines with context and store it.

I do not think you should step outside diff_flush().  Only when
producing textual diff, you would have to run the textual diff
twice by going over the q twice:

 * The first pass would run diff_flush_patch(), which would call
   into xdiff the usual way, but the callback from xdiff would
   capture the removed lines and the added lines without making any
   output.

 * The second pass would run diff_flush_patch(), but the callback
   from xdiff would be called with additional information, namely,
   the removed and the added lines captured in the first pass.

 * I suspect that the fn_out_consume() function that is used for a
   normal case (i.e. when we are not doing this more expensive
   "moved to/moved from" coloring) can be used for the second pass
   above (i.e. the "priv" aka "ecbdata" may need to be extended so
   that it can tell which mode of operation it is asked to perform),
   but if there is not enough similarity between the second pass of
   this "moved from/moved to" mode and the normal mode of output, it
   is also OK to have two different callback functions, i.e. the
   original one to be used in the normal mode, the second one that
   knows the "these are moved without modification" coloring.  The
   callback for the first pass is sufficiently different and I think
   it is better to invent a new callback function to be used in the
   first pass, instead of reusing fn_out_consume().

   The fn_out_consume() function working in the "second pass of
   moved from/moved to mode" would inspect line[] and see if it is
   an added or a removed line, and then:

   - if it is an added line, and it appears as a removed line
     elsewhere in the patchset (you obtained the information in the
     first pass), you show it as "this was moved from elsewhere".

   - if it is a removed line, and it appears as an added line
     elsewhere in the patchset (you obtained the information in the
     first pass), you show it as "this was moved to elsewhere".

Or something like that.

Reply via email to