On 19/09/2018 22:02, Roger Riggs wrote:
This came up in off-line discussions, it seems unlikely that two files
will differ only in the last of 100Mb
and it will require a separate code path that will very infrequently
be exercised. So I'd still to a single
technique even if it is slightly slower for very large files to keep
the size of the code in check.
If it shows up later as a performance problem it can be added.
I think this will eventually have a different implementation for the
default file system where it can used memory mapping of file files. If
the first 1MB of large files are identical then that it might switch
over to mapping by chunk to compare the rest of the file. Starting out
with a basic implementation is okay of course.
-Alan.