Hi Alan,
On 9/19/18 4:14 PM, Alan Bateman wrote:
On 19/09/2018 19:48, Joe Wang wrote:
Hi,
After much discussion and 10 iterations of reviews, this proposal has
evolved from what was the original isSameContent method to a mismatch
method. API-wise, a compare method was also considered as it looked
like just a short step forward from mismatch, however, it was
eventually dropped since there is no convincing use case comparing
files lexicographically by contents. Impl-wise, extensive performance
benchmarking has been done to compare a buffered reading vs memory
mapping, the result was that a simple buffered reading performed
better among small files, and those with the mismatched byte closer
to the beginning of files. Since the proposed method's targeted files
are small ones, the impl currently does a buffered reading only.
Please review.
JBS: https://bugs.openjdk.java.net/browse/JDK-8202285
specdiff:
http://cr.openjdk.java.net/~joehw/jdk12/8202285/specdiff/java/nio/file/Files.html
webrev: http://cr.openjdk.java.net/~joehw/jdk12/8202285/webrev/
Starting out using buffered I/O is probably okay but I assume we will
want to change this in the future to having it use memory mapped I/O
beyond a certain threshold.
This came up in off-line discussions, it seems unlikely that two files
will differ only in the last of 100Mb
and it will require a separate code path that will very infrequently be
exercised. So I'd still to a single
technique even if it is slightly slower for very large files to keep the
size of the code in check.
If it shows up later as a performance problem it can be added.
$.02, Roger
Can you explain the use of toRealPath and comparing the names? That
shouldn't be needed. Also the catching of ProviderMismatchExcepiton
seems a bit strange too. Can you replace mismatchByAttrs with
isSameFile? You could call this from Files.mismatch and then use the
supporting implementation for the case that the files are not the same.
-Alan