Steven Barker added the comment:

Here's a patch against the default branch that fixes filecmp.cmp's behavior 
when "shallow" is True, including an update to the module's docs (correcting 
the previous ambiguity discussed in the 2011 python-dev thread mentioned by 
Sandro Tosi) and a couple of new tests of files where shallow comparisons 
should be expected get the answer wrong.

The changed behavior is very simple. The lines from Lib/filecmp.py in the 
original code:

    if shallow and s1 == s2:
        return True

are changed to:

    if shallow:
        return s1 == s2

This presumes answers to the questions asked by Georg Brandl back in 2005 of 
"No, shallow comparisons never read the file contents" and "No, non-shallow 
comparisons don't care about mtimes (except for avoiding out of date cache 
entries)".

If we only applied the test changes (not the behavior change) from my patch, 
one of the new tests would fail, as the current filecmp.cmp code never gives 
false negatives on comparisons in shallow mode (only false positives).

We probably don't want to commit the patch exactly as it stands, because the 
changed behavior causes several of the previously existing tests to fail. 
Almost all the dircmp tests fail for me, and the "test_matching" filecmp.cmp 
test does so intermittently (due to my system sometimes being fast enough at 
creating the files to beat my filesystem's mtime resolution).

The test failures are all because of shallow comparisons between files with the 
same contents, but (slightly) different mtimes. The new shallow comparison 
behavior is to say those files are unequal while the old code would fall back 
on testing the file contents despite the shallow parameter, and so it would 
return True. The failing tests can probably be updated to either explicitly set 
the file mtimes with sys.utime (as I do with the files used in the new tests), 
or we could rewrite some of the setup code to use something like shutil.copy2 
to make copies of files while preserving their mtimes.

I'm not very familiar with the best practices when it comes to writing unit 
tests, so I pretty much copied the style of the existing tests in 
Lib/test/test_filecmp.py (though I've split mine up a bit more).

I understand from reading other filecmp bug reports that that test style is 
considered to be pretty poor, so there's probably room to improve my code as 
well. I thought I'd post this patch before making an effort at fixing the older 
tests, in order to get some feedback. I'll be happy to incorporate any 
suggested improvements!

This issue has been open for almost 9 years. Hopefully my patch can help get it 
moving again towards being fixed!

----------
keywords: +patch
versions: +Python 3.5
Added file: http://bugs.python.org/file35159/filecmp_real_shallow.diff

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue1234674>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to