[issue1234674] filecmp.cmp's "shallow" option

2014-08-13 Thread Berker Peksag

Changes by Berker Peksag :


--
stage: test needed -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1234674] filecmp.cmp's "shallow" option

2014-08-03 Thread Steven Barker

Changes by Steven Barker :


Added file: http://bugs.python.org/file36239/filecmp_behavior_and_doc_fix.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1234674] filecmp.cmp's "shallow" option

2014-08-03 Thread Steven Barker

Steven Barker added the comment:

I've worked on this filecmp issue some more, and I have some new patches.

First up is a patch that only modifies the tests. It has one test that fails 
without the behavior patch. The test patch also modifies some other tests so 
that they will work after the behavior patch is applied. Notably, test_keyword 
had a some tests that would fail due to false negatives on shallow comparisons.

The second patch is the behavior and documentation changes required to actually 
fix this issue. This should be very simple to understand (the behavior change 
is only two lines of code, quoted in the discussion above). If you apply only 
this patch, you'll get several test failures, all due to false negative shallow 
comparisons (where two files have the same contents, but their stat signatures 
differ).

With these new patches, I think this issue is ready for a review, and 
eventually to be committed. The behavior change is simple and I think, 
obviously correct (assuming we want to risk breaking backwards compatibility). 
Perhaps my test code can be improved, but I don't think it's too bad.

So, the main question is "will too much outside code break if we make this 
behavior change?"

I don't think filecmp is used very widely, but as was demonstrated by the 
standard library's only use of filecmp (in Lib/test/test_keyword.py), it's 
likely that a lot of users perform shallow comparisons (by default) where they 
really don't want to get false-negatives. If we decide that the changed 
behavior is too big of a break of backwards compatibility, we just need to 
document the current behavior better (at a minimum, the docstring for 
filecmp.cmp must be corrected).

--
Added file: http://bugs.python.org/file36238/filecmp_test_patch.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1234674] filecmp.cmp's "shallow" option

2014-07-27 Thread Steven Barker

Steven Barker added the comment:

I think that your test patch misses the confusing/possibly wrong case. That 
case is when two files have the same contents, but different mtimes. If you 
attempt a shallow comparison, you'll actually get a deep comparison (reading 
the whole files) and a result of True rather than the expected (though 
incorrect) False.

Try the test part of my patch (without the behavior change), and you'll see the 
failure of "test_shallow_false_negative". In your first assertion (that 
"filecmp.cmp(self.name, self.name_uppercase)" is False), you get the expected 
result, but for the wrong reason (you get it because the file contents differ, 
not because they have different mtimes).

Now, it might be that the "bad" case is actually working as we want it to (or 
at least, the behavior is established enough that we don't want to change it, 
for risk of breaking running code). If so, we should instead change the 
documentation (and especially the docstring) to explicit state that even if you 
request a shallow comparison, you might get a deep comparison instead.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1234674] filecmp.cmp's "shallow" option

2014-07-26 Thread Andrew Kubera

Andrew Kubera added the comment:

Attached is a couple extra tests which run filecmp on two files with different 
content but the same length and relevant stat() info. This appears to 
successfully check if the shallow options works correctly. 

It uses time.sleep(1) to ensure the files start out with different creation 
times, checks that filecmp returns false, then sets the creation time with 
os.utime to ensure filecmp is true.

These tests currently run successfully on 3.5, so if there was an issue, it has 
been resolved.

--
nosy: +Andrew.Kubera
versions:  -Python 2.7, Python 3.2, Python 3.3, Python 3.4
Added file: 
http://bugs.python.org/file36123/filecmp_extra_shallow_file_check.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1234674] filecmp.cmp's "shallow" option

2014-05-05 Thread Steven Barker

Steven Barker added the comment:

Here's a patch against the default branch that fixes filecmp.cmp's behavior 
when "shallow" is True, including an update to the module's docs (correcting 
the previous ambiguity discussed in the 2011 python-dev thread mentioned by 
Sandro Tosi) and a couple of new tests of files where shallow comparisons 
should be expected get the answer wrong.

The changed behavior is very simple. The lines from Lib/filecmp.py in the 
original code:

if shallow and s1 == s2:
return True

are changed to:

if shallow:
return s1 == s2

This presumes answers to the questions asked by Georg Brandl back in 2005 of 
"No, shallow comparisons never read the file contents" and "No, non-shallow 
comparisons don't care about mtimes (except for avoiding out of date cache 
entries)".

If we only applied the test changes (not the behavior change) from my patch, 
one of the new tests would fail, as the current filecmp.cmp code never gives 
false negatives on comparisons in shallow mode (only false positives).

We probably don't want to commit the patch exactly as it stands, because the 
changed behavior causes several of the previously existing tests to fail. 
Almost all the dircmp tests fail for me, and the "test_matching" filecmp.cmp 
test does so intermittently (due to my system sometimes being fast enough at 
creating the files to beat my filesystem's mtime resolution).

The test failures are all because of shallow comparisons between files with the 
same contents, but (slightly) different mtimes. The new shallow comparison 
behavior is to say those files are unequal while the old code would fall back 
on testing the file contents despite the shallow parameter, and so it would 
return True. The failing tests can probably be updated to either explicitly set 
the file mtimes with sys.utime (as I do with the files used in the new tests), 
or we could rewrite some of the setup code to use something like shutil.copy2 
to make copies of files while preserving their mtimes.

I'm not very familiar with the best practices when it comes to writing unit 
tests, so I pretty much copied the style of the existing tests in 
Lib/test/test_filecmp.py (though I've split mine up a bit more).

I understand from reading other filecmp bug reports that that test style is 
considered to be pretty poor, so there's probably room to improve my code as 
well. I thought I'd post this patch before making an effort at fixing the older 
tests, in order to get some feedback. I'll be happy to incorporate any 
suggested improvements!

This issue has been open for almost 9 years. Hopefully my patch can help get it 
moving again towards being fixed!

--
keywords: +patch
versions: +Python 3.5
Added file: http://bugs.python.org/file35159/filecmp_real_shallow.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1234674] filecmp.cmp's "shallow" option

2014-04-21 Thread Steven Barker

Steven Barker added the comment:

A recent Stack Overflow question (http://stackoverflow.com/q/23192359/1405065) 
relates to this bug. The questioner was surprised that filecmp.cmp is much 
slower than usual for certain large files, despite the "shallow" parameter 
being True.

It is pretty clear to me that the behavior of filecmp.cmp does not match its 
docstring with respect to "shallow". Either the docstring should be updated to 
match the existing behavior, or (more likely?) the behavior should change to 
match the docs.

--
components: +Library (Lib) -Extension Modules
nosy: +Steven.Barker
versions: +Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1234674] filecmp.cmp's "shallow" option

2012-07-27 Thread Chris Jerdonek

Changes by Chris Jerdonek :


--
nosy: +cjerdonek

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1234674] filecmp.cmp's "shallow" option

2011-10-31 Thread Sandro Tosi

Sandro Tosi  added the comment:

Hello,
we recently received a this message on docs@ : 
http://mail.python.org/pipermail/docs/2011-October/006121.html that's actually 
related to this issue: how can we move it forward?

--
nosy: +sandro.tosi
versions: +Python 3.3 -Python 3.1

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1234674] filecmp.cmp's "shallow" option

2010-08-21 Thread Mark Lawrence

Changes by Mark Lawrence :


--
versions: +Python 2.7, Python 3.1, Python 3.2 -Python 2.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1234674] filecmp.cmp's "shallow" option

2009-02-15 Thread Daniel Diniz

Changes by Daniel Diniz :


--
stage:  -> test needed
type:  -> behavior
versions: +Python 2.6 -Python 2.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com