[issue41354] filecmp.cmp documentation does not match actual code

2021-12-15 Thread Roundup Robot


Change by Roundup Robot :


--
nosy: +python-dev
nosy_count: 4.0 -> 5.0
pull_requests: +28340
pull_request: https://github.com/python/cpython/pull/30120

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41354] filecmp.cmp documentation does not match actual code

2021-08-04 Thread Łukasz Langa

Łukasz Langa  added the comment:

Closed in favor of BPO-42958. Thanks for your report, Christof!

--
nosy: +lukasz.langa
resolution:  -> out of date
stage: patch review -> resolved
status: open -> closed
superseder:  -> filecmp.cmp(shallow=True) isn't actually shallow when only 
mtime differs

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41354] filecmp.cmp documentation does not match actual code

2021-07-12 Thread Christof Hanke


Christof Hanke  added the comment:

Andrei,

See https://bugs.python.org/issue42958

Someone else stumbled over this topic.

Maybe you can merge these two requests?

Otherwise, I'm fine with a new arg.

Christof

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41354] filecmp.cmp documentation does not match actual code

2021-07-11 Thread Andrei Kulakov


Andrei Kulakov  added the comment:

Christof: I've left one comment on the PR.

Generally your reasoning (on why unexpected deep cmp is not ideal), makes sense 
to me.

However, this is a backwards incompatible change that will probably affect a 
large number of a lot of scripts that quietly run in the background and might 
silently break (i.e. just by affecting different sets of files, with no 
reported errors or warnings) with this change.

Can you find other instances where people complained about this behavior e.g. 
here on BPO or SO?

What do you think about making this change while preserving backwards 
compatibility e.g. via a new arg?

--
nosy: +andrei.avk

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41354] filecmp.cmp documentation does not match actual code

2020-12-27 Thread Christof Hanke


Christof Hanke  added the comment:

I understand that you are reluctant to change existing code.
But for me as a sysadmin, the current behavior doesn't make sense for two 
reasons:

* st.st_size is part of _sig.  why would you do a deep compare if  the two 
files have a different length ?

* comparing thousands of files, a proper shallow-only compare is required, 
since it takes a long time to compare large files (especially when they are 
migrated to a tape-backend), so a silent-fallback to a deep-compare is not good.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41354] filecmp.cmp documentation does not match actual code

2020-12-16 Thread Scott


Scott  added the comment:

I suggest changing the documentation rather than the code.  The mix up is in 
the wording.

Documentation below
"If shallow is true, files with identical os.stat() signatures are taken to be 
equal. Otherwise, the contents of the files are compared."

The "Otherwise" appears to be referencing the "If shallow is true" cause, when 
it should be referring to the equality of the _sig()s.

Proposed Change 
"If shallow is true, files with identical os.stat() signatures are taken to be 
equal. If they are not equal, the contents of the files are compared."

--
nosy: +FreeSandwiches

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41354] filecmp.cmp documentation does not match actual code

2020-07-21 Thread Christof Hanke


Change by Christof Hanke :


--
keywords: +patch
pull_requests: +20722
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/21580

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41354] filecmp.cmp documentation does not match actual code

2020-07-21 Thread Christof Hanke


New submission from Christof Hanke :

help(filecmp.cmp) says:

"""
cmp(f1, f2, shallow=True)
Compare two files.

Arguments:

f1 -- First file name

f2 -- Second file name

shallow -- Just check stat signature (do not read the files).
   defaults to True.

Return value:

True if the files are the same, False otherwise.

This function uses a cache for past comparisons and the results,
with cache entries invalidated if their stat information
changes.  The cache may be cleared by calling clear_cache().
"""

However, looking at the code, the shallow-argument is taken only into account 
if the signatures are the same:
"""
s1 = _sig(os.stat(f1))
s2 = _sig(os.stat(f2))
if s1[0] != stat.S_IFREG or s2[0] != stat.S_IFREG:
return False
if shallow and s1 == s2:
return True
if s1[1] != s2[1]:
return False

outcome = _cache.get((f1, f2, s1, s2))
if outcome is None:
outcome = _do_cmp(f1, f2)
if len(_cache) > 100:  # limit the maximum size of the cache
clear_cache()
_cache[f1, f2, s1, s2] = outcome
return outcome
"""

Therefore, if I call cmp with shallow=True and the stat-signatures differ, 
cmp actually does a "deep" compare.
This "deep" compare however does not check the stat-signatures.

Thus I propose follwing patch:
cmp always checks the "full" signature.
return True if shallow and above test passed.
It does not make sense to me that when doing a "deep" compare, that only the 
size 
is compared, but not the mtime. 


--- filecmp.py.orig 2020-07-16 12:00:57.0 +0200
+++ filecmp.py  2020-07-16 12:00:30.0 +0200
@@ -52,10 +52,10 @@
 s2 = _sig(os.stat(f2))
 if s1[0] != stat.S_IFREG or s2[0] != stat.S_IFREG:
 return False
-if shallow and s1 == s2:
-return True
-if s1[1] != s2[1]:
+if s1 != s2:
 return False
+if shallow:
+return True
 
 outcome = _cache.get((f1, f2, s1, s2))
 if outcome is None:

--
components: Library (Lib)
messages: 374054
nosy: chanke
priority: normal
severity: normal
status: open
title: filecmp.cmp documentation does not match actual code
type: behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com