[issue46086] Add ratio_min() function to the difflib library

2022-01-20 Thread Tal Einat


Tal Einat  added the comment:

I'm closing this for now since nobody has followed up and to the best of my 
understanding this wouldn't be an appropriate addition to the stdlib. 

This can be re-opened in the future if needed, of course.

--
resolution:  -> rejected
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46086] Add ratio_min() function to the difflib library

2022-01-08 Thread Tal Einat


Tal Einat  added the comment:

Thanks for the suggestion and the PR, Giacomo!

However, in my opinion, this is better suited to be something like a cookbook 
recipe.  The number of use cases for this will be low, and there would be 
little advantage to having this in the stdlib rather than elsewhere.

--
nosy: +taleinat

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46086] Add ratio_min() function to the difflib library

2021-12-15 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
assignee:  -> tim.peters

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46086] Add ratio_min() function to the difflib library

2021-12-15 Thread Alex Waygood


Change by Alex Waygood :


--
nosy:  -AlexWaygood

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46086] Add ratio_min() function to the difflib library

2021-12-15 Thread Alex Waygood


Alex Waygood  added the comment:

I am removing 3.10 from the "versions" field, since additions to the standard 
library are only considered for unreleased versions of Python.

--
nosy: +AlexWaygood, tim.peters
versions:  -Python 3.10

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46086] Add ratio_min() function to the difflib library

2021-12-15 Thread Roundup Robot


Change by Roundup Robot :


--
keywords: +patch
nosy: +python-dev
nosy_count: 1.0 -> 2.0
pull_requests: +28344
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/30125

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46086] Add ratio_min() function to the difflib library

2021-12-15 Thread Giacomo


New submission from Giacomo :

Here I propose a new function, namely .ratio_min(self,m). 

.ratio_min(self,m) is an extension of the difflib's function .ratio(self). 
Equivalently to .ratio(self), .ratio_min(self,m) returns a measure of two 
sequences' similarity (float in [0,1]). In addition to .ratio(), it can ignore 
matched substrings if these substrings have length less than a given threshold 
m. m is the second variable of the function. 

It is very useful to avoid spurious high similarity scores. 

# NEW FUNCTION: 

def ratio_min(self,m):
"""Return a measure of the sequences' similarity (float in [0,1]).
Where T is the total number of elements in both sequences, and
M_min is the number of matches with every single match has length at 
least m, this is 2.0*M_min / T.
Note that this is 1 if the sequences are identical, and 0 if
they have no substring of length m or more in common.
.ratio_min() is similar to .ratio(). 
.ratio_min(1) is equivalent to .ratio().

>>> s = SequenceMatcher(None, "abcd", "bcde")
>>> s.ratio_min(1)
0.75
>>> s.ratio_min(2)
0.75
>>> s.ratio_min(3)
0.75
>>> s.ratio_min(4)
0.0
"""

matches = sum(triple[-1] for triple in self.get_matching_blocks() if 
triple[-1] >=m)
return _calculate_ratio(matches, len(self.a) + len(self.b))

--
components: Library (Lib)
messages: 408622
nosy: gibu
priority: normal
severity: normal
status: open
title: Add ratio_min() function to the difflib library
type: enhancement
versions: Python 3.10, Python 3.11

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com