jenkins-bot has submitted this change. ( 
https://gerrit.wikimedia.org/r/c/pywikibot/core/+/1062691?usp=email )

Change subject: diff: Add get_close_matches_ratio() function
......................................................................

diff: Add get_close_matches_ratio() function

get_close_matches_ratio is similar to difflib.get_close_matches but
has a ignorecase parameter and also gives the ratio back.

Change-Id: I6c6e2357f286667a96c6cc3e24d49a8b8bbe21c6
---
M docs/licenses.rst
M pywikibot/diff.py
2 files changed, 81 insertions(+), 6 deletions(-)

Approvals:
  jenkins-bot: Verified
  Xqt: Looks good to me, approved




diff --git a/docs/licenses.rst b/docs/licenses.rst
index 7196e1f..ad8f9c5 100644
--- a/docs/licenses.rst
+++ b/docs/licenses.rst
@@ -8,10 +8,11 @@
 :ref:`MIT license`; translations by translators and manual pages on
 mediawiki.org are available under the `CC-BY-SA 3.0`_ license. The
 Pywikibot logo is Public domain but it includes material that may be
-protected as a trademark. Parts of :mod:`memento<data.memento>`
-module is licenced under the `BSD`_ open source software license. You
-may obtain a copy of the License at
-http://mementoweb.github.io/SiteStory/license.html.
+protected as a trademark. Parts of :mod:`memento<data.memento>` module
+is licenced under the `BSD`_ open source software license. The
+:func:`get_close_matches_ratio()<diff.get_close_matches_ratio>` function
+off the :mod:`diff` module incorporates code from Pyton software which
+is licenced under `PSF`_ license.


 MIT License
@@ -22,4 +23,5 @@


 .. _CC-BY-SA 3.0: https://creativecommons.org/licenses/by-sa/3.0/
-.. _BSD: 
https://github.com/mementoweb/py-memento-client/blob/master/LICENSE.txt
+.. _BSD: http://mementoweb.github.io/SiteStory/license.html
+.. _PSF: 
https://docs.python.org/3/license.html#psf-license-agreement-for-python-release
diff --git a/pywikibot/diff.py b/pywikibot/diff.py
index fe9091c..3963171 100644
--- a/pywikibot/diff.py
+++ b/pywikibot/diff.py
@@ -9,7 +9,8 @@
 import difflib
 import math
 from collections import abc
-from difflib import _format_range_unified  # type: ignore[attr-defined]
+from difflib import _format_range_unified, SequenceMatcher
+from heapq import nlargest
 from itertools import zip_longest

 import pywikibot
@@ -17,6 +18,14 @@
 from pywikibot.tools import chars


+__all__ = [
+    'Hunk',
+    'PatchManager', 'cherry_pick',
+    'get_close_matches_ratio',
+    'html_comparator',
+]
+
+
 class Hunk:

     """One change hunk between a and b.
@@ -613,3 +622,67 @@
             cruton_string = ''.join(cruton.strings)
             comparands[change_type].append(cruton_string)
     return comparands
+
+
+def get_close_matches_ratio(word: Sequence,
+                            possibilities: list[Sequence],
+                            *,
+                            n: int = 3,
+                            cutoff: float = 0.6,
+                            ignorecase: bool = False) -> list[float, Sequence]:
+    """Return a list of the best “good enough” matches and its ratio.
+
+    This method is Similar to Python's :pylib:`difflib.get_close_matches()
+    <difflib#difflib.get_close_matches>` but also gives ratio back and
+    has a *ignorecase* parameter to compare case-insensitive.
+
+    SequenceMatcher is used to return a list of the best "good enough"
+    matches together with their ratio. The ratio is computed by the
+    :wiki:`Gestalt pattern matching` algorithm. The best (no more than
+    *n*) matches among the *possibilities* with their ratio are returned
+    in a list, sorted by similarity score, most similar first.
+
+    >>> get_close_matches_ratio('appel', ['ape', 'apple', 'peach', 'puppy'])
+    [(0.8, 'apple'), (0.75, 'ape')]
+    >>> p = possibilities = ['Python', 'Wikipedia', 'Robot', 'Framework']
+    >>> get_close_matches_ratio('Pywikibot', possibilities, n=2, cutoff=0)
+    [(0.42857142857142855, 'Robot'), (0.4, 'Python')]
+    >>> get_close_matches_ratio('Pywikibot', p, n=2, cutoff=0, ignorecase=True)
+    [(0.4444444444444444, 'Wikipedia'), (0.42857142857142855, 'Robot')]
+
+    .. versionadded:: 9.4
+    .. note:: Most code is incorporated from Python software under the
+       `PSF`_ license.
+
+    :param word: a sequence for which close matches are desired
+        (typically a string)
+    :param possibilities: a list of sequences against which to match
+        *word* (typically a list of strings)
+    :param n: optional arg (default 3) which is the maximum number of
+        close matches to return. *n* must be :code:`> 0`.
+    :param cutoff: optional arg (default 0.6) is a float in :code:`[0, 1]`.
+        *possibilities* that don't score at least that similar to *word*
+        are ignored.
+    :param ignorecase: if false, compare case sensitive
+    :raises ValueError: invalid value for *n* or *catoff*
+
+    .. _PSF:
+       
https://docs.python.org/3/license.html#psf-license-agreement-for-python-release
+    """
+    if n < 0:
+        raise ValueError(f'n must be > 0: {n!r}')
+    if not 0.0 <= cutoff <= 1.0:
+        raise ValueError(f'cutoff must be in [0.0, 1.0]: {cutoff!r}')
+
+    result = []
+    s = SequenceMatcher()
+    s.set_seq2(word.lower() if ignorecase else word)
+    for x in possibilities:
+        s.set_seq1(x.lower() if ignorecase else x)
+        if s.real_quick_ratio() >= cutoff and \
+           s.quick_ratio() >= cutoff and \
+           s.ratio() >= cutoff:
+            result.append((s.ratio(), x))
+
+    # Move the best scorers to head of list
+    return nlargest(n, result)

--
To view, visit 
https://gerrit.wikimedia.org/r/c/pywikibot/core/+/1062691?usp=email
To unsubscribe, or for help writing mail filters, visit 
https://gerrit.wikimedia.org/r/settings?usp=email

Gerrit-MessageType: merged
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Change-Id: I6c6e2357f286667a96c6cc3e24d49a8b8bbe21c6
Gerrit-Change-Number: 1062691
Gerrit-PatchSet: 3
Gerrit-Owner: Xqt <i...@gno.de>
Gerrit-Reviewer: Xqt <i...@gno.de>
Gerrit-Reviewer: jenkins-bot
_______________________________________________
Pywikibot-commits mailing list -- pywikibot-commits@lists.wikimedia.org
To unsubscribe send an email to pywikibot-commits-le...@lists.wikimedia.org

Reply via email to