[issue39891] [difflib] Improve get_close_matches() to better match when casing of words are different

2020-06-23 Thread Raymond Hettinger
Raymond Hettinger added the comment: I concur with the other respondents that this is best left to the application code. Thank you for the suggestion, but I'll mark this as closed. Don't be deterred from making other suggestions :-) -- resolution: -> rejected stage: patch review

[issue39891] [difflib] Improve get_close_matches() to better match when casing of words are different

2020-06-21 Thread Rémi Lapeyre
Rémi Lapeyre added the comment: I fell like it's a bit weird to have a new function just for ignoring case, will a new function be required for every possible normalization like removing accents. One possible make the API handle those use cases would be to have a keyword-argument for this:

[issue39891] [difflib] Improve get_close_matches() to better match when casing of words are different

2020-04-08 Thread brian.gallagher
brian.gallagher added the comment: Just giving this a bump, in case it has been forgotten about. I've posted a patch at https://github.com/python/cpython/pull/18983. It adds a new parameter "ignorecase" to get_close_matches() that, if set to True, will result in the SequenceMatcher treating

[issue39891] [difflib] Improve get_close_matches() to better match when casing of words are different

2020-03-13 Thread Raymond Hettinger
Change by Raymond Hettinger : -- nosy: +rhettinger ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue39891] [difflib] Improve get_close_matches() to better match when casing of words are different

2020-03-13 Thread Roundup Robot
Change by Roundup Robot : -- keywords: +patch nosy: +python-dev nosy_count: 3.0 -> 4.0 pull_requests: +18331 stage: needs patch -> patch review pull_request: https://github.com/python/cpython/pull/18983 ___ Python tracker

[issue39891] [difflib] Improve get_close_matches() to better match when casing of words are different

2020-03-08 Thread brian.gallagher
brian.gallagher added the comment: I agree that there is an appeal to leaving any normalization to the application and that trying guess what people want is a tough hole -- I hadn't even considered what casing would mean in a general sense for Unicode. I'm not entirely convinced that this

[issue39891] [difflib] Improve get_close_matches() to better match when casing of words are different

2020-03-08 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: It looks like Brian is expecting some kind of normalization of the strings before they enter the function, e.g. convert to lowercase, remove extra whitespace, convert diacritics to regular letters, combinations of such normalizations, etc. Since both

[issue39891] [difflib] Improve get_close_matches() to better match when casing of words are different

2020-03-07 Thread Tim Peters
Tim Peters added the comment: If you pursue this, please introduce a new function for it. You immediately have an idea about how to change the current function precisely because it _doesn't_ try to guess what you really wanted. That lack of magic is valuable - you're not actually confused

[issue39891] [difflib] Improve get_close_matches() to better match when casing of words are different

2020-03-07 Thread brian.gallagher
New submission from brian.gallagher : Currently difflib's get_close_matches() doesn't match similar words that differ in their casing very well. Example: user@host:~$ python3 Python 3.6.9 (default, Nov 7 2019, 10:44:02) [GCC 8.3.0] on linux Type "help", "copyright", "credits" or "license"