Hello I am having an odd problem with difflib.SequenceMatcher. Sample code below:
The strings "src" and "trg" differ only a little. The SequenceMatcher.ratio() for these strings 0.0. Many other similar strings are working fine without problems (see below) with non-zero ratios depending on how much difference there is between strings (as expected). Tested on Python 2.7 on Ubuntu 14.04 Program follows: --- from difflib import SequenceMatcher as SM src = u"N KPT T HS KMNST KNFKXNS AS H KLT FR 0 ALMNXN AF PRFT PRPRT AN RRL ARS T P RPLST P KMNS H ASTPLXT HS ANTSTRL KR0 PRKRM NN AS 0 KRT LP FRRT 0S PRKRM KLT FR 0 RPT TRNSFRMXN AF XN FRM AN AKRRN AKNM T A SSLST ANTSTRL SST" trg = u"M KPT T HS KMNST KNFKXNS AS H KLT FR 0 ALMNXN AF PRFT PRPRT AN RRL ARS T P RPLST P KMNS H ASTPLXT HS ANTSTRL KR0 PRKRM NN AS 0 KRT LP FRRT 0S PRKRM KLT FR 0 RPT TRNSFRMXN AF XN FRM AN AKRRN AKNM T SSLST ANTSTRL SST" print src, '\n', trg, '\n', SM(None, trg, src).ratio() --- The following sequence prints a ratio() of 0.989795918367 which seems about right. --- src = u"M STNK AS AN AF 0 MST AMPRTNT LTRS TRNK 0 TNT0 SNTR HS MST PRMNNT AKMPLXMNT AS 0 ASTPLXMNT AF 0 PPLS RPPLK AF XN HS A0R AXFMNTS ANKLTT LTNK HS PPL AN 0 LNK MRX AFR FR 0SNT MLS T KP 0 KMNST MFMNT ALF" trg = u"M STNK AS AN AF 0 MST AMPRTNT LTRS TRNK 0 0 SNTR HS MST PRMNNT AKMPLXMNT AS 0 ASTPLXMNT AF 0 PPLS RPPLK AF XN HS A0R AXFMNTS ANKLT LTNK HS PPL AN 0 LNK MRX AFR FR 0SNT MLS T KP 0 KMNST MFMNT ALF" print src, '\n', trg, '\n', SM(None, trg, src).ratio() --- What could be the cause? Is there something I am doing wrong? Thanks in advance -- Regards Jay -- https://mail.python.org/mailman/listinfo/python-list