Hello

I am having an odd problem with difflib.SequenceMatcher. Sample code below:

The strings "src" and "trg" differ only a little. The
SequenceMatcher.ratio() for these strings 0.0. Many other similar
strings are working fine without problems (see below) with non-zero
ratios depending on how much difference there is between strings (as
expected).

Tested on Python 2.7 on Ubuntu 14.04

Program follows:
---
from difflib import SequenceMatcher as SM

src = u"N KPT T HS KMNST KNFKXNS AS H KLT FR 0 ALMNXN AF PRFT PRPRT AN
RRL ARS T P RPLST P KMNS H ASTPLXT HS ANTSTRL KR0 PRKRM NN AS 0 KRT LP
FRRT 0S PRKRM KLT FR 0 RPT TRNSFRMXN AF XN FRM AN AKRRN AKNM T A SSLST
ANTSTRL SST"
trg = u"M KPT T HS KMNST KNFKXNS AS H KLT FR 0 ALMNXN AF PRFT PRPRT AN
RRL ARS T P RPLST P KMNS H ASTPLXT HS ANTSTRL KR0 PRKRM NN AS 0 KRT LP
FRRT 0S PRKRM KLT FR 0 RPT TRNSFRMXN AF XN FRM AN AKRRN AKNM T SSLST
ANTSTRL SST"
print src, '\n', trg, '\n', SM(None, trg, src).ratio()
---


The following sequence prints a ratio() of 0.989795918367 which seems
about right.
---
src = u"M STNK AS AN AF 0 MST AMPRTNT LTRS TRNK 0 TNT0 SNTR HS MST
PRMNNT AKMPLXMNT AS 0 ASTPLXMNT AF 0 PPLS RPPLK AF XN HS A0R AXFMNTS
ANKLTT LTNK HS PPL AN 0 LNK MRX AFR FR 0SNT MLS T KP 0 KMNST MFMNT
ALF"
trg = u"M STNK AS AN AF 0 MST AMPRTNT LTRS TRNK 0 0 SNTR HS MST PRMNNT
AKMPLXMNT AS 0 ASTPLXMNT AF 0 PPLS RPPLK AF XN HS A0R AXFMNTS ANKLT
LTNK HS PPL AN 0 LNK MRX AFR FR 0SNT MLS T KP 0 KMNST MFMNT ALF"
print src, '\n', trg, '\n', SM(None, trg, src).ratio()
---

What could be the cause? Is there something I am doing wrong?

Thanks in advance
-- 
Regards
Jay
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to