Thank You Gabriel,
On Sun, Jan 25, 2009 at 7:12 AM, Gabriel Genellina
gagsl-...@yahoo.com.arwrote:
En Sat, 24 Jan 2009 15:08:08 -0200, S.Selvam Siva s.selvams...@gmail.com
escribió:
I am developing spell checker for my local language(tamil) using python.
I need to generate alternative word list for a miss-spelled word from the
dictionary of words.The alternatives must be as much as closer to the
miss-spelled word.As we know, ordinary string comparison wont work here .
Any suggestion for this problem is welcome.
I think it would better to add Tamil support to some existing library like
GNU aspell: http://aspell.net/
That was my plan earlier,But i am not sure how aspell integrates with other
editors.Better i will ask it in aspell mailing list.
You are looking for fuzzy matching:
http://en.wikipedia.org/wiki/Fuzzy_string_searching
In particular, the Levenshtein distance is widely used; I think there is a
Python extension providing those calculations.
--
Gabriel Genellina
The following code served my purpose,(thanks for some unknown contributors)
def distance(a,b):
c = {}
n = len(a); m = len(b)
for i in range(0,n+1):
c[i,0] = i
for j in range(0,m+1):
c[0,j] = j
for i in range(1,n+1):
for j in range(1,m+1):
x = c[i-1,j]+1
y = c[i,j-1]+1
if a[i-1] == b[j-1]:
z = c[i-1,j-1]
else:
z = c[i-1,j-1]+1
c[i,j] = min(x,y,z)
return c[n,m]
a=sys.argv[1]
b=sys.argv[2]
d=distance(a,b)
print d=,d
longer = float(max((len(a), len(b
shorter = float(min((len(a), len(b
r = ((longer - d) / longer) * (shorter / longer)
# r ranges between 0 and 1
--
Yours,
S.Selvam
--
http://mail.python.org/mailman/listinfo/python-list