Re: [HACKERS] String Similarity

2006-09-27 Thread Pang Zaihu
Hello! Would you like to give me a simple introduction of Levenshtein distence function? Thank you! On2006-05-1919:54,MartijnvanOosterhoutwrote: OnFri,May19,2006at04:00:48PM-0400,MarkWoodwardwrote: (3)IstherealsoadesireforaLevenshteindistencefunctionfortext

Re: [HACKERS] String Similarity

2006-09-27 Thread tomas
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Tue, Sep 26, 2006 at 09:09:33AM +0800, Pang Zaihu wrote: Hello! Would you like to give me a simple introduction of Levenshtein distence function? Better than I could explain: http://en.wikipedia.org/wiki/Levenshtein_distance Thank you!

Re: [HACKERS] String Similarity

2006-05-22 Thread Mark Woodward
Try contrib/pg_trgm... Tri-graphs are interesting, and I'll try to reconsider whether they fit or not, ut I suspect that do not. (You are the second to recommend it) Anything based on a word parser is probably not appropriate, the example I first gave is a little misleading in that it is not

Re: [HACKERS] String Similarity

2006-05-21 Thread Christopher Kings-Lynne
Try contrib/pg_trgm... Chris Mark Woodward wrote: I have a side project that needs to intelligently know if two strings are contextually similar. Think about how CDDB information is collected and sorted. It isn't perfect, but there should be enough information to be usable. Think about this:

Re: [HACKERS] String Similarity

2006-05-20 Thread Mark Woodward
What I was hoping someone had was a function that could find the substring runs in something less than a strlen1*strlen2 number of operations and a numerically sane way of representing the similarity or difference. Acually, it is more like strlen1*strlen2*N, where N is the number of valid

Re: [HACKERS] String Similarity

2006-05-19 Thread Martijn van Oosterhout
On Fri, May 19, 2006 at 04:00:48PM -0400, Mark Woodward wrote: (3) Is there also a desire for a Levenshtein distence function for text and varchars? I experimented with it, and was forced to write the function in item #1. Postgres already has a Levenshtein distence function, see fuzzystrmatch

Re: [HACKERS] String Similarity

2006-05-19 Thread Andrew Dunstan
Mark Woodward wrote: (3) Is there also a desire for a Levenshtein distence function for text and varchars? I experimented with it, and was forced to write the function in item #1. fuzzystrmatch in contrib already has a Levenshtein function. cheers andrew

Re: [HACKERS] String Similarity

2006-05-19 Thread Mark Dilger
Mark Woodward wrote: I have a side project that needs to intelligently know if two strings are contextually similar. Think about how CDDB information is collected and sorted. It isn't perfect, but there should be enough information to be usable. Think about this: pink floyd - dark side

Re: [HACKERS] String Similarity

2006-05-19 Thread Mark Woodward
Mark Woodward wrote: I have a side project that needs to intelligently know if two strings are contextually similar. Think about how CDDB information is collected and sorted. It isn't perfect, but there should be enough information to be usable. Think about this: pink floyd - dark side

Re: [HACKERS] String Similarity

2006-05-19 Thread Greg Sabino Mullane
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I have a side project that needs to intelligently know if two strings are contextually similar. The examples you gave seem heavy on word order and whitespace consideration, before applying any algorithms. Here's a quick perl version that does the

Re: [HACKERS] String Similarity

2006-05-19 Thread Josh Berkus
I have a side project that needs to intelligently know if two strings are contextually similar. Also check out the fuzzystrmatch module in /contrib, which offers soundex, metaphone and levenschtein functions. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco

Re: [HACKERS] String Similarity

2006-05-19 Thread Mark Woodward
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I have a side project that needs to intelligently know if two strings are contextually similar. The examples you gave seem heavy on word order and whitespace consideration, before applying any algorithms. Here's a quick perl version that

Re: [HACKERS] String Similarity

2006-05-19 Thread Oleg Bartunov
Get pg_trgm http://www.sai.msu.su/~megera/oddmuse/index.cgi/ReadmeTrgm It doesn't depends on language. Oleg On Fri, 19 May 2006, Mark Woodward wrote: I have a side project that needs to intelligently know if two strings are contextually similar. Think about how CDDB information is collected