Re: Searching "teh" or tihs"
1. The algorithm I implemented is for "fuzzy search" of written/typed words, not for "similar sounding words" (soundex), mostly quite different. My demo is scripted for looking up a (mistyped) search string in the 3233 keywords of LCScript. 2. That's for me the true value of LiveCode: Don't talk about possible development -- just do it. Then you have in a few hours a solution which is working on Mac/Win/Linux, using LC 6/7/8/9, and often fast enough even for RaspberryPi 2/3. Independent of current OS flavours. If that solution is not good enough or not fast enough for you then you can write C or java extensions. We have already a Java FFI available in LC 9-dp6! I'm really looking forward to your solution. In the meantime you can use my approach, it was updated today. I removed a small bug in the percentage search, which wasn't sloppy enough ;-) > Bob S. wrote: > There is always the soundex() sql function. SELECT soundex('the') = > soundex('teh') returns true. Not sure what the tolerance is though. Because > of the arbitrary nature of languages, this really requires a lookup table for > commonly mistyped words, with the ability to "learn" as corrections are made. > Then you would need to be able to "uncorrect" or delete entries. Eventually > you end up with something that is likely built into the OS already, so at > that point it would be better to write an extension in C or Java. > > Bob S > > > > hh wrote: > > > > Searching is important for your project? > > Would you like to ask "Did you mean the?" if user searches "teh"? > > > > I've implemented a fuzzySearch algorithm in LiveCode script: > > http://forums.livecode.com/viewtopic.php?p=152202#p152202 > > > > Now if you wish to look up "the" or "this" then fuzzySearch will find > > it (among others) by searching "teh" or tihs", with a penalty score of > > one only for swapping the chars. > ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Searching "teh" or tihs"
There is always the soundex() sql function. SELECT soundex('the') = soundex('teh') returns true. Not sure what the tolerance is though. Because of the arbitrary nature of languages, this really requires a lookup table for commonly mistyped words, with the ability to "learn" as corrections are made. Then you would need to be able to "uncorrect" or delete entries. Eventually you end up with something that is likely built into the OS already, so at that point it would be better to write an extension in C or Java. Bob S > On Mar 9, 2017, at 16:26 , hh via use-livecode > <use-livecode@lists.runrev.com> wrote: > > Searching is important for your project? > Would you like to ask "Did you mean the?" if user searches "teh"? > > I've implemented a fuzzySearch algorithm in LiveCode script: > http://forums.livecode.com/viewtopic.php?p=152202#p152202 > > Now if you wish to look up "the" or "this" then fuzzySearch will find > it (among others) by searching "teh" or tihs", with a penalty score of > one only for swapping the chars. > > > ___ > use-livecode mailing list > use-livecode@lists.runrev.com > Please visit this url to subscribe, unsubscribe and manage your subscription > preferences: > http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Searching "teh" or tihs"
Congratulations on the fuzzysearch. I don't know how you did it but for the English language, I remember both soundex and its refinement called metaphone, both algorithms are made for this kind of situation. I think that Levenshtein distance based algos are the way to go for this stuff these days but are a bit beyond of what I am used to developing... On Thu, Mar 9, 2017 at 9:26 PM, hh via use-livecode < use-livecode@lists.runrev.com> wrote: > Searching is important for your project? > Would you like to ask "Did you mean the?" if user searches "teh"? > > I've implemented a fuzzySearch algorithm in LiveCode script: > http://forums.livecode.com/viewtopic.php?p=152202#p152202 > > Now if you wish to look up "the" or "this" then fuzzySearch will find > it (among others) by searching "teh" or tihs", with a penalty score of > one only for swapping the chars. > > > ___ > use-livecode mailing list > use-livecode@lists.runrev.com > Please visit this url to subscribe, unsubscribe and manage your > subscription preferences: > http://lists.runrev.com/mailman/listinfo/use-livecode > -- http://www.andregarzia.com -- All We Do Is Code. http://fon.nu -- minimalist url shortening service. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Searching "teh" or tihs"
> Peter Bogdanoff wrote: > This looks intriguing! I’m working on a commercial project that > could use this. What is your license? The code is based on pseudocode from https://en.wikipedia.org/wiki/Damerau–Levenshtein_distance From my side it's free for non-commercial use, I only wish to have a citation. For commercial use of my published scripts I would like to have 1. a citation 2. an "At-least-donation", one time, for the + CFFL = Community Fund for LiveCoders: + For such LiveCoders who help the community such a lot in the forums or here in the list and who really _need_ some money (I know some). The donation for this script here should be _at least_ $10 (one time). The fund is a new idea. Certainly Richard Gaskin is willing to manage such a fund (assuming he doesn't need such funding). OK Richard? ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Searching "teh" or tihs"
hh, This looks intriguing! I’m working on a commercial project that could use this. What is your license? Peter Bogdanoff On Mar 9, 2017, at 4:26 PM, hh via use-livecode <use-livecode@lists.runrev.com> wrote: > Searching is important for your project? > Would you like to ask "Did you mean the?" if user searches "teh"? > > I've implemented a fuzzySearch algorithm in LiveCode script: > http://forums.livecode.com/viewtopic.php?p=152202#p152202 > > Now if you wish to look up "the" or "this" then fuzzySearch will find > it (among others) by searching "teh" or tihs", with a penalty score of > one only for swapping the chars. > > > ___ > use-livecode mailing list > use-livecode@lists.runrev.com > Please visit this url to subscribe, unsubscribe and manage your subscription > preferences: > http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Searching "teh" or tihs"
Searching is important for your project? Would you like to ask "Did you mean the?" if user searches "teh"? I've implemented a fuzzySearch algorithm in LiveCode script: http://forums.livecode.com/viewtopic.php?p=152202#p152202 Now if you wish to look up "the" or "this" then fuzzySearch will find it (among others) by searching "teh" or tihs", with a penalty score of one only for swapping the chars. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode