I don't think that you're missing something. It's a hard problem, and I'm not even convinced that doing several iterations will be as useful as it sounds.
I would guess, and a little experiment I just did seems to confirm it, that google collects stats on searches rather than on the documents to ask you "did you mean xxx." They of course have a few more searches to built on than we do :-), to make it a useful tool. By the way, as long as you just fetch the number of hits per search, and don't actually fetch any data, the searches should be very fast, so doing the search three times is not big deal. Regards, Dror On Fri, Nov 21, 2003 at 01:14:53AM +0000, sam s wrote: > Levenshtein is again word based like spell check. I found Jazzy quite handy > to some level for spell check. I am not worrying much about the spell check > part. What I want to do is show user right spell check suggestion (when > spell check returns multiple suggestions) based on other words he/she > entered for the search. > Once again going to same example > User enters: inted motherboard > spell check returns 3 suggestions for inted > inter > intel > intek > > In this situation since I have both words intel and motherboard in one > document of my search collection I should able to show user something like > Did you mean: intel motherboard? > > One simplest way to achieve this is do search 3 times for all three > suggestions with word motherboard and show user suggestion for which search > got more hits. Problem with this is number of iterations involved. If there > are suggestions on two words user entered, there will be all kinds of > combinations and those many iterations. So I dont want to go this way. > > I haven't studied in detail how lucene does indexing and search on it. I > don't know whether that will help. > > Has anybody come across this problem? Or I must be missing something.. > > Again, I apologize if you guys think this is not right post for lucene user > list. > > Thanks, > Abhay > > > >From: "sam s" <[EMAIL PROTECTED]> > >Reply-To: "Lucene Users List" <[EMAIL PROTECTED]> > >To: [EMAIL PROTECTED] > >Subject: RE: Context-based suggestions with spell check > >Date: Fri, 21 Nov 2003 00:35:13 +0000 > > > >I actually thought of using search for right combination of suggestions > >but I feared of performance degrade. I'll look at levenshtein. > > > >Thanks > > > >>From: Dan Quaroni <[EMAIL PROTECTED]> > >>Reply-To: Lucene Users List <[EMAIL PROTECTED]> > >>To: 'sam s ' <[EMAIL PROTECTED]> > >>Subject: RE: Context-based suggestions with spell check > >>Date: Thu, 20 Nov 2003 19:22:51 -0500 > >> > >> I would also suggest 'intend' as a possible correction. > >> > >>There are a decent number of algorithms out there for distance between to > >>words. Check out levenshtein for that. > >> > >>In terms of context based corrections, you could do a search for the word > >>combined with the word in front of it and the word behind it. > >> > >>"I just bought an inted motherboard" > >> > >>Then you do a search for "an inter", "an intel", etc and "inter > >>motherboard", "intel motherboard", etc and count the number of hits you > >>get > >>for each one and rank your suggestions accordingly. > >> > >> > >>-----Original Message----- > >>From: sam s > >>To: [EMAIL PROTECTED] > >>Sent: 11/20/03 7:07 PM > >>Subject: Context-based suggestions with spell check > >> > >>Hi, > >> > >>I am thinking to give spell check functionality to the search. I am > >>trying > >>to achieve two things to complement search. > >> > >>1. Spell check where dictionary will be composed of all text I am > >>creating > >>search index. This looks simple with some spell check implementation. > >> > >>2. The problem I am facing is how do I suggest right suggestion to a > >>wrong > >>word accompanied with other word. For example when user enters search > >>term > >>'inted' spell check returns suggestions inter, intel and intek. Now > >>problem > >>is when user searches 'inted motherboard' how do I decide that user is > >>searching for 'intel motherboard'? Where there are some items contain > >>text > >>'intel motherboard'. How do I make context-based suggestions? Does > >>anybody > >>any simple algorithm for this. > >>I know this is not related to lucene but thought may get some help from > >>community. Suggestions are appreciated. > >> > >>Thanks in advance, > >>Sam > >> > >>_________________________________________________________________ > >>Tired of spam? Get advanced junk mail protection with MSN 8. > >>http://join.msn.com/?page=features/junkmail > >> > >> > >>--------------------------------------------------------------------- > >>To unsubscribe, e-mail: [EMAIL PROTECTED] > >>For additional commands, e-mail: [EMAIL PROTECTED] > >> > > > >_________________________________________________________________ > >Add photos to your e-mail with MSN 8. Get 2 months FREE*. > >http://join.msn.com/?page=features/featuredemail > > > > > >--------------------------------------------------------------------- > >To unsubscribe, e-mail: [EMAIL PROTECTED] > >For additional commands, e-mail: [EMAIL PROTECTED] > > > > _________________________________________________________________ > STOP MORE SPAM with the new MSN 8 and get 2 months FREE* > http://join.msn.com/?page=features/junkmail > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > -- Dror Matalon Zapatec Inc 1700 MLK Way Berkeley, CA 94709 http://www.fastbuzz.com http://www.zapatec.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
