Now I can see lot improvement on my "related" help search result. Let me tell 
you that I have a non-token list which removes all irrelevant tokens from 
selected help topic. So after filtering all non tokens from the selected help 
topic, I search help system and show all results. But I am somehow not 
confident on non-token list, and I feel it can be improved or maybe I am 
looking for some sort of Human readable kind of tokenizer that can generate 
equivalent query. Needless to mention here that I am taking help of 
SANDBOX/wordnet to generated all synonym queries for User selected help topic. 
SO my system works something
   
  - Remove all non-tokens from User selected help topic using non-token list
  - Generate synonym queries using Lucene Sandbox/wordnet API
  - Search Help system using FuzzyLikeThisQuery.
  - Combine all results by using lucene ranking
  - Show only first 10 results to Users
   
  As I have mentioned this works fine but does not work as expected for generic 
topics.
   
  Somebody suggested me to user JAMA(http://math.nist.gov/javanumerics/jama/) 
with Lucene. But I am not sure whether I can afford resources for R&D on Jama 
and use it with Lucene. Rather I would definitely be satisfied with your 
suggested query and keep improving non-token list filter.
   
  Also I would appreciate your suggestion.
   
  - RB
markharw00d <[EMAIL PROTECTED]> wrote:
  Cool Coder wrote:
>> Is there anyway I can specify which terms are "MUST", I mean they 
have to appear in the result and some terms are optional,

One "hands off" approach you could try with this is to rewrite the 
fuzzyQuery and then set the minimum number of terms you want a match on. 
e.g.

FuzzyLikeThisQuery flt=new FuzzyLikeThisQuery(50,new 
StandardAnalyzer());
flt.addTerms("product critical update", "title", 0.75f, 
FuzzyQuery.defaultPrefixLength);
BooleanQuery q = (BooleanQuery) flt.rewrite(r);
int minNumClauseMatches=Math.round(q.clauses().size()*0.5f);
q.setMinimumNumberShouldMatch(minNumClauseMatches);

In the above code I'm specifying at least half of the input terms must 
have a match.

If a user wants more control then they really need to be more "hands on" 
and specify precisely which of these words are important to them in the 
actual query syntax.

Cheers
Mark

> Hello,
> I am trying to use FuzzyLikeThisQuery to search my help system and show set 
> of help entries for user selected Help topic. For any selected Help topic, 
> System needs to display all related topics. This works somehow, but if query 
> contains generic terms then result returned by FuzzyLikeThisQuery contains 
> all irrelevant topics. E.g. 
> if query is "product blog update" then I am getting results like
> 
> fuzzyLikeQuery.addTerms("product blog update", "title", 0.75f, 
> FuzzyQuery.defaultPrefixLength);
> 
> --Slide Show Update - Full Control Panel
> --Product manager: sent a mail to [EMAIL PROTECTED]
> 
> I would expect at least terms like "product" and "blog" should appear in the 
> result. 
> Is there anyway I can specify which terms are "MUST", I mean they have to 
> appear in the result and some terms are optional, I mean they need not appear 
> in the result. 
> 
> Previously, I was using PhraseQuery, but it looks for an exact match. 
> I would appreciate your suggestion?
> 
> - BR
> 
>
> 
> ---------------------------------
> Get easy, one-click access to your favorites. Make Yahoo! your homepage.
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



       
---------------------------------
Never miss a thing.   Make Yahoo your homepage.

Reply via email to