GoogleQueryParser

2002-09-11 Thread Eric Jain
Has anyone written something like a GoogleQueryParser? Such a parser would differ in the behavior of the default parser in the following points: - Default AND rather than OR. - Treat a-b as a-b rather that a -b. - Perhaps disallow ~. I guess I could write my own QueryParser, just wanted to be

Re: GoogleQueryParser

2002-09-11 Thread gcasper
- Treat a-b as a-b rather that a -b. I came across the same. Quite an essential issue for some european sites (as you surely know :-) I'm not very familiar with JavaCC, but I changed QueryParser.jj in the following way: I changed | MINUS: - to | MINUS: - and removed - from the

Lucene's Ranking Function

2002-09-11 Thread Clemens Marschner
In the FAQ it reads score_d = sum_t(tf_q * idf_t / norm_q * tf_d * idf_t / norm_d_t * boost_t) * coord_q_d 1. I think the new document boost is missing, isn't it? With that it should be something like score_d = sum_t(tf_q * idf_t / norm_q * tf_d * idf_t / norm_d_t * boost_t) * coord_q_d *

Re: GoogleQueryParser

2002-09-11 Thread Clemens Marschner
- Default AND rather than OR. As for this part: This can be accomplished with queryParser = new QueryParser(defaultField, new MyAnalyzer()); queryParser.setOperator(QueryParser.DEFAULT_OPERATOR_AND); - Treat a-b as a-b rather that a -b. That would be interesting for

Re: GoogleQueryParser

2002-09-11 Thread Eric Jain
This actually changes the behaviour to that of google and I didn't experience any negative side effects (yet). Thanks. I hope there will eventually be some standard way to accomplish this... -- Eric Jain -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail:

RE: GoogleQueryParser

2002-09-11 Thread Halcsy Pter
-Original Message- From: Eric Jain [mailto:[EMAIL PROTECTED]] Sent: Wednesday, September 11, 2002 1:44 PM To: Clemens Marschner Cc: Lucene Users List Subject: Re: GoogleQueryParser queryParser.setOperator(QueryParser.DEFAULT_OPERATOR_AND); Thanks, that would be exactely what I need.

Re: Lucene's Ranking Function

2002-09-11 Thread Clemens Marschner
I have seen that a norm factor between 0 and 255 is read with IndexReader.norms() in TermScorer.score(). I've seen now that this is an 8-bit float. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]

Re: Lucene's Ranking Function

2002-09-11 Thread Clemens Marschner
score_d = sum_t(tf_q * idf_t / norm_q * tf_d * idf_t / norm_d_t * boost_t) * coord_q_d One last thing I wondered about: Is idf_t really going into that equation twice? From what I see, idf_t/norm_q is completely left out, isn't it? tf_q is applied although it is never calculated - if a term

RE: GoogleQueryParser

2002-09-11 Thread Philip Chan
I think there's a bug, if I set the default operator to be OR, when I run java org.apache.lucene.queryParser.QueryParser a AND b OR c it will give me the result of +a +b c if I set the default operator to be AND, and run it with the term a b OR c, it will give me +a b c, which is different

Re: Lucene's Ranking Function

2002-09-11 Thread Doug Cutting
Clemens Marschner wrote: 1. I think the new document boost is missing, isn't it? With that it should be something like score_d = sum_t(tf_q * idf_t / norm_q * tf_d * idf_t / norm_d_t * boost_t) * coord_q_d * boost_d Is that correct? Almost. This should actually be boost_d * boost_d_t,