Rajiv,
Have a look at the details provided by IndexSearcher.explain() for
those documents, and you'll get some insight into the factors used to
rank them. Since both scores are 1.0, you'll probably want to
implement your own custom Similarity and override the lengthNorm() to
adjust that factor.
Another technique you can use is to expand a users query into a more
sophisticated boolean query, such that a users query for "new york
ny" would become (in Query.toString format): +new +york +ny "new york
ny", which would boost exact matches.
Erik
On Aug 1, 2006, at 1:19 PM, Rajiv Roopan wrote:
Ok, this is how I'm indexing. Both in indexing and searching I'm using
SimpleAnalyzer()
String loc = "New York, NY";
doc.add(new Field("location", loc, Field.Store.NO,
Field.Index.TOKENIZED));
String loc2 = "New York Mills, NY";
doc.add(new Field("location", loc2, Field.Store.NO,
Field.Index.TOKENIZED
));
and this is how I'm searching...
String searchStr = "New York, NY";
Analyzer analyzer = new SimpleAnalyzer();
QueryParser parser = new QueryParser("location", analyzer);
parser.setDefaultOperator(QueryParser.AND_OPERATOR);
Query query = parser.parse( searchStr );
Hits hits = searcher.search( query );
I've tried all query types and everytime "new york mills, ny" is in
hits(0).
Both results have a score of 1.0. I know I can add some kind of
sort to
always make the shorter field first. But shouldn't the first by
default, due
to the scoring algorithm, be "new york, ny" because it's a shorter
field?
let me know if i'm missing something. thanks!
rajiv
On 8/1/06, Simon Willnauer <[EMAIL PROTECTED]> wrote:
I guess so, but without any information about your code nobody can
tell
what.
If you provide more information you willl get help!!
regards simon
On 8/1/06, Rajiv Roopan <[EMAIL PROTECTED]> wrote:
> Hello, I have an index of locations for example. I'm indexing
one field
> using SimpleAnalyzer.
>
> doc1: albany ny
> doc2: hudson ny
> doc3: new york ny
> doc4: new york mills ny
>
> when I search for "new york ny" , the first result returned is
always
"new
> york mills ny". Am I doing something incorrect?
>
> thanks in advance,
> rajiv
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]