Dion Almaer wrote:
Interesting.  I implemented an approach which boosted based on the number of months in 
the past, and
after tweaking the boost amounts, it seems to do the job. I do a fresh reindex every 
night (since
the indexing process takes no time at all... unlike our old search solution!)

If you're reindexing every night, then document boosting should work well. The other approaches I mentioned would only be required if you can't afford to re-index frequently.


I read content for the index from different sources. Sometimes the source gives me 
documents loosely
in date order, but not all of them. So, it seems that one of the other approaches 
should be taken
(adding a month/week field etc).  I should look more into the HitCollector and see how 
it can help
me.

I wouldn't bother to pursue these approaches. Document boosting should work well for you, since you're reindexing.


The other issue I have is that I would like to prioritize the title field.  At the 
moment I am lazy
and add the title to the body (contents = title + body) which seems to be OK... 
however sometimes
something that mentions the search term in the title should appear higher up in the 
pecking order.

I am using the QueryParser (subclassed to disallow wildcards etc) to do the dirty work 
for me.
Should I get away from this and manage the queries myself (and run a Multi against the 
title field
as well as the contents?

A separate title field will solve this for you. You can, as you suggest, boost title clauses at query time. Alternately, you could boost title fields at index time, although that's less flexible.


Note that if you put titles in a separate field and search both the title and the body field then title matches will tend to be naturally boosted, since titles tend to be shorter than bodies and hence are less normalized.

Doug


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to