Hi Eric,
On May 9, 2007, at 2:39 AM, supereric wrote:
How I can get the tag word score in lucene. suppose that you have
searched a
tag word and 3 hit documents
are now found.
1 -How someone could find number of occurrences in any document so
it could
sort the results.
Span Queries tell you where the matches occur in the document by
offset, but I am not sure what your sorting criteria would be. The
explain method also can give you information about why a particular
document scored a particular way.
Also I wan to have some other policies for ranking the results.
What should
I do to handle that. for example
I want to score boldfaced tag words in an html document twice
normal texts.
Although totally experimental at this stage, the new Payload stuff in
the trunk version of Lucene (or nightly builds) is designed for such
a scenario. Check out the BoostingTermQuery which can boost term
scores based on the contents of a payload located at a particular
term. Feedback on the APIs is very much appreciated.
2- How I can omit some tag words from the index?! for example
common words
in another language?
See the StopFilter token filter and/or the StopwordAnalyzer
HTH,
Grant
--------------------------
Grant Ingersoll
Center for Natural Language Processing
http://www.cnlp.org/tech/lucene.asp
Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/
LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]