Hello,

I want to use Lucene to get similar documents based on a Boolean Query
(similar metadata with OR clauses) and ratings of the user for already
searched documents.

I intend to implement a Naive Bayes classifier to categorize documents
into liked/disliked classes and would do this by using a HitCollector class.

class ClassifyingHitCollector implements HitCollector {

  public void collect(int doc, float score) {
    // classify document

    // if document is liked -> add to hit collection
  }

}

...

ClassifyingHitCollector c = new ClassifyingHitCollector ();
searcher.search(query, c);


This means that the calculation of the bayes classification has to be
calculated for each matching document. Is there a possibility to do this
(during search) for only the n top matching documents or does this mean
to use the Hits returning searcher.search(..) overload and do the
calculation on the n top matching documents, after the Lucene search?

Is there another possibility to change the scoring of the search(..)
method that is more efficient?

TIA,
Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to