Hello,
 I am trying to implement my own Jaccard similarity for Lucene.
So far i have the following code
public class JaccardSimilarity extends DefaultSimilarity {
    int numberOfDocumentTerms;
//String field="contents"; // Should the Jaccard similarity be only based in the contents field????

    @Override
    public float idf(int i, int i1) {
    return 1;
  }
    @Override
    public float tf(int i) {
    return 1;
  }

    public int getNumberOfDocumentTerms() {
        return numberOfDocumentTerms;
    }

    public void setNumberOfDocumentTerms(int numberOfDocumentTerms) {
        this.numberOfDocumentTerms = numberOfDocumentTerms;
    }

    @Override
    public float queryNorm(float i) {
    return 1.0f;
  }
    @Override
    public float computeNorm(String field, FieldInvertState state) {


numberOfDocumentTerms=state.getLength();//for each field we get the number of terms
        setNumberOfDocumentTerms(numberOfDocumentTerms);

System.out.println("numberOfDocumentTerms from compute : " + numberOfDocumentTerms);
    return 1.0f;
  }

    @Override
    public float coord(int overlap, int maxOverlap) {
System.out.println("numberOfDocumentTerms : " + getNumberOfDocumentTerms());
    return (overlap/(numberOfDocumentTerms+(maxOverlap-overlap)));
  }
}

The problem is that coord() method is not used (or at least so that i understand) neither in searching nor in indexing
What do i do wrong? i need the

   |overlap| - the number of query terms matched in the document
   |maxOverlap| - the total number of terms in the query
to implement my scoring.
Any help would be highly appreciated
Thank you in advance!

Reply via email to