Hi,
I am new with Lucene.
I dont understand how Lucene works in some cases. For example:
If I have an index with the following three entries:
- ATUAÇÃO FALHA DE DISJUNTOR
- RESET DE FALHA DE DISJUNTOR
- FALHA DE COMANDO
When I try to look for something limilar with "FALHA DE DISJUNTOR", I've
got the following results:
Result | score
FALHA DE COMANDO | 0.9277342
ATUAÇÃO FALHA DE DISJUNTOR | 0.8880876
RESET DE FALHA DE DISJUNTOR | 0.5709133
If you pay attention, "FALHA DE COMANDO" is comming before "ATUAÇÃO FALHA
DE DISJUNTOR", but it is not what I would like. For my client, the first
result should be "ATUAÇÃO FALHA DE DISJUNTOR".
Other example happens when I try to look for "FALHA DISJUNTOR". For my
surprise, the unique result is "FALHA DE COMANDO". Like the other example, I
was expecting "ATUAÇÃO FALHA DE DISJUNTOR" as the first or unique result.
What should I do in order to have the "ATUAÇÃO FALHA DE DISJUNTOR" as the
first result (or unique) in the explained cases.
I am attaching the code at the end of this mail.
Thanks in advance,
Eloi
public class TestLucene {
private static final String FILENAME = "test.idx";
private static final String FIELD = "FIELD";
private static List<Document> getDocs() {
List<Document> docs = new ArrayList<Document>();
docs.add(toDoc("Atuação Falha de Disjuntor"));
docs.add(toDoc("Reset de Falha de Disjuntor"));
docs.add(toDoc("Falha de Comando"));
return docs;
}
private static Document toDoc(String text) {
Document doc = new Document();
doc.add(new Field(FIELD, text.toUpperCase(), Field.Store.YES,
Field.Index.UN_TOKENIZED));
return doc;
}
private static void createIndex(Collection docs) throws IOException {
IndexWriter writer = new IndexWriter(FILENAME, new
StandardAnalyzer(),
true);
Iterator itDocs = docs.iterator();
while (itDocs.hasNext()) {
Document doc = (Document) itDocs.next();
writer.addDocument(doc);
}
writer.optimize();
writer.close();
}
public static void main(String[] args) throws IOException {
createIndex(getDocs());
IndexSearcher indexSearcher = new IndexSearcher(FILENAME);
Hits hits = indexSearcher.search(new FuzzyQuery(new
Term(FIELD,"Falha de Disjuntor".toUpperCase()), 0.4f));
System.out.println(hits.length());
Iterator it = hits.iterator();
while (it.hasNext()) {
Hit hit = (Hit) it.next();
System.out.println(hit.get(FIELD) + " " + hit.getScore());
}
}
}