Hi everyone,
I have a problem about Lucene DuplicateFilter. I have some PDF files and
have 3 field (id, title and content). I am indexing pdf files page by page.
Different pages on the same pdf stores same id and title, only content is
different. I want to search a string and eliminate the same id. But on some
documents DuplicateFilter runs perfect, but in some socumetns it returns 0
result. By the way if I search the string in title it again returns true
results, but if we search in content 0 results resturn. I have added my code
below. I could not find the problem. Please help me about the issue. Thank
you...
String directory = "C:/indexes/";
Query queryd = null;
IndexReader = IndexReader.open(directory);
IndexSearcher searcher = new IndexSearcher(IndexReader);
Analyzer sanalyzer = new StopAnalyzer();
QueryParser parser = new QueryParser("content",sanalyzer);
queryd = parser.parse("point");
DuplicateFilter df = new DuplicateFilter("id",1,1);
ehits = searcher.search(queryd, df);
--
View this message in context:
http://www.nabble.com/DuplicateFilter-Problem-tp21330217p21330217.html
Sent from the Lucene - General mailing list archive at Nabble.com.