Areek Zillur created LUCENE-6339:
------------------------------------
Summary: [suggest] Near real time Document Suggester
Key: LUCENE-6339
URL: https://issues.apache.org/jira/browse/LUCENE-6339
Project: Lucene - Core
Issue Type: New Feature
Components: core/search
Affects Versions: 5.0
Reporter: Areek Zillur
Assignee: Areek Zillur
Fix For: 5.0
The idea is to index documents with one or more *SuggestField*(s) and be able
to suggest documents with a *SuggestField* value that matches a given key.
Individual *SuggestField* can be assigned a numeric weight to be used to score
the suggestion at query time.
Document suggestion can be done on an indexed *SuggestField*. The document
suggester can filter out deleted documents in near real-time. The suggester can
filter out documents based on a Filter (note: may change to a non-scoring
query?) at query time.
A custom postings format (CompletionPostingsFormat) is used to index
*SuggestField*s and perform document suggestions.
h4. Usage
{code:java}
// hook up custom postings format
// indexAnalyzer for SuggestField
Analyzer analyzer = ...
IndexWriterConfig config = new IndexWriterConfig(analyzer);
Codec codec = new Lucene50Codec() {
@Override
public PostingsFormat getPostingsFormatForField(String field) {
if (isSuggestField(field)) {
return new
CompletionPostingsFormat(super.getPostingsFormatForField(field));
}
return super.getPostingsFormatForField(field);
}
};
config.setCodec(codec);
IndexWriter writer = new IndexWriter(dir, config);
// index some documents with suggestions
Document doc = new Document();
doc.add(new SuggestField("suggest_title", "title1", 2));
doc.add(new SuggestField("suggest_name", "name1", 3));
writer.addDocument(document)
...
// open an nrt reader for the directory
DirectoryReader reader = DirectoryReader.open(writer, false);
// SuggestIndexSearcher is a thin wrapper over IndexSearcher
// queryAnalyzer will be used to analyze the query string
SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader,
queryAnalyzer);
// suggest 10 documents for "titl" on "suggest_title" field
TopSuggestDocs suggest = indexSearcher.suggest("suggest_title", "titl", 10);
{code}
h4. Indexing
Index analyzer set through *IndexWriterConfig*
{code:java}
new SuggestField(name, suggestion, weight)
{code}
h4. Query
Query analyzer set through *SuggestIndexSearcher*
{code:java}
// full options for TopSuggestDocs (TopDocs)
TopSuggestDocs suggest = suggestIndexSearcher.suggest(String field,
CharSequence key, int num, Filter filter)
// full options for Collector
// note: only collects does not score
suggestIndexSearcher.suggest(String field, CharSequence key, int maxNumPerLeaf,
Filter filter, Collector collector)
{code}
h4. Analyzer
*CompletionAnalyzer* can be used instead to wrap another analyzer to tune
suggest field only parameters.
{code:java}
CompletionAnalyzer completionAnalyzer = new CompletionAnalyzer(analyzer);
completionAnalyzer.setPreserveSep(..)
completionAnalyzer.setPreservePositionsIncrements(..)
completionAnalyzer.setMaxGraphExpansions(..)
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]