[
https://issues.apache.org/jira/browse/LUCENE-6459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Areek Zillur updated LUCENE-6459:
---------------------------------
Description:
This patch factors out common indexing/search API used by the recently
introduced [NRTSuggester|https://issues.apache.org/jira/browse/LUCENE-6339].
The motivation is to provide a query interface for FST-based fields
(*SuggestField* and *ContextSuggestField*)
to enable suggestion scoring and more powerful automaton queries.
Previously, only prefix ‘queries’ with index-time weights were supported but we
can also support:
* Prefix queries expressed as regular expressions: get suggestions that match
multiple prefixes
** *Example:* _star\[wa\|tr\]_ matches _starwars_ and _startrek_
* Fuzzy Prefix queries supporting scoring: get typo tolerant suggestions scored
by how close they are to the query prefix
** *Example:* querying for _seper_ will score _separate_ higher then
_superstitious_
* Context Queries: get suggestions boosted and/or filtered based on their
indexed contexts (meta data)
** *Boost example:* get typo tolerant suggestions on song names with prefix
_like a roling_ boosting songs with
genre _rock_ and _indie_
** *Filter example:* get suggestion on all file names starting with _finan_
only for _user1_ and _user2_
h3. Suggest API
{code}
SuggestIndexSearcher searcher = new SuggestIndexSearcher(reader);
CompletionQuery query = ...
TopSuggestDocs suggest = searcher.suggest(query, num);
{code}
h3. CompletionQuery
*CompletionQuery* is used to query *SuggestField* and *ContextSuggestField*. A
*CompletionQuery* produces a *CompletionWeight*,
which allows *CompletionQuery* implementations to pass in an automaton that
will be intersected with a FST and allows boosting and
meta data extraction from the intersected partial paths. A *CompletionWeight*
produces a *CompletionScorer*. A *CompletionScorer*
executes a Top N search against the FST with the provided automaton, scoring
and filtering all matched paths.
h4. PrefixCompletionQuery
Return documents with values that match the prefix of an analyzed term text
Documents are sorted according to their suggest field weight.
{code}
PrefixCompletionQuery(Analyzer analyzer, Term term)
{code}
h4. RegexCompletionQuery
Return documents with values that match the prefix of a regular expression
Documents are sorted according to their suggest field weight.
{code}
RegexCompletionQuery(Term term)
{code}
h4. FuzzyCompletionQuery
Return documents with values that has prefixes within a specified edit distance
of an analyzed term text.
Documents are ‘boosted’ by the number of matching prefix letters of the
suggestion with respect to the original term text.
{code}
FuzzyCompletionQuery(Analyzer analyzer, Term term)
{code}
h5. Scoring
{{suggestion_weight + (global_maximum_weight * boost)}}
where {{suggestion_weight}}, {{global_maximum_weight}} and {{boost}} are all
integers.
{{boost = # of prefix characters matched}}
h4. ContextQuery
Return documents that match a {{CompletionQuery}} filtered and/or boosted by
provided context(s).
{code}
ContextQuery(CompletionQuery query)
contextQuery.addContext(CharSequence context, int boost, boolean exact)
{code}
*NOTE:* {{ContextQuery}} should be used with {{ContextSuggestField}} to query
suggestions boosted and/or filtered by contexts.
Running {{ContextQuery}} against a {{SuggestField}} will error out.
h5. Scoring
{{suggestion_weight + (global_maximum_weight * context_boost)}}
where {{suggestion_weight}}, {{global_maximum_weight}} and {{context_boost}}
are all integers
When used with {{FuzzyCompletionQuery}},
{{suggestion_weight + (global_maximum_weight * (context_boost + fuzzy_boost))}}
h3. Context Suggest Field
To use {{ContextQuery}}, use {{ContextSuggestField}} instead of
{{SuggestField}}. Any {{CompletionQuery}} can be used with
{{ContextSuggestField}}, the default behaviour is to return suggestions from
*all* contexts. {{Context}} for every completion hit
can be accessed through {{SuggestScoreDoc#context}}.
{code}
ContextSuggestField(String name, Collection<CharSequence> contexts, String
value, int weight)
{code}
was:
This patch factors out common indexing/search API used by the recently
introduced [NRTSuggester|https://issues.apache.org/jira/browse/LUCENE-6339].
The motivation is to provide a query interface for FST-based fields
(*SuggestField* and *ContextSuggestField*)
for enabling suggestion scoring and more powerful automaton queries.
Previously, only prefix ‘queries’ with index-time weights were supported but we
can also support:
* Prefix queries expressed as regular expressions: get suggestions that match
multiple prefixes
** *Example:* _star\[wa\|tr\]_ matches _starwars_ and _startrek_
* Fuzzy Prefix queries supporting scoring: get typo tolerant suggestions scored
by how close they are to the query prefix
** *Example:* querying for _seper_ will score _separate_ higher then
_superstitious_
* Context Queries: get suggestions boosted and/or filtered based on their
indexed contexts (meta data)
** *Boost example:* get typo tolerant suggestions on song names with prefix
_like a roling_ boosting songs with
genre _rock_ and _indie_
** *Filter example:* get suggestion on all file names starting with _finan_
only for _user1_ and _user2_
h3. Suggest API
{code}
SuggestIndexSearcher searcher = new SuggestIndexSearcher(reader);
CompletionQuery query = ...
TopSuggestDocs suggest = searcher.suggest(query, num);
{code}
h3. CompletionQuery
*CompletionQuery* is used to query *SuggestField* and *ContextSuggestField*. A
*CompletionQuery* produces a *CompletionWeight*,
which allows *CompletionQuery* implementations to pass in an automaton that
will be intersected with a FST and allows boosting and
meta data extraction from the intersected partial paths. A *CompletionWeight*
produces a *CompletionScorer*. A *CompletionScorer*
executes a Top N search against the FST with the provided automaton, scoring
and filtering all matched paths.
h4. PrefixCompletionQuery
Return documents with values that match the prefix of an analyzed term text
Documents are sorted according to their suggest field weight.
{code}
PrefixCompletionQuery(Analyzer analyzer, Term term)
{code}
h4. RegexCompletionQuery
Return documents with values that match the prefix of a regular expression
Documents are sorted according to their suggest field weight.
{code}
RegexCompletionQuery(Term term)
{code}
h4. FuzzyCompletionQuery
Return documents with values that has prefixes within a specified edit distance
of an analyzed term text.
Documents are ‘boosted’ by the number of matching prefix letters of the
suggestion with respect to the original term text.
{code}
FuzzyCompletionQuery(Analyzer analyzer, Term term)
{code}
h5. Scoring
{{suggestion_weight + (global_maximum_weight * boost)}}
where {{suggestion_weight}}, {{global_maximum_weight}} and {{boost}} are all
integers.
{{boost = # of prefix characters matched}}
h4. ContextQuery
Return documents that match a {{CompletionQuery}} filtered and/or boosted by
provided context(s).
{code}
ContextQuery(CompletionQuery query)
contextQuery.addContext(CharSequence context, int boost, boolean exact)
{code}
*NOTE:* {{ContextQuery}} should be used with {{ContextSuggestField}} to query
suggestions boosted and/or filtered by contexts.
Running {{ContextQuery}} against a {{SuggestField}} will error out.
h5. Scoring
{{suggestion_weight + (global_maximum_weight * context_boost)}}
where {{suggestion_weight}}, {{global_maximum_weight}} and {{context_boost}}
are all integers
When used with {{FuzzyCompletionQuery}},
{{suggestion_weight + (global_maximum_weight * (context_boost + fuzzy_boost))}}
h3. Context Suggest Field
To use {{ContextQuery}}, use {{ContextSuggestField}} instead of
{{SuggestField}}. Any {{CompletionQuery}} can be used with
{{ContextSuggestField}}, the default behaviour is to return suggestions from
*all* contexts. {{Context}} for every completion hit
can be accessed through {{SuggestScoreDoc#context}}.
{code}
ContextSuggestField(String name, Collection<CharSequence> contexts, String
value, int weight)
{code}
> [suggest] Query Interface for suggest API
> -----------------------------------------
>
> Key: LUCENE-6459
> URL: https://issues.apache.org/jira/browse/LUCENE-6459
> Project: Lucene - Core
> Issue Type: New Feature
> Components: core/search
> Affects Versions: 5.1
> Reporter: Areek Zillur
> Assignee: Areek Zillur
> Fix For: Trunk, 5.x, 5.1
>
> Attachments: LUCENE-6459.patch, LUCENE-6459.patch, LUCENE-6459.patch,
> LUCENE-6459.patch, LUCENE-6459.patch, LUCENE-6459.patch, LUCENE-6459.patch,
> LUCENE-6459.patch
>
>
> This patch factors out common indexing/search API used by the recently
> introduced [NRTSuggester|https://issues.apache.org/jira/browse/LUCENE-6339].
> The motivation is to provide a query interface for FST-based fields
> (*SuggestField* and *ContextSuggestField*)
> to enable suggestion scoring and more powerful automaton queries.
> Previously, only prefix ‘queries’ with index-time weights were supported but
> we can also support:
> * Prefix queries expressed as regular expressions: get suggestions that
> match multiple prefixes
> ** *Example:* _star\[wa\|tr\]_ matches _starwars_ and _startrek_
> * Fuzzy Prefix queries supporting scoring: get typo tolerant suggestions
> scored by how close they are to the query prefix
> ** *Example:* querying for _seper_ will score _separate_ higher then
> _superstitious_
> * Context Queries: get suggestions boosted and/or filtered based on their
> indexed contexts (meta data)
> ** *Boost example:* get typo tolerant suggestions on song names with
> prefix _like a roling_ boosting songs with
> genre _rock_ and _indie_
> ** *Filter example:* get suggestion on all file names starting with
> _finan_ only for _user1_ and _user2_
> h3. Suggest API
> {code}
> SuggestIndexSearcher searcher = new SuggestIndexSearcher(reader);
> CompletionQuery query = ...
> TopSuggestDocs suggest = searcher.suggest(query, num);
> {code}
> h3. CompletionQuery
> *CompletionQuery* is used to query *SuggestField* and *ContextSuggestField*.
> A *CompletionQuery* produces a *CompletionWeight*,
> which allows *CompletionQuery* implementations to pass in an automaton that
> will be intersected with a FST and allows boosting and
> meta data extraction from the intersected partial paths. A *CompletionWeight*
> produces a *CompletionScorer*. A *CompletionScorer*
> executes a Top N search against the FST with the provided automaton, scoring
> and filtering all matched paths.
> h4. PrefixCompletionQuery
> Return documents with values that match the prefix of an analyzed term text
> Documents are sorted according to their suggest field weight.
> {code}
> PrefixCompletionQuery(Analyzer analyzer, Term term)
> {code}
> h4. RegexCompletionQuery
> Return documents with values that match the prefix of a regular expression
> Documents are sorted according to their suggest field weight.
> {code}
> RegexCompletionQuery(Term term)
> {code}
> h4. FuzzyCompletionQuery
> Return documents with values that has prefixes within a specified edit
> distance of an analyzed term text.
> Documents are ‘boosted’ by the number of matching prefix letters of the
> suggestion with respect to the original term text.
> {code}
> FuzzyCompletionQuery(Analyzer analyzer, Term term)
> {code}
> h5. Scoring
> {{suggestion_weight + (global_maximum_weight * boost)}}
> where {{suggestion_weight}}, {{global_maximum_weight}} and {{boost}} are all
> integers.
> {{boost = # of prefix characters matched}}
> h4. ContextQuery
> Return documents that match a {{CompletionQuery}} filtered and/or boosted by
> provided context(s).
> {code}
> ContextQuery(CompletionQuery query)
> contextQuery.addContext(CharSequence context, int boost, boolean exact)
> {code}
> *NOTE:* {{ContextQuery}} should be used with {{ContextSuggestField}} to query
> suggestions boosted and/or filtered by contexts.
> Running {{ContextQuery}} against a {{SuggestField}} will error out.
> h5. Scoring
> {{suggestion_weight + (global_maximum_weight * context_boost)}}
> where {{suggestion_weight}}, {{global_maximum_weight}} and {{context_boost}}
> are all integers
> When used with {{FuzzyCompletionQuery}},
> {{suggestion_weight + (global_maximum_weight * (context_boost +
> fuzzy_boost))}}
> h3. Context Suggest Field
> To use {{ContextQuery}}, use {{ContextSuggestField}} instead of
> {{SuggestField}}. Any {{CompletionQuery}} can be used with
> {{ContextSuggestField}}, the default behaviour is to return suggestions from
> *all* contexts. {{Context}} for every completion hit
> can be accessed through {{SuggestScoreDoc#context}}.
> {code}
> ContextSuggestField(String name, Collection<CharSequence> contexts, String
> value, int weight)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]