[
https://issues.apache.org/jira/browse/SOLR-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275058#comment-16275058
]
ASF GitHub Bot commented on SOLR-11662:
---------------------------------------
Github user dsmiley commented on a diff in the pull request:
https://github.com/apache/lucene-solr/pull/275#discussion_r154457666
--- Diff:
solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java ---
@@ -539,6 +591,27 @@ protected Query newRegexpQuery(Term regexp) {
return query;
}
+ @Override
+ protected Query newSynonymQuery(Term terms[]) {
+ switch (synonymQueryStyle) {
+ case PICK_BEST:
+ List<Query> currPosnClauses = new ArrayList<Query>(terms.length);
+ for (Term term : terms) {
+ currPosnClauses.add(newTermQuery(term));
+ }
+ DisjunctionMaxQuery dm = new DisjunctionMaxQuery(currPosnClauses,
0.0f);
+ return dm;
+ case AS_DISTINCT_TERMS:
+ BooleanQuery.Builder builder = new BooleanQuery.Builder();
+ for (Term term : terms) {
+ builder.add(newTermQuery(term), BooleanClause.Occur.SHOULD);
+ }
+ return builder.build();
+ default:
--- End diff --
What I meant to say in my previous review here is that you would have a
case statement for AS_SAME_TERM and then to satisfy Java, add a default that
throws an assertion error. This way we see all 3 enum vals with their own
case, which I think is easier to understand/maintain. Oh, are you're doing
this to handle "null"? Hmm. Maybe put the case immediately before your current
"default"? Or prevent null in the first place? Either I guess... nulls are
unfortunate; I like to avoid them. Notice TextField has primitives for some of
its other settings; it'd be nice if likewise we had a non-null value for
TextField.synonymQueryStyle.
> Make overlapping query term scoring configurable per field type
> ---------------------------------------------------------------
>
> Key: SOLR-11662
> URL: https://issues.apache.org/jira/browse/SOLR-11662
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Doug Turnbull
> Fix For: 7.2, master (8.0)
>
>
> This patch customizes the query-time behavior when query terms overlap
> positions. Right now the only option is SynonymQuery. This is a fantastic
> default & improvement on past versions. However, there are use cases where
> terms overlap positions but don't carry exact synonymy relationships. Often
> synonyms are actually used to model hypernym/hyponym relationships using
> synonyms (or other analyzers). So the individual term scores matter, with
> terms with higher specificity (hyponym) scoring higher than terms with lower
> specificity (hypernym).
> This patch adds the fieldType setting scoreOverlaps, as in:
> {code:java}
> <fieldType name="text_general" scoreOverlaps="pick_best"
> class="solr.TextField" positionIncrementGap="100" multiValued="true">
> {code}
> Valid values for scoreOverlaps are:
> *as_one_term*
> Default, most synonym use cases. Uses SynonymQuery
> Treats all terms as if they're exactly equivalent, with document frequency
> from underlying terms blended
> *pick_best*
> For a given document, score using the best scoring synonym (ie dismax over
> generated terms).
> Useful when synonyms not exactly equilevant. Instead they are used to model
> hypernym/hyponym relationships. Such as expanding to synonyms of where terms
> scores will reflect that quality
> IE this query time expansion
> tabby => tabby, cat, animal
> Searching "text", generates the dismax (text:tabby | text:cat | text:animal)
> *as_distinct_terms*
> (The pre 6.0 behavior.)
> Compromise between pick_best and as_oneSterm
> Appropriate when synonyms reflect a hypernym/hyponym relationship, but lets
> scores stack, so documents with more tabby, cat, or animal the better w/ a
> bias towards the term with highest specificity
> Terms are turned into a boolean OR query, with documen frequencies not blended
> IE this query time expansion
> tabby => tabby, cat, animal
> Searching "text", generates the boolean query (text:tabby text:cat
> text:animal)
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]