[
https://issues.apache.org/jira/browse/SOLR-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549535#comment-14549535
]
Ted Sullivan commented on SOLR-7539:
------------------------------------
Added logic that will determine if a field is single or multi-valued enabling
the filter to "do the right thing" when users use logic phrases ("and" and
"or") within queries. It turns out that knowing whether a field is single or
multi-value can enable the query autofiltering component to disambiguate user
entered boolean phrases. For example, the query "blue or red cars" means the
same as "blue and red cars" since the color field is single valued - a single
car can only be blue or red. Given that a property can only have one value for
a given record, for a set of cars, with "or" we mean "either" and with "and" we
mean "both", which both translate to a set UNION operation. In other words, for
single valued fields when talking about a group of things, "and" and "or" are
synonyms. The QueryAutofilteringComponent will detect that "red" and "blue"
are both values of the "color" field, determine that "color" is not a
multi-valued field and then translate this into an fq color:(blue OR red) no
matter what logical term was entered in the query phrase. Using AND in this fq
would of course yield 0 results because by definition, no item can have a value
of both "red" and "blue".
However, for multi-valued fields, "and" and "or" are antonyms in common
parlance, as in "cars with GPS, voice-activated bluetooth and heated seats" -
which means that I only want cars that have all three of these options (if i
say "or" in this phrase instead of "and", it means something different - I want
to see cars with any of these options). If "GPS", "voice-activated bluetooth"
and "heated seats" are all indexed as values in an "options" field, the query
autofiltering component will recognize that these are values of the same
multi-valued field and will create the fq options:(GPS AND "voice-activated
bluetooth" AND "heated seats") for the first case and options:(GPS OR
"voice-activated bluetooth" OR "heated seats") for the second. And" is the
default as in "show me inexpensive, fuel-efficient, safe cars" which implies
that I want to see cars with all of these attributes.
So, the vernacular usage of "and" and "or" is dependent on the context. For
single value fields, they are synonyms but for multi-value fields they are
antonyms. The context also depends on whether we are talking about one thing or
a group of things - "red and blue cars" has different meaning from "a red and
blue car". The latter must have a custom paint job so that it has both red and
blue. This problem is not solved currently because it requires a more
semantically intelligent stemming operation. For now, I assume that the user is
looking for a set of things as is most often the case with search.
> Add a QueryAutofilteringComponent for query introspection using indexed
> metadata
> --------------------------------------------------------------------------------
>
> Key: SOLR-7539
> URL: https://issues.apache.org/jira/browse/SOLR-7539
> Project: Solr
> Issue Type: New Feature
> Reporter: Ted Sullivan
> Priority: Minor
> Fix For: Trunk
>
> Attachments: SOLR-7539.patch
>
>
> The Query Autofiltering Component provides a method of inferring user intent
> by matching noun phrases that are typically used for faceted-navigation into
> Solr filter or boost queries (depending on configuration settings) so that
> more precise user queries can be met with more precise results.
> The algorithm uses a "longest contiguous phrase match" strategy which allows
> it to disambiguate queries where single terms are ambiguous but phrases are
> not. It will work when there is structured information in the form of String
> fields that are normally used for faceted navigation. It works across fields
> by building a map of search term to index field using the Lucene FieldCache
> (UninvertingReader). This enables users to create free text, multi-term
> queries that combine attributes across facet fields - as if they had searched
> and then navigated through several facet layers. To address the problem of
> exact-match only semantics of String fields, support for synonyms (including
> multi-term synonyms) and stemming was added.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]