[ 
https://issues.apache.org/jira/browse/SOLR-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15011006#comment-15011006
 ] 

Ted Sullivan commented on SOLR-7539:
------------------------------------

I have just uploaded a new patch that adds what I call "verb support" to the 
autofilter. This enables you to specify terms that will constrain the 
autofilter field choices. The example I have been using for this is a Music 
Ontology that I have been using to demonstrate the features of the query 
autofilter.  So if I have records of musicians, songwriters, songs etc. with 
fields like performer_ss, composer_ss, composition_type_s and so on. If I 
search for

         songs written by Johnny Cash
vs
         songs performed by Johnny Cash

Without the verb support, the autofilter would pick up composition_type_s:Song  
and performer_ss:"Johnny Cash" OR composer_ss:"Johnny Cash" because this artist 
has documents in which he is either (or both) a performer and a songwriter. 
That is, both of these queries would return the same results because neither 
'written' or 'performed' is a value in any document field.

By adding configurations like this and some supporting code

    <searchComponent name="autofilter" 
class="org.apache.solr.handler.component.QueryAutoFilteringComponent" >

      <arr name="verbModifiers">
        <str>written,wrote,composed:composer_ss</str>
        <str>performed,played,sang,recorded:performer_ss</str>
      </arr>
    </searchComponent>

The above queries work as expected. The code detects the presence of the 
modifier in proximity to a term that occurs in the search field (for 'written' 
that would be 'composer_ss') and then collapses the choices to that field alone 
so

(composer_ss:"Johnny Cash" OR performer_ss:"Johnny Cash") becomes just  
composer_ss:"Johnny Cash" when the verb is 'written' and performer_ss:"Johnny 
Cash" when the verb is 'performed'.

In addition, noun phrases that are composed of two different nouns in which one 
acts as a qualifier of the other as in "Beatles Songs" are handled with this 
configuration:

<str>covered,covers:performer_ss|version_s:Cover|original_performer_s:_ENTITY_,recording_type_ss:Song=>original_performer_s:_ENTITY_</str>

In this case, "Beatles Songs" is a single noun phrase that refers to songs 
written by one or more of the Beatles. With this configuration and supporting 
code, we can now disambiguate queries like:

"Beatles Songs covered"  - which are covers of Beatles songs by other artists 
from "songs Beatles covered" - which are songs performed by the Beatles that 
were written by other songwriters.  Two test cases have been added to the patch 
to demonstrate these new features.


> Add a QueryAutofilteringComponent for query introspection using indexed 
> metadata
> --------------------------------------------------------------------------------
>
>                 Key: SOLR-7539
>                 URL: https://issues.apache.org/jira/browse/SOLR-7539
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Ted Sullivan
>            Priority: Minor
>             Fix For: Trunk
>
>         Attachments: SOLR-7539.patch, SOLR-7539.patch, SOLR-7539.patch, 
> SOLR-7539.patch
>
>
> The Query Autofiltering Component provides a method of inferring user intent 
> by matching noun phrases that are typically used for faceted-navigation into 
> Solr filter or boost queries (depending on configuration settings) so that 
> more precise user queries can be met with more precise results.
> The algorithm uses a "longest contiguous phrase match" strategy which allows 
> it to disambiguate queries where single terms are ambiguous but phrases are 
> not. It will work when there is structured information in the form of String 
> fields that are normally used for faceted navigation. It works across fields 
> by building a map of search term to index field using the Lucene FieldCache 
> (UninvertingReader). This enables users to create free text, multi-term 
> queries that combine attributes across facet fields - as if they had searched 
> and then navigated through several facet layers. To address the problem of 
> exact-match only semantics of String fields, support for synonyms (including 
> multi-term synonyms) and stemming was added. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to