[ 
https://issues.apache.org/jira/browse/SOLR-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252185#comment-13252185
 ] 

Mike commented on SOLR-3099:
----------------------------

>From my perspective as the OP, my goals for this feature are simple:
 - I want my users to have an exact match operator
 - I want my users to have it work in an identical fashion to regular terms
 - I want highlighting to work without a bunch of extra coding effort

I just drew out various ways to make up the index structures, and 
performance-wise I think agree with Robert that it makes most sense to not 
change the way we index things, and to instead ask people to just have a 
stemmed and an unstemmed index that they can query against. The arguments 
against that are (1) if we introduce a query operator for exact match (which I 
think is vital for this request), it'd be awkward to have an operator determine 
which index is queried. (2), I can't imagine how highlighting would work with 
such a configuration.

Also, I *don't* think we can ask users to do queries like [ unstemmedBody:foo 
]. Two reasons:
 1. That sucks, and users probably won't use it.
 2. Highlighting will break. 

Anyway, hopefully this is helpful...maybe you guys have already thought through 
this. But from my perspective, we have two options for implementing this:
 1. Don't change index structures at the risk of having a query operator change 
which index a query goes against, probably breaking highlighting; or
 2. Change the index structures so they can have the unstemmed tokens at the 
risk of added complexity in the index and a possible performance impact.

As far as the analyzer goes, I don't have a horse in the race, provided it 
works. Seems like adding an exact match operator to it shouldn't be terribly 
hard, but I haven't delved into it myself.


                
> Add query operator, index structure, and analyzer for "exact match" searching
> -----------------------------------------------------------------------------
>
>                 Key: SOLR-3099
>                 URL: https://issues.apache.org/jira/browse/SOLR-3099
>             Project: Solr
>          Issue Type: Sub-task
>          Components: Schema and Analysis
>            Reporter: Mike
>             Fix For: 4.0
>
>
> A project I'm working on requires *exact match* searching with stemming 
> turned off. The users are accostomed to Sphinx search, and thus expect a 
> query like [ =runs ] to return only documents that contain the exact term, 
> "runs", and not the stemmed word "run".
> In SOLR-2866, there is similar work, but I believe it is different because it 
> uses a huge-synonym file rather than storing the original terms directly in 
> the index. 
> What I'd like instead is two things:
> 1. An analyzer that says, "store the original form of all words in the index 
> along with the stemmed variations." If necessary, it's fine if this is simply 
> an unstemmed field, but that seems cumbersome schema-wise and 
> performance-wise.
> 2. An operator in edismax that allows users to query the exact form of the 
> word. Sphinx uses the equals sign (=), and that makes sense logically to me.
> This issue is part of a meta issue, SOLR-3028, that is requesting two other 
> operators in edismax (quorum search and word order).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to