[
https://issues.apache.org/jira/browse/SOLR-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869323#comment-13869323
]
Naomi Dushay commented on SOLR-2649:
------------------------------------
I believe the changes Andrew is suggesting sound good. I recently make careful
improvements to our CJK Resource discovery (I'm in the midst of blogging about
it), and in combing through our logs of the last few days, I pulled out a few
actual use cases where we have CJK characters and "OR":
鈴木重雄 OR 日本精神生成史論
毛澤東 OR 基礎戰
日報 OR 濟南
飄 OR 上海
there are others. Note that we have no actual cases of CJK + non-CJK
characters and 'OR'.
In my relevancy tests for CJK (supplied by East Asian language librarians), I
didn't find many useful examples to exercise the case above. I could try to
apply a patch locally and check how it affects our ~1000 relevancy tests, but
we are currently running Solr 4.4. It would be much more tractable if there is
a Solr 4.x patch available for testing.
Here is the only realistic examples I could find from our test code:
スポーツ OR supotsu
both clauses translate to "sports" (from Japanese)
So from my perspective, the cjk test is a corner case, and I think Andrew's
approach sounds great. Tom Burton-West and I are partly behind Robert Muir's
fix, so getting Tom BW to weigh in would be great.
> MM ignored in edismax queries with operators
> --------------------------------------------
>
> Key: SOLR-2649
> URL: https://issues.apache.org/jira/browse/SOLR-2649
> Project: Solr
> Issue Type: Bug
> Components: query parsers
> Reporter: Magnus Bergmark
> Priority: Minor
> Fix For: 4.7
>
>
> Hypothetical scenario:
> 1. User searches for "stocks oil gold" with MM set to "50%"
> 2. User adds "-stockings" to the query: "stocks oil gold -stockings"
> 3. User gets no hits since MM was ignored and all terms where AND-ed
> together
> The behavior seems to be intentional, although the reason why is never
> explained:
> // For correct lucene queries, turn off mm processing if there
> // were explicit operators (except for AND).
> boolean doMinMatched = (numOR + numNOT + numPluses + numMinuses) == 0;
> (lines 232-234 taken from
> tags/lucene_solr_3_3/solr/src/java/org/apache/solr/search/ExtendedDismaxQParserPlugin.java)
> This makes edismax unsuitable as an replacement to dismax; mm is one of the
> primary features of dismax.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]