[
https://issues.apache.org/jira/browse/SOLR-10363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15943097#comment-15943097
]
Eyyub ÇİL commented on SOLR-10363:
----------------------------------
I did some tests on Master Branch Tree and found which commit solved the
problem.
*2172f3e0081e3e59ce8b02c2bb5654a592f79f74* (Shalin Shekhar Mangar on 27/10/2016
) has this problem, but my test *passed* when I did test on
*af60048097a83220aae135b09d209a0f2d4ba3c6* (Adrien Grand on 27/10/2016 at 17:27
-LUCENE-7429-)
Fix versions of [LUCENE-7429|https://issues.apache.org/jira/browse/LUCENE-7429]
are master (7.0), 6.3
but I use 6.4.1 for production and has this problem. I'll go deeper to find
reason
> ComplexPhrase WildCard Case Sensitivy problem
> ---------------------------------------------
>
> Key: SOLR-10363
> URL: https://issues.apache.org/jira/browse/SOLR-10363
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: query parsers
> Affects Versions: 6.4.1, 6.5, 6.4.2
> Reporter: Eyyub ÇİL
> Labels: CPQP, analyzers, complexqueryparser, filterfactory,
> wildcard
> Attachments: complexPhraseWildCardBug.zip,
> TestComplexPhraseTurkishECIL.java, TestComplexPhraseTurkish.java
>
>
> I encounter a problem with ComplexPhrase and TurkishLowerCaseFilterFactory.
> When I search like {!complexphrase}SContent_tinx:"6* YAŞINDA",
> {!complexphrase}SContent_tinx:"6* yaşında" , results is correct.
> {code}
> {
> "responseHeader":{
> "zkConnected":true,
> "status":0,
> "QTime":104,
> "params":{
> "q":"{!complexphrase}SContent_tinx:\"6* YAŞINDa\"",
> "debug":"query",
> "indent":"on",
> "rows":"0",
> "wt":"json",
> "_":"1490456571184"}},
> "response":{"numFound":15,"start":0,"maxScore":5972.9,"docs":[]
> },
> "debug":{
> "rawquerystring":"{!complexphrase}SContent_tinx:\"6* YAŞINDa\"",
> "querystring":"{!complexphrase}SContent_tinx:\"6* YAŞINDa\"",
> "parsedquery":"ComplexPhraseQuery(\"6* YAŞINDa\")",
> "parsedquery_toString":"\"6* YAŞINDa\"",
> "QParser":"ComplexPhraseQParser"}}
> {code}
> But If I want to search like {!complexphrase}SContent_tinx:"60 YAŞIND*" or
> {!complexphrase}SContent_tinx:"60 yaşınd*", NumFound is 0 or result changes
> according to Case Sensitive condition.
> {code}
> {
> "responseHeader":{
> "zkConnected":true,
> "status":0,
> "QTime":10,
> "params":{
> "q":"{!complexphrase}SContent_tinx:\"60 YAŞIND*\"",
> "debug":"query",
> "indent":"on",
> "rows":"0",
> "wt":"json",
> "_":"1490456571184"}},
> "response":{"numFound":0,"start":0,"maxScore":0.0,"docs":[]
> },
> "debug":{
> "rawquerystring":"{!complexphrase}SContent_tinx:\"60 YAŞIND*\"",
> "querystring":"{!complexphrase}SContent_tinx:\"60 YAŞIND*\"",
> "parsedquery":"ComplexPhraseQuery(\"60 YAŞIND*\")",
> "parsedquery_toString":"\"60 YAŞIND*\"",
> "QParser":"ComplexPhraseQParser"}}
> {code}
> When I search without ComplexPhraseQueryParser, numbers of results are same
> for +60 YAŞIND*+ and +60 yaşınd*+
> {code}
> {
> "responseHeader":{
> "zkConnected":true,
> "status":0,
> "QTime":56,
> "params":{
> "q":"60 yaŞınd*",
> "indent":"on",
> "rows":"0",
> "wt":"json",
> "debugQuery":"on",
> "_":"1490456571184"}},
> "response":{"numFound":776,"start":0,"maxScore":7.633286,"docs":[]
> },
> "debug":{
> "rawquerystring":"60 yaŞınd*",
> "querystring":"60 yaŞınd*",
> "parsedquery":"SContent_tinx:60 SContent_tinx:yaşınd*",
> "parsedquery_toString":"SContent_tinx:60 SContent_tinx:yaşınd*",
> "QParser":"LuceneQParser",
> "explain":{}}}
> {code}
> {code:xml}
> <fieldType name="text_general" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer type="index">
> <tokenizer class="solr.StandardTokenizerFactory"/>
> <filter class="solr.ApostropheFilterFactory"/>
> <filter class="solr.PatternReplaceFilterFactory"
> pattern="[^a-zA-Z0-9üğşçıiöâÜĞŞÇIİÖÂ@# ]" replacement=" "
> replace="all"/>
> <filter class="solr.TrimFilterFactory"/>
> <filter class="solr.TurkishLowerCaseFilterFactory"/>
> <filter class="solr.ReversedWildcardFilterFactory" />
> <!-- <filter class="solr.ReversedWildcardFilterFactory"
> withOriginal="true"
> maxPosAsterisk="3" maxPosQuestion="2"
> maxFractionAsterisk="0.33"/>-->
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.StandardTokenizerFactory"/>
> <filter class="solr.ApostropheFilterFactory"/>
> <filter class="solr.PatternReplaceFilterFactory"
> pattern="[^a-zA-Z0-9üğşçıiöâÜĞŞÇIİÖÂ@# ]" replacement=" "
> replace="all"/>
> <filter class="solr.TrimFilterFactory"/>
> <filter class="solr.TurkishLowerCaseFilterFactory"/>
> </analyzer>
> </fieldType>
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]