Eyyub ÇİL created SOLR-10363:
--------------------------------
Summary: ComplexPhrase WildCard Case Sensitivy problem
Key: SOLR-10363
URL: https://issues.apache.org/jira/browse/SOLR-10363
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Components: query parsers
Affects Versions: 6.4.1
Reporter: Eyyub ÇİL
I encounter a problem with ComplexPhrase and TurkishLowerCaseFilterFactory.
When I search like +{!complexphrase}SContent_tinx:"6* YAŞINDA"+,
+{!complexphrase}SContent_tinx:"6* yaşında"+, results is correct.
{code}
{
"responseHeader":{
"zkConnected":true,
"status":0,
"QTime":104,
"params":{
"q":"{!complexphrase}SContent_tinx:\"6* YAŞINDa\"",
"debug":"query",
"indent":"on",
"rows":"0",
"wt":"json",
"_":"1490456571184"}},
"response":{"numFound":15,"start":0,"maxScore":5972.9,"docs":[]
},
"debug":{
"rawquerystring":"{!complexphrase}SContent_tinx:\"6* YAŞINDa\"",
"querystring":"{!complexphrase}SContent_tinx:\"6* YAŞINDa\"",
"parsedquery":"ComplexPhraseQuery(\"6* YAŞINDa\")",
"parsedquery_toString":"\"6* YAŞINDa\"",
"QParser":"ComplexPhraseQParser"}}
{code}
But If I want to search like +{!complexphrase}SContent_tinx:"60 YAŞIND*"+ or +
{!complexphrase}SContent_tinx:"60 yaşınd*"+, NumFound is 0 or result changes
according to Case Sensitive condition.
{code}
{
"responseHeader":{
"zkConnected":true,
"status":0,
"QTime":10,
"params":{
"q":"{!complexphrase}SContent_tinx:\"60 YAŞIND*\"",
"debug":"query",
"indent":"on",
"rows":"0",
"wt":"json",
"_":"1490456571184"}},
"response":{"numFound":0,"start":0,"maxScore":0.0,"docs":[]
},
"debug":{
"rawquerystring":"{!complexphrase}SContent_tinx:\"60 YAŞIND*\"",
"querystring":"{!complexphrase}SContent_tinx:\"60 YAŞIND*\"",
"parsedquery":"ComplexPhraseQuery(\"60 YAŞIND*\")",
"parsedquery_toString":"\"60 YAŞIND*\"",
"QParser":"ComplexPhraseQParser"}}
{code}
When I search without ComplexPhraseQueryParser, numbers of results are same for
+60 YAŞIND*+ and +60 yaşınd*+
{code}
{
"responseHeader":{
"zkConnected":true,
"status":0,
"QTime":56,
"params":{
"q":"60 yaŞınd*",
"indent":"on",
"rows":"0",
"wt":"json",
"debugQuery":"on",
"_":"1490456571184"}},
"response":{"numFound":776,"start":0,"maxScore":7.633286,"docs":[]
},
"debug":{
"rawquerystring":"60 yaŞınd*",
"querystring":"60 yaŞınd*",
"parsedquery":"SContent_tinx:60 SContent_tinx:yaşınd*",
"parsedquery_toString":"SContent_tinx:60 SContent_tinx:yaşınd*",
"QParser":"LuceneQParser",
"explain":{}}}
{code}
{code:xml}
<fieldType name="text_general" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.ApostropheFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory"
pattern="[^a-zA-Z0-9üğşçıiöâÜĞŞÇIİÖÂ@# ]" replacement=" "
replace="all"/>
<filter class="solr.TrimFilterFactory"/>
<filter class="solr.TurkishLowerCaseFilterFactory"/>
<filter class="solr.ReversedWildcardFilterFactory" />
<!-- <filter class="solr.ReversedWildcardFilterFactory" withOriginal="true"
maxPosAsterisk="3" maxPosQuestion="2" maxFractionAsterisk="0.33"/>-->
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.ApostropheFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory"
pattern="[^a-zA-Z0-9üğşçıiöâÜĞŞÇIİÖÂ@# ]" replacement=" "
replace="all"/>
<filter class="solr.TrimFilterFactory"/>
<filter class="solr.TurkishLowerCaseFilterFactory"/>
</analyzer>
</fieldType>
{code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]