[ 
https://issues.apache.org/jira/browse/SOLR-10363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eyyub ÇİL updated SOLR-10363:
-----------------------------
    Attachment: complexPhraseWildCardBug.zip
                TestComplexPhraseTurkishECIL.java

Hi Mikhail, thanks for your comment. After your comment, I did some tests on 
different releases too. These release 6.4.1, 6.4.2, 6.5.0 and Master (hash is 
013601f05396523ad900a409e67cdbea19571447). My test only pass on Master branch.  

I attached TestComplexPhraseTurkishECIL.java and test results of different on 
different releases.

Does anyone know which commit  solved this problem? 




> ComplexPhrase WildCard Case Sensitivy problem
> ---------------------------------------------
>
>                 Key: SOLR-10363
>                 URL: https://issues.apache.org/jira/browse/SOLR-10363
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: query parsers
>    Affects Versions: 6.4.1
>            Reporter: Eyyub ÇİL
>              Labels: CPQP, analyzers, complexqueryparser, filterfactory, 
> wildcard
>         Attachments: complexPhraseWildCardBug.zip, 
> TestComplexPhraseTurkishECIL.java, TestComplexPhraseTurkish.java
>
>
> I encounter a problem with ComplexPhrase and TurkishLowerCaseFilterFactory.
> When I search like {!complexphrase}SContent_tinx:"6* YAŞINDA", 
> {!complexphrase}SContent_tinx:"6* yaşında" , results is correct.
> {code}
> {
>   "responseHeader":{
>     "zkConnected":true,
>     "status":0,
>     "QTime":104,
>     "params":{
>       "q":"{!complexphrase}SContent_tinx:\"6* YAŞINDa\"",
>       "debug":"query",
>       "indent":"on",
>       "rows":"0",
>       "wt":"json",
>       "_":"1490456571184"}},
>   "response":{"numFound":15,"start":0,"maxScore":5972.9,"docs":[]
>   },
>   "debug":{
>     "rawquerystring":"{!complexphrase}SContent_tinx:\"6* YAŞINDa\"",
>     "querystring":"{!complexphrase}SContent_tinx:\"6* YAŞINDa\"",
>     "parsedquery":"ComplexPhraseQuery(\"6* YAŞINDa\")",
>     "parsedquery_toString":"\"6* YAŞINDa\"",
>     "QParser":"ComplexPhraseQParser"}}
> {code}
> But If I want to search like {!complexphrase}SContent_tinx:"60 YAŞIND*" or 
> {!complexphrase}SContent_tinx:"60 yaşınd*", NumFound is 0 or result changes 
> according to  Case Sensitive condition.
> {code}
> {
>   "responseHeader":{
>     "zkConnected":true,
>     "status":0,
>     "QTime":10,
>     "params":{
>       "q":"{!complexphrase}SContent_tinx:\"60 YAŞIND*\"",
>       "debug":"query",
>       "indent":"on",
>       "rows":"0",
>       "wt":"json",
>       "_":"1490456571184"}},
>   "response":{"numFound":0,"start":0,"maxScore":0.0,"docs":[]
>   },
>   "debug":{
>     "rawquerystring":"{!complexphrase}SContent_tinx:\"60 YAŞIND*\"",
>     "querystring":"{!complexphrase}SContent_tinx:\"60 YAŞIND*\"",
>     "parsedquery":"ComplexPhraseQuery(\"60 YAŞIND*\")",
>     "parsedquery_toString":"\"60 YAŞIND*\"",
>     "QParser":"ComplexPhraseQParser"}}
> {code}
> When I search without ComplexPhraseQueryParser, numbers of results are same 
> for +60 YAŞIND*+ and +60 yaşınd*+
> {code}
> {
>   "responseHeader":{
>     "zkConnected":true,
>     "status":0,
>     "QTime":56,
>     "params":{
>       "q":"60 yaŞınd*",
>       "indent":"on",
>       "rows":"0",
>       "wt":"json",
>       "debugQuery":"on",
>       "_":"1490456571184"}},
>   "response":{"numFound":776,"start":0,"maxScore":7.633286,"docs":[]
>   },
>   "debug":{
>     "rawquerystring":"60 yaŞınd*",
>     "querystring":"60 yaŞınd*",
>     "parsedquery":"SContent_tinx:60 SContent_tinx:yaşınd*",
>     "parsedquery_toString":"SContent_tinx:60 SContent_tinx:yaşınd*",
>     "QParser":"LuceneQParser",
>     "explain":{}}}
> {code}
> {code:xml}
> <fieldType name="text_general" class="solr.TextField" 
> positionIncrementGap="100">
>          <analyzer type="index">
>        <tokenizer class="solr.StandardTokenizerFactory"/>
>       <filter class="solr.ApostropheFilterFactory"/>
>               <filter class="solr.PatternReplaceFilterFactory"
>                 pattern="[^a-zA-Z0-9üğşçıiöâÜĞŞÇIİÖÂ@# ]" replacement=" " 
> replace="all"/>
>                               <filter class="solr.TrimFilterFactory"/>
>               <filter class="solr.TurkishLowerCaseFilterFactory"/>
>                <filter class="solr.ReversedWildcardFilterFactory" />
>     <!-- <filter class="solr.ReversedWildcardFilterFactory" 
> withOriginal="true"
>            maxPosAsterisk="3" maxPosQuestion="2" 
> maxFractionAsterisk="0.33"/>-->
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.StandardTokenizerFactory"/>
>       <filter class="solr.ApostropheFilterFactory"/>
> <filter class="solr.PatternReplaceFilterFactory"
>                 pattern="[^a-zA-Z0-9üğşçıiöâÜĞŞÇIİÖÂ@# ]" replacement=" " 
> replace="all"/>
>                               <filter class="solr.TrimFilterFactory"/>
>   <filter class="solr.TurkishLowerCaseFilterFactory"/>
>       </analyzer>
>     </fieldType>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to