[
https://issues.apache.org/jira/browse/SOLR-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dalius updated SOLR-3106:
-------------------------
Description:
Sorry for inaccurate title.
I have a 3 fields (dc_title, dc_title_unicode, dc_unicode_full) containing same
value:
{code}
<title xmlns="http://www.tei-c.org/ns/1.0">cal•lígraf</title>
{code}
and these fields are configured accordingly:
{code}
<fieldType name="xml" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.ICUFoldingFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.ICUFoldingFilterFactory"/>
</analyzer>
</fieldType>
<fieldType name="xml_unicode" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
</analyzer>
</fieldType>
<fieldType name="xml_unicode_full" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
</analyzer>
</fieldType>
{code}
And finally my search configuration:
{code}
<requestHandler name="dictionary" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">all</str>
<str name="defType">edismax</str>
<str name="mm">2<-25%</str>
<str name="qf">dc_title_unicode_full^2 dc_title_unicode^2
dc_title</str>
<int name="rows">10</int>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="spellcheck.extendedResults">false</str>
<str name="spellcheck.count">1</str>
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>
{code}
I am trying to match the field with various search phrases (that are valid).
There are results:
|| # || search phrase || match? ||
| 1 | cal•lígra? | (/) |
| 2 | cal•ligra? | (x) |
| 3 | cal•ligraf | (/) |
| 4 | calligra? | (/) |
The problem is the #2 attempt to match a data. The #3 works replacing ? with f.
One more thing. If * is used insted of ? other data is matched as cal•lígrafia
but not cal•lígraf...
was:
Sorry for inaccurate title.
I have a 3 fields containing same value:
{code}
<title xmlns="http://www.tei-c.org/ns/1.0">cal•lígraf</title>
{code}
and these fields are configured accordingly:
{code}
<fieldType name="xml" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.ICUFoldingFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.ICUFoldingFilterFactory"/>
</analyzer>
</fieldType>
<fieldType name="xml_unicode" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
</analyzer>
</fieldType>
<fieldType name="xml_unicode_full" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
</analyzer>
</fieldType>
{code}
And finally my search configuration:
{code}
<requestHandler name="dictionary" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">all</str>
<str name="defType">edismax</str>
<str name="mm">2<-25%</str>
<str name="qf">dc_title_unicode_full^2 dc_title_unicode^2
dc_title</str>
<int name="rows">10</int>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="spellcheck.extendedResults">false</str>
<str name="spellcheck.count">1</str>
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>
{code}
I am trying to match the field with various search phrases (that are valid).
There are results:
|| # || search phrase || match? ||
| 1 | cal•lígra? | (/) |
| 2 | cal•ligra? | (x) |
| 3 | cal•ligraf | (/) |
| 4 | calligra? | (/) |
The problem is the #2 attempt to match a data. The #3 works replacing ? with f.
One more thing. If * is used insted of ? other data is matched as cal•lígrafia
but not cal•lígraf...
> Wildcard ? issue
> ----------------
>
> Key: SOLR-3106
> URL: https://issues.apache.org/jira/browse/SOLR-3106
> Project: Solr
> Issue Type: Bug
> Affects Versions: 3.5
> Environment: Tomcat 7.0.25 (request encoding UTF-8)
> Solr 3.5.0
> Java 7 Oracle
> Ubuntu 11.10
> Reporter: Dalius
>
> Sorry for inaccurate title.
> I have a 3 fields (dc_title, dc_title_unicode, dc_unicode_full) containing
> same value:
> {code}
> <title xmlns="http://www.tei-c.org/ns/1.0">cal•lígraf</title>
> {code}
> and these fields are configured accordingly:
> {code}
> <fieldType name="xml" class="solr.TextField" positionIncrementGap="100">
> <analyzer type="index">
> <charFilter class="solr.HTMLStripCharFilterFactory"/>
> <tokenizer class="solr.StandardTokenizerFactory"/>
> <filter class="solr.ICUFoldingFilterFactory"/>
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.StandardTokenizerFactory"/>
> <filter class="solr.ICUFoldingFilterFactory"/>
> </analyzer>
> </fieldType>
>
> <fieldType name="xml_unicode" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer type="index">
> <charFilter class="solr.HTMLStripCharFilterFactory"/>
> <tokenizer class="solr.StandardTokenizerFactory"/>
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> </analyzer>
> </fieldType>
>
> <fieldType name="xml_unicode_full" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer type="index">
> <charFilter class="solr.HTMLStripCharFilterFactory"/>
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> </analyzer>
> </fieldType>
> {code}
> And finally my search configuration:
> {code}
> <requestHandler name="dictionary" class="solr.SearchHandler">
> <lst name="defaults">
> <str name="echoParams">all</str>
> <str name="defType">edismax</str>
> <str name="mm">2<-25%</str>
> <str name="qf">dc_title_unicode_full^2 dc_title_unicode^2
> dc_title</str>
> <int name="rows">10</int>
> <str name="spellcheck.onlyMorePopular">true</str>
> <str name="spellcheck.extendedResults">false</str>
> <str name="spellcheck.count">1</str>
> </lst>
> <arr name="last-components">
> <str>spellcheck</str>
> </arr>
> </requestHandler>
> {code}
> I am trying to match the field with various search phrases (that are valid).
> There are results:
> || # || search phrase || match? ||
> | 1 | cal•lígra? | (/) |
> | 2 | cal•ligra? | (x) |
> | 3 | cal•ligraf | (/) |
> | 4 | calligra? | (/) |
> The problem is the #2 attempt to match a data. The #3 works replacing ? with
> f.
> One more thing. If * is used insted of ? other data is matched as
> cal•lígrafia but not cal•lígraf...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]