Hi,

I'm having issues getting an edismax query to match a certain document via
a particular field ("name_c"). I believe this issue is related to
whitespace removal and field/edismax configuration.

*Search term:* "viet nam"
*Document name:* "Vietnam"

*Field Type: *
  <!-- Exact match, whitespace ignored (e.g. "$Fish %Sticks"=="fishsticks")
-->
  <fieldType class="solr.TextField" name="text_exact_concat"
omitNorms="true"
             positionIncrementGap="0" omitTermFreqAndPositions="true">
    <analyzer>
      <charFilter class="solr.PatternReplaceCharFilterFactory"
                  pattern="([^a-z0-9])" replacement=""/>
      <tokenizer class="solr.KeywordTokenizerFactory"/>
      <filter class="solr.PatternReplaceFilterFactory" pattern="(\s+)"
replacement="" replace="all" />
      <filter class="solr.ASCIIFoldingFilterFactory"
preserveOriginal="false"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
  </fieldType>

*Field: *
<field name="name_c" type="text_exact_concat" multiValued="false"
indexed="true" required="false" stored="false"/>

*Raw Query (from Solr Admin Console):*
q=viet nam&
defType=edismax&
sow=false&
qf=name^1.0 name_c^10.0 ancestor_name^1.25&
sort=score desc, name_c asc&
wt=json&indent=true

*Issue Explanation:*
When I execute the query in my local admin console (with debugQuery
enabled) I don't see a match or score for "Vietnam" for the field "name_c".

   - I have this field boosted extra high so any match will take precedence.
   - I'm confident that this isn't being caused by any other fields I have
   more not listed but I removed for clarity
   - I believe this is caused by whitespace interpretation
   - Interestingly, the space is removed for the "name_c" field in the
   parsedquery:

########################################################################
"parsedquery":"(+DisjunctionMaxQuery(((name_c:vietnam)^10.0 |
                                      (ancestor_name:viet nam)^1.25 |
                                      (name:viet name_ps:nam)^1.0)"

"parsedquery_toString":"+((name_c:vietnam)^10.0 |
                          (ancestor_name:viet nam)^1.25 |
                          (name:viet nam)^1.0)
########################################################################

I would really appreciate any support or debugging advice in this matter!
-Simon Bloch

Reply via email to