2/14/2013 8:26 AM, Adrien Grand wrote:
This suggests that adding docvalues to the uniqueKey field would be a good
idea for distributed searching in general, since the first phase of a
distributed search only retrieves that field and score.  That assumes of
course that the docvalues are fully utilized for retrieving fields during
that initial phase.

Right, this would likely improve performance given than doc values
(even if disk-based) are more likely to be in memory than stored
fields. Another (better?) approach would be to use the internal Lucene
doc IDs for distributed search (I assumed there was an open JIRA issue
to do that but I can't find it).

Related to this ... I have been watching SOLR-3855. I notice that TextField is not listed on the supported types. Is that likely to change in the future, or is there a fundamental issue there?

My uniqueKey field uses the following fieldType definition:

    <!-- lowercases the entire field value -->
<fieldType name="lowercase" class="solr.TextField" sortMissingLast="true" positionIncrementGap="0" omitNorms="true">
      <analyzer>
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.ICUFoldingFilterFactory"/>
        <filter class="solr.TrimFilterFactory"/>
      </analyzer>
    </fieldType>

I'm about 95% sure that the source value from MySQL will never contain lowercase characters and probably does not actually need to be trimmed, but we want to be able to search when an uppercase value is entered. Would I have to give up that capability to get docvalues on this field? Does the current SOLR-3855 patch take advantage of docvalues for the first phase of a distributed search when they are present, as we discussed earlier?

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to