Re: Querying case-sensitive fields

Jeroen van Vianen Tue, 17 Aug 2010 04:02:41 -0700

On 17-8-2010 11:41, Markus Jelsma wrote:

l, we would need to see the declaration of the field type of your title field.
However, if you use the shipped schema.xml it doesn't make any sense because
it declares a field type that lowercases on both query time and index time.
You should use the analysis.jsp in your Solr admin.


Then, if you really need to change your schema.xml for some reason, you need
to reindex. If you still have the original CrawlDB, LinkDB and segments then
using the solrindex command is sufficient.

Thanks fro your reply. I have a stock schema.xml in my solr config, andsee the following definitions:


        <fieldType name="text" class="solr.TextField"
            positionIncrementGap="100">
            <analyzer>
                <tokenizer class="solr.WhitespaceTokenizerFactory"/>
                <filter class="solr.StopFilterFactory"
                    ignoreCase="true" words="stopwords.txt"/>
                <filter class="solr.WordDelimiterFilterFactory"
                    generateWordParts="1" generateNumberParts="1"
                    catenateWords="1" catenateNumbers="1" catenateAll="0"
                    splitOnCaseChange="1"/>
                <filter class="solr.LowerCaseFilterFactory"/>
                <filter class="solr.EnglishPorterFilterFactory"
                    protected="protwords.txt"/>
                <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
            </analyzer>
        </fieldType>

and

<field name="title" type="text" stored="true" indexed="true"/>

Which supports your statement above.

If I look in the index with Luke, however, I see upper-cased words inthe title field and am unable to find them if I enter a query there. Arethere differences in the way Luke handles queries like "title:Jobs"compared to Solr? These seem to be the most basis queries that shouldperform the same regardless of which tool is used to query the index?


Regards,


Jeroen

Re: Querying case-sensitive fields

Reply via email to