On 17-8-2010 11:41, Markus Jelsma wrote:
l, we would need to see the declaration of the field type of your title field.
However, if you use the shipped schema.xml it doesn't make any sense because
it declares a field type that lowercases on both query time and index time.
You should use the analysis.jsp in your Solr admin.

Then, if you really need to change your schema.xml for some reason, you need
to reindex. If you still have the original CrawlDB, LinkDB and segments then
using the solrindex command is sufficient.

Thanks fro your reply. I have a stock schema.xml in my solr config, and see the following definitions:

        <fieldType name="text" class="solr.TextField"
            positionIncrementGap="100">
            <analyzer>
                <tokenizer class="solr.WhitespaceTokenizerFactory"/>
                <filter class="solr.StopFilterFactory"
                    ignoreCase="true" words="stopwords.txt"/>
                <filter class="solr.WordDelimiterFilterFactory"
                    generateWordParts="1" generateNumberParts="1"
                    catenateWords="1" catenateNumbers="1" catenateAll="0"
                    splitOnCaseChange="1"/>
                <filter class="solr.LowerCaseFilterFactory"/>
                <filter class="solr.EnglishPorterFilterFactory"
                    protected="protwords.txt"/>
                <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
            </analyzer>
        </fieldType>

and

<field name="title" type="text" stored="true" indexed="true"/>

Which supports your statement above.

If I look in the index with Luke, however, I see upper-cased words in the title field and am unable to find them if I enter a query there. Are there differences in the way Luke handles queries like "title:Jobs" compared to Solr? These seem to be the most basis queries that should perform the same regardless of which tool is used to query the index?

Regards,


Jeroen

Reply via email to