I was playing around with the example Solr app, and I get different
results when I specify something like fq=manu_exact:"ASUS Computer Inc."
and fq=manu_exact:ASUS Computer Inc.

The latter gives many more matches, which looks kinda familiar.....

Silly season stuff, I know, but thought I'd mention it...

Best
Erick

On Wed, Nov 10, 2010 at 5:32 PM, Geert-Jan Brits <gbr...@gmail.com> wrote:

> Another option :  assuming themes_raw is type 'string' (couldn't get that
> nugget of info for 100%) it could be that you're seeing a difference in nr
> of results between the 110 for fq:themes_raw and 321 from your db, because
> fieldtype:string (thus themes_raw)  is case-sensitive while (depending on
> your db-setup) querying your db is case-insensitive, which could explain
> the
> larger nr of hits for your db as well.
>
> Cheers,
> Geert-Jan
>
>
> 2010/11/10 Jonathan Rochkind <rochk...@jhu.edu>
>
> > I've had that sort of thing happen from 'corrupting' my index, by
> changing
> > my schema.xml without re-indexing.
> >
> > If you change field types or other things in schema.xml, you need to
> > reindex all your data. (You can add brand new fields or types without
> having
> > to re-index, but most other changes will require a re-index).
> >
> > Could that be it?
> >
> >
> > PeterKerk wrote:
> >
> >> LOL, very clever indeed ;)
> >>
> >> The thing is: when I select the amount of records matching the theme
> >> 'Hotel
> >> en Restaurant' in my db, I end up with 321 records. So that is correct.
> I
> >> dont know where the 370 is coming from.
> >>
> >> Now when I change the query to this: &fq=themes_raw:Hotel en Restaurant
> I
> >> end up with 110 records...(another number even :s)
> >>
> >> What I did notice, is that this only happens on multi-word facets "Hotel
> >> en
> >> Restaurant" being a 3 word facet. The facets work correct on a facet
> named
> >> "Cafe", so I suspect it has something to do with the tokenization.
> >>
> >> As you can see, I'm using "text" and "string".
> >> For compleness Im posting definition of those in my schema.xml as well:
> >>
> >>    <fieldType name="text" class="solr.TextField"
> >> positionIncrementGap="100">
> >>      <analyzer type="index">
> >>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> >>
> >>        <!-- in this example, we will only use synonyms at query time
> >>        <filter class="solr.SynonymFilterFactory"
> >> synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
> >>        -->
> >>        <filter class="solr.StopFilterFactory" ignoreCase="true"
> >> words="stopwords_dutch.txt"/>
> >>        <filter class="solr.WordDelimiterFilterFactory"
> >> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> >> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
> >>        <filter class="solr.LowerCaseFilterFactory"/>
> >>        <filter class="solr.EnglishPorterFilterFactory"
> >> protected="protwords.txt"/>
> >>        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> >>      </analyzer>
> >>      <analyzer type="query">
> >>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> >>        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> >> ignoreCase="true" expand="true"/>
> >>        <filter class="solr.StopFilterFactory" ignoreCase="true"
> >> words="stopwords_dutch.txt"/>
> >>        <filter class="solr.WordDelimiterFilterFactory"
> >> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> >> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
> >>        <filter class="solr.LowerCaseFilterFactory"/>
> >>        <filter class="solr.EnglishPorterFilterFactory"
> >> protected="protwords.txt"/>
> >>        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> >>      </analyzer>
> >>    </fieldType>
> >>
> >>
> >> <fieldType name="string" class="solr.StrField" sortMissingLast="true"
> >> omitNorms="true" />
> >>
> >>
> >
>

Reply via email to