I was playing around with the example Solr app, and I get different results when I specify something like fq=manu_exact:"ASUS Computer Inc." and fq=manu_exact:ASUS Computer Inc.
The latter gives many more matches, which looks kinda familiar..... Silly season stuff, I know, but thought I'd mention it... Best Erick On Wed, Nov 10, 2010 at 5:32 PM, Geert-Jan Brits <gbr...@gmail.com> wrote: > Another option : assuming themes_raw is type 'string' (couldn't get that > nugget of info for 100%) it could be that you're seeing a difference in nr > of results between the 110 for fq:themes_raw and 321 from your db, because > fieldtype:string (thus themes_raw) is case-sensitive while (depending on > your db-setup) querying your db is case-insensitive, which could explain > the > larger nr of hits for your db as well. > > Cheers, > Geert-Jan > > > 2010/11/10 Jonathan Rochkind <rochk...@jhu.edu> > > > I've had that sort of thing happen from 'corrupting' my index, by > changing > > my schema.xml without re-indexing. > > > > If you change field types or other things in schema.xml, you need to > > reindex all your data. (You can add brand new fields or types without > having > > to re-index, but most other changes will require a re-index). > > > > Could that be it? > > > > > > PeterKerk wrote: > > > >> LOL, very clever indeed ;) > >> > >> The thing is: when I select the amount of records matching the theme > >> 'Hotel > >> en Restaurant' in my db, I end up with 321 records. So that is correct. > I > >> dont know where the 370 is coming from. > >> > >> Now when I change the query to this: &fq=themes_raw:Hotel en Restaurant > I > >> end up with 110 records...(another number even :s) > >> > >> What I did notice, is that this only happens on multi-word facets "Hotel > >> en > >> Restaurant" being a 3 word facet. The facets work correct on a facet > named > >> "Cafe", so I suspect it has something to do with the tokenization. > >> > >> As you can see, I'm using "text" and "string". > >> For compleness Im posting definition of those in my schema.xml as well: > >> > >> <fieldType name="text" class="solr.TextField" > >> positionIncrementGap="100"> > >> <analyzer type="index"> > >> <tokenizer class="solr.WhitespaceTokenizerFactory"/> > >> > >> <!-- in this example, we will only use synonyms at query time > >> <filter class="solr.SynonymFilterFactory" > >> synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/> > >> --> > >> <filter class="solr.StopFilterFactory" ignoreCase="true" > >> words="stopwords_dutch.txt"/> > >> <filter class="solr.WordDelimiterFilterFactory" > >> generateWordParts="1" generateNumberParts="1" catenateWords="1" > >> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> > >> <filter class="solr.LowerCaseFilterFactory"/> > >> <filter class="solr.EnglishPorterFilterFactory" > >> protected="protwords.txt"/> > >> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > >> </analyzer> > >> <analyzer type="query"> > >> <tokenizer class="solr.WhitespaceTokenizerFactory"/> > >> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" > >> ignoreCase="true" expand="true"/> > >> <filter class="solr.StopFilterFactory" ignoreCase="true" > >> words="stopwords_dutch.txt"/> > >> <filter class="solr.WordDelimiterFilterFactory" > >> generateWordParts="1" generateNumberParts="1" catenateWords="0" > >> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> > >> <filter class="solr.LowerCaseFilterFactory"/> > >> <filter class="solr.EnglishPorterFilterFactory" > >> protected="protwords.txt"/> > >> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > >> </analyzer> > >> </fieldType> > >> > >> > >> <fieldType name="string" class="solr.StrField" sortMissingLast="true" > >> omitNorms="true" /> > >> > >> > > >