You have a problem with the analysis chain. When you do a query, the EnglishPorterFilter is cutting off the last part of your word, but you're not doing the same when indexing. I think that removing that filter from the chain will solve your problem.
Remember that there are two different analysis chains, one for indexing time and one for querying time. I think that you didn't see the shortened word in analysis.jsp because you entered the text in the "Field Value (Index)" text box, so it was using the indexing time analysis chain. If you want to see the results of applying the querying time analysis chain, you should enter the text in the "Field Value (Query)" text box. Good luck, Juan Grande On Thu, Jan 6, 2011 at 10:58 AM, PeterKerk <vettepa...@hotmail.com> wrote: > > @iorixxx: > I ran: http://localhost:8983/solr/db/update/?optimize=true > This is the response: > <response> > <lst name="responseHeader"> > <int name="status">0</int> > <int name="QTime">58</int> > </lst> > </response> > > Then I ran: > > http://localhost:8983/solr/db/select/?indent=on&facet=on&q=*:*&facet.field=themes_raw > > This is response: > <lst name="facet_fields"> > <lst name="themes_raw"> > <int name="Hotel en Restaurant">366</int> > <int name="Kasteel en Landgoed">153</int> > <int name="Strand en Zee">16</int> > </lst> > </lst> > > So, it seems that nothing has changed there, and it looks like also before > the optimize operation the results were shown correct? > > when you say http caching, you mean the caching by the browser? Or does > Solr > have some caching by default? If the latter, how can I clear that cache? > > > @Erick: I added debugquery > > For "Strand en Zee" I see this: > <arr name="parsed_filter_queries"> > <str>PhraseQuery(themes:"strand en zee")</str> > </arr> > > Looks correct. > > > For "Kasteel en Landgoed" I see this: > <arr name="parsed_filter_queries"> > <str>PhraseQuery(themes:"kasteel en landgo")</str> > </arr> > > Which isnt correct! So it seems herein lies the problem. > > Now Im wondering why the value is cut off...this is my schema.xml: > > <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords_dutch.txt"/> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" > generateNumberParts="1" catenateWords="1" catenateNumbers="1" > catenateAll="0" splitOnCaseChange="1"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" > ignoreCase="true" expand="true"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords_dutch.txt"/> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" > generateNumberParts="1" catenateWords="0" catenateNumbers="0" > catenateAll="0" splitOnCaseChange="1"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.EnglishPorterFilterFactory" > protected="protwords.txt"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > </fieldType> > > <field name="themes" type="text" indexed="true" stored="true" > multiValued="true" /> > <field name="themes_raw" type="string" indexed="true" stored="true" > multiValued="true"/> > > > I checked analysis.jsp: > filled in Field: "themes" > and Field value: "Kasteel en Landgoed" > > and schema.jsp, but I didnt see any weird results > > Now, Im wondering what else it could be.. > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Searching-similar-values-for-same-field-results-in-different-results-tp2199269p2205706.html > Sent from the Solr - User mailing list archive at Nabble.com. >