Facet Results Strange - Help
Hello, I'm running into some strange results for some facets of mine. Below you'll see the XML returned from solr. I did a query using the standard request handler. Notice the duplicated values returned (american standard, delta, etc). There is actually quite a few of them. At first I though it may be because of case sensitivity, but I since lower case everything going to solr. Hopefully someone can chime in with some tips, thanks! Dan ?xml version=1.0 encoding=UTF-8 ? - response - lst name=responseHeader int name=status0/int int name=QTime4/int /lst result name=response numFound=2328 start=0 / - lst name=facet_counts lst name=facet_queries / - lst name=facet_fields - lst name=manufacturer_facet int name=kohler1560/int int name=american standard197/int int name=toto181/int int name=bemis83/int int name=porcher56/int int name=ginger45/int int name=elements of design40/int int name=brasstech18/int int name=st thomas18/int int name=hansgrohe15/int int name=sterling14/int int name=whitehaus13/int int name=delta12/int int name=jacuzzi10/int int name=cifial8/int int name=kwc8/int int name=herbeau7/int int name=jado7/int int name=elizabethan classics6/int int name=showhouse by moen5/int int name=grohe4/int int name=creative specialties3/int int name=latoscana3/int int name=american standard2/int int name=danze2/int int name=ronbow2/int int name=belle foret1/int int name=dornbracht1/int int name=kohler1/int int name=myson1/int int name=newport brass1/int int name=price pfister1/int int name=quayside publishing1/int int name=st. thomas1/int int name=adagio0/int int name=alno0/int int name=alsons0/int int name=bates and bates0/int int name=blanco0/int int name=cec0/int int name=cole and co0/int int name=competitive0/int int name=corstone0/int int name=creative specialties0/int int name=danze0/int int name=decolav0/int int name=dolan designs0/int int name=doralfe0/int int name=dornbracht0/int int name=dreamline0/int int name=elkay0/int int name=fontaine0/int int name=franke0/int int name=grohe0/int int name=hamat0/int int name=hydrosystems0/int int name=improvement direct0/int int name=insinkerator0/int int name=kenroy international0/int int name=kichler0/int int name=kindred0/int int name=maxim0/int int name=mico0/int int name=moen0/int int name=moen0/int int name=mr sauna0/int int name=mr steam0/int int name=neo elements0/int int name=newport brass0/int int name=ondine0/int int name=pegasus0/int int name=price pfister0/int int name=progress lighting0/int int name=pulse0/int int name=quoizel0/int int name=robern0/int int name=rohl0/int int name=sagehill designs0/int int name=sea gull lighting0/int int name=show house0/int int name=sloan0/int int name=st%2e thomas0/int int name=st%2e thomas creations0/int int name=steamist0/int int name=swanstone0/int int name=thomas lighting0/int int name=warmatowel0/int int name=waste king0/int int name=waterstone0/int /lst /lst /lst /response -- View this message in context: http://www.nabble.com/Facet-Results-Strange---Help-tf3658597.html#a1084 Sent from the Solr - User mailing list archive at Nabble.com.
Re: Facet Results Strange - Help
I have a dynamic field setup for facets. It looks like this: dynamicField name=*_facet type=string indexed=true stored=false multiValued=true / I do this, because we add facets quite often, so having to modify the schema every time would be unfeasible. I'm currently reindexing from scratch, so I cannot try wt=python for little bit longer. Once it's done indexing I'll give that a go and see if I notice anything. Dan Yonik Seeley wrote: On 4/27/07, realw5 [EMAIL PROTECTED] wrote: Hello, I'm running into some strange results for some facets of mine. Below you'll see the XML returned from solr. I did a query using the standard request handler. Notice the duplicated values returned (american standard, delta, etc). There is actually quite a few of them. At first I though it may be because of case sensitivity, but I since lower case everything going to solr. Hopefully someone can chime in with some tips, thanks! What's the field definition for manufacturer_facet in your schema? Is it multi-valued or not? Also, can you try the python response format (wt=python) as it outputs only ASCII and escapes everything else... there is an off chance the strings look the same but aren't. -Yonik -- View this message in context: http://www.nabble.com/Facet-Results-Strange---Help-tf3658597.html#a10226359 Sent from the Solr - User mailing list archive at Nabble.com.
Re: Facet Results Strange - Help
On 4/27/07, realw5 [EMAIL PROTECTED] wrote: I have a dynamic field setup for facets. It looks like this: dynamicField name=*_facet type=string indexed=true stored=false multiValued=true / I do this, because we add facets quite often, so having to modify the schema every time would be unfeasible. I'm currently reindexing from scratch, so I cannot try wt=python for little bit longer. Once it's done indexing I'll give that a go and see if I notice anything. If it's really the same field value repeated, you've hit a bug. If so, it would be helpful if you could open a JIRA bug, and anything you can do to help us reproduce the problem would be appreciated. -Yonik
Re: Facet Results Strange - Help
Ok, I just finished indexing about 20k in documents. I took a look at so far the problem has not appearred again. What I'm thinking caused it was I was not adding overwritePending overwriteCommited in the add process. Therefor over time as data was being cleaned up, it was just appending to the existing data. I did have once cause of repeated values, but after looking at the python writer, I notice a space at the end. I can fix this issue by triming all my values before sening them to solr :-) I'm going to continue indexing, and if the problem popups up once fully indexed I'll post back again. Otherwise thanks for the quick replies! Dan Yonik Seeley wrote: On 4/27/07, realw5 [EMAIL PROTECTED] wrote: I have a dynamic field setup for facets. It looks like this: dynamicField name=*_facet type=string indexed=true stored=false multiValued=true / I do this, because we add facets quite often, so having to modify the schema every time would be unfeasible. I'm currently reindexing from scratch, so I cannot try wt=python for little bit longer. Once it's done indexing I'll give that a go and see if I notice anything. If it's really the same field value repeated, you've hit a bug. If so, it would be helpful if you could open a JIRA bug, and anything you can do to help us reproduce the problem would be appreciated. -Yonik -- View this message in context: http://www.nabble.com/Facet-Results-Strange---Help-tf3658597.html#a10226731 Sent from the Solr - User mailing list archive at Nabble.com.
Re: Facet Results Strange - Help
On 4/27/07, realw5 [EMAIL PROTECTED] wrote: Ok, I just finished indexing about 20k in documents. I took a look at so far the problem has not appearred again. What I'm thinking caused it was I was not adding overwritePending overwriteCommited in the add process. Therefor over time as data was being cleaned up, it was just appending to the existing data. That is the default anyway. Even if duplicate documents were somehow added, that should not cause duplicates in facet results. It should be impossible to get duplicate values from facet.field, regardless of what the index looks like. I did have once cause of repeated values, but after looking at the python writer, I notice a space at the end. I can fix this issue by triming all my values before sening them to solr :-) Hopefully you should have also seen the space in the XML response... if it's not there, that would be a bug. -Yonik
Re: Facet Results Strange - Help
: It's likely you have the facet category added more than once for one : or more docs. Like this; : : field name=manufacturer_facetamerican standard/field : field name=manufacturer_facetamerican standard/field : : Are you adding the facet values on-the-fly? This happened to me and I : solved it by removing the duplicate facet fields. that's really odd ... i can't think of any way that exactly duplicate field values would be counted twice in the current facet.field code. I just tested this using the exampledocs by adding electronics to the cat field of some docs multiple times, and i couldn't reproduce this behavior. can you elaborate more on how to trigger it? -Hoss
Re: Facet Results Strange - Help
: writer, I notice a space at the end. I can fix this issue by triming all my : values before sening them to solr :-) The built in Field Faceting works on the indexed values, so Solr can solve this for you if you use something like this for your facet field type... fieldType name=facetString class=solr.TextField omitNorms=true analyzer !-- KeywordTokenizer does no actual tokenizing, so the entire input string is preserved as a single token -- tokenizer class=solr.KeywordTokenizerFactory/ !-- The LowerCase TokenFilter does what you expect, which can be when you want your sorting to be case insensitive -- filter class=solr.LowerCaseFilterFactory / !-- The TrimFilter removes any leading or trailing whitespace -- filter class=solr.TrimFilterFactory / /analyzer /fieldType -Hoss