RE: Facet sorting seems weird
This is indeed an interesting idea so to speak, but I think it's a bit too manual, so to speak, for our use case. I do see it would solve the problem though, so thank you for sharing it with the community! :) -Original Message- From: James Thomas [mailto:jtho...@camstar.com] Sent: 15. juli 2013 17:08 To: solr-user@lucene.apache.org Subject: RE: Facet sorting seems weird Hi Henrik, We did something related to this that I'll share. I'm rather new to Solr so take this idea cautiously :-) Our requirement was to show exact values but have case-insensitive sorting and facet filtering (prefix filtering). We created an index field (type=string) for creating facets so that the values are indexed as-is. The values we indexed were given the format lowercase value|exact value So for example, given the value bObles, we would index the string bobles|bObles. When displaying the facet we split the facet value from Solr in half and display the second half to the user. Of course the caveat is that you could have 2 facets that differ only in case, but to me that's a data cleansing issue. James -Original Message- From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] Sent: Monday, July 15, 2013 10:57 AM To: solr-user@lucene.apache.org Subject: RE: Facet sorting seems weird Hello, thank you for the quick reply! But given that facet.sort=index just sorts by the faceted index (and I don't want the facet itself to be in lower-case), would that really work? Regards, Henrik Ossipoff -Original Message- From: David Quarterman [mailto:da...@corexe.com] Sent: 15. juli 2013 16:46 To: solr-user@lucene.apache.org Subject: RE: Facet sorting seems weird Hi Henrik, Try setting up a copyfield in your schema and set the copied field to use something like 'text_ws' which implements LowerCaseFilterFactory. Then sort on the copyfield. Regards, DQ -Original Message- From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] Sent: 15 July 2013 15:08 To: solr-user@lucene.apache.org Subject: Facet sorting seems weird Hello, first time writing to the list. I am a developer for a company where we recently switched all of our search core from Sphinx to Solr with very great results. In general we've been very happy with the switch, and everything seems to work just as we want it to. Today however we've run into a bit of a issue regarding faceted sort. For example we have a field called brand in our core, defined as the text_en datatype from the example Solr core. This field is copied into facet_brand with the datatype string (since we don't really need to do much with it except show it for faceted navigation). Now, given these two entries into the field on different documents, LEGO and bObles, and given facet.sort=index, it appears that LEGO is sorted as being before bObles. I assume this is because of casing differences. My question then is, how do we define a decent datatype in our schema, where the casing is exact, but we are able to sort it without casing mattering? Thank you :) Best regards, Henrik Ossipoff
RE: Facet sorting seems weird
Hi Alex, Yes this makes sense. My Java is a bit dusty, but depending on how much in need we will become at this feature, it's definitely something we will look into creating, and if successful, we will definitely be submitting a patch. Thank you for your time and detailed answer! Best regards, Henrik Ossipoff -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: 15. juli 2013 17:16 To: solr-user@lucene.apache.org Subject: Re: Facet sorting seems weird Hi Henrik, If I understand the question correctly (case-insensitive sorting of the facet values), then this is the limitation of the current Facet component. You can see the full implementation at: https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java#L818 If you are comfortable with Java code, the easiest thing might be to copy/fix the component and use your own one for faceting. The components are defined in solrconfig.xml and FacetComponent is in a default chain. See: https://github.com/apache/lucene-solr/blob/trunk/solr/example/solr/collection1/conf/solrconfig.xml#L1194 If you do manage to do this (I would recommend doing it as an extra option), it would be nice to have it contributed back to Solr. I think you are not the only one with this requirement. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Mon, Jul 15, 2013 at 10:08 AM, Henrik Ossipoff Hansen h...@entertainment-trading.com wrote: Hello, first time writing to the list. I am a developer for a company where we recently switched all of our search core from Sphinx to Solr with very great results. In general we've been very happy with the switch, and everything seems to work just as we want it to. Today however we've run into a bit of a issue regarding faceted sort. For example we have a field called brand in our core, defined as the text_en datatype from the example Solr core. This field is copied into facet_brand with the datatype string (since we don't really need to do much with it except show it for faceted navigation). Now, given these two entries into the field on different documents, LEGO and bObles, and given facet.sort=index, it appears that LEGO is sorted as being before bObles. I assume this is because of casing differences. My question then is, how do we define a decent datatype in our schema, where the casing is exact, but we are able to sort it without casing mattering? Thank you :) Best regards, Henrik Ossipoff
RE: Facet sorting seems weird
Hi Henrik, Try setting up a copyfield in your schema and set the copied field to use something like 'text_ws' which implements LowerCaseFilterFactory. Then sort on the copyfield. Regards, DQ -Original Message- From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] Sent: 15 July 2013 15:08 To: solr-user@lucene.apache.org Subject: Facet sorting seems weird Hello, first time writing to the list. I am a developer for a company where we recently switched all of our search core from Sphinx to Solr with very great results. In general we've been very happy with the switch, and everything seems to work just as we want it to. Today however we've run into a bit of a issue regarding faceted sort. For example we have a field called brand in our core, defined as the text_en datatype from the example Solr core. This field is copied into facet_brand with the datatype string (since we don't really need to do much with it except show it for faceted navigation). Now, given these two entries into the field on different documents, LEGO and bObles, and given facet.sort=index, it appears that LEGO is sorted as being before bObles. I assume this is because of casing differences. My question then is, how do we define a decent datatype in our schema, where the casing is exact, but we are able to sort it without casing mattering? Thank you :) Best regards, Henrik Ossipoff
RE: Facet sorting seems weird
Hello, thank you for the quick reply! But given that facet.sort=index just sorts by the faceted index (and I don't want the facet itself to be in lower-case), would that really work? Regards, Henrik Ossipoff -Original Message- From: David Quarterman [mailto:da...@corexe.com] Sent: 15. juli 2013 16:46 To: solr-user@lucene.apache.org Subject: RE: Facet sorting seems weird Hi Henrik, Try setting up a copyfield in your schema and set the copied field to use something like 'text_ws' which implements LowerCaseFilterFactory. Then sort on the copyfield. Regards, DQ -Original Message- From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] Sent: 15 July 2013 15:08 To: solr-user@lucene.apache.org Subject: Facet sorting seems weird Hello, first time writing to the list. I am a developer for a company where we recently switched all of our search core from Sphinx to Solr with very great results. In general we've been very happy with the switch, and everything seems to work just as we want it to. Today however we've run into a bit of a issue regarding faceted sort. For example we have a field called brand in our core, defined as the text_en datatype from the example Solr core. This field is copied into facet_brand with the datatype string (since we don't really need to do much with it except show it for faceted navigation). Now, given these two entries into the field on different documents, LEGO and bObles, and given facet.sort=index, it appears that LEGO is sorted as being before bObles. I assume this is because of casing differences. My question then is, how do we define a decent datatype in our schema, where the casing is exact, but we are able to sort it without casing mattering? Thank you :) Best regards, Henrik Ossipoff
RE: Facet sorting seems weird
Hi Henrik, We did something related to this that I'll share. I'm rather new to Solr so take this idea cautiously :-) Our requirement was to show exact values but have case-insensitive sorting and facet filtering (prefix filtering). We created an index field (type=string) for creating facets so that the values are indexed as-is. The values we indexed were given the format lowercase value|exact value So for example, given the value bObles, we would index the string bobles|bObles. When displaying the facet we split the facet value from Solr in half and display the second half to the user. Of course the caveat is that you could have 2 facets that differ only in case, but to me that's a data cleansing issue. James -Original Message- From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] Sent: Monday, July 15, 2013 10:57 AM To: solr-user@lucene.apache.org Subject: RE: Facet sorting seems weird Hello, thank you for the quick reply! But given that facet.sort=index just sorts by the faceted index (and I don't want the facet itself to be in lower-case), would that really work? Regards, Henrik Ossipoff -Original Message- From: David Quarterman [mailto:da...@corexe.com] Sent: 15. juli 2013 16:46 To: solr-user@lucene.apache.org Subject: RE: Facet sorting seems weird Hi Henrik, Try setting up a copyfield in your schema and set the copied field to use something like 'text_ws' which implements LowerCaseFilterFactory. Then sort on the copyfield. Regards, DQ -Original Message- From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] Sent: 15 July 2013 15:08 To: solr-user@lucene.apache.org Subject: Facet sorting seems weird Hello, first time writing to the list. I am a developer for a company where we recently switched all of our search core from Sphinx to Solr with very great results. In general we've been very happy with the switch, and everything seems to work just as we want it to. Today however we've run into a bit of a issue regarding faceted sort. For example we have a field called brand in our core, defined as the text_en datatype from the example Solr core. This field is copied into facet_brand with the datatype string (since we don't really need to do much with it except show it for faceted navigation). Now, given these two entries into the field on different documents, LEGO and bObles, and given facet.sort=index, it appears that LEGO is sorted as being before bObles. I assume this is because of casing differences. My question then is, how do we define a decent datatype in our schema, where the casing is exact, but we are able to sort it without casing mattering? Thank you :) Best regards, Henrik Ossipoff
Re: Facet sorting seems weird
Hi Henrik, If I understand the question correctly (case-insensitive sorting of the facet values), then this is the limitation of the current Facet component. You can see the full implementation at: https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java#L818 If you are comfortable with Java code, the easiest thing might be to copy/fix the component and use your own one for faceting. The components are defined in solrconfig.xml and FacetComponent is in a default chain. See: https://github.com/apache/lucene-solr/blob/trunk/solr/example/solr/collection1/conf/solrconfig.xml#L1194 If you do manage to do this (I would recommend doing it as an extra option), it would be nice to have it contributed back to Solr. I think you are not the only one with this requirement. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Mon, Jul 15, 2013 at 10:08 AM, Henrik Ossipoff Hansen h...@entertainment-trading.com wrote: Hello, first time writing to the list. I am a developer for a company where we recently switched all of our search core from Sphinx to Solr with very great results. In general we've been very happy with the switch, and everything seems to work just as we want it to. Today however we've run into a bit of a issue regarding faceted sort. For example we have a field called brand in our core, defined as the text_en datatype from the example Solr core. This field is copied into facet_brand with the datatype string (since we don't really need to do much with it except show it for faceted navigation). Now, given these two entries into the field on different documents, LEGO and bObles, and given facet.sort=index, it appears that LEGO is sorted as being before bObles. I assume this is because of casing differences. My question then is, how do we define a decent datatype in our schema, where the casing is exact, but we are able to sort it without casing mattering? Thank you :) Best regards, Henrik Ossipoff
Re: Facet sorting seems weird
Alex, You could submit a JIRA ticket, and add an option like facet.sort = insensitive, and f. syntax Then we all get the benefit of the new feature. On Mon, Jul 15, 2013 at 9:16 AM, Alexandre Rafalovitch arafa...@gmail.comwrote: Hi Henrik, If I understand the question correctly (case-insensitive sorting of the facet values), then this is the limitation of the current Facet component. You can see the full implementation at: https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java#L818 If you are comfortable with Java code, the easiest thing might be to copy/fix the component and use your own one for faceting. The components are defined in solrconfig.xml and FacetComponent is in a default chain. See: https://github.com/apache/lucene-solr/blob/trunk/solr/example/solr/collection1/conf/solrconfig.xml#L1194 If you do manage to do this (I would recommend doing it as an extra option), it would be nice to have it contributed back to Solr. I think you are not the only one with this requirement. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Mon, Jul 15, 2013 at 10:08 AM, Henrik Ossipoff Hansen h...@entertainment-trading.com wrote: Hello, first time writing to the list. I am a developer for a company where we recently switched all of our search core from Sphinx to Solr with very great results. In general we've been very happy with the switch, and everything seems to work just as we want it to. Today however we've run into a bit of a issue regarding faceted sort. For example we have a field called brand in our core, defined as the text_en datatype from the example Solr core. This field is copied into facet_brand with the datatype string (since we don't really need to do much with it except show it for faceted navigation). Now, given these two entries into the field on different documents, LEGO and bObles, and given facet.sort=index, it appears that LEGO is sorted as being before bObles. I assume this is because of casing differences. My question then is, how do we define a decent datatype in our schema, where the casing is exact, but we are able to sort it without casing mattering? Thank you :) Best regards, Henrik Ossipoff -- Bill Bell billnb...@gmail.com cell 720-256-8076