RE: Facet sorting seems weird

2013-07-16 Thread Henrik Ossipoff Hansen
This is indeed an interesting idea so to speak, but I think it's a bit too 
manual, so to speak, for our use case. I do see it would solve the problem 
though, so thank you for sharing it with the community! :)
 
-Original Message-
From: James Thomas [mailto:jtho...@camstar.com] 
Sent: 15. juli 2013 17:08
To: solr-user@lucene.apache.org
Subject: RE: Facet sorting seems weird

Hi Henrik,

We did something related to this that I'll share.  I'm rather new to Solr so 
take this idea cautiously :-) Our requirement was to show exact values but have 
case-insensitive sorting and facet filtering (prefix filtering).

We created an index field (type=string) for creating facets so that the 
values are indexed as-is.
The values we indexed were given the format lowercase value|exact value So 
for example, given the value bObles, we would index the string 
bobles|bObles.
When displaying the facet we split the facet value from Solr in half and 
display the second half to the user.
Of course the caveat is that you could have 2 facets that differ only in case, 
but to me that's a data cleansing issue.

James

-Original Message-
From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com]
Sent: Monday, July 15, 2013 10:57 AM
To: solr-user@lucene.apache.org
Subject: RE: Facet sorting seems weird

Hello, thank you for the quick reply!

But given that facet.sort=index just sorts by the faceted index (and I don't 
want the facet itself to be in lower-case), would that really work?

Regards,
Henrik Ossipoff


-Original Message-
From: David Quarterman [mailto:da...@corexe.com]
Sent: 15. juli 2013 16:46
To: solr-user@lucene.apache.org
Subject: RE: Facet sorting seems weird

Hi Henrik,

Try setting up a copyfield in your schema and set the copied field to use 
something like 'text_ws' which implements LowerCaseFilterFactory. Then sort on 
the copyfield.

Regards,

DQ

-Original Message-
From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com]
Sent: 15 July 2013 15:08
To: solr-user@lucene.apache.org
Subject: Facet sorting seems weird

Hello, first time writing to the list. I am a developer for a company where we 
recently switched all of our search core from Sphinx to Solr with very great 
results. In general we've been very happy with the switch, and everything seems 
to work just as we want it to.

Today however we've run into a bit of a issue regarding faceted sort.

For example we have a field called brand in our core, defined as the text_en 
datatype from the example Solr core. This field is copied into facet_brand with 
the datatype string (since we don't really need to do much with it except show 
it for faceted navigation).

Now, given these two entries into the field on different documents, LEGO and 
bObles, and given facet.sort=index, it appears that LEGO is sorted as being 
before bObles. I assume this is because of casing differences.

My question then is, how do we define a decent datatype in our schema, where 
the casing is exact, but we are able to sort it without casing mattering?

Thank you :)

Best regards,
Henrik Ossipoff


RE: Facet sorting seems weird

2013-07-16 Thread Henrik Ossipoff Hansen
Hi Alex,

Yes this makes sense. My Java is a bit dusty, but depending on how much in need 
we will become at this feature, it's definitely something we will look into 
creating, and if successful, we will definitely be submitting a patch. Thank 
you for your time and detailed answer!

Best regards,
Henrik Ossipoff

-Original Message-
From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] 
Sent: 15. juli 2013 17:16
To: solr-user@lucene.apache.org
Subject: Re: Facet sorting seems weird

Hi Henrik,

If I understand the question correctly (case-insensitive sorting of the facet 
values), then this is the limitation of the current Facet component.

You can see the full implementation at:
https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java#L818

If you are comfortable with Java code, the easiest thing might be to copy/fix 
the component and use your own one for faceting. The components are defined in 
solrconfig.xml and FacetComponent is in a default chain.
See:
https://github.com/apache/lucene-solr/blob/trunk/solr/example/solr/collection1/conf/solrconfig.xml#L1194

If you do manage to do this (I would recommend doing it as an extra option), it 
would be nice to have it contributed back to Solr. I think you are not the only 
one with this requirement.

Regards,
   Alex.

Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at once. 
Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Mon, Jul 15, 2013 at 10:08 AM, Henrik Ossipoff Hansen  
h...@entertainment-trading.com wrote:

 Hello, first time writing to the list. I am a developer for a company 
 where we recently switched all of our search core from Sphinx to Solr 
 with very great results. In general we've been very happy with the 
 switch, and everything seems to work just as we want it to.

 Today however we've run into a bit of a issue regarding faceted sort.

 For example we have a field called brand in our core, defined as the 
 text_en datatype from the example Solr core. This field is copied into 
 facet_brand with the datatype string (since we don't really need to do 
 much with it except show it for faceted navigation).

 Now, given these two entries into the field on different documents, LEGO
 and bObles, and given facet.sort=index, it appears that LEGO is 
 sorted as being before bObles. I assume this is because of casing differences.

 My question then is, how do we define a decent datatype in our schema, 
 where the casing is exact, but we are able to sort it without casing 
 mattering?

 Thank you :)

 Best regards,
 Henrik Ossipoff



RE: Facet sorting seems weird

2013-07-15 Thread David Quarterman
Hi Henrik,

Try setting up a copyfield in your schema and set the copied field to use 
something like 'text_ws' which implements LowerCaseFilterFactory. Then sort on 
the copyfield.

Regards,

DQ

-Original Message-
From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] 
Sent: 15 July 2013 15:08
To: solr-user@lucene.apache.org
Subject: Facet sorting seems weird

Hello, first time writing to the list. I am a developer for a company where we 
recently switched all of our search core from Sphinx to Solr with very great 
results. In general we've been very happy with the switch, and everything seems 
to work just as we want it to.

Today however we've run into a bit of a issue regarding faceted sort.

For example we have a field called brand in our core, defined as the text_en 
datatype from the example Solr core. This field is copied into facet_brand with 
the datatype string (since we don't really need to do much with it except show 
it for faceted navigation).

Now, given these two entries into the field on different documents, LEGO and 
bObles, and given facet.sort=index, it appears that LEGO is sorted as being 
before bObles. I assume this is because of casing differences.

My question then is, how do we define a decent datatype in our schema, where 
the casing is exact, but we are able to sort it without casing mattering?

Thank you :)

Best regards,
Henrik Ossipoff


RE: Facet sorting seems weird

2013-07-15 Thread Henrik Ossipoff Hansen
Hello, thank you for the quick reply!

But given that facet.sort=index just sorts by the faceted index (and I don't 
want the facet itself to be in lower-case), would that really work?

Regards,
Henrik Ossipoff


-Original Message-
From: David Quarterman [mailto:da...@corexe.com] 
Sent: 15. juli 2013 16:46
To: solr-user@lucene.apache.org
Subject: RE: Facet sorting seems weird

Hi Henrik,

Try setting up a copyfield in your schema and set the copied field to use 
something like 'text_ws' which implements LowerCaseFilterFactory. Then sort on 
the copyfield.

Regards,

DQ

-Original Message-
From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] 
Sent: 15 July 2013 15:08
To: solr-user@lucene.apache.org
Subject: Facet sorting seems weird

Hello, first time writing to the list. I am a developer for a company where we 
recently switched all of our search core from Sphinx to Solr with very great 
results. In general we've been very happy with the switch, and everything seems 
to work just as we want it to.

Today however we've run into a bit of a issue regarding faceted sort.

For example we have a field called brand in our core, defined as the text_en 
datatype from the example Solr core. This field is copied into facet_brand with 
the datatype string (since we don't really need to do much with it except show 
it for faceted navigation).

Now, given these two entries into the field on different documents, LEGO and 
bObles, and given facet.sort=index, it appears that LEGO is sorted as being 
before bObles. I assume this is because of casing differences.

My question then is, how do we define a decent datatype in our schema, where 
the casing is exact, but we are able to sort it without casing mattering?

Thank you :)

Best regards,
Henrik Ossipoff


RE: Facet sorting seems weird

2013-07-15 Thread James Thomas
Hi Henrik,

We did something related to this that I'll share.  I'm rather new to Solr so 
take this idea cautiously :-)
Our requirement was to show exact values but have case-insensitive sorting and 
facet filtering (prefix filtering).

We created an index field (type=string) for creating facets so that the 
values are indexed as-is.
The values we indexed were given the format lowercase value|exact value
So for example, given the value bObles, we would index the string 
bobles|bObles.
When displaying the facet we split the facet value from Solr in half and 
display the second half to the user.
Of course the caveat is that you could have 2 facets that differ only in case, 
but to me that's a data cleansing issue.

James

-Original Message-
From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] 
Sent: Monday, July 15, 2013 10:57 AM
To: solr-user@lucene.apache.org
Subject: RE: Facet sorting seems weird

Hello, thank you for the quick reply!

But given that facet.sort=index just sorts by the faceted index (and I don't 
want the facet itself to be in lower-case), would that really work?

Regards,
Henrik Ossipoff


-Original Message-
From: David Quarterman [mailto:da...@corexe.com] 
Sent: 15. juli 2013 16:46
To: solr-user@lucene.apache.org
Subject: RE: Facet sorting seems weird

Hi Henrik,

Try setting up a copyfield in your schema and set the copied field to use 
something like 'text_ws' which implements LowerCaseFilterFactory. Then sort on 
the copyfield.

Regards,

DQ

-Original Message-
From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] 
Sent: 15 July 2013 15:08
To: solr-user@lucene.apache.org
Subject: Facet sorting seems weird

Hello, first time writing to the list. I am a developer for a company where we 
recently switched all of our search core from Sphinx to Solr with very great 
results. In general we've been very happy with the switch, and everything seems 
to work just as we want it to.

Today however we've run into a bit of a issue regarding faceted sort.

For example we have a field called brand in our core, defined as the text_en 
datatype from the example Solr core. This field is copied into facet_brand with 
the datatype string (since we don't really need to do much with it except show 
it for faceted navigation).

Now, given these two entries into the field on different documents, LEGO and 
bObles, and given facet.sort=index, it appears that LEGO is sorted as being 
before bObles. I assume this is because of casing differences.

My question then is, how do we define a decent datatype in our schema, where 
the casing is exact, but we are able to sort it without casing mattering?

Thank you :)

Best regards,
Henrik Ossipoff


Re: Facet sorting seems weird

2013-07-15 Thread Alexandre Rafalovitch
Hi Henrik,

If I understand the question correctly (case-insensitive sorting of the
facet values), then this is the limitation of the current Facet component.

You can see the full implementation at:
https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java#L818

If you are comfortable with Java code, the easiest thing might be to
copy/fix the component and use your own one for faceting. The components
are defined in solrconfig.xml and FacetComponent is in a default chain.
See:
https://github.com/apache/lucene-solr/blob/trunk/solr/example/solr/collection1/conf/solrconfig.xml#L1194

If you do manage to do this (I would recommend doing it as an extra
option), it would be nice to have it contributed back to Solr. I think you
are not the only one with this requirement.

Regards,
   Alex.

Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Mon, Jul 15, 2013 at 10:08 AM, Henrik Ossipoff Hansen 
h...@entertainment-trading.com wrote:

 Hello, first time writing to the list. I am a developer for a company
 where we recently switched all of our search core from Sphinx to Solr with
 very great results. In general we've been very happy with the switch, and
 everything seems to work just as we want it to.

 Today however we've run into a bit of a issue regarding faceted sort.

 For example we have a field called brand in our core, defined as the
 text_en datatype from the example Solr core. This field is copied into
 facet_brand with the datatype string (since we don't really need to do much
 with it except show it for faceted navigation).

 Now, given these two entries into the field on different documents, LEGO
 and bObles, and given facet.sort=index, it appears that LEGO is sorted as
 being before bObles. I assume this is because of casing differences.

 My question then is, how do we define a decent datatype in our schema,
 where the casing is exact, but we are able to sort it without casing
 mattering?

 Thank you :)

 Best regards,
 Henrik Ossipoff



Re: Facet sorting seems weird

2013-07-15 Thread William Bell
Alex,

You could submit a JIRA ticket, and add an option like facet.sort =
insensitive, and f. syntax

Then we all get the benefit of the new feature.



On Mon, Jul 15, 2013 at 9:16 AM, Alexandre Rafalovitch
arafa...@gmail.comwrote:

 Hi Henrik,

 If I understand the question correctly (case-insensitive sorting of the
 facet values), then this is the limitation of the current Facet component.

 You can see the full implementation at:

 https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java#L818

 If you are comfortable with Java code, the easiest thing might be to
 copy/fix the component and use your own one for faceting. The components
 are defined in solrconfig.xml and FacetComponent is in a default chain.
 See:

 https://github.com/apache/lucene-solr/blob/trunk/solr/example/solr/collection1/conf/solrconfig.xml#L1194

 If you do manage to do this (I would recommend doing it as an extra
 option), it would be nice to have it contributed back to Solr. I think you
 are not the only one with this requirement.

 Regards,
Alex.

 Personal website: http://www.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all at
 once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


 On Mon, Jul 15, 2013 at 10:08 AM, Henrik Ossipoff Hansen 
 h...@entertainment-trading.com wrote:

  Hello, first time writing to the list. I am a developer for a company
  where we recently switched all of our search core from Sphinx to Solr
 with
  very great results. In general we've been very happy with the switch, and
  everything seems to work just as we want it to.
 
  Today however we've run into a bit of a issue regarding faceted sort.
 
  For example we have a field called brand in our core, defined as the
  text_en datatype from the example Solr core. This field is copied into
  facet_brand with the datatype string (since we don't really need to do
 much
  with it except show it for faceted navigation).
 
  Now, given these two entries into the field on different documents,
 LEGO
  and bObles, and given facet.sort=index, it appears that LEGO is sorted
 as
  being before bObles. I assume this is because of casing differences.
 
  My question then is, how do we define a decent datatype in our schema,
  where the casing is exact, but we are able to sort it without casing
  mattering?
 
  Thank you :)
 
  Best regards,
  Henrik Ossipoff
 




-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076