Claudio Ranieri and I briefly discussed collator based sorting for
facets in the thread "Problem with accented words sorting" on the
solr-user mailing list. Here's the idea:

Solr faceting supports sorting by either count or index order. Claudio
and I both need the order to be collator-based. My understanding of the
issue is that it is not currently possible.

Collator-based document sorting in Solr uses CollationKeys as field
values. This does not work with faceting on fields with multiple values
as there is no mapping from the key to the human readable value. 

ICU sort keys are always null (00) terminated and when two keys are
compared, the comparison stops as soon as null is reached(?)
http://userguide.icu-project.org/collation/architecture

If we concatenate the keys with the original values:
<key><00><original value><offset of original value>
we get an entity where the ordering is still correct upon comparison and
where the original value can be extracted by using the offset from the
last int (or maybe short, to spare 2 bytes) in the BytesRef.

If the idea is sound, I'll open a JIRA issue. Unfortunately I do not
have time right now for hacking on it.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to