Nik,

I added an external post-query sorting facility that works very well. 
Especially for a scan query that cannot be sorted by ES. For bounded but 
accurate sorts, I implement a class that uses a TreeSort. When each search 
hit is added, I build the sort keys and then add the (implements 
Comparable, of course) document into the TreeSort. But a couple of 
optimizations:

1. If the document is greater than the last entry in the TreeSort, then it 
is discarded. One compare, and it's gone without touching anything.
2. Otherwise, I add it to the TreeSort and then remove the last element of 
the TreeSort.

This is the basic strategy of, how I'm told, Oracle handles sorting, for 
instance, several thousand matching records, limiting the actual response 
to only 200 records, but accurately sorting so that the 200 represent the 
top 200 of all of the several thousand records.

For my ES-based solution, I implement a server in front of ES that handles 
all my business rules. It also handles my post-query sorting and 
combinatorial facet hierarchies, but it's close to the ES server (same 
machine, typically) so there is little network overhead. Then the true 
clients can be far-flung but they don't see the intermediate overhead.

Anyway, this took a bit to write, but it wasn't too bad. And the TreeSort 
is really fast: I can issue a scan query for all 56,000 city documents and 
sort them by name, and return the top 10 matches that are accurately sorted 
across all 56,000. And it's surprisingly quick and doesn't consume any 
extra overhead: I'm not asking ES or Lucene to try and sort the entire mess 
at once, and the TreeSort keeps my working set in the JVM as tiny as my 
final response limit.

Hope this helps!

Brian

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/27ad506d-fc83-49c6-b86f-2b43522736d6%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to