If you are interested I can help, I used the FST on a Hadoop project
to implement a fast map side range join.
create JIRA item with patch attached, i will test it.
The Cassandra integration is probably beyond the time I have available. If
the locations in the code that need to be rewritten to use the FST are
known, and a patch simply 'plugs-in' the FST, that would be much easier.
Eg, I don't know how Cassandra stores the current key index for example...
Ok looks like the IndexSummary encapsulates everything, I can start with
hacking that.
On Fri, Jun 8, 2012 at 11:50 AM, Jason Rutherglen
jason.rutherg...@gmail.com wrote:
The Cassandra integration is probably beyond the time I have available.
If the locations in the code that need to be
Yeah that's fine, however if there isn't a Java implementation that's a lot
of extra work. The FST approach should be a clear quick and easy win. The
current system of in heap keys and a binary search is what the FST replaced
in Lucene. There are plenty of references to the improvement.
On
Yeah, that is why I wrote if possible :) Also, does FST provide a predecessor
lookup function, wasn't clear from the blog post?
On Friday 8 June 2012 at 22:53, Jason Rutherglen wrote:
Yeah that's fine, however if there isn't a Java implementation that's a lot
of extra work. The FST approach
Dne 8.6.2012 21:19, Jason Rutherglen napsal(a):
Ok looks like the IndexSummary encapsulates everything, I can start with
hacking that.
do memory part first. i want to test it on existing serialized index data.