The problem is that the tree built by the driver must be persistent so that it can be opened again to add more instances and so that other applications can navigate the tree when extracting the cluster for a given instance using some stragegy.

It takes less than a millisecond to extract a cluster once the distance between the nodes in the tree is calculated.

In my case the instances represents the documents in a Lucene index and I can use to instantly cluster the results, have a "more like this" with threadshold knob for each search result for the user to play with, and what not.


This tree becomes rather large and you do not want to keep the whole thing in memory, it needs to be persistent. I need some sort of local object storage and I like BDB, but their license isn't really comaptiable with the foundation. BDB is just a persistent hashtable and I think I can make my own ASLed variant rather easy using Harmony code.

This is what I just sent to their list:

So I'm thinking I should clone your HashMap, make all access to element data abstract and run it on ByteBuffers.

It would be used as a Map<K, DataFileEntry> pointing at where in a object data file the current value is located.

Updaing values would mean to mark the old instance deleted, add a new one to the end of the object data file and update the position in the index.

It would use Hadoops Writable to serialize keys and values.

It would initially be transactionless.


Any comments to this? Perhaps something similar already exists ASLed?



Ted Dunning skrev:
Can you say a bit more.

It looks to me like the hash map in the Apache Harmony project is a
completely ordinary hashmap implementation.

My confusion makes me think that I don't think I completely understood what
you were referring to.

On Sun, Apr 20, 2008 at 9:37 AM, Karl Wettin <[EMAIL PROTECTED]> wrote:

Karl Wettin skrev:

We could implement our own transactionless variant that use Writable for
serialization. Is it possible to seek on DFS?

I think it could be a trivial thing to implement such a thing based on the
Harmony HashMap.


   karl












Reply via email to