Hi Chris, Have a look at Cassandra (from Facebook). [ http://code.google.com/p/the-cassandra-project/] Its a BigTable implementation based on Amazon Dynamo (its completely decentralized/P2P with no single points of failure). You can import data into it very quickly (its got asynch and synchronous write opitions). Its fast and robust. See the presentation: http://www.new.facebook.com/video/video.php?v=540974400803 to learn more about it. However, the documentation is extremely sparse. Unlike HBase, there's a very small team of developers (internal Facebook employees) that support it. [They are working on fixing these issues - and there's talk of making it an Apache project].
>> Hi all- >> One more question. >> I'm looking for a lightweight way to serve data stored as key-value >>pairs in a series of MapFiles or SequenceFiles. HBase/Hypertable >>offer a very robust, powerful solution to this problem with a bunch of >>extra features like updates and column types, etc., that I don't need >>at all. But, I'm wondering if there might be something >>ultra-lightweight that someone has come up with for a very restricted >>(but important!) set of use cases. Basically, I'd like to be able to >>load the entire contents of a file key-value map file in DFS into >>memory across many machines in my cluster so that I can access any of >>it with ultra-low latencies. I don't need updates--I just need >>ultra-fast queries into a very large hash map (actually, just an array >>would be sufficient). This would correspond, approximately to the >>"sstable" functionality that BigTable is implemented on top of, but >>which is also useful for many, many things directly (refer to the >>BigTable paper or >>http://www.techworld.com/storage/features/index.cfm?featureid=3183). >>This question may be better targeted to the HBase community, if so, >>please let me know. Has anyone else tried to deal with this? >>Thanks-- >>Chris
