Steve Loughran wrote:
aakash shah wrote:
We can assume that this record has only one key->value mapping. Value will be updated every minute. Currently we have 1 Million these ( key->value ) pairs but I have to make sure that we can scale it upto 10 million of these ( key-> value ) pairs.

Every 10 minute I will be updating all of these value using their keys. This is the reason I cannot go for database as a solution.

I wouldn't be so quick to dismiss a database. All your big telcos run their mobile phone systems on databases, where the big issue is having enough memory for the DB to stay in memory; some dedicated databases (e.g. TimesTen) are designed to have bounded latency on lookup so you can predict how long operations will take.

That said, if you are only doing atomic updates of a single record, there's less need for the advanced features. Assuming >1 machine, some kind of distributed hash table may work


I was thinking about going with memcache pool. In the mean-time I heard about hadoop and wanted to get advice from this mailing list regarding memcache pool vs hadoop for this specific problem.

It's not an area Hadoop deals with at all.

The record size sounds too small for HDFS, unless the records are in turn grouped to something optimal for the block size. For records that size, I would also consider a) writing them out again instead of doing updates, b) testing for physical (disk) bottlenecks.

Also, there's memcacheddb as an alternative for persistent hashing:

http://memcachedb.org/benchmark.html

Bill



Reply via email to