Re: Indexed Hashtables

Steve Loughran Thu, 15 Jan 2009 03:33:33 -0800

Sean Shanny wrote:

Delip,
So far we have had pretty good luck with memcached. We are building ahadoop based solution for data warehouse ETL on XML based log files thatrepresent click stream data on steroids.
We process about 34 million records or about 70 GB data a day. We haveto process dimensional data in our warehouse and then load the surrogate<key><value> pairs in memcached so we can traverse the XML files onceagain to perform the substitutions. We are using the memcached solutionbecause is scales out just like hadoop. We will have code that allowsus to fall back to the DB if the memcached lookup fails but that shouldnot happen to often.

LinkedIn have just opened up something they run internally, ProjectVoldemort:


http://highscalability.com/product-project-voldemort-distributed-database
http://project-voldemort.com/

It's a DHT, Java based. I haven't played with it yet, but it looks likea good part of the portfolio.

Re: Indexed Hashtables

Reply via email to