On Wed, Jul 8, 2009 at 7:46 PM, Christophe Bisciglia < [email protected]> wrote:
> Hey Usman, your second approach is on the right track. You don't want > to have your end users interacting directly with HDFS. The latency is > too high, and it wasn't designed for this. > This definitely used to be true, but look at the recent news: http://www.docstoc.com/docs/7493304/HBase-Goes-Realtime > OTOH, running a "script" (a mapreduce, streaming, pig or hive job) on > a regular basis and populating a database table is common practice and > a great way to provide interactive access to summary/stats data. > Voldemort has a very nifty atomic switch-over capability. It also has utilities for exporting from HDFS. http://project-voldemort.com/
