"There are also core design flaws. For example, they use threaded IO...This just won’t scale."

FYI, Kevin, hbase puts up non-blocking server sockets to field client and intra-server communications (It uses Hadoop RPC). Client's of Hadoop's DFS -- e.g. mapreduce jobs, hbase, etc. -- use blocking thread-per-socket for swapping big data blocks. Reportedly, the latter has been sufficient substrate supporting clusters of thousands of computers.

My guess is that when synchronous socket I/O becomes a bottleneck or a good case -- rather than a "gut feeling" -- can be made that this model is overly consumptive, changing the HDFS servers to use async I/O will become a priority.

St.Ack



Kevin Burton wrote:
With all the activity over the holidays I forgot to post this to the list...


http://feedblog.org/2007/12/18/powerset-hadoop-rapleaf/


Reply via email to