"There are also core design flaws. For example, they use threaded
IO...This just won’t scale."
FYI, Kevin, hbase puts up non-blocking server sockets to field client
and intra-server communications (It uses Hadoop RPC). Client's of
Hadoop's DFS -- e.g. mapreduce jobs, hbase, etc. -- use blocking
thread-per-socket for swapping big data blocks. Reportedly, the latter
has been sufficient substrate supporting clusters of thousands of computers.
My guess is that when synchronous socket I/O becomes a bottleneck or a
good case -- rather than a "gut feeling" -- can be made that this model
is overly consumptive, changing the HDFS servers to use async I/O will
become a priority.
St.Ack
Kevin Burton wrote:
With all the activity over the holidays I forgot to post this to the list...
http://feedblog.org/2007/12/18/powerset-hadoop-rapleaf/