Lately, I investigate a lot of traditional methods of parallel processing and experiment some high level processing (e.g. matrix algebra, graph algorithm) using Hadoop/Hbase/MapReduce.
The only way to increase speed linearly was locality (Do write all data even if there are duplicated efforts). Increased node numbers, there is a linear increase of IO channel. Network communication is not a solution. -- Best Regards, Edward J. Yoon @ NHN, corp. [email protected] http://blog.udanax.org
