Ryan & Eric thanks for your input. Currently we're experimenting on a small cluster (5-8) node. I'm trying to get statistics from a cluster of this size in order to estimate what impact adding more nodes will have.
This is proving to be a hard task, since it’s hard with such a small number of nodes to compare 2 nodes vs. 5, because maybe Hadoop Datanodes/RegionServers are starving with 2 hence performance suffers. Playing with the HBase perfomanceEval Class, but it seems to take a long time to run “sequentialWrite 2” (~20 minutes). If I simply emulate 1 clients in a simple program, I can do 1 Million Puts in about 3 minutes (non mapred). The sequential write is writing 2 million with 2 clients. Please help in understanding how to use the performanceEvaluation Class. -- View this message in context: http://www.nabble.com/HBase-in-a-real-world-application-tp24920888p24939515.html Sent from the HBase User mailing list archive at Nabble.com.
