You're results seem very low, but your system specs are also quite moderate.
On 04/02/2010 04:46 PM, Chen Bangzhong wrote: > Hi, All > > I am benchmarking hbase. My HDFS clusters includes 4 servers (Dell 860, with > 2 GB RAM). One NameNode, one JobTracker, 2 DataNodes. > > My HBase Cluster also comprise 4 servers too. One Master, 2 region and one > ZooKeeper. (Dell 860, with 2 GB RAM) > While I'm far from being an authority on the matter, running datanodes+regionservers together should help performance Try making your 2 datanodes + 2 regionservers into 4 servers running both data/region. > I runned the org.apache.hadoop.PerformanceEvaluation on the ZooKeeper > server. the ROW_LENGTH was changed from 1000 to ROW_LENGTH = 100*1024; > So each value will be 100k in size. > > hadoop version is 0.20.2, hbase version is 0.20.3. dfs.replication set to 1. > Setting replication to 1 isn't going to give results that are very indicative of a "real" application, making it questionable as a benchmark. If you intend to run on a single replica at release, you'll be at high risk of data loss. > The following is the command line: > > bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation --nomapred > --rows=10000 randomWrite 20. > > It tooks about one hour to complete the test(3468628 ms), about 60 writes > per second. It seems the performance is disappointing. > > Is there anything I can do to make hbase perform better under 100k size ?I > didn't try the method mentioned in the performance wiki yet, because I > thought 60writes/sec is too low. > > Do you mean *over* 100k size? 2GB ram is pretty low and you'd likely get significantly better performance with it, though on this scale it probably isn't a significant problem. > If the value size is 1k, hbase performs much better. 200000 sequencewrite > tooks about 16 seconds, about 12500 writes/per second. > > Comparing sequencewrite performance with randomwrite isn't a helpful indicator. Do you have randomWrite results for 1k values? The way your performance degrades with the size of the records seems like you may have a bottleneck at network transfer? What's rack locality like and how much bandwidth do you have between the servers? > Now I am trying to benchmark using two clients on 2 servers, no result yet. > > You're already running 20 clients on your first server with the PerformanceEvaluation. Do you mean you intend to run 20 on each? Hopefully someone with better knowledge can give a better answer but my guess is that you have a network transfer transfer. Try doing further tests with randomWrite and decreasing value sizes and see if the time correlates to the total amount of data written.