Hey stack >> This gave me 32 regions across 2 of our 3 region servers (we have HDFS >> across 17 nodes but only machines running 3 RS). >> > > The balancer ran? I'd think it'd balance the regions across the three > servers. Something stuck in transition stopping the balancer running > (See master log).
The cluster is balanced on the whole, but was only using 2 regions for the TestTable: 12/01/26 09:43:41 INFO master.LoadBalancer: Skipping load balancing. servers=3 regions=1154 average=384.66666 mostloaded=385 leastloaded=384 >> And then the following to scan: >> $HADOOP_HOME/bin/hadoop org.apache.hadoop.hbase.PerformanceEvaluation scan 5 >> > > So, sounds like we're going against two of the three servers only. > >> The output of the scan is: >> 12/01/25 15:11:02 INFO mapred.JobClient: ROWS=5242850 >> 12/01/25 15:11:02 INFO mapred.JobClient: ELAPSED_TIME=1624832 >> (job took 52 secs in reality) >> >> Can anyone elaborate on how I am meant to interpret these numbers >> please? Looks like 3.2 rows per <timeunit> >> > > Your MR job scanned 5M rows. It looks like you had 5 clients so you > should have had 5 mappers running. The ELAPSED_TIME is supposed to be > the sum of the elapsed time of all mappers. The above looks way wrong > to me. No, we got 50 Mappers... 12/01/26 10:32:10 INFO hbase.PerformanceEvaluation: Total # of splits: 50 Does this indicate something suspicious perhaps? >> [I am trying to benchmark because our real data of 340M rows (215G on >> HDFS) takes 60 mins to scan which seems a lot] >> > > Three servers? Scanning in sequence? What rate you seeing per server > Tim? What kind of servers (I think you've posted your profile the > list before but ... it was a while back (smile)). What size the rows > being returned? So we have a really unbalanced cluster with 3 classes of machine. 17 nodes are running HDFS and MR, and 3 of them running HBase. The RS are running on: - 2x Intel(R) Xeon(R) CPU E5630 @ 2.53GHz (quad) - 6x250G SATA 5.4K - 24GB A full scan of the big tables (running through the API, or using Hive) runs in about 1hr, which basic maths suggests to me about 20k rows / sec / region (approx. 4kb per row) Because for these tables, we have the probability of non local data since the HDFS balancer runs, I want to sanitize things using the PE, before we go and buy new machines. For the PE test I've checked the TestTable is local to the RS. I think it is best to try and confirm the PE is performing as expected before looking at our data. Thanks! Tim
