[
https://issues.apache.org/jira/browse/HBASE-12040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14150214#comment-14150214
]
stack commented on HBASE-12040:
-------------------------------
Here are [~jmspaggi] notes on how he ran his tests from mailing list:
{code}
You were right, this is my "small" cluster. It's 4 nodes. One master, 3 RS.
The "big" cluster (8 nodes) is reserved for Lars (0.94) for now ;)
I run the tests using this command:
for i in {1..10}; do echo; echo -n $i ; /home/hadoop/bin/hadoop fs -rmr
/hbase/*; rm -rf /tmp/*; echo rmr /hbase |
/home/zookeeper/zookeeper-3.4.3/bin/zkCli.sh; bin/start-hbase.sh; sleep 60;
bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 1;
echo balancer | bin/hbase shell; sleep 60; bin/hbase
org.apache.hadoop.hbase.PerformanceEvaluation --rows=100 --nomapred
filterScan 1; bin/stop-hbase.sh; done &>> output.txt
Basically, it removed everything, start HBase, put some data it in, and
settle down. Then run the test. I do that 10 times for each test and remove
the smallest and fastests.
Nodes are 16GB.
Extract from hbase-env.sh:
export JAVA_HOME=/usr/local/jdk1.7.0_45/
# Extra Java CLASSPATH elements. Optional.
# export HBASE_CLASSPATH=
# The maximum amount of heap to use, in MB. Default is 1000.
export HBASE_HEAPSIZE=10240
Configured propoerties:
<property>
<name>hbase.rootdir</name>
<value>hdfs://hbasetest1:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>hbasetest1.distparser.com</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/zookeeper</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://hbasetest1:9000/</value>
</property>
<property>
<name>hbase.regionserver.codecs</name>
<value>gz</value>
</property>
<property>
<name>ipc.server.tcpnodelay</name>
<value>true</value>
</property>
<property>
<name>ipc.client.tcpnodelay</name>
<value>true</value>
</property>
<property>
<name>hbase.regionserver.region.split.policy</name>
<value>org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy</value>
</property>
<property>
<name>hbase.hregion.max.filesize</name>
<value>1073741824000</value>
</property>
3 disks only per node:
hbase@hbasetest2:~$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1,8T 0 disk
├─sda1 8:1 0 93,1G 0 part /
└─sda2 8:2 0 1,7T 0 part /data1
sdb 8:16 0 1,8T 0 disk
├─sdb1 8:17 0 9,3G 0 part [SWAP]
└─sdb2 8:18 0 1,8T 0 part /data2
sdc 8:32 0 1,8T 0 disk
└─sdc1 8:33 0 1,8T 0 part /data3
data1 to data3 are the datanode partitions.
Drives are pretty empty:
hbase@hbasetest2:~$ df -h
Sys. de fichiers Taille Utilisé Dispo Uti% Monté sur
/dev/sda1 92G 19G 69G 22% /
udev 10M 0 10M 0% /dev
tmpfs 1,6G 272K 1,6G 1% /run
tmpfs 5,0M 0 5,0M 0% /run/lock
tmpfs 5,0G 0 5,0G 0% /run/shm
/dev/sda2 1,8T 399G 1,4T 24% /data1
/dev/sdb2 1,9T 398G 1,4T 22% /data2
/dev/sdc1 1,9T 397G 1,4T 22% /data3
RegionServers are SATA, Master is SSD. Only one ZK server hosted on the
master too.
0.94.x tests run with hadoop 1.2.1.
0.98.x+ tests run with hadoop 2.2.0
I'm trying to build 0.99 from the source to run it and being able to run
some specific revisions. But so far no success (yet) ;)
Just ask me whatever else you might want to know about the cluster. Can
even give you a remote access.
JM
{code}
In standalone mode, master and tip of 0.98 are about the same. 0.98.6 runs
about 15% faster. Let me try on cluster to see if difference more marked there.
> Performances issues with FilteredScanTest
> ------------------------------------------
>
> Key: HBASE-12040
> URL: https://issues.apache.org/jira/browse/HBASE-12040
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.99.1
> Reporter: Jean-Marc Spaggiari
> Assignee: stack
> Priority: Blocker
> Fix For: 0.98.7, 0.99.1
>
> Attachments: at-HBASE-11331.html, pre-HBASE-11331.html
>
>
> While testing 0.99.0RC1 release performances, compared to 0.98.6, figured
> that:
> - FilteredScanTest is 100 times slower;
> - RandomReadTest is 1.5 times slower;
> - RandomSeekScanTest is 3.2 times slower;
> - RandomScanWithRange10Test is 1,2 times slower;
> - RandomScanWithRange100Test is 1,3 times slower;
> - RandomScanWithRange1000Test is 4 times slower;
> - SequentialReadTest is 1,7 times slower;
> - SequentialWriteTest is just a bit faster;
> - RandomWriteTest is just a bit faster;
> - GaussianRandomReadBenchmark is just a beat slower;
> - SequentialReadBenchmark is 1,1 times slower;
> - SequentialWriteBenchmark is 1,1 times slower;
> - UniformRandomReadBenchmark crashed;
> - UniformRandomSmallScan is 1,3 times slower.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)