Not sure why my performance is so slow. Here is my configuration: box1: 10395 SecondaryNameNode 11628 Jps 10131 NameNode 10638 HQuorumPeer 10705 HMaster
box 2-5: 6741 HQuorumPeer 6841 HRegionServer 7881 Jps 6610 DataNode hbase site: ======================= <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- /** * Copyright 2007 The Apache Software Foundation * * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information * regarding copyright ownership. The ASF licenses this file * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance * with the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ --> <configuration> <property> <name>hbase.rootdir</name> <value>hdfs://box1:9000/hbase</value> <description>The directory shared by region servers. </description> </property> <property> <name>hbase.master.port</name> <value>60000</value> <description>The port that the HBase master runs at. </description> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> <description>The mode the cluster will be in. Possible values are false: standalone and pseudo-distributed setups with managed Zookeeper true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh) </description> </property> <property> <name>hbase.regionserver.lease.period</name> <value>120000</value> <description>HRegion server lease period in milliseconds. Default is 60 seconds. Clients must report in within this period else they are considered dead.</description> </property> <property> <name>hbase.zookeeper.property.clientPort</name> <value>2222</value> <description>Property from ZooKeeper's config zoo.cfg. The port at which the clients will connect. </description> </property> <property> <name>hbase.zookeeper.property.dataDir</name> <value>/home/hadoop/zookeeper</value> </property> <property> <name>hbase.zookeeper.property.syncLimit</name> <value>5</value> </property> <property> <name>hbase.zookeeper.property.tickTime</name> <value>2000</value> </property> <property> <name>hbase.zookeeper.property.initLimit</name> <value>10</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>box1,box2,box3,box4</value> <description>Comma separated list of servers in the ZooKeeper Quorum. For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com". By default this is set to localhost for local and pseudo-distributed modes of operation. For a fully-distributed setup, this should be set to a full list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh this is the list of servers which we will start/stop ZooKeeper on. </description> </property> <property> <name>hfile.block.cache.size</name> <value>.5</value> <description>text</description> </property> </configuration> hbase env:==================================================== export HBASE_CLASSPATH=${HADOOP_CONF_DIR} export HBASE_HEAPSIZE=3000 export HBASE_OPTS="-XX:NewSize=6m -XX:MaxNewSize=6m -XX:+UseConcMarkSweepGC -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+CMSIncrementalMode -Xloggc:/home/hadoop/hbase-0.20.0/logs/gc-hbase.log" export HBASE_MANAGES_ZK=true Hadoop core site=========================================================== <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.default.name</name> <value>hdfs://box1:9000</value> <description>The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem.</description> </property> <property> <name>hadoop.tmp.dir</name> <value>/data/hadoop-0.20.0-${user.name}</value> <description>A base for other temporary directories.</description> </property> </configuration> ============== replication is set to 6. hadoop env================= export HADOOP_HEAPSIZE=3000 export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OPTS" export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_SECONDARYNAMENODE_OPTS" export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_DATANODE_OPTS" export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_BALANCER_OPTS" export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_JOBTRACKER_OPTS" ================== Very basic setup. then i start the cluster do simple random Get operations on a tall table (~60 M rows): {NAME => 'tallTable', FAMILIES => [{NAME => 'family1', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]} Is this fairly normal speeds? I'm unsure if this is a result of having a small cluster? Please advise... stack-3 wrote: > > Yeah, seems slow. In old hbase, it could do 5-10k writes a second going > by > performance eval page up on wiki. SequentialWrite was about same as > RandomWrite. Check out the stats on hw up on that page and description of > how test was set up. Can you figure where its slow? > > St.Ack > > On Wed, Aug 12, 2009 at 10:10 AM, llpind <[email protected]> wrote: > >> >> Thanks Stack. >> >> I will try mapred with more clients. I tried it without mapred using 3 >> clients Random Write operations here was the output: >> >> 09/08/12 09:22:52 INFO hbase.PerformanceEvaluation: client-0 Start >> randomWrite at offset 0 for 1048576 rows >> 09/08/12 09:22:52 INFO hbase.PerformanceEvaluation: client-1 Start >> randomWrite at offset 1048576 for 1048576 rows >> 09/08/12 09:22:52 INFO hbase.PerformanceEvaluation: client-2 Start >> randomWrite at offset 2097152 for 1048576 rows >> 09/08/12 09:24:23 INFO hbase.PerformanceEvaluation: client-1 >> 1048576/1153427/2097152 >> 09/08/12 09:24:23 INFO hbase.PerformanceEvaluation: client-2 >> 2097152/2201997/3145728 >> 09/08/12 09:24:25 INFO hbase.PerformanceEvaluation: client-0 >> 0/104857/1048576 >> 09/08/12 09:27:42 INFO hbase.PerformanceEvaluation: client-0 >> 0/209714/1048576 >> 09/08/12 09:27:46 INFO hbase.PerformanceEvaluation: client-1 >> 1048576/1258284/2097152 >> 09/08/12 09:27:46 INFO hbase.PerformanceEvaluation: client-2 >> 2097152/2306854/3145728 >> 09/08/12 09:32:32 INFO hbase.PerformanceEvaluation: client-1 >> 1048576/1363141/2097152 >> 09/08/12 09:32:33 INFO hbase.PerformanceEvaluation: client-0 >> 0/314571/1048576 >> 09/08/12 09:32:41 INFO hbase.PerformanceEvaluation: client-2 >> 2097152/2411711/3145728 >> 09/08/12 09:35:31 INFO hbase.PerformanceEvaluation: client-0 >> 0/419428/1048576 >> 09/08/12 09:35:34 INFO hbase.PerformanceEvaluation: client-1 >> 1048576/1467998/2097152 >> 09/08/12 09:35:53 INFO hbase.PerformanceEvaluation: client-2 >> 2097152/2516568/3145728 >> 09/08/12 09:39:02 INFO hbase.PerformanceEvaluation: client-0 >> 0/524285/1048576 >> 09/08/12 09:39:03 INFO hbase.PerformanceEvaluation: client-2 >> 2097152/2621425/3145728 >> 09/08/12 09:40:07 INFO hbase.PerformanceEvaluation: client-1 >> 1048576/1572855/2097152 >> 09/08/12 09:42:53 INFO hbase.PerformanceEvaluation: client-0 >> 0/629142/1048576 >> 09/08/12 09:44:25 INFO hbase.PerformanceEvaluation: client-2 >> 2097152/2726282/3145728 >> 09/08/12 09:44:44 INFO hbase.PerformanceEvaluation: client-1 >> 1048576/1677712/2097152 >> 09/08/12 09:46:43 INFO hbase.PerformanceEvaluation: client-0 >> 0/733999/1048576 >> 09/08/12 09:48:11 INFO hbase.PerformanceEvaluation: client-2 >> 2097152/2831139/3145728 >> 09/08/12 09:48:29 INFO hbase.PerformanceEvaluation: client-1 >> 1048576/1782569/2097152 >> 09/08/12 09:50:12 INFO hbase.PerformanceEvaluation: client-0 >> 0/838856/1048576 >> 09/08/12 09:52:47 INFO hbase.PerformanceEvaluation: client-2 >> 2097152/2935996/3145728 >> 09/08/12 09:53:51 INFO hbase.PerformanceEvaluation: client-1 >> 1048576/1887426/2097152 >> 09/08/12 09:56:32 INFO hbase.PerformanceEvaluation: client-0 >> 0/943713/1048576 >> 09/08/12 09:58:32 INFO hbase.PerformanceEvaluation: client-2 >> 2097152/3040853/3145728 >> 09/08/12 09:59:14 INFO hbase.PerformanceEvaluation: client-1 >> 1048576/1992283/2097152 >> 09/08/12 10:02:28 INFO hbase.PerformanceEvaluation: client-0 >> 0/1048570/1048576 >> 09/08/12 10:02:30 INFO hbase.PerformanceEvaluation: client-0 Finished >> randomWrite in 2376615ms at offset 0 for 1048576 rows >> 09/08/12 10:02:30 INFO hbase.PerformanceEvaluation: Finished 0 in >> 2376615ms >> writing 1048576 rows >> 09/08/12 10:06:35 INFO hbase.PerformanceEvaluation: client-2 >> 2097152/3145710/3145728 >> 09/08/12 10:06:38 INFO hbase.PerformanceEvaluation: client-2 Finished >> randomWrite in 2623395ms at offset 2097152 for 1048576 rows >> 09/08/12 10:06:38 INFO hbase.PerformanceEvaluation: Finished 2 in >> 2623395ms >> writing 1048576 rows >> 09/08/12 10:06:42 INFO hbase.PerformanceEvaluation: client-1 >> 1048576/2097140/2097152 >> 09/08/12 10:06:43 INFO hbase.PerformanceEvaluation: client-1 Finished >> randomWrite in 2630199ms at offset 1048576 for 1048576 rows >> 09/08/12 10:06:43 INFO hbase.PerformanceEvaluation: Finished 1 in >> 2630199ms >> writing 1048576 rows >> >> >> >> Seems kind of slow for ~3M records. I have a 4 node cluster up at the >> moment. HMaster & Namenode running on same box. >> -- >> View this message in context: >> http://www.nabble.com/HBase-in-a-real-world-application-tp24920888p24940922.html >> Sent from the HBase User mailing list archive at Nabble.com. >> >> > > -- View this message in context: http://www.nabble.com/HBase-in-a-real-world-application-tp24920888p24943406.html Sent from the HBase User mailing list archive at Nabble.com.
