Yes Hyunsik, but that's all I know from the Tajo website. I guess there are more default configurations but not showed on the Tajo wiki, right?
On Mon, Jan 19, 2015 at 4:52 PM, Hyunsik Choi <[email protected]> wrote: > Thank you for sharing the machine information. > > In my opinion, we can boost up Tajo performance very much in the > machine with proper configuration if the server is dedicated for Tajo. > I think that the configuration that we mentioned above only uses some > of the physical resources in the machine :) > > Warm regards, > Hyunsik > > On Sun, Jan 18, 2015 at 8:10 PM, Azuryy Yu <[email protected]> wrote: > > Thanks Tyunsik. > > > > I asked our infra team, my 6 nodes Tajo cluster were visulized from one > > host. that's mean I run 6 nodes Tajo cluster on one phisical host.(24cpu, > > 64G mem, 4T*12 HDD) > > > > so I think this was the real performance bottle neck. > > > > > > > > On Mon, Jan 19, 2015 at 11:12 AM, Hyunsik Choi <[email protected]> > wrote: > > > >> Hi Azuryy, > >> > >> Tajo automatically rewrites distinct aggregation queries into > >> multi-level aggregations. The query rewrite that Jinho suggested may > >> be already involved. > >> > >> I think that your query response times (12 ~ 15 secs) for distinct > >> count seems to be reasonable because just count aggregation takes 5 > >> secs. Usually, distinct aggregation queries are much more slower than > >> just aggregation queries because distinct aggregation involves sort, > >> large intermediate data, and only distinct value handling. > >> > >> In addition, I have a question for more better configuration guide. > >> Could you share available CPU, memory and disks for Tajo? > >> > >> Even though Jinho suggested one, there is still room to set exact and > >> better configurations. Since the resource configuration determines the > >> number of concurrent tasks, it may be main cause of your performance > >> problem. > >> > >> Best regards, > >> Hyunsik > >> > >> On Sun, Jan 18, 2015 at 6:54 PM, Jinho Kim <[email protected]> wrote: > >> > Sorry for my mistake example query. > >> > Can you change to “select count(a.auid) from ( select auid from > >> > test_pl_00_0 group by auid ) a;” ? > >> > > >> > -Jinho > >> > Best regards > >> > > >> > 2015-01-19 11:44 GMT+09:00 Azuryy Yu <[email protected]>: > >> > > >> >> Sorry for no response during weekend. > >> >> I changed hdfs-site.xml and restart hdfs and tajo.but It's more slow > >> than > >> >> before. > >> >> > >> >> default> select count(a.auid) from ( select auid from test_pl_00_0 ) > a; > >> >> Progress: 0%, response time: 1.132 sec > >> >> Progress: 0%, response time: 1.134 sec > >> >> Progress: 0%, response time: 1.536 sec > >> >> Progress: 0%, response time: 2.338 sec > >> >> Progress: 0%, response time: 3.341 sec > >> >> Progress: 3%, response time: 4.343 sec > >> >> Progress: 4%, response time: 5.346 sec > >> >> Progress: 9%, response time: 6.35 sec > >> >> Progress: 11%, response time: 7.352 sec > >> >> Progress: 16%, response time: 8.354 sec > >> >> Progress: 18%, response time: 9.362 sec > >> >> Progress: 24%, response time: 10.364 sec > >> >> Progress: 27%, response time: 11.366 sec > >> >> Progress: 29%, response time: 12.368 sec > >> >> Progress: 32%, response time: 13.37 sec > >> >> Progress: 37%, response time: 14.373 sec > >> >> Progress: 40%, response time: 15.377 sec > >> >> Progress: 42%, response time: 16.379 sec > >> >> Progress: 42%, response time: 17.382 sec > >> >> Progress: 43%, response time: 18.384 sec > >> >> Progress: 43%, response time: 19.386 sec > >> >> Progress: 45%, response time: 20.388 sec > >> >> Progress: 45%, response time: 21.391 sec > >> >> Progress: 46%, response time: 22.393 sec > >> >> Progress: 46%, response time: 23.395 sec > >> >> Progress: 48%, response time: 24.398 sec > >> >> Progress: 48%, response time: 25.401 sec > >> >> Progress: 50%, response time: 26.403 sec > >> >> Progress: 100%, response time: 26.95 sec > >> >> ?count > >> >> ------------------------------- > >> >> 4487999 > >> >> (1 rows, 26.95 sec, 8 B selected) > >> >> default> select count(distinct auid) from test_pl_00_0; > >> >> Progress: 0%, response time: 0.88 sec > >> >> Progress: 0%, response time: 0.881 sec > >> >> Progress: 0%, response time: 1.283 sec > >> >> Progress: 0%, response time: 2.086 sec > >> >> Progress: 0%, response time: 3.088 sec > >> >> Progress: 0%, response time: 4.09 sec > >> >> Progress: 25%, response time: 5.092 sec > >> >> Progress: 33%, response time: 6.094 sec > >> >> Progress: 50%, response time: 7.096 sec > >> >> Progress: 50%, response time: 8.098 sec > >> >> Progress: 50%, response time: 9.099 sec > >> >> Progress: 66%, response time: 10.101 sec > >> >> Progress: 66%, response time: 11.103 sec > >> >> Progress: 83%, response time: 12.105 sec > >> >> Progress: 100%, response time: 12.268 sec > >> >> ?count > >> >> ------------------------------- > >> >> 1222356 > >> >> (1 rows, 12.268 sec, 8 B selected) > >> >> > >> >> On Sat, Jan 17, 2015 at 11:00 PM, Jinho Kim <[email protected]> > wrote: > >> >> > >> >> > Thank you for your sharing > >> >> > > >> >> > Can you enable the dfs.datanode.hdfs-blocks-metadata.enabled in > >> >> > hdfs-site.xml ? > >> >> > If you enable the block-metadata, tajo-cluster can use the volume > load > >> >> > balancing. You should restart the datanode and tajo cluster. I will > >> >> > investigate performance of count-distinct operator. and You can > >> change to > >> >> > “select count(a.auid) from ( select auid from test_pl_00_0 ) a” > >> >> > > >> >> > > >> >> > -Jinho > >> >> > Best regards > >> >> > > >> >> > 2015-01-16 18:05 GMT+09:00 Azuryy Yu <[email protected]>: > >> >> > > >> >> > > default> select count(*) from test_pl_00_0; > >> >> > > Progress: 0%, response time: 0.718 sec > >> >> > > Progress: 0%, response time: 0.72 sec > >> >> > > Progress: 0%, response time: 1.121 sec > >> >> > > Progress: 12%, response time: 1.923 sec > >> >> > > Progress: 28%, response time: 2.925 sec > >> >> > > Progress: 41%, response time: 3.927 sec > >> >> > > Progress: 50%, response time: 4.931 sec > >> >> > > Progress: 100%, response time: 5.323 sec > >> >> > > 2015-01-16T17:04:41.116+0800: [GC2015-01-16T17:04:41.116+0800: > >> [ParNew: > >> >> > > 26543K->6211K(31488K), 0.0079770 secs] 26543K->6211K(115456K), > >> >> 0.0080700 > >> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs] > >> >> > > 2015-01-16T17:04:41.303+0800: [GC2015-01-16T17:04:41.303+0800: > >> [ParNew: > >> >> > > 27203K->7185K(31488K), 0.0066950 secs] 27203K->7185K(115456K), > >> >> 0.0068130 > >> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs] > >> >> > > 2015-01-16T17:04:41.504+0800: [GC2015-01-16T17:04:41.504+0800: > >> [ParNew: > >> >> > > 28177K->5597K(31488K), 0.0091630 secs] 28177K->6523K(115456K), > >> >> 0.0092430 > >> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs] > >> >> > > 2015-01-16T17:04:41.778+0800: [GC2015-01-16T17:04:41.778+0800: > >> [ParNew: > >> >> > > 26589K->6837K(31488K), 0.0067280 secs] 27515K->7764K(115456K), > >> >> 0.0068160 > >> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs] > >> >> > > ?count > >> >> > > ------------------------------- > >> >> > > 4487999 > >> >> > > (1 rows, 5.323 sec, 8 B selected) > >> >> > > > >> >> > > On Fri, Jan 16, 2015 at 5:03 PM, Azuryy Yu <[email protected]> > >> wrote: > >> >> > > > >> >> > > > Hi, > >> >> > > > There is no big improvement, sometimes more slower than > before. I > >> >> also > >> >> > > try > >> >> > > > to increase worker's heap size and parallel, nothing improve. > >> >> > > > > >> >> > > > default> select count(distinct auid) from test_pl_00_0; > >> >> > > > Progress: 0%, response time: 0.963 sec > >> >> > > > Progress: 0%, response time: 0.964 sec > >> >> > > > Progress: 0%, response time: 1.366 sec > >> >> > > > Progress: 0%, response time: 2.168 sec > >> >> > > > Progress: 0%, response time: 3.17 sec > >> >> > > > Progress: 0%, response time: 4.172 sec > >> >> > > > Progress: 16%, response time: 5.174 sec > >> >> > > > Progress: 16%, response time: 6.176 sec > >> >> > > > Progress: 16%, response time: 7.178 sec > >> >> > > > Progress: 33%, response time: 8.18 sec > >> >> > > > Progress: 50%, response time: 9.181 sec > >> >> > > > Progress: 50%, response time: 10.183 sec > >> >> > > > Progress: 50%, response time: 11.185 sec > >> >> > > > Progress: 50%, response time: 12.187 sec > >> >> > > > Progress: 66%, response time: 13.189 sec > >> >> > > > Progress: 66%, response time: 14.19 sec > >> >> > > > Progress: 100%, response time: 15.003 sec > >> >> > > > 2015-01-16T17:00:56.410+0800: [GC2015-01-16T17:00:56.410+0800: > >> >> [ParNew: > >> >> > > > 26473K->6582K(31488K), 0.0105030 secs] 26473K->6582K(115456K), > >> >> > 0.0105720 > >> >> > > > secs] [Times: user=0.04 sys=0.00, real=0.01 secs] > >> >> > > > 2015-01-16T17:00:56.593+0800: [GC2015-01-16T17:00:56.593+0800: > >> >> [ParNew: > >> >> > > > 27574K->6469K(31488K), 0.0086300 secs] 27574K->6469K(115456K), > >> >> > 0.0086940 > >> >> > > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs] > >> >> > > > 2015-01-16T17:00:56.800+0800: [GC2015-01-16T17:00:56.800+0800: > >> >> [ParNew: > >> >> > > > 27461K->5664K(31488K), 0.0122560 secs] 27461K->6591K(115456K), > >> >> > 0.0123210 > >> >> > > > secs] [Times: user=0.02 sys=0.01, real=0.01 secs] > >> >> > > > 2015-01-16T17:00:57.065+0800: [GC2015-01-16T17:00:57.065+0800: > >> >> [ParNew: > >> >> > > > 26656K->6906K(31488K), 0.0070520 secs] 27583K->7833K(115456K), > >> >> > 0.0071470 > >> >> > > > secs] [Times: user=0.03 sys=0.00, real=0.01 secs] > >> >> > > > ?count > >> >> > > > ------------------------------- > >> >> > > > 1222356 > >> >> > > > (1 rows, 15.003 sec, 8 B selected) > >> >> > > > > >> >> > > > > >> >> > > > On Fri, Jan 16, 2015 at 4:09 PM, Azuryy Yu <[email protected] > > > >> >> wrote: > >> >> > > > > >> >> > > >> Thanks Kim, I'll try and post back. > >> >> > > >> > >> >> > > >> On Fri, Jan 16, 2015 at 4:02 PM, Jinho Kim <[email protected]> > >> >> wrote: > >> >> > > >> > >> >> > > >>> Thanks Azuryy Yu > >> >> > > >>> > >> >> > > >>> Your parallel running tasks of tajo-worker is 10 but heap > >> memory is > >> >> > > 3GB. > >> >> > > >>> It > >> >> > > >>> cause a long JVM pause > >> >> > > >>> I recommend following : > >> >> > > >>> > >> >> > > >>> tajo-env.sh > >> >> > > >>> TAJO_WORKER_HEAPSIZE=3000 or more > >> >> > > >>> > >> >> > > >>> tajo-site.xml > >> >> > > >>> <!-- worker --> > >> >> > > >>> <property> > >> >> > > >>> <name>tajo.worker.resource.memory-mb</name> > >> >> > > >>> <value>3512</value> <!-- 3 tasks + 1 qm task --> > >> >> > > >>> </property> > >> >> > > >>> <property> > >> >> > > >>> <name>tajo.task.memory-slot-mb.default</name> > >> >> > > >>> <value>1000</value> <!-- default 512 --> > >> >> > > >>> </property> > >> >> > > >>> <property> > >> >> > > >>> <name>tajo.worker.resource.dfs-dir-aware</name> > >> >> > > >>> <value>true</value> > >> >> > > >>> </property> > >> >> > > >>> <!-- end --> > >> >> > > >>> > >> >> > > > >> >> > > >> >> > >> > http://tajo.apache.org/docs/0.9.0/configuration/worker_configuration.html > >> >> > > >>> > >> >> > > >>> -Jinho > >> >> > > >>> Best regards > >> >> > > >>> > >> >> > > >>> 2015-01-16 16:02 GMT+09:00 Azuryy Yu <[email protected]>: > >> >> > > >>> > >> >> > > >>> > Thanks Kim. > >> >> > > >>> > > >> >> > > >>> > The following is my tajo-env and tajo-site > >> >> > > >>> > > >> >> > > >>> > *tajo-env.sh:* > >> >> > > >>> > export HADOOP_HOME=/usr/local/hadoop > >> >> > > >>> > export JAVA_HOME=/usr/local/java > >> >> > > >>> > _TAJO_OPTS="-server -verbose:gc > >> >> > > >>> > -XX:+PrintGCDateStamps > >> >> > > >>> > -XX:+PrintGCDetails > >> >> > > >>> > -XX:+UseGCLogFileRotation > >> >> > > >>> > -XX:NumberOfGCLogFiles=9 > >> >> > > >>> > -XX:GCLogFileSize=256m > >> >> > > >>> > -XX:+DisableExplicitGC > >> >> > > >>> > -XX:+UseCompressedOops > >> >> > > >>> > -XX:SoftRefLRUPolicyMSPerMB=0 > >> >> > > >>> > -XX:+UseFastAccessorMethods > >> >> > > >>> > -XX:+UseParNewGC > >> >> > > >>> > -XX:+UseConcMarkSweepGC > >> >> > > >>> > -XX:+CMSParallelRemarkEnabled > >> >> > > >>> > -XX:CMSInitiatingOccupancyFraction=70 > >> >> > > >>> > -XX:+UseCMSCompactAtFullCollection > >> >> > > >>> > -XX:CMSFullGCsBeforeCompaction=0 > >> >> > > >>> > -XX:+CMSClassUnloadingEnabled > >> >> > > >>> > -XX:CMSMaxAbortablePrecleanTime=300 > >> >> > > >>> > -XX:+CMSScavengeBeforeRemark > >> >> > > >>> > -XX:PermSize=160m > >> >> > > >>> > -XX:GCTimeRatio=19 > >> >> > > >>> > -XX:SurvivorRatio=2 > >> >> > > >>> > -XX:MaxTenuringThreshold=60" > >> >> > > >>> > _TAJO_MASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m" > >> >> > > >>> > _TAJO_WORKER_OPTS="$_TAJO_OPTS -Xmx3g -Xms3g -Xmn1g" > >> >> > > >>> > _TAJO_QUERYMASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m > >> -Xmn256m" > >> >> > > >>> > export TAJO_OPTS=$_TAJO_OPTS > >> >> > > >>> > export TAJO_MASTER_OPTS=$_TAJO_MASTER_OPTS > >> >> > > >>> > export TAJO_WORKER_OPTS=$_TAJO_WORKER_OPTS > >> >> > > >>> > export TAJO_QUERYMASTER_OPTS=$_TAJO_QUERYMASTER_OPTS > >> >> > > >>> > export TAJO_LOG_DIR=${TAJO_HOME}/logs > >> >> > > >>> > export TAJO_PID_DIR=${TAJO_HOME}/pids > >> >> > > >>> > export TAJO_WORKER_STANDBY_MODE=true > >> >> > > >>> > > >> >> > > >>> > *tajo-site.xml:* > >> >> > > >>> > > >> >> > > >>> > <configuration> > >> >> > > >>> > <property> > >> >> > > >>> > <name>tajo.rootdir</name> > >> >> > > >>> > <value>hdfs://test-cluster/tajo</value> > >> >> > > >>> > </property> > >> >> > > >>> > <property> > >> >> > > >>> > <name>tajo.master.umbilical-rpc.address</name> > >> >> > > >>> > <value>10-0-86-51:26001</value> > >> >> > > >>> > </property> > >> >> > > >>> > <property> > >> >> > > >>> > <name>tajo.master.client-rpc.address</name> > >> >> > > >>> > <value>10-0-86-51:26002</value> > >> >> > > >>> > </property> > >> >> > > >>> > <property> > >> >> > > >>> > <name>tajo.resource-tracker.rpc.address</name> > >> >> > > >>> > <value>10-0-86-51:26003</value> > >> >> > > >>> > </property> > >> >> > > >>> > <property> > >> >> > > >>> > <name>tajo.catalog.client-rpc.address</name> > >> >> > > >>> > <value>10-0-86-51:26005</value> > >> >> > > >>> > </property> > >> >> > > >>> > <property> > >> >> > > >>> > <name>tajo.worker.tmpdir.locations</name> > >> >> > > >>> > <value>/test/tajo1,/test/tajo2,/test/tajo3</value> > >> >> > > >>> > </property> > >> >> > > >>> > <!-- worker --> > >> >> > > >>> > <property> > >> >> > > >>> > > >> >> > <name>tajo.worker.resource.tajo.worker.resource.cpu-cores</name> > >> >> > > >>> > <value>4</value> > >> >> > > >>> > </property> > >> >> > > >>> > <property> > >> >> > > >>> > <name>tajo.worker.resource.memory-mb</name> > >> >> > > >>> > <value>5120</value> > >> >> > > >>> > </property> > >> >> > > >>> > <property> > >> >> > > >>> > <name>tajo.worker.resource.dfs-dir-aware</name> > >> >> > > >>> > <value>true</value> > >> >> > > >>> > </property> > >> >> > > >>> > <property> > >> >> > > >>> > <name>tajo.worker.resource.dedicated</name> > >> >> > > >>> > <value>true</value> > >> >> > > >>> > </property> > >> >> > > >>> > <property> > >> >> > > >>> > > <name>tajo.worker.resource.dedicated-memory-ratio</name> > >> >> > > >>> > <value>0.6</value> > >> >> > > >>> > </property> > >> >> > > >>> > </configuration> > >> >> > > >>> > > >> >> > > >>> > On Fri, Jan 16, 2015 at 2:50 PM, Jinho Kim < > [email protected]> > >> >> > wrote: > >> >> > > >>> > > >> >> > > >>> > > Hello Azuyy yu > >> >> > > >>> > > > >> >> > > >>> > > I left some comments. > >> >> > > >>> > > > >> >> > > >>> > > -Jinho > >> >> > > >>> > > Best regards > >> >> > > >>> > > > >> >> > > >>> > > 2015-01-16 14:37 GMT+09:00 Azuryy Yu <[email protected] > >: > >> >> > > >>> > > > >> >> > > >>> > > > Hi, > >> >> > > >>> > > > > >> >> > > >>> > > > I tested Tajo before half a year, then not focus on > Tajo > >> >> > because > >> >> > > >>> some > >> >> > > >>> > > other > >> >> > > >>> > > > works. > >> >> > > >>> > > > > >> >> > > >>> > > > then I setup a small dev Tajo cluster this week.(six > >> nodes, > >> >> VM) > >> >> > > >>> based > >> >> > > >>> > on > >> >> > > >>> > > > Hadoop-2.6.0. > >> >> > > >>> > > > > >> >> > > >>> > > > so my questions is: > >> >> > > >>> > > > > >> >> > > >>> > > > 1) From I know half a yea ago, Tajo is work on Yarn, > using > >> >> Yarn > >> >> > > >>> > scheduler > >> >> > > >>> > > > to manage job resources. but now I found it doesn't > rely > >> on > >> >> > > Yarn, > >> >> > > >>> > > because > >> >> > > >>> > > > I only start HDFS daemons, no yarn daemons. so Tajo has > >> his > >> >> own > >> >> > > job > >> >> > > >>> > > > sheduler ? > >> >> > > >>> > > > > >> >> > > >>> > > > > >> >> > > >>> > > Now, tajo does using own task scheduler. and You can > start > >> >> tajo > >> >> > > >>> without > >> >> > > >>> > > Yarn daemons > >> >> > > >>> > > Please refer to > >> >> > > http://tajo.apache.org/docs/0.9.0/configuration.html > >> >> > > >>> > > > >> >> > > >>> > > > >> >> > > >>> > > > > >> >> > > >>> > > > 2) Does that we need to put the file replications on > every > >> >> > nodes > >> >> > > on > >> >> > > >>> > Tajo > >> >> > > >>> > > > cluster? > >> >> > > >>> > > > > >> >> > > >>> > > > >> >> > > >>> > > No, tajo does not need more replication. if you set more > >> >> > > >>> replication, > >> >> > > >>> > data > >> >> > > >>> > > locality can be increased > >> >> > > >>> > > > >> >> > > >>> > > such as I have a six nodes Tajo cluster, then should I > set > >> HDFS > >> >> > > block > >> >> > > >>> > > > replication to six? because: > >> >> > > >>> > > > > >> >> > > >>> > > > I noticed when I run Tajo query, some nodes are busy, > but > >> >> some > >> >> > is > >> >> > > >>> free. > >> >> > > >>> > > > because the file's blocks are only located on these > nodes. > >> >> non > >> >> > > >>> others. > >> >> > > >>> > > > > >> >> > > >>> > > > > >> >> > > >>> > > In my opinion, you need to run balancer > >> >> > > >>> > > > >> >> > > >>> > > > >> >> > > >>> > > >> >> > > >>> > >> >> > > > >> >> > > >> >> > >> > http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer > >> >> > > >>> > > > >> >> > > >>> > > > >> >> > > >>> > > 3)the test data set is 4 million rows. nearly several GB. > >> but > >> >> > it's > >> >> > > >>> very > >> >> > > >>> > > > slow when I runing: select count(distinct ID) from > ****; > >> >> > > >>> > > > Any possible problems here? > >> >> > > >>> > > > > >> >> > > >>> > > > >> >> > > >>> > > Could you share tajo-env.sh, tajo-site.xml ? > >> >> > > >>> > > > >> >> > > >>> > > > >> >> > > >>> > > > > >> >> > > >>> > > > > >> >> > > >>> > > > Thanks > >> >> > > >>> > > > > >> >> > > >>> > > > >> >> > > >>> > > >> >> > > >>> > >> >> > > >> > >> >> > > >> > >> >> > > > > >> >> > > > >> >> > > >> >> > >> >
