Thank you Selva, Gunnar pointed me to the sql manual guide before. I should read it more carefully.
There are a lot of information in your reply, I need some time to understand all of them. But for my major question mark ‘how to tell running time in different part for a given query’ is partially answered by your reply. I can use ‘Oper CPU time’ for this purpose, along with other tips described in your message. But as you correctly pointed out, it is an art ☺ So I need more practice to fully grasp it. Once I have more concrete question, I will ask for help again. Thanks, Ming 发件人: Selva Govindarajan [mailto:selva.govindara...@esgyn.com] 发送时间: 2016年3月9日 13:36 收件人: user@trafodion.incubator.apache.org 主题: RE: how to tell the most time-consuming part for a given Trafodion query plan? Hi Ming, The counters or metrics returned by RMS in Trafodion are documented at http://trafodion.apache.org/docs/2.0.0/sql_reference/index.html#sql_runtime_statistics. Counters displayed in operator stats: The DOP(Degree of Parallelism> determines the number of ESPs involved in executing the Trafodion Operator or TDB(Task Definition Block). The TDB can be identified by the number in ID column. LC and RC denotes the left and right child of the operator. Using these IDs and the parent TDBID (PaID), one can construct the query plan from this output. Dispatches column gives an indication how often the operator is scheduled for execution by the Trafodion SQL Engine scheduler. An operator is scheduled and run traversing the different steps within itself till it can't continue or it gives up on its own for other operators to be scheduled. During query execution, you will see these metrics being changed continuously for all the operators as the data flows across it till a blocking operator is encountered in the plan. The blocking operators are EX_SORT, EX_HASH_JOIN and EX_HASH_GRBY.. The operator cpu time is the sum of the cpu time spent in the operator in the executor thread of all the processes hosting the operator. Operator cpu time is real and measured in real time in microseconds and it is NOT a relative number. It doesn’t include the cpu time spent by other threads in executing the tasks on behalf of the executor thread. Usually, trafodion executor instance is run in a single thread and the engine can have multiple executor instances running in a process to support multi-threaded client applications. Most notably, the Trafodion engine uses thread pool to pre-fetch the rows while rows are fetched sequentially. It is also possible that Hbase uses thread pools to complete the operation requested by Trafodion. These thread timings are not included in the operator cpu time. To account for this, RMS provides another counter in a different view. It is the pertable view. GET STATISTICS FOR QID <qid> PERTABLE provides the following counters: HBase/Hive IOs Numbers of messages sent to HBase Region Servers(RS) HBase/Hive IO MBytes The cumulative size of these messages in MB accounted at the Trafodion layer. HBase/Hive Sum IO Time The cumulative time taken in microseconds by RS to respond and summed up across all ESPs HBase/Hive Max IO Time The maximum of the cumulative time taken in microseconds by RS to respond for any ESP. This gives an indication how much of the elapsed time is spent in HBase because the messages to RS are blocking The Sum and Max IO time are the elapsed time measured as wall clock time in microseconds. The max IO time should be less than the elapsed time or response time of the query. If the max IO time is closer to the elapsed time, then most of the time is spent in Hbase. The sum IO time should be less than the DOP * elapsed time. The Operator time is the CPU time. I sincerely hope you will find the above information useful to digest the output from RMS. I would say reading, analyzing and interpreting the output from RMS is an art that you would develop over time and it is always difficult to document every usage scenario. If you find something that needs to be added or isn’t correct, please let us know. Selva From: Liu, Ming (Ming) [mailto:ming....@esgyn.cn<mailto:ming....@esgyn.cn>] Sent: Tuesday, March 8, 2016 5:41 PM To: user@trafodion.incubator.apache.org<mailto:user@trafodion.incubator.apache.org> Subject: how to tell the most time-consuming part for a given Trafodion query plan? Hi, all, We have running some complex queries using Trafodion, and need to analyze the performance. One question is, if we want to know which part of the plan take longest time, is there any good tool/skills to answer this? I can use ‘get statistics for qid <qid> default’ to get runtime stats. But it is rather hard to interpret the output. I assume the “Oper CPU Time” is the best one we can trust? But I am not sure it is the pure CPU time, or it also include ‘waiting time’? If I want to know the whole time an operation from start to end, is there any way? And if it is CPU time, is it ns or something else, or just a relative number? Here is an example output of ‘get statistics’ LC RC Id PaId ExId Frag TDB Name DOP Dispatches Oper CPU Time Est. Records Used Act. Records Used Details 12 . 13 . 7 0 EX_ROOT 1 1 69 0 0 1945 11 . 12 13 6 0 EX_SPLIT_TOP 1 1 32 99,550,560 0 10 . 11 12 6 0 EX_SEND_TOP 10 32 1,844 99,550,560 0 9 . 10 11 6 2 EX_SEND_BOTTOM 10 20 666 99,550,560 0 8 . 9 10 6 2 EX_SPLIT_BOTTOM 10 40 411 99,550,560 0 53670501 6 7 8 9 5 2 EX_TUPLE_FLOW 10 10 57 99,550,560 0 . . 7 8 4 2 EX_TRAF_LOAD_PREPARATION 10 0 0 1 0 TRAFODION.SEABASE.BLTEST|0|0 5 . 6 8 3 2 EX_SORT 10 316,410 40,033,167 99,550,560 0 0|15880|10 4 . 5 6 2 2 EX_SPLIT_TOP 10 316,411 559,691 99,550,560 5,690,184 3 . 4 5 2 2 EX_SEND_TOP 160 474,849 13,076,509 99,550,560 5,690,196 2 . 3 4 2 3 EX_SEND_BOTTOM 160 919,425 90,107,363 99,550,560 5,695,235 1 . 2 3 2 3 EX_SPLIT_BOTTOM 16 94,836 4,236,816 99,550,560 5,698,863 350792654 . . 1 2 1 3 EX_HDFS_SCAN 16 48,227 256,448,475 0 5,715,193 HIVE.BLTEST|5715193|1664264993 Thanks in advance. Thanks, Ming