[
https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13630607#comment-13630607
]
Gunther Hagleitner commented on HIVE-4318:
------------------------------------------
[~pamelavagata]: I saw that too and I am sure it would make the numbers
slightly better. There's also the issue with allocating a new object for each
invocation which is probably even worse than the empty list. My point though is
this: Even if we get it down to where I fixed counters too, you would still pay
a price for the feature. No counters v fixed counters is still faster (see
above).
>From this thread it seems that the profiler is a valuable feature for keeping
>taps on performance in the dev cycle, operator hooks on the other hand are not
>that useful. Anything you add there has a tremendously bad effect on
>performance.
>From that I concluded that we should change the profiler not to rely on
>operator hooks and also not to contribute to run time in production. The best
>way to me is to remove it temporarily and handle it in a new jira (where we
>can discuss the how in more detail).
Does that make sense?
> OperatorHooks hit performance even when not used
> ------------------------------------------------
>
> Key: HIVE-4318
> URL: https://issues.apache.org/jira/browse/HIVE-4318
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Environment: Ubuntu LXC (64 bit)
> Reporter: Gopal V
> Assignee: Gunther Hagleitner
> Attachments: HIVE-4318.1.patch, HIVE-4318.2.patch
>
>
> Operator Hooks inserted into Operator.java cause a performance hit even when
> it is not being used.
> For a count(1) query tested with & without the operator hook calls.
> {code:title=with}
> 2013-04-09 07:33:58,920 Stage-1 map = 100%, reduce = 100%, Cumulative CPU
> 84.07 sec
> Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
> OK
> 28800991
> Time taken: 40.407 seconds, Fetched: 1 row(s)
> {code}
> {code:title=without}
> 2013-04-09 07:33:02,355 Stage-1 map = 100%, reduce = 100%, Cumulative CPU
> 68.48 sec
> ...
> Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
> OK
> 28800991
> Time taken: 35.907 seconds, Fetched: 1 row(s)
> {code}
> The effect is multiplied by the number of operators in the pipeline that has
> to forward the row - the more operators there are the, the slower the query.
> The modification made to test this was
> {code:title=Operator.java}
> --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
> +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
> @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws
> HiveException {
> return;
> }
> OperatorHookContext opHookContext = new OperatorHookContext(this, row,
> tag);
> - preProcessCounter();
> - enterOperatorHooks(opHookContext);
> + //preProcessCounter();
> + //enterOperatorHooks(opHookContext);
> processOp(row, tag);
> - exitOperatorHooks(opHookContext);
> - postProcessCounter();
> + //exitOperatorHooks(opHookContext);
> + //postProcessCounter();
> }
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira