zabetak commented on code in PR #4755:
URL: https://github.com/apache/hive/pull/4755#discussion_r1444696611
##########
common/src/java/org/apache/hadoop/hive/ql/log/PerfLogger.java:
##########
@@ -129,9 +133,31 @@ public static void setPerfLogger(PerfLogger
resetPerfLogger) {
* @param method method or ID that identifies this perf log element.
*/
public void perfLogBegin(String callerName, String method) {
+ perfLogBegin(callerName, method, null);
+ }
+
+ /**
+ * Call this function when you start to measure time spent by a piece of
code.
+ * @param callerName the logging object to be used.
+ * @param method method or ID that identifies this perf log element.
+ * @param additionalInfo particular step within the method that is being
measured
+ */
+ public void perfLogBegin(String callerName, String method, String
additionalInfo) {
long startTime = System.currentTimeMillis();
- startTimes.put(method, Long.valueOf(startTime));
- LOG.debug("<PERFLOG method={} from={}>", method, callerName);
+ String key = method;
+ if (additionalInfo != null) {
+ key = key + " - " + additionalInfo;
+ }
+ startTimes.put(key, Long.valueOf(startTime));
+ if (LOG.isDebugEnabled()) {
+ StringBuilder sb = new StringBuilder("<PERFLOG method=").append(method);
+ sb.append(" from=").append(callerName);
+ if (additionalInfo != null) {
+ sb.append(" ").append(additionalInfo);
+ }
+ sb.append(">");
+ LOG.debug(sb.toString());
+ }
beginMetrics(method);
Review Comment:
It seems that the `PerfLogger` already tracks some metrics for each call on
begin/end. Did you check if its possible to exploit the metrics to provide
these kind of cumulative optimization times? Why do we keep these metrics are
they useful in any way?
`PerfLogger` is called in many places and also gets heavier and more
involved every time. I want to ensure that we are not adding new information
and logic that we don't already have somewhere.
##########
common/src/java/org/apache/hadoop/hive/ql/log/PerfLogger.java:
##########
@@ -92,6 +94,8 @@ public class PerfLogger {
protected final Map<String, Long> startTimes = new ConcurrentHashMap<>();
protected final Map<String, Long> endTimes = new ConcurrentHashMap<>();
+ protected final Map<String, Long> durations = new ConcurrentHashMap<>();
Review Comment:
I see your point. However, the additional of the new information comes in
conflict with the current APIs. For instance, we have `PerfLogger#getDuration`
that uses the `startTimes` and `endTimes` data structures. Moreover, the
start/end times are exposed to the WebUI via public getters.
Should we adapt some of the old APIs? Should we add new ones?
##########
common/src/java/org/apache/hadoop/hive/ql/log/PerfLogger.java:
##########
@@ -152,10 +178,18 @@ public long perfLogEnd(String callerName, String method) {
* @return long duration the difference between now and startTime, or -1 if
startTime is null
*/
public long perfLogEnd(String callerName, String method, String
additionalInfo) {
- long startTime = startTimes.getOrDefault(method, -1L);
+ String key = method;
+ if (additionalInfo != null) {
+ key = key + " - " + additionalInfo;
+ }
+ long startTime = startTimes.getOrDefault(key, -1L);
long endTime = System.currentTimeMillis();
- long duration = startTime < 0 ? -1 : endTime - startTime;
- endTimes.put(method, Long.valueOf(endTime));
+ long duration = startTime < 0 ? -1 : (endTime - startTime +
durations.getOrDefault(key, 0L));
+ endTimes.put(key, Long.valueOf(endTime));
+ durations.put(key, duration);
+ synchronized(querySteps) {
+ querySteps.add(key);
Review Comment:
Isn't it possible to derive the order by looking at `startTimes`?
As you mentioned elsewhere sometimes the same key may appears multiple times
during compilation. In that case there is no total order among the compilation
events.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]