zabetak commented on code in PR #4755:
URL: https://github.com/apache/hive/pull/4755#discussion_r1532149473


##########
ql/src/java/org/apache/hadoop/hive/ql/QueryCompilationBreakdownSummary.java:
##########
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql;
+
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.hadoop.hive.ql.exec.tez.monitoring.PrintSummary;
+import org.apache.hadoop.hive.ql.log.PerfLogger;
+import org.apache.hadoop.hive.ql.session.SessionState;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import static 
org.apache.hadoop.hive.ql.exec.tez.monitoring.Constants.SEPARATOR;
+
+public class QueryCompilationBreakdownSummary implements PrintSummary {
+  private static final String OPERATION_SUMMARY = "%-84s %9s";
+  private static final String QUERY_COMPILATION_SUMMARY = "Query Compilation 
Summary";
+  private static final String LINE_SEPARATOR = "line.separator";
+  private static final String TOTAL_COMPILATION_TIME = "Total Compilation 
Time";
+  private List<Pair<String, String>> compileSteps = new ArrayList<>();
+
+  private String format(String value, long number) {
+    return String.format(OPERATION_SUMMARY, value, number + "ms");
+  }
+
+  private void populateCompileStepNumbers() {

Review Comment:
   The content is immutable so we don't need to populate the `compileSteps` on 
every call. The whole thing can be become static final and initialized only 
once.



##########
ql/src/java/org/apache/hadoop/hive/ql/QueryCompilationBreakdownSummary.java:
##########
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql;
+
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.hadoop.hive.ql.exec.tez.monitoring.PrintSummary;
+import org.apache.hadoop.hive.ql.log.PerfLogger;
+import org.apache.hadoop.hive.ql.session.SessionState;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import static 
org.apache.hadoop.hive.ql.exec.tez.monitoring.Constants.SEPARATOR;
+
+public class QueryCompilationBreakdownSummary implements PrintSummary {

Review Comment:
   I just noticed that `PrintSummary` is a Tez interface so it doesn't seem 
appropriate for the use-case. It might have been part of my previous suggestion 
but I failed to notice then that it was tailored around Tez.
   
   Thinking a bit more on the topic it seems that the 
`QueryCompilationBreakdownSummary` could be implemented as a `Hook`. In 
particular, the `QueryLifeTimeHook` interface seems to be fit well for this use 
case.
   
   Two advantages of implementing this as a `QueryLifeTimeHook`:
   1. we wouldn't need a new configuration property
   2. we could easily test the summary feature via qtests



##########
ql/src/java/org/apache/hadoop/hive/ql/QueryCompilationBreakdownSummary.java:
##########
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql;
+
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.hadoop.hive.ql.exec.tez.monitoring.PrintSummary;
+import org.apache.hadoop.hive.ql.log.PerfLogger;
+import org.apache.hadoop.hive.ql.session.SessionState;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import static 
org.apache.hadoop.hive.ql.exec.tez.monitoring.Constants.SEPARATOR;
+
+public class QueryCompilationBreakdownSummary implements PrintSummary {
+  private static final String OPERATION_SUMMARY = "%-84s %9s";
+  private static final String QUERY_COMPILATION_SUMMARY = "Query Compilation 
Summary";
+  private static final String LINE_SEPARATOR = "line.separator";
+  private static final String TOTAL_COMPILATION_TIME = "Total Compilation 
Time";
+  private List<Pair<String, String>> compileSteps = new ArrayList<>();
+
+  private String format(String value, long number) {
+    return String.format(OPERATION_SUMMARY, value, number + "ms");
+  }
+
+  private void populateCompileStepNumbers() {
+    compileSteps.add(Pair.of(PerfLogger.PARSE, "1"));
+    compileSteps.add(Pair.of(PerfLogger.GENERATE_RESOLVED_PARSETREE, "2"));
+    compileSteps.add(Pair.of(PerfLogger.LOGICALPLAN_AND_HIVE_OPERATOR_TREE, 
"3"));
+    compileSteps.add(Pair.of(PerfLogger.GENERATE_LOGICAL_PLAN, "3.1"));
+    compileSteps.add(Pair.of(PerfLogger.PLAN_GENERATION, "3.1.1"));
+    compileSteps.add(Pair.of(PerfLogger.MV_REWRITE_FIELD_TRIMMER, "3.1.2"));
+    compileSteps.add(Pair.of(PerfLogger.REMOVING_SUBQUERY, "3.1.3"));
+    compileSteps.add(Pair.of(PerfLogger.DECORRELATION, "3.1.4"));
+    compileSteps.add(Pair.of(PerfLogger.VALIDATE_QUERY_MATERIALIZATION, 
"3.1.5"));
+    compileSteps.add(Pair.of(PerfLogger.PREJOIN_ORDERING, "3.1.6"));
+    compileSteps.add(Pair.of(PerfLogger.MV_REWRITING, "3.1.7"));
+    compileSteps.add(Pair.of(PerfLogger.JOIN_REORDERING, "3.1.8"));
+    compileSteps.add(Pair.of(PerfLogger.POSTJOIN_ORDERING, "3.1.9"));
+    compileSteps.add(Pair.of(PerfLogger.HIVE_SORT_PREDICATES, "3.1.10"));
+    compileSteps.add(Pair.of(PerfLogger.GENERATE_OPERATOR_TREE, "3.2"));
+    compileSteps.add(Pair.of(PerfLogger.DEDUCE_RESULTSET_SCHEMA, "4"));
+    compileSteps.add(Pair.of(PerfLogger.PARSE_CONTEXT_GENERATION, "5"));
+    compileSteps.add(Pair.of(PerfLogger.SAVE_AND_VALIDATE_VIEW, "6"));
+    compileSteps.add(Pair.of(PerfLogger.LOGICAL_OPTIMIZATION, "7"));
+    compileSteps.add(Pair.of(PerfLogger.PHYSICAL_OPTIMIZATION, "8"));
+    compileSteps.add(Pair.of(PerfLogger.POST_PROCESSING, "9"));
+  }
+
+  @Override
+  public void print(SessionState.LogHelper console) {
+    console.printInfo(getCompileSummary(console));
+  }
+
+  public String getCompileSummary(SessionState.LogHelper console) {

Review Comment:
   The parameter to the method is not used.



##########
common/src/java/org/apache/hadoop/hive/ql/log/PerfLogger.java:
##########
@@ -90,9 +118,10 @@ public class PerfLogger {
   public static final String HIVE_GET_NOT_NULL_CONSTRAINT = 
"getNotNullConstraints";
   public static final String HIVE_GET_TABLE_CONSTRAINTS = 
"getTableConstraints";
 
-  protected final Map<String, Long> startTimes = new ConcurrentHashMap<>();
+  protected final Map<String, Long> startTimes = 
Collections.synchronizedMap(new LinkedHashMap<>());

Review Comment:
   I am afraid that using `Collections.synchronizedMap` here breaks thread 
safety (reintroduce HIVE-27733). When using synchronized wrapper it is 
imperative (as described also in the Javadoc) to synchronize explicitly during 
iteration. Either we review all usages of `startTimes` and add external 
synchronize when necessary or we revert back to `ConcurrentHashMap`.



##########
ql/src/java/org/apache/hadoop/hive/ql/QueryCompilationBreakdownSummary.java:
##########
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql;
+
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.hadoop.hive.ql.exec.tez.monitoring.PrintSummary;
+import org.apache.hadoop.hive.ql.log.PerfLogger;
+import org.apache.hadoop.hive.ql.session.SessionState;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import static 
org.apache.hadoop.hive.ql.exec.tez.monitoring.Constants.SEPARATOR;
+
+public class QueryCompilationBreakdownSummary implements PrintSummary {
+  private static final String OPERATION_SUMMARY = "%-84s %9s";
+  private static final String QUERY_COMPILATION_SUMMARY = "Query Compilation 
Summary";
+  private static final String LINE_SEPARATOR = "line.separator";
+  private static final String TOTAL_COMPILATION_TIME = "Total Compilation 
Time";
+  private List<Pair<String, String>> compileSteps = new ArrayList<>();
+
+  private String format(String value, long number) {
+    return String.format(OPERATION_SUMMARY, value, number + "ms");
+  }
+
+  private void populateCompileStepNumbers() {
+    compileSteps.add(Pair.of(PerfLogger.PARSE, "1"));
+    compileSteps.add(Pair.of(PerfLogger.GENERATE_RESOLVED_PARSETREE, "2"));
+    compileSteps.add(Pair.of(PerfLogger.LOGICALPLAN_AND_HIVE_OPERATOR_TREE, 
"3"));
+    compileSteps.add(Pair.of(PerfLogger.GENERATE_LOGICAL_PLAN, "3.1"));
+    compileSteps.add(Pair.of(PerfLogger.PLAN_GENERATION, "3.1.1"));
+    compileSteps.add(Pair.of(PerfLogger.MV_REWRITE_FIELD_TRIMMER, "3.1.2"));
+    compileSteps.add(Pair.of(PerfLogger.REMOVING_SUBQUERY, "3.1.3"));
+    compileSteps.add(Pair.of(PerfLogger.DECORRELATION, "3.1.4"));
+    compileSteps.add(Pair.of(PerfLogger.VALIDATE_QUERY_MATERIALIZATION, 
"3.1.5"));
+    compileSteps.add(Pair.of(PerfLogger.PREJOIN_ORDERING, "3.1.6"));
+    compileSteps.add(Pair.of(PerfLogger.MV_REWRITING, "3.1.7"));
+    compileSteps.add(Pair.of(PerfLogger.JOIN_REORDERING, "3.1.8"));
+    compileSteps.add(Pair.of(PerfLogger.POSTJOIN_ORDERING, "3.1.9"));
+    compileSteps.add(Pair.of(PerfLogger.HIVE_SORT_PREDICATES, "3.1.10"));
+    compileSteps.add(Pair.of(PerfLogger.GENERATE_OPERATOR_TREE, "3.2"));
+    compileSteps.add(Pair.of(PerfLogger.DEDUCE_RESULTSET_SCHEMA, "4"));
+    compileSteps.add(Pair.of(PerfLogger.PARSE_CONTEXT_GENERATION, "5"));
+    compileSteps.add(Pair.of(PerfLogger.SAVE_AND_VALIDATE_VIEW, "6"));
+    compileSteps.add(Pair.of(PerfLogger.LOGICAL_OPTIMIZATION, "7"));
+    compileSteps.add(Pair.of(PerfLogger.PHYSICAL_OPTIMIZATION, "8"));
+    compileSteps.add(Pair.of(PerfLogger.POST_PROCESSING, "9"));
+  }
+
+  @Override
+  public void print(SessionState.LogHelper console) {
+    console.printInfo(getCompileSummary(console));
+  }
+
+  public String getCompileSummary(SessionState.LogHelper console) {
+    StringBuilder compileSummary = new StringBuilder();
+    populateCompileStepNumbers();
+    compileSummary.append(QUERY_COMPILATION_SUMMARY);
+    compileSummary.append(System.getProperty(LINE_SEPARATOR));
+    compileSummary.append(SEPARATOR);
+    compileSummary.append(System.getProperty(LINE_SEPARATOR));
+    PerfLogger perfLogger = SessionState.getPerfLogger();
+    for (Pair<String, String> compileStep : compileSteps) {
+      compileSummary.append(format(PerfLogger.COMPILE_STEP + " - " +
+          compileStep.getRight() + " " + compileStep.getLeft(),
+          perfLogger.getDuration(compileStep.getLeft())));
+      compileSummary.append(System.getProperty(LINE_SEPARATOR));
+    }
+    compileSummary.append(format(TOTAL_COMPILATION_TIME, 
perfLogger.getDuration(PerfLogger.COMPILE)));
+    compileSummary.append(System.getProperty(LINE_SEPARATOR));
+    compileSummary.append(SEPARATOR);
+    return compileSummary.toString();
+  }
+}

Review Comment:
   nit: no newline at the end of file



##########
ql/src/java/org/apache/hadoop/hive/ql/QueryCompilationBreakdownSummary.java:
##########
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql;
+
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.hadoop.hive.ql.exec.tez.monitoring.PrintSummary;
+import org.apache.hadoop.hive.ql.log.PerfLogger;
+import org.apache.hadoop.hive.ql.session.SessionState;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import static 
org.apache.hadoop.hive.ql.exec.tez.monitoring.Constants.SEPARATOR;

Review Comment:
   nit: Using the SEPARATOR from 
`org.apache.hadoop.hive.common.log.InPlaceUpdate#SEPARATOR` seems more 
appropriate in terms of package dependencies
   
   Anyways this is minor even if the text was not there at all that would be 
fine with me.



##########
ql/src/java/org/apache/hadoop/hive/ql/QueryCompilationBreakdownSummary.java:
##########
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql;
+
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.hadoop.hive.ql.exec.tez.monitoring.PrintSummary;
+import org.apache.hadoop.hive.ql.log.PerfLogger;
+import org.apache.hadoop.hive.ql.session.SessionState;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import static 
org.apache.hadoop.hive.ql.exec.tez.monitoring.Constants.SEPARATOR;
+
+public class QueryCompilationBreakdownSummary implements PrintSummary {
+  private static final String OPERATION_SUMMARY = "%-84s %9s";
+  private static final String QUERY_COMPILATION_SUMMARY = "Query Compilation 
Summary";
+  private static final String LINE_SEPARATOR = "line.separator";
+  private static final String TOTAL_COMPILATION_TIME = "Total Compilation 
Time";
+  private List<Pair<String, String>> compileSteps = new ArrayList<>();
+
+  private String format(String value, long number) {
+    return String.format(OPERATION_SUMMARY, value, number + "ms");
+  }
+
+  private void populateCompileStepNumbers() {
+    compileSteps.add(Pair.of(PerfLogger.PARSE, "1"));
+    compileSteps.add(Pair.of(PerfLogger.GENERATE_RESOLVED_PARSETREE, "2"));
+    compileSteps.add(Pair.of(PerfLogger.LOGICALPLAN_AND_HIVE_OPERATOR_TREE, 
"3"));
+    compileSteps.add(Pair.of(PerfLogger.GENERATE_LOGICAL_PLAN, "3.1"));
+    compileSteps.add(Pair.of(PerfLogger.PLAN_GENERATION, "3.1.1"));
+    compileSteps.add(Pair.of(PerfLogger.MV_REWRITE_FIELD_TRIMMER, "3.1.2"));

Review Comment:
   The hardcoded string numbering may be hard to maintain if we decide to add 
or remove steps in the future since it will require manually updating the 
counters.
   
   Using a dedicated structure for keeping the steps could alleviate this by 
making the implicit nesting explicit. For instance:
   ```java
   private static final CompileStep {
       private final String name;
       private final List<CompileStep> substeps;
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Reply via email to