[
https://issues.apache.org/jira/browse/HIVE-26438?focusedWorklogId=797180&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-797180
]
ASF GitHub Bot logged work on HIVE-26438:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 02/Aug/22 10:01
Start Date: 02/Aug/22 10:01
Worklog Time Spent: 10m
Work Description: zabetak commented on code in PR #3487:
URL: https://github.com/apache/hive/pull/3487#discussion_r935321551
##########
ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java:
##########
@@ -974,44 +972,40 @@ Pair<Boolean, String> canCBOHandleAst(ASTNode ast, QB qb,
PreCboCtx cboCtx) {
* Query<br>
* 2. Nested Subquery will return false for qbToChk.getIsQuery()
*/
- private static String canHandleQbForCbo(QueryProperties queryProperties,
HiveConf conf,
- boolean topLevelQB, boolean verbose) {
-
- if (!queryProperties.hasClusterBy() && !queryProperties.hasDistributeBy()
- && !(queryProperties.hasSortBy() && queryProperties.hasLimit())
- && !queryProperties.hasPTF() && !queryProperties.usesScript()
- && queryProperties.isCBOSupportedLateralViews()) {
- // Ok to run CBO.
- return null;
- }
-
+ private static String canHandleQbForCbo(QueryProperties queryProperties,
+ HiveConf conf, boolean topLevelQB) {
+ List reasons = new ArrayList();
Review Comment:
```suggestion
List<String> reasons = new ArrayList<>();
```
Check "Effective Java, Item 23: Don’t use raw types in new code" for more
info regarding the suggestion.
##########
ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java:
##########
@@ -974,44 +972,40 @@ Pair<Boolean, String> canCBOHandleAst(ASTNode ast, QB qb,
PreCboCtx cboCtx) {
* Query<br>
* 2. Nested Subquery will return false for qbToChk.getIsQuery()
*/
- private static String canHandleQbForCbo(QueryProperties queryProperties,
HiveConf conf,
- boolean topLevelQB, boolean verbose) {
-
- if (!queryProperties.hasClusterBy() && !queryProperties.hasDistributeBy()
- && !(queryProperties.hasSortBy() && queryProperties.hasLimit())
- && !queryProperties.hasPTF() && !queryProperties.usesScript()
- && queryProperties.isCBOSupportedLateralViews()) {
- // Ok to run CBO.
- return null;
- }
-
+ private static String canHandleQbForCbo(QueryProperties queryProperties,
+ HiveConf conf, boolean topLevelQB) {
+ List reasons = new ArrayList();
// Not ok to run CBO, build error message.
- String msg = "";
- if (verbose) {
- if (queryProperties.hasClusterBy()) {
- msg += "has cluster by; ";
- }
- if (queryProperties.hasDistributeBy()) {
- msg += "has distribute by; ";
- }
- if (queryProperties.hasSortBy() && queryProperties.hasLimit()) {
- msg += "has sort by with limit; ";
- }
- if (queryProperties.hasPTF()) {
- msg += "has PTF; ";
- }
- if (queryProperties.usesScript()) {
- msg += "uses scripts; ";
- }
- if (queryProperties.hasLateralViews()) {
- msg += "has lateral views; ";
- }
- if (msg.isEmpty()) {
- msg += "has some unspecified limitations; ";
- }
- msg = msg.substring(0, msg.length() - 2);
+ String errorMsg = "";
+ if (queryProperties.hasClusterBy()) {
+ errorMsg = "has cluster by";
+ reasons.add(errorMsg);
+ }
+ if (queryProperties.hasDistributeBy()) {
+ errorMsg = "has distribute by";
+ reasons.add(errorMsg);
+ }
+ if (queryProperties.hasSortBy() && queryProperties.hasLimit()) {
+ errorMsg = "has sort by with limit";
+ reasons.add(errorMsg);
+ }
+ if (queryProperties.hasPTF()) {
+ errorMsg = "has PTF";
+ reasons.add(errorMsg);
+ }
+ if (queryProperties.usesScript()) {
+ errorMsg = "uses scripts";
+ reasons.add(errorMsg);
+ }
+ if (queryProperties.hasLateralViews()) {
+ errorMsg = "has lateral views";
+ reasons.add(errorMsg);
+ }
+ if (!queryProperties.isCBOSupportedLateralViews()) {
Review Comment:
If I am not wrong there is an important change in behavior with the proposed
refactoring. Going quickly over the code it seems that `hasLateralViews` always
returns `true` when there are such views in the query. We do not want to raise
an error for every lateral view but only for those that CBO cannot handle. I
think the following refactoring is valid:
```java
if (!queryProperties.isCBOSupportedLateralViews()) {
reasons.add("has lateral views");
}
```
We could include `queryProperties.hasLateralViews()` in the condition but I
think it is redundant.
Moreover, I get the impression that `"has some unspecified limitations"` is
not reachable so can be omitted completely.
##########
ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java:
##########
@@ -974,44 +972,40 @@ Pair<Boolean, String> canCBOHandleAst(ASTNode ast, QB qb,
PreCboCtx cboCtx) {
* Query<br>
* 2. Nested Subquery will return false for qbToChk.getIsQuery()
*/
- private static String canHandleQbForCbo(QueryProperties queryProperties,
HiveConf conf,
- boolean topLevelQB, boolean verbose) {
-
- if (!queryProperties.hasClusterBy() && !queryProperties.hasDistributeBy()
- && !(queryProperties.hasSortBy() && queryProperties.hasLimit())
- && !queryProperties.hasPTF() && !queryProperties.usesScript()
- && queryProperties.isCBOSupportedLateralViews()) {
- // Ok to run CBO.
- return null;
- }
-
+ private static String canHandleQbForCbo(QueryProperties queryProperties,
+ HiveConf conf, boolean topLevelQB) {
+ List reasons = new ArrayList();
// Not ok to run CBO, build error message.
- String msg = "";
- if (verbose) {
- if (queryProperties.hasClusterBy()) {
- msg += "has cluster by; ";
- }
- if (queryProperties.hasDistributeBy()) {
- msg += "has distribute by; ";
- }
- if (queryProperties.hasSortBy() && queryProperties.hasLimit()) {
- msg += "has sort by with limit; ";
- }
- if (queryProperties.hasPTF()) {
- msg += "has PTF; ";
- }
- if (queryProperties.usesScript()) {
- msg += "uses scripts; ";
- }
- if (queryProperties.hasLateralViews()) {
- msg += "has lateral views; ";
- }
- if (msg.isEmpty()) {
- msg += "has some unspecified limitations; ";
- }
- msg = msg.substring(0, msg.length() - 2);
+ String errorMsg = "";
Review Comment:
The local variable `errorMsg` is redundant and can be removed. You can
directly do `reasons.add("has cluster by")` etc.
Issue Time Tracking
-------------------
Worklog Id: (was: 797180)
Time Spent: 50m (was: 40m)
> Remove unnecessary optimization in canHandleQbForCbo() method
> -------------------------------------------------------------
>
> Key: HIVE-26438
> URL: https://issues.apache.org/jira/browse/HIVE-26438
> Project: Hive
> Issue Type: Bug
> Reporter: Abhay
> Assignee: Abhay
> Priority: Major
> Labels: pull-request-available
> Time Spent: 50m
> Remaining Estimate: 0h
>
> This ticket is an improvement on
> https://issues.apache.org/jira/browse/HIVE-26426. The canHandleQbForCbo()
> checks whether Calcite handle the query or not and it returns null if the
> query can be handled; non-null reason string if it cannot be.
> But currently, it returns an empty string if INFO Log is not enabled. This is
> probably a performance optimization that is not needed and can be simplified.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)