[ 
https://issues.apache.org/jira/browse/HIVE-23730?focusedWorklogId=448971&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-448971
 ]

ASF GitHub Bot logged work on HIVE-23730:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 22/Jun/20 00:28
            Start Date: 22/Jun/20 00:28
    Worklog Time Spent: 10m 
      Work Description: jcamachor commented on a change in pull request #1152:
URL: https://github.com/apache/hive/pull/1152#discussion_r443271395



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
##########
@@ -1566,13 +1569,38 @@ private void 
removeSemijoinsParallelToMapJoin(OptimizeTezProcContext procCtx)
 
       List<ExprNodeDesc> keyDesc = 
selectedMJOp.getConf().getKeys().get(posBigTable);
       ExprNodeColumnDesc keyCol = (ExprNodeColumnDesc) keyDesc.get(0);
-
-      tsProbeDecodeCtx = new TableScanOperator.ProbeDecodeContext(mjCacheKey, 
mjSmallTablePos,
-          keyCol.getColumn(), selectedMJOpRatio);
+      String realTSColName = getOriginalTSColName(selectedMJOp, 
keyCol.getColumn());
+      if (realTSColName != null) {
+        tsProbeDecodeCtx = new 
TableScanOperator.ProbeDecodeContext(mjCacheKey, mjSmallTablePos,
+                realTSColName, selectedMJOpRatio);
+      } else {
+        LOG.warn("ProbeDecode could not find TSColName for ColKey {} with MJ 
Schema {} ", keyCol, selectedMJOp.getSchema());
+      }
     }
     return tsProbeDecodeCtx;
   }
 
+  private static String getOriginalTSColName(MapJoinOperator mjOp, String 
internalCoName) {

Review comment:
       Can you check if this can already be done by any of the utility methods 
in `OperatorUtils`? If it is not, can we move it to that class with rest of 
utility methods in case it is useful in the future?

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
##########
@@ -1566,13 +1569,38 @@ private void 
removeSemijoinsParallelToMapJoin(OptimizeTezProcContext procCtx)
 
       List<ExprNodeDesc> keyDesc = 
selectedMJOp.getConf().getKeys().get(posBigTable);
       ExprNodeColumnDesc keyCol = (ExprNodeColumnDesc) keyDesc.get(0);
-
-      tsProbeDecodeCtx = new TableScanOperator.ProbeDecodeContext(mjCacheKey, 
mjSmallTablePos,
-          keyCol.getColumn(), selectedMJOpRatio);
+      String realTSColName = getOriginalTSColName(selectedMJOp, 
keyCol.getColumn());
+      if (realTSColName != null) {
+        tsProbeDecodeCtx = new 
TableScanOperator.ProbeDecodeContext(mjCacheKey, mjSmallTablePos,
+                realTSColName, selectedMJOpRatio);
+      } else {
+        LOG.warn("ProbeDecode could not find TSColName for ColKey {} with MJ 
Schema {} ", keyCol, selectedMJOp.getSchema());

Review comment:
       You could throw an error only if you are running in test mode:
   ```
   if (conf.getBoolVar(ConfVars.HIVE_IN_TEST)) {
   ...
   ```
   While I am not a huge fan of interleaving a check like this within the 
production code, it may help you identify any issues, gaps, or regressions in 
the future.
   If you do not want to do that, an alternative is uploading a patch throwing 
an error instead of printing the warning to get a complete test run, then 
create follow-up issues if there are any. It will not help you with regressions 
in the future, but you will be able to identify any existing issue.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 448971)
    Time Spent: 20m  (was: 10m)

> Compiler support tracking TS keyColName for Probe MapJoin
> ---------------------------------------------------------
>
>                 Key: HIVE-23730
>                 URL: https://issues.apache.org/jira/browse/HIVE-23730
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Panagiotis Garefalakis
>            Assignee: Panagiotis Garefalakis
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Compiler needs to track the original TS key columnName used for MJ 
> probedecode.
> Even though we know the MJ keyCol at compile time, this could be generated by 
> previous (parent) operators thus we dont always know the original TS column 
> it maps to.
> To find the original columnMapping, we need to track the MJ keyCol through 
> the operator pipeline. Tracking can be done through the parent operator 
> ColumnExprMap and RowSchema.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to