subject:"\[jira\] \[Work logged\] \(HIVE\-22369\) Handle HiveTableFunctionScan at return path"

[jira] [Work logged] (HIVE-22369) Handle HiveTableFunctionScan at return path

2019-11-22 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22369?focusedWorklogId=348394=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-348394
 ]

ASF GitHub Bot logged work on HIVE-22369:
-

Author: ASF GitHub Bot
Created on: 22/Nov/19 22:34
Start Date: 22/Nov/19 22:34
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #845: 
HIVE-22369 Handle HiveTableFunctionScan at return path
URL: https://github.com/apache/hive/pull/845
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 348394)
Time Spent: 1h 20m  (was: 1h 10m)

> Handle HiveTableFunctionScan at return path
> ---
>
> Key: HIVE-22369
> URL: https://issues.apache.org/jira/browse/HIVE-22369
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22369.01.patch, HIVE-22369.02.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The 
> [optimizedOptiqPlan|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L1573]
>  at CalcitePlanner.getOptimizedHiveOPDag is ultimately generated by 
> CalcitePlanner.internalGenSelectLogicalPlan, which may either provide a 
> [HiveProject|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L4831]
>  or a 
> [HiveTableFunctionScan|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L4776].
>  When HiveCalciteUtil.getTopLevelSelect is invoked on this it is looking for 
> a 
> [HiveProject|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java#L633]
>  node in the tree, which if won't find in case of a HiveTableFunctionScan was 
> returned. This is why TestNewGetSplitsFormat is failing with return path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22369) Handle HiveTableFunctionScan at return path

2019-11-20 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22369?focusedWorklogId=347096=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-347096
 ]

ASF GitHub Bot logged work on HIVE-22369:
-

Author: ASF GitHub Bot
Created on: 21/Nov/19 00:40
Start Date: 21/Nov/19 00:40
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #845: HIVE-22369 
Handle HiveTableFunctionScan at return path
URL: https://github.com/apache/hive/pull/845#discussion_r348844843
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hive/jdbc/BaseJdbcWithMiniLlap.java
 ##
 @@ -530,12 +498,10 @@ protected int processQuery(String currentDatabase, 
String query, int numSplits,
 InputSplit[] splits = inputFormat.getSplits(job, numSplits);
 
 // Fetch rows from splits
-boolean first = true;
 
 Review comment:
   These changes do not seem related to this patch? If they are not, can we 
submit it in a different one?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 347096)
Time Spent: 1h 10m  (was: 1h)

> Handle HiveTableFunctionScan at return path
> ---
>
> Key: HIVE-22369
> URL: https://issues.apache.org/jira/browse/HIVE-22369
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22369.01.patch, HIVE-22369.02.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The 
> [optimizedOptiqPlan|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L1573]
>  at CalcitePlanner.getOptimizedHiveOPDag is ultimately generated by 
> CalcitePlanner.internalGenSelectLogicalPlan, which may either provide a 
> [HiveProject|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L4831]
>  or a 
> [HiveTableFunctionScan|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L4776].
>  When HiveCalciteUtil.getTopLevelSelect is invoked on this it is looking for 
> a 
> [HiveProject|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java#L633]
>  node in the tree, which if won't find in case of a HiveTableFunctionScan was 
> returned. This is why TestNewGetSplitsFormat is failing with return path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22369) Handle HiveTableFunctionScan at return path

2019-11-20 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22369?focusedWorklogId=347020=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-347020
 ]

ASF GitHub Bot logged work on HIVE-22369:
-

Author: ASF GitHub Bot
Created on: 20/Nov/19 21:50
Start Date: 20/Nov/19 21:50
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on issue #845: HIVE-22369 
Handle HiveTableFunctionScan at return path
URL: https://github.com/apache/hive/pull/845#issuecomment-556444626
 
 
   @jcamachor, I think I understood what you asked for, please check out the 
new patch, and let me know if this is how you wanted it.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 347020)
Time Spent: 1h  (was: 50m)

> Handle HiveTableFunctionScan at return path
> ---
>
> Key: HIVE-22369
> URL: https://issues.apache.org/jira/browse/HIVE-22369
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22369.01.patch, HIVE-22369.02.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The 
> [optimizedOptiqPlan|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L1573]
>  at CalcitePlanner.getOptimizedHiveOPDag is ultimately generated by 
> CalcitePlanner.internalGenSelectLogicalPlan, which may either provide a 
> [HiveProject|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L4831]
>  or a 
> [HiveTableFunctionScan|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L4776].
>  When HiveCalciteUtil.getTopLevelSelect is invoked on this it is looking for 
> a 
> [HiveProject|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java#L633]
>  node in the tree, which if won't find in case of a HiveTableFunctionScan was 
> returned. This is why TestNewGetSplitsFormat is failing with return path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22369) Handle HiveTableFunctionScan at return path

2019-11-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22369?focusedWorklogId=346094=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-346094
 ]

ASF GitHub Bot logged work on HIVE-22369:
-

Author: ASF GitHub Bot
Created on: 19/Nov/19 17:03
Start Date: 19/Nov/19 17:03
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on issue #845: HIVE-22369 
Handle HiveTableFunctionScan at return path
URL: https://github.com/apache/hive/pull/845#issuecomment-555606077
 
 
   Added unit tests for the modifications.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 346094)
Time Spent: 50m  (was: 40m)

> Handle HiveTableFunctionScan at return path
> ---
>
> Key: HIVE-22369
> URL: https://issues.apache.org/jira/browse/HIVE-22369
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22369.01.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The 
> [optimizedOptiqPlan|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L1573]
>  at CalcitePlanner.getOptimizedHiveOPDag is ultimately generated by 
> CalcitePlanner.internalGenSelectLogicalPlan, which may either provide a 
> [HiveProject|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L4831]
>  or a 
> [HiveTableFunctionScan|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L4776].
>  When HiveCalciteUtil.getTopLevelSelect is invoked on this it is looking for 
> a 
> [HiveProject|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java#L633]
>  node in the tree, which if won't find in case of a HiveTableFunctionScan was 
> returned. This is why TestNewGetSplitsFormat is failing with return path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22369) Handle HiveTableFunctionScan at return path

2019-11-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22369?focusedWorklogId=346093=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-346093
 ]

ASF GitHub Bot logged work on HIVE-22369:
-

Author: ASF GitHub Bot
Created on: 19/Nov/19 17:02
Start Date: 19/Nov/19 17:02
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #845: 
HIVE-22369 Handle HiveTableFunctionScan at return path
URL: https://github.com/apache/hive/pull/845#discussion_r348050730
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveOpConverter.java
 ##
 @@ -186,12 +193,67 @@ OpAttr dispatch(RelNode rn) throws SemanticException {
   return visit((HiveSortExchange) rn);
 } else if (rn instanceof HiveAggregate) {
   return visit((HiveAggregate) rn);
+} else if (rn instanceof HiveTableFunctionScan) {
+  return visit((HiveTableFunctionScan) rn);
 }
 LOG.error(rn.getClass().getCanonicalName() + "operator translation not 
supported"
 + " yet in return path.");
 return null;
   }
 
+  private OpAttr visit(HiveTableFunctionScan scanRel) throws SemanticException 
{
+if (LOG.isDebugEnabled()) {
+  LOG.debug("Translating operator rel#" + scanRel.getId() + ":"
+  + scanRel.getRelTypeName() + " with row type: [" + 
scanRel.getRowType() + "]");
+}
+
+RexCall call = (RexCall)scanRel.getCall();
+
+String functionName = call.getOperator().getName();
+FunctionInfo fi = FunctionRegistry.getFunctionInfo(functionName);
+GenericUDTF genericUDTF = fi.getGenericUDTF();
+
+RowResolver rowResolver = new RowResolver();
+List fieldNames = new 
ArrayList<>(scanRel.getRowType().getFieldNames());
+List exprNames = new ArrayList<>(fieldNames);
+List exprCols = new ArrayList<>();
+Map colExprMap = new HashMap<>();
+for (int pos = 0; pos < call.getOperands().size(); pos++) {
+  ExprNodeConverter converter = new 
ExprNodeConverter(SemanticAnalyzer.DUMMY_TABLE, fieldNames.get(pos),
+  scanRel.getRowType(), scanRel.getRowType(), 
((HiveTableScan)scanRel.getInput(0)).getPartOrVirtualCols(),
+  scanRel.getCluster().getTypeFactory(), true);
+  ExprNodeDesc exprCol = call.getOperands().get(pos).accept(converter);
+  colExprMap.put(exprNames.get(pos), exprCol);
+  exprCols.add(exprCol);
+
+  ColumnInfo columnInfo = new ColumnInfo(fieldNames.get(pos), 
exprCol.getWritableObjectInspector(), null, false);
+  rowResolver.put(columnInfo.getTabAlias(), columnInfo.getAlias(), 
columnInfo);
+}
+
+QB qb = new QB(semanticAnalyzer.getQB().getId(), nextAlias(), true);
+qb.getMetaData().setSrcForAlias(SemanticAnalyzer.DUMMY_TABLE, 
semanticAnalyzer.getDummyTable());
+TableScanOperator op = (TableScanOperator) 
semanticAnalyzer.genTablePlan(SemanticAnalyzer.DUMMY_TABLE, qb);
 
 Review comment:
   QB will be removed completely? I'll check out how we could integrate this to 
`visit(HiveTableScan scanRel)`.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 346093)
Time Spent: 40m  (was: 0.5h)

> Handle HiveTableFunctionScan at return path
> ---
>
> Key: HIVE-22369
> URL: https://issues.apache.org/jira/browse/HIVE-22369
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22369.01.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The 
> [optimizedOptiqPlan|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L1573]
>  at CalcitePlanner.getOptimizedHiveOPDag is ultimately generated by 
> CalcitePlanner.internalGenSelectLogicalPlan, which may either provide a 
> [HiveProject|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L4831]
>  or a 
> [HiveTableFunctionScan|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L4776].
>  When HiveCalciteUtil.getTopLevelSelect is invoked on this it is looking for 
> a 
>

[jira] [Work logged] (HIVE-22369) Handle HiveTableFunctionScan at return path

2019-11-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22369?focusedWorklogId=346092=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-346092
 ]

ASF GitHub Bot logged work on HIVE-22369:
-

Author: ASF GitHub Bot
Created on: 19/Nov/19 17:02
Start Date: 19/Nov/19 17:02
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #845: 
HIVE-22369 Handle HiveTableFunctionScan at return path
URL: https://github.com/apache/hive/pull/845#discussion_r348049562
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveOpConverter.java
 ##
 @@ -186,12 +193,67 @@ OpAttr dispatch(RelNode rn) throws SemanticException {
   return visit((HiveSortExchange) rn);
 } else if (rn instanceof HiveAggregate) {
   return visit((HiveAggregate) rn);
+} else if (rn instanceof HiveTableFunctionScan) {
+  return visit((HiveTableFunctionScan) rn);
 }
 LOG.error(rn.getClass().getCanonicalName() + "operator translation not 
supported"
 + " yet in return path.");
 return null;
   }
 
+  private OpAttr visit(HiveTableFunctionScan scanRel) throws SemanticException 
{
+if (LOG.isDebugEnabled()) {
+  LOG.debug("Translating operator rel#" + scanRel.getId() + ":"
+  + scanRel.getRelTypeName() + " with row type: [" + 
scanRel.getRowType() + "]");
+}
+
+RexCall call = (RexCall)scanRel.getCall();
+
+String functionName = call.getOperator().getName();
+FunctionInfo fi = FunctionRegistry.getFunctionInfo(functionName);
+GenericUDTF genericUDTF = fi.getGenericUDTF();
+
+RowResolver rowResolver = new RowResolver();
+List fieldNames = new 
ArrayList<>(scanRel.getRowType().getFieldNames());
+List exprNames = new ArrayList<>(fieldNames);
+List exprCols = new ArrayList<>();
+Map colExprMap = new HashMap<>();
+for (int pos = 0; pos < call.getOperands().size(); pos++) {
+  ExprNodeConverter converter = new 
ExprNodeConverter(SemanticAnalyzer.DUMMY_TABLE, fieldNames.get(pos),
+  scanRel.getRowType(), scanRel.getRowType(), 
((HiveTableScan)scanRel.getInput(0)).getPartOrVirtualCols(),
+  scanRel.getCluster().getTypeFactory(), true);
+  ExprNodeDesc exprCol = call.getOperands().get(pos).accept(converter);
+  colExprMap.put(exprNames.get(pos), exprCol);
+  exprCols.add(exprCol);
+
+  ColumnInfo columnInfo = new ColumnInfo(fieldNames.get(pos), 
exprCol.getWritableObjectInspector(), null, false);
+  rowResolver.put(columnInfo.getTabAlias(), columnInfo.getAlias(), 
columnInfo);
+}
+
+QB qb = new QB(semanticAnalyzer.getQB().getId(), nextAlias(), true);
+qb.getMetaData().setSrcForAlias(SemanticAnalyzer.DUMMY_TABLE, 
semanticAnalyzer.getDummyTable());
+TableScanOperator op = (TableScanOperator) 
semanticAnalyzer.genTablePlan(SemanticAnalyzer.DUMMY_TABLE, qb);
+op.getConf().setRowLimit(1);
+qb.addAlias(SemanticAnalyzer.DUMMY_TABLE);
+qb.setTabAlias(SemanticAnalyzer.DUMMY_TABLE, SemanticAnalyzer.DUMMY_TABLE);
+
+Operator output = OperatorFactory.getAndMakeChild(new 
SelectDesc(exprCols, fieldNames, false),
+new RowSchema(rowResolver.getRowSchema()), op);
+output.setColumnExprMap(colExprMap);
+semanticAnalyzer.putOpInsertMap(output, rowResolver);
+
+Operator funcOp = semanticAnalyzer.genUDTFPlan(genericUDTF, null, 
fieldNames, qb, output, false);
 
 Review comment:
   Moving `genUDTFPlan` to `HiveOpConverter` would mean that it should be 
copied from the `SemanticAnalyzer`, and having a duplication before the old one 
get's removed? It's ok for me, but we should also consider that 
`HiveOpConverter` is already 1300 lines long, and it wouldn't be wise to create 
another monster class. So I'm ok with moving it here, but then soon we should 
have a jira for creating a better design here too, and cutting 
`HiveOpConverter` to pieces.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 346092)
Time Spent: 40m  (was: 0.5h)

> Handle HiveTableFunctionScan at return path
> ---
>
> Key: HIVE-22369
> URL: https://issues.apache.org/jira/browse/HIVE-22369
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22369.01.patch
>
>  Time Spent: 40m
>  Remaining

[jira] [Work logged] (HIVE-22369) Handle HiveTableFunctionScan at return path

2019-11-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22369?focusedWorklogId=345778=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-345778
 ]

ASF GitHub Bot logged work on HIVE-22369:
-

Author: ASF GitHub Bot
Created on: 19/Nov/19 03:25
Start Date: 19/Nov/19 03:25
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #845: HIVE-22369 
Handle HiveTableFunctionScan at return path
URL: https://github.com/apache/hive/pull/845#discussion_r347714603
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveOpConverter.java
 ##
 @@ -186,12 +193,67 @@ OpAttr dispatch(RelNode rn) throws SemanticException {
   return visit((HiveSortExchange) rn);
 } else if (rn instanceof HiveAggregate) {
   return visit((HiveAggregate) rn);
+} else if (rn instanceof HiveTableFunctionScan) {
+  return visit((HiveTableFunctionScan) rn);
 }
 LOG.error(rn.getClass().getCanonicalName() + "operator translation not 
supported"
 + " yet in return path.");
 return null;
   }
 
+  private OpAttr visit(HiveTableFunctionScan scanRel) throws SemanticException 
{
+if (LOG.isDebugEnabled()) {
+  LOG.debug("Translating operator rel#" + scanRel.getId() + ":"
+  + scanRel.getRelTypeName() + " with row type: [" + 
scanRel.getRowType() + "]");
+}
+
+RexCall call = (RexCall)scanRel.getCall();
+
+String functionName = call.getOperator().getName();
+FunctionInfo fi = FunctionRegistry.getFunctionInfo(functionName);
+GenericUDTF genericUDTF = fi.getGenericUDTF();
+
+RowResolver rowResolver = new RowResolver();
+List fieldNames = new 
ArrayList<>(scanRel.getRowType().getFieldNames());
+List exprNames = new ArrayList<>(fieldNames);
+List exprCols = new ArrayList<>();
+Map colExprMap = new HashMap<>();
+for (int pos = 0; pos < call.getOperands().size(); pos++) {
+  ExprNodeConverter converter = new 
ExprNodeConverter(SemanticAnalyzer.DUMMY_TABLE, fieldNames.get(pos),
+  scanRel.getRowType(), scanRel.getRowType(), 
((HiveTableScan)scanRel.getInput(0)).getPartOrVirtualCols(),
+  scanRel.getCluster().getTypeFactory(), true);
+  ExprNodeDesc exprCol = call.getOperands().get(pos).accept(converter);
+  colExprMap.put(exprNames.get(pos), exprCol);
+  exprCols.add(exprCol);
+
+  ColumnInfo columnInfo = new ColumnInfo(fieldNames.get(pos), 
exprCol.getWritableObjectInspector(), null, false);
+  rowResolver.put(columnInfo.getTabAlias(), columnInfo.getAlias(), 
columnInfo);
+}
+
+QB qb = new QB(semanticAnalyzer.getQB().getId(), nextAlias(), true);
+qb.getMetaData().setSrcForAlias(SemanticAnalyzer.DUMMY_TABLE, 
semanticAnalyzer.getDummyTable());
+TableScanOperator op = (TableScanOperator) 
semanticAnalyzer.genTablePlan(SemanticAnalyzer.DUMMY_TABLE, qb);
+op.getConf().setRowLimit(1);
+qb.addAlias(SemanticAnalyzer.DUMMY_TABLE);
+qb.setTabAlias(SemanticAnalyzer.DUMMY_TABLE, SemanticAnalyzer.DUMMY_TABLE);
+
+Operator output = OperatorFactory.getAndMakeChild(new 
SelectDesc(exprCols, fieldNames, false),
+new RowSchema(rowResolver.getRowSchema()), op);
+output.setColumnExprMap(colExprMap);
+semanticAnalyzer.putOpInsertMap(output, rowResolver);
+
+Operator funcOp = semanticAnalyzer.genUDTFPlan(genericUDTF, null, 
fieldNames, qb, output, false);
 
 Review comment:
   Should we move this method to `HiveOpConverter` and clean any context 
information that is not needed? Similar to `genReduceSink`. Idea is that when 
we enable this path, we can easily get rid of most dependencies from 
`SemanticAnalyzer`.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 345778)
Time Spent: 20m  (was: 10m)

> Handle HiveTableFunctionScan at return path
> ---
>
> Key: HIVE-22369
> URL: https://issues.apache.org/jira/browse/HIVE-22369
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22369.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The 
> [optimizedOptiqPlan|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L1573]
>  at CalcitePlanner.getOptimizedHiveOPDag is

[jira] [Work logged] (HIVE-22369) Handle HiveTableFunctionScan at return path

2019-11-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22369?focusedWorklogId=345779=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-345779
 ]

ASF GitHub Bot logged work on HIVE-22369:
-

Author: ASF GitHub Bot
Created on: 19/Nov/19 03:25
Start Date: 19/Nov/19 03:25
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #845: HIVE-22369 
Handle HiveTableFunctionScan at return path
URL: https://github.com/apache/hive/pull/845#discussion_r347714909
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveOpConverter.java
 ##
 @@ -186,12 +193,67 @@ OpAttr dispatch(RelNode rn) throws SemanticException {
   return visit((HiveSortExchange) rn);
 } else if (rn instanceof HiveAggregate) {
   return visit((HiveAggregate) rn);
+} else if (rn instanceof HiveTableFunctionScan) {
+  return visit((HiveTableFunctionScan) rn);
 }
 LOG.error(rn.getClass().getCanonicalName() + "operator translation not 
supported"
 + " yet in return path.");
 return null;
   }
 
+  private OpAttr visit(HiveTableFunctionScan scanRel) throws SemanticException 
{
+if (LOG.isDebugEnabled()) {
+  LOG.debug("Translating operator rel#" + scanRel.getId() + ":"
+  + scanRel.getRelTypeName() + " with row type: [" + 
scanRel.getRowType() + "]");
+}
+
+RexCall call = (RexCall)scanRel.getCall();
+
+String functionName = call.getOperator().getName();
+FunctionInfo fi = FunctionRegistry.getFunctionInfo(functionName);
+GenericUDTF genericUDTF = fi.getGenericUDTF();
+
+RowResolver rowResolver = new RowResolver();
+List fieldNames = new 
ArrayList<>(scanRel.getRowType().getFieldNames());
+List exprNames = new ArrayList<>(fieldNames);
+List exprCols = new ArrayList<>();
+Map colExprMap = new HashMap<>();
+for (int pos = 0; pos < call.getOperands().size(); pos++) {
+  ExprNodeConverter converter = new 
ExprNodeConverter(SemanticAnalyzer.DUMMY_TABLE, fieldNames.get(pos),
+  scanRel.getRowType(), scanRel.getRowType(), 
((HiveTableScan)scanRel.getInput(0)).getPartOrVirtualCols(),
+  scanRel.getCluster().getTypeFactory(), true);
+  ExprNodeDesc exprCol = call.getOperands().get(pos).accept(converter);
+  colExprMap.put(exprNames.get(pos), exprCol);
+  exprCols.add(exprCol);
+
+  ColumnInfo columnInfo = new ColumnInfo(fieldNames.get(pos), 
exprCol.getWritableObjectInspector(), null, false);
+  rowResolver.put(columnInfo.getTabAlias(), columnInfo.getAlias(), 
columnInfo);
+}
+
+QB qb = new QB(semanticAnalyzer.getQB().getId(), nextAlias(), true);
+qb.getMetaData().setSrcForAlias(SemanticAnalyzer.DUMMY_TABLE, 
semanticAnalyzer.getDummyTable());
+TableScanOperator op = (TableScanOperator) 
semanticAnalyzer.genTablePlan(SemanticAnalyzer.DUMMY_TABLE, qb);
 
 Review comment:
   We could create the operator in this class (we could add a method 
`createDummyTableScan`). It should probably be a subset of `visit(HiveTableScan 
scanRel)`? It could also be useful for other translations. That way we can also 
remove the dependency to QB, which we want eventually to get rid of in this 
path.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 345779)
Time Spent: 0.5h  (was: 20m)

> Handle HiveTableFunctionScan at return path
> ---
>
> Key: HIVE-22369
> URL: https://issues.apache.org/jira/browse/HIVE-22369
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22369.01.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The 
> [optimizedOptiqPlan|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L1573]
>  at CalcitePlanner.getOptimizedHiveOPDag is ultimately generated by 
> CalcitePlanner.internalGenSelectLogicalPlan, which may either provide a 
> [HiveProject|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L4831]
>  or a 
> [HiveTableFunctionScan|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L4776].
>  When

[jira] [Work logged] (HIVE-22369) Handle HiveTableFunctionScan at return path

2019-11-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22369?focusedWorklogId=345369=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-345369
 ]

ASF GitHub Bot logged work on HIVE-22369:
-

Author: ASF GitHub Bot
Created on: 18/Nov/19 16:00
Start Date: 18/Nov/19 16:00
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #845: 
HIVE-22369 Handle HiveTableFunctionScan at return path
URL: https://github.com/apache/hive/pull/845
 
 
   The optimizedOptiqPlan at CalcitePlanner.getOptimizedHiveOPDag is ultimately 
generated by CalcitePlanner.internalGenSelectLogicalPlan, which may either 
provide a HiveProject or a HiveTableFunctionScan. When 
HiveCalciteUtil.getTopLevelSelect is invoked on this it is looking for a 
HiveProject node in the tree, which if won't find in case of a 
HiveTableFunctionScan was returned. This is why TestNewGetSplitsFormat is 
failing with return path.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 345369)
Remaining Estimate: 0h
Time Spent: 10m

> Handle HiveTableFunctionScan at return path
> ---
>
> Key: HIVE-22369
> URL: https://issues.apache.org/jira/browse/HIVE-22369
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22369.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The 
> [optimizedOptiqPlan|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L1573]
>  at CalcitePlanner.getOptimizedHiveOPDag is ultimately generated by 
> CalcitePlanner.internalGenSelectLogicalPlan, which may either provide a 
> [HiveProject|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L4831]
>  or a 
> [HiveTableFunctionScan|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L4776].
>  When HiveCalciteUtil.getTopLevelSelect is invoked on this it is looking for 
> a 
> [HiveProject|https://github.com/apache/hive/blob/5c91d324f22c2ae47e234e76a9bc5ee1a71e6a70/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java#L633]
>  node in the tree, which if won't find in case of a HiveTableFunctionScan was 
> returned. This is why TestNewGetSplitsFormat is failing with return path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22369) Handle HiveTableFunctionScan at return path

[jira] [Work logged] (HIVE-22369) Handle HiveTableFunctionScan at return path

[jira] [Work logged] (HIVE-22369) Handle HiveTableFunctionScan at return path

[jira] [Work logged] (HIVE-22369) Handle HiveTableFunctionScan at return path

[jira] [Work logged] (HIVE-22369) Handle HiveTableFunctionScan at return path

[jira] [Work logged] (HIVE-22369) Handle HiveTableFunctionScan at return path

[jira] [Work logged] (HIVE-22369) Handle HiveTableFunctionScan at return path

[jira] [Work logged] (HIVE-22369) Handle HiveTableFunctionScan at return path

[jira] [Work logged] (HIVE-22369) Handle HiveTableFunctionScan at return path

9 matches

Site Navigation

Mail list logo

Footer information