[jira] [Work logged] (HIVE-25673) Column pruning fix for MR tasks

ASF GitHub Bot (Jira) Mon, 08 Nov 2021 02:02:04 -0800


     [ 
https://issues.apache.org/jira/browse/HIVE-25673?focusedWorklogId=678397&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-678397
 ]


ASF GitHub Bot logged work on HIVE-25673:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/Nov/21 10:01
            Start Date: 08/Nov/21 10:01
    Worklog Time Spent: 10m 
      Work Description: pvary commented on a change in pull request #2765:
URL: https://github.com/apache/hive/pull/2765#discussion_r744568098



##########
File path: 
iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergSelects.java
##########
@@ -203,4 +204,29 @@ public void testScanTableCaseInsensitive() throws 
IOException {
     Assert.assertArrayEquals(new Object[] {0L, "Alice", "Brown"}, rows.get(0));
     Assert.assertArrayEquals(new Object[] {1L, "Bob", "Green"}, rows.get(1));
   }
+
+  /**
+   * Column pruning could become problematic when a single Map Task contains 
multiple TableScan operators where
+   * different columns are pruned. This only occurs on MR, as Tez initializes 
a single Map task for every TableScan
+   * operator.
+   */
+  @Test
+  public void testMultiColumnPruning() throws IOException {
+    shell.setHiveSessionValue("hive.cbo.enable", true);
+
+    Schema schema1 = new Schema(optional(1, "fk", Types.StringType.get()));
+    List<Record> records1 = 
TestHelper.RecordsBuilder.newInstance(schema1).add("fk1").build();
+    testTables.createTable(shell, "table1", schema1, fileFormat, records1);
+
+    Schema schema2 = new Schema(optional(1, "fk", Types.StringType.get()), 
optional(2, "val", Types.StringType.get()));
+    List<Record> records2 = 
TestHelper.RecordsBuilder.newInstance(schema2).add("fk1", "val").build();
+    testTables.createTable(shell, "table2", schema2, fileFormat, records2);
+
+    // MR is needed for the reproduction
+    shell.setHiveSessionValue("hive.execution.engine", "mr");

Review comment:
       I am not sure that Hive removed MR support (I think not yet, but there 
is no active development maintaining it), but Hive - Iceberg definitely needs 
Tez.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 678397)
    Time Spent: 1h 40m  (was: 1.5h)

> Column pruning fix for MR tasks
> -------------------------------
>
>                 Key: HIVE-25673
>                 URL: https://issues.apache.org/jira/browse/HIVE-25673
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Peter Vary
>            Assignee: Peter Vary
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> When running join tests for Iceberg tables then we got the following 
> exception:
> {code}
> Caused by: java.lang.RuntimeException: Map operator initialization failed
>       at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:131)
>       ... 23 more
> Caused by: java.lang.RuntimeException: cannot find field val from 
> [org.apache.iceberg.mr.hive.serde.objectinspector.IcebergRecordObjectInspector$IcebergRecordStructField@45f29d]
>       at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:550)
>       at 
> org.apache.iceberg.mr.hive.serde.objectinspector.IcebergRecordObjectInspector.getStructFieldRef(IcebergRecordObjectInspector.java:70)
>       at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:56)
>       at 
> org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:1073)
>       at 
> org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:1099)
>       at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:74)
>       at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:360)
>       at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:549)
>       at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:503)
>       at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:369)
>       at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:549)
>       at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:503)
>       at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:369)
>       at 
> org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:505)
>       at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:110)
>       ... 23 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Work logged] (HIVE-25673) Column pruning fix for MR tasks

Reply via email to