[jira] [Work logged] (HIVE-25673) Column pruning fix for MR tasks

ASF GitHub Bot (Jira) Mon, 08 Nov 2021 01:53:04 -0800


     [ 
https://issues.apache.org/jira/browse/HIVE-25673?focusedWorklogId=678383&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-678383
 ]


ASF GitHub Bot logged work on HIVE-25673:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/Nov/21 09:52
            Start Date: 08/Nov/21 09:52
    Worklog Time Spent: 10m 
      Work Description: pvary commented on a change in pull request #2765:
URL: https://github.com/apache/hive/pull/2765#discussion_r744560669



##########
File path: 
iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergSelects.java
##########
@@ -203,4 +204,29 @@ public void testScanTableCaseInsensitive() throws 
IOException {
     Assert.assertArrayEquals(new Object[] {0L, "Alice", "Brown"}, rows.get(0));
     Assert.assertArrayEquals(new Object[] {1L, "Bob", "Green"}, rows.get(1));
   }
+
+  /**
+   * Column pruning could become problematic when a single Map Task contains 
multiple TableScan operators where
+   * different columns are pruned. This only occurs on MR, as Tez initializes 
a single Map task for every TableScan
+   * operator.
+   */
+  @Test
+  public void testMultiColumnPruning() throws IOException {
+    shell.setHiveSessionValue("hive.cbo.enable", true);
+
+    Schema schema1 = new Schema(optional(1, "fk", Types.StringType.get()));
+    List<Record> records1 = 
TestHelper.RecordsBuilder.newInstance(schema1).add("fk1").build();
+    testTables.createTable(shell, "table1", schema1, fileFormat, records1);
+
+    Schema schema2 = new Schema(optional(1, "fk", Types.StringType.get()), 
optional(2, "val", Types.StringType.get()));
+    List<Record> records2 = 
TestHelper.RecordsBuilder.newInstance(schema2).add("fk1", "val").build();
+    testTables.createTable(shell, "table2", schema2, fileFormat, records2);
+
+    // MR is needed for the reproduction
+    shell.setHiveSessionValue("hive.execution.engine", "mr");

Review comment:
       With Hive 4.0.0 we do not support Iceberg. When I tried to run the tests 
with MR, the inserts were not working. So I had to run the inserts with Tez, 
and then the test query with MR.
   
   OTOH this is a valid issue with MR, and older versions on Hive, where MR is 
supported. (Maybe on newer version as well)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 678383)
    Time Spent: 1h  (was: 50m)

> Column pruning fix for MR tasks
> -------------------------------
>
>                 Key: HIVE-25673
>                 URL: https://issues.apache.org/jira/browse/HIVE-25673
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Peter Vary
>            Assignee: Peter Vary
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> When running join tests for Iceberg tables then we got the following 
> exception:
> {code}
> Caused by: java.lang.RuntimeException: Map operator initialization failed
>       at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:131)
>       ... 23 more
> Caused by: java.lang.RuntimeException: cannot find field val from 
> [org.apache.iceberg.mr.hive.serde.objectinspector.IcebergRecordObjectInspector$IcebergRecordStructField@45f29d]
>       at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:550)
>       at 
> org.apache.iceberg.mr.hive.serde.objectinspector.IcebergRecordObjectInspector.getStructFieldRef(IcebergRecordObjectInspector.java:70)
>       at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:56)
>       at 
> org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:1073)
>       at 
> org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:1099)
>       at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:74)
>       at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:360)
>       at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:549)
>       at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:503)
>       at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:369)
>       at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:549)
>       at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:503)
>       at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:369)
>       at 
> org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:505)
>       at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:110)
>       ... 23 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Work logged] (HIVE-25673) Column pruning fix for MR tasks

Reply via email to