[jira] [Work logged] (HIVE-26319) Iceberg integration: Perform update split early

ASF GitHub Bot (Jira) Mon, 27 Jun 2022 05:18:05 -0700


     [ 
https://issues.apache.org/jira/browse/HIVE-26319?focusedWorklogId=785057&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-785057
 ]


ASF GitHub Bot logged work on HIVE-26319:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 27/Jun/22 12:17
            Start Date: 27/Jun/22 12:17
    Worklog Time Spent: 10m 
      Work Description: pvary commented on code in PR #3362:
URL: https://github.com/apache/hive/pull/3362#discussion_r907320335


##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java:
##########
@@ -858,19 +861,25 @@ private static boolean 
hasParquetListColumnSupport(Properties tableProps, Schema
    * @param overwrite If we have to overwrite the existing table or just add 
the new data
    * @return The generated JobContext
    */
-  private Optional<JobContext> generateJobContext(Configuration configuration, 
String tableName, boolean overwrite) {
+  private Optional<List<JobContext>> generateJobContext(Configuration 
configuration, String tableName,
+      boolean overwrite) {
     JobConf jobConf = new JobConf(configuration);
-    Optional<SessionStateUtil.CommitInfo> commitInfo = 
SessionStateUtil.getCommitInfo(jobConf, tableName);
-    if (commitInfo.isPresent()) {
-      JobID jobID = JobID.forName(commitInfo.get().getJobIdStr());
-      commitInfo.get().getProps().forEach(jobConf::set);
-      jobConf.setBoolean(InputFormatConfig.IS_OVERWRITE, overwrite);
-
-      // we should only commit this current table because
-      // for multi-table inserts, this hook method will be called sequentially 
for each target table
-      jobConf.set(InputFormatConfig.OUTPUT_TABLES, tableName);
-
-      return Optional.of(new JobContextImpl(jobConf, jobID, null));
+    Optional<Map<String, SessionStateUtil.CommitInfo>> commitInfoMap =
+        SessionStateUtil.getCommitInfo(jobConf, tableName);
+    if (commitInfoMap.isPresent()) {
+      List<JobContext> jobContextList = Lists.newLinkedList();
+      for (SessionStateUtil.CommitInfo commitInfo : 
commitInfoMap.get().values()) {
+        JobID jobID = JobID.forName(commitInfo.getJobIdStr());
+        commitInfo.getProps().forEach(jobConf::set);
+        jobConf.setBoolean(InputFormatConfig.IS_OVERWRITE, overwrite);
+
+        // we should only commit this current table because
+        // for multi-table inserts, this hook method will be called 
sequentially for each target table
+        jobConf.set(InputFormatConfig.OUTPUT_TABLES, tableName);
+
+        jobContextList.add(new JobContextImpl(jobConf, jobID, null));
+      }
+      return Optional.of(jobContextList);

Review Comment:
   Why not empty list instead of Optional?





Issue Time Tracking
-------------------

    Worklog Id:     (was: 785057)
    Time Spent: 3.5h  (was: 3h 20m)

> Iceberg integration: Perform update split early
> -----------------------------------------------
>
>                 Key: HIVE-26319
>                 URL: https://issues.apache.org/jira/browse/HIVE-26319
>             Project: Hive
>          Issue Type: Improvement
>          Components: File Formats
>            Reporter: Krisztian Kasa
>            Assignee: Krisztian Kasa
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>          Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Extend update split early to iceberg tables like in HIVE-21160 for native 
> acid tables



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Work logged] (HIVE-26319) Iceberg integration: Perform update split early

Reply via email to