[GitHub] [hive] pvary commented on a change in pull request #2161: WIP: HIVE-XXX: Move Iceberg job commit to HS2 side

GitBox Thu, 08 Apr 2021 09:33:07 -0700


pvary commented on a change in pull request #2161:
URL: https://github.com/apache/hive/pull/2161#discussion_r609885766




##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java
##########
@@ -250,9 +255,32 @@ public int execute() {
           this.setException(new HiveException(monitor.getDiagnostics()));
         }
 
-        // fetch the counters
         try {
           Set<StatusGetOpts> statusGetOpts = 
EnumSet.of(StatusGetOpts.GET_COUNTERS);
+          // save useful commit information into session conf, e.g. for custom 
commit hooks
+          List<BaseWork> allWork = work.getAllWork();
+          boolean hasReducer = 
allWork.stream().map(workToVertex::get).anyMatch(v -> 
v.getName().startsWith("Reducer"));
+          for (BaseWork baseWork : allWork) {
+            Vertex vertex = workToVertex.get(baseWork);
+            if (!hasReducer || vertex.getName().startsWith("Reducer")) {
+              // construct the parsable job id
+              VertexStatus status = 
dagClient.getVertexStatus(vertex.getName(), statusGetOpts);
+              String[] jobIdParts = status.getId().split("_");
+              // status.getId() returns something like: 
vertex_1617722404520_0001_1_00
+              // this should be transformed to a parsable JobID: 
job_16177224045200_0001
+              int vertexId = Integer.parseInt(jobIdParts[jobIdParts.length - 
1]);
+              String jobId = String.format("job_%s%d_%s", jobIdParts[1], 
vertexId, jobIdParts[2]);
+              // prefix with table name (for multi-table inserts), if available
+              String tableName = 
Optional.ofNullable(workToConf.get(baseWork)).map(c -> 
c.get("name")).orElse(null);
+              String jobIdKey = HIVE_TEZ_COMMIT_JOB_ID + (tableName == null ? 
"" : "." + tableName);;
+              String taskCountKey = HIVE_TEZ_COMMIT_TASK_COUNT + (tableName == 
null ? "" : "." + tableName);
+              // save info into session conf
+              HiveConf sessionConf = SessionState.get().getConf();
+              sessionConf.set(jobIdKey, jobId);
+              sessionConf.setInt(taskCountKey, 
status.getProgress().getSucceededTaskCount());

Review comment:
       I think we should extend the `queryState` for Iceberg queries. This 
could be a candidate to put there, and also it could be good to put the 
snapshots of the table the query is execution on.
   
   What do you think?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [hive] pvary commented on a change in pull request #2161: WIP: HIVE-XXX: Move Iceberg job commit to HS2 side

Reply via email to