marton-bod commented on a change in pull request #2161:
URL: https://github.com/apache/hive/pull/2161#discussion_r612455851
##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java
##########
@@ -250,9 +255,32 @@ public int execute() {
this.setException(new HiveException(monitor.getDiagnostics()));
}
- // fetch the counters
try {
Set<StatusGetOpts> statusGetOpts =
EnumSet.of(StatusGetOpts.GET_COUNTERS);
+ // save useful commit information into session conf, e.g. for custom
commit hooks
+ List<BaseWork> allWork = work.getAllWork();
+ boolean hasReducer =
allWork.stream().map(workToVertex::get).anyMatch(v ->
v.getName().startsWith("Reducer"));
+ for (BaseWork baseWork : allWork) {
+ Vertex vertex = workToVertex.get(baseWork);
+ if (!hasReducer || vertex.getName().startsWith("Reducer")) {
+ // construct the parsable job id
+ VertexStatus status =
dagClient.getVertexStatus(vertex.getName(), statusGetOpts);
+ String[] jobIdParts = status.getId().split("_");
+ // status.getId() returns something like:
vertex_1617722404520_0001_1_00
+ // this should be transformed to a parsable JobID:
job_16177224045200_0001
+ int vertexId = Integer.parseInt(jobIdParts[jobIdParts.length -
1]);
+ String jobId = String.format("job_%s%d_%s", jobIdParts[1],
vertexId, jobIdParts[2]);
+ // prefix with table name (for multi-table inserts), if available
+ String tableName =
Optional.ofNullable(workToConf.get(baseWork)).map(c ->
c.get("name")).orElse(null);
+ String jobIdKey = HIVE_TEZ_COMMIT_JOB_ID + (tableName == null ?
"" : "." + tableName);;
+ String taskCountKey = HIVE_TEZ_COMMIT_TASK_COUNT + (tableName ==
null ? "" : "." + tableName);
+ // save info into session conf
+ HiveConf sessionConf = SessionState.get().getConf();
+ sessionConf.set(jobIdKey, jobId);
+ sessionConf.setInt(taskCountKey,
status.getProgress().getSucceededTaskCount());
Review comment:
I'll look into this in the following PR, once we've replaced the
temporary listing solution with the permanent one and upgraded the Tez
dependency.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]