marton-bod commented on a change in pull request #2161:
URL: https://github.com/apache/hive/pull/2161#discussion_r609880455
##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java
##########
@@ -250,9 +255,32 @@ public int execute() {
this.setException(new HiveException(monitor.getDiagnostics()));
}
- // fetch the counters
try {
Set<StatusGetOpts> statusGetOpts =
EnumSet.of(StatusGetOpts.GET_COUNTERS);
+ // save useful commit information into session conf, e.g. for custom
commit hooks
+ List<BaseWork> allWork = work.getAllWork();
+ boolean hasReducer =
allWork.stream().map(workToVertex::get).anyMatch(v ->
v.getName().startsWith("Reducer"));
+ for (BaseWork baseWork : allWork) {
+ Vertex vertex = workToVertex.get(baseWork);
+ if (!hasReducer || vertex.getName().startsWith("Reducer")) {
+ // construct the parsable job id
+ VertexStatus status =
dagClient.getVertexStatus(vertex.getName(), statusGetOpts);
+ String[] jobIdParts = status.getId().split("_");
+ // status.getId() returns something like:
vertex_1617722404520_0001_1_00
+ // this should be transformed to a parsable JobID:
job_16177224045200_0001
+ int vertexId = Integer.parseInt(jobIdParts[jobIdParts.length -
1]);
+ String jobId = String.format("job_%s%d_%s", jobIdParts[1],
vertexId, jobIdParts[2]);
+ // prefix with table name (for multi-table inserts), if available
+ String tableName =
Optional.ofNullable(workToConf.get(baseWork)).map(c ->
c.get("name")).orElse(null);
+ String jobIdKey = HIVE_TEZ_COMMIT_JOB_ID + (tableName == null ?
"" : "." + tableName);;
+ String taskCountKey = HIVE_TEZ_COMMIT_TASK_COUNT + (tableName ==
null ? "" : "." + tableName);
+ // save info into session conf
+ HiveConf sessionConf = SessionState.get().getConf();
+ sessionConf.set(jobIdKey, jobId);
+ sessionConf.setInt(taskCountKey,
status.getProgress().getSucceededTaskCount());
Review comment:
Good point, I wanted to unset these variables in a finally block during
`commitInsertTable`, just forgot to add it.
I did not find a place to put these in `queryState` but maybe I missed
something.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]