marton-bod commented on a change in pull request #2161:
URL: https://github.com/apache/hive/pull/2161#discussion_r609892146
##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java
##########
@@ -250,9 +255,32 @@ public int execute() {
this.setException(new HiveException(monitor.getDiagnostics()));
}
- // fetch the counters
try {
Set<StatusGetOpts> statusGetOpts =
EnumSet.of(StatusGetOpts.GET_COUNTERS);
+ // save useful commit information into session conf, e.g. for custom
commit hooks
+ List<BaseWork> allWork = work.getAllWork();
+ boolean hasReducer =
allWork.stream().map(workToVertex::get).anyMatch(v ->
v.getName().startsWith("Reducer"));
+ for (BaseWork baseWork : allWork) {
+ Vertex vertex = workToVertex.get(baseWork);
+ if (!hasReducer || vertex.getName().startsWith("Reducer")) {
+ // construct the parsable job id
+ VertexStatus status =
dagClient.getVertexStatus(vertex.getName(), statusGetOpts);
+ String[] jobIdParts = status.getId().split("_");
+ // status.getId() returns something like:
vertex_1617722404520_0001_1_00
+ // this should be transformed to a parsable JobID:
job_16177224045200_0001
+ int vertexId = Integer.parseInt(jobIdParts[jobIdParts.length -
1]);
+ String jobId = String.format("job_%s%d_%s", jobIdParts[1],
vertexId, jobIdParts[2]);
+ // prefix with table name (for multi-table inserts), if available
+ String tableName =
Optional.ofNullable(workToConf.get(baseWork)).map(c ->
c.get("name")).orElse(null);
+ String jobIdKey = HIVE_TEZ_COMMIT_JOB_ID + (tableName == null ?
"" : "." + tableName);;
+ String taskCountKey = HIVE_TEZ_COMMIT_TASK_COUNT + (tableName ==
null ? "" : "." + tableName);
+ // save info into session conf
+ HiveConf sessionConf = SessionState.get().getConf();
+ sessionConf.set(jobIdKey, jobId);
+ sessionConf.setInt(taskCountKey,
status.getProgress().getSucceededTaskCount());
Review comment:
Yeah, I'll look into it. One question though: will the queryState be
available in the meta hook? Is it available also via the SessionState?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]