belugabehr commented on a change in pull request #1963:
URL: https://github.com/apache/hive/pull/1963#discussion_r573979578
##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java
##########
@@ -236,6 +239,10 @@ public int execute() {
throw new HiveException("Operation cancelled");
}
+ // Log all the info required to find the various logs for this query
+ LOG.info("HS2 Host: [{}], Query ID: [{}], Dag ID: [{}], DAG Session
ID: [{}]", getHostNameIP(), queryId,
Review comment:
Hey @pgaref, thanks for the valuable input.
So, on one hand it may seem confusing as to the need to log the host name
because if you are looking at this log file then of course the hostname is
already known. However, as I understand the code, these log messages are sent
(redirected) to the client via Thrift RPC APIs. This is helpful information
for debugging for the client as it's not otherwise clear which instance of HS2
is processing the query, if for example, a load balancer is between the client
and HS2. I had thought about including some sort of HS2 unique ID as well, but
I didn't find such a capability in the project already and did not want to
introduce here.
However, these 4 pieces of information allow a client to report a problem to
the admin and allow the admin to grab all the log files: HS2 and YARN TEZ DAG
logs.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]