hey guys, I have a question on why Hiveserver2 would issue a "killjob" signal.
We run Yarn on Hadoop 5.6 with the HiveServer2 process. It uses the fair-scheduler. Pre-emption is turned off. At least twice a day we have jobs that are randomly killed. they can be big jobs, they can be small ones. they can be Tez jobs, they can be MR jobs. I can't find any pattern and i can't find *WHY* this is occurring so i'm reaching out. i have the RM process running at DEBUG level logging as well as all the NM processes so i have pretty detailed logs. Still i can't find a reason - all is see is something like this: 2017-01-18 11:18:15,732 DEBUG [IPC Server handler 0 on 42807] org.apache.hadoop.ipc.Server: IPC Server handler 0 on 42807: org.apache.hadoop.mapreduce.v2.api.MRClientProtocolPB.killJob from 172.19.75.137:39623 Call#5981979 Retry#0 for RpcKind RPC_PROTOCOL_BUFFER 2017-01-18 11:18:15,732 DEBUG [IPC Server handler 0 on 42807] org.apache.hadoop.security.UserGroupInformation: PrivilegedAction as:dwr (auth:SIMPLE) from:org.apache.hadoop.ipc.Server$Handler.run(Server. java:2038) *2017-01-18 11:18:15,736 INFO [IPC Server handler 0 on 42807] org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Kill job job_1484692657978_3312 received from dwr (auth:SIMPLE) at 172.19.75.137* 2017-01-18 11:18:15,736 DEBUG [AsyncDispatcher event handler] org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.mapreduce.v2.app.job.event.JobDiagnosticsUpdateEvent.EventType: JOB_DIAGNOSTIC_UPDATE 2017-01-18 11:18:15,736 DEBUG [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Processing job_1484692657978_3312 of type JOB_DIAGNOSTIC_UPDATE 2017-01-18 11:18:15,736 DEBUG [AsyncDispatcher event handler] org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.mapreduce.v2.app.job.event.JobEvent.EventType: JOB_KILL 2017-01-18 11:18:15,736 DEBUG [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Processing job_1484692657978_3312 of type JOB_KILL 2017-01-18 11:18:15,737 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1484692657978_3312Job Transitioned from RUNNING to KILL_WAIT 2017-01-18 11:18:15,737 DEBUG [AsyncDispatcher event handler] org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.mapreduce.v2.app.job.event.TaskEvent.EventType: T_KILL 2017-01-18 11:18:15,737 DEBUG [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Processing task_1484692657978_3312_m_000000 of type T_KILL 2017-01-18 11:18:15,737 DEBUG [AsyncDispatcher event handler] org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.mapreduce.v2.app.job.event.TaskEvent.EventType: T_KILL 2017-01-18 11:18:15,737 DEBUG [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Processing task_1484692657978_3312_m_000001 of type T_KILL * 172.19.75.137 is node where the HiveServer2 process is running - so i'm pretty sure this is the process issuing the kill. but why? why kill the job? makes no sense to me. what would cause this to happen? The Hiveserver2 log doesn't have anything useful in it at the INFO level and i can't seem to get it to log at the DEBUG level. the bit in red below doesn't seem to work so any suggestions here gladly accepted too. $ ps -ef | grep -i hiveserver2 dwr 21927 1785 8 Jan14 ? 10:10:11 /usr/lib/jvm/java-8-oracle/jre//bin/java -Xmx12288m -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/usr/lib/hadoop/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop -Dhadoop.id.str= -Dhadoop.root.logger=INFO,console -Djava.library.path=/usr/lib/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx268435456 -Xmx10G -Dlog4j.configurationFile=hive-log4j2.properties -Dorg.apache.logging.log4j.simplelog.StatusLogger.level=DEBUG -Dlog4j.configurationFile=hive-log4j2.properties -Djava.util.logging.config.file=/usr/lib/apache-hive-2.1.0-bin/bin/../conf/parquet-logging.properties -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /usr/lib/apache-hive-2.1.0-bin/lib/hive-service-2.1.0.jar org.apache.hive.service.server.HiveServer2 *--hiveconf hive.root.logger=DEBUG,console* thanks, Stephen