gaozhan ding created HIVE-23802:
-----------------------------------
Summary: “merge files” job was submited to default queue when set
hive.merge.tezfiles to true
Key: HIVE-23802
URL: https://issues.apache.org/jira/browse/HIVE-23802
Project: Hive
Issue Type: Bug
Components: HiveServer2
Affects Versions: 3.1.0
Reporter: gaozhan ding
Assignee: gaozhan ding
We use tez as the query engine. When hive.merge.tezfiles set to true,merge
files task, which followed by orginal task, will be submit to default queue
rather then the queue same with orginal task.
I study this issue for days and found that, every time starting a container,
"tez,queue.name" whill be unset in current session. Code are as below:
{panel:title=TezSessionState.startSessionAndContainers()}
// sessionState.getQueueName() comes from cluster wide configured queue names.
// sessionState.getConf().get("tez.queue.name") is explicitly set by user in a
session.
// TezSessionPoolManager sets tez.queue.name if user has specified one or use
the one from
// cluster wide queue names.
// There is no way to differentiate how this was set (user vs system).
// Unset this after opening the session so that reopening of session uses the
correct queue
// names i.e, if client has not died and if the user has explicitly set a queue
name
// then reopened session will use user specified queue name else default
cluster queue names.
conf.unset(TezConfiguration.TEZ_QUEUE_NAME);{panel}
So after the orgin tast was submited to yarn, "tez,queue.name" will be unset.
While starting merge file task, it will try use the same session with orgin
job, but get false due to tez,queue.name was unset.
{panel:title=TezSessionPoolManager.canWorkWithSameSession()}
if (!session.isDefault()) {
String queueName = session.getQueueName();
String confQueueName = conf.get(TezConfiguration.TEZ_QUEUE_NAME);
LOG.info("Current queue name is " + queueName + " incoming queue name is " +
confQueueName);
return (queueName == null) ? confQueueName == null :
queueName.equals(confQueueName);
} else {
// this session should never be a default session unless something has messed
up.
throw new HiveException("The pool session " + session + " should have been
returned to the pool");
}{panel}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)