Cole Mackenzie created LIVY-799:
-----------------------------------
Summary: SASL Negotiation Failed, Application spawned successfully
on remote YARN
Key: LIVY-799
URL: https://issues.apache.org/jira/browse/LIVY-799
Project: Livy
Issue Type: Bug
Components: RSC, Server
Affects Versions: 0.7.0
Environment: Java Version: 1.8.0_265 (Oracle Corporation)
Scala Version: 2.11.12
Hadoop Version: 2.7 on Livy Server (bundled with Spark), 2.8.5 on EMR
Spark: 2.4.4 and 2.4.7 (problem exists on both versions)
Reporter: Cole Mackenzie
Attachments: Screen Shot 2020-10-30 at 2.33.37 PM.png, Screen Shot
2020-10-30 at 2.34.03 PM.png, container-logs.txt, livy-server-logs.txt
h2. Issue description
Running Apache Livy on a separate machine and submitting jobs via YARN results
in an abandoned YARN application.
h2. Steps to reproduce the issue
1. Livy on Machine A (a.example.com)
2. YARN Cluster on Machine B (b.example.com)
3. Create session via API.
{code:java}
data = {
"kind": "spark",
"name": f"Demo-{str(uuid4())[:8]}",
"proxyUser": "hadoop",
"driverMemory": "4G",
"conf": {
"livy.rsc.rpc.server.address": "a.example.com"
}
}
headers = {"Content-Type": "application/json"}
session = requests.post(host + "/sessions", data=json.dumps(data),
headers=headers)
session.json(){code}
h2. What's the expected result?
* Remote spark context created using YARN on the remote cluster.
* Communication back via RPC is established with Livy Server
* Session moved to starting to idle
h2. What's the actual result?
* Remote spark context created using YARN on the remote cluster successfully.
* Connection back to Livy server is established but later dropped due to SASL
Negotiation Failure
* Spark Context is running and healthy on remote YARN cluster but stuck in
"starting" state on Livy server until clean happens.
h3. Livy Configuration
{code:java}
livy.spark.master = yarn
livy.spark.deploy-mode = cluster
livy.impersonation.enabled = true
livy.repl.enable-hive-context = true
livy.server.yarn.app-lookup-timeout = 300s
livy.server.yarn.app-leakage.check-timeout = 600s
livy.server.yarn.app-leakage.check-interval = 60s
livy.rsc.rpc.server.address = a.example.com
livy.rsc.launcher.port.range = 11000~11010
livy.rsc.proxy-user = hadoop
livy.rsc.channel.log.level = DEBUG
livy.rsc.server.connect.timeout = 300s {code}
h2. Logs
[^container-logs.txt]
[^livy-server-logs.txt]
h2. Screenshots !Screen Shot 2020-10-30 at 2.33.37 PM.png!
!Screen Shot 2020-10-30 at 2.34.03 PM.png!
Related Issues:
LIVY-686, LIVY-465
--
This message was sent by Atlassian Jira
(v8.3.4#803005)