-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64340/
-----------------------------------------------------------
(Updated Dec. 5, 2017, 7:29 p.m.)
Review request for Ambari, Dmytro Grinenko, Jonathan Hurley, and Nate Cole.
Changes
-------
tested changes
Summary (updated)
-----------------
Livy server start fails during EU with 'Address already in use' error
Bugs: AMBARI-22594
https://issues.apache.org/jira/browse/AMBARI-22594
Repository: ambari
Description
-------
Observed this issue quite consistently in Ambari-2.6.1 Upgrade ST runs
*STR*
# Deployed cluster with Ambari version: 2.5.1.0-159 and HDP version: 2.6.1.0-129
# Upgrade Ambari to Target Version: 2.6.1.0-43 | Hash:
acbce28fdd119c72625c6beff63fc169de58ba22
# Regenerate keytabs post Ambari upgrade and this step will restart all
services. Here Livy server is operational and gets restarted fine (at
timestamp: 09:29)
# Now register HDP-2.6.4.0-36 version and perform EU. During EU 'Restart Livy
server' task happens and reports success (at timestamp: 10:26)
# However when checking the livy logs - Livy restart reported below exception
as the previous process was not killed/stopped
{code}
17/11/21 10:26:22 WARN AbstractLifeCycle: FAILED
org.eclipse.jetty.server.Server@3bc735b3: java.net.BindException: Address
already in use
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:321)
at org.apache.livy.server.LivyServer.main(LivyServer.scala)
Exception in thread "main" java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
{code}
- Post Upgrade, I tried to stop/start Spark as well and Livy still gave same
exception; although web Ui reports operation as success (at timestamp: 11:37)
- Finally the web UI shows Livy as down, even though the process is running
from the initial step (at timestamp: 09:29)
Diffs (updated)
-----
ambari-server/src/main/resources/common-services/SPARK/1.2.1/package/scripts/livy_service.py
a78f50c077
ambari-server/src/main/resources/common-services/SPARK/1.2.1/package/scripts/params.py
9b813a13f0
ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/livy2_service.py
0d60cf41ad
ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/params.py
1968b0e8fa
Diff: https://reviews.apache.org/r/64340/diff/2/
Changes: https://reviews.apache.org/r/64340/diff/1-2/
Testing (updated)
-------
mvn clean test
check on live cluster
Thanks,
Dmitro Lisnichenko