[ 
https://issues.apache.org/jira/browse/LIVY-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16760029#comment-16760029
 ] 

Ruslan Dautkhanov edited comment on LIVY-541 at 2/4/19 5:11 PM:
----------------------------------------------------------------

It might be a one-liner change here

 

[https://github.com/apache/incubator-livy/blob/56c76bc2d4563593edce062a563603fe63e5a431/server/src/main/scala/org/apache/livy/server/interactive/InteractiveSession.scala#L99]

 

Change
|builderProperties.getOrElseUpdate("spark.app.name", s"livy-session-$id")|

to

builderProperties.getOrElseUpdate("spark.app.name", apptag)

 

I don't thing there is a drawback for doing there.. 

A better approach might be to add a configuration knob 
`livy.yarn.session.prefix` so you could specify prefixes for different Livy 
servers and they wouldn't overlap.

 


was (Author: tagar):
It might be a one-liner change here

 

[https://github.com/apache/incubator-livy/blob/56c76bc2d4563593edce062a563603fe63e5a431/server/src/main/scala/org/apache/livy/server/interactive/InteractiveSession.scala#L99]

 

Change
|builderProperties.getOrElseUpdate("spark.app.name", s"livy-session-$id")|

to

builderProperties.getOrElseUpdate(apptag)

 

I don't thing there is a drawback for doing there.. 

A better approach might be to add a configuration knob 
`livy.yarn.session.prefix` so you could specify prefixes for different Livy 
servers and they wouldn't overlap.

 

> Multiple Livy servers submitting to Yarn results in LivyException: Session is 
> finished ... No YARN application is found with tag livy-session-197-uveqmqyj 
> in 300 seconds. Please check your cluster status, it is may be very busy
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LIVY-541
>                 URL: https://issues.apache.org/jira/browse/LIVY-541
>             Project: Livy
>          Issue Type: Bug
>          Components: Server
>    Affects Versions: 0.5.0
>         Environment: Hortonworks HDP 2.6
>            Reporter: Hari Sekhon
>            Priority: Critical
>
> It appears Livy doesn't differentiate sessions properly in Yarn causing 
> errors when running multiple Livy servers behind a load balancer for HA / 
> performance scaling on the same Hadoop cluster.
> Each livy server uses monotonically incrementing session IDs with a random 
> suffix but it appears that the random suffix isn't passed to Yarn which 
> results in the following errors on the Livy server which is further behind in 
> session numbers because it appears to find the session with the same number 
> has already finished (submitted earlier by a different user on another Livy 
> server as seen in Yarn RM UI):
> {code:java}
> org.apache.zeppelin.livy.LivyException: Session 197 is finished, appId: null, 
> log: [  at 
> org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2887), at 
> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2904),
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511), 
> at java.util.concurrent.FutureTask.run(FutureTask.java:266), at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142),
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617),
>  at java.lang.Thread.run(Thread.java:748), 
> YARN Diagnostics: , java.lang.Exception: No YARN application is found with 
> tag livy-session-197-uveqmqyj in 300 seconds. Please check your cluster 
> status, it is may be very busy., 
> org.apache.livy.utils.SparkYarnApp.org$apache$livy$utils$SparkYarnApp$$getAppIdFromTag(SparkYarnApp.scala:182)
>  
> org.apache.livy.utils.SparkYarnApp$$anonfun$1$$anonfun$4.apply(SparkYarnApp.scala:239)
>  
> org.apache.livy.utils.SparkYarnApp$$anonfun$1$$anonfun$4.apply(SparkYarnApp.scala:236)
>  scala.Option.getOrElse(Option.scala:120) 
> org.apache.livy.utils.SparkYarnApp$$anonfun$1.apply$mcV$sp(SparkYarnApp.scala:236)
>  org.apache.livy.Utils$$anon$1.run(Utils.scala:94)]
> at 
> org.apache.zeppelin.livy.BaseLivyInterpreter.createSession(BaseLivyInterpreter.java:300)
> at 
> org.apache.zeppelin.livy.BaseLivyInterpreter.initLivySession(BaseLivyInterpreter.java:184)
> at 
> org.apache.zeppelin.livy.LivySharedInterpreter.open(LivySharedInterpreter.java:57)
> at 
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
> at 
> org.apache.zeppelin.livy.BaseLivyInterpreter.getLivySharedInterpreter(BaseLivyInterpreter.java:165)
> at 
> org.apache.zeppelin.livy.BaseLivyInterpreter.open(BaseLivyInterpreter.java:139)
> at 
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
> at 
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:493)
> at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
> at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to