pavan kumar kolamuri created OOZIE-2253:
-------------------------------------------

             Summary: Spark Job is failing when it is running in standalone 
server
                 Key: OOZIE-2253
                 URL: https://issues.apache.org/jira/browse/OOZIE-2253
             Project: Oozie
          Issue Type: Bug
            Reporter: pavan kumar kolamuri
            Assignee: pavan kumar kolamuri


When Spark Job is running in spark standalone cluster the job is getting 
launched and succedded and infinite jobs are getting launched in spark cluster. 
Oozie workflow will be in running state forever as spark is launching job 
infinite times. 

This might be because in spark when job succeeds and it always do 
System.exit(0) . In LauncherSecurityManager  exception is thrown for this. It 
looks like spark(through akka framework)  is catching that and launching one 
more attempt for the same job. It is happening infinitely .

{noformat}
Sending launch command to spark://inmobi-Precision-T3610:7077
Driver successfully submitted as driver-20150526105806-0000
... waiting before polling master for driver state
... polling master for driver state
State of driver-20150526105806-0000 is SUBMITTED
Sending launch command to spark://inmobi-Precision-T3610:7077
Driver successfully submitted as driver-20150526105811-0001
... waiting before polling master for driver state
... polling master for driver state
State of driver-20150526105811-0001 is SUBMITTED
Sending launch command to spark://inmobi-Precision-T3610:7077
Driver successfully submitted as driver-20150526105816-0002
... waiting before polling master for driver state
... polling master for driver state
State of driver-20150526105816-0002 is SUBMITTED
Sending launch command to spark://inmobi-Precision-T3610:7077
Driver successfully submitted as driver-20150526105821-0003
... waiting before polling master for driver state
... polling master for driver state
State of driver-20150526105821-0003 is SUBMITTED
Sending launch command to spark://inmobi-Precision-T3610:7077
Driver successfully submitted as driver-20150526105826-0004
... waiting before polling master for driver state
{noformat}

{noformat}
2015-05-26 10:58:11,573 ERROR [driverClient-akka.actor.default-dispatcher-4] 
akka.actor.OneForOneStrategy: Intercepted System.exit(0)
java.lang.SecurityException: Intercepted System.exit(0)
        at 
org.apache.oozie.action.hadoop.LauncherSecurityManager.checkExit(LauncherMapper.java:601)
        at java.lang.Runtime.exit(Runtime.java:107)
        at java.lang.System.exit(System.java:962)
        at 
org.apache.spark.deploy.ClientActor.pollAndReportStatus(Client.scala:115)
        at 
org.apache.spark.deploy.ClientActor$$anonfun$receiveWithLogging$1.applyOrElse(Client.scala:123)
        at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
        at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
        at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
        at 
org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:53)
        at 
org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:42)
        at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
        at 
org.apache.spark.util.ActorLogReceive$$anon$1.applyOrElse(ActorLogReceive.scala:42)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
        at akka.actor.ActorCell.invoke(ActorCell.scala:456)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
        at akka.dispatch.Mailbox.run(Mailbox.scala:219)
        at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to