Hi All,

I have a web application which will submit spark jobs on Cloudera spark cluster 
using spark launcher library. It is successfully submitting the spark job to 
cluster. However it is not calling back the listener class methods and also the 
getState() on returned SparkAppHandle never changes from "UNKNOWN" even after 
job finishes execution on cluster. 
 I am using yarn-cluster mode. Here is my code. Is anything else needs to be 
done or is this a bug?
 SparkLauncher launcher = new SparkLauncher() .setSparkHome("sparkhome") 
.setMaster("yarn-cluster") .setAppResource("spark job jar file") 
.setMainClass("spark job driver class") .setAppName("appname") 
.addAppArgs(argsArray) .setVerbose(true) .addSparkArg("--verbose"); 
SparkAppHandle handle = launcher.startApplication(new LauncherListener()); int 
c = 0; while(!handle.getState().isFinal()) { 
LOG.info(">>>>>>>> state is= "+handle.getState() ); 
LOG.info(">>>>>>>> state is not final yet. counter= 
"+c++ ); LOG.info(">>>>>>>> sleeping for a second"); 
try { Thread.sleep(1000L); } catch (InterruptedException e) { } if(c == 200) 
break; }  Here are the things I have already tried:
 Added listener instance to SparkAppHandle once application is launched.
Made the current class implement SparkAppHandle.Listener and passed it (this) 
in both ways (while launching, and by setting it on SparkAppHandle)
Tried to use launcher.launch() method so that at least I can block on the 
resulting Process object by calling process.waitFor() method till spark job 
finishes running on cluster. However in this case, for long running spark jobs, 
corresponding process on this node never returns (though it works fine for 
spark jobs which are finishing in 1 or 2 min)
 


Thanks,
Reddy

Reply via email to