hi,Sarthak Sharma

You can be at the zeppelin server,
Run ./bin/spark-submit --class org.apache.spark.examples.SparkPi,
Test it to see if there is a problem with the spark runtime environment on the 
zeppelin server.

> 在 2018年11月20日,下午5:39,Sarthak Sharma <sarthak...@media.net> 写道:
> 
> Is it similar to an existing bug related to the interpreter processes getting 
> stuck ? (wherein the workaround is to kill the application on yarn, restart 
> the interpreter from the interface and then try resubmitting the query 
> again). 
> The problem in this case is that it is intermittently happening on some spark 
> interpreters randomly. And since the driver app is not scheduled on yarn, 
> there are no logs available to figure out the reason for this issue.
> 
> Thanks and Regards
> 
> Sarthak Sharma
> DevOps Engineer, Media.Net
> +918002228376 <tel:+918002228376> | sarthak...@media.net 
> <mailto:sarthak...@media.net>
>  <http://en-gb.facebook.com/people/Sarthak-Sharma/100006006014244>  
> <http://in.linkedin.com/in/sarthaksharma96> 
> 
> 
> On Tue, Nov 20, 2018 at 2:22 PM Jeff Zhang <zjf...@gmail.com 
> <mailto:zjf...@gmail.com>> wrote:
> If zeppelin.interpreter.connect.timeout is reached, but the yarn app is still 
> in ACCEPTED state, then this should be a bug. The yarn app should be killed 
> it it can not be created in the timeout threashold
> 
> Sarthak Sharma <sarthak...@media.net <mailto:sarthak...@media.net>> 
> 于2018年11月20日周二 下午4:47写道:
> Hey,
> 
> Like you mentioned, I'm already using the spark.yarn.queue parameter, hence I 
> know which yarn queue it is getting scheduled in and this queue has resources 
> available for applications since other apps are also getting scheduled there. 
> However, assuming the queue does NOT have resources for it to schedule within 
> the given time frame causing it to throw an exception after the 
> zeppelin.interpreter.connect.timeout is reached, the application should in 
> any case get scheduled eventually which is not the case here. Interpreter 
> driver process remains stuck in ACCEPTED state. Is there a change in the way 
> it is implemented in this version ? Since we never experienced this on the 
> previous one (zeppelin-0.7.3) where drivers would get scheduled eventually in 
> their respective queues. 
> 
> On Tue, Nov 20, 2018, 7:29 AM Xun Liu <neliu...@163.com 
> <mailto:neliu...@163.com> wrote:
> HI,Sarthak Sharma
> 
> The log shows that the task submitted by spark-submmit has been waiting for 
> execution in the queue of YARN. Is there no resource for the queue of YARN?
> You can specify a queue with resources in the spark interpreter via the 
> spark.yarn.queue parameter.
> 
> 
>> 在 2018年11月19日,下午7:41,Sarthak Sharma <sarthak...@media.net 
>> <mailto:sarthak...@media.net>> 写道:
>> 
>> Hi, 
>> 
>> We already have a zeppelin-0.7.3 setup which runs fine and is in use 
>> currently but we are looking into the yarn cluster mode support for spark 
>> interpreter in zeppelin-0.8. I've built it from source from branch-0.8 (As 
>> of Nov-15) and am facing the following issues intermittently in some of the 
>> spark interpreters while trying to use spark-sql on it.
>> 
>> 18/11/19 10:04:07 INFO yarn.Client: Submitting application 
>> application_1542587655772_35129 to ResourceManager
>> 18/11/19 10:04:07 INFO impl.YarnClientImpl: Submitted application 
>> application_1542587655772_35129
>> 18/11/19 10:04:08 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:08 INFO yarn.Client:
>>       client token: N/A
>>       diagnostics: N/A
>>       ApplicationMaster host: N/A
>>       ApplicationMaster RPC port: -1
>>       queue: root.zep
>>       start time: 1542621847537
>>       final status: UNDEFINED
>>       tracking URL: 
>> http://resource-manager-addr/proxy/application_1542587655772_35129/ 
>> <http://c8-auto-hadoop-service-1.srv.media.net:8088/proxy/application_1542587655772_35129/>
>>       user: sarthak.sh
>> 18/11/19 10:04:09 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:10 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:11 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:12 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:13 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:14 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:15 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:16 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:17 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:18 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:19 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:20 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:21 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:22 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:23 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:24 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:25 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:26 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:27 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:28 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:29 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:30 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:31 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:32 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:33 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:34 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:35 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:36 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:37 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:38 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:39 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:40 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:41 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:42 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:43 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:44 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:45 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:46 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:47 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:48 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:49 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:50 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:51 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:52 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:53 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:54 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:55 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:56 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:57 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:58 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:04:59 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:00 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:01 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:02 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:03 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:04 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:05 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:06 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:07 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:08 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:09 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:10 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:11 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:12 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:13 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:14 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:15 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:16 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:17 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:18 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:19 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:20 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:21 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:22 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:23 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:24 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:25 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:26 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:27 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:28 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:29 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:30 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:31 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:32 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:33 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:34 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:35 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:36 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:37 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:38 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:39 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:40 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:41 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:42 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:43 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:44 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:45 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:46 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:47 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:48 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:49 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:50 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:51 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:52 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:53 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:54 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:55 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:56 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:57 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:58 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 18/11/19 10:05:59 INFO yarn.Client: Application report for 
>> application_1542587655772_35129 (state: ACCEPTED)
>> 
>>      at 
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterManagedProcess.start(RemoteInterpreterManagedProcess.java:205)
>>      at 
>> org.apache.zeppelin.interpreter.ManagedInterpreterGroup.getOrCreateInterpreterProcess(ManagedInterpreterGroup.java:64)
>>      at 
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getOrCreateInterpreterProcess(RemoteInterpreter.java:111)
>>      at 
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.internal_create(RemoteInterpreter.java:164)
>>      at 
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:132)
>>      at 
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:299)
>>      at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:407)
>>      at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
>>      at 
>> org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:315)
>>      at 
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>      at 
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>      at 
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>      at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>      at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>      at java.lang.Thread.run(Thread.java:748)
>> 
>> Any further submit to this interpreter will give null pointer exceptions due 
>> to the absence of an interpreter process. 
>> It looks like the interpreter driver process while getting submitted to 
>> yarn, is stuck in ACCEPTED state because of which we're not able to connect 
>> to the remote interpreter process. This happens even if there are resources 
>> on the cluster in yarn. 
>> Also I've tried increasing the zeppelin.interpreter.connect.timeout but that 
>> didn't help since the application is stuck in ACCEPTED state indefinitely 
>> and there are no logs available too.
>> It'll be great if you can point me to something that can help. Also please 
>> do let me know if any configuration files are required for debugging this.
>> 
>> 
>> Thanks and Regards
>> 
>> 
>> Sarthak Sharma
>> DevOps Engineer, Media.Net <http://media.net/>
>> +918002228376 <tel:+918002228376> | sarthak...@media.net 
>> <mailto:sarthak...@media.net>
>>  <http://en-gb.facebook.com/people/Sarthak-Sharma/100006006014244>  
>> <http://in.linkedin.com/in/sarthaksharma96> 
> 
> 
> 
> -- 
> Best Regards
> 
> Jeff Zhang

Reply via email to