Hey,

Like you mentioned, I'm already using the *spark.yarn.queue* parameter,
hence I know which yarn queue it is getting scheduled in and this queue has
resources available for applications since other apps are also getting
scheduled there.
However, assuming the queue does NOT have resources for it to schedule
within the given time frame causing it to throw an exception after the
*zeppelin.interpreter.connect.timeout
*is reached, the application should in any case get scheduled eventually
which is not the case here. Interpreter driver process remains stuck in
ACCEPTED state. Is there a change in the way it is implemented in this
version ? Since we never experienced this on the previous one
(zeppelin-0.7.3) where drivers would get scheduled eventually in their
respective queues.

On Tue, Nov 20, 2018, 7:29 AM Xun Liu <neliu...@163.com wrote:

> HI,Sarthak Sharma
>
> The log shows that the task submitted by spark-submmit has been waiting
> for execution in the queue of YARN. Is there no resource for the queue of
> YARN?
> You can specify a queue with resources in the spark interpreter via the
> spark.yarn.queue parameter.
>
>
> 在 2018年11月19日,下午7:41,Sarthak Sharma <sarthak...@media.net> 写道:
>
> Hi,
>
> We already have a zeppelin-0.7.3 setup which runs fine and is in use
> currently but we are looking into the yarn cluster mode support for spark
> interpreter in zeppelin-0.8. I've built it from source from *branch-0.8
> (As of Nov-15) *and am facing the following issues intermittently in some
> of the spark interpreters while trying to use spark-sql on it.
>
> *18/11/19 10:04:07 INFO yarn.Client: Submitting application
> application_1542587655772_35129 to ResourceManager*
> *18/11/19 10:04:07 INFO impl.YarnClientImpl: Submitted application
> application_1542587655772_35129*
> *18/11/19 10:04:08 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:08 INFO yarn.Client:*
> *  client token: N/A*
> *  diagnostics: N/A*
> *  ApplicationMaster host: N/A*
> *  ApplicationMaster RPC port: -1*
> *  queue: root.zep*
> *  start time: 1542621847537*
> *  final status: UNDEFINED*
> *  tracking
> URL: http://resource-manager-addr/proxy/application_1542587655772_35129/
> <http://c8-auto-hadoop-service-1.srv.media.net:8088/proxy/application_1542587655772_35129/>*
> *  user: sarthak.sh*
> *18/11/19 10:04:09 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:10 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:11 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:12 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:13 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:14 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:15 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:16 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:17 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:18 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:19 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:20 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:21 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:22 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:23 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:24 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:25 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:26 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:27 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:28 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:29 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:30 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:31 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:32 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:33 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:34 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:35 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:36 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:37 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:38 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:39 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:40 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:41 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:42 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:43 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:44 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:45 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:46 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:47 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:48 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:49 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:50 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:51 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:52 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:53 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:54 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:55 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:56 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:57 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:58 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:04:59 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:00 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:01 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:02 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:03 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:04 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:05 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:06 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:07 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:08 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:09 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:10 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:11 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:12 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:13 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:14 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:15 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:16 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:17 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:18 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:19 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:20 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:21 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:22 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:23 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:24 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:25 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:26 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:27 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:28 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:29 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:30 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:31 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:32 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:33 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:34 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:35 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:36 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:37 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:38 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:39 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:40 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:41 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:42 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:43 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:44 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:45 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:46 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:47 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:48 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:49 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:50 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:51 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:52 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:53 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:54 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:55 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:56 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:57 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:58 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
> *18/11/19 10:05:59 INFO yarn.Client: Application report for
> application_1542587655772_35129 (state: ACCEPTED)*
>
> * at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterManagedProcess.start(RemoteInterpreterManagedProcess.java:205)*
> * at
> org.apache.zeppelin.interpreter.ManagedInterpreterGroup.getOrCreateInterpreterProcess(ManagedInterpreterGroup.java:64)*
> * at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getOrCreateInterpreterProcess(RemoteInterpreter.java:111)*
> * at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.internal_create(RemoteInterpreter.java:164)*
> * at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:132)*
> * at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:299)*
> * at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:407)*
> * at org.apache.zeppelin.scheduler.Job.run(Job.java:188)*
> * at
> org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:315)*
> * at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)*
> * at java.util.concurrent.FutureTask.run(FutureTask.java:266)*
> * at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)*
> * at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)*
> * at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)*
> * at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)*
> * at java.lang.Thread.run(Thread.java:748)*
>
> Any further submit to this interpreter will give null pointer exceptions
> due to the absence of an interpreter process.
> It looks like the interpreter driver process while getting submitted to
> yarn, is stuck in ACCEPTED state because of which we're not able to connect
> to the remote interpreter process. This happens even if there are resources
> on the cluster in yarn.
> Also I've tried increasing the *zeppelin.interpreter.connect.timeout *but
> that didn't help since the application is stuck in ACCEPTED state
> indefinitely and there are no logs available too.
> It'll be great if you can point me to something that can help. Also please
> do let me know if any configuration files are required for debugging this.
>
>
> Thanks and Regards
>
>
> *Sarthak Sharma*
> DevOps Engineer, Media.Net <http://media.net/>
> +918002228376 | sarthak...@media.net
> <http://en-gb.facebook.com/people/Sarthak-Sharma/100006006014244>
> <http://in.linkedin.com/in/sarthaksharma96>
>
>
>

Reply via email to