Keiji Yoshida created ZEPPELIN-2997: ---------------------------------------
Summary: Add configuration for disabling the feature of "auto-restart interpreter on cron execution" Key: ZEPPELIN-2997 URL: https://issues.apache.org/jira/browse/ZEPPELIN-2997 Project: Zeppelin Issue Type: Improvement Reporter: Keiji Yoshida Add configuration for disabling the feature of "auto-restart interpreter on cron execution". There is a risk of deadlock when a number of Notebooks which same interpreters are bound are scheduled to run at the same time and the "auto-restart interpreter on cron execution" checkbox is checked on some of them. Configuration for disabling the feature of "auto-restart interpreter on cron execution" should be added on Zeppelin. Here is the deadlock information of the jstack command when the deadlock occurs in the ZeppelinServer process: ################################################################ Found one Java-level deadlock: ============================= "pool-2-thread-63": waiting to lock monitor 0x00007fd2fc0073e8 (object 0x00000000c04f9c50, a java.util.concurrent.ConcurrentHashMap), which is held by "DefaultQuartzScheduler_Worker-2" "DefaultQuartzScheduler_Worker-2": waiting to lock monitor 0x00007fd28c0036c8 (object 0x00000000c16f4738, a org.apache.zeppelin.notebook.Note), which is held by "DefaultQuartzScheduler_Worker-4" "DefaultQuartzScheduler_Worker-4": waiting to lock monitor 0x00007fd2fc0073e8 (object 0x00000000c04f9c50, a java.util.concurrent.ConcurrentHashMap), which is held by "DefaultQuartzScheduler_Worker-2" Java stack information for the threads listed above: =================================================== "pool-2-thread-63": at org.apache.zeppelin.interpreter.InterpreterSettingManager.get(InterpreterSettingManager.java:981) - waiting to lock <0x00000000c04f9c50> (a java.util.concurrent.ConcurrentHashMap) at org.apache.zeppelin.interpreter.InterpreterSettingManager.getInterpreterSettings(InterpreterSettingManager.java:450) at org.apache.zeppelin.interpreter.InterpreterFactory.getInterpreter(InterpreterFactory.java:382) at org.apache.zeppelin.notebook.Paragraph.getRepl(Paragraph.java:255) at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:361) at org.apache.zeppelin.scheduler.Job.run(Job.java:175) at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:329) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) "DefaultQuartzScheduler_Worker-2": at org.apache.zeppelin.notebook.Note.stopDelayedPersistTimer(Note.java:789) - waiting to lock <0x00000000c16f4738> (a org.apache.zeppelin.notebook.Note) at org.apache.zeppelin.notebook.Note.persist(Note.java:725) at org.apache.zeppelin.socket.NotebookServer$ParagraphListenerImpl.afterStatusChange(NotebookServer.java:2070) at org.apache.zeppelin.scheduler.Job.setStatus(Job.java:149) at org.apache.zeppelin.interpreter.InterpreterSettingManager.stopJobAllInterpreter(InterpreterSettingManager.java:966) at org.apache.zeppelin.interpreter.InterpreterSettingManager.restart(InterpreterSettingManager.java:942) - locked <0x00000000c04f9c50> (a java.util.concurrent.ConcurrentHashMap) at org.apache.zeppelin.interpreter.InterpreterSettingManager.restart(InterpreterSettingManager.java:956) at org.apache.zeppelin.notebook.Notebook$CronJob.execute(Notebook.java:907) at org.quartz.core.JobRunShell.run(JobRunShell.java:202) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) - locked <0x00000000c0ef6190> (a java.lang.Object) "DefaultQuartzScheduler_Worker-4": at org.apache.zeppelin.interpreter.InterpreterSettingManager.get(InterpreterSettingManager.java:981) - waiting to lock <0x00000000c04f9c50> (a java.util.concurrent.ConcurrentHashMap) at org.apache.zeppelin.interpreter.InterpreterSettingManager.getInterpreterSettings(InterpreterSettingManager.java:450) at org.apache.zeppelin.interpreter.InterpreterFactory.getInterpreter(InterpreterFactory.java:382) at org.apache.zeppelin.notebook.Paragraph.isValidInterpreter(Paragraph.java:701) at org.apache.zeppelin.notebook.Paragraph.getMagic(Paragraph.java:690) at org.apache.zeppelin.notebook.Paragraph.isBlankParagraph(Paragraph.java:354) at org.apache.zeppelin.notebook.Note.run(Note.java:609) at org.apache.zeppelin.notebook.Note.runAll(Note.java:596) at org.apache.zeppelin.notebook.Note.runAll(Note.java:587) - locked <0x00000000c16f4738> (a org.apache.zeppelin.notebook.Note) at org.apache.zeppelin.notebook.Notebook$CronJob.execute(Notebook.java:885) at org.quartz.core.JobRunShell.run(JobRunShell.java:202) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) - locked <0x00000000c0f4b800> (a java.lang.Object) Found 1 deadlock. ################################################################ -- This message was sent by Atlassian JIRA (v6.4.14#64029)