Thanks for the guidance! Setting the --driver-java-options in spark-shell instead of SPARK_MASTER_OPTS made the debugger connect to the right JVM. My breakpoints get hit now.
nirandap [via Apache Spark Developers List] < ml-node+s1001551n18145...@n3.nabble.com> schrieb am Fr., 1. Juli 2016 um 04:39 Uhr: > Guys, > > Aren't TaskScheduler and DAGScheduler residing in the spark context? So, > the debug configs need to be set in the JVM where the spark context is > running? [1] > > But yes, I agree, if you really need to check the execution, you need to > set those configs in the executors [2] > > [1] > https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-sparkcontext.html > [2] > http://spark.apache.org/docs/latest/configuration.html#runtime-environment > > > On Fri, Jul 1, 2016 at 12:30 AM, rxin [via Apache Spark Developers List] > <[hidden > email] <http:///user/SendEmail.jtp?type=node&node=18145&i=0>> wrote: > >> Yes, scheduling is centralized in the driver. >> >> For debugging, I think you'd want to set the executor JVM, not the worker >> JVM flags. >> >> On Thu, Jun 30, 2016 at 11:36 AM, cbruegg <[hidden email] >> <http:///user/SendEmail.jtp?type=node&node=18141&i=0>> wrote: >> > Hello everyone, >>> >>> I'm a student assistant in research at the University of Paderborn, >>> working >>> on integrating Spark (v1.6.2) with a new network resource management >>> system. >>> I have already taken a deep dive into the source code of spark-core >>> w.r.t. >>> its scheduling systems. >>> >>> We are running a cluster in standalone mode consisting of a master node >>> and >>> three slave nodes. Am I right to assume that tasks are scheduled within >>> the >>> TaskSchedulerImpl using the DAGScheduler in this mode? I need to find a >>> place where the execution plan (and each stage) for a job is computed and >>> can be analyzed, so I placed some breakpoints in these two classes. >>> >>> The remote debugging session within IntelliJ IDEA has been established by >>> running the following commands on the master node before: >>> >>> export SPARK_WORKER_OPTS="-Xdebug >>> -Xrunjdwp:server=y,transport=dt_socket,address=4000,suspend=n" >>> export SPARK_MASTER_OPTS="-Xdebug >>> -Xrunjdwp:server=y,transport=dt_socket,address=4000,suspend=n" >>> >>> Port 4000 has been forwarded to my local machine. Unfortunately, none of >>> my >>> breakpoints through the class get hit when I invoke a task like >>> sc.parallelize(1 to 1000).count() in spark-shell on the master node >>> (using >>> --master spark://...), though when I pause all threads I can see that the >>> process I am debugging runs some kind of event queue, which means that >>> the >>> debugger is connected to /something/. >>> >>> Do I rely on false assumptions or should these breakpoints in fact get >>> hit? >>> I am not too familiar with Spark, so please bear with me if I got >>> something >>> wrong. Many thanks in advance for your help. >>> >>> Best regards, >>> Christian Brüggemann >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-developers-list.1001551.n3.nabble.com/Debugging-Spark-itself-in-standalone-cluster-mode-tp18139.html >>> Sent from the Apache Spark Developers List mailing list archive at >>> Nabble.com. >>> >>> --------------------------------------------------------------------- >>> >> To unsubscribe e-mail: [hidden email] >>> <http:///user/SendEmail.jtp?type=node&node=18141&i=1> >>> >>> >> >> ------------------------------ >> If you reply to this email, your message will be added to the discussion >> below: >> >> http://apache-spark-developers-list.1001551.n3.nabble.com/Debugging-Spark-itself-in-standalone-cluster-mode-tp18139p18141.html >> > To start a new topic under Apache Spark Developers List, email [hidden >> email] <http:///user/SendEmail.jtp?type=node&node=18145&i=1> >> To unsubscribe from Apache Spark Developers List, click here. >> NAML >> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> >> > > > > -- > Niranda > @n1r44 <https://twitter.com/N1R44> > +94-71-554-8430 > https://pythagoreanscript.wordpress.com/ > > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > > http://apache-spark-developers-list.1001551.n3.nabble.com/Debugging-Spark-itself-in-standalone-cluster-mode-tp18139p18145.html > To unsubscribe from Debugging Spark itself in standalone cluster mode, click > here > <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=18139&code=bWFpbEBjYnJ1ZWdnLmNvbXwxODEzOXwtMjAxMTcyNDY4OQ==> > . > NAML > <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Debugging-Spark-itself-in-standalone-cluster-mode-tp18139p18146.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com.