[GitHub] [beam] tszerszen commented on a change in pull request #13743: [BEAM-11213] Display Beam Metrics in Spark History Server

GitBox Sun, 31 Jan 2021 03:14:24 -0800


tszerszen commented on a change in pull request #13743:
URL: https://github.com/apache/beam/pull/13743#discussion_r567409686




##########
File path: 
runners/spark/src/main/java/org/apache/beam/runners/spark/SparkPipelineRunner.java
##########
@@ -123,10 +134,33 @@ public PortablePipelineResult run(RunnerApi.Pipeline 
pipeline, JobInfo jobInfo)
         "Will stage {} files. (Enable logging at DEBUG level to see which 
files will be staged.)",
         pipelineOptions.getFilesToStage().size());
     LOG.debug("Staging files: {}", pipelineOptions.getFilesToStage());
-
     PortablePipelineResult result;
     final JavaSparkContext jsc = 
SparkContextFactory.getSparkContext(pipelineOptions);
 
+    EventLoggingListener eventLoggingListener = null;
+    if (pipelineOptions.getEventLogEnabled()) {
+      eventLoggingListener =
+          new EventLoggingListener(
+              jobInfo.jobId(),
+              scala.Option.apply(jobInfo.jobName()),
+              new URI(pipelineOptions.getSparkHistoryDir()),
+              jsc.getConf(),
+              jsc.hadoopConfiguration());
+      eventLoggingListener.initializeLogIfNecessary(false, false);
+      eventLoggingListener.start();
+      scala.collection.immutable.Map<String, String> logUrlMap =
+          new scala.collection.immutable.HashMap<String, String>();
+      Tuple2<String, String>[] sparkMasters = 
jsc.getConf().getAllWithPrefix("spark.master");
+      Tuple2<String, String>[] sparkExecutors = 
jsc.getConf().getAllWithPrefix("spark.executor.id");
+      for (int i = 0; i < sparkMasters.length; i++) {
+        eventLoggingListener.onExecutorAdded(
+            new SparkListenerExecutorAdded(
+                Instant.now().getMillis(),
+                sparkExecutors[i]._2(),

Review comment:
       @ibzib done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] tszerszen commented on a change in pull request #13743: [BEAM-11213] Display Beam Metrics in Spark History Server

Reply via email to