[ 
https://issues.apache.org/jira/browse/BEAM-13981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17504775#comment-17504775
 ] 

Ismaël Mejía commented on BEAM-13981:
-------------------------------------

[~jvilcek] I opened the PR and asked [~aromanenko] to take a lot int he absence 
of Kyle so we hopefully get this in the 2.38.0 release.

> Job event log empty on Spark History Server
> -------------------------------------------
>
>                 Key: BEAM-13981
>                 URL: https://issues.apache.org/jira/browse/BEAM-13981
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-spark
>    Affects Versions: 2.33.0
>            Reporter: Jozef Vilcek
>            Assignee: Ismaël Mejía
>            Priority: P2
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> After upgrade from Beam 2.24.0 -> 2.33.0 Spark jobs run on YARN after 
> complete shows empty data on History server.
> The problem seems to be a race condition and 2 
> {noformat}
> 22/02/22 10:51:11 INFO EventLoggingListener: Logging events to 
> hdfs:/user/spark/jobhistory/application_1553109013416_12079975_1
> ...
> 22/02/22 10:51:41 INFO EventLoggingListener: Logging events to 
> hdfs:/user/spark/jobhistory/application_1553109013416_12079975_1{noformat}
> At the end failure:
> {noformat}
> 22/02/22 11:17:57 INFO SchedulerExtensionServices: Stopping 
> SchedulerExtensionServices
> (serviceOption=None,
>  services=List(),
>  started=false)
> 22/02/22 11:17:57 ERROR Utils: Uncaught exception in thread Driver
> java.io.IOException: Target log file already exists 
> (hdfs:/user/spark/jobhistory/application_1553109013416_12079975_1)
>       at 
> org.apache.spark.scheduler.EventLoggingListener.stop(EventLoggingListener.scala:255)
>       at 
> org.apache.spark.SparkContext.$anonfun$stop$13(SparkContext.scala:1960)
>       at 
> org.apache.spark.SparkContext.$anonfun$stop$13$adapted(SparkContext.scala:1960)
>       at scala.Option.foreach(Option.scala:274)
>       at 
> org.apache.spark.SparkContext.$anonfun$stop$12(SparkContext.scala:1960)
>       at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1340)
>       at org.apache.spark.SparkContext.stop(SparkContext.scala:1960)
>       at 
> org.apache.spark.api.java.JavaSparkContext.stop(JavaSparkContext.scala:654)
>       at 
> org.apache.beam.runners.spark.translation.SparkContextFactory.stopSparkContext(SparkContextFactory.java:73)
>       at 
> org.apache.beam.runners.spark.SparkPipelineResult$BatchMode.stop(SparkPipelineResult.java:133)
>       at 
> org.apache.beam.runners.spark.SparkPipelineResult.offerNewState(SparkPipelineResult.java:234)
>       at 
> org.apache.beam.runners.spark.SparkPipelineResult.waitUntilFinish(SparkPipelineResult.java:99)
>       at 
> org.apache.beam.runners.spark.SparkPipelineResult.waitUntilFinish(SparkPipelineResult.java:92)
>       at 
> com.sizmek.dp.dsp.pipeline.driver.PipelineDriver.$anonfun$main$1(PipelineDriver.scala:41)
>       at scala.util.Try$.apply(Try.scala:213)
>       at 
> com.sizmek.dp.dsp.pipeline.driver.PipelineDriver.main(PipelineDriver.scala:34)
>       at 
> com.sizmek.dp.dsp.pipeline.driver.PipelineDriver.main$(PipelineDriver.scala:17)
>       at 
> com.zetaglobal.dp.dsp.jobs.dealstats.DealStatsDriver$.main(DealStatsDriver.scala:18)
>       at 
> com.zetaglobal.dp.dsp.jobs.dealstats.DealStatsDriver.main(DealStatsDriver.scala)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:497)
>       at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:684){noformat}
> This ends up with very empty file in HDFS and empty job details in history 
> server.
>  
> Problem seems to by introduced by this change:
> [https://github.com/apache/beam/pull/14409]
> Why Beam runs concurrent event listener to what spark is doing internally? 
> When I roll back change for SparkRunner, problem disappear for me.
> I am running native SparkRunner with Spark 2.4.4
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to