[ https://issues.apache.org/jira/browse/SPARK-25645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Devaraj K resolved SPARK-25645. ------------------------------- Resolution: Duplicate > Add provision to disable EventLoggingListener default flush/hsync/hflush for > all events > --------------------------------------------------------------------------------------- > > Key: SPARK-25645 > URL: https://issues.apache.org/jira/browse/SPARK-25645 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 2.3.2 > Reporter: Devaraj K > Priority: Major > > {code:java|title=EventLoggingListener.scala|borderStyle=solid} > private def logEvent(event: SparkListenerEvent, flushLogger: Boolean = false) > { > val eventJson = JsonProtocol.sparkEventToJson(event) > // scalastyle:off println > writer.foreach(_.println(compact(render(eventJson)))) > // scalastyle:on println > if (flushLogger) { > writer.foreach(_.flush()) > hadoopDataStream.foreach(ds => ds.getWrappedStream match { > case wrapped: DFSOutputStream => > wrapped.hsync(EnumSet.of(SyncFlag.UPDATE_LENGTH)) > case _ => ds.hflush() > }) > } > {code} > There are events which come with flushLogger=true and go through the > underlying stream flush, Here I tried running apps with disabling the > flush/hsync/hflush for all events and see that there is significant > improvement in the app completion time and also there are no event drops, > posting more details in the comments section. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org