Devaraj K created SPARK-25645: --------------------------------- Summary: Add provision to disable EventLoggingListener default flush/hsync/hflush for all events Key: SPARK-25645 URL: https://issues.apache.org/jira/browse/SPARK-25645 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 2.3.2 Reporter: Devaraj K
{code:java|title=EventLoggingListener.scala|borderStyle=solid} private def logEvent(event: SparkListenerEvent, flushLogger: Boolean = false) { val eventJson = JsonProtocol.sparkEventToJson(event) // scalastyle:off println writer.foreach(_.println(compact(render(eventJson)))) // scalastyle:on println if (flushLogger) { writer.foreach(_.flush()) hadoopDataStream.foreach(ds => ds.getWrappedStream match { case wrapped: DFSOutputStream => wrapped.hsync(EnumSet.of(SyncFlag.UPDATE_LENGTH)) case _ => ds.hflush() }) } {code} There are events which come with flushLogger=true and go through the underlying stream flush, Here I tried running apps with disabling the flush/hsync/hflush for all events and see that there is significant improvement in the app completion time and also there are no event drops, posting more details in the comments section. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org