[ https://issues.apache.org/jira/browse/SPARK-22783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290620#comment-16290620 ]
omkar kankalapati commented on SPARK-22783: ------------------------------------------- EventLoggingListener (org.apache.spark.scheduler.EventLoggingListener) opens the the log file named <application_id>.inprogress file for writing in start() method as a Writer object. On all the events, logEvent() is invoked, which simply appends to the writer, without checking the size/performing any rotation. The writer is closed and file is renamed to remove the ".inprogress" suffix only in stop() method. Thus, the *.inprogress file will be held open throughout and keeps growing. It would be very helpful if EventLoggingListener can be enhanced to support rotating the file .inprogress file when it reaches a (configured) threshhold size/interval. > event log directory(spark-history) filled by large .inprogress files for > spark streaming applications > ----------------------------------------------------------------------------------------------------- > > Key: SPARK-22783 > URL: https://issues.apache.org/jira/browse/SPARK-22783 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.6.0, 2.1.0 > Environment: Linux(Generic) > Reporter: omkar kankalapati > Priority: Critical > > When running long running streaming applications, the HDFS storage gets > filled up with large *.inprogress files in hdfs://spark-history/ directory > For example: > hadoop fs -du -h /spark-history > 234 /spark-history/<Application_1_ID>.inprogress > 46.6 G /spark-history/<Application_2_ID>.inprogress > Instead of continuing to write to a very large (multi GB) .inprogress file, > Spark should instead rotate the current log file when it reaches a size (for > example: 100 MB) or interval > and perhaps expose a configuration parameter for the size/interval. > This is also mentioned in SPARK-12140 as a concern. > It is very important and useful to support rotating the log files because > users may have limited HDFS quota and these large files consume the available > limited quota. > Also the users do not have a viable workaround > 1) Can not move the files to an another location because the moving the file > causes the event logging to stop > 2) Trying to copy the .inprogress file to another location and truncate the > .inprogress file fails because the file is still opened by > EventLoggingListener for writing > hdfs dfs -truncate -w 0 /spark-history/<application_id>.inprogress > truncate: Failed to TRUNCATE_FILE /spark-history/<application_id>.inprogress > for DFSClient_NONMAPREDUCE_<#ID>on <IP> because this file lease is currently > owned by DFSClient_NONMAPREDUCE_<#ID> on <IP> > The only workaround available is to disable the event logging for streaming > applications by setting "spark.eventLog.enabled" to false -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org