[ 
https://issues.apache.org/jira/browse/SPARK-28594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17012562#comment-17012562
 ] 

Jungtaek Lim commented on SPARK-28594:
--------------------------------------

I'm enumerating the items which are "good to do", which might be better to file 
JIRA issues once we decide we should do them, or all required functionalities 
are done and we have a resource to deal with them.

For now, the items what I have are below:
 * Retain specific number of jobs / executions which allows compact file to 
have some of finished jobs / executions
 ** [https://github.com/apache/spark/pull/27085#discussion_r363428336]
 * Separate compaction from cleaning to allow leaving some old event log files 
after compaction
 ** [https://github.com/apache/spark/pull/27085#issuecomment-572792067]
 * Cache the state of compactor to avoid replaying event log files previously 
loaded before
 ** [https://github.com/apache/spark/pull/26416#discussion_r358260674]

 

> Allow event logs for running streaming apps to be rolled over.
> --------------------------------------------------------------
>
>                 Key: SPARK-28594
>                 URL: https://issues.apache.org/jira/browse/SPARK-28594
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.0.0
>         Environment: This has been reported on 2.0.2.22 but affects all 
> currently available versions.
>            Reporter: Stephen Levett
>            Priority: Major
>
> At all current Spark releases when event logging on spark streaming is 
> enabled the event logs grow massively.  The files continue to grow until the 
> application is stopped or killed.
> The Spark history server then has difficulty processing the files.
> https://issues.apache.org/jira/browse/SPARK-8617
> Addresses .inprogress files but not event log files that are still running.
> Identify a mechanism to set a "max file" size so that the file is rolled over 
> when it reaches this size?
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to