[ 
https://issues.apache.org/jira/browse/SPARK-28294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-28294:
-------------------------------------

    Assignee: Dongjoon Hyun

> Support `spark.history.fs.cleaner.maxNum` configuration
> -------------------------------------------------------
>
>                 Key: SPARK-28294
>                 URL: https://issues.apache.org/jira/browse/SPARK-28294
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.0.0
>            Reporter: Dongjoon Hyun
>            Assignee: Dongjoon Hyun
>            Priority: Major
>
> Up to now, Apache Spark maintains the event log directory by time policy, 
> `spark.history.fs.cleaner.maxAge`. However, there are two issues.
> 1. Some file system has a limitation on the maximum number of files in a 
> single directory. For example, HDFS 
> `dfs.namenode.fs-limits.max-directory-items` is 1024 * 1024 by default.
> - 
> https://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
> 2. Spark is sometimes unable to to clean up some old log files due to 
> permission issues. 
> To handle both (1) and (2), this issue aims to support an additional number 
> policy configuration for the event log directory, 
> `spark.history.fs.cleaner.maxNum`. Spark can try to keep the number of files 
> in the event log directory according to this policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to