[
https://issues.apache.org/jira/browse/SPARK-28294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Apache Spark reassigned SPARK-28294:
------------------------------------
Assignee: (was: Apache Spark)
> Support `spark.history.fs.cleaner.maxNum` configuration
> -------------------------------------------------------
>
> Key: SPARK-28294
> URL: https://issues.apache.org/jira/browse/SPARK-28294
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 3.0.0
> Reporter: Dongjoon Hyun
> Priority: Major
>
> Up to now, Apache Spark maintains the event log directory by time policy,
> `spark.history.fs.cleaner.maxAge`. However, there are two issues.
> 1. Some file system has a limitation on the maximum number of files in a
> single directory. For example, HDFS
> `dfs.namenode.fs-limits.max-directory-items` is 1024 * 1024 by default.
> -
> https://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
> 2. Spark is sometimes unable to to clean up some old log files due to
> permission issues.
> To handle both (1) and (2), this issue aims to support an additional number
> policy configuration for the event log directory,
> `spark.history.fs.cleaner.maxNum`. Spark can try to keep the number of files
> in the event log directory according to this policy.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]