[ https://issues.apache.org/jira/browse/SPARK-28294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dongjoon Hyun reassigned SPARK-28294: ------------------------------------- Assignee: Dongjoon Hyun > Support `spark.history.fs.cleaner.maxNum` configuration > ------------------------------------------------------- > > Key: SPARK-28294 > URL: https://issues.apache.org/jira/browse/SPARK-28294 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 3.0.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun > Priority: Major > > Up to now, Apache Spark maintains the event log directory by time policy, > `spark.history.fs.cleaner.maxAge`. However, there are two issues. > 1. Some file system has a limitation on the maximum number of files in a > single directory. For example, HDFS > `dfs.namenode.fs-limits.max-directory-items` is 1024 * 1024 by default. > - > https://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml > 2. Spark is sometimes unable to to clean up some old log files due to > permission issues. > To handle both (1) and (2), this issue aims to support an additional number > policy configuration for the event log directory, > `spark.history.fs.cleaner.maxNum`. Spark can try to keep the number of files > in the event log directory according to this policy. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org