HeartSaVioR commented on a change in pull request #27085: [SPARK-29779][CORE]
Compact old event log files and cleanup
URL: https://github.com/apache/spark/pull/27085#discussion_r363569530
##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -195,6 +195,24 @@ package object config {
"configured to be at least 10 MiB.")
.createWithDefaultString("128m")
+ private[spark] val EVENT_LOG_ROLLING_MAX_FILES_TO_RETAIN =
+ ConfigBuilder("spark.eventLog.rolling.maxFilesToRetain")
+ // TODO: remove this when integrating compactor with FsHistoryProvider
+ .internal()
+ .doc("The maximum number of event log files which will be retained as
non-compacted. " +
+ "By default, all event log files will be retained. Please set the
configuration " +
+ s"and ${EVENT_LOG_ROLLING_MAX_FILE_SIZE.key} accordingly if you want
to control " +
+ "the overall size of event log files.")
+ .intConf
+ .checkValue(_ > 0, "Max event log files to retain should be higher than
0.")
Review comment:
> On a side note, compaction currently overrides all the "retained blah"
configurations
Yeah, that's actually the point I have been thinking about, as end users may
see only few of jobs if they set this to 1 and the log files are just compacted.
Maybe we could improve event filter to configure max retained jobs on
compact file so that some of finished jobs can be still tracked (`live jobs +
finished jobs <= max retained jobs`). It cannot be strictly max, as live jobs
could have been more than the configured value, so may need to have better name.
Looks like it could be further TODO worth filing JIRA issue.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]