[GitHub] [spark] HyukjinKwon commented on a change in pull request #27085: [SPARK-29779][CORE] Compact old event log files and cleanup

2020-01-30 Thread GitBox
HyukjinKwon commented on a change in pull request #27085: [SPARK-29779][CORE] 
Compact old event log files and cleanup
URL: https://github.com/apache/spark/pull/27085#discussion_r373345690
 
 

 ##
 File path: 
core/src/main/resources/META-INF/services/org.apache.spark.deploy.history.EventFilterBuilder
 ##
 @@ -0,0 +1 @@
+org.apache.spark.deploy.history.BasicEventFilterBuilder
 
 Review comment:
   Okay, thanks. at least it's consistent.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #27085: [SPARK-29779][CORE] Compact old event log files and cleanup

2020-01-30 Thread GitBox
HyukjinKwon commented on a change in pull request #27085: [SPARK-29779][CORE] 
Compact old event log files and cleanup
URL: https://github.com/apache/spark/pull/27085#discussion_r373307820
 
 

 ##
 File path: 
core/src/main/resources/META-INF/services/org.apache.spark.deploy.history.EventFilterBuilder
 ##
 @@ -0,0 +1 @@
+org.apache.spark.deploy.history.BasicEventFilterBuilder
 
 Review comment:
   I see. I think that's possible via simply using reflection which I think is 
easier to read the codes. I think we're already doing this in few places such 
as `FileCommitProtocol.instantiate`
   
   Seems a bit odds to use service loader for internal classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #27085: [SPARK-29779][CORE] Compact old event log files and cleanup

2020-01-30 Thread GitBox
HyukjinKwon commented on a change in pull request #27085: [SPARK-29779][CORE] 
Compact old event log files and cleanup
URL: https://github.com/apache/spark/pull/27085#discussion_r373306885
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
 ##
 @@ -195,6 +195,24 @@ package object config {
 "configured to be at least 10 MiB.")
   .createWithDefaultString("128m")
 
+  private[spark] val EVENT_LOG_ROLLING_MAX_FILES_TO_RETAIN =
+ConfigBuilder("spark.eventLog.rolling.maxFilesToRetain")
+  // TODO: remove this when integrating compactor with FsHistoryProvider
+  .internal()
+  .doc("The maximum number of event log files which will be retained as 
non-compacted. " +
+"By default, all event log files will be retained. Please set the 
configuration " +
+s"and ${EVENT_LOG_ROLLING_MAX_FILE_SIZE.key} accordingly if you want 
to control " +
+"the overall size of event log files.")
+  .intConf
+  .checkValue(_ > 0, "Max event log files to retain should be higher than 
0.")
+  .createWithDefault(Integer.MAX_VALUE)
 
 Review comment:
   Why didn't we make it optional, or defaults to -1 to express "all event log 
files will be retained"?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #27085: [SPARK-29779][CORE] Compact old event log files and cleanup

2020-01-30 Thread GitBox
HyukjinKwon commented on a change in pull request #27085: [SPARK-29779][CORE] 
Compact old event log files and cleanup
URL: https://github.com/apache/spark/pull/27085#discussion_r373304218
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
 ##
 @@ -195,6 +195,24 @@ package object config {
 "configured to be at least 10 MiB.")
   .createWithDefaultString("128m")
 
+  private[spark] val EVENT_LOG_ROLLING_MAX_FILES_TO_RETAIN =
+ConfigBuilder("spark.eventLog.rolling.maxFilesToRetain")
+  // TODO: remove this when integrating compactor with FsHistoryProvider
+  .internal()
+  .doc("The maximum number of event log files which will be retained as 
non-compacted. " +
+"By default, all event log files will be retained. Please set the 
configuration " +
+s"and ${EVENT_LOG_ROLLING_MAX_FILE_SIZE.key} accordingly if you want 
to control " +
+"the overall size of event log files.")
+  .intConf
+  .checkValue(_ > 0, "Max event log files to retain should be higher than 
0.")
+  .createWithDefault(Integer.MAX_VALUE)
+
+  private[spark] val EVENT_LOG_COMPACTION_SCORE_THRESHOLD =
+ConfigBuilder("spark.eventLog.rolling.compaction.score.threshold")
+  .internal()
 
 Review comment:
   I think we should have added some docs here too.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #27085: [SPARK-29779][CORE] Compact old event log files and cleanup

2020-01-30 Thread GitBox
HyukjinKwon commented on a change in pull request #27085: [SPARK-29779][CORE] 
Compact old event log files and cleanup
URL: https://github.com/apache/spark/pull/27085#discussion_r373302332
 
 

 ##
 File path: 
core/src/main/resources/META-INF/services/org.apache.spark.deploy.history.EventFilterBuilder
 ##
 @@ -0,0 +1 @@
+org.apache.spark.deploy.history.BasicEventFilterBuilder
 
 Review comment:
   `EventFilterBuilder` is private in Spark. Do you mind if I ask why we use 
service loader?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org