xuanyuanking commented on a change in pull request #27620:
URL: https://github.com/apache/spark/pull/27620#discussion_r471878372
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala
##########
@@ -311,6 +344,9 @@ object FileStreamSource {
/** Timestamp for file modification time, in ms since January 1, 1970 UTC. */
type Timestamp = Long
+ val DISCARD_UNSEEN_FILES_RATIO = 0.2
+ val MAX_CACHED_UNSEEN_FILES = 10000
Review comment:
Any reason for keeping these 2 parameters instead of making it
configurable? Is it to detail to expose to the end-user?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]