HeartSaVioR commented on a change in pull request #27620:
URL: https://github.com/apache/spark/pull/27620#discussion_r471881844



##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala
##########
@@ -311,6 +344,9 @@ object FileStreamSource {
   /** Timestamp for file modification time, in ms since January 1, 1970 UTC. */
   type Timestamp = Long
 
+  val DISCARD_UNSEEN_FILES_RATIO = 0.2
+  val MAX_CACHED_UNSEEN_FILES = 10000

Review comment:
       I just wanted to avoid Spark configuration be "airplane control panel" - 
end users already have bunch of things to tune. It's completely OK to make them 
be configurable, if we found the case the default value won't work.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to