dongjoon-hyun commented on a change in pull request #27380: [SPARK-30669][SS]
Introduce AdmissionControl APIs for StructuredStreaming
URL: https://github.com/apache/spark/pull/27380#discussion_r373264223
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala
##########
@@ -115,15 +117,19 @@ class FileStreamSource(
* `synchronized` on this method is for solving race conditions in tests. In
the normal usage,
* there is no race here, so the cost of `synchronized` should be rare.
*/
- private def fetchMaxOffset(): FileStreamSourceOffset = synchronized {
+ private def fetchMaxOffset(limit: ReadLimit): FileStreamSourceOffset =
synchronized {
// All the new files found - ignore aged files and files that we have seen.
val newFiles = fetchAllFiles().filter {
case (path, timestamp) => seenFiles.isNewFile(path, timestamp)
}
// Obey user's setting to limit the number of files in this batch trigger.
- val batchFiles =
- if (maxFilesPerBatch.nonEmpty) newFiles.take(maxFilesPerBatch.get) else
newFiles
+ val batchFiles = limit match {
+ case files: ReadMaxFiles =>
+ newFiles.take(files.maxFiles())
+ case all: ReadAllAvailable =>
Review comment:
nit.
```scala
- case all: ReadAllAvailable =>
+ case _: ReadAllAvailable =>
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]