HeartSaVioR commented on issue #27620: [SPARK-30866][SS] FileStreamSource: Cache fetched list of files beyond maxFilesPerTrigger as unread files URL: https://github.com/apache/spark/pull/27620#issuecomment-614656637 Only one file left in unread will be used for the batch for that case. It's designed to avoid calling list operation whenever possible, but in some case it might be valid to drop unread files and call list operation if the number of remaining files are relatively smaller than the max files to trigger. I think it's affecting only few batch, though.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
