For the first time it needs to list them. AFter that the list should be cached by the file stream implementation (as far as I remember).
On Thu, Jul 30, 2015 at 3:55 PM, Brandon White <[email protected]> wrote: > Is this a known bottle neck for Spark Streaming textFileStream? Does it > need to list all the current files in a directory before he gets the new > files? Say I have 500k files in a directory, does it list them all in order > to get the new files? >
