yes, can you always specify minimum number of partitions and that would force some parallelism ( assuming you have enough cores)
On Wed, Nov 12, 2014 at 9:36 AM, Saiph Kappa <saiph.ka...@gmail.com> wrote: > What if the window is of 5 seconds, and the file takes longer than 5 > seconds to be completely scanned? It will still attempt to load the whole > file? > > On Mon, Nov 10, 2014 at 6:24 PM, Soumitra Kumar <kumar.soumi...@gmail.com> > wrote: > >> Entire file in a window. >> >> On Mon, Nov 10, 2014 at 9:20 AM, Saiph Kappa <saiph.ka...@gmail.com> >> wrote: >> >>> Hi, >>> >>> In my application I am doing something like this "new >>> StreamingContext(sparkConf, Seconds(10)).textFileStream("logs/")", and I >>> get some unknown exceptions when I copy a file with about 800 MB to that >>> folder ("logs/"). I have a single worker running with 512 MB of memory. >>> >>> Anyone can tell me if every 10 seconds spark reads parts of that big >>> file, or if it attempts to read the entire file in a single window? How >>> does it work? >>> >>> Thanks. >>> >>> >> >