yes, can you always specify minimum number of partitions and that would
force some parallelism ( assuming you have enough cores)

On Wed, Nov 12, 2014 at 9:36 AM, Saiph Kappa <saiph.ka...@gmail.com> wrote:

> What if the window is of 5 seconds, and the file takes longer than 5
> seconds to be completely scanned? It will still attempt to load the whole
> file?
>
> On Mon, Nov 10, 2014 at 6:24 PM, Soumitra Kumar <kumar.soumi...@gmail.com>
> wrote:
>
>> Entire file in a window.
>>
>> On Mon, Nov 10, 2014 at 9:20 AM, Saiph Kappa <saiph.ka...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> In my application I am doing something like this "new
>>> StreamingContext(sparkConf, Seconds(10)).textFileStream("logs/")", and I
>>> get some unknown exceptions when I copy a file with about 800 MB to that
>>> folder ("logs/"). I have a single worker running with 512 MB of memory.
>>>
>>> Anyone can tell me if every 10 seconds spark reads parts of that big
>>> file, or if it attempts to read the entire file in a single window? How
>>> does it work?
>>>
>>> Thanks.
>>>
>>>
>>
>

Reply via email to