If you set concurrentJobs flag to 2, then it lets you run two jobs
parallel. It will be a bit hard for you predict the application behavior
with this flag thus debugging would be a headache.

Thanks
Best Regards

On Sun, Aug 23, 2015 at 10:36 AM, Sateesh Kavuri <sateesh.kav...@gmail.com>
wrote:

> Hi Akhil,
>
> Think of the scenario as running a piece of code in normal Java with
> multiple threads. Lets say there are 4 threads spawned by a Java process to
> handle reading from database, some processing and storing to database. In
> this process, while a thread is performing a database I/O, the CPU could
> allow another thread to perform the processing, thus efficiently using the
> resources.
>
> Incase of Spark, while a node executor is running the same "read from DB
> => process data => store to DB", during the "read from DB" and "store to
> DB" phase, the CPU is not given to other requests in queue, since the
> executor will allocate the resources completely to the current ongoing
> request.
>
> Does not flag spark.streaming.concurrentJobs enable this kind of scenario
> or is there any other way to achieve what I am looking for
>
> Thanks,
> Sateesh
>
> On Sat, Aug 22, 2015 at 7:26 PM, Akhil Das <ak...@sigmoidanalytics.com>
> wrote:
>
>> Hmm for a singl core VM you will have to run it in local mode(specifying
>> master= local[4]). The flag is available in all the versions of spark i
>> guess.
>> On Aug 22, 2015 5:04 AM, "Sateesh Kavuri" <sateesh.kav...@gmail.com>
>> wrote:
>>
>>> Thanks Akhil. Does this mean that the executor running in the VM can
>>> spawn two concurrent jobs on the same core? If this is the case, this is
>>> what we are looking for. Also, which version of Spark is this flag in?
>>>
>>> Thanks,
>>> Sateesh
>>>
>>> On Sat, Aug 22, 2015 at 1:44 AM, Akhil Das <ak...@sigmoidanalytics.com>
>>> wrote:
>>>
>>>> You can look at the spark.streaming.concurrentJobs by default it runs a
>>>> single job. If set it to 2 then it can run 2 jobs parallely. Its an
>>>> experimental flag, but go ahead and give it a try.
>>>> On Aug 21, 2015 3:36 AM, "Sateesh Kavuri" <sateesh.kav...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> My scenario goes like this:
>>>>> I have an algorithm running in Spark streaming mode on a 4 core
>>>>> virtual machine. Majority of the time, the algorithm does disk I/O and
>>>>> database I/O. Question is, during the I/O, where the CPU is not
>>>>> considerably loaded, is it possible to run any other task/thread so as to
>>>>> efficiently utilize the CPU?
>>>>>
>>>>> Note that one DStream of the algorithm runs completely on a single CPU
>>>>>
>>>>> Thank you,
>>>>> Sateesh
>>>>>
>>>>
>>>
>

Reply via email to