If you set concurrentJobs flag to 2, then it lets you run two jobs parallel. It will be a bit hard for you predict the application behavior with this flag thus debugging would be a headache.
Thanks Best Regards On Sun, Aug 23, 2015 at 10:36 AM, Sateesh Kavuri <sateesh.kav...@gmail.com> wrote: > Hi Akhil, > > Think of the scenario as running a piece of code in normal Java with > multiple threads. Lets say there are 4 threads spawned by a Java process to > handle reading from database, some processing and storing to database. In > this process, while a thread is performing a database I/O, the CPU could > allow another thread to perform the processing, thus efficiently using the > resources. > > Incase of Spark, while a node executor is running the same "read from DB > => process data => store to DB", during the "read from DB" and "store to > DB" phase, the CPU is not given to other requests in queue, since the > executor will allocate the resources completely to the current ongoing > request. > > Does not flag spark.streaming.concurrentJobs enable this kind of scenario > or is there any other way to achieve what I am looking for > > Thanks, > Sateesh > > On Sat, Aug 22, 2015 at 7:26 PM, Akhil Das <ak...@sigmoidanalytics.com> > wrote: > >> Hmm for a singl core VM you will have to run it in local mode(specifying >> master= local[4]). The flag is available in all the versions of spark i >> guess. >> On Aug 22, 2015 5:04 AM, "Sateesh Kavuri" <sateesh.kav...@gmail.com> >> wrote: >> >>> Thanks Akhil. Does this mean that the executor running in the VM can >>> spawn two concurrent jobs on the same core? If this is the case, this is >>> what we are looking for. Also, which version of Spark is this flag in? >>> >>> Thanks, >>> Sateesh >>> >>> On Sat, Aug 22, 2015 at 1:44 AM, Akhil Das <ak...@sigmoidanalytics.com> >>> wrote: >>> >>>> You can look at the spark.streaming.concurrentJobs by default it runs a >>>> single job. If set it to 2 then it can run 2 jobs parallely. Its an >>>> experimental flag, but go ahead and give it a try. >>>> On Aug 21, 2015 3:36 AM, "Sateesh Kavuri" <sateesh.kav...@gmail.com> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> My scenario goes like this: >>>>> I have an algorithm running in Spark streaming mode on a 4 core >>>>> virtual machine. Majority of the time, the algorithm does disk I/O and >>>>> database I/O. Question is, during the I/O, where the CPU is not >>>>> considerably loaded, is it possible to run any other task/thread so as to >>>>> efficiently utilize the CPU? >>>>> >>>>> Note that one DStream of the algorithm runs completely on a single CPU >>>>> >>>>> Thank you, >>>>> Sateesh >>>>> >>>> >>> >