at 7:26 PM, Akhil Das
wrote:
> Hmm for a singl core VM you will have to run it in local mode(specifying
> master= local[4]). The flag is available in all the versions of spark i
> guess.
> On Aug 22, 2015 5:04 AM, "Sateesh Kavuri"
> wrote:
>
>> Thanks Akhi
ook at the spark.streaming.concurrentJobs by default it runs a
> single job. If set it to 2 then it can run 2 jobs parallely. Its an
> experimental flag, but go ahead and give it a try.
> On Aug 21, 2015 3:36 AM, "Sateesh Kavuri"
> wrote:
>
>> Hi,
>>
>
d you not start disk io in a separate
> thread, so that the sceduler can go ahead and assign other tasks ?
> On 21 Aug 2015 16:06, "Sateesh Kavuri" wrote:
>
>> Hi,
>>
>> My scenario goes like this:
>> I have an algorithm running in Spark streaming mode on a
Hi,
My scenario goes like this:
I have an algorithm running in Spark streaming mode on a 4 core virtual
machine. Majority of the time, the algorithm does disk I/O and database
I/O. Question is, during the I/O, where the CPU is not considerably loaded,
is it possible to run any other task/thread so
Probably overloading the question a bit.
In Storm, Bolts have the functionality of getting triggered on events. Is
that kind of functionality possible with Spark streaming? During each phase
of the data processing, the transformed data is stored to the database and
this transformed data should the
Jun 4, 2015 at 2:14 AM, Sateesh Kavuri
> wrote:
>
>> Hi,
>>
>> I have used weka machine learning library for generating a model for my
>> training set. I have used the PART algorithm (decision lists) from weka.
>>
>> Now, I would like to use spark ML for t
Hi,
I have used weka machine learning library for generating a model for my
training set. I have used the PART algorithm (decision lists) from weka.
Now, I would like to use spark ML for the PART algo for my training set and
could not seem to find a parallel. Could anyone point out the correspond
his would have the overall effect of decreasing performance
> if your required number of connections outstrips the database's resources.
>
> On Fri, Apr 3, 2015 at 12:22 AM Sateesh Kavuri
> wrote:
>
>> But this basically means that the pool is confined to the job (of a
>> sin
pache.org/docs/latest/streaming-programming-guide.html#transformations-on-dstreams
>
> On Thu, Apr 2, 2015 at 7:52 AM, Sateesh Kavuri
> wrote:
>
>> Right, I am aware on how to use connection pooling with oracle, but the
>> specific question is how to use it in the context of spark job ex
't seem to be Spark specific, btw
>
>
>
>
> > On Apr 2, 2015, at 4:45 AM, Sateesh Kavuri
> wrote:
> >
> > Hi,
> >
> > We have a case that we will have to run concurrent jobs (for the same
> algorithm) on different data sets. And these jobs can ru
Hi,
We have a case that we will have to run concurrent jobs (for the same
algorithm) on different data sets. And these jobs can run in parallel and
each one of them would be fetching the data from the database.
We would like to optimize the database connections by making use of
connection pooling.
11 matches
Mail list logo