Well that could be the problem. A SQL database is essential a big synchronizer.
If you have a lot of spark tasks all bottlenecking on a single database socket
(is the database clustered or colocated with spark workers?) then you will have
blocked threads on the database server.
Sent from my Verizon Wireless 4G LTE smartphone
-------- Original message --------
From: Malcolm Lockyer <[email protected]>
Date: 05/30/2016 10:40 PM (GMT-05:00)
To: [email protected]
Subject: Re: Spark + Kafka processing trouble
On Tue, May 31, 2016 at 1:56 PM, Darren Govoni <[email protected]> wrote:
> So you are calling a SQL query (to a single database) within a spark
> operation distributed across your workers?
Yes, but currently with very small sets of data (1-10,000) and on a
single (dev) machine right now.
(sorry didn't reply to the list)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]