Re: Spark Streaming threading model

Gerard Maas Wed, 25 Sep 2013 12:31:36 -0700

Hi Tathagata,

Many thanks for the extended answer and the clarifications on the kafka
data distribution in the cluster.


There are many points to handle, so, to start somewhere:

Case (ii) could have been implemented as an actor as it just inserts a
>
> record on an arraybuffer (i.e.m very small task). However, with rates of
> more than 100K records received per second, I was unsure what the overhead
> of sending each record as a message through the actor library would be
> like.
>
> I'm personally curious about this point. I could investigate by creating a
simplified test scenario that isolates the data cummulator case and compare
the performance of both models (actors vs threads with proper locking)
under different levels of concurrency.
Do you think this could be helpful for the project? I'm looking to
contribute and this could be an interesting starting point.

>>I probably went into more detail that you wanted to know. :)
Absolutely not. The more, the better :-)

-kr, Gerard.

Re: Spark Streaming threading model

Reply via email to