Re: How true is this about spark streaming?

Sean Owen Tue, 29 Jul 2014 01:54:07 -0700

I'm not sure I understand this, maybe because the context is missing.
An RDD is immutable, so there is no such thing as writing to an RDD.
I'm not sure which aspect is being referred to as single-threaded. Is
this the Spark Streaming driver?

What is the difference between "streaming into Spark" and "reading
from the stream"? Streaming data into Spark means Spark reads the
stream.

A mini batch of data is exposed as an RDD, but the stream processing
continues while it is operated on. Saving the RDDs is one of the most
basic operations exposed by streaming:
http://spark.apache.org/docs/latest/streaming-programming-guide.html#output-operations
 No, you do not stop the stream processing to persist it. In fact you
couldn't.

On that basis, no, this sounds fairly wrong.

On Tue, Jul 29, 2014 at 1:37 AM, Rohit Pujari <rpuj...@hortonworks.com> wrote:
> Hello folks:
>
> I came across a thread that said
>
> "A Spark RDD read/write access is driven by a context object and is single
> threaded.  You cannot stream into Spark and read from the stream at the same
> time.  You have to stop the stream processing, snapshot the RDD and
> continue"
>
> Can you please offer some insights?
>
>
> Thanks,
> Rohit Pujari
> Solutions Engineer, Hortonworks
> rpuj...@hortonworks.com
> 716-430-6899
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader of
> this message is not the intended recipient, you are hereby notified that any
> printing, copying, dissemination, distribution, disclosure or forwarding of
> this communication is strictly prohibited. If you have received this
> communication in error, please contact the sender immediately and delete it
> from your system. Thank You.

Re: How true is this about spark streaming?

Reply via email to