Re: CallbackServer in PySpark Streaming

Todd Gao Wed, 11 Feb 2015 18:01:05 -0800

Oh I see! Thank you very much, Davies. You correct some of my wrong
understandings.


On Thu, Feb 12, 2015 at 9:50 AM, Davies Liu <[email protected]> wrote:

> Yes.
>
> On Wed, Feb 11, 2015 at 5:44 PM, Todd Gao <[email protected]>
> wrote:
> > Thanks Davies.
> > I am not quite familiar with Spark Streaming. Do you mean that the
> compute
> > routine of DStream is only invoked in the driver node,
> > while only the compute routines of RDD are distributed to the slaves?
> >
> > On Thu, Feb 12, 2015 at 2:38 AM, Davies Liu <[email protected]>
> wrote:
> >>
> >> The CallbackServer is part of Py4j, it's only used in driver, not used
> >> in slaves or workers.
> >>
> >> On Wed, Feb 11, 2015 at 12:32 AM, Todd Gao
> >> <[email protected]> wrote:
> >> > Hi all,
> >> >
> >> > I am reading the code of PySpark and its Streaming module.
> >> >
> >> > In PySpark Streaming, when the `compute` method of the instance of
> >> > PythonTransformedDStream is invoked, a connection to the
> CallbackServer
> >> > is created internally.
> >> > I wonder where is the CallbackServer for each PythonTransformedDStream
> >> > instance on the slave nodes in distributed environment.
> >> > Is there a CallbackServer running on every slave node?
> >> >
> >> > thanks
> >> > Todd
> >
> >
>

Re: CallbackServer in PySpark Streaming

Reply via email to