Re: CallbackServer in PySpark Streaming

2015-02-11 Thread Davies Liu
The CallbackServer is part of Py4j, it's only used in driver, not used in slaves or workers. On Wed, Feb 11, 2015 at 12:32 AM, Todd Gao todd.gao.2013+sp...@gmail.com wrote: Hi all, I am reading the code of PySpark and its Streaming module. In PySpark Streaming, when the `compute` method of

Re: CallbackServer in PySpark Streaming

2015-02-11 Thread Todd Gao
Thanks Davies. I am not quite familiar with Spark Streaming. Do you mean that the compute routine of DStream is only invoked in the driver node, while only the compute routines of RDD are distributed to the slaves? On Thu, Feb 12, 2015 at 2:38 AM, Davies Liu dav...@databricks.com wrote: The

Re: CallbackServer in PySpark Streaming

2015-02-11 Thread Todd Gao
Oh I see! Thank you very much, Davies. You correct some of my wrong understandings. On Thu, Feb 12, 2015 at 9:50 AM, Davies Liu dav...@databricks.com wrote: Yes. On Wed, Feb 11, 2015 at 5:44 PM, Todd Gao todd.gao.2013+sp...@gmail.com wrote: Thanks Davies. I am not quite familiar with

CallbackServer in PySpark Streaming

2015-02-11 Thread Todd Gao
Hi all, I am reading the code of PySpark and its Streaming module. In PySpark Streaming, when the `compute` method of the instance of PythonTransformedDStream is invoked, a connection to the CallbackServer is created internally. I wonder where is the CallbackServer for each