As Jeremy said, the Spark Streaming has no python API yet. However, there
are a number of things you can do that allows you to do your main data
manipulation in Python. Spark API allows the data of a dataset to be
"piped" out to any arbitrary external script (say, a Bash script, or a
Python script). Look up
RDD.pipe()<http://spark.incubator.apache.org/docs/latest/api/core/index.html#org.apache.spark.rdd.RDD>
 function. So you can use Scala or Java based Spark Streaming API to read
data from different sources, generated RDDs (the data abstraction in Spark)
and pipe it out to an external python script for the more complex
processing.

TD


On Fri, Feb 21, 2014 at 5:57 PM, Jeremy Freeman <freeman.jer...@gmail.com>wrote:

> There is currently no support for Streaming in the Python API, but I
> believe it's on the roadmap.
>
> -- Jeremy
>
> On Feb 21, 2014, at 6:33 AM, Prasanth Prahladan <prashvirg...@gmail.com>
> wrote:
>
> Hi,
> I am new to Spark, Hadoop and related technologies. I intend to use this
> for gps data stream processing. As I am more comfortable with Python, I
> intend to use Python based technologies for the application development.
>
>
> Is it possible to use the current PySpark API for implementing Stream
> Processing as executed within the Spark Streaming framework?
>
> --
> Regards,
> Prasanth Prahladan
>
>
>
>
>
>
>
>
>

Reply via email to