As Jeremy said, the Spark Streaming has no python API yet. However, there are a number of things you can do that allows you to do your main data manipulation in Python. Spark API allows the data of a dataset to be "piped" out to any arbitrary external script (say, a Bash script, or a Python script). Look up RDD.pipe()<http://spark.incubator.apache.org/docs/latest/api/core/index.html#org.apache.spark.rdd.RDD> function. So you can use Scala or Java based Spark Streaming API to read data from different sources, generated RDDs (the data abstraction in Spark) and pipe it out to an external python script for the more complex processing.
TD On Fri, Feb 21, 2014 at 5:57 PM, Jeremy Freeman <freeman.jer...@gmail.com>wrote: > There is currently no support for Streaming in the Python API, but I > believe it's on the roadmap. > > -- Jeremy > > On Feb 21, 2014, at 6:33 AM, Prasanth Prahladan <prashvirg...@gmail.com> > wrote: > > Hi, > I am new to Spark, Hadoop and related technologies. I intend to use this > for gps data stream processing. As I am more comfortable with Python, I > intend to use Python based technologies for the application development. > > > Is it possible to use the current PySpark API for implementing Stream > Processing as executed within the Spark Streaming framework? > > -- > Regards, > Prasanth Prahladan > > > > > > > > >