[ 
https://issues.apache.org/jira/browse/SPARK-5037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-5037:
--------------------------------
    Labels: bulk-closed  (was: )

> support dynamic loading of input DStreams in pyspark streaming
> --------------------------------------------------------------
>
>                 Key: SPARK-5037
>                 URL: https://issues.apache.org/jira/browse/SPARK-5037
>             Project: Spark
>          Issue Type: New Feature
>          Components: DStreams, PySpark
>    Affects Versions: 1.2.0
>            Reporter: Jascha Swisher
>            Priority: Major
>              Labels: bulk-closed
>
> The scala and java streaming APIs support "external" InputDStreams (e.g. the 
> ZeroMQReceiver example) through a number of mechanisms, for instance by 
> overriding ActorReceiver or just subclassing Receiver directly. The pyspark 
> streaming API does not currently allow similar flexibility, being limited at 
> the moment to file-backed text and binary streams or socket text streams.
> It would be great to open up the pyspark streaming API to other stream 
> sources, putting it closer to on par with the JVM APIs.
> One way of doing this could be to support dynamically loading InputDStream 
> implementations through reflection at the JVM level, analogously to what is 
> currently done for Hadoop InputFormats in the regular pyspark context.py 
> Hadoop methods. 
> I'll submit a PR momentarily with my shot at this. Comments and alternative 
> approaches more than welcome.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to