[
https://issues.apache.org/jira/browse/SPARK-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Josh Rosen resolved SPARK-867.
------------------------------
Resolution: Won't Fix
I'm going to close this as "Won't Fix" in order to help clear out the Python
issue backlog.
> Add a native Python way to create input RDDs in PySpark
> -------------------------------------------------------
>
> Key: SPARK-867
> URL: https://issues.apache.org/jira/browse/SPARK-867
> Project: Spark
> Issue Type: New Feature
> Components: PySpark
> Reporter: Matei Zaharia
>
> While PySpark can easily read RDDs of Strings from Java/Scala, it would be
> nice to create your own subclass that implements the partitions(),
> preferredLocations() and compute() methods as in Java, to plug in new input
> sources. It shouldn't be too hard to turn this into a special RDD on the Java
> side by, say, parallelizing an array of partition objects and piping them
> through Python.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]