[ https://issues.apache.org/jira/browse/SPARK-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chandan Kumar closed SPARK-1662. -------------------------------- Resolution: Not a Problem The issue is due to a limitation with Python's pickle mechanism. Probably not worth the effort to use something other than pickle. There are workarounds anyway. > PySpark fails if python class is used as a data container > --------------------------------------------------------- > > Key: SPARK-1662 > URL: https://issues.apache.org/jira/browse/SPARK-1662 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 1.0.0 > Environment: Ubuntu 14, Python 2.7.6 > Reporter: Chandan Kumar > Priority: Minor > > PySpark fails if RDD operations are performed on data encapsulated in Python > objects (rare use case where plain python objects are used as data containers > instead of regular dict or tuples). > I have written a small piece of code to reproduce the bug: > https://gist.github.com/nrchandan/11394440 > <script src="https://gist.github.com/nrchandan/11394440.js"></script> -- This message was sent by Atlassian JIRA (v6.2#6252)