[
https://issues.apache.org/jira/browse/SPARK-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Armbrust resolved SPARK-5898.
-------------------------------------
Resolution: Fixed
Fix Version/s: 1.3.0
Issue resolved by pull request 4679
[https://github.com/apache/spark/pull/4679]
> Can't create DataFrame from Pandas data frame
> ---------------------------------------------
>
> Key: SPARK-5898
> URL: https://issues.apache.org/jira/browse/SPARK-5898
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Reporter: Michael Armbrust
> Assignee: Davies Liu
> Priority: Critical
> Fix For: 1.3.0
>
>
> {code}
> data = sqlContext.table("sparkCommits")
> p = data.toPandas()
> sqlContext.createDataFrame(p)
> {code}
> {code}
> ---------------------------------------------------------------------------
> AttributeError Traceback (most recent call last)
> <ipython-input-12-fb4f1895bd2f> in <module>()
> 1 data = sqlContext.table("sparkCommits")
> 2 p = data.toPandas()
> ----> 3 sqlContext.createDataFrame(p)
> /home/ubuntu/databricks/spark/python/pyspark/sql/context.pyc in
> createDataFrame(self, data, schema, samplingRatio)
> 385 data = self._sc.parallelize(data.to_records(index=False))
> 386 if schema is None:
> --> 387 schema = list(data.columns)
> 388
> 389 if not isinstance(data, RDD):
> AttributeError: 'RDD' object has no attribute 'columns'
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]