Lee moon soo created ZEPPELIN-341: ------------------------------------- Summary: Converting pandas dataframe to spark dataframe in Zeppelin does not work Key: ZEPPELIN-341 URL: https://issues.apache.org/jira/browse/ZEPPELIN-341 Project: Zeppelin Issue Type: Bug Affects Versions: 0.5.0 Reporter: Lee moon soo Fix For: 0.6.0
Converting pandas dataframe to spark dataframe does not work in Zeppelin (does work in pyspark shell) {code} %pyspark import pandas as pd from pyspark.sql import SQLContext print sc df = pd.DataFrame([("foo", 1), ("bar", 2)], columns=("k", "v")) print type(df) print df sqlCtx = SQLContext(sc) sqlCtx.createDataFrame(df).show() {code} generates error {code} Traceback (most recent call last): File "/tmp/zeppelin_pyspark.py", line 162, in <module> eval(compiledCode) File "<string>", line 8, in <module> File "/home/bala/Software/spark-1.5.0-bin-hadoop2.6/python/pyspark/sql/context.py", line 406, in createDataFrame rdd, schema = self._createFromLocal(data, schema) File "/home/bala/Software/spark-1.5.0-bin-hadoop2.6/python/pyspark/sql/context.py", line 322, in _createFromLocal struct = self._inferSchemaFromList(data) File "/home/bala/Software/spark-1.5.0-bin-hadoop2.6/python/pyspark/sql/context.py", line 211, in _inferSchemaFromList schema = _infer_schema(first) File "/home/bala/Software/spark-1.5.0-bin-hadoop2.6/python/pyspark/sql/types.py", line 829, in _infer_schema raise TypeError("Can not infer schema for type: %s" % type(row)) TypeError: Can not infer schema for type: <type 'str'> {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)