[jira] [Created] (ZEPPELIN-341) Converting pandas dataframe to spark dataframe in Zeppelin does not work

Lee moon soo (JIRA) Tue, 13 Oct 2015 03:45:07 -0700

Lee moon soo created ZEPPELIN-341:
-------------------------------------

             Summary: Converting pandas dataframe to spark dataframe in 
Zeppelin does not work
                 Key: ZEPPELIN-341
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-341
             Project: Zeppelin
          Issue Type: Bug
    Affects Versions: 0.5.0
            Reporter: Lee moon soo
             Fix For: 0.6.0



Converting pandas dataframe to spark dataframe does not work in Zeppelin (does 
work in pyspark shell)

{code}
%pyspark
import pandas as pd
from pyspark.sql import SQLContext
print sc
df = pd.DataFrame([("foo", 1), ("bar", 2)], columns=("k", "v"))
print type(df)
print df
sqlCtx = SQLContext(sc)
sqlCtx.createDataFrame(df).show()
{code}

generates error

{code}
Traceback (most recent call last): File "/tmp/zeppelin_pyspark.py", 
line 162, in <module> eval(compiledCode) File "<string>", 
line 8, in <module> File 
"/home/bala/Software/spark-1.5.0-bin-hadoop2.6/python/pyspark/sql/context.py", 
line 406, in createDataFrame rdd, schema = self._createFromLocal(data, schema) 
File 
"/home/bala/Software/spark-1.5.0-bin-hadoop2.6/python/pyspark/sql/context.py", 
line 322, in _createFromLocal struct = self._inferSchemaFromList(data) File 
"/home/bala/Software/spark-1.5.0-bin-hadoop2.6/python/pyspark/sql/context.py", 
line 211, in _inferSchemaFromList schema = _infer_schema(first) File 
"/home/bala/Software/spark-1.5.0-bin-hadoop2.6/python/pyspark/sql/types.py", 
line 829, in _infer_schema raise TypeError("Can not infer schema for type: %s" 
% type(row)) 
TypeError: Can not infer schema for type: <type 'str'> 
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (ZEPPELIN-341) Converting pandas dataframe to spark dataframe in Zeppelin does not work

Reply via email to