Re: [jira] [Created] (ZEPPELIN-185) z.show does not work on DataFrame in pyspark

IT CTO Mon, 10 Aug 2015 04:49:36 -0700

Does anyone knows how to solve this one? my users are using python and
iterating through the DF each time is not useful
Eran


On Sat, Jul 25, 2015 at 10:06 PM Felix Cheung (JIRA) <[email protected]>
wrote:

> Felix Cheung created ZEPPELIN-185:
> -------------------------------------
>
>              Summary: z.show does not work on DataFrame in pyspark
>                  Key: ZEPPELIN-185
>                  URL: https://issues.apache.org/jira/browse/ZEPPELIN-185
>              Project: Zeppelin
>           Issue Type: Bug
>           Components: Core, Interpreters
>     Affects Versions: 0.6.0
>             Reporter: Felix Cheung
>             Assignee: Felix Cheung
>
>
> I’ve tested this out and found these issues. Firstly,
>
>
> http://spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=createdataframe#pyspark.sql.SQLContext.createDataFrame
> # Code should be changed to this – it does not work in pyspark CLI
> otherwise
> rdd = sc.parallelize(["1","2","3"])
> Data = Row('first')
> df = sqlContext.createDataFrame(rdd.map(lambda d: Data(d)))
>
> Secondly,
> z.show() doesn’t seem to work properly in Python – I see the same error
> below: “AttributeError: 'DataFrame' object has no attribute
> '_get_object_id'"
> #Python/PySpark – doesn’t work
> rdd = sc.parallelize(["1","2","3"])
> Data = Row('first')
> df = sqlContext.createDataFrame(rdd.map(lambda d: Data(d)))
> print df
> print df.collect()
> z.show(df)
>         AttributeError: 'DataFrame' object has no attribute
> ‘_get_object_id'
>
> #Scala – this works
> val a = sc.parallelize(List("1", "2", "3"))
> val df = a.toDF()
> z.show(df)
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>

Re: [jira] [Created] (ZEPPELIN-185) z.show does not work on DataFrame in pyspark

Reply via email to