Re: [jira] [Created] (ZEPPELIN-185) z.show does not work on DataFrame in pyspark

IT CTO Tue, 11 Aug 2015 04:35:27 -0700

Hi,
I tested this one and it works for me.
Why is the JIRA bug still open?
Eran


On Mon, Aug 10, 2015 at 7:02 PM IT CTO <[email protected]> wrote:

> Greate, I did not know. I will test it tomorrow.
> Eran
>
> בתאריך יום ב׳, 10 באוג׳ 2015, 18:48 מאת Felix Cheung <
> [email protected]>:
>
>> Could you elaborate? Are you referring to working around this issue?The
>> fix for this has been merged.
>>
>> > From: [email protected]
>> > Date: Mon, 10 Aug 2015 11:48:13 +0000
>> > Subject: Re: [jira] [Created] (ZEPPELIN-185) z.show does not work on
>> DataFrame in pyspark
>> > To: [email protected]
>> >
>> > Does anyone knows how to solve this one? my users are using python and
>> > iterating through the DF each time is not useful
>> > Eran
>> >
>> > On Sat, Jul 25, 2015 at 10:06 PM Felix Cheung (JIRA) <[email protected]>
>> > wrote:
>> >
>> > > Felix Cheung created ZEPPELIN-185:
>> > > -------------------------------------
>> > >
>> > >              Summary: z.show does not work on DataFrame in pyspark
>> > >                  Key: ZEPPELIN-185
>> > >                  URL:
>> https://issues.apache.org/jira/browse/ZEPPELIN-185
>> > >              Project: Zeppelin
>> > >           Issue Type: Bug
>> > >           Components: Core, Interpreters
>> > >     Affects Versions: 0.6.0
>> > >             Reporter: Felix Cheung
>> > >             Assignee: Felix Cheung
>> > >
>> > >
>> > > I’ve tested this out and found these issues. Firstly,
>> > >
>> > >
>> > >
>> http://spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=createdataframe#pyspark.sql.SQLContext.createDataFrame
>> > > # Code should be changed to this – it does not work in pyspark CLI
>> > > otherwise
>> > > rdd = sc.parallelize(["1","2","3"])
>> > > Data = Row('first')
>> > > df = sqlContext.createDataFrame(rdd.map(lambda d: Data(d)))
>> > >
>> > > Secondly,
>> > > z.show() doesn’t seem to work properly in Python – I see the same
>> error
>> > > below: “AttributeError: 'DataFrame' object has no attribute
>> > > '_get_object_id'"
>> > > #Python/PySpark – doesn’t work
>> > > rdd = sc.parallelize(["1","2","3"])
>> > > Data = Row('first')
>> > > df = sqlContext.createDataFrame(rdd.map(lambda d: Data(d)))
>> > > print df
>> > > print df.collect()
>> > > z.show(df)
>> > >         AttributeError: 'DataFrame' object has no attribute
>> > > ‘_get_object_id'
>> > >
>> > > #Scala – this works
>> > > val a = sc.parallelize(List("1", "2", "3"))
>> > > val df = a.toDF()
>> > > z.show(df)
>> > >
>> > >
>> > >
>> > > --
>> > > This message was sent by Atlassian JIRA
>> > > (v6.3.4#6332)
>> > >
>>
>
>

Re: [jira] [Created] (ZEPPELIN-185) z.show does not work on DataFrame in pyspark

Reply via email to