Greate, I did not know. I will test it tomorrow.
Eran

בתאריך יום ב׳, 10 באוג׳ 2015, 18:48 מאת Felix Cheung <
[email protected]>:

> Could you elaborate? Are you referring to working around this issue?The
> fix for this has been merged.
>
> > From: [email protected]
> > Date: Mon, 10 Aug 2015 11:48:13 +0000
> > Subject: Re: [jira] [Created] (ZEPPELIN-185) z.show does not work on
> DataFrame in pyspark
> > To: [email protected]
> >
> > Does anyone knows how to solve this one? my users are using python and
> > iterating through the DF each time is not useful
> > Eran
> >
> > On Sat, Jul 25, 2015 at 10:06 PM Felix Cheung (JIRA) <[email protected]>
> > wrote:
> >
> > > Felix Cheung created ZEPPELIN-185:
> > > -------------------------------------
> > >
> > >              Summary: z.show does not work on DataFrame in pyspark
> > >                  Key: ZEPPELIN-185
> > >                  URL:
> https://issues.apache.org/jira/browse/ZEPPELIN-185
> > >              Project: Zeppelin
> > >           Issue Type: Bug
> > >           Components: Core, Interpreters
> > >     Affects Versions: 0.6.0
> > >             Reporter: Felix Cheung
> > >             Assignee: Felix Cheung
> > >
> > >
> > > I’ve tested this out and found these issues. Firstly,
> > >
> > >
> > >
> http://spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=createdataframe#pyspark.sql.SQLContext.createDataFrame
> > > # Code should be changed to this – it does not work in pyspark CLI
> > > otherwise
> > > rdd = sc.parallelize(["1","2","3"])
> > > Data = Row('first')
> > > df = sqlContext.createDataFrame(rdd.map(lambda d: Data(d)))
> > >
> > > Secondly,
> > > z.show() doesn’t seem to work properly in Python – I see the same error
> > > below: “AttributeError: 'DataFrame' object has no attribute
> > > '_get_object_id'"
> > > #Python/PySpark – doesn’t work
> > > rdd = sc.parallelize(["1","2","3"])
> > > Data = Row('first')
> > > df = sqlContext.createDataFrame(rdd.map(lambda d: Data(d)))
> > > print df
> > > print df.collect()
> > > z.show(df)
> > >         AttributeError: 'DataFrame' object has no attribute
> > > ‘_get_object_id'
> > >
> > > #Scala – this works
> > > val a = sc.parallelize(List("1", "2", "3"))
> > > val df = a.toDF()
> > > z.show(df)
> > >
> > >
> > >
> > > --
> > > This message was sent by Atlassian JIRA
> > > (v6.3.4#6332)
> > >
>

Reply via email to