Here is the code first is a paragraph in pySpark which fails and second is
one in scala which works

%pyspark
#This paragraph fails
wordcount = (sc.textFile("some path to file"))
wcDF = wordcount.toDF() #here is where the code fails
z.show(wcDF)

btw, the same code works in scala:

//This paragraph works well
val wordcount = (sc.textFile("some path to file"))
val wcDF = wordcount.toDF()
z.show(wcDF)



On Mon, Jul 20, 2015 at 10:34 AM <felixcheun...@hotmail.com> wrote:

>  Could you post more of your code leading to that?
>
>
>
> On Sun, Jul 19, 2015 at 10:19 PM -0700, "IT CTO" <goi....@gmail.com>
> wrote:
>
>  I am trying to convert the Python RDD to DF but I am getting and error:
>
>  myRDD_DF = myRDD.toDF()
>
>  error: AtributeError("'list' object is not attribute '_get_object_id'",)
>
>  As much as I read this is something to do with python and java
> conversion but I don't know....
> Any help?
>
>  On Mon, Jul 20, 2015 at 4:21 AM <felixcheun...@hotmail.com> wrote:
>
>  You should try to convert the RDD into a DataFrame. Zeppelin can then
> display it as a table automatically
>
>
>
> On Sun, Jul 19, 2015 at 1:55 AM -0700, "IT CTO" <goi....@gmail.com> wrote:
>
>  Hi,
> I am using pySpark with zeppelin and would like to print the RDD as a
> table to be able to display in the display system.
> I know how to loop through the records and generate the %table string and
> print it but I am looking for a more elegant way.
> I tried z.show(MyRdd) but it failed:
> ... 'PipelinedRDD object has no attribute '_get_object_id
>
>  any help?
> Eran
>
>

Reply via email to