Thanks for the reply.

It looks strange that in scala shell I can implement this translation:

scala> sc.parallelize(List(3,2,1,4)).toDF.show
+-----+
|value|
+-----+
|    3|
|    2|
|    1|
|    4|
+-----+

But in pyspark i have to write as:

sc.parallelize([3,2,1,4]).map(lambda x: (x,1)).toDF(['id','count']).show()
+---+-----+
| id|count|
+---+-----+
|  3|    1|
|  2|    1|
|  1|    1|
|  4|    1|
+---+-----+


So there are differences on the implementation of pyspark and scala.

Thanks

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to