Hello

for the RDD I can apply flatMap method:

>>> sc.parallelize(["a few words","ba na ba na"]).flatMap(lambda x: x.split(" ")).collect()
['a', 'few', 'words', 'ba', 'na', 'ba', 'na']


But for a dataframe table how can I flatMap that as above?

>>> df.show()
+----------------+
|           value|
+----------------+
|     a few lines|
|hello world here|
|     ba na ba na|
+----------------+


Thanks

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to