hi,all:
I want't to generate some test data , which contained about one hundred
million rows .
I create a dataset have ten rows ,and I do df.union operation in 'for'
circulation , but this will case the operation only happen on driver node.
how can I do it on the whole cluster.2018-12-14 lk_spark
