Re: Avoiding collect but use foreach

2019-02-04 Thread 刘虓
hi, I think you can make your python code into an udf and call udf in foreachpartition. Aakash Basu 于2019年2月1日周五 下午3:37写道: > Hi, > > This: > > > *to_list = [list(row) for row in df.collect()]* > > > Gives: > > > [[5, 1, 1, 1, 2, 1, 3, 1, 1, 0], [5, 4, 4, 5, 7, 10, 3, 2, 1, 0], [3, 1, > 1, 1, 2,

Avoiding collect but use foreach

2019-01-31 Thread Aakash Basu
Hi, This: *to_list = [list(row) for row in df.collect()]* Gives: [[5, 1, 1, 1, 2, 1, 3, 1, 1, 0], [5, 4, 4, 5, 7, 10, 3, 2, 1, 0], [3, 1, 1, 1, 2, 2, 3, 1, 1, 0], [6, 8, 8, 1, 3, 4, 3, 7, 1, 0], [4, 1, 1, 3, 2, 1, 3, 1, 1, 0]] I want to avoid collect operation, but still convert the datafr