回复: 回复:Can I collect Dataset[Row] to driver without converting it toArray [Row]?

2020-04-23 Thread maqy
Hi Jinxin,  Thanks for your suggestions, I will try to use foreachpartition later.   Best regards, maqy 发件人: Tang Jinxin 发送时间: 2020年4月23日 7:31 收件人: maqy 抄送: Andrew Melo; user@spark.apache.org 主题: 回复:Can I collect Dataset[Row] to driver without converting it toArray [Row]? Hi maqy, Thanks

回复: 回复:Can I collect Dataset[Row] to driver without converting it toArray [Row]?

2020-04-23 Thread maqy
Hi Jinxin,  Thanks for your suggestions, I will try to use foreachpartition later.   Best regards, maqy 发件人: Tang Jinxin 发送时间: 2020年4月23日 7:31 收件人: maqy 抄送: Andrew Melo; user@spark.apache.org 主题: 回复:Can I collect Dataset[Row] to driver without converting it toArray [Row]? Hi maqy, Thanks

回复: 回复:Can I collect Dataset[Row] to driver without converting it toArray [Row]?

2020-04-23 Thread maqy
Hi Jinxin,  Thanks for your suggestions, I will try to use foreachpartition later.   Best regards, maqy 发件人: Tang Jinxin 发送时间: 2020年4月23日 7:31 收件人: maqy 抄送: Andrew Melo; user@spark.apache.org 主题: 回复:Can I collect Dataset[Row] to driver without converting it toArray [Row]? Hi maqy, Thanks

回复:Can I collect Dataset[Row] to driver without converting it toArray [Row]?

2020-04-22 Thread Tang Jinxin
Hi maqy, Thanks for your question.Through consideration,I have some ideas as   follow:firstly,try not collect to driver if not nessessary,instead (use foreachpartition)send data from ececutors;secondly,if not use some high performance  ser/deser like kryo, we could have a try.As a summary,I

回复: Can I collect Dataset[Row] to driver without converting it toArray [Row]?

2020-04-22 Thread maqy
 Hi Andrew, Thank you for your reply, I am using the scala api of spark, and the tensorflow machine is not in the spark cluster. Is this JIRA / PR still valid in this situation?  In addition, the current bottleneck of the application is that the amount of data transferred through the