I check the `collect` of `DataSet`, this method call the `collect` of `RDD` and apply `decodeUnsafeRows`. So I think the function of the two `collect` is differenct. The `collect` of `DataSet` is used for spark sql. If you really want use `collectAsync`,please code following: `df.rdd.collectAsync`
At 2018-12-24 11:36:14, "Jiaan Geng" <belie...@163.com> wrote: >RDD have not the method `collectAsync`.There exists a implicit conversion >from RDD to AsyncRDDActions in object RDD. The implicit conversion is : > implicit def rddToAsyncRDDActions[T: ClassTag](rdd: RDD[T]): >AsyncRDDActions[T] = { > new AsyncRDDActions(rdd) > } >The method collect of RDD use the SparkContext.runJob,But the method >collectAsync of AsyncRDDActions use SparkContext.submitJob. >You can refer this difference to achieve this function. > > > > >-- >Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > >--------------------------------------------------------------------- >To unsubscribe e-mail: user-unsubscr...@spark.apache.org