DataFrame is a kind of special case of Dataset, so they mean the same thing. Actually the ML pipeline API will accept Dataset[_] instead of DataFrame in Spark 2.0. We can say that MLlib will focus on the Dataset-based API for futher development more accurately.
Thanks Yanbo 2016-07-10 20:35 GMT-07:00 jinhong lu <lujinho...@gmail.com>: > Hi, > Since the DataSet will be the major API in spark2.0, why mllib will > DataFrame-based, and 'future development will focus on the DataFrame-based > API.’ > > Any plan will change mllib form DataFrame-based to DataSet-based? > > > ============= > Thanks, > lujinhong > > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >