Just share some of insights from operating SparkML side at scale - map reduce may not best way to iterative sync partitioned workers. - native hardware accelerations is key to adopt rapid changes in ML improvements in foreseeable future.
Chen On Apr 29, 2019, at 11:02, jincheng sun <sunjincheng...@gmail.com> wrote: > > Hi Shaoxuan, > > Thanks for doing more efforts for the enhances of the scalability and the > ease of use of Flink ML and make it one step further. Thank you for sharing > a lot of context information. > > big +1 for this proposal! > > Here only one suggestion, that is, It has been a short time until the > release of flink-1.9, so I recommend It's better to add a detailed > implementation plan to FLIP and google doc. > > What do you think? > > Best, > Jincheng > > Shaoxuan Wang <wshaox...@gmail.com> 于2019年4月29日周一 上午10:34写道: > >> Hi everyone, >> >> Weihua has proposed to rebuild Flink ML pipeline on top of TableAPI several >> months ago in this mail thread: >> >> >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Embracing-Table-API-in-Flink-ML-td25368.html >> >> Luogen, Becket, Xu, Weihua and I have been working on this proposal >> offline in >> the past a few months. Now we want to share the first phase of the entire >> proposal with a FLIP. In this FLIP-39, we want to achieve several things >> (and hope those can be accomplished and released in Flink-1.9): >> >> - >> >> Provide a new set of ML core interface (on top of Flink TableAPI) >> - >> >> Provide a ML pipeline interface (on top of Flink TableAPI) >> - >> >> Provide the interfaces for parameters management and pipeline/mode >> persistence >> - >> >> All the above interfaces should facilitate any new ML algorithm. We will >> gradually add various standard ML algorithms on top of these new >> proposed >> interfaces to ensure their feasibility and scalability. >> >> >> Part of this FLIP has been present in Flink Forward 2019 @ San Francisco by >> Xu and Me. >> >> >> https://sf-2019.flink-forward.org/conference-program#when-table-meets-ai--build-flink-ai-ecosystem-on-table-api >> >> >> https://sf-2019.flink-forward.org/conference-program#high-performance-ml-library-based-on-flink >> >> You can find the videos & slides at >> https://www.ververica.com/flink-forward-san-francisco-2019 >> >> The design document for FLIP-39 can be found here: >> >> >> https://docs.google.com/document/d/1StObo1DLp8iiy0rbukx8kwAJb0BwDZrQrMWub3DzsEo >> >> >> I am looking forward to your feedback. >> >> Regards, >> >> Shaoxuan >>