Hi Yun and Zhipeng,

Thanks a lot for starting the discussion. +1 for the FLINK ML 2.1.0
release. Looking forward for these ML algorithms. I plan to write a blog
about PyFlink + Flink ML after the released.

Best,
Xingbo

Zhipeng Zhang <zhangzhipe...@gmail.com> 于2022年6月23日周四 11:15写道:

> Hi devs,
>
> Yun and I would like to start a discussion for releasing Flink ML
> <https://github.com/apache/flink-ml> 2.1.0.
>
> In the past few months, we focused on improving the infra (e.g. memory
> management, benchmark infra, online training, python support) of Flink ML
> by implementing, benchmarking, and optimizing 9 new algorithms in Flink ML.
> Our results have shown that Flink ML is able to meet or exceed the
> performance of selected algorithms in alternative popular ML libraries.
>
> Please see below for a detailed list of improvements:
>
> - A set of representative machine learning algorithms:
>     - feature engineering
>         - MinMaxScaler (https://issues.apache.org/jira/browse/FLINK-25552)
>         - StringIndexer (https://issues.apache.org/jira/browse/FLINK-25527
> )
>         - VectorAssembler (
> https://issues.apache.org/jira/browse/FLINK-25616
> )
>         - StandardScaler (
> https://issues.apache.org/jira/browse/FLINK-26626)
>         - Bucketizer (https://issues.apache.org/jira/browse/FLINK-27072)
>     - online learning:
>         - OnlineKmeans (https://issues.apache.org/jira/browse/FLINK-26313)
>         - OnlineLogisiticRegression (
> https://issues.apache.org/jira/browse/FLINK-27170)
>     - regression:
>         - LinearRegression (
> https://issues.apache.org/jira/browse/FLINK-27093)
>     - classification:
>         - LinearSVC (https://issues.apache.org/jira/browse/FLINK-27091)
>     - Evaluation:
>         - BinaryClassificationEvaluator (
> https://issues.apache.org/jira/browse/FLINK-27294)
> - A benchmark framework for Flink ML. (
> https://issues.apache.org/jira/browse/FLINK-26443)
> - A website for Flink ML users (
> https://nightlies.apache.org/flink/flink-ml-docs-stable/)
> - Python support for Flink ML algorithms (
> https://issues.apache.org/jira/browse/FLINK-26268,
> https://issues.apache.org/jira/browse/FLINK-26269)
> - Several optimizations for FlinkML infrastructure (
> https://issues.apache.org/jira/browse/FLINK-27096,
> https://issues.apache.org/jira/browse/FLINK-27877)
>
> With the improvements and throughput benchmarks we have made, we think it
> is time to release Flink ML 2.1.0, so that interested developers in the
> community can try out the new Flink ML infra to develop algorithms with
> high throughput and low latency.
>
> If there is any concern, please let us know.
>
>
> Best,
> Yun and Zhipeng
>

Reply via email to