[
https://issues.apache.org/jira/browse/FLINK-26828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511623#comment-17511623
]
Yunfeng Zhou commented on FLINK-26828:
--------------------------------------
Hi Jing Ge,
Thanks a lot for the PR! I just had a discussion with Lin Dong and Zhang
Zhipeng. We agree that there should be a flink-ml-examples module in the long
run. But for now, adding the module could cause concerns like follows.
* In each operator's test class there have already been end-to-end tests, and
the example code duplicates with that. We're not sure how to deal with this
duplication yet.
* Moreover, should the example code just provide usage of the most basic API,
like `fit()` and `transform()`, or also cover higher level APIs, like save/load
and get/setModelData? There have already been test cases covering all levels of
usage of the API.
* Duplication also exists between example code and the document, as you have
illustrated in the PR's description. When adding the flink-ml-examples module,
we also hope to link the example code in this module to those provided in the
document. Spark's MLlib Guide has achieved this.
* Apart from example code, other contents in the document can also be
automatically generated, like the algorithm's description or the parameter
table. Thus when linking the example codes, there should have been a complete
design of how we would like the documents to be generated as a whole. These
designs have not been put into discussion yet.
Because of the reasons above, we believe that adding example codes is more than
adding a module alone. It should function coherently with other modules. How do
you like the idea to add the example module after the concerns above have been
settled? Please also feel free to share with us your ideas about these
considerations.
Yunfeng
> build flink-ml example module
> -----------------------------
>
> Key: FLINK-26828
> URL: https://issues.apache.org/jira/browse/FLINK-26828
> Project: Flink
> Issue Type: Improvement
> Reporter: Jing Ge
> Assignee: Jing Ge
> Priority: Major
> Labels: pull-request-available
>
> first example is the KMeans described in Flink-ML official doc with some
> modification.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)