> It's mainly used for model evaluation purposes for `ML_EVALUATE`. Different loss functions will be used and different metrics will be output for `ML_EVALUATE` based on the task option of the model. Task option is not necessary if the model is not used in `ML_EVALUATE`. `ML_EVALUATE` also has an overloading method which can override the task type during evaluation.
>From your explanation, I personally feel that it might be more appropriate to replace task with a word more suited to the scenario, but of course I don't have a good suggestion at the moment, just a suggestion. Best, Ron Hao Li <h...@confluent.io.invalid> 于2025年5月7日周三 11:24写道: > Hi Yunfeng, Ron, > > Thanks for the feedback. > > > it might be better to change the configuration api_key to apikey > Make sense. I updated the FLIP. > > > Why is it necessary to define the task option in the WITH clause of the > Model DDL, and what is its purpose? > It's mainly used for model evaluation purposes for `ML_EVALUATE`. Different > loss functions will be used and different metrics will be output for > `ML_EVALUATE` based on the task option of the model. Task option is not > necessary if the model > is not used in `ML_EVALUATE`. `ML_EVALUATE` also has an overloading method > which can override the task type during evaluation. > > Apart from evaluation, in the future, if model training is supported in > Flink, it can also serve the purpose of how the model can be trained. > > > About the CatalogModel interface, why does it need `getInputSchema` and > `getOutputSchema` methods? What is the role of Schema? > Schema is mainly to specific the input and output data type of the model > when it's used in prediction. During prediction, `ML_PREDICT` takes columns > from the input table matching the models input schema types and output > columns based on the model's output schema type. > > > Regarding the ModelProvider interface, what is the role of the copy > method? > I think it can be useful in the future if we need to copy it during the > planning stage and apply mutations to the provider. But it may not be used > for now. I'm also ok to remove it. > > > Hope this answers your question. > > Thanks, > Hao > > > On Tue, May 6, 2025 at 7:49 PM Ron Liu <ron9....@gmail.com> wrote: > > > Hi, Hao > > > > Thanks for starting this proposal, it's a great feature, +1. > > > > Since I was missing some context, I went to FLIP-437. Combining these two > > FLIPs, I have the following three questions: > > 1. Why is it necessary to define the task option in the WITH clause of > the > > Model DDL, and what is its purpose? I understand that one model can > support > > various types of tasks such as regression, classification, clustering, > > etc... But the example you have given gives me the impression that model > > can only perform a specific type of task, which confuses me. I think the > > task option is not needed > > > > 2. About the CatalogModel interface, why does it need `getInputSchema` > and > > `getOutputSchema` method, What is the role of Schema? > > > > 3. Regarding the ModelProvider interface, what is the role of the copy > > method? Since I don't know much about the implementation details, I'm > > curious about what cases need to be copied. > > > > > > Best, > > Ron > > > > Yunfeng Zhou <flink.zhouyunf...@gmail.com> 于2025年5月7日周三 09:33写道: > > > > > Hi Hao, > > > > > > Thanks for the FLIP! It provides a clearer guideline for developers to > > > implement model functions. > > > > > > One minor comment: it might be better to change the configuration > api_key > > > to apikey, which corresponds to GlobalConfiguration.SENSITIVE_KEYS. > > > Otherwise users’ secrets might be exposed in logs and cause security > > risks. > > > > > > Best, > > > Yunfeng > > > > > > > > > > 2025年4月29日 07:22,Hao Li <h...@confluent.io.INVALID> 写道: > > > > > > > > Hi All, > > > > > > > > I would like to start a discussion about FLIP-525 [1]: Model > > ML_PREDICT, > > > > ML_EVALUATE Implementation Design. This FLIP is co-authored with > > Shengkai > > > > Fang. > > > > > > > > This FLIP is a follow up of FLIP-437 [2] to propose the > implementation > > > > design for ML_PREDICT and ML_EVALUATE function which were introduced > in > > > > FLIP-437. > > > > > > > > For more details, see FLIP-525 [1]. Looking forward to your feedback. > > > > > > > > > > > > [1] > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-525%3A+Model+ML_PREDICT%2C+ML_EVALUATE+Implementation+Design > > > > [2] > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-437%3A+Support+ML+Models+in+Flink+SQL > > > > > > > > > > > > Thanks, > > > > Hao > > > > > > > > >