Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-09-06 Thread Dong Lin
Hi everyone, Just FYI, if there is no further suggestion on FLIP, we plan to start the voting thread this Friday on 9/10. Thanks, Dong On Fri, Aug 27, 2021 at 10:32 AM Zhipeng Zhang wrote: > Thanks for the post, Dong :) > > We welcome everyone to drop us an email on Flink ML. Let's work

Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-08-26 Thread Zhipeng Zhang
Thanks for the post, Dong :) We welcome everyone to drop us an email on Flink ML. Let's work together to build machine learning on Flink :) Dong Lin 于2021年8月25日周三 下午8:58写道: > Hi everyone, > > Based on the feedback received in the online/offline discussion in the > past few weeks, we (Zhepeng,

Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-08-25 Thread Dong Lin
Hi everyone, Based on the feedback received in the online/offline discussion in the past few weeks, we (Zhepeng, Fan, myself and a few other developers at Alibaba) have reached agreement on the design to support DAG of algorithms. We have merged the ideas from the intial two options into this

Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-08-23 Thread Becket Qin
Thanks for the comments, Fan. Please see the reply inline. On Thu, Aug 19, 2021 at 10:25 PM Fan Hong wrote: > Hi, Becket, > > Many thanks to your detailed review. I agree that it is easier to involve > more people to discuss if fundamental differences are highlighted. > > > Here are some of my

Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-08-19 Thread Fan Hong
Hi, Becket, Many thanks to your detailed review. I agree that it is easier to involve more people to discuss if fundamental differences are highlighted. Here are some of my thoughts to help other people to think about these differences. (correct me if those technique details are not right.)

Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-08-19 Thread Fan Hong
Hi, Mingliang and Becket, Thank you for providing a real-world case of heterogeneous topology in the training and inference phase, and Becket has given two options to you to choose. Personally, I think Becket's two options are over-simplified in description, and may be somehow misleading.

回复:[DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-08-18 Thread 洪帆(既起)
Sincerely, Fan Hong -- 发件人:青雉(祁明良) 发送时间:2021年8月10日(星期二) 11:36 收件人:dev@flink.apache.org 主 题:Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML) Vote for option 2. It is similar to what we are doing with Tensorflow. 1. Define

Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-08-11 Thread Becket Qin
Hi Zhipeng, It looks like there are three different but potentially related things here. 1. How to describe multiple output of a node in the DAG. 2. How to construct / describe the DAG. 3. Do we need an encapsulation class of a DAG, e.g. the Graph class in option 1? It is much easier to discuss

Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-08-10 Thread Zhipeng Zhang
Hi Timo, Becket, Thanks for the feedback. I agree that having named table can help the code more readable. No matter there is one output table or multiple output tables, users have to access an output table by a magic index (For the case that there is only one output table, we need to use index

Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-08-10 Thread Becket Qin
Thanks for the feedback, Mingliang. Dong, I think what Mingliang meant by option-2 is the second way mentioned in my email, i.e. having a Graph encapsulation. It does not mean the option 2 in the FLIP. So he actually meant option 1 of the FLIP. Mingliang can correct me if I misunderstood. Hi

Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-08-10 Thread Timo Walther
Hi everyone, I'm not deeply involved in the discussion but I quickly checked out the proposed interfaces because it seems they are using Table API heavily and would like to leave some feedback here: I have the feeling that the proposed interfaces are a bit too simplified. Methods like

Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-08-09 Thread Dong Lin
Thank you Mingliang for providing the comments. Currently option-1 proposes Graph/GraphModel/GraphBuilder to build an Estimator from a graph of Estimator/Transformer, where Estimator could generate the model (as a Transformer) directly. On the other hand, option-2 proposes AlgoOperator that can

Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-08-09 Thread 青雉(祁明良)
Vote for option 2. It is similar to what we are doing with Tensorflow. 1. Define the graph in training phase 2. Export model with different input/output spec for online inference Thanks, Mingliang On Aug 10, 2021, at 9:39 AM, Becket Qin mailto:becket@gmail.com>> wrote: estimatorInputs

Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-08-09 Thread Becket Qin
Thanks Mingliang. It is super helpful to get your input. At this point, there are two ways mentioned in the FLIP to support heterogeneous topology in training and inference phase. 1. Create two separate DAGs or code for training and inference respectively. 2. An encapsulation API called Graph,

Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-08-09 Thread 青雉(祁明良)
Hi all, This is mingliang, a machine learning engineer in recommendation area. I see there’s discussion about “heterogeneous topologies in training and inference.” Actually this is a very common case in recommendation system especially in CTR prediction tasks. For training task, usually data is

Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-08-06 Thread Becket Qin
Hi Zhipeng, Yes, I agree that the key difference between the two options is how they support MIMO. My main concern for option 2 is potential inconsistent availability of algorithms in the two sets of API. In order to make an algorithm available to both sets of API, people have to implement the

Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-08-06 Thread Becket Qin
Hi Dong, Sorry for the late reply. I am a bit confused by this description of the semantic change. By "from > Data -> Data conversion to generic Table -> Table", do you mean "Table != > Data"? Yes, I think that Table and Data are not equivalent in this case. It might depend on what people

Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-07-20 Thread Dong Lin
Hi Becket, Thank you for the detailed reply! My understanding of your comments is that most of option-1 looks good except its change of the Transformer semantics. Please see my reply inline. On Tue, Jul 20, 2021 at 11:43 AM Becket Qin wrote: > Hi Dong, Zhipeng and Fan, > > Thanks for the

Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-07-20 Thread Zhipeng Zhang
Hi Becket, Thanks for the review! I totally agree that it would be easier for people to discuss if we can list the fundamental difference between these two proposals. (So I want to make the discussion even shorter) In my opinion, the fundamental difference between proposal-1 and proposal-2 is

Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-07-19 Thread Becket Qin
Hi Dong, Zhipeng and Fan, Thanks for the detailed proposals. It is quite a lot of reading! Given that we are introducing a lot of stuff here, I find that it might be easier for people to discuss if we can list the fundamental differences first. From what I understand, the very fundamental

[DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-07-01 Thread Dong Lin
Hi all, Zhipeng, Fan (cc'ed) and I are opening this thread to discuss two different designs to extend Flink ML API to support more use-cases, e.g. expressing a DAG of preprocessing and training logics. These two designs have been documented in FLIP-173