Hi devs,

I would like to start a discussion on FLIP-357: Deprecate Iteration API of 
DataStream [1].

Currently, the Iteration API of DataStream is incomplete. For instance, it 
lacks support
for iteration in sync mode and exactly once semantics. Additionally, it does 
not offer the
ability to set iteration termination conditions. As a result, it's hard for 
developers to
build an iteration pipeline by DataStream in the practical applications such as 
machine learning.

FLIP-176: Unified Iteration to Support Algorithms [2] has introduced a unified 
iteration library
in the Flink ML repository. This library addresses all the issues present in 
the Iteration API of
DataStream and could provide solution for all the iteration use-cases. However, 
maintaining two
separate implementations of iteration in both the Flink repository and the 
Flink ML repository
would introduce unnecessary complexity and make it difficult to maintain the 
Iteration API.

As such I propose deprecating the Iteration API of DataStream and removing it 
completely in the next
major version. In the future, if other modules in the Flink repository require 
the use of the
Iteration API, we can consider extracting all Iteration implementations from 
the Flink ML repository
into an independent module.

Looking forward to your feedback.


[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-357%3A+Deprecate+Iteration+API+of+DataStream
[2] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=184615300

Best regards,

Wencong Liu

Reply via email to