[
https://issues.apache.org/jira/browse/FLINK-11409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16757282#comment-16757282
]
Kezhu Wang commented on FLINK-11409:
------------------------------------
[~aljoscha] [~dawidwys] I would like to present example code for this
discussion.
{code:java}
public abstract class AbstractFlinkRichFunction<T extend Action> extends
AbstractRichFunction implements CheckpointedFunction {
private final OperatorInfo operatorInfo;
protected transient T action;
@Override
public void open(Configuration parameters) throws Exception {
super.open(parameters);
// Open target operator action
}
@Override
public void close() throws Exception {
// Close target operator action
super.close();
}
@Override
public void snapshotState(FunctionSnapshotContext snapshotContext) throws
Exception {
// Relay snapshot to target operator action
}
@Override
public void initializeState(FunctionInitializationContext
initializationContext) throws Exception {
// Create operator action base on <T> and operator info
// Relay initializeState to target operator action
}
}
public class FlinkFlatMapFunction extends AbstractFlinkRichFunction<T>
implements FlatMapFunction<Event, Event> {
@Override
public void flatMap(Event value, Collector<Event> out) throws Exception {
// Relay flatMap to target operator action
}
}
{code}
In above code, `AbstractFlinkRichFunction` focuses on lifecycle management,
while `FlinkXyzFunction` focuses on data processing. This pattern works fine
for `MapFunction`, `FilterFunction`, `SourceFunction` and others. But for
`ProcessFunction` and etc., we have to duplicate `AbstractFlinkRichFunction` as
these function callbacks are implemented as abstract classes. *Due to Java's
single class inheritance, I think exporting _callback like apis_ as classes not
interfaces is intrusive and unfriendly to caller.*
Besides this, from api perspective, I think making `ProcessFunction` and etc.
as subclass of `AbstractRichFunction` mixes up data processing function and
lifecycle management.
> Make `ProcessFunction`, `ProcessWindowFunction` and etc. pure interfaces
> ------------------------------------------------------------------------
>
> Key: FLINK-11409
> URL: https://issues.apache.org/jira/browse/FLINK-11409
> Project: Flink
> Issue Type: Improvement
> Components: DataStream API
> Reporter: Kezhu Wang
> Priority: Major
> Labels: Breaking-Change
>
> I found these functions express no opinionated demands from implementing
> classes. It would be nice to implement as interfaces not abstract classes as
> abstract class is intrusive and hampers caller user cases. For example,
> client can't write an `AbstractFlinkRichFunction` to unify lifecycle
> management for all data processing functions in easy way.
> I dive history of some of these functions, and find that some functions were
> converted as abstract class from interface due to default method
> implementation, such as `ProcessFunction` and `CoProcessFunction` were
> converted to abstract classes in FLINK-4460 which predate -FLINK-7242-. After
> -FLINK-7242-, [Java 8 default
> method|https://docs.oracle.com/javase/tutorial/java/IandI/defaultmethods.html]
> would be a better solution.
> I notice also that some functions which are introduced after -FLINK-7242-,
> such as `ProcessJoinFunction`, are implemented as abstract classes. I think
> it would be better to establish a well-known principle to guide both api
> authors and callers of data processing functions.
> Personally, I prefer interface for all exported function callbacks for the
> reason I express in first paragraph.
> Besides this, with `AbstractRichFunction` and interfaces for data processing
> functions I think lots of rich data processing functions can be eliminated as
> they are plain classes extending `AbstractRichFunction` and implementing data
> processing interfaces, clients can write this in one line code with clear
> intention of both data processing and lifecycle management.
> Following is a possible incomplete list of data processing functions
> implemented as abstract classes currently:
> * `ProcessFunction`, `KeyedProcessFunction`, `CoProcessFunction` and
> `ProcessJoinFunction`
> * `ProcessWindowFunction` and `ProcessAllWindowFunction`
> * `BaseBroadcastProcessFunction`, `BroadcastProcessFunction` and
> `KeyedBroadcastProcessFunction`
> All above functions are annotated with `@PublicEvolving`, making they
> interfaces won't break Flink's compatibility guarantee but compatibility is
> still a big consideration to evaluate this proposal.
> Any thoughts on this proposal ? Please must comment out.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)