[ 
https://issues.apache.org/jira/browse/FLINK-17012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085785#comment-17085785
 ] 

Stephan Ewen commented on FLINK-17012:
--------------------------------------

I am against an "initialize" phase. There is a lot of existing complexity 
because the tasks don't eagerly initialize in the constructor, but at a later 
stage.
That means after invokable instance is created, it is incomplete. That leads to 
a bunch or issues, for example can the instance already by "cancel()"-ed, which 
needs to be extra careful because of incomplete initialization. Or checkpoints 
see a task but need to take extra care that the task in not actually fully 
initialized. Lot's of extra status checking, easy to forget, it was a frequent 
source of bugs over the years.

I would suggest to move more towards "eager initialization", rather than 
hard-wiring the delayed initialization with an "initialize()" method.

As a general thought: My take is that almost always (with very very few 
exceptions), "initialize()" or "configure()" methods should be seen as 
anti-patterns and be avoided if possible.

> Expose stage of task initialization
> -----------------------------------
>
>                 Key: FLINK-17012
>                 URL: https://issues.apache.org/jira/browse/FLINK-17012
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Metrics, Runtime / Task
>            Reporter: Wenlong Lyu
>            Priority: Major
>
> Currently a task switches to running before fully initialized, does not take 
> state initialization and operator initialization(#open ) in to account, which 
> may take long time to finish. As a result, there would be a weird phenomenon 
> that all tasks are running but throughput is 0. 
> I think it could be good if we can expose the initialization stage of tasks. 
> What to you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to