[ 
https://issues.apache.org/jira/browse/FLINK-19774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17219524#comment-17219524
 ] 

Yuan Mei commented on FLINK-19774:
----------------------------------

Places need to be changed:

1. set the parent of view -> invalid

2. view is released before set to null in subpartition(done)

 

> Introduce sub partition view version for approximate Failover
> -------------------------------------------------------------
>
>                 Key: FLINK-19774
>                 URL: https://issues.apache.org/jira/browse/FLINK-19774
>             Project: Flink
>          Issue Type: Sub-task
>            Reporter: Yuan Mei
>            Priority: Major
>
>  
> This ticket is to solve a corner case where a downstream task continuously 
> fails multiple times, or an orphan task execution may exist for a short 
> period of time after new execution is running (as described in the FLIP)
>  
> Here is an idea of how to cleanly and thoroughly solve this kind of problem:
>  # We go with the simplified release view version: only release view before a 
> new creation (in thread2). That says we won't clean up view when downstream 
> task disconnects ({{releaseView}} would not be called from the reference copy 
> of view) (in thread1 or 2).
>  * 
>  ** This would greatly simplify the threading model
>  ** This won't cause any resource leak, since view release is only to notify 
> the upstream result partition to releaseOnConsumption when all subpartitions 
> are consumed in PipelinedSubPartitionView. In our case, we do not release the 
> result partition on consumption any way (the result partition is put in track 
> in JobMaster, similar to the ResultParition.blocking Type).
>       2. Each view is associated with a downstream task execution version
>  * 
>  ** This is making sense because we actually have different versions of view 
> now, corresponding to the vertex.version of the downstream task.
>  ** createView is performed only if the new version to create is greater than 
> the existing one
>  ** If we decide to create a new view, the old view should be released.
> I think this way, we can completely disconnect the old view with the 
> subpartition. Besides that, the working handler in use would always hold the 
> freshest view reference.
>  
> Point 1 has already been addressed in FLINK-19632. This ticket is to address 
> Point 2.
> Details discussion in [https://github.com/apache/flink/pull/13648]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to