[ 
https://issues.apache.org/jira/browse/FLINK-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15708810#comment-15708810
 ] 

ASF GitHub Bot commented on FLINK-5114:
---------------------------------------

GitHub user uce opened a pull request:

    https://github.com/apache/flink/pull/2912

    [FLINK-5114] [network] Handle partition producer state check for 
unregistered executions

    If a partition state request is triggered for a producer that terminates 
before the request arrives, the execution is unregistered and the producer 
cannot be found. In this case the partition state returns `null` and the job 
fails although this is perfectly legal.
    
    For these cases, we look up the respective intermediate result partition 
and find the producing execution manually instead of looking it up via the 
registered executions.
    
    I've removed some unused message parameters that have become obsolete with 
other recent refactorings.
    
    This adds a hash map to `IntermediateResult` for lookups by partition ID. I 
didn't dare to change the partition connect logic in other places that is 
tightly coupled to the partitions being held as an array. As an alternative, we 
could to a linear scan over the partitions as this happens rarely. The memory 
overhead for the hash map should be acceptable as it's created per produced 
result and only has entries for each partition.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/uce/flink 5114-partition_state

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/2912.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2912
    
----
commit 6308ff0aba49f026c23c67af4a2f3943b16f2b31
Author: Ufuk Celebi <u...@apache.org>
Date:   2016-11-22T15:15:04Z

    [FLINK-5114] [network] Handle partition producer state check for 
unregistered executions

----


> PartitionState update with finished execution fails
> ---------------------------------------------------
>
>                 Key: FLINK-5114
>                 URL: https://issues.apache.org/jira/browse/FLINK-5114
>             Project: Flink
>          Issue Type: Bug
>          Components: Network
>            Reporter: Ufuk Celebi
>            Assignee: Ufuk Celebi
>
> If a partition state request is triggered for a producer that finishes before 
> the request arrives, the execution is unregistered and the producer cannot be 
> found. In this case the PartitionState returns null and the job fails.
> We need to check the producer location via the intermediate result partition 
> in this case.
> See here: https://api.travis-ci.org/jobs/177668505/log.txt?deansi=true



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to