[ 
https://issues.apache.org/jira/browse/FLINK-2286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660051#comment-14660051
 ] 

ASF GitHub Bot commented on FLINK-2286:
---------------------------------------

GitHub user gaborhermann opened a pull request:

    https://github.com/apache/flink/pull/994

    [FLINK-2286] [streaming] Wrapped ParallelMerge into stream operator

    Resolved the issue by wrapping the ParallelMerge function into a 
StreamOperator and emitting the remaining records at the end of the data 
processing.
    
    This is not tested (automatically), as I could not reproduce the issue in a 
deterministic way (due to its behaviour depending on time).
    
    It can be reproduced by changing the streaming WordCount to use windows:
    ```java
    final StreamExecutionEnvironment env =
        StreamExecutionEnvironment.getExecutionEnvironment();
    
    DataStreamSource<String> text = env.fromElements(
                "To be, or not to be,--that is the question:--",
                "Whether 'tis nobler in the mind to suffer"
    );
    
    DataStream<Tuple2<String, Integer>> counts =
    text.flatMap(new Tokenizer())
                .window(Time.of(1000, TimeUnit.MILLISECONDS))
                .groupBy(0)
                .sum(1)
                .flatten();
    
    counts.print();
    ```
    
    This topology would print nothing before the fix.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gaborhermann/flink FLINK-2286

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/994.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #994
    
----
commit 86096adb688d9233c77986c3012298ae00bfbba0
Author: Gábor Hermann <[email protected]>
Date:   2015-08-06T13:42:54Z

    [FLINK-2286] [streaming] Wrapped ParallelMerge into stream operator

----


> Window ParallelMerge sometimes swallows elements of the last window
> -------------------------------------------------------------------
>
>                 Key: FLINK-2286
>                 URL: https://issues.apache.org/jira/browse/FLINK-2286
>             Project: Flink
>          Issue Type: Bug
>          Components: Streaming
>    Affects Versions: 0.9, 0.10
>            Reporter: Márton Balassi
>            Assignee: Gábor Hermann
>             Fix For: 0.9.1
>
>
> Last windows in the stream that do not have parts at all the parallel 
> operator instances get swallowed by the ParallelMerge.
> To resolve this ParallelMerge should be an operator instead of a function, so 
> the close method can access the collector and emit these.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to