[jira] [Commented] (FLINK-5056) BucketingSink deletes valid data when checkpoint notification is slow.

ASF GitHub Bot (JIRA) Wed, 16 Nov 2016 02:40:08 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15670106#comment-15670106
 ]


ASF GitHub Bot commented on FLINK-5056:
---------------------------------------

Github user kl0u commented on a diff in the pull request:

    https://github.com/apache/flink/pull/2797#discussion_r88209971
  
    --- Diff: 
flink-streaming-connectors/flink-connector-filesystem/src/test/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSinkFaultToleranceITCase.java
 ---
    @@ -160,7 +167,15 @@ public void postSubmit() throws Exception {
                                while (line != null) {
                                        Matcher matcher = 
messageRegex.matcher(line);
                                        if (matcher.matches()) {
    -                                           numRead++;
    +                                           uniqMessagesRead.add(line);
    +
    +                                           // check that in the committed 
files there are no duplicates
    +                                           if 
(!file.getPath().toString().endsWith(IN_PROGRESS_SUFFIX) && 
!file.getPath().toString().endsWith(PENDING_SUFFIX)) {
    +                                                   if 
(!messagesInCommittedFiles.add(line)) {
    +                                                           
Assert.fail("Duplicate entry in committed bucket.");
    --- End diff --
    
    This test will change after introducing a `dispose()` in the 
`RichFunction`. The reason for the check is that now when we close and given 
that we do not delete pending/invalid files upon restore, we cannot check the 
exactly-once so this test checks the "at-least" once. So I would suggest to 
leave it as is for now and adapt it as soon as the other change gets in. When 
this happens, the method that was not used will be the new `close()` and it 
will rename all valid pending files to committed.


> BucketingSink deletes valid data when checkpoint notification is slow.
> ----------------------------------------------------------------------
>
>                 Key: FLINK-5056
>                 URL: https://issues.apache.org/jira/browse/FLINK-5056
>             Project: Flink
>          Issue Type: Bug
>          Components: filesystem-connector
>    Affects Versions: 1.1.3
>            Reporter: Kostas Kloudas
>            Assignee: Kostas Kloudas
>             Fix For: 1.2.0
>
>
> Currently if BucketingSink receives no data after a checkpoint and then a 
> notification about a previous checkpoint arrives, it clears its state. This 
> can 
> lead to not committing valid data about intermediate checkpoints for whom
> a notification has not arrived yet. As a simple sequence that illustrates the 
> problem:
> -> input data 
> -> snapshot(0) 
> -> input data
> -> snapshot(1)
> -> no data
> -> notifyCheckpointComplete(0)
> the last will clear the state of the Sink without committing as final the 
> data 
> that arrived for checkpoint 1.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-5056) BucketingSink deletes valid data when checkpoint notification is slow.

Reply via email to