Github user kl0u commented on a diff in the pull request:

    https://github.com/apache/flink/pull/2797#discussion_r88209971
  
    --- Diff: 
flink-streaming-connectors/flink-connector-filesystem/src/test/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSinkFaultToleranceITCase.java
 ---
    @@ -160,7 +167,15 @@ public void postSubmit() throws Exception {
                                while (line != null) {
                                        Matcher matcher = 
messageRegex.matcher(line);
                                        if (matcher.matches()) {
    -                                           numRead++;
    +                                           uniqMessagesRead.add(line);
    +
    +                                           // check that in the committed 
files there are no duplicates
    +                                           if 
(!file.getPath().toString().endsWith(IN_PROGRESS_SUFFIX) && 
!file.getPath().toString().endsWith(PENDING_SUFFIX)) {
    +                                                   if 
(!messagesInCommittedFiles.add(line)) {
    +                                                           
Assert.fail("Duplicate entry in committed bucket.");
    --- End diff --
    
    This test will change after introducing a `dispose()` in the 
`RichFunction`. The reason for the check is that now when we close and given 
that we do not delete pending/invalid files upon restore, we cannot check the 
exactly-once so this test checks the "at-least" once. So I would suggest to 
leave it as is for now and adapt it as soon as the other change gets in. When 
this happens, the method that was not used will be the new `close()` and it 
will rename all valid pending files to committed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to