[ 
https://issues.apache.org/jira/browse/FLINK-6306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978974#comment-15978974
 ] 

ASF GitHub Bot commented on FLINK-6306:
---------------------------------------

GitHub user sjwiesman opened a pull request:

    https://github.com/apache/flink/pull/3752

    [FLINK-6306] [filesystem-connectors] Sink for eventually consistent file 
systems

    https://issues.apache.org/jira/browse/FLINK-6306
    
    This PR introduces a bucketer for eventually consistent file systems such 
as Amazon S3, guaranteeing exactly once output across failure and concurrent 
checkpoints (thank you @StephanEwen). I have attempted to keep the api as 
similar the the BucketingSink as possible including the shared use of writers 
for specifying output format. 
    
    Currently there is documentation in the form of javadoc, once the api is 
settled I will make another PR with updated documentation.  

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sjwiesman/flink FLINK-6306

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/3752.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3752
    
----
commit c778ce282e60577f1d9a105e9cffa295938642b9
Author: Seth Wiesman <[email protected]>
Date:   2017-04-14T19:11:53Z

    FLINK-6306 Sink for eventually consistent file systems
    
    https://issues.apache.org/jira/browse/FLINK-6306

----


> Sink for eventually consistent file systems
> -------------------------------------------
>
>                 Key: FLINK-6306
>                 URL: https://issues.apache.org/jira/browse/FLINK-6306
>             Project: Flink
>          Issue Type: New Feature
>          Components: filesystem-connector
>            Reporter: Seth Wiesman
>            Assignee: Seth Wiesman
>         Attachments: eventually-consistent-sink
>
>
> Currently Flink provides the BucketingSink as an exactly once method for 
> writing out to a file system. It provides these guarantees by moving files 
> through several stages and deleting or truncating files that get into a bad 
> state. While this is a powerful abstraction, it causes issues with eventually 
> consistent file systems such as Amazon's S3 where most operations (ie rename, 
> delete, truncate) are not guaranteed to become consistent within a reasonable 
> amount of time. Flink should provide a sink that provides exactly once writes 
> to a file system where only PUT operations are considered consistent. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to