[jira] [Commented] (FLINK-6988) Add Apache Kafka 0.11 connector

ASF GitHub Bot (JIRA) Thu, 10 Aug 2017 01:09:53 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121233#comment-16121233
 ]


ASF GitHub Bot commented on FLINK-6988:
---------------------------------------

Github user pnowojski commented on the issue:

    https://github.com/apache/flink/pull/4239
  
    Indeed it seems like you are right. `read_committed` doesn't play along 
with long `max.transaction.timeout.ms`. I'm not sure about Beam, but in Flink 
we can not use one single `transactional.id`, because our checkpoints are 
asynchronous - `notifyCheckpointComplete` (which triggers 
`KafkaProducer#commit`) can come long after `preCommit`. In that time we can 
not use the same `transactional.id` for new transactions. 
    
    We can walk around this issue by implementing a pool of 
`transactional.id`s, which we can save on the state. This will allows on 
restoring state to not only `recoverAndCommit` all pending transactions, but to 
abort all other unknown "lingering" transactions


> Add Apache Kafka 0.11 connector
> -------------------------------
>
>                 Key: FLINK-6988
>                 URL: https://issues.apache.org/jira/browse/FLINK-6988
>             Project: Flink
>          Issue Type: Improvement
>          Components: Kafka Connector
>    Affects Versions: 1.3.1
>            Reporter: Piotr Nowojski
>            Assignee: Piotr Nowojski
>
> Kafka 0.11 (it will be released very soon) add supports for transactions. 
> Thanks to that, Flink might be able to implement Kafka sink supporting 
> "exactly-once" semantic. API changes and whole transactions support is 
> described in 
> [KIP-98|https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging].
> The goal is to mimic implementation of existing BucketingSink. New 
> FlinkKafkaProducer011 would 
> * upon creation begin transaction, store transaction identifiers into the 
> state and would write all incoming data to an output Kafka topic using that 
> transaction
> * on `snapshotState` call, it would flush the data and write in state 
> information that current transaction is pending to be committed
> * on `notifyCheckpointComplete` we would commit this pending transaction
> * in case of crash between `snapshotState` and `notifyCheckpointComplete` we 
> either abort this pending transaction (if not every participant successfully 
> saved the snapshot) or restore and commit it. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-6988) Add Apache Kafka 0.11 connector

Reply via email to