Re: does the flink sink only support bio?

2018-01-08 Thread Tony Wei
Hi Stefan, Your reply really helps me a lot. Thank you. 2018-01-08 19:38 GMT+08:00 Stefan Richter : > Hi, > > 1. If `snapshotState` failed at the first checkpoint, does it mean there > is no state and no transaction can be aborted by default? > > > This is a general problem and not only limited

Re: does the flink sink only support bio?

2018-01-08 Thread Stefan Richter
Hi, > 1. If `snapshotState` failed at the first checkpoint, does it mean there is > no state and no transaction can be aborted by default? This is a general problem and not only limited to the first checkpoint. Whenever you open a transaction, there is no guaranteed way to store it in persist

Re: does the flink sink only support bio?

2018-01-08 Thread Tony Wei
Hi Stefan, Since TwoPhaseCommitSinkFunction is new to me, I would like to know more details. There are two more questions: 1. If `snapshotState` failed at the first checkpoint, does it mean there is no state and no transaction can be aborted by default? 2. I saw FlinkKafkaProducer011 has a trans

Re: does the flink sink only support bio?

2018-01-04 Thread Stefan Richter
Yes, that is how it works. > Am 04.01.2018 um 14:47 schrieb Jinhua Luo : > > The TwoPhaseCommitSinkFunction seems to record the transaction status > in the state just like what I imagine above, correct? > and if the progress fails before commit, in the later restart, the > commit would be trigger

Re: does the flink sink only support bio?

2018-01-04 Thread Jinhua Luo
The TwoPhaseCommitSinkFunction seems to record the transaction status in the state just like what I imagine above, correct? and if the progress fails before commit, in the later restart, the commit would be triggered again, correct? So the commit would not be forgotten, correct? 2018-01-03 22:54 G

Re: does the flink sink only support bio?

2018-01-03 Thread Stefan Richter
I think a mix of async UPDATES and exactly-once all this might be tricky, and the typical use case for async IO is more about reads. So let’s take a step back: what would you like to achieve with this? Do you want a read-modify-update (e.g. a map function that queries and updates a DB) or just

Re: does the flink sink only support bio?

2018-01-03 Thread Jinhua Luo
No, I mean how to implement exactly-once db commit (given our async io target is mysql), not the state used by flink. As mentioned in previous mail, if I commit db in notifyCheckpointComplete, we have a risk to lost data (lost commit, and flink restart would not trigger notifyCheckpointComplete for

Re: does the flink sink only support bio?

2018-01-03 Thread Stefan Richter
Hi, > > Then how to implement exactly-once async io? That is, neither missing > data or duplicating data. From the docs about async IO here https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/asyncio.html

Re: does the flink sink only support bio?

2018-01-03 Thread Jinhua Luo
Then how to implement exactly-once async io? That is, neither missing data or duplicating data. Is there some way to index data by checkpoint id and records which checkpoints already commit to db? But that means we need MapState, right? However, the async-io operator normally follows other operat

Re: does the flink sink only support bio?

2018-01-03 Thread Stefan Richter
> Am 01.01.2018 um 15:22 schrieb Jinhua Luo : > > 2017-12-08 18:25 GMT+08:00 Stefan Richter : >> You need to be a bit careful if your sink needs exactly-once semantics. In >> this case things should either be idempotent or the db must support rolling >> back changes between checkpoints, e.g. v

Re: does the flink sink only support bio?

2018-01-01 Thread Jinhua Luo
2017-12-08 18:25 GMT+08:00 Stefan Richter : > You need to be a bit careful if your sink needs exactly-once semantics. In > this case things should either be idempotent or the db must support rolling > back changes between checkpoints, e.g. via transactions. Commits should be > triggered for conf

Re: does the flink sink only support bio?

2017-12-08 Thread Stefan Richter
> I have two new questions: > > 1) the async operator must emit some value to the async collector > (even it acts as a sink), right? > I think so, but you should be able to simply return empty collection. > 2) How could I use CheckpointListener with async operator? Could you > give a simple ex

Re: does the flink sink only support bio?

2017-12-08 Thread Jinhua Luo
Thank you very much! I have two new questions: 1) the async operator must emit some value to the async collector (even it acts as a sink), right? 2) How could I use CheckpointListener with async operator? Could you give a simple example or doc page? 2017-12-08 18:25 GMT+08:00 Stefan Richter :

Re: does the flink sink only support bio?

2017-12-08 Thread Stefan Richter
Hi, Flink currently does not offer async sinks out of the box, but there is no fundamental problem against having them and we will probably offer something is this direction in the future. In the meantime, you can build something like this by replacing the sink with an async io operator that ac

does the flink sink only support bio?

2017-12-07 Thread Jinhua Luo
Hi, all. The invoke method of sink seems no way to make async io? e.g. returns Future? For example, the redis connector uses jedis lib to execute redis command synchronously: https://github.com/apache/bahir-flink/blob/master/flink-connector-redis/src/main/java/org/apache/flink/streaming/connecto