改动source或sink operator后无法从savepoint恢复作业

Eleanore Jin Mon, 10 Aug 2020 09:37:40 -0700

请教各位

我用的是 Beam 2.23.0, with flink runner 1.8.2. 想要实验启动checkpoint 和 Beam KafkaIO
EOS(exactly once semantics) 以后，添加或删除source/sink operator
然后从savepoint恢复作业的情况。我是在电脑上run kafka 和 flink cluster (1 job manager, 1 task
manager)


下面是我尝试的不同场景：

1. 在SAVEPOINT 后，添加一个source topic
在savepoint之前: read from input1 and write to output
Take a savepoint
在savepoint之后: read from input1 and input2 and write to output
情况: output 中没有输出input2的数据

2. 在SAVEPOINT 后 去掉一个source topic
在savepoint之前: read from input1 and input2 and write to output
Take a savepoint
在savepoint之后: read from input1 and write to output
情况: 可以正常运行，output只会有input1的数据

3. 在SAVEPOINT 后，添加一个sink topic
在savepoint之前: read from input1 and write to output1
Take a savepoint
在savepoint之后: read from input1 and write to output1 and output2
情况: pipeline failed with exception
[image: image.png]

4. 在SAVEPOINT 后 去掉一个sink topic
在savepoint之前: read from input1 and write to output1 and output2
Take a savepoint
在savepoint之后: read from input1 and write to output1
情况: It requires to change the sinkGroupId, otherwise get exception
[image: image.png]

看起来是改动SOURCE或者SINK以后，基本上不太能从SAVEPOINT恢复作业。想请教这是Flink预期的结果吗，还是有可能因为Beam
KafkaIO Exactly Once 的实现方式造成的，亦或是我配置的问题？

谢谢！
Eleanore

改动source或sink operator后 无法从savepoint恢复作业

回复

改动source或sink operator后无法从savepoint恢复作业