syhily commented on code in PR #20725:
URL: https://github.com/apache/flink/pull/20725#discussion_r960280066
##########
flink-connectors/flink-connector-pulsar/src/main/java/org/apache/flink/connector/pulsar/source/enumerator/PulsarSourceEnumStateSerializer.java:
##########
@@ -54,57 +55,37 @@ private PulsarSourceEnumStateSerializer() {
@Override
public int getVersion() {
- // We use PulsarPartitionSplitSerializer's version because we use
reuse this class.
- return PulsarPartitionSplitSerializer.CURRENT_VERSION;
+ return CURRENT_VERSION;
}
@Override
public byte[] serialize(PulsarSourceEnumState obj) throws IOException {
- // VERSION 0 serialization
try (ByteArrayOutputStream baos = new ByteArrayOutputStream();
DataOutputStream out = new DataOutputStream(baos)) {
serializeSet(
out, obj.getAppendedPartitions(),
SPLIT_SERIALIZER::serializeTopicPartition);
- serializeSet(
- out,
- obj.getPendingPartitionSplits(),
- SPLIT_SERIALIZER::serializePulsarPartitionSplit);
Review Comment:
This is almost the same logic with Kafka. Assign states is useless because
it would be changed after scaling. The `appendedPartition` in the checkpoint
would assume that all operations that happened before the snapshot have
successfully completed. This means no pending splits existed before checkpoint.
See:
https://github.com/apache/flink/blob/8b8245ba46b25c2617d91cff3d3a44b99879d9f2/flink-core/src/main/java/org/apache/flink/api/connector/source/SplitEnumerator.java#L73
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]