Hello all, I would like to discuss the release of Pulsar Connectors v4.0.0.
A lot of new features and bug fixes have been added to flink-connector-pulsar 4.0.0 release. It also introduces some break changes in this release. Improvements [FLINK-28083] PulsarSource works with object-reusing DeserializationSchema. This commit will reuse the Message instance created by the Pulsar client. No need to create duplicate objects which improves the source performance. [FLINK-29709] Bump the Pulsar to latest 2.10.2. Bundle the latest Pulsar client which is more stable on transaction. [FLINK-30413] Drop subscription support, remove unordered consumption. We won’t support the Shared and Key_Shared subscription in this release. It’s buggy and low performance. All the subscriptions created by connector will be Exclusive by default. [FLINK-28870] Improve the Pulsar source performance when meeting small data rates. The old connector will hang for 10 seconds when consuming the messages at a small rate. We have removed this limitation and made it faster to switch to another topic partition. [FLINK-28351] Add dynamic sink topic support for Pulsar connector. Writing to a non-existed topic is supported in Pulsar. We added this support since this release. [FLINK-30654][Connector/Pulsar] Force consumption from StartCursor every time the application starts. StartCursor is used only when the subscription does not exist in Pulsar. We now support using the position from the StartCursor every time the application starts. New Features [FLINK-26027] Expose Pulsar producer metrics and add FLIP-33 sink metrics. You can monitor all the Pulsar client metrics now in this release. We also support the FLIP-33 sink metrics by default which you can have a clear view on how many messages have been written. [FLINK-25686] Support schema evolution for Pulsar source. Schema evolution has been supported in the Pulsar sink. We add a new option for enabling the schema evolution support in source. [FLINK-28082] Add end-to-end encryption support for Pulsar connector. End-to-end encryption can encrypt the messages from sink to source in the connector with an extra signature check. No one can see the real content except the connector. [FLINK-30689] Support sending message bytes with extra schema check. Connector is sending the messages in bytes by default. We add this feature for supporting validating the message bytes with the latest schema on Pulsar. [FLINK-30622] Support consuming messages with schema auto-detection from Pulsar. Some topics may contain multiple types of messages. Use this feature to convert these messages into a more general interface. BUG Fixes [FLINK-30552] Drop next message id calculation, use resetCursor api to exclude the given message. The old connector has a crucial bug on consuming messages which are sent to Pulsar in batch. We have fixed the bug and supported all the message types since this release. Breaking Changes Some classes and interfaces which are annotated with @PublicEvolving have been changed in this release. And this may affect the end-users. Pulsar Sink PulsarMessage should be created by using the PulsarMessage.builder() method. The PulsarMessageBuilder can’t be created directly like before. The route method in the TopicRouter interface should return a TopicPartition instead of a topic name. All the static methods in PulsarSerializationSchema have been removed. You should set them (Schema, SerializationSchema) by directly using PulsarSinkBuilder.setSerializationSchema(). Pulsar Source The subscription type setting has been removed from the connector. All the connector’s subscriptions will be created in Exclusive type. The checkpoint for Shared and Key_Shared subscriptions couldn’t be used since this release. The setSubscriptionType method has been removed from the PulsarSourceBuilder. The fromMessageTime method has been removed from the StartCursor. Pulsar doesn’t support seeking from message time. The RangeGenerator interface removed the deprecated open method and the keyShareMode method. Because we don’t support the Key_Shared subscription now. The RangeGenerator is only used in Exclusive subscription to filter the desired keys. TopicPartition only exposes two constructors with the @PublicEvolving annotation now. All the static methods in PulsarDeserializationSchema have been removed. You should set them (Schema, SerializationSchema) by directly using PulsarSourceBuilder.setDeserializationSchema(). The first argument of the open method in PulsarDeserializationSchema has been changed from InitializationContext to a new PulsarInitializationContext interface. Weijie Guo will be the release manager. Thanks, Yufan