This is an automated email from the ASF dual-hosted git repository. penghui pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/pulsar.git
The following commit(s) were added to refs/heads/master by this push: new 735acee6072 [improve][pip] PIP-414: Enforce topic consistency check (#24213) 735acee6072 is described below commit 735acee6072e45e16b1f4a02d038c475ddbdf541 Author: Zixuan Liu <node...@gmail.com> AuthorDate: Tue Jun 17 12:26:18 2025 +0800 [improve][pip] PIP-414: Enforce topic consistency check (#24213) --- pip/pip-414.md | 101 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 101 insertions(+) diff --git a/pip/pip-414.md b/pip/pip-414.md new file mode 100644 index 00000000000..6b76882457a --- /dev/null +++ b/pip/pip-414.md @@ -0,0 +1,101 @@ +# PIP-414: Enforce topic consistency check + +## Background Knowledge + +In Apache Pulsar, topics can be either non-partitioned or partitioned. A non-partitioned topic handles all messages on a +single topic instance, while a partitioned topic is logically split across multiple partitions for parallelism and +scalability. Partitioned topic is backed by multiple internal non-partitioned topics, one per partition (e.g., +`topic-partition-0`, `topic-partition-1`, etc.). + +## Motivation + +Apache Pulsar allows clients to produce to or consume from a fully qualified partitioned topic name (e.g., +`public/default/test-partitioned-topic-partition-N`), even if the partitioned topic metadata does not yet exist. If topic +auto-creation is enabled and `allowAutoTopicCreationType=non-partitioned` is set, the broker will automatically create a +non-partitioned topic for that partition name. + +This behavior can lead to inconsistencies: users expecting to interact with a partitioned topic may unknowingly create +non-partitioned topics for individual partitions. As a result, listing topics may yield unexpected structures, with +partitioned and non-partitioned topics mixed under similar names. + +While this issue can also occur in GEO-replicated environments due to asynchronous metadata propagation, the core +problem is the lack of topic validation. Disabling this behavior and enforcing +strict topic-type consistency is essential to ensure clarity, prevent accidental misconfigurations, and maintain the +integrity of topic structures in Pulsar. + +## Goals + +### In Scope + +- Introduce validation to enforce consistency between topic metadata and topic type. +- Prevent producing or consuming from a topic when metadata indicates a mismatch. + +## High Level Design + +The proposed solution adds server-side validation to ensure that the topic type matches the metadata. + +- When accessing a non-partitioned topic, the broker checks that no partition metadata exists(which shouldn't be there). +- When accessing a partitioned topic (via a `-partition-N` name), the broker ensures that valid partition metadata + exists and that the topic is indeed part of a partitioned group. + +- If these checks fail, the broker rejects the client request with an appropriate error. + +## Detailed Design + +### Design & Implementation Details + +- Validation is added when loading the topic. +- If a mismatch is detected: + - Non-partitioned topic with partition metadata → reject request. + - Partitioned topic (e.g., `-partition-0`) with missing metadata → reject request. + +### Public-facing Changes + +## Monitoring + +Operators should monitor for client/broker errors indicating rejected produce/consume requests due to topic-type mismatches. +These errors help identify misconfigured clients. + +## Backward & Forward Compatibility + +### Upgrade + +No special upgrade procedure is required. Once upgraded, clients/brokers attempting invalid topic-type interactions will +receive an error. + +If you encounter the error `Partition metadata not found for the partitioned topic`, you can recreate the metadata using +the following command: + +``` +bin/pulsar-admin topics create-partitioned-topic $TOPIC -p $PARITIONS +``` + +This behavior is supported by [#24225](https://github.com/apache/pulsar/pull/24225). + +### Downgrade / Rollback + +On downgrade, previous behavior resumes: mismatched topic types may be silently created or accessed. Operators should +monitor carefully if a downgrade is necessary. + +### Pulsar Geo-Replication Upgrade & Downgrade/Rollback Considerations + +The old version may allow inconsistent topic-type usage while newer ones enforce validation. + +# Alternatives + +@BewareMyPower proposed adding a configuration option that `validatePartitionMetadataConsistency` to allow enabling or +disabling this validation behavior. While this would provide flexibility, we believe it is not the right direction. + +Pulsar already has an extensive set of configuration parameters. Introducing more options for narrowly scoped behaviors +increases maintenance overhead and adds to user confusion. Since topic-type consistency is fundamental to correct Pulsar +usage, this validation should be enforced by default, not made optional. + +## General Notes + +This change increases Pulsar robustness and predictability by enforcing stricter topic-type expectations. + +## Links + +* Implement PR: https://github.com/apache/pulsar/pull/24118 +* Mailing List discussion thread: https://lists.apache.org/thread/v7v889qzfjpqznm713wvogmx2bs4pkkb +* Mailing List voting thread: https://lists.apache.org/thread/92zpyqwbrfbjrsxbh1dsqksn65scsc5m