RobertIndie commented on code in PR #24328: URL: https://github.com/apache/pulsar/pull/24328#discussion_r2244525811
########## pip/pip-420.md: ########## @@ -0,0 +1,276 @@ +# PIP-420: Provides an ability for Pulsar clients to integrate with third-party schema registry service + +# Motivation + +Apache Pulsar currently provides a built-in schema management system tightly coupled with the broker. +Pulsar clients interact with this system implicitly when creating producers and consumers. + +However, many organizations already have independent schema registry services (such as Confluent Schema Registry) +and wish to reuse their existing schema governance processes across multiple messaging systems, including Pulsar. + +By enabling Pulsar clients to integrate with third-party schema registry services: +- Users can unify schema management across different platforms. +- Pulsar brokers can be decoupled from schema storage and validation responsibilities. +- Pulsar users can integrate with ecosystems that rely on external schema registries easier. + +This flexibility is particularly valuable for enterprises with strict schema validation, versioning, +and governance workflows already centralized in external registries. + +# Goals + +## In Scope + +- Provide the ability for Pulsar clients to leverage third-party schema registry services for schema operations. + +## Out Scope + +- Providing built-in implementations for third-party schemas. +- Support `AutoProduceBytesSchema` and `AutoConsumeSchema`. +- Migrating existing Pulsar-managed schemas to external schema registries. + +# High Level Design + +- Provide a mechanism to configure the Pulsar client to use either: + - The existing Pulsar schema registry (default) + - Third-party schema registry implementations + +# Detailed Design + +## Design & Implementation Details + +This PIP aims to enable the Pulsar client to directly integrate with external schema registry services for schema management. +In this model, the external schema registry is fully responsible for schema storage, retrieval, and validation. +The Pulsar broker will no longer manage schema data for topics using external schemas. + +### SchemaType: EXTERNAL + +Pulsar will introduce a new schema type: **SchemaType.EXTERNAL**. + +- All schemas that integrate with external schema registries must declare `SchemaType.EXTERNAL`. +- When using `EXTERNAL` schema type, the Pulsar client will provide empty schema data to the broker. +- The broker will only record the schema type for topics. +- Compatibility restrictions: + - Introduce a new compatibility check in broker side. + - The schema type `SchemaType.EXTERNAL` can't be compatible with other Pulsar schemas + - This prevents accidental data corruption or schema conflicts between internal and external schema management systems. Review Comment: Could we talk about how the external compatibility checker work with the compatibility check strategy in this PIP? https://pulsar.apache.org/docs/next/schema-understand/#schema-compatibility-check-strategy Like what the behavior iof the external checker is when users set it to each different strategy. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pulsar.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org