lhotari commented on code in PR #25721:
URL: https://github.com/apache/pulsar/pull/25721#discussion_r3210295559


##########
pip/pip-475.md:
##########
@@ -0,0 +1,350 @@
+# PIP-475: Regular-to-Scalable Topic Migration
+
+*Sub-PIP of [PIP-460: Scalable Topics](pip-460.md)*
+
+## Motivation
+
+[PIP-460](pip-460.md) introduces scalable topics (`topic://...`) as a new 
topic type that supports range splitting and merging without breaking key 
ordering. For this to be adoptable in real deployments, users with existing 
partitioned and non-partitioned topics need a migration path that:
+
+1. **Doesn't require recreating their topics from scratch.** Existing topics 
may hold months of retained data and have many active subscriptions. Re-create 
+ re-publish is not a viable upgrade story.
+2. **Lets clients adopt the V5 SDK before any topic is migrated.** 
Operationally, applications need to be upgraded one at a time over weeks, while 
the topics they read and write keep working as-is. The V5 SDK has to 
interoperate with the *old* topic types until the migration moment.
+3. **Keeps the migration moment small and surgical.** Once all clients are on 
the V5 SDK, an admin command flips a topic from regular to scalable in a single 
atomic step, without copying data or moving cursors.
+4. **Cannot be reversed.** Once a topic is scalable, regressing to a regular 
topic is unsafe (the new layout can have already split, leaving data in 
segments that don't map back to a fixed partition count). The metadata 
transition has to be one-way.
+
+PIP-460 lists "Tooling for migrating existing partitioned topics to scalable 
topics" in its postponed section. This PIP closes that gap.
+
+This PIP also clarifies the V5 SDK's behavior when given a topic name that may 
or may not be scalable, and tightens the broker so that a v4 client cannot 
accidentally write to (or auto-create) a regular topic that has already been 
migrated.
+
+The longer-term direction for Pulsar is for scalable topics to **fully 
replace** partitioned and non-partitioned topics: the existing topic types stay 
supported for backward compatibility, but new development on the topic surface 
targets scalable topics, and migration tooling like this PIP is what lets 
existing deployments make that transition incrementally instead of all at once.
+
+---
+
+## Background Knowledge
+
+### Topic domains in Pulsar today
+
+A Pulsar topic name encodes its domain in a URI scheme:
+
+- `persistent://t/n/x` — durable topic backed by a managed ledger.
+- `non-persistent://t/n/x` — in-memory topic, no durability.
+- `topic://t/n/x` — scalable topic introduced by PIP-460. Backed by a DAG of 
segments; each segment is itself a `segment://...` topic with its own managed 
ledger.
+

Review Comment:
   One detail that we missed in Pulsar 4.2.0 is the migration from v1 topics to 
v2 topics. Since users might be upgrading directly from 4.0.x to 5.0.x, I'd 
assume that v1 topics would need to be handled in some way. 
   
   znodes are different for v2 and v1 topics:
   
   Managed ledger
   • v2: /managed-ledgers/tenant/ns/persistent/topic
   • v1: /managed-ledgers/tenant/cluster/ns/persistent/topic
   
   Partitioned topic metadata
   • v2: /admin/partitioned-topics/tenant/ns/persistent/topic
   • v1: /admin/partitioned-topics/tenant/cluster/ns/persistent/topic
   
   Namespace policies
   • v2: /admin/policies/tenant/ns
   • v1: /admin/policies/tenant/cluster/ns
   
   A common reason why v1 topics exist in 4.0.x production deployments is that 
adding a slash to a topic name makes it silently a v1 topic.
   
   In 4.1.0, a configuration setting 
allowAutoTopicCreationWithLegacyNamingScheme was added to prevent creating v1 
topics accidentially:
   https://github.com/apache/pulsar/pull/23620
   
   How are we going to address the possible existence of v1 topics in 4.0.x -> 
5.0.x migration?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to