codelipenghui commented on code in PR #25242:
URL: https://github.com/apache/pulsar/pull/25242#discussion_r2891549704


##########
pip/pip-455.md:
##########
@@ -0,0 +1,125 @@
+
+---
+
+# PIP-455: Support Namespace Bundle Lookup and Topic Preloading
+
+
+---
+
+## Background Knowledge
+
+Apache Pulsar uses **namespace bundles** as the unit of ownership and load 
balancing.
+
+Key concepts:
+
+- **Namespace Bundle**: A subdivision of a namespace's hash space, 
representing a set of topics whose names hash into that range.
+- **Bundle Ownership**: At any given time, each bundle is owned by exactly one 
broker, which is responsible for serving all topics within that bundle.
+- **Lazy Topic Loading**: By default, topics are not loaded into memory until 
the first producer/consumer request arrives. This reduces startup overhead but 
increases first-call latency.
+- **PulsarAdmin & pulsar-admin CLI**: The administrative interface for 
managing Pulsar clusters, including operations on namespaces, topics, bundles, 
etc.
+
+Currently, there is **no API to proactively lookup a namespace bundle or load 
all topics within a bundle**. This forces users to trigger topic creation via 
producer/consumer requests, which is not suitable for:
+- Warm-up scenarios (preloading topics before traffic arrives)
+- Disaster recovery (forcing bundle ownership transfer and topic loading)
+- Observability (checking which broker actually owns a bundle)
+
+---
+
+## Motivation
+
+The current implementation of namespace and bundle management lacks support 
for **explicit lookup and preloading**. This leads to several pain points:
+
+1. **No way to warm up topics**  
+   In production, after a broker restart or bundle unload, topics are loaded 
lazily. The first request experiences high latency due to topic metadata 
loading, cursor recovery, and ownership establishment. There is no API to 
proactively load topics in a bundle to avoid this cold-start penalty.  Some use 
cases (e.g., migration validation, pre-warming for large-scale events) require 
loading all topics in a namespace.
+
+2. **Difficult to verify bundle ownership**  
+   While internal lookup mechanisms exist, there is no admin-facing API to 
query the owner of a specific bundle and force-load it onto the current broker. 
This makes operational debugging and manual intervention cumbersome.
+
+3. **Client-Admin API inconsistency**  
+   The `pulsar-admin` CLI provides `unload`, `split`, `clear-backlog` for 
bundles, but no `load` or `lookup` counterpart. This asymmetry complicates 
operational tooling.
+
+4. **Dependency cycle in the codebase**  
+   The `LookupData` class resides in `pulsar-common`, but 
`pulsar-client-admin-api` cannot depend on it directly. This forced a 
workaround via a new interface to avoid cyclic dependencies.
+
+This proposal introduces **bundle-level and namespace-level lookup + load** 
APIs, enabling operators to proactively control bundle ownership and topic 
lifecycle.
+
+---
+
+## Goals
+
+### In Scope
+
+- Provide a new admin API to **lookup a namespace bundle**, returning the 
broker serving it (same as topic lookup but at bundle granularity).
+- Provide a new admin API to **load all topics in a namespace bundle** onto 
the owning broker.
+- Provide a new admin API to **load all topics in a namespace** (by iterating 
over its bundles).
+- Extend the `pulsar-admin namespaces` CLI with `lookup` and `lookup-bundle` 
commands.
+- Introduce `LookupDataInterface` to break the cyclic dependency between 
`pulsar-common` and `pulsar-client-admin-api`.
+
+
+
+## High-Level Design
+
+The core idea is to extend the existing `Namespaces` admin resource to support 
**lookup operations at both namespace and bundle granularity**, with an 
optional flag to trigger topic loading.
+
+### 1. New REST Endpoints
+
+**V2 :**
+
+```
+PUT /admin/v2/namespaces/{tenant}/{namespace}/lookup

Review Comment:
   Could you please also define the response format of this API, it will be a 
map which mapping the service URL for each bundle?



##########
pip/pip-455.md:
##########
@@ -0,0 +1,125 @@
+
+---
+
+# PIP-455: Support Namespace Bundle Lookup and Topic Preloading
+
+
+---
+
+## Background Knowledge
+
+Apache Pulsar uses **namespace bundles** as the unit of ownership and load 
balancing.
+
+Key concepts:
+
+- **Namespace Bundle**: A subdivision of a namespace's hash space, 
representing a set of topics whose names hash into that range.
+- **Bundle Ownership**: At any given time, each bundle is owned by exactly one 
broker, which is responsible for serving all topics within that bundle.
+- **Lazy Topic Loading**: By default, topics are not loaded into memory until 
the first producer/consumer request arrives. This reduces startup overhead but 
increases first-call latency.
+- **PulsarAdmin & pulsar-admin CLI**: The administrative interface for 
managing Pulsar clusters, including operations on namespaces, topics, bundles, 
etc.
+
+Currently, there is **no API to proactively lookup a namespace bundle or load 
all topics within a bundle**. This forces users to trigger topic creation via 
producer/consumer requests, which is not suitable for:
+- Warm-up scenarios (preloading topics before traffic arrives)
+- Disaster recovery (forcing bundle ownership transfer and topic loading)
+- Observability (checking which broker actually owns a bundle)
+
+---
+
+## Motivation
+
+The current implementation of namespace and bundle management lacks support 
for **explicit lookup and preloading**. This leads to several pain points:
+
+1. **No way to warm up topics**  
+   In production, after a broker restart or bundle unload, topics are loaded 
lazily. The first request experiences high latency due to topic metadata 
loading, cursor recovery, and ownership establishment. There is no API to 
proactively load topics in a bundle to avoid this cold-start penalty.  Some use 
cases (e.g., migration validation, pre-warming for large-scale events) require 
loading all topics in a namespace.
+
+2. **Difficult to verify bundle ownership**  
+   While internal lookup mechanisms exist, there is no admin-facing API to 
query the owner of a specific bundle and force-load it onto the current broker. 
This makes operational debugging and manual intervention cumbersome.
+
+3. **Client-Admin API inconsistency**  
+   The `pulsar-admin` CLI provides `unload`, `split`, `clear-backlog` for 
bundles, but no `load` or `lookup` counterpart. This asymmetry complicates 
operational tooling.
+
+4. **Dependency cycle in the codebase**  
+   The `LookupData` class resides in `pulsar-common`, but 
`pulsar-client-admin-api` cannot depend on it directly. This forced a 
workaround via a new interface to avoid cyclic dependencies.
+
+This proposal introduces **bundle-level and namespace-level lookup + load** 
APIs, enabling operators to proactively control bundle ownership and topic 
lifecycle.
+
+---
+
+## Goals
+
+### In Scope
+
+- Provide a new admin API to **lookup a namespace bundle**, returning the 
broker serving it (same as topic lookup but at bundle granularity).
+- Provide a new admin API to **load all topics in a namespace bundle** onto 
the owning broker.
+- Provide a new admin API to **load all topics in a namespace** (by iterating 
over its bundles).
+- Extend the `pulsar-admin namespaces` CLI with `lookup` and `lookup-bundle` 
commands.
+- Introduce `LookupDataInterface` to break the cyclic dependency between 
`pulsar-common` and `pulsar-client-admin-api`.
+
+
+
+## High-Level Design
+
+The core idea is to extend the existing `Namespaces` admin resource to support 
**lookup operations at both namespace and bundle granularity**, with an 
optional flag to trigger topic loading.
+
+### 1. New REST Endpoints
+
+**V2 :**
+
+```
+PUT /admin/v2/namespaces/{tenant}/{namespace}/lookup
+PUT /admin/v2/namespaces/{tenant}/{namespace}/{bundle}/lookup
+```
+
+**V1 :**
+
+```
+PUT /admin/namespaces/{property}/{cluster}/{namespace}/lookup

Review Comment:
   We don't need to support V1 API since we are working on removing the V1 
endpoints from the repo.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to