Dear Pulsar Community,

I'd like to propose PIP-442, which addresses critical memory exhaustion
issues in Pulsar's topic discovery commands that can cause broker crashes
in production deployments.

Problem:
The current topic listing implementation lacks flow control, creating
unbounded memory allocation scenarios:
- OutOfMemoryError when multiple clients request large topic lists
simultaneously
- Proxy cascading failures due to unbuffered response forwarding
- Unpredictable resource usage making capacity planning difficult
- Performance degradation from GC pressure affecting all broker operations

A namespace with 10K topics can consume ~1MB per response. With 1K concurrent
requests, this creates ~1GB memory pressure that can crash brokers.

Solution:
PIP-442 introduces MaxTopicListInFlightLimiter - a memory-aware
semaphore system:
- Dual memory tracking for heap (topic assembly) and direct memory
(network buffers)
- Asynchronous flow control with configurable timeouts and queue limits
- Permit-based system ensuring memory is released after response transmission
- Graceful degradation instead of broker crashes

Benefits:
- Prevents broker crashes from topic listing memory exhaustion
- Predictable resource usage for capacity planning
- Maintains full backward compatibility (no client changes required)
- Comprehensive monitoring with detailed metrics
- Fair resource sharing through queueing mechanisms

The implementation adds flow control at two key points: after topic retrieval
from metadata store (heap) and before response serialization (direct memory).

The full proposal can be found at: https://github.com/apache/pulsar/wiki/PIP-442

I welcome your feedback and discussion on this proposal. Please share
your thoughts, concerns, or suggestions.

Best regards,

-Lari

Reply via email to