This is an automated email from the ASF dual-hosted git repository.

mmerli pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/pulsar.git


The following commit(s) were added to refs/heads/master by this push:
     new 5451921cd49 [improve] PIP-384: ManagedLedger interface decoupling 
(#23363)
5451921cd49 is described below

commit 5451921cd49dca03c541617c92ee8a3c83af9e50
Author: Lari Hotari <[email protected]>
AuthorDate: Mon Oct 7 18:37:55 2024 +0300

    [improve] PIP-384: ManagedLedger interface decoupling (#23363)
---
 pip/pip-384.md | 158 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 158 insertions(+)

diff --git a/pip/pip-384.md b/pip/pip-384.md
new file mode 100644
index 00000000000..ba02a147d85
--- /dev/null
+++ b/pip/pip-384.md
@@ -0,0 +1,158 @@
+# PIP-384: ManagedLedger interface decoupling
+
+## Background knowledge
+
+Apache Pulsar uses a component called ManagedLedger to handle persistent 
storage of messages.
+
+The ManagedLedger interfaces and implementation were initially tightly 
coupled, making it difficult to introduce alternative implementations or 
improve the architecture.
+This PIP documents changes that have been made in the master branch for Pulsar 
4.0. Pull Requests [#22891](https://github.com/apache/pulsar/pull/22891) and 
[#23311](https://github.com/apache/pulsar/pull/23311) have already been merged.
+This work happened after lazy consensus on the dev mailing list based on the 
discussion thread ["Preparing for Pulsar 4.0: cleaning up the Managed Ledger 
interfaces"](https://lists.apache.org/thread/l5zjq0fb2dscys3rsn6kfl7505tbndlx).
+There is one remaining PR 
[#23313](https://github.com/apache/pulsar/pull/23313) at the time of writing 
this document.
+The goal of this PIP is to document the changes in this area for later 
reference.
+
+Key concepts:
+
+- **ManagedLedger**: A component that handles the persistent storage of 
messages in Pulsar.
+- **BookKeeper**: The default storage system used by ManagedLedger.
+- **ManagedLedgerStorage interface**: A factory for configuring and creating 
the `ManagedLedgerFactory` instance. [ManagedLedgerStorage.java source 
code](https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/storage/ManagedLedgerStorage.java)
+- **ManagedLedgerFactory interface**: Creates and manages ManagedLedger 
instances. [ManagedLedgerFactory.java source 
code](https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/ManagedLedgerFactory.java)
+- **ManagedLedger interface**: Handles the persistent storage of messages in 
Pulsar. [ManagedLedger.java source 
code](https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/ManagedLedger.java)
+- **ManagedCursor interface**: Handles the persistent storage of Pulsar 
subscriptions and related message acknowledgements. [ManagedCursor.java source 
code](https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/ManagedCursor.java)
+
+## Motivation
+
+The current ManagedLedger implementation faces several challenges:
+
+1. **Tight coupling**: The interfaces are tightly coupled with their 
implementation, making it difficult to introduce alternative implementations.
+
+2. **Limited flexibility**: The current architecture doesn't allow for easy 
integration of different storage systems or optimizations.
+
+3. **Dependency on BookKeeper**: The ManagedLedger implementation is closely 
tied to BookKeeper, limiting options for alternative storage solutions.
+
+4. **Complexity**: The tight coupling increases the overall complexity of the 
system, making it harder to maintain, test and evolve.
+
+5. **Limited extensibility**: Introducing new features or optimizations often 
requires changes to both interfaces and implementations.
+
+## Goals
+
+### In Scope
+
+- Decouple ManagedLedger interfaces from their current implementation.
+- Introduce a ReadOnlyManagedLedger interface.
+- Decouple OpAddEntry and LedgerHandle from ManagedLedgerInterceptor.
+- Enable support for multiple ManagedLedgerFactory instances.
+- Decouple BookKeeper client from ManagedLedgerStorage.
+- Improve overall architecture by reducing coupling between core Pulsar 
components and specific ManagedLedger implementations.
+- Prepare the groundwork for alternative ManagedLedger implementations in 
Pulsar 4.0.
+
+### Out of Scope
+
+- Implementing alternative ManagedLedger storage backends.
+- Changes to external APIs or behaviors.
+- Comprehensive JavaDocs for the interfaces.
+
+## High Level Design
+
+1. **Decouple interfaces from implementations**:
+   - Move required methods from implementation classes to their respective 
interfaces.
+   - Update code to use interfaces instead of concrete implementations.
+
+2. **Introduce ReadOnlyManagedLedger interface**:
+   - Extract this interface to decouple from ReadOnlyManagedLedgerImpl.
+   - Adjust code to use the new interface where appropriate.
+
+3. **Decouple ManagedLedgerInterceptor**:
+   - Introduce AddEntryOperation and LastEntryHandle interfaces.
+   - Adjust ManagedLedgerInterceptor to use these new interfaces.
+
+4. **Enable multiple ManagedLedgerFactory instances**:
+   - Modify ManagedLedgerStorage interface to support multiple "storage 
classes".
+   - Implement BookkeeperManagedLedgerStorageClass for BookKeeper support.
+   - Update PulsarService and related classes to support multiple 
ManagedLedgerFactory instances.
+   - Add "storage class" to persistence policy part of the namespace level or 
topic level policies.
+
+5. **Decouple BookKeeper client**:
+   - Move BookKeeper client creation and management to 
BookkeeperManagedLedgerStorageClass.
+   - Update ManagedLedgerStorage interface to remove direct BookKeeper 
dependencies.
+
+## Detailed Design
+
+### Interface Decoupling
+
+1. Update ManagedLedger interface:
+   - Add methods from ManagedLedgerImpl to the interface.
+   - Remove dependencies on implementation-specific classes.
+
+2. Update ManagedLedgerFactory interface:
+   - Add necessary methods from ManagedLedgerFactoryImpl.
+   - Remove dependencies on implementation-specific classes.
+
+3. Update ManagedCursor interface:
+   - Add required methods from ManagedCursorImpl.
+   - Remove dependencies on implementation-specific classes.
+
+4. Introduce ReadOnlyManagedLedger interface:
+   - Extract methods specific to read-only operations.
+   - Update relevant code to use this interface where appropriate.
+
+5. Decouple ManagedLedgerInterceptor:
+   - Introduce AddEntryOperation interface for beforeAddEntry method.
+   - Introduce LastEntryHandle interface for 
onManagedLedgerLastLedgerInitialize method.
+   - Update ManagedLedgerInterceptor to use these new interfaces.
+
+### Multiple ManagedLedgerFactory Instances
+
+1. Update ManagedLedgerStorage interface:
+   - Add methods to support multiple storage classes.
+   - Introduce getManagedLedgerStorageClass method to retrieve specific 
storage implementations.
+
+2. Implement BookkeeperManagedLedgerStorageClass:
+   - Create a new class implementing ManagedLedgerStorageClass for BookKeeper.
+   - Move BookKeeper client creation and management to this class.
+
+3. Update PulsarService and related classes:
+   - Modify to support creation and management of multiple 
ManagedLedgerFactory instances.
+   - Update configuration to allow specifying different storage classes for 
different namespaces or topics.
+
+### BookKeeper Client Decoupling
+
+1. Update ManagedLedgerStorage interface:
+   - Remove direct dependencies on BookKeeper client.
+   - Introduce methods to interact with storage without exposing BookKeeper 
specifics.
+
+2. Implement BookkeeperManagedLedgerStorageClass:
+   - Encapsulate BookKeeper client creation and management.
+   - Implement storage operations using BookKeeper client.
+
+3. Update relevant code:
+   - Replace direct BookKeeper client usage with calls to ManagedLedgerStorage 
methods.
+   - Update configuration handling to support BookKeeper-specific settings 
through the new storage class.
+
+## Public-facing Changes
+
+### Configuration
+
+- Add new configuration option to specify default ManagedLedger "storage 
class" at broker level.
+
+### API Changes
+
+- No major changes to external APIs are planned.
+- The only API change is to add `managedLedgerStorageClassName` to 
`PersistencePolicies` which can be used by a custom `ManagedLedgerStorage` to 
control the ManagedLedgerFactory instance that is used for a particular 
namespace or topic.
+
+## Backward & Forward Compatibility
+
+The changes are internal and don't affect external APIs or behaviors.
+Backward compatibility is fully preserved in Apache Pulsar.
+
+## Security Considerations
+
+The decoupling of interfaces and implementation doesn't introduce new security 
concerns.
+
+## Links
+
+- Initial mailing List discussion thread: [Preparing for Pulsar 4.0: cleaning 
up the Managed Ledger 
interfaces](https://lists.apache.org/thread/l5zjq0fb2dscys3rsn6kfl7505tbndlx)
+  - Merged Pull Request #22891: [Replace dependencies on PositionImpl with 
Position interface](https://github.com/apache/pulsar/pull/22891)
+  - Merged Pull Request #23311: [Decouple ManagedLedger interfaces from the 
current implementation](https://github.com/apache/pulsar/pull/23311)
+  - Implementation Pull Request #23313: [Decouple Bookkeeper client from 
ManagedLedgerStorage and enable multiple ManagedLedgerFactory 
instances](https://github.com/apache/pulsar/pull/23313)
+- Mailing List PIP discussion thread: 
https://lists.apache.org/thread/rtnktrj7tp5ppog0235t2mf9sxrdpfr8
+- Mailing List PIP voting thread: 
https://lists.apache.org/thread/4jj5dmk6jtpq05lcd6dxlkqpn7hov5gv
\ No newline at end of file

Reply via email to