poorbarcode commented on code in PR #20493:
URL: https://github.com/apache/pulsar/pull/20493#discussion_r1221173857


##########
pip/pip-274.md:
##########
@@ -0,0 +1,126 @@
+# Background knowledge
+
+Apache Pulsar is a distributed messaging system that supports multiple 
messaging protocols and storage methods. Among them, Pulsar Topic Compaction is 
a mechanism to clean up duplicate messages in topics to reduce storage space 
and improve system efficiency.
+More topic compaction details can be found in [Pulsar Topic 
Compaction](https://pulsar.apache.org/docs/en/concepts-topic-compaction/).
+
+# Motivation
+
+Currently, the implementation of Pulsar Topic Compaction is fixed and does not 
support custom strategy, which limits users from using more Compactor policies 
in their applications.
+
+
+For example, we need to parse the Kafka format then compact message in Kop, 
but the current implementation of Pulsar topic compaction does not support this 
feature.
+Another the topic compaction logic implemented in `TwoPhaseCompactor` only 
compacts messages to the last one, but sometimes we need to keep the first 
valid message e.g 
[`StrategicTwoPhaseCompactor`](https://github.com/coderzc/pulsar/blob/0e9935c493060b13b322a84c5418146423992369/pulsar-broker/src/main/java/org/apache/pulsar/compaction/StrategicTwoPhaseCompactor.java).
+
+So we need to make the topic compactor pluggable to support more compaction 
strategy.
+
+# Goals
+
+## In Scope
+
+<!--
+What this PIP intend to achieve once It's integrated into Pulsar.
+Why does it benefit Pulsar.
+-->
+
+Make the compactor pluggable.
+
+## Out of Scope
+
+<!--
+Describe what you have decided to keep out of scope, perhaps left for a 
different PIP/s.
+-->
+
+
+# High Level Design
+
+<!--
+Describe the design of your solution in *high level*.
+Describe the solution end to end, from a birds-eye view.
+Don't go into implementation details in this section.
+
+I should be able to finish reading from beginning of the PIP to here 
(including) and understand the feature and 
+how you intend to solve it, end to end.
+
+DON'T
+* Avoid code snippets, unless it's essential to explain your intent.
+-->
+
+Make the topic compactor pluggable, users can customize the compactor 
implementation according to their own special scenarios.
+
+
+# Detailed Design
+
+## Design & Implementation Details
+
+<!--
+This is the section where you dive into the details. It can be:
+* Concrete class names and their roles and responsibility, including methods.
+* Code snippets of existing code.
+* Interface names and its methods.
+* ...
+-->
+* Define a standard Compactor interface that specifies the methods and 
properties that the Compactor implementation needs to implement. This interface 
should include methods for Compactor initialization, Compactor execution, and 
getting Compactor stats.
+```java
+public interface Compactor {
+
+    void initialize(ServiceConfiguration conf,
+                    PulsarClient pulsar,
+                    BookKeeper bk,
+                    ScheduledExecutorService scheduler);
+
+    CompletableFuture<Long> compact(String topic);
+
+    CompactorMXBean getStats();
+}
+```
+
+* Rename `org.apache.pulsar.compaction.Compactor` to 
`org.apache.pulsar.compaction.AbstractCompactor` and make it implement 
`Compactor` interface.
+
+* Load custom compactor based on configuration in 
`org.apache.pulsar.broker.PulsarService.newCompactor` and `CompactorTool`.
+
+## Public-facing Changes
+
+<!--
+Describe the additions you plan to make for each public facing component. 
+Remove the sections you are not changing.
+Clearly mark any changes which are BREAKING backward compatability.
+-->
+
+
+### Configuration

Review Comment:
   A consensus was reached after a discussion with @coderzc 
   - This PIP only tries to make Compactor configurable, with no strong 
relation to KOP
   - The new Compactor should be compatible with all the functions of 
`TwoPhaseCompactor`, it guarantees compatibility.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to