lhotari commented on code in PR #24842: URL: https://github.com/apache/pulsar/pull/24842#discussion_r2468286626
########## pip/pip-445.md: ########## @@ -0,0 +1,137 @@ +# PIP-445: Add Builder Methods to Create Message-based TableView + +# Background knowledge + +* **TableView**: In Pulsar, a `TableView` is a client-side abstraction that provides a key-value map interface over a Pulsar topic. It consumes messages from the topic (typically a compacted one) and maintains an in-memory view of the latest value for each key. This allows applications to easily query the current state of a key without managing a consumer manually. + +* **Pulsar `Message<T>`**: A Pulsar message is not just its data payload. The `Message<T>` object is a container that includes the deserialized **payload** (`T`) as well as important **metadata**, such as a message key, user-defined properties (a key-value map), event time, publish time, and more. + +# Motivation + +The current `TableView` API provides a `get(String key)` method that only returns the deserialized **value** (`T`) of the latest message for a given key. This limits its usefulness for applications that need access to the message's metadata. + +For instance, a user might need to inspect the message **properties** to get a trace-id or check the **event time** to determine if the data is recent. Currently, the only way to access this metadata is to create a separate, redundant `Consumer` on the same topic, which is inefficient and undermines the convenience of using a `TableView`. + +This proposal aims to solve this problem by providing a way to create a `TableView` that exposes the entire `Message<T>` object. + +# Goals + +## In Scope + +* Add new generic methods, `createMapped()` and `createMappedAsync()`, to the `TableViewBuilder<T>` interface, which accept a mapping function. +* Allow users to create a `TableView<V>` instance by providing a function that transforms a `Message<T>` into a custom object `V`. This includes the ability to create a `TableView<Message<T>>` by passing an identity function. +* Ensure the change is fully backward-compatible and does not impact the performance of existing `TableView` users. + +## Out of Scope + +* Modifying the behavior of the existing `create()` and `createAsync()` methods in the builder. +* Changing the underlying topic compaction logic or any broker-side functionality. +* Handling exceptions thrown by the user-provided mapper function within the `TableView` (e.g., "poison pill" message handling). + +# High Level Design + +The proposed solution is a simple and non-breaking addition to the public client API. Instead of adding a specific method for retrieving messages, we will introduce a more flexible, generic mapping mechanism. + +1. New generic methods, `<V> TableView<V> createMapped(...)` and `<V> CompletableFuture<TableView<V>> createMappedAsync(...)`, will be added to the `TableViewBuilder<T>` interface. +2. These methods will accept a `java.util.function.Function<Message<T>, V>` as a parameter. This `mapper` function defines how to transform an incoming raw `Message<T>` into a value `V` to be stored in the `TableView`. +3. This approach provides maximum flexibility. Users who need the entire `Message<T>` object can simply pass `Function.identity()` as the mapper. Other users can create custom, memory-efficient objects containing only the necessary data from the message payload and metadata. +4. The existing `create()` and `createAsync()` methods will remain unchanged, preserving behavior for all existing use cases. + +# Detailed Design + +## Design & Implementation Details + +The changes will be confined to the Pulsar client library. + +* **New Abstract Class `org.apache.pulsar.client.impl.AbstractTableView<V>`**: + * An abstract base class will be created to contain the common logic for `TableView` implementations, such as managing the underlying consumer and handling topic events. This prevents code duplication. + +* **Class `org.apache.pulsar.client.impl.TableViewImpl<T>`**: + * This class will be refactored to extend the new `AbstractTableView<T>`. While its internal implementation will change to use the base class, its public-facing functionality and behavior will remain exactly the same. This ensures that the change has no impact on existing users. + +* **New Class `org.apache.pulsar.client.impl.MessageTableViewImpl<T>`**: + * A new class will be created that extends `AbstractTableView<Message<T>>` and implements the `TableView<Message<T>>` interface. It will be responsible for storing the full `Message<T>` object for each key, which its `get(key)` method will return. + +* **Class `org.apache.pulsar.client.impl.TableViewBuilderImpl<T>`**: + * The new `createForMessages` methods will be implemented to instantiate a new `MessageTableViewImpl`. Review Comment: Please update these details to match the latest plan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
