lhotari commented on code in PR #24842: URL: https://github.com/apache/pulsar/pull/24842#discussion_r2431615285
########## pip/pip-445.md: ########## @@ -0,0 +1,126 @@ +# PIP-445: Add Builder Methods to Create Message-based TableView + +# Background knowledge + +* **TableView**: In Pulsar, a `TableView` is a client-side abstraction that provides a key-value map interface over a Pulsar topic. It consumes messages from the topic (typically a compacted one) and maintains an in-memory view of the latest value for each key. This allows applications to easily query the current state of a key without managing a consumer manually. + +* **Pulsar `Message<T>`**: A Pulsar message is not just its data payload. The `Message<T>` object is a container that includes the deserialized **payload** (`T`) as well as important **metadata**, such as a message key, user-defined properties (a key-value map), event time, publish time, and more. + +# Motivation + +The current `TableView` API provides a `get(String key)` method that only returns the deserialized **value** (`T`) of the latest message for a given key. This limits its usefulness for applications that need access to the message's metadata. + +For instance, a user might need to inspect the message **properties** to get a trace-id or check the **event time** to determine if the data is recent. Currently, the only way to access this metadata is to create a separate, redundant `Consumer` on the same topic, which is inefficient and undermines the convenience of using a `TableView`. + +This proposal aims to solve this problem by providing a way to create a `TableView` that exposes the entire `Message<T>` object. + +# Goals + +## In Scope + +* Add new methods, `createForMessages()` and `createForMessagesAsync()`, to the `TableViewBuilder<T>` interface. +* Allow users to create a `TableView<Message<T>>` instance, which provides access to the complete `Message<T>` object for each key, including its payload, properties, and all other metadata. +* Ensure the change is fully backward-compatible and does not impact the performance of existing `TableView` users. + +## Out of Scope + +* Modifying the behavior of the existing `create()` and `createAsync()` methods in the builder. +* Changing the underlying topic compaction logic or any broker-side functionality. + +# High Level Design + +The proposed solution is a simple and non-breaking addition to the public client API. Instead of modifying the existing `TableView` implementation, we will introduce new methods to the `TableViewBuilder`. + +1. New methods, `TableView<Message<T>> createForMessages()` and `CompletableFuture<TableView<Message<T>>> createForMessagesAsync()`, will be added to the `TableViewBuilder<T>` interface. +2. These methods will create a new, specialized `TableView` implementation (`MessageTableViewImpl`) that stores the entire `Message<T>` object for each key. +3. The existing `create()` and `createAsync()` methods will continue to create the standard `TableView` that stores only the message value (`T`). + +This opt-in design provides the new functionality efficiently without impacting the performance or behavior of existing `TableView` use cases. + +# Detailed Design + +## Design & Implementation Details + +The changes will be confined to the Pulsar client library. + +* **Interface `org.apache.pulsar.client.api.TableViewBuilder<T>`**: + New methods will be added to this interface to create a `TableView` for messages. + +* **Class `org.apache.pulsar.client.impl.TableViewBuilderImpl<T>`**: + The new `createForMessages` methods will be implemented to instantiate a new `MessageTableViewImpl`. + +* **New Class `org.apache.pulsar.client.impl.MessageTableViewImpl<T>`**: + A new class will be created that implements `TableView<Message<T>>`. It will be based on the existing `TableViewImpl` but its internal map will store `Message<T>` objects instead of just `T` values. Its `get(key)` method will return the full `Message<T>` object. + +* **Class `org.apache.pulsar.client.impl.TableViewImpl<T>`**: + This class will remain unchanged, ensuring no impact on existing users. Review Comment: This would change to use the abstract base class, but the functionality won't change. ########## pip/pip-445.md: ########## @@ -0,0 +1,126 @@ +# PIP-445: Add Builder Methods to Create Message-based TableView + +# Background knowledge + +* **TableView**: In Pulsar, a `TableView` is a client-side abstraction that provides a key-value map interface over a Pulsar topic. It consumes messages from the topic (typically a compacted one) and maintains an in-memory view of the latest value for each key. This allows applications to easily query the current state of a key without managing a consumer manually. + +* **Pulsar `Message<T>`**: A Pulsar message is not just its data payload. The `Message<T>` object is a container that includes the deserialized **payload** (`T`) as well as important **metadata**, such as a message key, user-defined properties (a key-value map), event time, publish time, and more. + +# Motivation + +The current `TableView` API provides a `get(String key)` method that only returns the deserialized **value** (`T`) of the latest message for a given key. This limits its usefulness for applications that need access to the message's metadata. + +For instance, a user might need to inspect the message **properties** to get a trace-id or check the **event time** to determine if the data is recent. Currently, the only way to access this metadata is to create a separate, redundant `Consumer` on the same topic, which is inefficient and undermines the convenience of using a `TableView`. + +This proposal aims to solve this problem by providing a way to create a `TableView` that exposes the entire `Message<T>` object. + +# Goals + +## In Scope + +* Add new methods, `createForMessages()` and `createForMessagesAsync()`, to the `TableViewBuilder<T>` interface. +* Allow users to create a `TableView<Message<T>>` instance, which provides access to the complete `Message<T>` object for each key, including its payload, properties, and all other metadata. +* Ensure the change is fully backward-compatible and does not impact the performance of existing `TableView` users. + +## Out of Scope + +* Modifying the behavior of the existing `create()` and `createAsync()` methods in the builder. +* Changing the underlying topic compaction logic or any broker-side functionality. + +# High Level Design + +The proposed solution is a simple and non-breaking addition to the public client API. Instead of modifying the existing `TableView` implementation, we will introduce new methods to the `TableViewBuilder`. + +1. New methods, `TableView<Message<T>> createForMessages()` and `CompletableFuture<TableView<Message<T>>> createForMessagesAsync()`, will be added to the `TableViewBuilder<T>` interface. +2. These methods will create a new, specialized `TableView` implementation (`MessageTableViewImpl`) that stores the entire `Message<T>` object for each key. +3. The existing `create()` and `createAsync()` methods will continue to create the standard `TableView` that stores only the message value (`T`). + +This opt-in design provides the new functionality efficiently without impacting the performance or behavior of existing `TableView` use cases. + +# Detailed Design + +## Design & Implementation Details + +The changes will be confined to the Pulsar client library. + +* **Interface `org.apache.pulsar.client.api.TableViewBuilder<T>`**: + New methods will be added to this interface to create a `TableView` for messages. + +* **Class `org.apache.pulsar.client.impl.TableViewBuilderImpl<T>`**: + The new `createForMessages` methods will be implemented to instantiate a new `MessageTableViewImpl`. + +* **New Class `org.apache.pulsar.client.impl.MessageTableViewImpl<T>`**: + A new class will be created that implements `TableView<Message<T>>`. It will be based on the existing `TableViewImpl` but its internal map will store `Message<T>` objects instead of just `T` values. Its `get(key)` method will return the full `Message<T>` object. Review Comment: There's no need to duplicate code. An abstract base class is a better solution. I did a quick refactoring to show how this can be accomplished: https://github.com/lhotari/pulsar/commit/c5a4991 . This can be used as the basis. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
