Something to keep in mind with your proposal is that you're moving the
Decompression and Filtering costs into the Brokers. It probably also adds a
new Compression cost if you want the Broker to send compressed data over
the network. Centralizing that cost on the cluster may not be desirable and
would likely increase latency across the board.

Additionally, because header values are byte arrays, the Brokers probably
would not be able to do very sophisticated filtering. Support for basic
comparisons of the built-in Serdes might be simple enough, but anything
more complex or involving custom Serdes would probably require a new
plug-in type on the broker.

On Mon, Nov 29, 2021 at 10:49 AM Talat Uyarer <tuya...@paloaltonetworks.com>
wrote:

> Hi All,
>
> I want to get your advice about one subject. I want to create a KIP for
> message header base filtering on Fetch API.
>
> Our current use case We have 1k+ topics and per topic, have 10+ consumers
> for different use cases. However all consumers are interested in different
> sets of messages on the same topic. Currently  We read all messages from a
> given topic and drop logs on the consumer side. To reduce our stream
> processing cost I want to drop logs on the broker side. So far my
> understanding
>
> *Broker send messages as is (No serilization cost) -> Network Transfer ->
> > Consumer Deserialize Messages(User side deserilization cost) -> User
> Space
> > drop or use messages (User Sidefiltering cost)*
>
>
> If I can drop messages based on their headers without serialization and
> deserialization messages. It will help us save network bandwidth and as
> well as consumer side cpu cost.
>
> My approach is building a header index. Consumer clients will define
> their filter in the fetch call. If the filter is matching, the broker will
> send the messages. I would like to hear your suggestions about my solution.
>
> Thanks
>

Reply via email to