[ 
https://issues.apache.org/jira/browse/KAFKA-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16786242#comment-16786242
 ] 

Yiming Zang commented on KAFKA-6020:
------------------------------------

Any updates for this?

We have smilier needs on our side, strongly support this idea on broker-side 
filtering. 

Our use case comes from N-DC replication. Basically imagine if you have 5 data 
centers and you need to replicate data to everywhere, typically you'll have to 
run N*(N-1) which is 20 mirror-maker jobs in order replicate messages in each 
local data center to all remote data centers. Each mirror maker will have to 
read the whole 5 copies of events, do some processing and only replicate one 
fifth of the events. This is a huge waste of network bandwidth and cpu 
resources. If we can have a way to pre filter the events on broker side, mirror 
maker doesn't need to read all 5 copies of events any more, which can be a huge 
amount of savings when we have even more data centers in the future.

> Broker side filtering
> ---------------------
>
>                 Key: KAFKA-6020
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6020
>             Project: Kafka
>          Issue Type: New Feature
>          Components: consumer
>            Reporter: Pavel Micka
>            Priority: Major
>              Labels: needs-kip
>
> Currently, it is not possible to filter messages on broker side. Filtering 
> messages on broker side is convenient for filter with very low selectivity 
> (one message in few thousands). In my case it means to transfer several GB of 
> data to consumer, throw it away, take one message and do it again...
> While I understand that filtering by message body is not feasible (for 
> performance reasons), I propose to filter just by message key prefix. This 
> can be achieved even without any deserialization, as the prefix to be matched 
> can be passed as an array (hence the broker would do just array prefix 
> compare).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to