[ 
https://issues.apache.org/jira/browse/KAFKA-6643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navinder Brar updated KAFKA-6643:
---------------------------------
    Description: 
In the current scenario, Kafka Streams has changelog Kafka topics(internal 
topics having all the data for the store) which are used to build the state of 
replicas. So, if we keep the number of standby replicas as 1, we still have 
more availability for persistent state stores as changelog Kafka topics are 
also replicated depending upon broker replication policy but that also means we 
are using at least 4 times the space(1 master store, 1 replica store, 1 
changelog, 1 changelog replica). 

Now if we have an year's data in persistent stores(rocksdb), we don't want the 
changelog topics to have an year's data as it will put an unnecessary burden on 
brokers(in terms of space). If we have to scale our kafka streams 
application(having 200-300 TB's of data) we have to scale the kafka brokers as 
well. We want to reduce this dependency and find out ways to just use changelog 
topic as a queue, having just 2 or 3 days of data and warm up the replicas from 
scratch in some other way.

I have few proposals in that respect.
1. Use a new kafka topic related to each partition which we need to warm up on 
the fly(when node containing that partition crashes. Produce into this topic 
from another replica/active and built new replica through this topic.
2. Use peer to peer file transfer as rocksdb can create backups, which can be 
transferred from source node to destination node when a new replica has to be 
built from scratch.
3. Use HDFS in intermediate instead of kafka topic where we can keep scheduled 
backups for each partition and use those to build new replicas.

  was:
In the current scenario, Kafka Streams has changelog Kafka topics(internal 
topics having all the data for the store) which are used to build the state of 
replicas. So, if we keep the number of standby replicas as 1, we still have 
more availability for persistent state stores as changelog Kafka topics are 
also replicated depending upon broker replication policy but that also means we 
are using at least 4 times the space(1 master store, 1 replica store, 1 
changelog, 1 changelog replica). 

Now if we have an year's data in persistent stores(rocksdb), we don't want the 
changelog topics to have an year's data as it will put an unnecessary burden on 
brokers(in terms of space). If we have to scale our kafka streams 
application(having 200-300 TB's of data) we have to scale the kafka brokers as 
well. We want to reduce this dependency and find out ways to just use changelog 
topic as a queue, having just 2 or 3 days of data and warm up the replicas from 
scratch in some other way.

I have few proposals in that respect.
1. Use a new kafka topic related to each partition whi


> Warm up new replicas from scratch when changelog topic has LIMITED retention 
> time
> ---------------------------------------------------------------------------------
>
>                 Key: KAFKA-6643
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6643
>             Project: Kafka
>          Issue Type: New Feature
>          Components: streams
>            Reporter: Navinder Brar
>            Priority: Major
>
> In the current scenario, Kafka Streams has changelog Kafka topics(internal 
> topics having all the data for the store) which are used to build the state 
> of replicas. So, if we keep the number of standby replicas as 1, we still 
> have more availability for persistent state stores as changelog Kafka topics 
> are also replicated depending upon broker replication policy but that also 
> means we are using at least 4 times the space(1 master store, 1 replica 
> store, 1 changelog, 1 changelog replica). 
> Now if we have an year's data in persistent stores(rocksdb), we don't want 
> the changelog topics to have an year's data as it will put an unnecessary 
> burden on brokers(in terms of space). If we have to scale our kafka streams 
> application(having 200-300 TB's of data) we have to scale the kafka brokers 
> as well. We want to reduce this dependency and find out ways to just use 
> changelog topic as a queue, having just 2 or 3 days of data and warm up the 
> replicas from scratch in some other way.
> I have few proposals in that respect.
> 1. Use a new kafka topic related to each partition which we need to warm up 
> on the fly(when node containing that partition crashes. Produce into this 
> topic from another replica/active and built new replica through this topic.
> 2. Use peer to peer file transfer as rocksdb can create backups, which can be 
> transferred from source node to destination node when a new replica has to be 
> built from scratch.
> 3. Use HDFS in intermediate instead of kafka topic where we can keep 
> scheduled backups for each partition and use those to build new replicas.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to