Hi Yi Thanks a log the reply and the hint. It's more about a kafka issue, the lucky thing is that Samza experts "happens to" be kafka experts. :)
I just checked the cleaner log and found out we run to this issue: https://issues.apache.org/jira/browse/KAFKA-1641 the log cleaner stop for about a month for the checkpoint topic. I remove the cleaner-offset-checkpoint for the corresponding broker and restart them to let the log cleaner running again from the beginning. Now the checkpoint size is reduced to KB level after the first cleanup, a healthy log cleaner thread will definitely solve this problem. I will keep monitoring the cleaner log. The other thing is that issue is fixed in kafka 0.9.0.0 <https://issues.apache.org/jira/browse/KAFKA/fixforversion/12328745>, and I'm really looking forward to samza support for kafka 0.9.0, I saw some discussion about this topic in the email list, I guess I have to wait for a while. On 10 May 2016 at 01:24, Yi Pan <nickpa...@gmail.com> wrote: > Hi, Bo, > > I embedded my answers in-between: > > On Sun, May 8, 2016 at 9:00 PM, Liu Bo <diabl...@gmail.com> wrote: > > > The other thing is log retention is set to 24 hour or 30GB. But seems not > > working for checkpoint topic. As all the *.log file are there unlike the > > data topic which only has recent ones. > > > > > When your topic cleanup policy is set to log compact, the time-retention > policy will not be effective again. Hence, the reduction of checkpoint > topic size purely depends on the log compaction on the Kafka broker. > > > > I am going to dig further on this (never config compaction before), and > > your suggestions would be grateful. > > > > > It would be good to check whether your log compact thread on the Kafka > broker works in a healthy mode, and how often it is triggered. > > > > > > > > -- > > All the best > > > > Liu Bo > > > -- All the best Liu Bo