Hi Navneeth Your configuration looks correct to me.
If you observe that compaction is not cleaning up old records, it could either be due to slow compaction or could be due to incorrect configuration. Here are a few things that I would check: First, validate that the log cleaner is running. There are multiple ways to do that: - Option#1: Check if a thread with the name “kafka-log-cleaner-thread-” is running. You can either use a utility such as jstack or jconsole to check the status of running threads or you can take a thread dump using kill -3 pId.You should observe N threads with prefix “kafka-log-cleaner-thread-” where N is value for configuration log.cleaner.threads - Option#2: Check the value of the metric kafka.log:type=LogCleanerManager,name=time-since-last-run-ms to observe the last time the cleaner was run. After that check the value of metric max-clean-time-secsto verify that the run did not immediately end. - Option#3: Check the value of the metric kafka.log:type=LogCleaner,name=DeadThreadCount to observe if any threads are dead. Ideally this should be 0. Second, validate if the partition that you are interested in is being cleaned. - Option#1: Validate that the kafka.log:type=LogCleanerManager,name=uncleanable-partitions-count is not > 0. If it is, then the partition you are trying to clean may have been marked uncleanable. It can occur due to unexpected exceptions which could be found in the logs by searching for prefix “Unexpected exception thrown when cleaning log” at WARN level. - Option#2: Validate that by checking the value of log.cleaner.min.compaction.lag.ms and log.cleaner.max.compaction.lag.ms Third, if the cleaner is running, it might be slow in cleaning up data. This could be due to resource constraints (such as IO throttle limit, dedup heap being allocated etc.) that could be fixed by changing the configuration. Some metrics to look out for are: - “kafka.log:type=LogCleaner,name=max-buffer-utilization-percent ” should not be 100%. If it is, then it’s a sign that you need to increase the value of configuration log.cleaner.dedupe.buffer.size For MSK, you can access these metric if you have set up open monitoring https://docs.aws.amazon.com/msk/latest/developerguide/open-monitoring.html and you should contact the AWS support for access to log clean application log files. -- Divij Vaidya On Thu, Oct 27, 2022 at 11:31 PM Navneeth Krishnan <reachnavnee...@gmail.com> wrote: > Hi All, > > We are using AWS MSK with kafka version 2.6.1. There is a compacted topic > with the below configurations. After reading the documentation my > understanding was that null values in the topic can be removed using delete > retention time but I can see months old keys having null values. Is there > any other configuration that needs to be changed for removing null values > from a compacted topic? Thanks! > > cleanup.policy=compact > segment.bytes=1073741824 > min.cleanable.dirty.ratio=0.1 > delete.retention.ms=3600000 > segment.ms=3600000 > > Regards, > Navneeth >