Claudia, This sounds like an issue I and other users have reported recently: https://issues.apache.org/jira/browse/KAFKA-8335. Looks like a fix was merged yesterday.
Michael On Tue, May 14, 2019 at 4:44 AM Claudia Wegmann <c.wegm...@kasasi.de> wrote: > Dear kafka users, > > I run a kafka cluster (version 2.1.1) with 6 brokers to process ~100 > messages per second with a number of kafka streams apps. There are > currently 53 topics with 30 partitions each. I have exactly once processing > enabled. My problem is that the __consumer_offsets topic is growing > indefinitely. __consumer_offsets topic has the default 50 partitions > although some of them are not filled yet. Commit interval is also at its > default value of 100 ms for exactly once stream processing. All the streams > apps are running and processing data continuously. > I therefore did some digging: > > 1.) I'm not hitting > https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_KAFKA-2D8335&d=DwIGaQ&c=x_Y1Lz9GyeGp2OvBCa_eow&r=ZHKtfNLXdH8j2N1pHGofCpIdHXPkUWyDl-Rljkb5iwQ&m=R10G5xIjmHox-lk9ZVq5M9Pi7FyPqwKbeKKOdW_QJG4&s=PI_JjVqslUfx1DP3NrecKuyv_fEwhhBJLWlGyao_7pk&e= > . I checked the content of the __consumer_offsets topic. The last messages > retained are from a week ago and do not seem to be solely related to > transactions. > > 2.) The LogCleaner thread on all brokers is running. According to its logs > __consumer_offsets partitions are cleaned up very rarely. E. g. on one of > the brokers there are log entries for one partition a day. In the last 7 > days on 3 days no cleaning for __consumer_offsets happened at all on this > broker even though it holds 25 partitions of __consumer_offsets topic. > > 3.) Even when the rare occasion of cleaning a __consumer_offsets partition > happens that just reduced the size of the partition from ~ 16 GB to 13 GB. > I would have expected more cleaning. > > 4.) The previous to points led to a total of ~ 750 GB disc space being > occupied by the __consumer_offsets topic. This considerably slows down > broker startup and so on. > > 5.) The cleanup of __transaction_state topic does seem to work smoothly. > There, each partition is cleaned ~ once per hour and therefore does not > grow above ~ 100 MB. In total __transaction_state topic occupies ~8 GB of > diskspace. > > 6.) Other topics occupy ~3 GB of disk space. They, too, get cleaned up > regularly, no matter the cleanup policy. > > So, the Questions are: > > a) For some streams apps I have a lot of instances running. Does this > impact the number of messages in __consumer_offsets topic? From my > understanding that should not make a difference because the offsets are > stored per consumer group and partition. Is this correct? > > b) How can I assure that the LogCleaner regularly cleans up > __consumer_offsets partitions? What is special about this topic in regard > to cleanup? > > c) I set the segment.bytes for __consumer_offsets topic to 1 GB. Is the > LogCleaner working more efficiently for a lot of smaller files? > > Thanks for your help. > > Best, Claudia >