Hi Kafka Devs & Users, We recently had an issue where we processed a lot of old data and we crashed our brokers due to too many memory mapped files. It seems to me that the nature of Kafka / Kafka Streams is a bit suboptimal in terms of resource management. (Keeping all files open all the time, maybe there should be something managing this more on-demand?)
In the issue I described, the repartition topic was produced very fast, but not consumed, causing a lot of segments and files to be open at the same time. I have worked around the issue by making sure I have more threads than partitions to force tasks to subscribe to internal topics only, but seems a bit hacky and maybe there should be some guidance in documentation if considered part of design.. After quite some testing and code reversing it seems that the nature of this imbalance lies within how the broker multiplexes the consumed topic-partitions. I have attached a slide that I will present to my team to explain the issue in a bit more detail, it might be good to check it out to understand the context. Any thoughts about my findings and concerns? Kind regards Niklas