Hi Everyone, Ever since we've upgraded from 0.9.0.1 to 0.10.0 our five-node Kafka cluster is unstable. Specifically, whereas before a 6GB memory heap worked fine, following the upgrade all five brokers crashed with out of memory errors within an hour of the upgrade. I boosted the memory heap to 10GB, which fixed the OOM error problem, but now it appears the GC pauses are preventing the cluster from maintaining more than one ISR. I realize I could up the replica lag settings to improve the ISR numbers, but that's treating the symptom and not the root problem.
There appears to be a change in the memory requirements somewhere in the Kafka stack, which could be on the producer side as well, but I want to rule out any configuration issues on the broker side. Are there any 0.9 defaults in particular anyone is aware of that I should change for 0.10.x to resolve the root problem(s) of these observations? --John