2022-09-13 11:09:18 UTC - Nishant Hooda: <#C3TPCAQG1|general> We have switched to aws MSK for the communication between controller and invoker. It was working fine until we enabled IAM auth for MSK. With IAM auth enabled, the Invokers keep going Unhealthy/Offline->Healthy->Unhealthy/Offline, with lots of these errors:
`[ERROR] [#tid_sid_dispatcher] [MessageFeed] failed to commit activation consumer offset: org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured <http://max.poll.interval.ms|max.poll.interval.ms>, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing <http://max.poll.interval.ms|max.poll.interval.ms> or by reducing the maximum size of batches returned in poll() with max.poll.records.` Before enabling IAM auth these errors were encountered less frequently, now it is more frequent. Some activations still go through so we know that the auth configuration is working. How do we resolve this? https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1663067358702399 ----