Hi all, I would like to propose flipping the default value of Kafka offset fetching config. The context is following:
Before Spark 3.1, there was only one approach on fetching offset, using consumer.poll(0). This has been pointed out as a root cause for hang since there is no timeout for metadata fetch. In Spark 3.1, we addressed this via introducing a new approach on fetching offset, via SPARK-32032 <https://issues.apache.org/jira/browse/SPARK-32032>. Since the new approach leverages AdminClient and consumer group is no longer needed for fetching offset, required security ACLs are loosen. Reference: https://spark.apache.org/docs/latest/structured-streaming-kafka-integration.html#offset-fetching There was some concern about behavioral change on the security model hence we couldn't make the new approach by default. During the time, we have observed various Kafka connector related issues which came from old offset fetching (e.g. hang, issues on rebalance on customer group, etc.) and we fixed many of these issues via simply flipping the config. Based on this, I would consider the default value as "incorrect". The security-related behavioral change would be introduced inevitably (they can set topic based ACL rule), but most people will get benefited. IMHO this is something we can deal with release/migration note. Would like to hear the voices on this. Thanks, Jungtaek Lim (HeartSaVioR)