gaborgsomogyi commented on a change in pull request #29729:
URL: https://github.com/apache/spark/pull/29729#discussion_r500109389
##########
File path: docs/ss-migration-guide.md
##########
@@ -26,6 +26,19 @@ Note that this migration guide describes the items specific
to Structured Stream
Many items of SQL migration can be applied when migrating Structured Streaming
to higher versions.
Please refer [Migration Guide: SQL, Datasets and
DataFrame](sql-migration-guide.html).
+## Upgrading from Structured Streaming 3.0 to 3.1
+
+- In Spark 3.0 and below, secure Kafka processing needed the following ACLs
from driver perspective:
+ * Topic resource describe operation
+ * Topic resource read operation
+ * Group resource read operation
+
+ Since Spark 3.1, offsets are obtained with `AdminClient` instead of
`KafkaConsumer` and now the following ACLs needed from driver perspective:
+ * Topic resource describe operation
+
+ Since `AdminClient` in driver is not connecting to consumer group,
`group.id` based authorization will not work anymore (executors never done
group based authorization).
Review comment:
Many users who I've spoken with thought that driver and executors are
doing group based authorization which makes sense from far distance. Both cases
`KafkaConsumer` used and both cases `group.id` is set. The truth is that
executors using `KafkaConsumer` in a special way:
* `assign` strategy used only
* Auto offset commit is turned off
* Manual offset commit not called
Such case `KafkaConsumer` is not joining to any group and not doing any
group based authorization (this is true with and without this PR).
Not relevant but worth to mention, providing the possibility to either
prefix or override executor side `group.id` still has value from user
perspective so it's not modified.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]