vanzin commented on a change in pull request #24305: [SPARK-27294][SS] Add 
multi-cluster Kafka delegation token
URL: https://github.com/apache/spark/pull/24305#discussion_r279086457
 
 

 ##########
 File path: docs/structured-streaming-kafka-integration.md
 ##########
 @@ -703,11 +703,98 @@ Kafka broker configuration):
 
 After obtaining delegation token successfully, Spark distributes it across 
nodes and renews it accordingly.
 Delegation token uses `SCRAM` login module for authentication and because of 
that the appropriate
-`spark.kafka.sasl.token.mechanism` (default: `SCRAM-SHA-512`) has to be 
configured. Also, this parameter
+`spark.kafka.clusters.${cluster}.sasl.token.mechanism` (default: 
`SCRAM-SHA-512`) has to be configured. Also, this parameter
 must match with Kafka broker configuration.
 
 When delegation token is available on an executor it can be overridden with 
JAAS login configuration.
 
+#### Configuration
+
+Delegation tokens can be obtained from multiple clusters and 
<code>${cluster}</code> is an arbitrary unique identifier which helps to group 
different configurations.
+
+<table class="table">
+<tr><th>Property Name</th><th>Default</th><th>Meaning</th></tr>
+  <tr>
+    <td><code>spark.kafka.clusters.${cluster}.bootstrap.servers</code></td>
+    <td>None</td>
+    <td>
+      A list of coma separated host/port pairs to use for establishing the 
initial connection
+      to the Kafka cluster. For further details please see kafka 
documentation. Only used to obtain delegation token.
+    </td>
+  </tr>
+  <tr>
+    
<td><code>spark.kafka.clusters.${cluster}.target.bootstrap.servers.regex</code></td>
+    <td>.*</td>
+    <td>
+      When delegation token obtained it can be used on sources/sinks. If any 
of the sources/sinks <code>bootstrap.servers</code> configuration
 
 Review comment:
   Wording is a bit weird. Suggestion:
   
   ```
   Regular expression to match against the <code>bootstrap.servers</code> 
config for sources and sinks in the application. If a server address matches 
this regex, the delegation token obtained from the respective bootstrap servers 
will be used when connecting. If multiple clusters match the address, an 
exception will be thrown and the query won't be started.
   ```
   
   Also it would be good to clarify how to config things e.g. when you are 
connecting to one unsecured and one secured kafka cluster. 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to