Nimfadora commented on a change in pull request #23791: 
[SPARK-20597][SQL][SS][WIP] KafkaSourceProvider falls back on path as synonym 
for topic
URL: https://github.com/apache/spark/pull/23791#discussion_r261634612
 
 

 ##########
 File path: docs/structured-streaming-kafka-integration.md
 ##########
 @@ -457,8 +463,17 @@ The following configurations are optional:
   <td>string</td>
   <td>none</td>
   <td>streaming and batch</td>
+  <td>Sets the topic that all rows will be written to in Kafka. This option 
overrides
+  ```path``` option and any topic column that may exist in the data.</td>
+</tr>
+<tr>
+  <td>path</td>
+  <td>string</td>
+  <td>none</td>
+  <td>streaming and batch</td>
   <td>Sets the topic that all rows will be written to in Kafka. This option 
overrides any
-  topic column that may exist in the data.</td>
+  topic column that may exist in the data and is overridden by ```topic``` 
option.
 
 Review comment:
   I agree with you, that all three of them should be checked. However, now we 
have the validation being  splitted and duplicated between 
[KafkaWriter#validateQuery](https://github.com/apache/spark/blob/master/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaWriter.scala#L45),
 
[KafkaWriteTask#createProjection](https://github.com/apache/spark/blob/master/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaWriteTask.scala#L101)
 and 
[KafkaSourceProvider#resolveTopic](https://github.com/apache/spark/pull/23791/files#diff-eeac5bdf3a1ecd7b9f8aaf10fff37f05R197).
 We can refactor this moment and move this validations to one place, or just 
leave as is and add topic column and topic/path option comparison validation to 
KafkaSourceProvider#validateQuery. The fist is more complicated and 
error-prone, but will result in more readable code. On the other hand, second 
solution will not require so much code to be rewritten. Which way do you think 
is right?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to