[GitHub] [spark] redsk commented on a change in pull request #26153: [SPARK-29500][SQL][SS] Support partition column when writing to Kafka

GitBox Fri, 18 Oct 2019 04:56:27 -0700

redsk commented on a change in pull request #26153: [SPARK-29500][SQL][SS] 
Support partition column when writing to Kafka
URL: https://github.com/apache/spark/pull/26153#discussion_r336453652


 ##########
 File path: docs/structured-streaming-kafka-integration.md
 ##########
 @@ -622,6 +626,10 @@ a ```null``` valued key column will be automatically 
added (see Kafka semantics
 how ```null``` valued key values are handled). If a topic column exists then 
its value
 is used as the topic when writing the given row to Kafka, unless the "topic" 
configuration
 option is set i.e., the "topic" configuration option overrides the topic 
column.
+If a partition column is not specified then the partition is calculated by the 
Kafka producer
+(using ```org.apache.kafka.clients.producer.internals.DefaultPartitioner```).
+This can be overridden in Spark by setting the ```kafka.partitioner.class``` 
option.
 
 Review comment:
   Yes, exactly. But this is `KafkaProducer` standard behaviour: 
   - it uses `ProducerRecord` partition field. if `null`, fall backs to:
   - `kafka.partitioner.class` provided. If not set:
   - use default partitioner.
   
   I don't believe we need a test for this (otherwise we would be testing Kafka 
API) but maybe we should explicitly state it in the doc.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] redsk commented on a change in pull request #26153: [SPARK-29500][SQL][SS] Support partition column when writing to Kafka

Reply via email to