redsk commented on a change in pull request #26153: [SPARK-29500][SQL][SS] Support partition column when writing to Kafka URL: https://github.com/apache/spark/pull/26153#discussion_r336453652
########## File path: docs/structured-streaming-kafka-integration.md ########## @@ -622,6 +626,10 @@ a ```null``` valued key column will be automatically added (see Kafka semantics how ```null``` valued key values are handled). If a topic column exists then its value is used as the topic when writing the given row to Kafka, unless the "topic" configuration option is set i.e., the "topic" configuration option overrides the topic column. +If a partition column is not specified then the partition is calculated by the Kafka producer +(using ```org.apache.kafka.clients.producer.internals.DefaultPartitioner```). +This can be overridden in Spark by setting the ```kafka.partitioner.class``` option. Review comment: Yes, exactly. But this is `KafkaProducer` standard behaviour: - it uses `ProducerRecord` partition field. if `null`, fall backs to: - `kafka.partitioner.class` provided. If not set: - use default partitioner. I don't believe we need a test for this (otherwise we would be testing Kafka API) but maybe we should explicitly state it in the doc. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
