[jira] [Work logged] (BEAM-11806) KafkaIO - Partition Recognition in WriteRecords

ASF GitHub Bot (Jira) Fri, 12 Feb 2021 07:53:04 -0800


     [ 
https://issues.apache.org/jira/browse/BEAM-11806?focusedWorklogId=551877&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-551877
 ]


ASF GitHub Bot logged work on BEAM-11806:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 12/Feb/21 15:52
            Start Date: 12/Feb/21 15:52
    Worklog Time Spent: 10m 
      Work Description: aromanenko-dev edited a comment on pull request #13975:
URL: https://github.com/apache/beam/pull/13975#issuecomment-778275873


   Rion, please add your test into `org.apache.beam.sdk.io.kafka.KafkaIOTest`, 
we used to use for KafkaIO unit testing in general. I believe it already 
contains some tests for `ProducerRecord` sink - please, take a look on 
`testRecordsSink()`, for example. I think it can be just adjusted to check that 
right partition was properly set in `ProducerRecord`s.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 551877)
    Time Spent: 50m  (was: 40m)

> KafkaIO - Partition Recognition in WriteRecords
> -----------------------------------------------
>
>                 Key: BEAM-11806
>                 URL: https://issues.apache.org/jira/browse/BEAM-11806
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-kafka
>            Reporter: Rion Williams
>            Assignee: Rion Williams
>            Priority: P2
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> At present, the `WriteRecords` support for the KafkaIO does not recognize the 
> `partition` property defined on `ProducerRecord` instances consumed by the 
> transform. This ticket would added support so that any explicit partitioning 
> that was defined would be acknowledged accordingly while still respecting the 
> default behavior if it was not explicitly included.
> This can be easily identified within the `KafkaWriter` class used behind the 
> scenes in the `WriteRecords` transform:
> {code:java}
> producer.send(
>         // The null property in the following constructor represents partition
>         new ProducerRecord<>(
>             topicName, null, timestampMillis, record.key(), record.value(), 
> record.headers()),
>         new SendCallback());
> {code}
> Because of this limitation, in a scenario where a user may desire an 
> explicitly defined partitioning strategy as opposed to round-robin, they 
> would have to create their own custom DoFn that defines a KafkaProducer 
> (preferably within a @StartBundle) similar to the following approach (in 
> Kotlin):
> {code:java}
> private class ExampleProducerDoFn(...): DoFn<...>() {
>         private lateinit var producer: KafkaProducer<...>
>         @StartBundle
>         fun startBundle(context: StartBundleContext) {
>             val options = 
> context.pipelineOptions.`as`(YourPipelineOptions::class.java)
>             producer = getKafkaProducer(options)
>         }
>         @ProcessElement
>         fun processElement(context: ProcessContext){
>             // Omitted for brevity
>             
>             // Produce the record to a specific topic at a specific partition
>             producer.send(ProducerRecord(
>                 "your_topic_here",
>                 your_partition_here,
>                 context.element().kv.key,
>                 context.element().kv.value
>             ))
>         }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-11806) KafkaIO - Partition Recognition in WriteRecords

Reply via email to