Hi folks, I am taking KafkaIO as an example for the PulsarIO connector, during the development of the new IO, I got some questions on KafkaIO implementation. I was wondering if anyone has some experience with KafkaIO SDF implementation that might help me.
- What was taken into consideration to implement the KafkaSourceDescriptor <https://github.com/apache/beam/blob/master/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaSourceDescriptor.java> which is used as input for the SDF in Kafka? - In the ReadFromKafkaDoFn <https://github.com/apache/beam/blob/master/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/ReadFromKafkaDoFn.java> class, you have to implement a getSize in order to estimate how much work it will take. What approach do you take in order to get an estimate with an unbounded approach like kafka? - For the SDF implementation, I suppose it will need a Source Interface implementation <https://beam.apache.org/documentation/io/developing-io-java/#implementing-the-source-interface> and a Reader subclass <https://beam.apache.org/documentation/io/developing-io-java/#implementing-the-reader-subclass>? The documentation is kind of confusing in that part when you are working with SDF, Should it be treated as Unbounded for the source/reading part? Thanks in advance -- *Marco Robles* *|* WIZELINE Software Engineer [email protected] Amado Nervo 2200, Esfera P6, Col. Ciudad del Sol, 45050 Zapopan, Jal. -- *This email and its contents (including any attachments) are being sent to you on the condition of confidentiality and may be protected by legal privilege. Access to this email by anyone other than the intended recipient is unauthorized. If you are not the intended recipient, please immediately notify the sender by replying to this message and delete the material immediately from your system. Any further use, dissemination, distribution or reproduction of this email is strictly prohibited. Further, no representation is made with respect to any content contained in this email.*
