eolivelli opened a new pull request #9448: URL: https://github.com/apache/pulsar/pull/9448
### Motivation Currently KafkaSource allows only to deal with strings and byte arrays, it does not support records with Schema. In Kafka we have the ability to encode messages using Avro and there is a Schema Registry (by Confluent®) ### Modifications Summary of changes: - add new` org.apache.pulsar.io.kafka.KafkaAvroRecordSource` that reads from Kafka using `io.confluent.kafka.serializers.KafkaAvroDeserializer `produces GenericRecords to the Pulsar topic - this source support Schema Evolution end-to-end (i.e. add fields to the original schema in the Kafka world, and see the new fields in the Pulsar topic, without any reconfiguration or restart) - add Confluent® Schema Registry Client to the Kafka Connector NAR, the license is compatible with Apache 2 license and we can redistribute it - the configuration of the Schema Registry Client is done done in the consumerProperties property of the source (usually you add schema.registry.url) - add integration tests with Kafka and Schema Registry It also adds a few enhancements to the Pulsar IO runtime: - allow PulsarSink to deal with org.apache.pulsar.client.api.schema.GenericRecord: it must not enforce an empty schema to the topic while starting the source - allow AvroWriter to deal with GenericRecord that are not a subclass of AvroGenericRecord This patch includes this patch that is to be committed as pre-requisite https://github.com/apache/pulsar/pull/9396 ### Verifying this change The patch introduces new unit tests and integration tests. The integration tests launch a Kafka Container and also a Confluent Schema Registry Container ### Documentation I will be happy to provide documentation once this patch is committed. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
