eolivelli opened a new pull request #9448:
URL: https://github.com/apache/pulsar/pull/9448


   ### Motivation
   Currently KafkaSource allows only to deal with strings and byte arrays, it 
does not support records with Schema.
   In Kafka we have the ability to encode messages using Avro and there is a 
Schema Registry (by Confluent®)
   
   ### Modifications
   
   Summary of changes:
   - add new` org.apache.pulsar.io.kafka.KafkaAvroRecordSource` that reads from 
Kafka using `io.confluent.kafka.serializers.KafkaAvroDeserializer `produces 
GenericRecords to the Pulsar topic
   - this source support Schema Evolution end-to-end (i.e. add fields to the 
original schema in the Kafka world, and see the new fields in the Pulsar topic, 
without any reconfiguration or restart)
   - add Confluent® Schema Registry Client to the Kafka Connector NAR, the 
license is compatible with Apache 2 license and we can redistribute it
   - the configuration of the Schema Registry Client is done done in the 
consumerProperties property of the source (usually you add schema.registry.url)
   - add integration tests with Kafka and Schema Registry
   
   It also adds a few enhancements to the Pulsar IO runtime:
   - allow PulsarSink to deal with 
org.apache.pulsar.client.api.schema.GenericRecord: it must not enforce an empty 
schema to the topic while starting the source
   - allow AvroWriter to deal with GenericRecord that are not a subclass of 
AvroGenericRecord
   
   This patch includes this patch that is to be committed as pre-requisite
   https://github.com/apache/pulsar/pull/9396
   
   ### Verifying this change
   
   The patch introduces new unit tests and integration tests.
   The integration tests launch a Kafka Container and also a Confluent Schema 
Registry Container
   
   ### Documentation
   
   I will be happy to provide documentation once this patch is committed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to