rmahindra123 opened a new pull request #3592: URL: https://github.com/apache/hudi/pull/3592
## What is the purpose of the pull request Implement Kafka Sink Protocol for Hudi for Ingesting Immutable Data. This PR enables connect users to readily ingest Kafka AVRO/ JSON string records into Hudi tables without Spark engine, within the Kafka Connect framework. Currently, we use the HoodieJavaWriteClient's bulk insert support to insert append only data (CoW). We use file id indexing to ensure multiple writers per Kafka partition can write to the same Hudi partition path concurrently without locks. ## Brief change log 1. The Kafka connect protocol is implemented in a new package, hudi-kafka-connect 2. A few code changes to integrate support for bulk insert with HoodieJavaWriteClient. ## Verify this pull request 1. Wrote unit tests for the key Coorindator <-> Participants interaction. 2. Tested with the kafka console connect in distributed mode as per instructions in README.md -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
