nsivabalan opened a new pull request, #18892: URL: https://github.com/apache/hudi/pull/18892
### Change Logs `KafkaAvroSchemaDeserializer` previously only overrode `deserialize(String, Boolean, byte[], Schema)` to inject the configured `sourceSchema`. The Kafka consumer / Connect framework can invoke other overloads — `deserialize(String, byte[])`, `deserialize(String, byte[], Schema)`, and `deserialize(String, Headers, byte[])` — which bypassed the `sourceSchema` injection. This caused `ArrayIndexOutOfBoundsException` when consuming records serialized with an older schema (fewer fields in a nested record) while the deserializer was configured with an evolved schema, because Avro resolution used the writer's old schema instead of the configured reader schema. This change overrides all three additional `deserialize` methods to consistently inject `sourceSchema`, ensuring Avro schema resolution handles old → new field evolution correctly (defaulting new nullable fields to null). ### Impact Bug fix for `KafkaAvroSchemaDeserializer`. No public API change. Behavior for the already-overridden `deserialize(String, Boolean, byte[], Schema)` is unchanged. The three newly overridden methods now behave consistently with the existing override. ### Risk level low ### Documentation Update No user-facing config or API change. ### Contributor's checklist - [x] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [x] Change Logs and Impact were stated clearly - [x] Adequate tests were added ### Test Plan - New tests in `TestKafkaAvroSchemaDeserializer` use a Debezium CDC envelope schema with a nested `Value` record that gains 4 nullable fields (`notes`, `search_engine_id`, `locale_id`, `language_id`). All 4 deserialize overloads are exercised against old-schema records read with the evolved schema, validating positional index access (index 20-23) on the nested `before` record to reproduce the AIOOBE. - `mvn -pl hudi-utilities test -Dtest='TestKafkaAvroSchemaDeserializer'` — 2/2 pass. - `mvn -pl hudi-utilities checkstyle:check` — 0 violations. Closes #18891 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
