gtwuser opened a new issue, #5701: URL: https://github.com/apache/hudi/issues/5701
**Describe the problem you faced** We are trying to update an existing column `col1` which has schema of a empty array, which is by default taken as `array<string>`. Perhaps the issue is that the new upcoming records has data in this existing column `col1` that is it's an array of not null values. While upserting it throws error of `•••binary Utf8 optional element of not group ••••`. We don't have any predefined schema for these records, it's all inferred by default. Hence during insert this column `col1` schema becomes array<string> by default. But since the new upcoming records have non null or non empty array values while upserting them to tu his column it fails the upsert operation. In short this issue comes whenever we are trying to update the schema of a column from `array<string>` to `array<struct<>>` or `array<array<>>`. Kindly let me know if there is a work around or solution for it. A clear and concise description of the problem. **To Reproduce** Steps to reproduce the behavior: 1. `Insert` records which has a column with only empty array as value 2. Upsert records with atleast one entry of non empty array as value in that column which previously had only empty array. **Expected behavior** Expected behaviour would be to upgrade schema of columns which had a default schema for an empty array(i.e array<string>) to the new recieved non empty array value schema. That is upgrade a array based column schema from default array<string> to a more complex schema of the data which the non empty array holds. **Environment Description** * AWS glue 3.0 * Hudi version : 0.10.1 * Spark version : 3.1.2 * Running on Docker? (yes/no) : no, we are running glue jobs using pyspark **Additional context** Add any other context about the problem here. **Stacktrace** ```Add the stacktrace of the error.``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
