peach12345 commented on issue #13406: URL: https://github.com/apache/iceberg/issues/13406#issuecomment-3102282909
> Got it. So you need to traverse the list, which is part of the input data, to figure out the correct type for the schema you provide. Right now, we’ve established a few requirements: we’re not supporting schema evolution within array elements. Instead, the first element must include all expected fields. Further thoughts and ideas: I’m not sure if this is considered best practice, but we already have the JSON Schema defined for the incoming events. Based on this, our idea is to generate a complete Iceberg schema that fully reflects the event structure. That way, we always provide the full schema with each input — and any fields that aren’t set would simply default to null or something equivalent. @mxm curious to hear your thoughts on this approach? > If that's the case, we need to fix the code. The Dynamic Sink should never use field ids of the schema you provide. Instead, it uses fully-qualified name matching to match fields of the provided schema with the table schemas. Could you perhaps provide an example to reproduce this issue? As far as I recall, we had some difficulties with schema evolution. That said, it's certainly possible that the problem was on our end. If I can reproduce it, I’ll share an example with you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
