mxm commented on issue #13406: URL: https://github.com/apache/iceberg/issues/13406#issuecomment-3102422664
Thank you, this is very useful and important feedback. Any further examples are very welcome. >Right now, we’ve established a few requirements: we’re not supporting schema evolution within array elements. Instead, the first element must include all expected fields. I think that is reasonable. If the type changes, the likelihood is very high that the type change is "incompatible", i.e. can't be made without dropping the field entirely. The Dynamic Sink does not allow these types of changes because it would change the semantics of queries. That said, we've been thinking to loosen this restriction a bit, if users opt-in for it. >I’m not sure if this is considered best practice, but we already have the JSON Schema defined for the incoming events. Based on this, our idea is to generate a complete Iceberg schema that fully reflects the event structure. That way, we always provide the full schema with each input — and any fields that aren’t set would simply default to null or something equivalent. That's how we intended Dynamic Sink to be used. You create the complete schema which you plan to write data for and pass it on to the sink. The sink handles the schema evolution with some guardrails (no deletion, type changes without data loss). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
