nsivabalan commented on pull request #2012: URL: https://github.com/apache/hudi/pull/2012#issuecomment-847855344
@sbernauer @giaosudau @dirksan28 @sathyaprakashg : There are quite a few flows or use-cases in general wrt schema evolution. Would you mind helping us explain your use-case. Let me call out few of them : 1. Existing hudi table is in schema1 with 3 cols and you are trying to ingest new batch with schema2 with 4 cols. 2. Existing hudi table is in schema2 with 4 cols (after schema got evolved from schema1). new batch of ingest has records in old schema(schema1). For both (1) and (2), there could be different flows in deltastreamer. a. no transformer and no schema provider. b. no transformer and user provides a schema provider with non null target schema. c. no transformer and user provides a schema provider with NULL target schema. d. has transformer and no schema provider. e. has transformer and user provides a schema provider with non null target schema. f. has transformer and user provides a schema provider with NULL target schema. Can you call out if your use case is 1a or 2e etc. Patch we have put up solves most of the above use-cases, but we would like to better understand whats exactly your use-case is. And simple schema evolution of case 1b should already work in hudi w/o any fix. If your use-case does not belong to any of the above categories, do help us explain so that we can work towards a fix. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
