nsivabalan commented on pull request #2012:
URL: https://github.com/apache/hudi/pull/2012#issuecomment-847855344


   @sbernauer @giaosudau @dirksan28 @sathyaprakashg : There are quite a few 
flows or use-cases in general wrt schema evolution. Would you mind helping us 
explain your use-case. 
   
   Let me call out few of them : 
   1. Existing hudi table is in schema1 with 3 cols and you are trying to 
ingest new batch with schema2 with 4 cols. 
   2. Existing hudi table is in schema2 with 4 cols (after schema got evolved 
from schema1). new batch of ingest has records in old schema(schema1). 
   For both (1) and (2), there could be different flows in deltastreamer. 
   a. no transformer and no schema provider. 
   b. no transformer and user provides a schema provider with non null target 
schema.
   c. no transformer and user provides a schema provider with NULL target 
schema.
   d. has transformer and no schema provider. 
   e. has transformer and user provides a schema provider with non null target 
schema.
   f. has transformer and user provides a schema provider with NULL target 
schema.
   
   Can you call out if your use case is 1a or 2e etc. Patch we have put up 
solves most of the above use-cases, but we would like to better understand 
whats exactly your use-case is. And simple schema evolution of case 1b should 
already work in hudi w/o any fix. 
   If your use-case does not belong to any of the above categories, do help us 
explain so that we can work towards a fix. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to