liujinhui1994 commented on pull request #2666:
URL: https://github.com/apache/hudi/pull/2666#issuecomment-820931906


   > > sorry for late turn around on reviewing this. We should definitely get 
this before next release.
   > > I am yet to review tests. but few high level thoughts on reviewing 
source code.
   > > 
   > > * Shouldn't we check schema compatibility? what incase new incoming 
batch is not compatible w/ table schema w/ partial updates set to true? did we 
cover this scenario.
   > > * I see we have added support only for COW. should we throw exception if 
the config is set for MOR?
   > > * I don't have a good idea of adding sql DML support to hoodie table. 
But if feasible, once such support is added, do you think we can leverage this 
w/o duplicating the work for sql DML. for eg, "update col1 = 'new_york' where 
col2= '123'" Such partial updates should translate from sql layer to this right.
   > > * In tests, do verify that schema in commit metadata refers to table 
schema and not incoming partial schema.
   > 
   > I have the same feeling, we should still use the old schema with full 
fields there, for new records with partial values, we can patch them up with a 
builtin placeholder values, and when we pre_combine the old and new, if we 
encounter the placeholder values, use the value from the existing record.
   > 
   > In any case, to be consistent with SQL, please do not modify the schema 
which mess the thing up.
   
   Okay, I think of a way to support such a situation


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to