[GitHub] [hudi] liujinhui1994 commented on pull request #2666: [HUDI-1160] Support update partial fields for CoW table
liujinhui1994 commented on pull request #2666: URL: https://github.com/apache/hudi/pull/2666#issuecomment-884002932 @vinothchandar Please take the time to look at @nsivabalan idea of using some of the fields in COW to update. Your guidance and approval would be much appreciated -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2666: [HUDI-1160] Support update partial fields for CoW table
liujinhui1994 commented on pull request #2666: URL: https://github.com/apache/hudi/pull/2666#issuecomment-881826299 @nsivabalan hello, I see that some partners have already suggested this idea. [HUDI-1884] MergeInto Support Partial Update For COW Are we still necessary to carry out this PR, if so, I will continue to deal with it and do a good job to improve -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2666: [HUDI-1160] Support update partial fields for CoW table
liujinhui1994 commented on pull request #2666: URL: https://github.com/apache/hudi/pull/2666#issuecomment-822930145 > > we should still use the old schema with full fields there, for new records with partial values, we can patch them up with a builtin placeholder values > > agree. This is actually a fairly involved feature. and I would like to design things for both COW and MOR at the outset. IMO if we design something only for COW, then over time it leads to a pretty confusing matrix of feature gaps. > > Can we first write down the high level design in a comment, summarize open issues, and we can tackle them one by one? Please allow me to think about it and I will reply once there is a result -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2666: [HUDI-1160] Support update partial fields for CoW table
liujinhui1994 commented on pull request #2666: URL: https://github.com/apache/hudi/pull/2666#issuecomment-820931906 > > sorry for late turn around on reviewing this. We should definitely get this before next release. > > I am yet to review tests. but few high level thoughts on reviewing source code. > > > > * Shouldn't we check schema compatibility? what incase new incoming batch is not compatible w/ table schema w/ partial updates set to true? did we cover this scenario. > > * I see we have added support only for COW. should we throw exception if the config is set for MOR? > > * I don't have a good idea of adding sql DML support to hoodie table. But if feasible, once such support is added, do you think we can leverage this w/o duplicating the work for sql DML. for eg, "update col1 = 'new_york' where col2= '123'" Such partial updates should translate from sql layer to this right. > > * In tests, do verify that schema in commit metadata refers to table schema and not incoming partial schema. > > I have the same feeling, we should still use the old schema with full fields there, for new records with partial values, we can patch them up with a builtin placeholder values, and when we pre_combine the old and new, if we encounter the placeholder values, use the value from the existing record. > > In any case, to be consistent with SQL, please do not modify the schema which mess the thing up. Okay, I think of a way to support such a situation -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2666: [HUDI-1160] Support update partial fields for CoW table
liujinhui1994 commented on pull request #2666: URL: https://github.com/apache/hudi/pull/2666#issuecomment-812290646 Your suggestion is very good, please let me think about it @nsivabalan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2666: [HUDI-1160] Support update partial fields for CoW table
liujinhui1994 commented on pull request #2666: URL: https://github.com/apache/hudi/pull/2666#issuecomment-811861142 > @liujinhui1994 : Can you please update the description with an example. 1、old data id nameage 1 liujinhui 26 2 sam 25 3 madeline30 2、new data {"id": 1,"name": "abc"} 3、 final result 1 abc 26 2 sam 25 3 madeline30 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2666: [HUDI-1160] Support update partial fields for CoW table
liujinhui1994 commented on pull request #2666: URL: https://github.com/apache/hudi/pull/2666#issuecomment-811860789 > @liujinhui1994 : Can you please update the description with an example. 一、old data id nameage 1 liujinhui 26 2 sam 25 3 madeline30 二、new data {"id": 1,"name": "abc"} 三、 final result 1 abc 26 2 sam 25 3 madeline30 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2666: [HUDI-1160] Support update partial fields for CoW table
liujinhui1994 commented on pull request #2666: URL: https://github.com/apache/hudi/pull/2666#issuecomment-805667943 Maybe after 0.8 is released --原始邮件-- 发件人: "apache/hudi" ***@***.***; 发送时间:2021年3月24日(星期三) 晚上6:03 ***@***.***; ***@***.**@***.***; 主题:Re: [apache/hudi] [HUDI-1160] Support update partial fields for CoW table (#2666) Is there any timeline for this pull request? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2666: [HUDI-1160] Support update partial fields for CoW table
liujinhui1994 commented on pull request #2666: URL: https://github.com/apache/hudi/pull/2666#issuecomment-805655005 @Sugamber This branch should be available -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2666: [HUDI-1160] Support update partial fields for CoW table
liujinhui1994 commented on pull request #2666: URL: https://github.com/apache/hudi/pull/2666#issuecomment-800745638 There does not seem to be the same function as this PR at the moment @nsivabalan This PR inherits from here, this one https://github.com/apache/hudi/pull/1929 will be obsolete This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2666: [HUDI-1160] Support update partial fields for CoW table
liujinhui1994 commented on pull request #2666: URL: https://github.com/apache/hudi/pull/2666#issuecomment-799193666 @n3nash @satishkotha @yanghua This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2666: [HUDI-1160] Support update partial fields for CoW table
liujinhui1994 commented on pull request #2666: URL: https://github.com/apache/hudi/pull/2666#issuecomment-799193044 The content in this pr https://github.com/apache/hudi/pull/1929 comment is resolved here, please review This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org