[GitHub] [hudi] vinothchandar commented on pull request #3285: [HUDI-1771] Propagate CDC format for hoodie
vinothchandar commented on pull request #3285: URL: https://github.com/apache/hudi/pull/3285#issuecomment-894475193 Thanks. @codope and @bvaradar are looking into backwards incompatible schema evolution already. So we can turn this on for spark and flink be default in the next release -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] vinothchandar commented on pull request #3285: [HUDI-1771] Propagate CDC format for hoodie
vinothchandar commented on pull request #3285: URL: https://github.com/apache/hudi/pull/3285#issuecomment-893991056 >For existing tables, this will cause issues with schema evol? or we only do. it for new tables? Then it will break if someone upgrades from 0.8.0 to 0.9.0 on an existing table with Flink. I understand the criticality of the feature. but wondering how the upgrade process looks. Should we check with users and see? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] vinothchandar commented on pull request #3285: [HUDI-1771] Propagate CDC format for hoodie
vinothchandar commented on pull request #3285: URL: https://github.com/apache/hudi/pull/3285#issuecomment-892123167 @swuferhong @danny0405 is this ready for review? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] vinothchandar commented on pull request #3285: [HUDI-1771] Propagate CDC format for hoodie
vinothchandar commented on pull request #3285: URL: https://github.com/apache/hudi/pull/3285#issuecomment-888749576 False - we can turn it on next release, after also hardening the end-to-end use case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] vinothchandar commented on pull request #3285: [HUDI-1771] Propagate CDC format for hoodie
vinothchandar commented on pull request #3285: URL: https://github.com/apache/hudi/pull/3285#issuecomment-888735362 Folks, @bvaradar and @codope are already looking into ' supporting such evolution. I suggest we do this in the next release without breaking existing tables. If we want to have this go in this release, it has to be strictly optional. Wdyt? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] vinothchandar commented on pull request #3285: [HUDI-1771] Propagate CDC format for hoodie
vinothchandar commented on pull request #3285: URL: https://github.com/apache/hudi/pull/3285#issuecomment-885764276 @codope please help me close out the schema evolution story here. my point was: when mixing old files where _hoodie_operation is NOT present with new files where it is present, the queries can break? @nsivabalan's response > yes. Check row 4 here https://hudi.apache.org/docs/writing_data.html#schema-evolution > Here is what could be happening. Your new batch of write (with new operation field), might be routed to all existing file groups and hence it might have worked. Can you try sending to only partial file groups. > If you are doing some sanity test, you can do this. > batch1: old schema: route to partition_0 > batch2: new schema w/ operation field: route to partition_1 > unless operation field is appended to the end of the existing schema, this may not work. > Let us know what do you see. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org