kasakrisz opened a new pull request, #3362: URL: https://github.com/apache/hive/pull/3362
### What changes were proposed in this pull request? Rewrite update statements of iceberg tables to multi insert statement similarly in case of native acid tables. When generating the rewritten statement: * Get the virtual columns from the table's storage handler in case of non native acid tables * Include the old values to the select clause of the delete branch of the multi insert statement. When executing the multi insert: * Two iceberg writers are used which produce a data delta file and a delete delta file. The result of these writers should be merged into one `FilesForCommit` if both writers are run in the same task. * In case of more complex statements (ex. partitioned and/or bucketed) more than one Tez task produces commit info so this patch enables storing all of them. * Every `FileSinkOperator` creates its own jobConf instance because the iceberg write operation is stored in it and it is different in both instance. ### Why are the changes needed? See #2855 + Preparation for iceberg Merge implementation. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? ``` mvn test -Dtest.output.overwrite -DskipSparkTests -Dtest=TestIcebergLlapLocalCliDriver -Dqfile=update_iceberg_partitioned_orc2.q -pl itests/qtest-iceberg -Piceberg -Pitests ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
