bbasosuho opened a new issue, #4639: URL: https://github.com/apache/iceberg/issues/4639
A problem occurs when updating/deleting with spark-sql after loading data with iceberg Java api **Iceberg Version : iceberg-spark-runtime-3.1_2.12-0.13.1.jar Spark Version : 3.1.2** First, load 3 rows of data with iceberg Java api  And load 1 rows of data with iceberg Java api ( id = 4 )  Delete 1 row data with iceberg Java api ( id = 2 )  And Retrieve data from spark-sql Rows with IDs 1, 3, 4 are retrieved. ( Normal ) Update through spark-sql ( id =1 ) (delete also happens the same.) case1 ) UPDATE iceberg.testdb.testtb SET memo='new-value' WHERE id = **2** ; And Retrieve data from spark-sql Rows with IDs 1, 3, 4 are retrieved. ( Normal ) case2 ) UPDATE iceberg.testdb.testtb SET memo='new' WHERE id = **'2'** ; And Retrieve data from spark-sql **Rows with IDs 1, 2, 3, 4 are retrieved. ( Abnormal )**  **Is there any problem when using the iceberg API and spark-sql crossover? It seems that the existing delete file information is lost when performing Update/Delete statements in Spark-sql.** Iceberg API is used as below. (Writer is implemented by inheriting BaseEqualityDeltaWriter.) ... WriteResult files = writer.complete(); RowDelta newRowDelta = icebergTable.newRowDelta(); Arrays.stream(files.dataFiles()).forEach(newRowDelta::addRows); Arrays.stream(files.deleteFiles()).forEach(newRowDelta::addDeletes); .... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
