mehtaashish23 opened a new issue #2041: URL: https://github.com/apache/iceberg/issues/2041
GIST: https://gist.github.com/mehtaashish23/232dda586040a8e9038b61f3553dde76 While doing INSERT INTO on Iceberg table to append data, the order of fields being commited are wrong, and hence corrupting the table. Checkout the GIST for the code, but if the table has field "f1,f2,f3" and I insert the data using the following code, then it gets the wrong data (out of order fields) into the table. NOTE: dataFrame.write.mode("append").save() works fine and maintains the correct order. ``` sc.parallelize(Seq((s"f3ValueFile2",s"f1ValueFile2",s"f2ValueFile2"))).toDF("f3","f1","f2").registerTempTable("input") spark.read.format("iceberg").load(masterTablePath).registerTempTable("target") spark.sql("insert into target (select * from input)").show(false) ``` ``` scala> spark.read.format("iceberg").load(masterTablePath).show(false) +------------+------------+------------+ |f1 |f2 |f3 | +------------+------------+------------+ |f3ValueFile2|f1ValueFile2|f2ValueFile2| +------------+------------+------------+ ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
