vshinde-medacist opened a new issue, #4542:
URL: https://github.com/apache/iceberg/issues/4542

   We are trying to evaluate schema evolution feature supported by  Iceberg and 
below are the steps carried out so far.
   
   1. Create a new Iceberg table
   
   `people.csv` data:
   
   | age|   name|
   |----|-------|
   |  30|   Andy|
   |  19| Justin|
   
   ```scala
   val df : DataFrame = spark.read.format("csv").option("header", 
"true").option("delimiter", ",").load("/spark-apps/people.csv")
   df.write.format("iceberg").saveAsTable("local.demo_table")
   ```
   
   2. Add a new column and append it to the table created in Step 1.
   `updated_people.csv` data:
   
   | age|   name|job|
   |----|-------| ---- |
   |  36|   Vikram| Developer|
   |  18| Raj| Developer|
   
   ```scala
   val csvdf = spark.read.format("csv").option("header", 
"true").option("delimiter", ",").load("/spark-apps/updated_people.csv")
   csvdf.write.format("iceberg").mode("append").save("/path/to/table)
   ```
   But getting the below exception:
   
   ```cmd
   org.apache.spark.sql.AnalysisException: Cannot write to '/path/to/table', 
too many data columns:
   Table columns: 'age', 'name'
   Data columns: 'age', 'name', 'job'
   ```
   
   Any suggestions on how to enable schema evolution support on dataframe.
   
   FYR, entire POC script:
   
   ```scala
       val df : DataFrame = spark.read.format("csv").option("header", 
"true").option("delimiter", ",").load("/spark-apps/people.csv")
       df.show()    
   
       df.write.format("iceberg").saveAsTable("local.demo_table")
   
       val tableData = 
spark.read.format("iceberg").load("s3a://iceberg-poc/warehouse/demo_table")
       tableData.show()
   
       val csvdf = spark.read.format("csv").option("header", 
"true").option("delimiter", ",").load("/spark-apps/updated_people.csv")
       csvdf.show()
      
      // Exception while executing below statement
       
csvdf.write.format("iceberg").mode("append").save("s3a://iceberg-poc/warehouse/demo_table")
 
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to