vshinde-medacist opened a new issue, #4542:
URL: https://github.com/apache/iceberg/issues/4542
We are trying to evaluate schema evolution feature supported by Iceberg and
below are the steps carried out so far.
1. Create a new Iceberg table
`people.csv` data:
| age| name|
|----|-------|
| 30| Andy|
| 19| Justin|
```scala
val df : DataFrame = spark.read.format("csv").option("header",
"true").option("delimiter", ",").load("/spark-apps/people.csv")
df.write.format("iceberg").saveAsTable("local.demo_table")
```
2. Add a new column and append it to the table created in Step 1.
`updated_people.csv` data:
| age| name|job|
|----|-------| ---- |
| 36| Vikram| Developer|
| 18| Raj| Developer|
```scala
val csvdf = spark.read.format("csv").option("header",
"true").option("delimiter", ",").load("/spark-apps/updated_people.csv")
csvdf.write.format("iceberg").mode("append").save("/path/to/table)
```
But getting the below exception:
```cmd
org.apache.spark.sql.AnalysisException: Cannot write to '/path/to/table',
too many data columns:
Table columns: 'age', 'name'
Data columns: 'age', 'name', 'job'
```
Any suggestions on how to enable schema evolution support on dataframe.
FYR, entire POC script:
```scala
val df : DataFrame = spark.read.format("csv").option("header",
"true").option("delimiter", ",").load("/spark-apps/people.csv")
df.show()
df.write.format("iceberg").saveAsTable("local.demo_table")
val tableData =
spark.read.format("iceberg").load("s3a://iceberg-poc/warehouse/demo_table")
tableData.show()
val csvdf = spark.read.format("csv").option("header",
"true").option("delimiter", ",").load("/spark-apps/updated_people.csv")
csvdf.show()
// Exception while executing below statement
csvdf.write.format("iceberg").mode("append").save("s3a://iceberg-poc/warehouse/demo_table")
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]