lcspinter opened a new issue, #4930:
URL: https://github.com/apache/iceberg/issues/4930
I've been playing with schema evolution on migrated Hive tables and I
noticed some inconsistencies. Here is what I did
Create a simple hive table
`CREATE EXTERNAL TABLE customers (id int, first_name string, last_name
string) STORED AS PARQUET;`
Insert one record
`INSERT INTO customers VALUES (11, 'Lisa', 'Truman');`
Migrate it to Iceberg
`ALTER TABLE customers SET TBLPROPERTIES " +
"('storage_handler'='org.apache.iceberg.mr.hive.HiveIcebergStorageHandler')`
Drop a column
`ALTER TABLE customers REPLACE COLUMNS (id int, last_name string)`
Readd the same column
`ALTER TABLE customers ADD COLUMNS (first_name string)`
Running a select query on the readded column will give back the previously
inserted record, which I believe is the expected outcome.
I've added an additional step to the test scenario. Before dropping the
column I renamed it.
`ALTER TABLE customers CHANGE COLUMN first_name first_name_1 string`
If I readd the renamed column and run a query on it I get `null` values.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]