Hi,
Any ideas on how to go about this ? Any insights you have would be helpful.
I am kinda stuck here.
Here are the steps I followed on hive 0.13
1) create table t (f1 String, f2 string) stored as Parquet;2) upload parquet
files with 2 fields3) select * from t; <---- Works fine.4) alter table t add
columns (f3 string);5) Select * from t; <----- ERROR "Caused by:
java.lang.IllegalStateException: Column f3 at index 2 does not exist at
org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:116)
at
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204)
at
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:79)
at
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:66)
at
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
at
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:65)
On Wednesday, January 7, 2015 2:55 PM, Kumar V <[email protected]>
wrote:
Hi, I have a Parquet format Hive table with a few columns. I have loaded a
lot of data to this table already and it seems to work.I have to add a few new
columns to this table. If I add new columns, queries don't work anymore since
I have not reloaded the old data.Is there a way to add new fields to the table
and not reload the old Parquet files and make the query work ?
I tried this in Hive 0.10 and also on hive 0.13. Getting an error in both
versions.
Please let me know how to handle this.
Regards,Kumar.