subject:"SparkSQL \- can we add new column\(s\) to parquet files"

Re: SparkSQL - can we add new column(s) to parquet files

2014-11-21 Thread Evan Chan

I would expect an SQL query on c would fail because c would not be known in the schema of the older Parquet file. What I'd be very interested in is how to add a new column as an incremental new parquet file, and be able to somehow join the existing and new file, in an efficient way. IE, somehow

SparkSQL - can we add new column(s) to parquet files

2014-11-21 Thread Sadhan Sood

We create the table definition by reading the parquet file for schema and store it in hive metastore. But if someone adds a new column to the schema, and if we rescan the schema from the new parquet files and update the table definition, would it still work if we run queries on the table ? So, old