Parquet Migrations

2014-10-31 Thread Gary Malouf
Outside of what is discussed here https://issues.apache.org/jira/browse/SPARK-3851 as a future solution, is there any path for being able to modify a Parquet schema once some data has been written? This seems like the kind of thing that should make people pause when considering whether or not to

Re: Parquet Migrations

2014-10-31 Thread Michael Armbrust
You can't change parquet schema without reencoding the data as you need to recalculate the footer index data. You can manually do what SPARK-3851 https://issues.apache.org/jira/browse/SPARK-3851 is going to do today however. Consider two schemas: Old Schema: (a: Int, b: String) New Schema,