Outside of what is discussed here
https://issues.apache.org/jira/browse/SPARK-3851 as a future solution, is
there any path for being able to modify a Parquet schema once some data has
been written? This seems like the kind of thing that should make people
pause when considering whether or not to
You can't change parquet schema without reencoding the data as you need to
recalculate the footer index data. You can manually do what SPARK-3851
https://issues.apache.org/jira/browse/SPARK-3851 is going to do today
however.
Consider two schemas:
Old Schema: (a: Int, b: String)
New Schema,