Github user DaimonPl commented on the issue:
https://github.com/apache/spark/pull/16578
@mallman @viirya from my understanding current workaround is for case when
reading columns which are not in file schema
> Parquet-mr will throw an exception if we try to read a superset of the
file's schema.
Isn't it somehow dependent on schema evolution setting?
http://spark.apache.org/docs/latest/sql-programming-guide.html#schema-merging
> Since schema merging is a relatively expensive operation, and is not a
necessity in most cases, we turned it off by default starting from 1.5.0. You
may enable it by
> * setting data source option mergeSchema to true when reading Parquet
files (as shown in the examples below), or
> * setting the global SQL option spark.sql.parquet.mergeSchema to true.
Wouldn't it work fine with `spark.sql.parquet.mergeSchema` enabled?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]