Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/22184
> if spark.sql.hive.convertMetastoreParquet and spark.sql.caseSensitive are
both set to true, we throw an exception
I'd like to just skip the conversion and log a warning message to say why.
> ... which is not consistent
I think it's ok. At the end they are different data sources and can define
their own behaviors.
But you do have a point about `spark.sql.hive.convertMetastoreParquet`, the
behavior must be consistent to do the conversion. My proposal is, parquet data
source should provide an option(not SQL conf) to switch the behavior when
hitting duplicated field names in case-insensitive mode. And when converting
hive parquet table to parquet data source, set the option and ask parquet data
source to pick the first matched field.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]