dongjoon-hyun commented on a change in pull request #23944: [SPARK-26932][DOC]Hive 2.1.1 cannot read ORC table created by Spark 2.4.0 in default URL: https://github.com/apache/spark/pull/23944#discussion_r261876376
########## File path: docs/sql-migration-guide-upgrade.md ########## @@ -193,6 +193,8 @@ displayTitle: Spark SQL Upgrading Guide - Since Spark 2.0, Spark converts Parquet Hive tables by default for better performance. Since Spark 2.4, Spark converts ORC Hive tables by default, too. It means Spark uses its own ORC support by default instead of Hive SerDe. As an example, `CREATE TABLE t(id int) STORED AS ORC` would be handled with Hive SerDe in Spark 2.3, and in Spark 2.4, it would be converted into Spark's ORC data source table and ORC vectorization would be applied. To set `false` to `spark.sql.hive.convertMetastoreOrc` restores the previous behavior. + - Since Spark 2.4, Spark uses native ORC in default, Which cause Hive 2.1.1 cannot read ORC table created by Spark 2.4. Refer to [HIVE-16683](https://issues.apache.org/jira/browse/HIVE-16683) for details. To set `false` to `spark.sql.hive.convertMetastoreOrc` and set `hive` to `spark.sql.orc.impl` restores the previous behavior. Review comment: Hi, @haiboself . Line 172 is the better place to append this kind of notice. - https://github.com/apache/spark/pull/23944/files#diff-3f19ec3d15dcd8cd42bb25dde1c5c1a9R172 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
