Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/23108#discussion_r235670505
--- Diff: docs/sql-migration-guide-upgrade.md ---
@@ -111,6 +111,8 @@ displayTitle: Spark SQL Upgrading Guide
- Since Spark 2.0, Spark converts Parquet Hive tables by default for
better performance. Since Spark 2.4, Spark converts ORC Hive tables by default,
too. It means Spark uses its own ORC support by default instead of Hive SerDe.
As an example, `CREATE TABLE t(id int) STORED AS ORC` would be handled with
Hive SerDe in Spark 2.3, and in Spark 2.4, it would be converted into Spark's
ORC data source table and ORC vectorization would be applied. To set `false` to
`spark.sql.hive.convertMetastoreOrc` restores the previous behavior.
+ - In version 2.3 and earlier, `spark.sql.hive.converMetastoreOrc`
default is `false`, if you specify a directory in the `LOCATION` clause in the
`CREATE EXTERNAL TABLE STORED AS ORC LOCATION` sql statement, Spark will use
the Hive ORC reader to read the data into the table if the directory or
sub-directory contains the matching data, if you specify the wild card(*), the
Hive ORC reader will not be able to read the data, because it is treating the
wild card as a directory. For example: ORC data is stored at
`/tmp/orctab1/dir1/`, `create external table tab1(...) stored as orc location
'/tmp/orctab1/'` will read the data into the table, `create external table
tab2(...) stored as orc location '/tmp/orctab1/*' ` will not. Since Spark 2.4,
`spark.sql.hive.convertMetaStoreOrc` default is `true`, Spark will use native
ORC reader, it will read the data if you specify the wild card, but will not if
you specify the parent directory. To set `false` to
`spark.sql.hive.convertMetastoreOrc`
restores the previous behavior.
--- End diff --
Could you change `but will not if you specify the parent directory` more
clearly with examples like the other sentence?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]