[GitHub] spark pull request #20484: [SPARK-23313][DOC] Add a migration guide for ORC

gatorsmile Mon, 12 Feb 2018 12:45:51 -0800

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20484#discussion_r167680857
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -1776,6 +1776,35 @@ working with timestamps in `pandas_udf`s to get the 
best performance, see
     
     ## Upgrading From Spark SQL 2.2 to 2.3
     
    +  - Since Spark 2.3, Spark supports a vectorized ORC reader with a new ORC 
file format for ORC files. To do that, the following configurations are newly 
added or change their default values. For creating ORC tables, `USING ORC` or 
`USING HIVE` syntaxes are recommended.
    --- End diff --
    
    When users create tables by `USING HIVE`, we are using the ORC library in 
Hive 1.2.1 to read/write ORC tables unless they manually change 
`spark.sql.hive.convertMetastoreOrc` to `true`. 
    
    The last message is confusing to me.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #20484: [SPARK-23313][DOC] Add a migration guide for ORC

Reply via email to