[GitHub] spark pull request #20610: [SPARK-23426][SQL] Use `hive` ORC impl and disabl...

dongjoon-hyun Wed, 14 Feb 2018 19:05:05 -0800

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20610#discussion_r168371933
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -1004,6 +1004,29 @@ Configuration of Parquet can be done using the 
`setConf` method on `SparkSession
     </tr>
     </table>
     
    +## ORC Files
    +
    +Since Spark 2.3, Spark supports a vectorized ORC reader with a new ORC 
file format for ORC files.
    +To do that, the following configurations are newly added. The vectorized 
reader is used for the
    +native ORC tables (e.g., the ones created using the clause `USING ORC`) 
when `spark.sql.orc.impl`
    +is set to `native` and `spark.sql.orc.enableVectorizedReader` is set to 
`true`. For the Hive ORC
    +serde tables (e.g., the ones created using the clause `USING HIVE OPTIONS 
(fileFormat 'ORC')`),
    +the vectorized reader is used when `spark.sql.hive.convertMetastoreOrc` is 
set to `true`.
    --- End diff --
    
    Thank you. I see.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #20610: [SPARK-23426][SQL] Use `hive` ORC impl and disabl...

Reply via email to