[GitHub] spark pull request #20610: [SPARK-23426][SQL] Use `hive` ORC impl and disabl...

viirya Wed, 14 Feb 2018 16:37:25 -0800

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20610#discussion_r168355081
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -1004,6 +1004,29 @@ Configuration of Parquet can be done using the 
`setConf` method on `SparkSession
     </tr>
     </table>
     
    +## ORC Files
    +
    +Since Spark 2.3, Spark supports a vectorized ORC reader with a new ORC 
file format for ORC files.
    +To do that, the following configurations are newly added. The vectorized 
reader is used for the
    +native ORC tables (e.g., the ones created using the clause `USING ORC`) 
when `spark.sql.orc.impl`
    +is set to `native` and `spark.sql.orc.enableVectorizedReader` is set to 
`true`. For the Hive ORC
    +serde tables (e.g., the ones created using the clause `USING HIVE OPTIONS 
(fileFormat 'ORC')`),
    +the vectorized reader is used when `spark.sql.hive.convertMetastoreOrc` is 
set to `true`.
    +
    +<table class="table">
    +  <tr><th><b>Property 
Name</b></th><th><b>Default</b></th><th><b>Meaning</b></th></tr>
    +  <tr>
    +    <td><code>spark.sql.orc.impl</code></td>
    +    <td><code>hive</code></td>
    +    <td>The name of ORC implementation. It can be one of 
<code>native</code> and <code>hive</code>. <code>native</code> means the native 
ORC support that is built on Apache ORC 1.4.1. `hive` means the ORC library in 
Hive 1.2.1 which is used prior to Spark 2.3.</td>
    +  </tr>
    +  <tr>
    +    <td><code>spark.sql.orc.enableVectorizedReader</code></td>
    +    <td><code>true</code></td>
    +    <td>Enables vectorized orc decoding in <code>native</code> 
implementation. If <code>false</code>, a new non-vectorized ORC reader is used 
in <code>native</code> implementation. For <code>hive</code> implementation, 
this is ignored.</td>
    +  </tr>
    +</table>
    +
    --- End diff --
    
    The description of `spark.sql.orc.filterPushdown` is disappeared?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20610: [SPARK-23426][SQL] Use `hive` ORC impl and disabl...

Reply via email to