GitHub user HyukjinKwon opened a pull request:

    https://github.com/apache/spark/pull/14518

    [SPARK-16610][SQL] Do not ignore `orc.compress` when `compression` option 
is unset

    ## What changes were proposed in this pull request?
    
    For ORC source, Spark SQL has a writer option `compression`, which is used 
to set the codec and its value will be also set to `orc.compress` (the orc conf 
used for codec). However, if a user only set `orc.compress` in the writer 
option, we should not use the default value of `compression` (snappy) as the 
codec. Instead, we should respect the value of `orc.compress`.
    
    This PR make ORC data source not ignoring `orc.compress` when `comperssion` 
is unset.
    
    So, here is the behaviour,
    
     1. Check `compression` and use this if it is set.
     2. If `orc.compress` is not set, check `orc.compress` and use it.
     3. If `compression` and `orc.compress` are not set, then use the default 
snappy.
    
    ## How was this patch tested?
    
    Unit test in `OrcQuerySuite`.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HyukjinKwon/spark SPARK-16610

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14518.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14518
    
----
commit 2d55a61c7b5e59f2442fa20d8dc5bec8eceda650
Author: hyukjinkwon <[email protected]>
Date:   2016-08-06T02:05:25Z

    [SPARK-16610][SQL] Do not ignore `orc.compress` when `compression` option 
is unset

commit 4f2731370621e1fd9b25105a8f0184c98a7465f7
Author: hyukjinkwon <[email protected]>
Date:   2016-08-06T02:08:28Z

    Use SNAPPY as default

commit 1ad44eca2d796202c894c262efad666249a7b942
Author: hyukjinkwon <[email protected]>
Date:   2016-08-06T02:09:59Z

    Fix indentation

commit af1a3b837a3d384ba2387e2db0b5ae975870b21a
Author: hyukjinkwon <[email protected]>
Date:   2016-08-06T02:11:38Z

    Add a comment for default value

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to