GitHub user HyukjinKwon opened a pull request:

    https://github.com/apache/spark/pull/9687

    [SPARK-11677][SQL ]ORC filter tests all pass if filters are actually not 
pushed down.

    Currently ORC filters are not tested properly. All the tests pass even if 
the filters are not pushed down or disabled. In this PR, I add some logics for 
this.
    
    Several things to mention.
    Firstly, since ORC does not filter record by record fully, I checked the 
count and if it contains the expected values.
    
    Secondly, I wonder if it is okay to put `extractSourceRDDToDataFrame` at 
`QueryTest`. I did not put but I think the `extractSourceRDDToDataFrame` can be 
shared with `ParquetFilterSuite`. 
    
    Lastly, I originally wanted to add `OrcFilterSuite` separately in order to 
test actual filter evaluation; however, I decided not to do it here (I will do 
in a separate issue or followup PR) and just let the original test way work 
properly first.
    
    cc @liancheng  

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HyukjinKwon/spark SPARK-11677

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9687.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9687
    
----
commit 7a13c8e6c73e3824b8188b865fcaeda4c7e04117
Author: hyukjinkwon <gurwls...@gmail.com>
Date:   2015-11-13T05:18:10Z

    [SPARK-11677][SQL] ORC filter tests all pass if filters are actually not 
pushed down.

commit 82d0aa773d58115b0a2b3d5fd782d473e26c2671
Author: hyukjinkwon <gurwls...@gmail.com>
Date:   2015-11-13T07:43:00Z

    [SPARK-11677][SQL] Add tests for is-not-null operator and in-operator

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to