GitHub user dilipbiswal opened a pull request:

    https://github.com/apache/spark/pull/21174

    [SPARK-24085] Query returns UnsupportedOperationException when scalar 
subquery is present in partitioning expression

    ## What changes were proposed in this pull request?
    In this case, the partition pruning happens before the planning phase of 
scalar subquery expressions.
    For scalar subquery expressions, the planning occurs late in the cycle 
(after the physical planning)  in "PlanSubqueries" just before execution. 
Currently we try to execute the scalar subquery expression as part of partition 
pruning and fail as it implements Unevaluable.
    
    The fix attempts to ignore the Subquery expressions from partition pruning 
computation. Another option can be to somehow plan the subqueries before the 
partition pruning. Since this may not be a commonly occuring expression, i am 
opting for a simpler fix.
    
    Repro
    ``` SQL
    CREATE TABLE test_prc_bug (
    id_value string
    )
    partitioned by (id_type string)
    location '/tmp/test_prc_bug'
    stored as parquet;
    
    insert into test_prc_bug values ('1','a');
    insert into test_prc_bug values ('2','a');
    insert into test_prc_bug values ('3','b');
    insert into test_prc_bug values ('4','b');
    
    
    select * from test_prc_bug
    where id_type = (select 'b');
    ```
    ## How was this patch tested?
    Added test in SubquerySuite and hive/SQLQuerySuite


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dilipbiswal/spark spark-24085

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21174.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21174
    
----
commit 38c769274fca2931d0b0147e5e666b9cd7c99f59
Author: Dilip Biswal <dbiswal@...>
Date:   2018-04-26T00:40:01Z

    [SPARK-24085] Query returns UnsupportedOperationException when scalar 
subquery is present in partitioning expression.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to