dzcxzl created SPARK-31590:
------------------------------

             Summary: The filter used by Metadata-only queries should not have 
Unevaluable
                 Key: SPARK-31590
                 URL: https://issues.apache.org/jira/browse/SPARK-31590
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.4.0
            Reporter: dzcxzl


code:
{code:scala}
        sql("set spark.sql.optimizer.metadataOnly=true")
        sql("CREATE TABLE test_tbl (a INT,d STRING,h STRING) USING PARQUET 
PARTITIONED BY (d ,h)")
        sql("""
            |INSERT OVERWRITE TABLE test_tbl PARTITION(d,h)
            |SELECT 1,'2020-01-01','23'
            |UNION ALL
            |SELECT 2,'2020-01-02','01'
            |UNION ALL
            |SELECT 3,'2020-01-02','02'
            """.stripMargin)
        sql(
          s"""
             |SELECT d, MAX(h) AS h
             |FROM test_tbl
             |WHERE d= (
             |  SELECT MAX(d) AS d
             |  FROM test_tbl
             |)
             |GROUP BY d
        """.stripMargin).collect()
{code}

Exception:
{code:java}
java.lang.UnsupportedOperationException: Cannot evaluate expression: 
scalar-subquery#48 []

...
at 
org.apache.spark.sql.execution.datasources.PartitioningAwareFileIndex.prunePartitions(PartitioningAwareFileIndex.scala:180)
{code}

optimizedPlan:
{code:java}
Aggregate [d#245], [d#245, max(h#246) AS h#243]
+- Project [d#245, h#246]
   +- Filter (isnotnull(d#245) AND (d#245 = scalar-subquery#242 []))
      :  +- Aggregate [max(d#245) AS d#241]
      :     +- LocalRelation <empty>, [d#245]
      +- Relation[a#244,d#245,h#246] parquet
{code}






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to