[
https://issues.apache.org/jira/browse/SPARK-31590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
dzcxzl updated SPARK-31590:
---------------------------
Description:
When using SPARK-23877, some sql execution errors.
code:
{code:scala}
sql("set spark.sql.optimizer.metadataOnly=true")
sql("CREATE TABLE test_tbl (a INT,d STRING,h STRING) USING PARQUET
PARTITIONED BY (d ,h)")
sql("""
|INSERT OVERWRITE TABLE test_tbl PARTITION(d,h)
|SELECT 1,'2020-01-01','23'
|UNION ALL
|SELECT 2,'2020-01-02','01'
|UNION ALL
|SELECT 3,'2020-01-02','02'
""".stripMargin)
sql(
s"""
|SELECT d, MAX(h) AS h
|FROM test_tbl
|WHERE d= (
| SELECT MAX(d) AS d
| FROM test_tbl
|)
|GROUP BY d
""".stripMargin).collect()
{code}
Exception:
{code:java}
java.lang.UnsupportedOperationException: Cannot evaluate expression:
scalar-subquery#48 []
...
at
org.apache.spark.sql.execution.datasources.PartitioningAwareFileIndex.prunePartitions(PartitioningAwareFileIndex.scala:180)
{code}
optimizedPlan:
{code:java}
Aggregate [d#245], [d#245, max(h#246) AS h#243]
+- Project [d#245, h#246]
+- Filter (isnotnull(d#245) AND (d#245 = scalar-subquery#242 []))
: +- Aggregate [max(d#245) AS d#241]
: +- LocalRelation <empty>, [d#245]
+- Relation[a#244,d#245,h#246] parquet
{code}
was:
code:
{code:scala}
sql("set spark.sql.optimizer.metadataOnly=true")
sql("CREATE TABLE test_tbl (a INT,d STRING,h STRING) USING PARQUET
PARTITIONED BY (d ,h)")
sql("""
|INSERT OVERWRITE TABLE test_tbl PARTITION(d,h)
|SELECT 1,'2020-01-01','23'
|UNION ALL
|SELECT 2,'2020-01-02','01'
|UNION ALL
|SELECT 3,'2020-01-02','02'
""".stripMargin)
sql(
s"""
|SELECT d, MAX(h) AS h
|FROM test_tbl
|WHERE d= (
| SELECT MAX(d) AS d
| FROM test_tbl
|)
|GROUP BY d
""".stripMargin).collect()
{code}
Exception:
{code:java}
java.lang.UnsupportedOperationException: Cannot evaluate expression:
scalar-subquery#48 []
...
at
org.apache.spark.sql.execution.datasources.PartitioningAwareFileIndex.prunePartitions(PartitioningAwareFileIndex.scala:180)
{code}
optimizedPlan:
{code:java}
Aggregate [d#245], [d#245, max(h#246) AS h#243]
+- Project [d#245, h#246]
+- Filter (isnotnull(d#245) AND (d#245 = scalar-subquery#242 []))
: +- Aggregate [max(d#245) AS d#241]
: +- LocalRelation <empty>, [d#245]
+- Relation[a#244,d#245,h#246] parquet
{code}
> The filter used by Metadata-only queries should not have Unevaluable
> --------------------------------------------------------------------
>
> Key: SPARK-31590
> URL: https://issues.apache.org/jira/browse/SPARK-31590
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.4.0
> Reporter: dzcxzl
> Priority: Trivial
>
> When using SPARK-23877, some sql execution errors.
> code:
> {code:scala}
> sql("set spark.sql.optimizer.metadataOnly=true")
> sql("CREATE TABLE test_tbl (a INT,d STRING,h STRING) USING PARQUET
> PARTITIONED BY (d ,h)")
> sql("""
> |INSERT OVERWRITE TABLE test_tbl PARTITION(d,h)
> |SELECT 1,'2020-01-01','23'
> |UNION ALL
> |SELECT 2,'2020-01-02','01'
> |UNION ALL
> |SELECT 3,'2020-01-02','02'
> """.stripMargin)
> sql(
> s"""
> |SELECT d, MAX(h) AS h
> |FROM test_tbl
> |WHERE d= (
> | SELECT MAX(d) AS d
> | FROM test_tbl
> |)
> |GROUP BY d
> """.stripMargin).collect()
> {code}
> Exception:
> {code:java}
> java.lang.UnsupportedOperationException: Cannot evaluate expression:
> scalar-subquery#48 []
> ...
> at
> org.apache.spark.sql.execution.datasources.PartitioningAwareFileIndex.prunePartitions(PartitioningAwareFileIndex.scala:180)
> {code}
> optimizedPlan:
> {code:java}
> Aggregate [d#245], [d#245, max(h#246) AS h#243]
> +- Project [d#245, h#246]
> +- Filter (isnotnull(d#245) AND (d#245 = scalar-subquery#242 []))
> : +- Aggregate [max(d#245) AS d#241]
> : +- LocalRelation <empty>, [d#245]
> +- Relation[a#244,d#245,h#246] parquet
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]