[GitHub] [spark] HyukjinKwon commented on a change in pull request #28383: [SPARK-31590][SQL] The filter used by Metadata-only queries should not have Unevaluable
HyukjinKwon commented on a change in pull request #28383: URL: https://github.com/apache/spark/pull/28383#discussion_r418857861 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/OptimizeMetadataOnlyQuery.scala ## @@ -117,7 +117,7 @@ case class OptimizeMetadataOnlyQuery(catalog: SessionCatalog) extends Rule[Logic case a: AttributeReference => a.withName(relation.output.find(_.semanticEquals(a)).get.name) } -} +}.filterNot(SubqueryExpression.hasSubquery) Review comment: Yeah let's keep the PR title.and description matched .. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #28383: [SPARK-31590][SQL] The filter used by Metadata-only queries should not have Unevaluable
HyukjinKwon commented on a change in pull request #28383: URL: https://github.com/apache/spark/pull/28383#discussion_r418417828 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/OptimizeMetadataOnlyQuerySuite.scala ## @@ -150,4 +150,30 @@ class OptimizeMetadataOnlyQuerySuite extends QueryTest with SharedSparkSession { } } } + + test("SPARK-31590 The filter used by Metadata-only queries should not have Unevaluable") { +withTable("test_tbl") { + withSQLConf(OPTIMIZER_METADATA_ONLY.key -> "true") { +sql("CREATE TABLE test_tbl (a INT,d STRING,h STRING) USING PARQUET PARTITIONED BY (d ,h)") Review comment: Can we reuse `testMetadataOnly`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #28383: [SPARK-31590][SQL] The filter used by Metadata-only queries should not have Unevaluable
HyukjinKwon commented on a change in pull request #28383: URL: https://github.com/apache/spark/pull/28383#discussion_r418415403 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/OptimizeMetadataOnlyQuerySuite.scala ## @@ -150,4 +150,30 @@ class OptimizeMetadataOnlyQuerySuite extends QueryTest with SharedSparkSession { } } } + + test("SPARK-31590 The filter used by Metadata-only queries should not have Unevaluable") { +withTable("test_tbl") { + withSQLConf(OPTIMIZER_METADATA_ONLY.key -> "true") { +sql("CREATE TABLE test_tbl (a INT,d STRING,h STRING) USING PARQUET PARTITIONED BY (d ,h)") Review comment: Can you make the test case minimised, and consistent with the style used in this file? I think you can create the partitioned table via `write.parquetby` syntax instead of relying on the SQL syntax here even when you create tables. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #28383: [SPARK-31590][SQL] The filter used by Metadata-only queries should not have Unevaluable
HyukjinKwon commented on a change in pull request #28383: URL: https://github.com/apache/spark/pull/28383#discussion_r418415403 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/OptimizeMetadataOnlyQuerySuite.scala ## @@ -150,4 +150,30 @@ class OptimizeMetadataOnlyQuerySuite extends QueryTest with SharedSparkSession { } } } + + test("SPARK-31590 The filter used by Metadata-only queries should not have Unevaluable") { +withTable("test_tbl") { + withSQLConf(OPTIMIZER_METADATA_ONLY.key -> "true") { +sql("CREATE TABLE test_tbl (a INT,d STRING,h STRING) USING PARQUET PARTITIONED BY (d ,h)") Review comment: Can you make the test case minimised, and consistent with the style used in this file? I think you can create the partitioned table via `write.parquetby` syntax instead of relying on the SQL syntax here even when you create tables. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #28383: [SPARK-31590][SQL] The filter used by Metadata-only queries should not have Unevaluable
HyukjinKwon commented on a change in pull request #28383: URL: https://github.com/apache/spark/pull/28383#discussion_r418415403 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/OptimizeMetadataOnlyQuerySuite.scala ## @@ -150,4 +150,30 @@ class OptimizeMetadataOnlyQuerySuite extends QueryTest with SharedSparkSession { } } } + + test("SPARK-31590 The filter used by Metadata-only queries should not have Unevaluable") { +withTable("test_tbl") { + withSQLConf(OPTIMIZER_METADATA_ONLY.key -> "true") { +sql("CREATE TABLE test_tbl (a INT,d STRING,h STRING) USING PARQUET PARTITIONED BY (d ,h)") Review comment: Can you make the test case minimised, and consistent with the style used in this file? I think you can create the partitioned table via `write.parquetby` syntax instead of relying on the SQL syntax here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #28383: [SPARK-31590][SQL] The filter used by Metadata-only queries should not have Unevaluable
HyukjinKwon commented on a change in pull request #28383: URL: https://github.com/apache/spark/pull/28383#discussion_r418408328 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/OptimizeMetadataOnlyQuery.scala ## @@ -119,6 +119,10 @@ case class OptimizeMetadataOnlyQuery(catalog: SessionCatalog) extends Rule[Logic } } +if (normalizedFilters.exists(_.find(_.isInstanceOf[Unevaluable]).isDefined)) { + return child +} Review comment: Why don't you just filter out subquerties consistently with other normalized filters, by `normalizedFilters.filterNot(SubqueryExpression.hasSubquery)`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org