[GitHub] [spark] sunchao commented on a change in pull request #33350: [SPARK-36136][SQL][TEST] Move PruneFileSourcePartitionsSuite to org.apache.spark.sql.execution.datasources

GitBox Thu, 15 Jul 2021 10:36:17 -0700


sunchao commented on a change in pull request #33350:
URL: https://github.com/apache/spark/pull/33350#discussion_r670676507




##########
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/PruneFileSourcePartitionsSuite.scala
##########
@@ -42,35 +43,27 @@ class PruneFileSourcePartitionsSuite extends 
PrunePartitionSuiteBase {
 
   test("PruneFileSourcePartitions should not change the output of 
LogicalRelation") {
     withTable("test") {
-      withTempDir { dir =>
-        sql(
-          s"""
-            |CREATE EXTERNAL TABLE test(i int)
-            |PARTITIONED BY (p int)
-            |STORED AS parquet
-            |LOCATION '${dir.toURI}'""".stripMargin)
-
-        val tableMeta = spark.sharedState.externalCatalog.getTable("default", 
"test")
-        val catalogFileIndex = new CatalogFileIndex(spark, tableMeta, 0)
-
-        val dataSchema = StructType(tableMeta.schema.filterNot { f =>
-          tableMeta.partitionColumnNames.contains(f.name)
-        })
-        val relation = HadoopFsRelation(
-          location = catalogFileIndex,
-          partitionSchema = tableMeta.partitionSchema,
-          dataSchema = dataSchema,
-          bucketSpec = None,
-          fileFormat = new ParquetFileFormat(),
-          options = Map.empty)(sparkSession = spark)
-
-        val logicalRelation = LogicalRelation(relation, tableMeta)
-        val query = Project(Seq(Symbol("i"), Symbol("p")),
-          Filter(Symbol("p") === 1, logicalRelation)).analyze
-
-        val optimized = Optimize.execute(query)
-        assert(optimized.missingInput.isEmpty)
-      }
+      spark.range(10).selectExpr("id", "id % 3 as 
p").write.partitionBy("p").saveAsTable("test")
+      val tableMeta = spark.sharedState.externalCatalog.getTable("default", 
"test")
+      val catalogFileIndex = new CatalogFileIndex(spark, tableMeta, 0)
+
+      val dataSchema = StructType(tableMeta.schema.filterNot { f =>
+        tableMeta.partitionColumnNames.contains(f.name)
+      })
+      val relation = HadoopFsRelation(
+        location = catalogFileIndex,
+        partitionSchema = tableMeta.partitionSchema,
+        dataSchema = dataSchema,
+        bucketSpec = None,
+        fileFormat = new ParquetFileFormat(),
+        options = Map.empty)(sparkSession = spark)
+
+      val logicalRelation = LogicalRelation(relation, tableMeta)
+      val query = Project(Seq(Symbol("id"), Symbol("p")),

Review comment:
       Thanks. I'm not sure whether it's worth doing so because we changed how 
the test table is created by using the DataFrame API 
`spark.range(10).selectExpr("id", "id % 3 as 
p").write.partitionBy("p").saveAsTable("test")`, which creates `id` column by 
default. The `id` here is also consistent with the rest of the tests in this 
file as well as other tests which use the same API to create tables.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] sunchao commented on a change in pull request #33350: [SPARK-36136][SQL][TEST] Move PruneFileSourcePartitionsSuite to org.apache.spark.sql.execution.datasources

Reply via email to