Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/13622#discussion_r67021276
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/parquetSuites.scala ---
@@ -676,6 +676,46 @@ class ParquetSourceSuite extends
ParquetPartitioningTest {
}
}
+ test("Verify the PARQUET conversion parameter:
CONVERT_METASTORE_PARQUET") {
+ withTempTable("single") {
+ val singleRowDF = Seq((0, "foo")).toDF("key", "value")
+ singleRowDF.createOrReplaceTempView("single")
+
+ Seq("true", "false").foreach { parquetConversion =>
+ withSQLConf(HiveUtils.CONVERT_METASTORE_PARQUET.key ->
parquetConversion) {
+ val tableName = "test_parquet_ctas"
+ withTable(tableName) {
+ sql(
+ s"""
+ |CREATE TABLE $tableName STORED AS PARQUET
+ |AS SELECT tmp.key, tmp.value FROM single tmp
+ """.stripMargin)
+
+ val df = spark.sql(s"SELECT * FROM $tableName WHERE key=0")
+ checkAnswer(df, singleRowDF)
+
+ val queryExecution = df.queryExecution
+ if (parquetConversion == "true") {
+ queryExecution.analyzed.collectFirst {
+ case _: LogicalRelation =>
+ }.getOrElse {
+ fail(s"Expecting the query plan to convert parquet to data
sources, " +
+ s"but got:\n$queryExecution")
+ }
+ } else {
+ queryExecution.analyzed.collectFirst {
+ case _: MetastoreRelation =>
+ }.getOrElse {
+ fail(s"Expecting no conversion from parquet to data
sources, " +
+ s"but got:\n$queryExecution")
+ }
+ }
+ }
+ }
+ }
+ }
+ }
+
--- End diff --
Thank you for your review! @clockfly
I like your suggestion, but the conf `CONVERT_METASTORE_ORC` and
`CONVERT_METASTORE_PARQUET` are not available in `CatalystConf` and
`SimpleCatalystConf`. I am hesitant to change the source codes for improving
test case coverage.
Actually, I am not sure if we should keep them in the `Analyzer` in the
future release. These two rules are not used for resolution. Instead, it is for
performance enhancement. Ideally, I think they are `Optimizer` rules.
cc @cloud-fan @liancheng
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]