[GitHub] spark pull request #20406: [SPARK-23230][SQL]Error by creating a data table ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20406#discussion_r167456217 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveSerDeSuite.scala --- @@ -100,6 +100,25 @@ class HiveSerDeSuite extends HiveComparisonTest with PlanTest with BeforeAndAfte assert(output == Some("org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat")) assert(serde == Some("org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe")) } + +withSQLConf("hive.default.fileformat" -> "orc") { --- End diff -- Actually, this PR does not need to improve the test coverage. What we really need to do is to confirm whether Hive's default serde are the ones added by this PR. Anybody can run it and post the results here? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20406: [SPARK-23230][SQL]Error by creating a data table ...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/20406#discussion_r167440884 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveSerDeSuite.scala --- @@ -100,6 +100,25 @@ class HiveSerDeSuite extends HiveComparisonTest with PlanTest with BeforeAndAfte assert(output == Some("org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat")) assert(serde == Some("org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe")) } + +withSQLConf("hive.default.fileformat" -> "orc") { --- End diff -- Please test with all possible values which are supported by Spark. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20406: [SPARK-23230][SQL]Error by creating a data table ...
GitHub user cxzl25 opened a pull request: https://github.com/apache/spark/pull/20406 [SPARK-23230][SQL]Error by creating a data table when using hive.default.fileformat=orc When hive.default.fileformat is other kinds of file types, create textfile table cause a serda error. We should take the default type of textfile and sequencefile both as org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe. ``` set hive.default.fileformat=orc; create table tbl( i string ) stored as textfile; desc formatted tbl; Serde Library org.apache.hadoop.hive.ql.io.orc.OrcSerde InputFormat org.apache.hadoop.mapred.TextInputFormat OutputFormat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/cxzl25/spark default_serde Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20406.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20406 commit f370dd6217cf8a590ef52ecc970e4dc33c235631 Author: sychenDate: 2018-01-26T12:40:48Z take the default type of textfile and sequencefile both as org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org