GitHub user CrazyJacky opened a pull request: https://github.com/apache/spark/pull/19434
[SPARK-21785][SQL]Support create table from a parquet file schema ## Support create table from a parquet file schema As described in jira: ```sql CREATE EXTERNAL TABLE IF NOT EXISTS test LIKE 'PARQUET' '/user/test/abc/a.snappy.parquet' STORED AS PARQUET LOCATION '/user/test/def/'; ``` this is a very ugly fix and I would like someone to help to review and refine. and it only supports create hive table. ## Tested by test case and tested about build the runnable distribution ```scala test("create table like parquet") { val f = getClass.getClassLoader. getResource("test-data/dec-in-fixed-len.parquet").getPath val v1 = """ |create table if not exists db1.table1 like 'parquet' """.stripMargin.concat("'" + f + "'").concat( """ |stored as sequencefile |location '/tmp/table1' """.stripMargin ) val (desc, allowExisting) = extractTableDesc(v1) assert(allowExisting) assert(desc.identifier.database == Some("db1")) assert(desc.identifier.table == "table1") assert(desc.tableType == CatalogTableType.EXTERNAL) assert(desc.schema == new StructType() .add("fixed_len_dec", "decimal(10,2)")) assert(desc.bucketSpec.isEmpty) assert(desc.viewText.isEmpty) assert(desc.viewDefaultDatabase.isEmpty) assert(desc.viewQueryColumnNames.isEmpty) assert(desc.storage.locationUri == Some(new URI("/tmp/table1"))) assert(desc.storage.inputFormat == Some("org.apache.hadoop.mapred.SequenceFileInputFormat")) assert(desc.storage.outputFormat == Some("org.apache.hadoop.mapred.SequenceFileOutputFormat")) assert(desc.storage.serde == Some("org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe")) } ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/jacshen/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19434.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19434 ---- commit 6b23cb8ff5a778f4f1b4ca4f218cbe8c4e422101 Author: Shen <jacs...@lm-sea-11008031.corp.ebay.com> Date: 2017-10-04T20:35:03Z Add support to create table which schema is reading from a given parquet file commit 877a57ec439db4e688c71568ddd312bdc2a50cec Author: jacshen <jacs...@ebay.com> Date: 2017-10-04T20:37:08Z Merge branch 'master' of https://github.com/apache/spark commit a22c39e795ab4a730d0277c4162cdfadd37dbf22 Author: jacshen <jacs...@ebay.com> Date: 2017-10-04T21:21:02Z Add support to create table which schema is reading from a given parquet file ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org