Hi spark users, I'm trying to create external table using HiveContext after creating a schemaRDD and saving the RDD into a parquet file on hdfs.
I would like to use the schema in the schemaRDD (rdd_table) when I create the external table. For example: rdd_table.saveAsParquetFile("/user/spark/my_data.parquet") hiveContext.registerRDDAsTable(rdd_table, "rdd_table") hiveContext.sql("CREATE EXTERNAL TABLE my_data LIKE rdd_table LOCATION '/user/spark/my_data.parquet'") the last line fails with: org.apache.spark.sql.execution.QueryExecutionException: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Table not found rdd_table at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:322) at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:284) at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult$lzycompute(NativeCommand.scala:35) at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult(NativeCommand.scala:35) at org.apache.spark.sql.hive.execution.NativeCommand.execute(NativeCommand.scala:38) at org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd$lzycompute(HiveContext.scala:382) at org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd(HiveContext.scala:382) Is this supported? Best Regards, Jerry