Hi spark users,

I'm trying to create external table using HiveContext after creating a
schemaRDD and saving the RDD into a parquet file on hdfs.

I would like to use the schema in the schemaRDD (rdd_table) when I create
the external table.

For example:
rdd_table.saveAsParquetFile("/user/spark/my_data.parquet")
hiveContext.registerRDDAsTable(rdd_table, "rdd_table")
hiveContext.sql("CREATE EXTERNAL TABLE my_data LIKE rdd_table LOCATION
'/user/spark/my_data.parquet'")

the last line fails with:

org.apache.spark.sql.execution.QueryExecutionException: FAILED: Execution
Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Table not
found rdd_table
at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:322)
at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:284)
at
org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult$lzycompute(NativeCommand.scala:35)
at
org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult(NativeCommand.scala:35)
at
org.apache.spark.sql.hive.execution.NativeCommand.execute(NativeCommand.scala:38)
at
org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd$lzycompute(HiveContext.scala:382)
at
org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd(HiveContext.scala:382)

Is this supported?

Best Regards,

Jerry

Reply via email to