[ 
https://issues.apache.org/jira/browse/SPARK-36860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17420497#comment-17420497
 ] 

wineternity commented on SPARK-36860:
-------------------------------------

thanks, [~sarutak] , may I ask why spark doesn't support creating Hive table 
using storage handlers? 

it seems spark supported the stored by syntax, the data is already in 
CreateFileFormatContext.

!image-2021-09-27-14-18-10-910.png!

but the validateRowFormatFileFormat in AstBuilder later only checks the 
fileformat provided in stored as syntax, and as fileformat is null here, it 
throw out an exception.  Maybe stored by clause can be fixed by fix this check?

!image-2021-09-27-14-25-28-900.png!

> Create the external hive table for HBase failed 
> ------------------------------------------------
>
>                 Key: SPARK-36860
>                 URL: https://issues.apache.org/jira/browse/SPARK-36860
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.1.2
>            Reporter: wineternity
>            Priority: Major
>         Attachments: image-2021-09-27-14-18-10-910.png
>
>
> We use follow sql to create hive external table , which read from hbase
> {code:java}
> CREATE EXTERNAL TABLE if not exists dev.sanyu_spotlight_headline_material(
>    rowkey string COMMENT 'HBase主键',
>    content string COMMENT '图文正文')
> USING HIVE   
> ROW FORMAT SERDE
>    'org.apache.hadoop.hive.hbase.HBaseSerDe'
>  STORED BY
>    'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>  WITH SERDEPROPERTIES (
>    'hbase.columns.mapping'=':key, cf1:content'
> )
>  TBLPROPERTIES (
>    'hbase.table.name'='spotlight_headline_material'
>  );
> {code}
> But the sql failed in Spark 3.1.2, which throw this exception
> {code:java}
> 21/09/27 11:44:24 INFO scheduler.DAGScheduler: Asked to cancel job group 
> 26d7459f-7b58-4c18-9939-5f2737525ff2
> 21/09/27 11:44:24 ERROR thriftserver.SparkExecuteStatementOperation: Error 
> executing query with 26d7459f-7b58-4c18-9939-5f2737525ff2, currentState 
> RUNNING,
> org.apache.spark.sql.catalyst.parser.ParseException:
> Operation not allowed: Unexpected combination of ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.hbase.HBaseSerDe' and STORED BY 
> 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'WITHSERDEPROPERTIES('hbase.columns.mapping'=':key,
>  cf1:content')(line 5, pos 0)
> {code}
> this check was introduced from this change: 
> [https://github.com/apache/spark/pull/28026]
>  
> Could anyone gave the introduction how to create the external table for hbase 
> in spark3 now ? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to