[jira] [Commented] (HUDI-2426) spark sql extensions breaks read.table from metastore

Raymond Xu (Jira) Sun, 09 Jan 2022 16:47:05 -0800


    [ 
https://issues.apache.org/jira/browse/HUDI-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17471622#comment-17471622
 ]


Raymond Xu commented on HUDI-2426:
----------------------------------

I verified with Hudi master (0.11.0-SNAPSHOT) against 3.0.3,  3.1.2, and 3.2.0 
for this CTAS and it's working alright


{code}
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.0.3
      /_/
         
Using Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_312)
Type in expressions to have them evaluated.
Type :help for more information.

scala> val sql = """
     | create table ctas_cow_pt_nonpcf_tbl using hudi
     | options (type = 'cow', primaryKey = 'id')
     | partitioned by (dt)
     | as
     | select 1 as id, 'a1' as name, 10 as price, 1000 as dt;
     | """
sql: String =
"
create table ctas_cow_pt_nonpcf_tbl using hudi
options (type = 'cow', primaryKey = 'id')
partitioned by (dt)
as
select 1 as id, 'a1' as name, 10 as price, 1000 as dt;
"

scala> spark.sql(sql)
22/01/10 00:44:29 WARN package: Truncated the string representation of a plan 
since it was too large. This behavior can be adjusted by setting 
'spark.sql.debug.maxToStringFields'.
res0: org.apache.spark.sql.DataFrame = []                                       

scala> spark.table("default.ctas_cow_pt_nonpcf_tbl").show
+-------------------+--------------------+------------------+----------------------+--------------------+---+----+-----+----+
|_hoodie_commit_time|_hoodie_commit_seqno|_hoodie_record_key|_hoodie_partition_path|
   _hoodie_file_name| id|name|price|  dt|
+-------------------+--------------------+------------------+----------------------+--------------------+---+----+-----+----+
|  20220110004428844|20220110004428844...|              id:1|               
dt=1000|af4772c8-a56e-48a...|  1|  a1|   10|1000|
+-------------------+--------------------+------------------+----------------------+--------------------+---+----+-----+----+

{code}


> spark sql extensions breaks read.table from metastore
> -----------------------------------------------------
>
>                 Key: HUDI-2426
>                 URL: https://issues.apache.org/jira/browse/HUDI-2426
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: Spark Integration
>            Reporter: nicolas paris
>            Assignee: Yann Byron
>            Priority: Critical
>              Labels: sev:critical, user-support-issues
>             Fix For: 0.10.1
>
>
> when adding the hudi spark sql support, this breaks the ability to read a 
> hudi metastore from spark:
>  bash-4.2$ ./spark3.0.2/bin/spark-shell --packages 
> org.apache.hudi:hudi-spark3-bundle_2.12:0.9.0,org.apache.spark:spark-avro_2.12:3.1.2
>  --conf "spark.serializer=org.apache.spark.serializer.KryoSerializer" --conf 
> 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
>  
> scala> spark.table("default.test_hudi_table").show
> java.lang.UnsupportedOperationException: Unsupported parseMultipartIdentifier 
> method
>  at 
> org.apache.spark.sql.parser.HoodieCommonSqlParser.parseMultipartIdentifier(HoodieCommonSqlParser.scala:65)
>  at org.apache.spark.sql.SparkSession.table(SparkSession.scala:581)
>  ... 47 elided
>  
> removing the config makes the hive table readable again from spark
> this affect at least spark 3.0.x and 3.1.x



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (HUDI-2426) spark sql extensions breaks read.table from metastore

Reply via email to