[ 
https://issues.apache.org/jira/browse/HUDI-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17396900#comment-17396900
 ] 

ASF GitHub Bot commented on HUDI-2279:
--------------------------------------

vinothchandar commented on pull request #3415:
URL: https://github.com/apache/hudi/pull/3415#issuecomment-896330648


   @pengzhiwei2018 can you please take a look


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


> Support column name matching for insert * and update set *  in merge into 
> when sourceTable's columns contains all targetTable's columns
> ---------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HUDI-2279
>                 URL: https://issues.apache.org/jira/browse/HUDI-2279
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: Spark Integration
>            Reporter: 董可伦
>            Assignee: 董可伦
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.9.0
>
>
> Example:
> {code:java}
> val tableName = generateTableName
> // Create table
> spark.sql(
>  s"""
>  |create table $tableName (
>  | id int,
>  | name string,
>  | price double,
>  | ts long,
>  | dt string
>  |) using hudi
>  | location '${tmp.getCanonicalPath}/$tableName'
>  | options (
>  | primaryKey ='id',
>  | preCombineField = 'ts'
>  | )
>  """.stripMargin)
> spark.sql(
>   s"""
>      |merge into $tableName as t0
>      |using (
>      |  select 1 as id, '2021-05-05' as dt, 1002 as ts, 97 as price, 'a1' as 
> name union all
>      |  select 1 as id, '2021-05-05' as dt, 1003 as ts, 98 as price, 'a2' as 
> name union all
>      |  select 2 as id, '2021-05-05' as dt, 1001 as ts, 99 as price, 'a3' as 
> name
>      | ) as s0
>      |on t0.id = s0.id
>      |when matched then update set *
>      |when not matched  then insert *
>      |""".stripMargin)
> spark.sql(s"select id, name, price, ts, dt from $tableName").show(){code}
> Fow now,the result is:
> +---+----------+-----+---+---+
> | id| name|price| ts| dt|
> +---+----------+-----+---+---+
> | 2|2021-05-05| 99.0| 99| a3|
> | 1|2021-05-05| 98.0| 98| a2|
> +---+----------+-----+---+---+
> When the order of the column types of souceTable is different from that of 
> the column types of targetTable
>  
> {code:java}
> spark.sql(
>   s"""
>      |merge into ${tableName} as t0
>      |using (
>      |  select 1 as id, 'a1' as name, 1002 as ts, '2021-05-05' as dt, 97 as 
> price union all
>      |  select 1 as id, 'a2' as name, 1003 as ts, '2021-05-05' as dt, 98 as 
> price union all
>      |  select 2 as id, 'a3' as name, 1001 as ts, '2021-05-05' as dt, 99 as 
> price
>      | ) as s0
>      |on t0.id = s0.id
>      |when matched then update set *
>      |when not matched  then insert *
>      |""".stripMargin){code}
>  
> It will throw an exception:
> {code:java}
> [ERROR] 2021-08-05 21:48:53,941 org.apache.hudi.io.HoodieWriteHandle  - Error 
> writing record HoodieRecord{key=HoodieKey { recordKey=id:2 partitionPath=}, 
> currentLocation='null', newLocation='null'}
> java.lang.RuntimeException: Error in execute expression: 
> org.apache.spark.unsafe.types.UTF8String cannot be cast to java.lang.Integer.
> Expressions is: [boundreference() AS `id`  boundreference() AS `name`  
> CAST(boundreference() AS `price` AS DOUBLE)  CAST(boundreference() AS `ts` AS 
> BIGINT)  CAST(boundreference() AS `dt` AS STRING)]
> CodeBody is: {
> ......
> Caused by: java.lang.ClassCastException: 
> org.apache.spark.unsafe.types.UTF8String cannot be cast to 
> java.lang.IntegerCaused by: java.lang.ClassCastException: 
> org.apache.spark.unsafe.types.UTF8String cannot be cast to java.lang.Integer 
> at 
> org.apache.hudi.sql.payload.ExpressionPayloadEvaluator_366797ae_4c30_4862_8222_7be486ede4f8.eval(Unknown
>  Source) at 
> org.apache.spark.sql.hudi.command.payload.ExpressionPayload.org$apache$spark$sql$hudi$command$payload$ExpressionPayload$$evaluate(ExpressionPayload.scala:258)
>  ... 18 more{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to