ad1happy2go commented on issue #9838:
URL: https://github.com/apache/hudi/issues/9838#issuecomment-1754472931
@rita-ihnatsyeva Are you saying it worked before but now stopped working. I
tried the below code and it worked with OSS hudi.
```
test("Test MergeInto With Complex Data Types") {
Seq(true, false).foreach { sparkSqlOptimizedWrites =>
withRecordType()(withTempDir { tmp =>
spark.sql("set hoodie.payload.combined.schema.validate = false")
val tableName = generateTableName
val cls = classOf[HoodieFileIndex]
//val appender = LoggingTestUtils.attachInMemoryAppender(cls)
var msg = "Total file slices: 1; candidate file slices after data
skipping: 0; skipping percentage 1.0"
// Create table
spark.sql(
s"""
|create table $tableName (
| uuid int,
| array_of_structs array<STRUCT<id:string, created_at:string>>
,
| array_of_deleted_ids array<string>,
| ts long
|) using hudi
| location 'file:///tmp/$tableName'
| tblproperties (
| primaryKey ='uuid',
| preCombineField = 'ts'
| )
""".stripMargin)
spark.sql(
s"""INSERT INTO $tableName VALUES (
| 1,
| ARRAY(named_struct('id', 'example_id', 'created_at',
'2023-10-10')),
| ARRAY('id'),
| 1633843200000
|)
""".stripMargin)
spark.sql(
"""
|SELECT
| 1 as uuid,
| ARRAY(named_struct('id', 'example_id1', 'created_at',
'2023-10-11')) as array_of_structs,
| ARRAY('id') as array_of_deleted_ids,
| 1633843200001 as ts
|""".stripMargin).createOrReplaceTempView("source")
spark.sql(
s""" merge into $tableName target
using source
on target.uuid = source.uuid
when matched then update set
target.ts = source.ts,
target.array_of_structs =
filter(array_union(coalesce(target.array_of_structs, array()),
source.array_of_structs), x -> !
array_contains(source.array_of_deleted_ids, x.id))
when not matched then insert *
""".stripMargin)
spark.read.format("hudi").table(tableName).show(false)
})
}
}
```
Let us know more details on this. Is there is something I am missing in this
code. Can you try above code to see if it works for you. Thanks.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]