Selva,
Can you try setting this property("hoodie.combine.before.upsert") to
false. By default this is set to true and hence you see the error.
Wrt your question on key values,
if non-partitioned dataset (global index) : records having same record
key will be combined if precombine is enabled.
for partitioned data set( non global index) : records having same HoodieKey
will be combined if precombine is enabled.
On Sat, May 16, 2020 at 5:37 PM selvaraj periyasamy <
[email protected]> wrote:
> Team,
>
>
>
> Regarding PRECOMBINE_FIELD_OPT_KEY, I see the below description in the
> documentation.
>
>
>
> PRECOMBINE_FIELD_OPT_KEY
>
> Property: hoodie.datasource.write.precombine.field, Default: ts
> Field used in preCombining before actual write. When two records have the
> same key value, we will pick the one with the largest value for the
> precombine field, determined by Object.compareTo(..)
>
>
>
> It says same key values. What does key value denote here ? Is it a record
> key? I have to process small batch record frequently , which could have
> the records for same record key. Say example
>
>
>
> *Batch 1:*
>
>
>
> *Request_id | Operation_name | field 1. | field 2 |
> transaction_timestamp*
>
> 123 | I | Value 0 | value 0 |
> Fri May 15 20:23:26 GMT 2020
>
>
>
>
>
> *Batch 2 :*
>
>
>
> *Request_id | Operation_name | field 1. | field 2 |
> transaction_timestamp*
>
> 123 | U | Value 1 | value 0 |
> Fri May 15 20:23:26 GMT 2020
>
> *Request_id | Operation_name | field 1. | field 2 |
> transaction_timestamp*
>
> 123 | U | Value 0 | value 2 |
> Fri May 15 20:23:26 GMT 2020
>
>
>
> Here Batch 1 got processed and then after 10 min, I need to process Batch
> 2 using Hudi.Basicaclly I don’t want to do PRECOMBINE_FIELD_OPT_KEY
> operation and logcailly handles batch 2 updates. Because my batch 2 will
> have 2 updates and I need to logically combine them and upsert HDFS.
>
>
>
> PRECOMBINE_FIELD_OPT_KEY seems to be mandateone and it throws error, if I
> don’t have ts column in my table and also don't want Hudi to do
> precombine as it could eliminate some of my genuine records. What is
> your suggestion on handling this scenario?
>
>
>
> Thanks,
>
> Selva
>
--
Regards,
-Sivabalan