[
https://issues.apache.org/jira/browse/HUDI-152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nishith Agarwal updated HUDI-152:
---------------------------------
Labels: sev:normal triaged user-support-issues (was: sev:critical triaged
user-support-issues)
> Invoke preCombine in real time view by converting arrayWritable to Avro
> -----------------------------------------------------------------------
>
> Key: HUDI-152
> URL: https://issues.apache.org/jira/browse/HUDI-152
> Project: Apache Hudi
> Issue Type: Bug
> Components: Hive Integration
> Reporter: Nishith Agarwal
> Assignee: Nishith Agarwal
> Priority: Major
> Labels: sev:normal, triaged, user-support-issues
>
> There are 2 issues with the realtime input format:
>
> # Delta records (updates) might not have the entire row change log, in such
> an update, we need to be able to call preCombine of the HoodieRecordPayload
> implementation so that we merge existing data from parquet (full row change
> log) with the new column being updated.
> # In case there is some custom computation of columns in a custom
> implementation of the HoodieRecordPayload, that will be missed in the
> realtime input format right now. We need to honor that by calling preCombine.
>
> Both of the above are use-cases for power users who implement their own
> custom record. Since this is not common, this is lower priority.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)