[ 
https://issues.apache.org/jira/browse/HUDI-152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishith Agarwal updated HUDI-152:
---------------------------------
    Labels: sev:normal triaged user-support-issues  (was: sev:critical triaged 
user-support-issues)

> Invoke preCombine in real time view by converting arrayWritable to Avro
> -----------------------------------------------------------------------
>
>                 Key: HUDI-152
>                 URL: https://issues.apache.org/jira/browse/HUDI-152
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: Hive Integration
>            Reporter: Nishith Agarwal
>            Assignee: Nishith Agarwal
>            Priority: Major
>              Labels: sev:normal, triaged, user-support-issues
>
> There are 2 issues with the realtime input format:
>  
>  # Delta records (updates) might not have the entire row change log, in such 
> an update, we need to be able to call preCombine of the HoodieRecordPayload 
> implementation so that we merge existing data from parquet (full row change 
> log) with the new column being updated.
>  # In case there is some custom computation of columns in a custom 
> implementation of the HoodieRecordPayload, that will be missed in the 
> realtime input format right now. We need to honor that by calling preCombine.
>  
> Both of the above are use-cases for power users who implement their own 
> custom record. Since this is not common, this is lower priority. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to