[ 
https://issues.apache.org/jira/browse/HUDI-499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu resolved HUDI-499.
-----------------------------
    Resolution: Implemented

Configuration docs will be published later.

> Allow partition path to be updated with GLOBAL_BLOOM index
> ----------------------------------------------------------
>
>                 Key: HUDI-499
>                 URL: https://issues.apache.org/jira/browse/HUDI-499
>             Project: Apache Hudi (incubating)
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Raymond Xu
>            Assignee: Raymond Xu
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 0.5.2
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> h3. Context
> When a record is to be updated with a new partition path, and when set to 
> GLOBAL_BLOOM as index, the current logic implemented in 
> [https://github.com/apache/incubator-hudi/pull/1091/] ignores the new 
> partition path and update the record in the original partition path.
> h3. Proposed change
> Allow records to be inserted into their new partition paths and delete the 
> records in the old partition paths. A configuration (e.g. 
> {{hoodie.index.bloom.update.partition.path=true}}) can be added to enable 
> this feature.
> h4. An example use case
> A Hudi dataset manages people info and partitioned by birthday. In most 
> cases, where people info are updated, birthdays are not to be changed (that's 
> why we choose it as partition field). But in some edge cases where birthday 
> info are input wrongly and we want to manually fix it or allow user to 
> updated it occasionally. In this case, option 2 would be helpful in keeping 
> records in the expected partition, so that a query like "show me people who 
> were born after 2000" would work.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to