[ 
https://issues.apache.org/jira/browse/HUDI-1598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang closed HUDI-1598.
--------------------------
      Assignee: Danny Chen
    Resolution: Done

Done via master branch: 5d2491d10c70e4e5fc9b7aeb62cc64bcaaf6043f

> Write as minor batches during one checkpoint interval for the new writer
> ------------------------------------------------------------------------
>
>                 Key: HUDI-1598
>                 URL: https://issues.apache.org/jira/browse/HUDI-1598
>             Project: Apache Hudi
>          Issue Type: Sub-task
>          Components: Flink Integration
>            Reporter: Danny Chen
>            Assignee: Danny Chen
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.8.0
>
>
> Buffering data during one checkpoint when flush the buffer out all at a time 
> is not resource friendly for streaming write. The more proper way it to cut 
> the batches based on their real memory data buffer size (say, 128Mb), the 
> writer always flushes the buffer out when its size reaches the configured 
> threshold.
> Thus, after this change, one instant may span one (if every checkpoint 
> succeeds) or more (if there are checkpoint failures) checkpoints. The instant 
> only commits when there is a successful checkpoint.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to