[
https://issues.apache.org/jira/browse/HUDI-1598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
vinoyang closed HUDI-1598.
--------------------------
Assignee: Danny Chen
Resolution: Done
Done via master branch: 5d2491d10c70e4e5fc9b7aeb62cc64bcaaf6043f
> Write as minor batches during one checkpoint interval for the new writer
> ------------------------------------------------------------------------
>
> Key: HUDI-1598
> URL: https://issues.apache.org/jira/browse/HUDI-1598
> Project: Apache Hudi
> Issue Type: Sub-task
> Components: Flink Integration
> Reporter: Danny Chen
> Assignee: Danny Chen
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.8.0
>
>
> Buffering data during one checkpoint when flush the buffer out all at a time
> is not resource friendly for streaming write. The more proper way it to cut
> the batches based on their real memory data buffer size (say, 128Mb), the
> writer always flushes the buffer out when its size reaches the configured
> threshold.
> Thus, after this change, one instant may span one (if every checkpoint
> succeeds) or more (if there are checkpoint failures) checkpoints. The instant
> only commits when there is a successful checkpoint.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)