Xiaoqiao He created HUDI-3599:
---------------------------------
Summary: Not atomicity commit could cause streaming read loss data
Key: HUDI-3599
URL: https://issues.apache.org/jira/browse/HUDI-3599
Project: Apache Hudi
Issue Type: Bug
Components: core
Reporter: Xiaoqiao He
The current `commit` implement call hierarchy show as following, and
`transitionState` invoke write deltacommit file to complete this commit. But
`write file` is not atomicity operation on HDFS for instance.
{code:java}
HoodieActiveTimeline.transitionState(HoodieInstant, HoodieInstant,
Option<byte[]>, boolean) (org.apache.hudi.common.table.timeline)
HoodieActiveTimeline.transitionState(HoodieInstant, HoodieInstant,
Option<byte[]>) (org.apache.hudi.common.table.timeline)
HoodieActiveTimeline.saveAsComplete(HoodieInstant, Option<byte[]>)
(org.apache.hudi.common.table.timeline)
BaseHoodieWriteClient.commit(HoodieTable, String, String,
HoodieCommitMetadata, List<HoodieWriteStat>) (org.apache.hudi.client)
BaseHoodieWriteClient.commitStats(String, List<HoodieWriteStat>,
Option<Map<String, String>>, String, Map<String, List<String>>)
(org.apache.hudi.client)
HoodieFlinkWriteClient.commit(String, List<WriteStatus>,
Option<Map<String, String>>, String, Map<String, List<String>>)
(org.apache.hudi.client)
HoodieJavaWriteClient.commit(String, List<WriteStatus>, Option<Map<String,
String>>, String, Map<String, List<String>>) (org.apache.hudi.client)
{code}
As the
org.apache.hudi.common.table.timeline.HoodieActiveTimeline#createImmutableFileInPath
said as below, there are three step to complete data write: A. create file, B.
write data, C. close file handle. Consider `StreamReadMonitoring` traverse this
deltacommit file but content is null between step A and B then it will read
nothing at the loop. IMO it could loss some commit data for stream read.
{code:java}
private void createImmutableFileInPath(Path fullPath, Option<byte[]> content)
{
FSDataOutputStream fsout = null;
try {
fsout = metaClient.getFs().create(fullPath, false);
if (content.isPresent()) {
fsout.write(content.get());
}
} catch (IOException e) {
throw new HoodieIOException("Failed to create file " + fullPath, e);
} finally {
try {
if (null != fsout) {
fsout.close();
}
} catch (IOException e) {
throw new HoodieIOException("Failed to close file " + fullPath, e);
}
}
}
{code}
In order to avoid this corner case, I think we should dependency on `rename`
operation to complete commit rather than create-write-close flow. Please
correct me if something I missed.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)