sivabalan narayanan created HUDI-3840:
-----------------------------------------
Summary: Warn logs about not able to read replace commit metadata
Key: HUDI-3840
URL: https://issues.apache.org/jira/browse/HUDI-3840
Project: Apache Hudi
Issue Type: Task
Components: spark
Reporter: sivabalan narayanan
I was trying out spark streaming sink w/ hudi and saw warn logs as below.
{code:java}
22/04/09 15:54:16 WARN AbstractTableFileSystemView: Could not read commit
details from
/tmp/hudi_streaming_kafka/COPY_ON_WRITE/.hoodie/20220409154917240.replacecommit
22/04/09 15:54:16 WARN AbstractTableFileSystemView: Could not read commit
details from
/tmp/hudi_streaming_kafka/COPY_ON_WRITE/.hoodie/20220409155011647.replacecommit
{code}
But ran some validations and ensured data was intact. Further investigation
revealed that, this happens just after archival, where in the replace commit
shown above were part of the list of instants that got archived. So, may be
active timeline reloading is missed somewhere. Since its a warn log and does
not cause any correctness issue, filing a low priority ticket.
Steps to repo:
spark streaming write to Hudi COW table w/ async clustering. make archival
aggressive and you should see these logs at some point
--
This message was sent by Atlassian Jira
(v8.20.1#820001)