Alexander Filipchik created HUDI-784: ----------------------------------------
Summary: CorruptedLogFileException sometimes happens on GCS Key: HUDI-784 URL: https://issues.apache.org/jira/browse/HUDI-784 Project: Apache Hudi (incubating) Issue Type: Bug Reporter: Alexander Filipchik 768726 [Executor task launch worker-2] ERROR org.apache.hudi.common.table.log.AbstractHoodieLogRecordScanner - Got exception when reading log file org.apache.hudi.exception.CorruptedLogFileException: HoodieLogFile{pathStr=' [gs://.log.|gs://1_20200219014757.log.2] ', fileLen=0}could not be read. Did not find the magic bytes at the start of the block at org.apache.hudi.common.table.log.HoodieLogFileReader.readMagic(HoodieLogFileReader.java:313) at org.apache.hudi.common.table.log.HoodieLogFileReader.hasNext(HoodieLogFileReader.java:295) at org.apache.hudi.common.table.log.HoodieLogFormatReader.hasNext(HoodieLogFormatReader.java:103) I did extensive debugging and still unclear on why it is happening. It might be issue with GCS libraries themselves. The fix that is working: In: HoodieLogFileReader made {code:java} // private final byte[] magicBuffer = new byte[6]; {code} non static. I'm not sure why it is actually static in the first place as it is inviting a race. Also in HoodieLogFileReader: added {code:java} // fsDataInputStream.seek(0); {code} added right after stream creation in the constructor. -- This message was sent by Atlassian Jira (v8.3.4#803005)