[
https://issues.apache.org/jira/browse/HUDI-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lietong Liu updated HUDI-1583:
------------------------------
Description:
When 'spark.speculation' is enabled, there may be logFile with zero size.
*HoodieLogFormatReader.hasNext()* will return false when encounter logFile
with zero size,which will skip remaining log files。
{code:java}
@Override
public boolean hasNext() {
if (currentReader == null)
{ return false; }
else if (currentReader.hasNext())
{ return true; }
else if (logFiles.size() > 0) {
try {
HoodieLogFile nextLogFile = logFiles.remove(0);
// First close previous reader only if readBlockLazily is true
if (!readBlocksLazily)
{ this.currentReader.close(); }
else
{ this.prevReadersInOpenState.add(currentReader); }
this.currentReader =
new HoodieLogFileReader(fs, nextLogFile, readerSchema, bufferSize,
readBlocksLazily, false);
} catch (IOException io)
{ throw new HoodieIOException("unable to initialize read with log file ", io); }
LOG.info("Moving to the next reader for logfile " + currentReader.getLogFile());
return this.currentReader.hasNext() || hasNext();
}
return false;
}
{code}
was:
When `spark.speculation` is enabled, there may be logFile with zero size.
`HoodieLogFormatReader.hasNext()` will return false when encounter logFile
with zero size,which will skip remaining log files。
```
@Override
public boolean hasNext() {
if (currentReader == null) {
return false;
} else if (currentReader.hasNext()) {
return true;
} else if (logFiles.size() > 0) {
try {
HoodieLogFile nextLogFile = logFiles.remove(0);
// First close previous reader only if readBlockLazily is true
if (!readBlocksLazily) {
this.currentReader.close();
} else {
this.prevReadersInOpenState.add(currentReader);
}
this.currentReader =
new HoodieLogFileReader(fs, nextLogFile, readerSchema, bufferSize,
readBlocksLazily, false);
} catch (IOException io) {
throw new HoodieIOException("unable to initialize read with log file ", io);
}
LOG.info("Moving to the next reader for logfile " +
currentReader.getLogFile());
return this.currentReader.hasNext() || hasNext();
}
return false;
}
```
> Hudi will skip remaining log files if there is logFile with zero size in
> logFileList when merge on read.
> ---------------------------------------------------------------------------------------------------------
>
> Key: HUDI-1583
> URL: https://issues.apache.org/jira/browse/HUDI-1583
> Project: Apache Hudi
> Issue Type: Bug
> Components: Common Core
> Affects Versions: 0.6.0
> Reporter: Lietong Liu
> Priority: Major
> Fix For: 0.6.0
>
>
> When 'spark.speculation' is enabled, there may be logFile with zero size.
> *HoodieLogFormatReader.hasNext()* will return false when encounter logFile
> with zero size,which will skip remaining log files。
>
> {code:java}
> @Override
> public boolean hasNext() {
> if (currentReader == null)
> { return false; }
> else if (currentReader.hasNext())
> { return true; }
> else if (logFiles.size() > 0) {
> try {
> HoodieLogFile nextLogFile = logFiles.remove(0);
> // First close previous reader only if readBlockLazily is true
> if (!readBlocksLazily)
> { this.currentReader.close(); }
> else
> { this.prevReadersInOpenState.add(currentReader); }
> this.currentReader =
> new HoodieLogFileReader(fs, nextLogFile, readerSchema, bufferSize,
> readBlocksLazily, false);
> } catch (IOException io)
> { throw new HoodieIOException("unable to initialize read with log file ",
> io); }
> LOG.info("Moving to the next reader for logfile " +
> currentReader.getLogFile());
> return this.currentReader.hasNext() || hasNext();
> }
> return false;
> }
>
> {code}
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)