[
https://issues.apache.org/jira/browse/HBASE-27644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Duo Zhang reassigned HBASE-27644:
---------------------------------
Assignee: Duo Zhang
> Should not return false when WALKey has no following KVs while reading WAL
> file
> -------------------------------------------------------------------------------
>
> Key: HBASE-27644
> URL: https://issues.apache.org/jira/browse/HBASE-27644
> Project: HBase
> Issue Type: Bug
> Components: dataloss, wal
> Reporter: Duo Zhang
> Assignee: Duo Zhang
> Priority: Critical
>
> In the current implementation
> {code}
> if (!walKey.hasFollowingKvCount() || 0 == walKey.getFollowingKvCount())
> {
> LOG.trace("WALKey has no KVs that follow it; trying the next one.
> current offset={}",
> this.inputStream.getPos());
> seekOnFs(originalPosition);
> return false;
> }
> {code}
> Here we just return false, seek back to the original position. I think the
> intention here is that it means the data is not available yet and we should
> try to read them next time.
> But this class is not only used for replication, it is also used by
> splitting, return false will make the reader.next return null, and
> WALSplitter will think the WAL file has been fully read and complete the
> splitting task. If there are still other WAL entries in the file, we will
> miss reading them and cause data loss.
> And in fact, the following kv count is a field in a pb message, so it is
> impossible that now it is 0 but later it will become a value greater than 0,
> as we use writeDelimited to write the message, there is a size in front of
> the message, if we read it successfully, we can make sure the message is
> complete/ So seeking back in replication is also an useless operation.
> So here we propose we still need to return true here, so the upper layer are
> free to skip or not, but they still need to read other entries.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)