[
https://issues.apache.org/jira/browse/HBASE-27621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17687330#comment-17687330
]
Xiaolin Ha commented on HBASE-27621:
------------------------------------
We can work well on our cluster with this patch and disable value compression,
the replication stuck issue now seems relevant to the WAL value compression.
Then it can explain why the stuck happens after many delete row operations. But
by checking the WAL file using WALPrettyPrinter, the stop position is in the
middle, we should notice the data loss issue with WAL value compression. Great
job, thanks.
> Also clear the Dictionary when resetting when reading compressed WAL file
> -------------------------------------------------------------------------
>
> Key: HBASE-27621
> URL: https://issues.apache.org/jira/browse/HBASE-27621
> Project: HBase
> Issue Type: Bug
> Components: Replication, wal
> Reporter: Duo Zhang
> Assignee: Duo Zhang
> Priority: Critical
> Fix For: 2.6.0, 3.0.0-alpha-4, 2.4.17, 2.5.4
>
>
> After trying several times, now I can reproduce a critical problem when
> reading compressed WAL file in replication.
> The problem is about how we construct the LRUDictionary when reset the
> WALEntryStream. In the current design, we will not reconstruct the
> LRUDictionary when reseting, but when reading again, we will call addEntry
> directly to add 'new' word into the dict, which will mess up the dict and
> cause data corruption.
> I've implemented a UT to simulate reading partial WAL entry in replication,
> with the current code base, after reseting and reading again, we will stuck
> there for ever.
> -The fix is to always use findEntry when constructing the dict when reading,
> so we will not mess things up.-
> It turns out that the above solution does not work.
> Another possible fix is to always reconstruct the dict after reseting, we
> will also clear the dict and reconstruct it again. But it is less efficient
> as we need to read from the beginning to the position we want to seek to,
> instead of seek to the position directly, especially when tailing the WAL
> file which is currently being written.
> And notice that, the UT can only reproduce the problem in local file system,
> on HDFS, the available method is implemented so if there is not enough data,
> we will throw EOFException earlier before parsing cells with the compression
> decoder, so we will not add duplicated word to dict. But in real world, it
> is possible that even if there are enough data to read, we could hit an
> IOException while reading and lead to the same problem described above.
> And while fixing, I also found another problem that in TagConressionContext
> and CompressionContext, we use the result of InputStream incorrectly, as we
> just cast it to byte and test whether it is -1 to determine whether the field
> is in the dict. The return value of InputStream.read is an int, and it will
> return -1 if reaches EOF, but here we will consider it as not in dict... We
> should throw EOFException instead.
> I'm not sure whether fix this can also fix HBASE-27073 but let's have a try
> first.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)