[
https://issues.apache.org/jira/browse/HBASE-23205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16958714#comment-16958714
]
Jeongdae Kim edited comment on HBASE-23205 at 10/24/19 9:33 AM:
----------------------------------------------------------------
In this PR([https://github.com/apache/hbase/pull/749]), I tried to update log
position only when WAL rolled, and removed log position updates in reader side
to resolve concurrency issue by case 2).
I added some tests to check case 1) and case 3)
* TestReplicationSource.testSetLogPositionForWALCurrentlyReadingWhenLogsRolled
for case 1)
* TestWALEntryStream.testReplicationSourceWALReaderThreadWithFilter for case 3)
I think all flaws i mentioned can be fixed by this patch.
was (Author: jeongdae kim):
In this PR, I tried to update log position only when WAL rolled, and removed
log position updates in reader side to resolve concurrency issue by case 2).
I added some tests to check case 1) and case 3)
* TestReplicationSource.testSetLogPositionForWALCurrentlyReadingWhenLogsRolled
for case 1)
* TestWALEntryStream.testReplicationSourceWALReaderThreadWithFilter for case 3)
I think all flaws i mentioned can be fixed by this patch.
> Correctly update the position of WALs currently being replicated.
> -----------------------------------------------------------------
>
> Key: HBASE-23205
> URL: https://issues.apache.org/jira/browse/HBASE-23205
> Project: HBase
> Issue Type: Bug
> Affects Versions: 1.5.0, 1.4.10, 1.4.11
> Reporter: Jeongdae Kim
> Assignee: Jeongdae Kim
> Priority: Major
>
> We observed a lot of old WALs were not removed from archives and their
> corresponding replication queues, while testing with 1.4.10.
> stacked old WALs are empty or have no entries to be replicated (not in
> replication table_cfs)
>
> As described in HBASE-22784, if no entries to be replicated are appended to
> WALs, log position will never be updated. As a consequence, all WALs won’t be
> removed. this issue happened since HBASE-15995.
>
> I think old WALs would not be stacked with HBASE-22784. but, it still have
> something to be fixed as below
> case 1) Log position could be updated wrongly, when log rolled, because
> lastWalPath of batches might not point to WAL currently being read.
> * For example, after last entry added in a batch were read from P1 position
> in the WAL W1
> and then WAL rolled, and reader read until it reaches the end of old wals
> and continue reading entries from new WAL W2, and then it reached batch size.
> current read position for W2 is P2. In this case, the batch being passed to a
> shipper have walPath W1 and P2, so shipper will try to update position P2 for
> W1. it may result in data inconsistency in recovery case or update failure to
> zookeeper (znode could not exist by previous log position updates, i guess
> this case is the same case as HBASE-23169 ?)
>
> case 2) Log position could be not updated or updated to wrong position by
> pendingShipment flag introduced from HBASE-22784
> * In shipper thread, it would not be guaranteed to update log position
> always, by setting pendingShipment to false.
> If reader set the flag to true, right after shipper set it to false during
> {color:#24292e}updateLogPosition(), shipper won’t update log position.{color}
> On the other hand, while reader read filtered entries, If shipper set to
> false reader will update log position to current read position. it may lose
> data in recovery case.
>
> case 3) A lot of log position updates could be happened, when most of WAL
> entries are filtered by TableCfWALEntryFilter.
> * I think it would be better to reduce the number of log updates in that
> case, because
> ## zookeeper writes are more expensive operations than reads.(since writes
> involve synchronizing the state of all servers),
> ## even if read position was not updated, it would be harmless because all
> entries will be filtered out again in recovery process.
> * It would be enough to update log position only when wal rolled in that
> case. (to cleanup old wals)
>
> In addition, During this work, i found a minor bug which is updating
> replication buffer size wrongly by decreasing total buffer size with the size
> of bulk loaded files.
> I’d like to fix it, if it’s ok.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)