[
https://issues.apache.org/jira/browse/HBASE-26850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Duo Zhang resolved HBASE-26850.
-------------------------------
Assignee: (was: tianhang tang)
Resolution: Won't Fix
All 1.x release lines are EOL.
Close. Feel free to reopen if it also affects other active branches.
> Optimize the implementation of LRUCache in LRUDictionary
> --------------------------------------------------------
>
> Key: HBASE-26850
> URL: https://issues.apache.org/jira/browse/HBASE-26850
> Project: HBase
> Issue Type: Improvement
> Components: wal
> Affects Versions: 1.7.1, 3.0.0-alpha-2, 2.4.11
> Reporter: tianhang tang
> Priority: Major
> Attachments: image-2022-03-16-17-13-00-871.png
>
>
> *This issue is to unify the behavior of reading and writing links.*
>
> During my research on HBASE-26849, I found that there seems to be something
> wrong with the implementation of LRUDictionary.
> It uses array to save data, and uses hashMap to save array index.
> For the write link:
> {code:java}
> CompressedKvEncoder#write
> ->
> LRUDictionary#findEntry
> ->
> LRUDictionary#addEntry
> {code}
> And for the read link:
> {code:java}
> CompressedKVDecoder#readIntoArray
> ->
> LRUDirectory#addEntry
> {code}
> Then we could see the logic in {_}findEntry{_}:
> {code:java}
> @Override
> public short findEntry(byte[] data, int offset, int length) {
> short ret = backingStore.findIdx(data, offset, length);
> if (ret == NOT_IN_DICTIONARY) {
> addEntry(data, offset, length);
> }
> return ret;
> }
> private short findIdx(byte[] array, int offset, int length) {
> Short s;
> final Node comparisonNode = new Node();
> comparisonNode.setContents(array, offset, length);
> if ((s = nodeToIndex.get(comparisonNode)) != null) {
> moveToHead(indexToNode[s]);
> return s;
> } else {
> return -1;
> }
> }
> {code}
> At first it will try to find if there is same node in the cache, If there is,
> use it directly.
> The problem is, *if we have some same nodes, The behavior on the read and
> write links are inconsistent:*
> On the write link we will reuse just one node, but on the read link multiple
> copies will be stored in the array.
> but there is only one mapping from the node to the array index in the map.
> So, on the read link, if the first 'same' node is evict, we will remove it
> from the map, then we might met a NPE because when the second 'same' node
> evict we can not find it in the map.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)