Re: [PR] HBASE-29890 WAL tailing reader should resume partial cell reads instead of resetting compression [hbase]

via GitHub Mon, 09 Mar 2026 08:48:34 -0700


hgromer commented on code in PR #7741:
URL: https://github.com/apache/hbase/pull/7741#discussion_r2906285521



##########
hbase-common/src/main/java/org/apache/hadoop/hbase/io/util/UndoableLRUDictionary.java:
##########
@@ -0,0 +1,144 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.io.util;
+
+import java.util.IdentityHashMap;
+import java.util.Map;
+import org.apache.yetus.audience.InterfaceAudience;
+
+/**
+ * An LRUDictionary that supports checkpoint and rollback. Used for tag 
dictionary compression in
+ * WAL decoding, where dictionary updates within a single cell must be rolled 
back if the cell read
+ * fails (e.g., due to EOF on a WAL being tailed).
+ * <p>
+ * On {@link #checkpoint()}, saves the current LRU state. Operations proceed 
normally against the
+ * real dictionary. On {@link #rollback()}, restores the dictionary to the 
checkpointed state. On
+ * {@link #commit()}, discards the saved state.
+ */
[email protected]
+public class UndoableLRUDictionary extends LRUDictionary {
+
+  private boolean tracking = false;
+  private int savedCurrSize;
+  private BidirectionalLRUMap.Node savedHead;
+  private BidirectionalLRUMap.Node savedTail;
+  private final Map<BidirectionalLRUMap.Node, NodeSnapshot> snapshots = new 
IdentityHashMap<>();
+
+  private static class NodeSnapshot {
+    final BidirectionalLRUMap.Node savedPrev;
+    final BidirectionalLRUMap.Node savedNext;
+    final byte[] savedContents;
+    final int savedOffset;
+    final int savedLength;
+
+    NodeSnapshot(BidirectionalLRUMap.Node node) {
+      this.savedPrev = node.prev;
+      this.savedNext = node.next;
+      this.savedContents = node.getContents();
+      this.savedOffset = node.offset;
+      this.savedLength = node.length;
+    }
+  }
+
+  public void checkpoint() {
+    tracking = true;
+    savedCurrSize = backingStore.currSize;
+    savedHead = backingStore.head;
+    savedTail = backingStore.tail;
+    snapshots.clear();
+  }
+
+  public void commit() {
+    snapshots.clear();
+    tracking = false;
+  }
+
+  public void rollback() {
+    if (!tracking) {
+      return;
+    }
+    for (Map.Entry<BidirectionalLRUMap.Node, NodeSnapshot> entry : 
snapshots.entrySet()) {
+      BidirectionalLRUMap.Node node = entry.getKey();
+      NodeSnapshot snap = entry.getValue();
+      node.prev = snap.savedPrev;
+      node.next = snap.savedNext;
+      if (snap.savedContents != null) {
+        backingStore.nodeToIndex.remove(node);
+        node.setContents(snap.savedContents, snap.savedOffset, 
snap.savedLength);
+        backingStore.nodeToIndex.put(node, findIndexForNode(node));

Review Comment:
   I wonder if doing this in 2 phases avoids needing an n2 approach to 
iteration here. 
   
   Additionally, rollback() does remove/setContents/put on nodeToIndex (a 
content-based HashMap) one node at a time, in non-deterministic IdentityHashMap 
iteration order. If during the tracked period a node gets overwritten with a 
value that equals another node's original value (e.g., evict "a" from slot 0, 
then "a" gets added to slot 1),  there's an intermediate state during rollback 
where two nodes hold the same content. Depending on iteration order, the 
HashMap operations can clobber each other, leaving a missing entry in 
nodeToIndex.
   
   Would it be more efficient and safer to do:
   
   ```java
   // Phase 1: restore all node contents
     for (...) {
         node.prev = snap.savedPrev;
         node.next = snap.savedNext;
         if (snap.savedContents != null) {
             node.setContents(snap.savedContents, snap.savedOffset, 
snap.savedLength);
         }
     }
     // Phase 2: rebuild nodeToIndex from scratch
     backingStore.nodeToIndex.clear();
     for (short i = 0; i < savedCurrSize; i++) {
         backingStore.nodeToIndex.put(backingStore.indexToNode[i], i);
     }
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] HBASE-29890 WAL tailing reader should resume partial cell reads instead of resetting compression [hbase]

Reply via email to