Re: [PR] HBASE-30038: RefCnt Leak error when caching [hbase]

2026-04-22 Thread via GitHub


wchevreuil merged PR #7995:
URL: https://github.com/apache/hbase/pull/7995


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-30038: RefCnt Leak error when caching [hbase]

2026-04-09 Thread via GitHub


dParikesit commented on code in PR #7995:
URL: https://github.com/apache/hbase/pull/7995#discussion_r3058161112


##
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java:
##
@@ -244,9 +244,10 @@ public void onLeak(String s, String s1) {
   Mockito.verify(cache).shouldCacheBlock(Mockito.any(), Mockito.anyLong(), 
Mockito.any());
   Mockito.verify(cache, Mockito.never()).cacheBlock(Mockito.any(), 
Mockito.any(),
 Mockito.anyBoolean(), Mockito.anyBoolean());
-  System.gc();
-  Thread.sleep(1000);
-  alloc.allocate(128 * 1024).release();
+  for (int i = 0; i < 15 && counter.get() == 0; i++) {
+System.gc();
+alloc.allocate(128 * 1024).release();
+  }

Review Comment:
   Thanks for the feedback! I believe the sleep should be done after we run 
allocate.release() right? I have updated the commit to change it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-30038: RefCnt Leak error when caching [hbase]

2026-04-08 Thread via GitHub


vaijosh commented on code in PR #7995:
URL: https://github.com/apache/hbase/pull/7995#discussion_r3055418544


##
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java:
##
@@ -244,9 +244,10 @@ public void onLeak(String s, String s1) {
   Mockito.verify(cache).shouldCacheBlock(Mockito.any(), Mockito.anyLong(), 
Mockito.any());
   Mockito.verify(cache, Mockito.never()).cacheBlock(Mockito.any(), 
Mockito.any(),
 Mockito.anyBoolean(), Mockito.anyBoolean());
-  System.gc();
-  Thread.sleep(1000);
-  alloc.allocate(128 * 1024).release();
+  for (int i = 0; i < 15 && counter.get() == 0; i++) {
+System.gc();
+alloc.allocate(128 * 1024).release();
+  }

Review Comment:
   @dParikesit 
   You have not added sleep inside the loop so it will finish quicky and 
deafeat the purpose.
   
   Please keep it something like below 
   
   for (int i = 0; i < 30 && counter.get() == 0; i++) {
   System.gc();
   alloc.allocate(128 * 1024).release();
   try {
   Thread.sleep(1000); 
   } catch (InterruptedException e) {
   Thread.currentThread().interrupt();
   break;
   }
   }



##
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java:
##
@@ -258,9 +256,10 @@ public void onLeak(String s, String s1) {
 try {
   writeDataBlocksAndCreateIndex(hbw, outputStream, biw);
   assertTrue(biw.getNumLevels() >= 3);
-  System.gc();
-  Thread.sleep(1000);
-  allocator.allocate(128 * 1024).release();
+  for (int i = 0; i < 15 && counter.get() == 0; i++) {
+System.gc();
+allocator.allocate(128 * 1024).release();
+  }

Review Comment:
   @dParikesit 
   Please add Put slip inside the loop and make 30 retries instead of 15



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-30038: RefCnt Leak error when caching [hbase]

2026-04-08 Thread via GitHub


dParikesit commented on PR #7995:
URL: https://github.com/apache/hbase/pull/7995#issuecomment-4208552365

   @wchevreuil thanks for the reminder. I have modified the tests to use 
retry-loop instead of sleep (including the test added in 
[HBASE-28890](https://issues.apache.org/jira/browse/HBASE-28890) ). I've tested 
it on my machine. Can you help trigger the CI pipeline?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-30038: RefCnt Leak error when caching [hbase]

2026-04-08 Thread via GitHub


wchevreuil commented on PR #7995:
URL: https://github.com/apache/hbase/pull/7995#issuecomment-4205473343

   Any news on this, @dParikesit ? Please get the reviews addressed so that we 
can move forward here, as this is an important fix you have found.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-30038: RefCnt Leak error when caching [hbase]

2026-03-30 Thread via GitHub


dParikesit commented on PR #7995:
URL: https://github.com/apache/hbase/pull/7995#issuecomment-4155825124

   Thanks for the suggestions. I'll get back to you after I fix it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-30038: RefCnt Leak error when caching [hbase]

2026-03-30 Thread via GitHub


wchevreuil commented on PR #7995:
URL: https://github.com/apache/hbase/pull/7995#issuecomment-4154337190

   Please consider @vaijosh suggestions and also fix spotless issues.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-30038: RefCnt Leak error when caching [hbase]

2026-03-29 Thread via GitHub


vaijosh commented on code in PR #7995:
URL: https://github.com/apache/hbase/pull/7995#discussion_r3007321459


##
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java:
##
@@ -216,6 +216,60 @@ public void onLeak(String s, String s1) {
 assertEquals(0, counter.get());
   }
 
+  @Test
+  public void testIntermediateIndexCacheOnWriteDoesNotLeak() throws Exception {
+Configuration localConf = new Configuration(TEST_UTIL.getConfiguration());
+localConf.setInt(HFile.FORMAT_VERSION_KEY, HFile.MAX_FORMAT_VERSION);
+localConf.setBoolean(CacheConfig.CACHE_INDEX_BLOCKS_ON_WRITE_KEY, true);
+localConf.setInt(ByteBuffAllocator.BUFFER_SIZE_KEY, 4096);
+localConf.setInt(ByteBuffAllocator.MAX_BUFFER_COUNT_KEY, 32);
+localConf.setInt(ByteBuffAllocator.MIN_ALLOCATE_SIZE_KEY, 0);
+ByteBuffAllocator allocator = ByteBuffAllocator.create(localConf, true);
+List buffers = new ArrayList<>();
+for (int i = 0; i < allocator.getTotalBufferCount(); i++) {
+  buffers.add(allocator.allocateOneBuffer());
+  assertEquals(0, allocator.getFreeBufferCount());
+}
+buffers.forEach(ByteBuff::release);
+assertEquals(allocator.getTotalBufferCount(), 
allocator.getFreeBufferCount());
+ResourceLeakDetector.setLevel(ResourceLeakDetector.Level.PARANOID);
+final AtomicInteger counter = new AtomicInteger();
+RefCnt.detector.setLeakListener(new ResourceLeakDetector.LeakListener() {
+  @Override
+  public void onLeak(String s, String s1) {
+counter.incrementAndGet();
+  }
+});
+
+Path localPath = new Path(TEST_UTIL.getDataTestDir(),
+  "block_index_testIntermediateIndexCacheOnWriteDoesNotLeak_" + compr);
+HFileContext meta = new HFileContextBuilder().withHBaseCheckSum(true)
+  
.withIncludesMvcc(includesMemstoreTS).withIncludesTags(true).withCompression(compr)
+  .withBytesPerCheckSum(HFile.DEFAULT_BYTES_PER_CHECKSUM).build();
+HFileBlock.Writer hbw = new HFileBlock.Writer(localConf, null, meta, 
allocator,
+  meta.getBlocksize());
+FSDataOutputStream outputStream = fs.create(localPath);
+LruBlockCache cache = new LruBlockCache(8 * 1024 * 1024, 1024, true, 
localConf);
+CacheConfig cacheConfig = new CacheConfig(localConf, null, cache, 
allocator);
+HFileBlockIndex.BlockIndexWriter biw =
+  new HFileBlockIndex.BlockIndexWriter(hbw, cacheConfig, 
localPath.getName(), null);
+biw.setMaxChunkSize(512);
+
+try {
+  writeDataBlocksAndCreateIndex(hbw, outputStream, biw);
+  assertTrue(biw.getNumLevels() >= 3);
+  System.gc();
+  Thread.sleep(1000);

Review Comment:
   @dParikesit 
   I think the 1-second assumption might bite us on slower systems and make the 
test flaky.
   What if we wrap this in a loop instead? We could retry up to 15 times with a 
small delay—that way it succeeds as soon as it's ready rather than relying on 
luck.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-30038: RefCnt Leak error when caching [hbase]

2026-03-29 Thread via GitHub


vaijosh commented on code in PR #7995:
URL: https://github.com/apache/hbase/pull/7995#discussion_r3007309262


##
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java:
##
@@ -202,6 +207,53 @@ public void testReaderWithLRUBlockCache() throws Exception 
{
 lru.shutdown();
   }
 
+  @Test
+  public void testWriterCacheOnWriteSkipDoesNotLeak() throws Exception {
+int bufCount = 32;
+int blockSize = 4 * 1024;
+ByteBuffAllocator alloc = initAllocator(true, blockSize, bufCount, 0);
+fillByteBuffAllocator(alloc, bufCount);
+ResourceLeakDetector.setLevel(ResourceLeakDetector.Level.PARANOID);
+Configuration myConf = HBaseConfiguration.create(conf);
+myConf.setBoolean(CacheConfig.CACHE_BLOCKS_ON_WRITE_KEY, true);
+myConf.setBoolean(CacheConfig.CACHE_INDEX_BLOCKS_ON_WRITE_KEY, false);
+myConf.setBoolean(CacheConfig.CACHE_BLOOM_BLOCKS_ON_WRITE_KEY, false);
+final AtomicInteger counter = new AtomicInteger();
+RefCnt.detector.setLeakListener(new ResourceLeakDetector.LeakListener() {
+  @Override
+  public void onLeak(String s, String s1) {
+counter.incrementAndGet();
+  }
+});
+BlockCache cache = Mockito.mock(BlockCache.class);
+Mockito.when(cache.shouldCacheBlock(Mockito.any(), Mockito.anyLong(), 
Mockito.any()))
+  .thenReturn(Optional.of(false));
+Path hfilePath = new Path(TEST_UTIL.getDataTestDir(), 
"testWriterCacheOnWriteSkipDoesNotLeak");
+HFileContext context = new 
HFileContextBuilder().withBlockSize(blockSize).build();
+
+try {
+  Writer writer = new HFile.WriterFactory(myConf, new CacheConfig(myConf, 
null, cache, alloc))
+.withPath(fs, hfilePath).withFileContext(context).create();
+  try {
+writer.append(new KeyValue(Bytes.toBytes("row"), Bytes.toBytes("cf"), 
Bytes.toBytes("q"),
+  HConstants.LATEST_TIMESTAMP, Bytes.toBytes("value")));
+  } finally {
+writer.close();
+  }
+
+  Mockito.verify(cache).shouldCacheBlock(Mockito.any(), Mockito.anyLong(), 
Mockito.any());
+  Mockito.verify(cache, Mockito.never()).cacheBlock(Mockito.any(), 
Mockito.any(),
+Mockito.anyBoolean(), Mockito.anyBoolean());
+  System.gc();
+  Thread.sleep(1000);

Review Comment:
   @dParikesit 
   I think the 1-second assumption might bite us on slower systems and make the 
test flaky.
   What if we wrap this in a loop instead? We could retry up to 15 times with a 
small delay—that way it succeeds as soon as it's ready rather than relying on 
luck.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



[PR] HBASE-30038: RefCnt Leak error when caching [hbase]

2026-03-28 Thread via GitHub


dParikesit opened a new pull request, #7995:
URL: https://github.com/apache/hbase/pull/7995

   JIRA: [HBASE-30038](https://issues.apache.org/jira/browse/HBASE-30038)
   
   This bug is similar to 
[HBASE-28890](https://issues.apache.org/jira/browse/HBASE-28890)
   
   HFileBlock.Writer.getBlockForCaching() creates a ref-counted HFileBlock, and 
the caller must release its own reference after the cache has taken the 
ownership it needs. 
   
   In HFileWriterImpl.java, the leak happened when shouldCacheBlock() returned 
before cacheFormatBlock.release() ran.
   
   In HFileBlockIndex.java, the intermediate-index path cached blockForCaching 
but never released the reference at all. 
   
   Because these blocks can be backed by pooled off-heap ByteBuffs, repeated 
HFile writes could steadily drain allocator buffers and effectively leak 
memory, even though the blocks were only meant to live long enough to be 
considered for cache-on-write.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]