[
https://issues.apache.org/jira/browse/HBASE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16529732#comment-16529732
]
Zheng Hu commented on HBASE-20789:
----------------------------------
As [~Apache9] comment on RB, there's problem here in patch.v3:
{code}
443 if (replaceExistingCacheBlock) {
444 ramCache.put(cacheKey, re);
445 } else if (ramCache.putIfAbsent(cacheKey, re) != null) {
446 return;
447 }
{code}
Can not just replace the cacheKey with new RAMQueueEntry, because the heapSize
of bucket cache need to update if removing entry from ramCache. the
WriterThread write to io-engine firstly, then sync, then remove the
RAMQueueEntry from ramCache. It's possible that the removed entry is not the
right one.
{code}
t1. thread0 try to cache block0 with key0 (BucketCache#cacheBlock)
t2. replace it into ramCache;
t3. writer thread write to io-engine;
// t4. another thread1 try to cache block1 with same key0;
(BucketCache#cacheBlock)
// t5. replace block0 with block1 in ramCache
t5. remove the entry (block1) with key0 from ramCache;
{code}
Finally,the thread0 will remove the incorrect block1... the heap size is wrong
also..
So for safety, we still keep the putIfAbsent() to ensure that only one thread
will remove entry from ramCache... the flaky ut has been fixed by waiting
until the cache flushed to io-engine...
> TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky
> ---------------------------------------------------------------
>
> Key: HBASE-20789
> URL: https://issues.apache.org/jira/browse/HBASE-20789
> Project: HBase
> Issue Type: Bug
> Reporter: Zheng Hu
> Assignee: Zheng Hu
> Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 1.4.6, 2.0.2
>
> Attachments:
> 0001-HBASE-20789-TestBucketCache-testCacheBlockNextBlockM.patch,
> HBASE-20789.v1.patch, HBASE-20789.v2.patch, HBASE-20789.v3.patch,
> bucket-33718.out
>
>
> The UT failed frequently in our internal branch-2... Will dig into the UT.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)