[
https://issues.apache.org/jira/browse/HBASE-17379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15782637#comment-15782637
]
Eshcar Hillel commented on HBASE-17379:
---------------------------------------
Hi
[~ted_yu] some of the changes you suggest here may conflict with the patch I
just attached in HBASE-17373 for resolving the bug found by
TestAsyncTableGetMultiThreaded.
For example, I removed the method drain. Might be better to solve these issues
one at a time to avoid such conflicts.
The idea behind the original design of the compaction pipeline was to reduce
synchronization as much as possible, only use synchronization where it is
necessary for correctness.
Write-write conflicts in the pipeline are generally handled by using a lock +
version number, which also prevents conflicting operation to happen even when
they are not concurrent.
So what needs to be handled (the problem that is captured in the current
failure) are read-write conflicts.
Wrapping pipeline with a synchronize block whenever used is a correct solution
but might block operations even when not necessary. For example it does not
allow concurrent reads.
May I suggest to use one of Java's thread safe lists for the pipeline; The list
here is not long. We can go through all possible alternatives (there are not
many) and see which implementation best fits our case. Specifically, one that
allows concurrent reads and prevents concurrent read-write operations.
We'll still need to use the synchronize blocks whenever we need to atomically
update the pipeline and the version number as we do now.
> Lack of synchronization in CompactionPipeline#getScanners()
> -----------------------------------------------------------
>
> Key: HBASE-17379
> URL: https://issues.apache.org/jira/browse/HBASE-17379
> Project: HBase
> Issue Type: Bug
> Affects Versions: 2.0.0
> Reporter: Ted Yu
> Assignee: Ted Yu
> Attachments: 17379.v1.txt, 17379.v2.txt, 17379.v3.txt
>
>
> From
> https://builds.apache.org/job/PreCommit-HBASE-Build/5053/testReport/org.apache.hadoop.hbase.regionserver/TestHRegionWithInMemoryFlush/testWritesWhileGetting/
> :
> {code}
> java.io.IOException: java.util.ConcurrentModificationException
> at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.handleException(HRegion.java:5886)
> at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.initializeScanners(HRegion.java:5856)
> at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.<init>(HRegion.java:5819)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:2786)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2766)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7036)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7015)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6994)
> at
> org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting(TestHRegion.java:4141)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> at
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
> at
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.ConcurrentModificationException: null
> at
> java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966)
> at java.util.LinkedList$ListItr.next(LinkedList.java:888)
> at
> org.apache.hadoop.hbase.regionserver.CompactionPipeline.getScanners(CompactionPipeline.java:220)
> at
> org.apache.hadoop.hbase.regionserver.CompactingMemStore.getScanners(CompactingMemStore.java:298)
> at
> org.apache.hadoop.hbase.regionserver.HStore.getScanners(HStore.java:1154)
> at org.apache.hadoop.hbase.regionserver.Store.getScanners(Store.java:97)
> at
> org.apache.hadoop.hbase.regionserver.StoreScanner.getScannersNoCompaction(StoreScanner.java:353)
> at
> org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:210)
> at
> org.apache.hadoop.hbase.regionserver.HStore.createScanner(HStore.java:1892)
> at
> org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:1880)
> at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.initializeScanners(HRegion.java:5842)
> at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.<init>(HRegion.java:5819)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:2786)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2766)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7036)
> {code}
> The cause is in CompactionPipeline#getScanners() where there is no
> synchronization around iterating pipeline.
> The code causing ConcurrentModificationException:
> {code}
> for (Segment segment : this.pipeline) {
> {code}
> was introduced by HBASE-17081
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)