[
https://issues.apache.org/jira/browse/HBASE-17379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15797781#comment-15797781
]
ramkrishna.s.vasudevan commented on HBASE-17379:
------------------------------------------------
bq.Consider a get operation and in-memory compaction or a flush to disk running
concurrently. The former scans the pipeline while the later changes it.
Before HBASE-17081 - you mean the problem was there? Am not sure on that. The
reversal of active/snapshot in another JIRA solves the problem where the
snapshot is not updated. But regarding the pipeline during inmemory
flush/compaction I think we were safe there.
I remember checking that part of the code. So when we keep adding it to the
pipeline during in memory flush- first of all writes are blocked. So any read
that starts at that point will have a readpt lesser than the current write's
mvcc. So with that read pt it tries to create a scanner over the pipeline.
So either the pipeline got added with the new segment or it was not added. In
getscanners() we do getSegments() which was actually doing synchronize() on the
pipeline. So that should have been enough to avoid any issues. (Not only
concurrent modification but also data loss).
Now it was in HBASE-17081 where things were changed to use size() and
pipeline#getScanners() which lacked synchronization.
So we should be needing a do something like what getSegments() was doing
previously. And I think that is unavoidable. But now whether to use a sync
block or read/write lock is only the decision point here.
[~eshcar]
Let me know what you think? I also read thro your suggestion of using dirty
bit to know if the cached version was really updated. Will that be really
needed?
In case we really want to know if the failing tests passes we actually should
run the tests with HBASE-17081 in it. Otherwise I think this problem won't be
coming up.
> Lack of synchronization in CompactionPipeline#getScanners()
> -----------------------------------------------------------
>
> Key: HBASE-17379
> URL: https://issues.apache.org/jira/browse/HBASE-17379
> Project: HBase
> Issue Type: Bug
> Affects Versions: 2.0.0
> Reporter: Ted Yu
> Assignee: Ted Yu
> Attachments: 17379.v1.txt, 17379.v14.txt, 17379.v2.txt, 17379.v3.txt,
> 17379.v4.txt, 17379.v5.txt, 17379.v6.txt, 17379.v8.txt
>
>
> From
> https://builds.apache.org/job/PreCommit-HBASE-Build/5053/testReport/org.apache.hadoop.hbase.regionserver/TestHRegionWithInMemoryFlush/testWritesWhileGetting/
> :
> {code}
> java.io.IOException: java.util.ConcurrentModificationException
> at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.handleException(HRegion.java:5886)
> at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.initializeScanners(HRegion.java:5856)
> at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.<init>(HRegion.java:5819)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:2786)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2766)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7036)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7015)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6994)
> at
> org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting(TestHRegion.java:4141)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> at
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
> at
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.ConcurrentModificationException: null
> at
> java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966)
> at java.util.LinkedList$ListItr.next(LinkedList.java:888)
> at
> org.apache.hadoop.hbase.regionserver.CompactionPipeline.getScanners(CompactionPipeline.java:220)
> at
> org.apache.hadoop.hbase.regionserver.CompactingMemStore.getScanners(CompactingMemStore.java:298)
> at
> org.apache.hadoop.hbase.regionserver.HStore.getScanners(HStore.java:1154)
> at org.apache.hadoop.hbase.regionserver.Store.getScanners(Store.java:97)
> at
> org.apache.hadoop.hbase.regionserver.StoreScanner.getScannersNoCompaction(StoreScanner.java:353)
> at
> org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:210)
> at
> org.apache.hadoop.hbase.regionserver.HStore.createScanner(HStore.java:1892)
> at
> org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:1880)
> at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.initializeScanners(HRegion.java:5842)
> at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.<init>(HRegion.java:5819)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:2786)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2766)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7036)
> {code}
> The cause is in CompactionPipeline#getScanners() where there is no
> synchronization around iterating pipeline.
> The code causing ConcurrentModificationException:
> {code}
> for (Segment segment : this.pipeline) {
> {code}
> was introduced by HBASE-17081
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)