[jira] [Commented] (HBASE-17379) Lack of synchronization in CompactionPipeline#getScanners()

ramkrishna.s.vasudevan (JIRA) Wed, 04 Jan 2017 01:35:43 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-17379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15797781#comment-15797781
 ]


ramkrishna.s.vasudevan commented on HBASE-17379:
------------------------------------------------

bq.Consider a get operation and in-memory compaction or a flush to disk running 
concurrently. The former scans the pipeline while the later changes it. 
Before HBASE-17081 - you mean the problem was there? Am not sure on that. The 
reversal of active/snapshot in another JIRA solves the problem where the 
snapshot is not updated. But regarding the pipeline during inmemory 
flush/compaction I think we were safe there.
I remember checking that part of the code. So when we keep adding it to the 
pipeline during in memory flush- first of all writes are blocked. So any read 
that starts at that point will have a readpt lesser than the current write's 
mvcc. So with that read pt it tries to create a scanner over the pipeline. 
So either the pipeline got added with the new segment or it was not added. In 
getscanners() we do getSegments() which was actually doing synchronize() on the 
pipeline. So that should have been enough to avoid any issues. (Not only 
concurrent modification but also data loss).
Now it was in HBASE-17081 where things were changed to use size() and 
pipeline#getScanners() which lacked synchronization. 
So we should be needing a do something like what getSegments() was doing 
previously. And I think that is unavoidable. But now whether to use a sync 
block or read/write lock is only the decision point here. 
[~eshcar]
Let me know what you think?  I also read thro your suggestion of using dirty 
bit to know if the cached version was really updated. Will that be really 
needed? 
In case we really want to know if the failing tests passes we actually should 
run the tests with HBASE-17081 in it. Otherwise I think this problem won't be 
coming up. 

> Lack of synchronization in CompactionPipeline#getScanners()
> -----------------------------------------------------------
>
>                 Key: HBASE-17379
>                 URL: https://issues.apache.org/jira/browse/HBASE-17379
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.0.0
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>         Attachments: 17379.v1.txt, 17379.v14.txt, 17379.v2.txt, 17379.v3.txt, 
> 17379.v4.txt, 17379.v5.txt, 17379.v6.txt, 17379.v8.txt
>
>
> From 
> https://builds.apache.org/job/PreCommit-HBASE-Build/5053/testReport/org.apache.hadoop.hbase.regionserver/TestHRegionWithInMemoryFlush/testWritesWhileGetting/
>  :
> {code}
> java.io.IOException: java.util.ConcurrentModificationException
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.handleException(HRegion.java:5886)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.initializeScanners(HRegion.java:5856)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.<init>(HRegion.java:5819)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:2786)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2766)
>       at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7036)
>       at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7015)
>       at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6994)
>       at 
> org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting(TestHRegion.java:4141)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>       at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>       at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>       at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>       at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>       at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>       at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>       at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>       at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>       at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>       at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>       at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.ConcurrentModificationException: null
>       at 
> java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966)
>       at java.util.LinkedList$ListItr.next(LinkedList.java:888)
>       at 
> org.apache.hadoop.hbase.regionserver.CompactionPipeline.getScanners(CompactionPipeline.java:220)
>       at 
> org.apache.hadoop.hbase.regionserver.CompactingMemStore.getScanners(CompactingMemStore.java:298)
>       at 
> org.apache.hadoop.hbase.regionserver.HStore.getScanners(HStore.java:1154)
>       at org.apache.hadoop.hbase.regionserver.Store.getScanners(Store.java:97)
>       at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.getScannersNoCompaction(StoreScanner.java:353)
>       at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:210)
>       at 
> org.apache.hadoop.hbase.regionserver.HStore.createScanner(HStore.java:1892)
>       at 
> org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:1880)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.initializeScanners(HRegion.java:5842)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.<init>(HRegion.java:5819)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:2786)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2766)
>       at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:7036)
> {code}
> The cause is in CompactionPipeline#getScanners() where there is no 
> synchronization around iterating pipeline.
> The code causing ConcurrentModificationException:
> {code}
>     for (Segment segment : this.pipeline) {
> {code}
> was introduced by HBASE-17081



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-17379) Lack of synchronization in CompactionPipeline#getScanners()

Reply via email to