[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16623041#comment-16623041 ] Lars Hofhansl commented on HBASE-13082: --- [~ram_krish] Above sounded a bit more whiny than intended? I can't think of anything better either (perhaps the old code was slower but more less error-prone...?) At the very least let's add the message improvements suggested on HBASE-20704. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug > Components: Performance, Scanners >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan >Priority: Major > Fix For: 1.3.0, 2.0.0 > > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, CountDownLatch-0.98.txt, HBASE-13082.pdf, > HBASE-13082_1.pdf, HBASE-13082_12.patch, HBASE-13082_13.patch, > HBASE-13082_14.patch, HBASE-13082_15.patch, HBASE-13082_16.patch, > HBASE-13082_17.patch, HBASE-13082_18.patch, HBASE-13082_19.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622426#comment-16622426 ] ramkrishna.s.vasudevan commented on HBASE-13082: HBASE-20704 though I started seeing it I lost track of it. Just checking the patch there, seems good to me. Probably these are some hidden cases that are found when we are making the compaction archival asynchronous. Good one. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug > Components: Performance, Scanners >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan >Priority: Major > Fix For: 1.3.0, 2.0.0 > > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, CountDownLatch-0.98.txt, HBASE-13082.pdf, > HBASE-13082_1.pdf, HBASE-13082_12.patch, HBASE-13082_13.patch, > HBASE-13082_14.patch, HBASE-13082_15.patch, HBASE-13082_16.patch, > HBASE-13082_17.patch, HBASE-13082_18.patch, HBASE-13082_19.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622421#comment-16622421 ] ramkrishna.s.vasudevan commented on HBASE-13082: [~lhofhansl] I agree to your point that reference counting is risky is some cases where there is no unified way of incrementing/decrementing. Is there a better way we can do it? Or shall we unify the way the store files scanners are accessed and closed? > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug > Components: Performance, Scanners >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan >Priority: Major > Fix For: 1.3.0, 2.0.0 > > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, CountDownLatch-0.98.txt, HBASE-13082.pdf, > HBASE-13082_1.pdf, HBASE-13082_12.patch, HBASE-13082_13.patch, > HBASE-13082_14.patch, HBASE-13082_15.patch, HBASE-13082_16.patch, > HBASE-13082_17.patch, HBASE-13082_18.patch, HBASE-13082_19.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622352#comment-16622352 ] Andrew Purtell commented on HBASE-13082: bq. I don't like reference counting... In this case you forget to close just one StoreFileScanner, and BOOM compactions now can never succeed; unless and until you move the region or bounce the region server. Not good! [~lhofhansl] See related HBASE-20704, which [~toffer] has a patch pending. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug > Components: Performance, Scanners >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan >Priority: Major > Fix For: 1.3.0, 2.0.0 > > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, CountDownLatch-0.98.txt, HBASE-13082.pdf, > HBASE-13082_1.pdf, HBASE-13082_12.patch, HBASE-13082_13.patch, > HBASE-13082_14.patch, HBASE-13082_15.patch, HBASE-13082_16.patch, > HBASE-13082_17.patch, HBASE-13082_18.patch, HBASE-13082_19.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621508#comment-16621508 ] Lars Hofhansl commented on HBASE-13082: --- Interestingly we just upgraded one of our clusters to 1.3.2 and now we see: "Can't archive compacted file because of either isCompactedAway = true or file has reference, isReferencedInReads = true, skipping for now." I looked through the code and it seems impossible to visually track where all we create StoreFileScanners and whether we always close them. I don't like reference counting... In this case you forget to close just one StoreFileScanner, and BOOM compactions now can *never* succeed; unless and until you move the region or bounce the region server. Not good! > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug > Components: Performance, Scanners >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan >Priority: Major > Fix For: 1.3.0, 2.0.0 > > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, CountDownLatch-0.98.txt, HBASE-13082.pdf, > HBASE-13082_1.pdf, HBASE-13082_12.patch, HBASE-13082_13.patch, > HBASE-13082_14.patch, HBASE-13082_15.patch, HBASE-13082_16.patch, > HBASE-13082_17.patch, HBASE-13082_18.patch, HBASE-13082_19.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181607#comment-15181607 ] Lars Hofhansl commented on HBASE-13082: --- [~ramkrishna.s.vasude...@gmail.com] [~anoop.hbase], Yep, that's what I suggest as downside ("risks") above. Everything that is calling updateReaders will have to wait behind active StoreScanners. Instead of a lot of huh hah, compactions are delayed until they can safely finish. But those would only be StoreScanner open at the time the compaction/flush tries to finish. So actually my point about delaying compactions potentially forever is not true. Instead as as soon as the old set or StoreScanners is done the compaction/flush can proceed and the new Scanner would only see the new files. The flush part is concerning, though. Thanks for pointing that out. That was already the cases before in other situations (like a Scanner that only encounters deleted Cells with has a Filter that filters most Cells, etc). But that would not be the case in trunk anymore - just that there we'd disallow the file to be removed, which is better. There is currently still the checkFlushed call, which needs a volatile and hence causes a memory barrier. I'll do some tests. This wasn't meant as an actual patch to be committed, just wanted to get some feedback. I'm always a fan of simplest code possible. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug > Components: Performance, Scanners >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Fix For: 2.0.0 > > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, CountDownLatch-0.98.txt, HBASE-13082.pdf, > HBASE-13082_1.pdf, HBASE-13082_12.patch, HBASE-13082_13.patch, > HBASE-13082_14.patch, HBASE-13082_15.patch, HBASE-13082_16.patch, > HBASE-13082_17.patch, HBASE-13082_18.patch, HBASE-13082_19.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181578#comment-15181578 ] ramkrishna.s.vasudevan commented on HBASE-13082: [~lhofhansl] In the current trunk and 1.3 actually we are not doing any updateReaders after compactions. Only after flushes we are doing. So how does this patch work with flushes if you will wait on the latch that is created per StoreScanner. Because for flush we are forced to update the scanner because we are cleaning up the memstore snapshot. But yes we do have the reference to the older memstore snapshot but holding it up for a longer time is not good is what was discussed previously in this JIRA impl. So even if compactions get blocked if there are large scans then it will affect the compaction requests that comes newly? > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug > Components: Performance, Scanners >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Fix For: 2.0.0 > > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, CountDownLatch-0.98.txt, HBASE-13082.pdf, > HBASE-13082_1.pdf, HBASE-13082_12.patch, HBASE-13082_13.patch, > HBASE-13082_14.patch, HBASE-13082_15.patch, HBASE-13082_16.patch, > HBASE-13082_17.patch, HBASE-13082_18.patch, HBASE-13082_19.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181541#comment-15181541 ] Anoop Sam John commented on HBASE-13082: So this means when there are parallel scanners, the thread doing compaction will get blocked. Means it can delay other compaction requests in Q. Means if there are some parallel writes also that may delay flushes. So may block writes (?) Is these possible Lars? > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug > Components: Performance, Scanners >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Fix For: 2.0.0 > > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, CountDownLatch-0.98.txt, HBASE-13082.pdf, > HBASE-13082_1.pdf, HBASE-13082_12.patch, HBASE-13082_13.patch, > HBASE-13082_14.patch, HBASE-13082_15.patch, HBASE-13082_16.patch, > HBASE-13082_17.patch, HBASE-13082_18.patch, HBASE-13082_19.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181406#comment-15181406 ] Lars Hofhansl commented on HBASE-13082: --- So I found a deceptively simple way of (1) avoiding the locks in every method, (2) avoiding resetting the scanner stack, (3) avoiding an extra cleanup chore: # At StoreScanner creation time create a CountDownLatch initialized with 1. # When a StoreScanner is closed, call countDown on the latch. # When the compaction calls updateReaders in the StoreScanner it will block until the scanner is done. So we have an easy way to ensure that no compactions can finish while a scanner using referring to the store files is running. That also means that we no longer need to reset the scanner stack (a compaction only happens after the scanner is closed), and lastly the compactions will naturally stack up behind the scanners and eventually run. The risk is that scanner is not properly closed and we'll never count down that latch, and hence block compactions forever. I've not seen this happening in my tests, also every scanner is guarded by a lease, so eventually we will close it. The other risk is that we simply might not be able to get any compactions through in a busy system. Comments? Happy to attach a patch (or open another jira for this) [~stack], [~ram_krish], [~anoop.hbase], [~apurtell] > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug > Components: Performance, Scanners >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Fix For: 2.0.0 > > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, > HBASE-13082_18.patch, HBASE-13082_19.patch, HBASE-13082_1_WIP.patch, > HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, > HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15114679#comment-15114679 ] Hudson commented on HBASE-13082: SUCCESS: Integrated in HBase-1.3-IT #460 (See [https://builds.apache.org/job/HBase-1.3-IT/460/]) HBASE-14970 Backport HBASE-13082 and its sub-jira to branch-1 - recommit (ramkrishna: rev bc0e5fc048de3f18ab07bd5cb06509eb349e951a) * hbase-server/src/test/java/org/apache/hadoop/hbase/TestIOFencing.java * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionReplayEvents.java * hbase-server/src/main/java/org/apache/hadoop/hbase/executor/ExecutorService.java * hbase-client/src/main/java/org/apache/hadoop/hbase/executor/EventType.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestSnapshotFromMaster.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileManager.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/OnlineRegions.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionMergeTransactionOnCluster.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ReversedStoreScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat2.java * hbase-server/src/test/java/org/apache/hadoop/hbase/backup/example/TestZooKeeperTableArchiveClient.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionReplicas.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactedHFilesDischarger.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/compactions/TestCompactedHFilesDischarger.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestEncryptionKeyRotation.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactionConfiguration.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultStoreFileManager.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ChangedReadersObserver.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactedHFilesDischargeHandler.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStripeStoreFileManager.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MockStoreFile.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java * hbase-client/src/main/java/org/apache/hadoop/hbase/executor/ExecutorType.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StripeStoreFileManager.java * hbase-server/src/test/java/org/apache/hadoop/hbase/MockRegionServerServices.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestWideScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug > Components: Performance, Scanners >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Fix For: 2.0.0 > > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, > HBASE-13082_18.patch, HBASE-13082_19.patch, HBASE-13082_1_WIP.patch, > HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, > HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15114824#comment-15114824 ] Hudson commented on HBASE-13082: SUCCESS: Integrated in HBase-1.3 #513 (See [https://builds.apache.org/job/HBase-1.3/513/]) HBASE-14970 Backport HBASE-13082 and its sub-jira to branch-1 - recommit (ramkrishna: rev bc0e5fc048de3f18ab07bd5cb06509eb349e951a) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionReplayEvents.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/compactions/TestCompactedHFilesDischarger.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileManager.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionMergeTransactionOnCluster.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * hbase-server/src/test/java/org/apache/hadoop/hbase/backup/example/TestZooKeeperTableArchiveClient.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ChangedReadersObserver.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/OnlineRegions.java * hbase-client/src/main/java/org/apache/hadoop/hbase/executor/ExecutorType.java * hbase-server/src/test/java/org/apache/hadoop/hbase/MockRegionServerServices.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestEncryptionKeyRotation.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStripeStoreFileManager.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StripeStoreFileManager.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java * hbase-server/src/main/java/org/apache/hadoop/hbase/executor/ExecutorService.java * hbase-client/src/main/java/org/apache/hadoop/hbase/executor/EventType.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestSnapshotFromMaster.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestWideScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactionConfiguration.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultStoreFileManager.java * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat2.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ReversedStoreScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactedHFilesDischargeHandler.java * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionReplicas.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java * hbase-server/src/test/java/org/apache/hadoop/hbase/TestIOFencing.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactedHFilesDischarger.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MockStoreFile.java > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug > Components: Performance, Scanners >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Fix For: 2.0.0 > > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, > HBASE-13082_18.patch, HBASE-13082_19.patch, HBASE-13082_1_WIP.patch, > HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, > HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. >
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15113280#comment-15113280 ] Hudson commented on HBASE-13082: SUCCESS: Integrated in HBase-1.3 #509 (See [https://builds.apache.org/job/HBase-1.3/509/]) Revert "HBASE-14970 Backport HBASE-13082 and its sub-jira to branch-1 (stack: rev a5228a0b4d8b4c6b9bab58e1eee015393edaaade) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStripeStoreFileManager.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionMergeTransactionOnCluster.java * hbase-client/src/main/java/org/apache/hadoop/hbase/executor/ExecutorType.java * hbase-server/src/test/java/org/apache/hadoop/hbase/MockRegionServerServices.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionReplayEvents.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactionConfiguration.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/compactions/TestCompactedHFilesDischarger.java * hbase-client/src/main/java/org/apache/hadoop/hbase/executor/EventType.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/backup/example/TestZooKeeperTableArchiveClient.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestSnapshotFromMaster.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestEncryptionKeyRotation.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultStoreFileManager.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionReplicas.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MockStoreFile.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileManager.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StripeStoreFileManager.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/OnlineRegions.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/executor/ExecutorService.java * hbase-server/src/test/java/org/apache/hadoop/hbase/TestIOFencing.java * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat2.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ChangedReadersObserver.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactedHFilesDischargeHandler.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestWideScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactedHFilesDischarger.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ReversedStoreScanner.java > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug > Components: Performance, Scanners >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Fix For: 2.0.0 > > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, > HBASE-13082_18.patch, HBASE-13082_19.patch, HBASE-13082_1_WIP.patch, > HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, > HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15113298#comment-15113298 ] Hudson commented on HBASE-13082: SUCCESS: Integrated in HBase-1.3-IT #454 (See [https://builds.apache.org/job/HBase-1.3-IT/454/]) Revert "HBASE-14970 Backport HBASE-13082 and its sub-jira to branch-1 (stack: rev a5228a0b4d8b4c6b9bab58e1eee015393edaaade) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * hbase-server/src/main/java/org/apache/hadoop/hbase/executor/ExecutorService.java * hbase-server/src/test/java/org/apache/hadoop/hbase/backup/example/TestZooKeeperTableArchiveClient.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ReversedStoreScanner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionMergeTransactionOnCluster.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MockStoreFile.java * hbase-client/src/main/java/org/apache/hadoop/hbase/executor/ExecutorType.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestSnapshotFromMaster.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactedHFilesDischargeHandler.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/compactions/TestCompactedHFilesDischarger.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestEncryptionKeyRotation.java * hbase-client/src/main/java/org/apache/hadoop/hbase/executor/EventType.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StripeStoreFileManager.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionReplayEvents.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactedHFilesDischarger.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileManager.java * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat2.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/OnlineRegions.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultStoreFileManager.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ChangedReadersObserver.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * hbase-server/src/test/java/org/apache/hadoop/hbase/MockRegionServerServices.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestWideScanner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/TestIOFencing.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStripeStoreFileManager.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionReplicas.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactionConfiguration.java > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug > Components: Performance, Scanners >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Fix For: 2.0.0 > > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, > HBASE-13082_18.patch, HBASE-13082_19.patch, HBASE-13082_1_WIP.patch, > HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, > HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110921#comment-15110921 ] Hudson commented on HBASE-13082: FAILURE: Integrated in HBase-1.3 #505 (See [https://builds.apache.org/job/HBase-1.3/505/]) HBASE-14970 Backport HBASE-13082 and its sub-jira to branch-1 (Ram) (ramkrishna: rev 58521869b06a63894e422e9c9403e48b4b12f388) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileManager.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestSnapshotFromMaster.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultStoreFileManager.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestEncryptionKeyRotation.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactionConfiguration.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestWideScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java * hbase-server/src/test/java/org/apache/hadoop/hbase/backup/example/TestZooKeeperTableArchiveClient.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionReplayEvents.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MockStoreFile.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/OnlineRegions.java * hbase-server/src/test/java/org/apache/hadoop/hbase/TestIOFencing.java * hbase-server/src/main/java/org/apache/hadoop/hbase/executor/ExecutorService.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactedHFilesDischargeHandler.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java * hbase-client/src/main/java/org/apache/hadoop/hbase/executor/ExecutorType.java * hbase-client/src/main/java/org/apache/hadoop/hbase/executor/EventType.java * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat2.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStripeStoreFileManager.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionReplicas.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StripeStoreFileManager.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ChangedReadersObserver.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/compactions/TestCompactedHFilesDischarger.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionMergeTransactionOnCluster.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * hbase-server/src/test/java/org/apache/hadoop/hbase/MockRegionServerServices.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ReversedStoreScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactedHFilesDischarger.java > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug > Components: Performance, Scanners >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Fix For: 2.0.0 > > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, > HBASE-13082_18.patch, HBASE-13082_19.patch, HBASE-13082_1_WIP.patch, > HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, > HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111600#comment-15111600 ] Hudson commented on HBASE-13082: SUCCESS: Integrated in HBase-1.3-IT #451 (See [https://builds.apache.org/job/HBase-1.3-IT/451/]) HBASE-14970 Backport HBASE-13082 and its sub-jira to branch-1 (Ram) (ramkrishna: rev 58521869b06a63894e422e9c9403e48b4b12f388) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileManager.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ReversedStoreScanner.java * hbase-client/src/main/java/org/apache/hadoop/hbase/executor/ExecutorType.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestSnapshotFromMaster.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat2.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MockStoreFile.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionMergeTransactionOnCluster.java * hbase-server/src/test/java/org/apache/hadoop/hbase/backup/example/TestZooKeeperTableArchiveClient.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionReplayEvents.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionReplicas.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java * hbase-server/src/main/java/org/apache/hadoop/hbase/executor/ExecutorService.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/compactions/TestCompactedHFilesDischarger.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestWideScanner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestEncryptionKeyRotation.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StripeStoreFileManager.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/OnlineRegions.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactedHFilesDischargeHandler.java * hbase-client/src/main/java/org/apache/hadoop/hbase/executor/EventType.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactionConfiguration.java * hbase-server/src/test/java/org/apache/hadoop/hbase/MockRegionServerServices.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactedHFilesDischarger.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStripeStoreFileManager.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ChangedReadersObserver.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/TestIOFencing.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultStoreFileManager.java > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug > Components: Performance, Scanners >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Fix For: 2.0.0 > > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, > HBASE-13082_18.patch, HBASE-13082_19.patch, HBASE-13082_1_WIP.patch, > HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, > HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108407#comment-15108407 ] Hudson commented on HBASE-13082: FAILURE: Integrated in HBase-Trunk_matrix #645 (See [https://builds.apache.org/job/HBase-Trunk_matrix/645/]) HBASE-15101 Leaked References to StoreFile.Reader after HBASE-13082 (ramkrishna: rev 93e200d52b29d35ad5a98eed9eea05783960f6b2) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug > Components: Performance, Scanners >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Fix For: 2.0.0 > > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, > HBASE-13082_18.patch, HBASE-13082_19.patch, HBASE-13082_1_WIP.patch, > HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, > HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15048506#comment-15048506 ] ramkrishna.s.vasudevan commented on HBASE-13082: Oops. I need to prepare a patch for 1.3 which I started but left it due to some conflicts. Let me complete it by the end of this week. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Fix For: 2.0.0 > > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, > HBASE-13082_18.patch, HBASE-13082_19.patch, HBASE-13082_1_WIP.patch, > HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, > HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15041432#comment-15041432 ] Anoop Sam John commented on HBASE-13082: Can we have it in 1.3 also? > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Fix For: 2.0.0 > > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, > HBASE-13082_18.patch, HBASE-13082_19.patch, HBASE-13082_1_WIP.patch, > HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, > HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15041873#comment-15041873 ] stack commented on HBASE-13082: --- Wahoo. +1 on 1.3. Probably too late for 1.2. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Fix For: 2.0.0 > > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, > HBASE-13082_18.patch, HBASE-13082_19.patch, HBASE-13082_1_WIP.patch, > HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, > HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15041227#comment-15041227 ] Hudson commented on HBASE-13082: FAILURE: Integrated in HBase-Trunk_matrix #530 (See [https://builds.apache.org/job/HBase-Trunk_matrix/530/]) HBASE-13082 Coarsen StoreScanner locks to RegionScanner (Ram) (ramkrishna: rev 8b3d1f144408e4a7a014c5ac46418c9e91b9b0db) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionMergeTransactionOnCluster.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionReplayEvents.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestSnapshotFromMaster.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/backup/example/TestZooKeeperTableArchiveClient.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StripeStoreFileManager.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestEncryptionKeyRotation.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MockStoreFile.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStripeStoreFileManager.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/compactions/TestCompactedHFilesDischarger.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactedHFilesDischarger.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionReplicas.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileManager.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ReversedStoreScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultStoreFileManager.java * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat2.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/TestIOFencing.java > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Fix For: 2.0.0 > > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, > HBASE-13082_18.patch, HBASE-13082_19.patch, HBASE-13082_1_WIP.patch, > HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, > HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034534#comment-15034534 ] Hadoop QA commented on HBASE-13082: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12775074/HBASE-13082_19.patch against master branch at commit a7673ed7552542c9848d95f9492bf3a55d2060e2. ATTACHMENT ID: 12775074 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 50 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 2.7.1) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 checkstyle{color}. The applied patch generated new checkstyle errors. Check build console for list of new errors. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/16719//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/16719//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/16719//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/16719//console This message is automatically generated. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, > HBASE-13082_18.patch, HBASE-13082_19.patch, HBASE-13082_1_WIP.patch, > HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, > HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033927#comment-15033927 ] Hadoop QA commented on HBASE-13082: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12775032/HBASE-13082_18.patch against master branch at commit 7979ac46cce36f21033f8ed03c8d0dd5fddde005. ATTACHMENT ID: 12775032 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 46 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 2.7.1) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 2 warning messages. {color:red}-1 checkstyle{color}. The applied patch generated new checkstyle errors. Check build console for list of new errors. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.io.TestHeapSize Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/16717//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/16717//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/16717//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/16717//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/16717//console This message is automatically generated. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, > HBASE-13082_18.patch, HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, > HBASE-13082_9.patch, HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, > HBASE-13082_withpatch.jpg, LockVsSynchronized.java, gc.png, gc.png, gc.png, > hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033578#comment-15033578 ] Anoop Sam John commented on HBASE-13082: sortCompactedfiles This is already been called on new temp ArrayList. Do we still need create a new list? May be to be best way is create ImmutableList here at end for setting to instance variable private ChoreService choreService; We dont need to keep a ref to this. bq.LOG.trace("No compacted files to compact"); Say no compacted files to archive. {code} this.storeEngine.getStoreFileManager().addCompactionResults(compactedFiles, result); // Mark the files as compactedAway once the storefiles and compactedfiles list is finalised // Let a background thread close the actual reader on these compacted files and also // ensure to evict the blocks from block cache so that they are no longer in // cache this.storeEngine.getStoreFileManager().markCompactedAway(compactedFiles); {code} Do we need 2 API calls for this? Can the marking be done within addCompactionResults? May be we need a better name for that API then. {code} /** * @return true if the file is compacted and if there are no references */ public boolean isReferencedInReads() { return !(this.compactedAway && refCount.get() == 0); } {code} That looks bad this API checks the status of compactedAway also. We incr refCount with out any check. Either name the API accordingly or just it return state of refCount. {code} public boolean isCompactedAway() { if (this.reader != null) { return this.reader.isCompactedAway(); } return true; } {code} We will have not null reader on every file every time right? As per this logic when there is no reader on a file we treat it as compacted away. Return false instead of true? Seems we use this only in tests. Why because the other API isReferencedInReads is checking both ref count and compactedAway state! {code} StoreFile.Reader r = file.getReader(); if (r != null && !r.isReferencedInReads()) { {code} Better add similar API in StoreFile also and use that rather than using directly from reader. removeCompactedFiles -> Every time the chore runs, we seems to make a ThreadPoolExecutor and ExecutorService which seems expensive. Why we need multi threaded reader close? checkResetHeap -> This is working with out any lock now. Say flushed was true and so we did nullifyCurrentHeap(). Another reader by this time created new heap. Then only the old thread got a chance. This will try set the var to false. In btw that another flush happened and variable was set to ON. But we may miss this reset of heap? It will make the ref to that snapshot for longer time max till all reader on that snapshot is completed. Should be ok as it is rare case also.. Just saying. private HRegion region; -> Do we need this ref of HRegion type? Can be Region? > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033596#comment-15033596 ] ramkrishna.s.vasudevan commented on HBASE-13082: bq.sortCompactedfiles This is already been called on new temp ArrayList. Do we still need create a new list? May be to be best way is create ImmutableList here at end for setting to instance variable Stack suggested not to use Guava DS and APIs. bq.Can the marking be done within addCompactionResults? If we do this way then we have to add the markcompactedAway in all the compaction impl. This way it will be better. Also better we do it inside the store level lock. bq.That looks bad this API checks the status of compactedAway also. The name is Ok I think. Any other name you suggest? bq.isCompactedAway() { This is used in tests only because wanted to check some conditions for the tests. bq.Better add similar API in StoreFile also and use that rather than using directly from reader. This cannot be done so easily because all our StorefileScanners are created over the Reader. So if we want to access from Storefile then it will be a bigger change. Any way there is a follow up that was discussed to make things work with StoreFileManager and StorefileInfo getting used by the manager. bq.removeCompactedFiles -> Every time the chore runs, we seems to make a ThreadPoolExecutor and ExecutorService which seems expensive. Why we need multi threaded reader close? The close() is a costly operation hence did not want to do these things serially when we know that we can do it parallely. The reason for creating the executor every time was that not sure what is the thread count to be allocated to the executor. bq.Should be ok as it is rare case also.. Just saying. YA if there are very frequent flushes then this will happen, till then the GC will not be cleared for that snapshot till the scan is over. bq.HRegion region; -> Do we need this ref of HRegion type? Can be Region? Okie that can be done I think. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033701#comment-15033701 ] ramkrishna.s.vasudevan commented on HBASE-13082: Changed the way isReferencedInREads is called and instead replaced it with r.isCompactedAway() && !r.isReferencedInReads(). The isReferencedInReads is now {code} public boolean isReferencedInReads() { return refCount.get() != 0; } {code} > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15031384#comment-15031384 ] ramkrishna.s.vasudevan commented on HBASE-13082: Will commit this patch shortly. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027356#comment-15027356 ] stack commented on HBASE-13082: --- +1 Thanks [~ram_krish] StoreFileManager and StoreFileInfo all needs cleaning up -- where does StoreFile accounting get done, where is StoreFile metadata kept -- but not as part of this patch. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15024811#comment-15024811 ] Hadoop QA commented on HBASE-13082: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12774054/HBASE-13082_17.patch against master branch at commit 0bae444b34b6be3a28b5ccc036afb5add23818c6. ATTACHMENT ID: 12774054 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 46 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 2.7.1) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/16651//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/16651//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/16651//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/16651//console This message is automatically generated. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15024368#comment-15024368 ] ramkrishna.s.vasudevan commented on HBASE-13082: bq.Could also have the method internally do these checks for null and check for zero count? Done. bq.closeAfterCompaction seems misnamed. The return is whether there are references outstanding and if the file can be safely removed/closed? isReferencedInReads() is better? bq.Does StoreFileScanner have reference to StoreFileManager? No there is no reference. Coming to this StorefileManager and refcounting, as you can see that SFM has a list of storeFiles that are under the current store. But the StorefileScanner is created by the StoreFile only. In the sense the StoreScanner creates a StorefileScanner on each of this storefiles returned by the StorefileManager. The decrement in ref counting we will have to do when the scanner on the Storefile is done (ie StoreFileScanner.close()). So this how can the SFM do this ref counting? SFM is not involved in the scan process except for maintiaining the list of storefiles eligible for scanning. What do you think? If we need to support this way, then SFM needs to be modified iin such a way that the scanner creation also happen thro the SFM. bq.Gets complicated here on the end in closeAndArchiveCompactedFiles where there is a lock for the Store being used to close out storefiles And, you are just following on from what is in here already. Ugh. I thought it is better to have the lock there (though it is a lock at the store level) where the compacted file is updated. One thing can be done is maintain a new lock only for the compacted file but doing so one problem is that addCompactionResults we need to split into two like first hold the current lock and update the storeFile list and then hold the new compaciton lock and update the compactedFiles list. And use this comapcted lock every time we remove (in write mode) and select the compacted files ( in read mode). bq.On how often to call checkResetHeap, we have to be prompt because we are carrying a snapshot until we do reset? Yes. We are carrying the snapshot hence thought it is better if we can check on every scan API so that the snapshot can be released as soon as possible. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_1_WIP.patch, > HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, > HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15021943#comment-15021943 ] ramkrishna.s.vasudevan commented on HBASE-13082: bq. Do it up here where compactedfiles is introduced. Ya okie. I will add it in StorefileManager where we introduce this compactedfiles. Also removed getCompactedfiles from the Store.java. bq.ImmutableList is a guava-ism? Removed the usage of guava API. bq.Yeah, below has a side-effect. Instead return what was sorted and have the caller assign: Ya done. bq.Why every '2' minutes? Any reason? 5 minutes? The TTLCleaner works every 5 mins for the secondary region replica. So before that we need to move the compacted files to the archive. Hence it is 2 mins. bq.You are looking at an enum? Why not just look at the refcount? Have a method isReferenced? Or isAliveStill? Or isInScan? I have changed these things now.. Also the logic is pretty much simple. Every time do a refCount increment while getting the files for scans. (We are going to do that only on the active Store files). Scan completiong will decrement the count. Once compacted done mark the boolean as compactedAway. So for compaction cleaner chore we only check if the count and the boolean is true. If so move it to archive. bq.So, we need Store Interface to have public Collection getCompactedfiles() { Not needed any more. Will use the StorefileManager API. I think configuring the Cleaner per region makes sense because the region can collectively issue the cleaner across all its stores. bq.Is this added in this patch? If so, closeAfterCompaction? or isCloseAfterCompaction? closeAfterCompaction() will be the name. bq.We make filesToRemove even if we may not use it it? i.e. compacted is false. We create this array to hold one file only? Then clear it? Changed this. Will directly collect the compactedfiles and just remove them. bq.Who calls this closeAndArchiveCompactedFiles ? The chore thread? Yes. bq.We do a copy here and operate on the copy? The original can change if just after we getThecompactedFiles another set of compaction gets completed. Like we do in other areas by obtaininig a read lock before getting the files for scanners similarly I have added the read lock here just for copying the compacted files. bq..down in archiveAndRemoveCompactedFiles? There are no references to the file, right? Currently the storefiles are updated under the write lock. Similarly the compacted files are also updated with the same write lock. So the removal of the compacted files are also under the write lock. This is basically to make the update to the compactedfiles list atomic. bq.{ 156ACTIVE, DISCARDED; 157 } Now no more enums. bq.doing it in StoreFile is not right.. .this is meta info on the StoreFile instances. StoreFileManager? Agree that Storefile is not right and StorefileInfo is better. I think we can do this in a seperate JIRA. But if we do in StorefileManager currently it is an interface so every impl of the SFManager should take care of this state and ref counting. (like default and StripeSFM). bq.Shoud return Collection and internally you do the Immutable stuff (good practice) Okie. I just followed what clearStoreFiles() does. bq.In StoreFileScanner, when would scanner be null? Changed this now. Considering the fact that we have two seperated lists for the current storefiles and compacted files this may no longer be needed. bq.Maybe a timer on scans? If goes on longer than a minute have it return and then clean up compacted files New issue. Ya true. New Issue. bq.Is checkFlushed the right name for the method? Changed to checkResetHeap. bq.Why is it a CompactedHFilesCleaner cleaner and not a HFileCleaner? Changed to CompactedHFilesDischarger. Does that sound good? bq.Yeah, why call checkFlushed in shipped? I rechecked this flow fully. Calling this in shipped or close() may not be needed because after the shipped() call any way we wil be calling one of the scan API like next(), reseek(), peek() etc. But regarding not calling checkResetHeap in reseek(), peek(), seek() etc. I think its okie. The reason is because every next() may call reseek or seek() internally and that time if we can ensure that if we can reset the heap on flush it will ensure that we don't hold up the snapshot for a longer time. But one downside could be if there is no flush during a scan and there are lot of reseeks() we end up checking the volatile every time. But I think it is okie? > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt,
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023787#comment-15023787 ] ramkrishna.s.vasudevan commented on HBASE-13082: bq.Do you have to have 'Compacted' in there since files are discharged by the compactor always? I first thought it is a cleaner because this chore moves the compacted files alone to the archive dir. It does not deal with any other hfile. That is why I added 'Compacted'. You can review this patch except for the name of the Chore service . > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_1_WIP.patch, > HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, > HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023832#comment-15023832 ] stack commented on HBASE-13082: --- Nice explanation on compactedfiles bq.// Sorting may not be really needed here for the compacted files? Yeah. They are just going to be deleted, right... No harm sorting though I suppose.. you'll delete oldest to newest? Keep CompactedHFilesDischarger name. It ties his chore to this stuff going on in the StoreManager. Could also have the method internally do these checks for null and check for zero count? 2351if (copyCompactedfiles != null && copyCompactedfiles.size() != 0) { 2352 removeCompactedFiles(copyCompactedfiles); 2353} Then all in one place. closeAfterCompaction seems misnamed. The return is whether there are references outstanding and if the file can be safely removed/closed? 470 /** 471* Closes and archives the compacted files under this store 472*/ 473 void closeAndArchiveCompactedFiles() throws IOException; We'll only close and archive if no references and if it is marked as compacted, right? Otherwise, we'll do it at a later place? So, we are going to change this in another issue? compactedFile.markCompacted(); i.e. marking a file as compacted rather than telling StoreFileManager it is compacted? So, the StoreFile has new attributes of current refcount and if compacted away. StoreFileManager has list of compacted files. StoreFileManager is in charge of the list of StoreFlies in a Store. It makes ruling on what to include in a Scan. It does clearFiles... and compaction. Shouldn't it be in charge of knowing when files are not to be included in a Scan/can be removed? When we mark a file compacted, we should do it on StoreFileManager? Can it do the refcounting? When a Scan is done, tell the StoreFileManager so it can do the refcounting? >From your earlier comment: bq. We cannot have this in StorefileInfo because we only cache the Storefile (in the StorefileManager) and not the StorefileInfo. StoreFileInfos are created every time from the hfile path. Can StoreFileManager then do refcounting and knowing what files are compacted? Would that be doable and put these attributes in one location? This should be happening internal to StoreFileManager rather than out here in HStore? 853 Collection compactedfiles = 854 storeEngine.getStoreFileManager().clearCompactedFiles(); 855 // clear the compacted files 856 if (compactedfiles != null && !compactedfiles.isEmpty()) { 857 removeCompactedFiles(compactedfiles); 858 } Gets complicated here on the end in closeAndArchiveCompactedFiles where there is a lock for the Store being used to close out storefiles And, you are just following on from what is in here already. Ugh. In StoreFile you have public boolean isCompacted() { and then later you have on the Reader isCompactedAway. These methods should be named the same (And see above where hopefully, we don't have to do this on the StoreFile itself at all). Ditto for getRefCount (Does StoreFileManager know refcount?) Looking at the additions to StoreFileManager, if compaction was kept internal to StoreFileManager, would you have to add these new methods? Does StoreFileScanner have reference to StoreFileManager? On how often to call checkResetHeap, we have to be prompt because we are carrying a snapshot until we do reset? > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_1_WIP.patch, > HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, > HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023099#comment-15023099 ] stack commented on HBASE-13082: --- bq. The TTLCleaner works every 5 mins for the secondary region replica. So before that we need to move the compacted files to the archive. Hence it is 2 mins. Add your reasoning as a comment. bq. Changed to CompactedHFilesDischarger. Does that sound good? No. But more descriptive (smile). Do you have to have 'Compacted' in there since files are discharged by the compactor always? bq. But one downside could be if there is no flush during a scan and there are lot of reseeks() we end up checking the volatile every time. But I think it is okie? Yeah, my suggestion was try to minimize the checks... and if possible, avoid doing multiple checks inside in the one access when one is all that is needed... one on the way out of the next call. Otherwise, you start to lose alot of the nice gains you have here. Want me to review this version of patch or you want to do another first? > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_1_WIP.patch, > HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, > HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15021609#comment-15021609 ] stack commented on HBASE-13082: --- bq. Oops. Latest comments. I will check them too. Thanks. Sorry. Done now reviewing. Sorry for overlap. Patch is almost there I think. Lets get it in. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, > HBASE-13082_9.patch, HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, > HBASE-13082_withpatch.jpg, LockVsSynchronized.java, gc.png, gc.png, gc.png, > hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15021589#comment-15021589 ] ramkrishna.s.vasudevan commented on HBASE-13082: Oops. Latest comments. I will check them too. Thanks. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, > HBASE-13082_9.patch, HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, > HBASE-13082_withpatch.jpg, LockVsSynchronized.java, gc.png, gc.png, gc.png, > hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15021607#comment-15021607 ] stack commented on HBASE-13082: --- You think we need to do checkFlushed(); on each seek call? Would doing it on next be enough? Does checkReseek end up calling seek? (Which calls checkFlushed?) Yeah, how often we calling checkFlushed? The less the better. I love the removal of all those locks. Looks beautiful. Is checkFlushed the right name for the method? I see you set flushed to true in 670 flushed = true; in updateReaders ... so should it be checkReadersChanged or checkResetStoreFiles or something? Yeah, why call checkFlushed in shipped? Is that a good place to do it? I'm wondering why we'd call checkFlushed in any place but on the way out of a next call? Why is it a CompactedHFilesCleaner cleaner and not a HFileCleaner? Is it a Cleaner? It closes storefiles. Discharger as in it is discharging the hfile from duty... not it is eligable for archiving... and later delete? Or Acquiter I think this patch is almost there and its a very nice one. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, > HBASE-13082_9.patch, HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, > HBASE-13082_withpatch.jpg, LockVsSynchronized.java, gc.png, gc.png, gc.png, > hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15021585#comment-15021585 ] stack commented on HBASE-13082: --- Looking at v14: Yeah, talk about how stuff is different now in the javadoc on compactedfiles... of how now we keep the files that were compacted away around rather than immediately archive IFF referenced by scanners. Do it up here where compactedfiles is introduced. ImmutableList is a guava-ism? We want to use this in our APIs? (They tend to change faster than we do and we are on an old version of Guava -- we should update) Yeah, are these guava Immutable* in later versions of Guava? This bit repeated? Worth making a private method? ArrayList newCompactedfiles = null; 133 if (compactedfiles != null) { 134 newCompactedfiles = Lists.newArrayList(compactedfiles); 135 } else { 136 newCompactedfiles = Lists.newArrayList(); 137 } i.e. return compactedFiles == null? Lists.newArrayList(): Lists.newArrayList(compactedfiles); Yeah, below has a side-effect. Instead return what was sorted and have the caller assign: 204 private void sortAndSetCompactedFiles(List storeFiles) { 205 // Sorting may not be really needed here for the compacted files? 206 Collections.sort(storeFiles, StoreFile.Comparators.SEQ_ID); 207 compactedfiles = ImmutableList.copyOf(storeFiles); 208 } Yeah man, say what this does when it is introduced in HRegion: 815 // Start the CompactedHFileCleaner here Why every '2' minutes? Any reason? 5 minutes? Help me understand, so a file goes into the compactedfiles list but it may still have references... its refcount needs to go to zero. Only when its refcount is zero can it be removed... so when you do this: // check if the references are cleared now by seeing if the ref files are in DISCARDED state You are looking at an enum? Why not just look at the refcount? Have a method isReferenced? Or isAliveStill? Or isInScan? Hmm... So, we need Store Interface to havepublic Collection getCompactedfiles() { because.. the cleaner is on the Region level? What if the Cleaner was on the Store level? We'd not need to expose 'compactedfiles' outside of Store? That could be cleaner? Yeah, these guava'isms in our API ain't the best? Just return a List and the fact that it is a guava ImmutableList is not exposed until someone tries to change the list externally... then they'll have a surprise that'd be fine. This is a mouthful: isReadyForCloseAfterCompaction(); Is this added in this patch? If so, closeAfterCompaction? or isCloseAfterCompaction? And shouldEvictOnClose should be evictOnClose? What is going on here? {code} 886 ArrayList filesToRemove = new ArrayList(); 887 if (compacted) { 888 filesToRemove.add(f); 889 fs.removeStoreFiles(getFamily().getNameAsString(), filesToRemove); 890 // we already have the lock here. 891 getStoreEngine().getStoreFileManager().removeCompactedFiles(filesToRemove); 892 filesToRemove.clear(); 893 return null; 894 } 895 res.add(f); {code} We make filesToRemove even if we may not use it it? i.e. compacted is false. We create this array to hold one file only? Then clear it? Yeah, this don't make sense anymore, at least not where it is now: // 4. Compute new store size Who calls this closeAndArchiveCompactedFiles ? The chore thread? We do a copy here and operate on the copy? 2351 Collection copyCompactedfiles = Lists.newArrayList(compactedfiles); 2352 removeCompactedFiles(copyCompactedfiles); Can the original change in the meantime? What is isReadyForCloseAfterCompaction ? Is it not after all references have gone away? It reads like it is close of the region or Store but sounds like it is close of the referencing Scanner? Why a lock here? lock.writeLock().lock(); ..down in archiveAndRemoveCompactedFiles? There are no references to the file, right? You are only going to expose one of the below, right?, in new patch? 476 /** 477* Closes and archives the compacted files under this store 478*/ 479 void closeAndArchiveCompactedFiles() throws IOException; 480 481 /** 482* Close and archive the compacted files under this store 483* @param compactedStorefiles the list of compacted files 484*/ 485void closeAndArchiveCompactedFiles(Collection compactedStorefiles) 486 throws IOException; Yeah, do we need these at all since the StoreFile is being managed at a higher level? 153 154 // Indicates whether the current store file is compacted or not 155 private enum
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15015577#comment-15015577 ] ramkrishna.s.vasudevan commented on HBASE-13082: Thanks for the reviews Stack. bq. Need to update comment and rename variable else we'll stay confused. Will update the comment and say that 'these files are not included in reads'? The name 'compactedfiles' is still better? bq.Only one thread involved here? bq.public ImmutableCollection clearCompactedFiles() { Yes it will be single threaded. Used only while closing the store files. bq.Suggest you change this method to return the Collection rather than set the data member internall: i.e remove the 'set' part from sortAndSetCompactedFiles Do the set on return. Methods like this with 'side effects' can be tough to follow. Okie. bq.You know the size of the array to allocate here: 124 newCompactedFiles = Lists.newArrayList(); ... Yes done some refactoring there. bq. DISCARDED and ACTIVE? We will make it ACTIVE and COMPACTEDAWAY (as you suggested in another comment).? bq.I don't follow how we were checking for references when we went to merge but in this patch it changes to a check for compactions: I think you were referring to some old patch. The patch that was latest was _14. bq.Fix formatting here abouts if you are doing a new patch: if (!SystemUtils.IS_OS_WINDOWS) { This is not there in the latest patch. bq.Where we explain what it does? There is a javadoc explaining what it does. bq.as to be public because its in the Interface? Does it have to be: We access Store not directly from HStore but from Store.java. So it is better to add there and anyway this is going to be common for that store. bq.Might want to note that expectation is that access on methods like this one are single-threaded: clearCompactedFiles Okie. bq.Do you have to stop the chore in the region or store close? Before you do your close? Yes good catch. Done now. bq.void closeAndArchiveCompactedFiles(List compactedStorefiles) throws IOException; In my next version will remove this but keep the other one void closeAndArchiveCompactedFiles() throws IOException; bq.Do they have to be so specific? Can they be made more generic? Generic in the sense? bq.t seems like the compacted or not belongs in StoreFileInfo rather than in StoreFile. Is this fact persisted across open/close? We cannot have this in StorefileInfo because we only cache the Storefile (in the StorefileManager) and not the StorefileInfo. StoreFileInfos are created every time from the hfile path. bq.Maybe 'compactedAway'? Ya we can have ACTIVE and COMPACTED_AWAY.? > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15018970#comment-15018970 ] stack commented on HBASE-13082: --- bq. Will update the comment and say that 'these files are not included in reads'? The name 'compactedfiles' is still better? Yeah. When I see the change in context, compactedfiles is good name. You need to do a bunch of explaining of what there are though... and here is a good place to do it ("List of the files that were replaced by new ones on compaction; will be deleted once there are no longer any references by ongoing scanners to these files..." bq. ...Used only while closing the store files. One threaded because close is done with a lock held, right that's good. Might just say on the method that caller needs to ensure single-thread-at-a-time access. bq. I think you were referring to some old patch. The patch that was latest was _14. Argh Was looking at v9. My fault. bq. Generic in the sense? Its ok. You removed one of the two methods. That is good enough. bq. We cannot have this in StorefileInfo because we only cache the Storefile (in the StorefileManager) and not the StorefileInfo. StoreFileInfos are created every time from the hfile path. Hmmm... this don't seem right. Ok for now but we should address this. If SFI, should be not confined in its use... we keep meta data sometimes in SFI and then other times we modify SF. bq. Ya we can have ACTIVE and COMPACTED_AWAY.? Do we even need two states? Could we not just stamp a StoreFile STALE or COMPACTED_AWAY? A SF has this attribute or it does not. Do we even have to stamp the SF with this property at all? We are keeping a running list up in SFManager. Why we need this? Let me go back to v14. Will be back w/ more comments. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15015253#comment-15015253 ] ramkrishna.s.vasudevan commented on HBASE-13082: Did some JMH microbenchmark test by executing fixed set of code in a loop under 4 different conditions 1) With a reentrant lock 2) with a synchronized block 3) with a volatile boolean check 4) with no locks/synchronizations/volaties This is the JMH result that I got {code} Benchmark Mode Cnt Score Error Units LockVsSynchronized.operationUnderLock thrpt957.696 ± 0.322 ops/s LockVsSynchronized.operationUnderSynchronized thrpt944.692 ± 0.333 ops/s LockVsSynchronized.operationUnderVolatile thrpt9 1056.632 ± 5.584 ops/s LockVsSynchronized.operationWithoutLockthrpt9 1428.580 ± 5.372 ops/s {code} So we can see that when our operations are mostly single threaded and only at times we need the parallelism like when we want to reset the scanner stack, going with volatile is significantly faster than going with a read lock. (though it is not as fast as without lock). > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15015321#comment-15015321 ] stack commented on HBASE-13082: --- bq. // Start the CompactedHFileCleaner here This is new? Where we explain what it does? This comment is no longer correct, right? // notify scanners, close file readers, and recompute store size Has to be public because its in the Interface? Does it have to be: public Collection getCompactedfiles() { Might want to note that expectation is that access on methods like this one are single-threaded: clearCompactedFiles Do you have to stop the chore in the region or store close? Before you do your close? Not your change but remove it since it obviously wrong now: // 4. Compute new store size Yeah, this needs better explaining especially if in an Interface 67/** 68 * Get the compacted store files 69 * @return the list of compacted files 70 */ 71Collection getCompactedfiles(); You need these in the Interface? 475 boolean isPrimaryReplicaStore(); 476 477 /** 478* Closes and archives the compacted files under this store 479*/ 480 void closeAndArchiveCompactedFiles() throws IOException; 481 482 /** 483* Close and archive the compacted files under this store 484* @param compactedStorefiles the list of compacted files 485*/ 486 void closeAndArchiveCompactedFiles(List compactedStorefiles) throws IOException; Do they have to be so specific? Can they be made more generic? Yeah, this is hard... we have both nomenclatures going on ... compacted and ACTIVE, DISCARDED; 157 } This is in the StoreFile. It seems like the compacted or not belongs in StoreFileInfo rather than in StoreFile. Is this fact persisted across open/close? Maybe 'compactedAway'? Got as far as StoreFileManager. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15015281#comment-15015281 ] stack commented on HBASE-13082: --- An order of magnitude improvement * 2. Not bad [~ramkrishna]. Thanks for the jmh patch too. Let me look at the actual patch bq. The NON_COMPACTED indicates that it is an ACTIVE file which can be used in reads. COMPACTED indicates that it is no longer an active file and its contents are already copied to a new file. In this patch you still talk of 'compacted'... /** 58 * List of compacted files inside this store 59 */ 60private volatile ImmutableList compactedfiles = null; These are files that are not to be included in a Scan, right. Need to update comment and rename variable else we'll stay confused. BUT, reading the patch and code I see why you are calling them compacted and think you are right to do do. You just need to explain better what they are when there is no context around (e.g. in comment on data member... no need to explain when inside addCompactionResult because plenty context here). Only one thread involved here? public ImmutableCollection clearCompactedFiles() { Suggest you change this method to return the Collection rather than set the data member internall: i.e remove the 'set' part from sortAndSetCompactedFiles Do the set on return. Methods like this with 'side effects' can be tough to follow. nit: You know the size of the array to allocate here: 124 newCompactedFiles = Lists.newArrayList(); ... Is the below the new 'terminology'? 6585// check if the references are cleared now by seeing if the ref files are in DISCARDED state 6586// There should be only one file in the ACTIVE state and that is the new file i.e. DISCARDED and ACTIVE? I don't follow how we were checking for references when we went to merge but in this patch it changes to a check for compactions: 6593List compactedFiles = new ArrayList(); 6594for (Store s : dstRegion.getStores()) { 6595 compactedFiles.clear(); 6596 if (!dstRegion.isCompacted((HStore) s, dstRegion.getRegionFileSystem())) { 6597throw new IOException("Merged region " + dstRegion 6598+ " still has files that are not yet compacted, is compaction canceled?"); 6599 } nit: change this 6630if (nonCompactedFiles > 1) { 6631 return false; 6632} 6633return true; to: return nonCompactedFiles <= 1; Fix formatting here abouts if you are doing a new patch:if (!SystemUtils.IS_OS_WINDOWS) { Got to HStore. Will be back. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15006602#comment-15006602 ] ramkrishna.s.vasudevan commented on HBASE-13082: Ping for reviews. !!! [~saint@gmail.com], [~lhofhansl], [~anoop.hbase]. Once this is in we can try to do the sub task of reseeking only to the newly flushed file on scanner reset instead of all the files. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, > gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15000264#comment-15000264 ] ramkrishna.s.vasudevan commented on HBASE-13082: Withpatch === Avg: 324secs Withoutpatch === Avg : 410secs Using the command {code} ./hbase org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --oneCon=true --caching=5000 --filterAll=true --rows=1 scanRange1 50 {code} With YCSB scanRange can see that there is no significant gain because of network bottleneck but there is no degrade. Withpatch [OVERALL], Throughput(ops/sec), 3723.210569152307 Withoutpatch == [OVERALL], Throughput(ops/sec), 3752.08334427691 > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1432#comment-1432 ] ramkrishna.s.vasudevan commented on HBASE-13082: Ping for reviews !! Will do some more tests with YCSB today. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14996209#comment-14996209 ] Hadoop QA commented on HBASE-13082: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12771291/HBASE-13082_14.patch against master branch at commit 1cbcf1175e6ce497936f12c60fb2e897833ace39. ATTACHMENT ID: 12771291 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 46 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 2.7.1) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/16455//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/16455//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/16455//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/16455//console This message is automatically generated. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, > HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, > HBASE-13082_9.patch, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14995572#comment-14995572 ] Hadoop QA commented on HBASE-13082: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12771229/HBASE-13082_13.patch against master branch at commit 305ecaf224340b0f6e248d4fdabf0b53e1cd3b03. ATTACHMENT ID: 12771229 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 46 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 2.7.1) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1728 checkstyle errors (more than the master's current 1726 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/16451//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/16451//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/16451//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/16451//console This message is automatically generated. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_1_WIP.patch, > HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993688#comment-14993688 ] Hadoop QA commented on HBASE-13082: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12771006/HBASE-13082_12.patch against master branch at commit bfa36891901b96b95d82f5307642c35fd2b9f534. ATTACHMENT ID: 12771006 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 46 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 2.7.1) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1728 checkstyle errors (more than the master's current 1726 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.io.TestHeapSize Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/16429//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/16429//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/16429//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/16429//console This message is automatically generated. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, > HBASE-13082_9.patch, HBASE-13082_9.patch, gc.png, gc.png, gc.png, hits.png, > next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14991542#comment-14991542 ] ramkrishna.s.vasudevan commented on HBASE-13082: I just saw this {code} if (FSUtils.WINDOWS) { // compaction cannot move files while they are open in secondary on windows. Skip remaining. return; } {code} In one of the test case - so may this is what I was observing. Anyway will verify once in linux box also. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_1_WIP.patch, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14992285#comment-14992285 ] Enis Soztutar commented on HBASE-13082: --- bq. In one of the test case - so may this is what I was observing. Anyway will verify once in linux box also. Yeah, when FS is local filesystem on windows, the rename while having readers does not work. If fs is hdfs on top of windows, it works. This particular unit test is using the local file system, hence the workaround needed, but in production, we are not using ntfs as a local fs, we are using hdfs on top of ntfs. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_1_WIP.patch, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14993245#comment-14993245 ] ramkrishna.s.vasudevan commented on HBASE-13082: bq.The locks are not the problem, it's the memory barriers on every call to next and pretty that cause the performance issue. The locks are (almost) never contended Yes Lars. I too agree with this. I am trying to profile before I could come to any conclusion on the memory barrier that this volatile introduces. Ideally for the compaction cases we don't even need the notifying mechansim and can always be going on without any locks or resets of the heap but for flushes we may need it because if we don't reset then the flushed snapshot cannot be GCed and if this is going to be bigger scans then may lead to GC issues. Other than that as you always say Scanner is >99% single threaded. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_1_WIP.patch, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14991667#comment-14991667 ] Lars Hofhansl commented on HBASE-13082: --- bq. The volatile check is still a memory fence barrier similar to the lock although much less expensive. Right. The locks are not the problem, it's the memory barriers on every call to next and pretty that cause the performance issue. The locks are (almost) never contended. The point was to lock once for the scan, and then use the versioned access proxy without locking. Cool that even with the volatile it's faster, though! > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_1_WIP.patch, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990891#comment-14990891 ] Enis Soztutar commented on HBASE-13082: --- The volatile check is still a memory fence barrier similar to the lock although much less expensive. Were you able to profile this implementation similar to Lars' numbers above? > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_1_WIP.patch, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990872#comment-14990872 ] Enis Soztutar commented on HBASE-13082: --- bq. HDFS does not allow to rename if there are any active readers. I don't think this is how hdfs works. The FileLink.FileLinkInputStream is especially for this particular reason where an open file may get moved to a different directory. The way region replicas work is that they rely on having a high TTL for the hfiles so that even if primary replica compacts the file away and moves the file to the archive directory, the secondary replica will be able to still read from the archive location. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_1_WIP.patch, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14991079#comment-14991079 ] ramkrishna.s.vasudevan commented on HBASE-13082: bq. The way region replicas work is that they rely on having a high TTL for the hfiles so that even if primary replica compacts the file away and moves the file to the archive directory, the secondary replica will be able to still read from the archive location. Is it? So is that because my tests are running in the Windows environment (I mean my Junits) this issue occurs. But one thing is that after this patch comes in - the Secondary replica does not close the reader when an update happens on the secondary region. I have not yet run this secondary replica thing in my Linux box still but I found even in linux some junits failed due to this behaviour. {code} - // let the archive util decide if we should archive or delete the files - LOG.debug("Removing store files after compaction..."); - for (StoreFile compactedFile : compactedFiles) { -compactedFile.closeReader(true); - } - if (removeFiles) { -this.fs.removeStoreFiles(this.getColumnFamilyName(), compactedFiles); - } {code} bq.The volatile check is still a memory fence barrier similar to the lock although much less expensive. Were you able to profile this implementation similar to Lars' numbers above? Yes, I did some basic PE tool runs. I got 6 to 7% improvement. This is the case where it is pure read (scan workload on the server side with filterAll enabled) and there were no compactions or flushes. I can do some more tests on a bigger load and I think the main advantage of this patch is going to be with scans rather than gets if am not wrong. The other advantage of this patch is that because we use the older files only after compactions we avoid the reset and reseek on those new compacted files and other files already in heap. Discussing with Anoop, even for flushes we can ensure that instead of reset we can only allow the heap to be changed and ensure only the new flushed file we do a seek to the current key the scaner was at which the memstore was flushed. But that can be done as a later improvement. Thanks Enis for your reviews. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_1_WIP.patch, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14991115#comment-14991115 ] ramkrishna.s.vasudevan commented on HBASE-13082: I think [~enis] your point makes sense. In current approach, if the primary compacts a file it closes the reader and archives it but still the secondary may be having a reader on that file but the archiving process is successful. So I think even in the current updated design things should work as now. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_1_WIP.patch, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, > HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983917#comment-14983917 ] Hadoop QA commented on HBASE-13082: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12769932/HBASE-13082_9.patch against master branch at commit 683f84e6a217dfd872e5f1be82c7aa4cdf232ec1. ATTACHMENT ID: 12769932 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 38 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 2.7.1) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1731 checkstyle errors (more than the master's current 1730 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/16332//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/16332//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/16332//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/16332//console This message is automatically generated. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, > HBASE-13082_9.patch, HBASE-13082_9.patch, gc.png, gc.png, gc.png, hits.png, > next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982585#comment-14982585 ] Hadoop QA commented on HBASE-13082: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12769780/HBASE-13082_9.patch against master branch at commit 23fa18184cb68ca05246beb2189f8801200bdd7c. ATTACHMENT ID: 12769780 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 38 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/16307//console This message is automatically generated. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, > HBASE-13082_9.patch, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982561#comment-14982561 ] ramkrishna.s.vasudevan commented on HBASE-13082: After doing the code changes and testing it in a single node cluster found that having a single List in the StorefileManager and adding all the newly compacted files and not removing the older one till the compaction chore runs - creates some problems like in flush where we try to count the storefiles count and decide if there are so many files and wait for the compaciton to compact it. Now if we have one single list we are bound to iterate thro the list and find if each of the file is already compacted by checking its state and then return the count based on that. It may be costlier operation and also considering the fact that all the test cases depend on the count - the patch becomes bigger. So instead if we maintain two lists one for the actual store files and other one for the compacted files and the compaction cleaner will remove the files from the comapcted list - things should work fine. Just attaching a patch for QA - some more things am just verifying. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14980263#comment-14980263 ] ramkrishna.s.vasudevan commented on HBASE-13082: One interesting part while doing this is that of the Secondary replicas. As per the idea of read replicas the same underlying store file will be having a reader from the Secondary region always opened. In the current code whenever we refreshStoreFiles we close the reader that we think is no longer needed. So the archiving of the file if done by the primary region will be successul. But now after this change we cannot do that because we need the file to be available and only the compacted file cleaner can close the reader. Now there are two case If suppose the compacted file cleaner has not closed the reader then the secondary region will have a reader opened on the file and so any external archive option will fail. Similarly though the compacted file cleaner will close the reader and try to archive the compacted file still if there is a reader opened by the secondary region this archival will fail. Anyway the next cleaner thread will make it successful. So may be need to see if there are any other implications. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979769#comment-14979769 ] ramkrishna.s.vasudevan commented on HBASE-13082: Regarding the memstore files getting updated in the current scanner, I think we can solve it this way - let me try to explain here In the current code {code} this.lock.writeLock().lock(); try { this.storeEngine.getStoreFileManager().insertNewFiles(sfs); if (snapshotId > 0) { this.memstore.clearSnapshot(snapshotId); } } finally { // We need the lock, as long as we are updating the storeFiles // or changing the memstore. Let us release it before calling // notifyChangeReadersObservers. See HBASE-4485 for a possible // deadlock scenario that could have happened if continue to hold // the lock. this.lock.writeLock().unlock(); } notifyChangedReadersObservers(); {code} As you can see that before we notify the observer we clear the snapshot. So this notification wil try to hold the storescanner's lock and do the heap nullify and the scanner will continue to reset the heap seeing it to be null every time it enters any scanner API like next(), seek() etc. So here if there is an ongoing scanner going on assume it is in StoreScanner#next() it would have already held the lock and so the notify has to wait for nullifying the heap. So though the snapshot is cleared the storescanner operates on the older reference of the snapshot which cannot be GCed till the current scanner finishes the current API operation. So we can change this logic slightly in such a way that let the flush code set a volatile boolean in all the StoreScanner ( the observers). Note that the flush also has cleared the snapshot but still the reference cannot be GCed. Now the current on going scanner every time it enters the scan APIs like next(), seek(), reseek() etc. check for the volatile boolean if it is true. If so, it means that the flush has happened so go ahead and nullfiy the heap and create a new heap. All this will be done by the Storescanner itself. Note that within a storescanner it is always single threaded. Now even while entering a scan API if the scanner does not see the volatile boolean as true - it will continue to use the existing heap only and the scan will still work because we have the older memstore snapshot ref that is not yet GCed - (similar to how things work now). So this is more of a lazy model and there may be lag in nullifying the heap and resetting it but should be okie considering the existing way things work. Thoughts?!! > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964695#comment-14964695 ] ramkrishna.s.vasudevan commented on HBASE-13082: bq.This status is in reader? It suits better in StoreFile no? The reason for doing this (I just now checked the code once again) is because when we create a scanner on the Storefile we do it on the reader associated with the Storefile and not on the store file. So hence do determine whether this store file can be used in the scanner or not the state and ref count if it is in the reader it will be easy to use that info. Already info like isBulkLoad and setSeqId everything is happening on the reader now and not on the StoreFile. So may be if we can introduce a getStoreFileScanner at the Storefile level rather than the Reader as how it is currently is, then we can make this change of adding the ref count and the state to the StoreFile. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963297#comment-14963297 ] Anoop Sam John commented on HBASE-13082: So we wont reset the heap of the files during the scan. Compaction case is fine.. the versioned approach and we wont move the old compacted files until the scanners finished will do the solution. What about flush? Once the flush is finished, we will have to clear the snapshot of cells in Memstore no? Then we will have to open this newly added file for reading.. If we dont open it, we can not release the memstore snapshot. Means we will have to keep the cells as long as the scanner completes! That will not be acceptable IMO. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963400#comment-14963400 ] Anoop Sam John commented on HBASE-13082: Good doc explaining the change. bq. Along with the ref count, we also have CompactionStatus (COMPACTED and NON_COMPACTED) added to these store file readers This status is in reader? It suits better in StoreFile no? Even the ref count is in reader? I would say it is better suited in StoreFile. That call the state as Compacted we better call it as a file to be discarded? When we mark a file as compacted the confusion can be like whether this file is a file created out of compaction. (?) bq. Ensure that a background thread runs periodically runs that scans the list of store files and checks for the state as COMPACTED... This is a new Chore thread adding to the system.. Mention about it clearly. It will be one thread per RS. Also what is its interval? Is it configurable? How? bq.Hence it requires us to add new APIs which ensures that a split can go ahead even if reference files are present but they are in the COMPACTED state hmm.. making this more complex :-) > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963502#comment-14963502 ] ramkrishna.s.vasudevan commented on HBASE-13082: bq.What about flush? Once the flush is finished, we will have to clear the snapshot of cells in Memstore no? Then we will have to open this newly added file for reading.. If we dont open it, we can not release the memstore snapshot. We do open the reader for sure. But one thing may be to check is that if we open but still the current scanner heap is not reset will it have any impact on the current scan is what needs to be checked? Because during flush the reads can still be served from the snapshot. Only after flush is a point to be noted. bq.This status is in reader? It suits better in StoreFile no? The reader is the common object here. Hence was referencing it from there. Initially had it in storefile only but felt that StoreFile is more volatile. Let me check. Can be done. Infact I was also not very sure of having the state in the reader. We can change the states no problem. bq.This is a new Chore thread adding to the system.. Mention about it clearly. It will be one thread per RS. Also what is its interval? Is it configurable? How? For to add that config details. It is per store and the chore is started by the HStore and not the RS. bq.Hence it requires us to add new APIs which ensures that a split can go ahead even if reference files are present but they are in the COMPACTED state This is bit of sticky area. In case of merge I have tried to forcefully clear the compacted files. May be we need to do the same with split also. But in split do we explicitly call compact? I was not pretty sure on that. But in merge we do. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964005#comment-14964005 ] stack commented on HBASE-13082: --- The one pager is great ([~lhofhansl]!). Nice summary of the issue up top. Explanation of the approach your fix takes helps. I am trying to understand the COMPACTED vs NON_COMPACTED state. Is NON_COMPACTED a freshly-flushed file? Is a COMPACTED file a file that is made up of N other storefiles and you want to make sure the scan doesn't include duplicated info -- the compacted file and the compactions inputs? Or I think you are saying that when a file is COMPACTED, then its content can be found in another file and so it should not be included in a scan? Should the state be COMPACTED_AWAY or REPLACED (by file...). Do we need the NON_COMPACTED state? It is the default. No need to call this state anything (Active?) How can the following happen? "Now in case of the versioned storefile approach change, there is a chance that there are reference files which are not yet archived after compaction because of the change in the store file management." Where will you write the COMPACTED_AWAY state? In memory or into the file? (I suppose you can't write it to the file because it is already written) Doc is great [~ram_krish] > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964031#comment-14964031 ] stack commented on HBASE-13082: --- bq. When we mark a file as compacted the confusion can be like whether this file is a file created out of compaction. Yeah, that was my first thought too... same as Anoop. Is that a new Chore per Store per Region? Every two minutes seems like a long time to hold on to files? Yeah, this is a good point by [~anoop.hbase]: bq. What about flush? .So we wont change the store file's heap during the scan is not really possible? The other thing to consider is getting bulk loaded files in here while scans are going on. Seems like they could go in on open of a new scan. That'll work nicely. A file will be bullk loaded but won't be seen by ongoing scans. The hard thing then is what to do about flush (we've been here before!) On flush, what if we let all Scans complete before letting go the snapshot? More memory pressure. Simpler implementation. Otherwise, flush registers it has happened at the region level. Scanners check for flush events every-so-often (we already have checkpoints in the scan to ensure we don't go over size or time constraints... could check at these times) and when they find one, they swap in the flushed file. When all Scanners have done this, then we let go the snapshot. This might be a different sort of event to the one described in the doc here.. where we swap in compacted files on new scan creation.. but yeah, implementation would be cleaner if swap in of flushed files and compacted files all happened in the one manner. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963860#comment-14963860 ] ramkrishna.s.vasudevan commented on HBASE-13082: But again if handling flushes is a problem then resetting the heap should be done at any case. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963858#comment-14963858 ] ramkrishna.s.vasudevan commented on HBASE-13082: bq.So we wont change the store file's heap during the scan is not really possible? I think our comments were published at the same time and in between JIRA was down. Still I think we can do this for a pure bulk loaded case. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964476#comment-14964476 ] Anoop Sam John commented on HBASE-13082: Not getting cells from bulk loaded files which is loaded in btw a scan seems ok and seems that is the correct way. Regarding flush bq.On flush, what if we let all Scans complete before letting go the snapshot? More memory pressure. Simpler implementation Yes simple but heap pressure will be much more. And when the scan will get over is not really in our hands. I dont think we can think of doing this :-) The other option you said seems fine. We need to see how this will look.. Will be much more complex.. So how much is the perf gain we get by fully avoiding this lock? Worth of doing all these complex stuff. I think it is still worth as the gain may be high. And this will allow to make full use of the cached blocks of compacted files is another adv. Again nice doc Ram.. Just in 2 mins we can understand the whole change and why to do it.. Getting the impl details reading through the patch is much harder. Thanks man. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964555#comment-14964555 ] Duo Zhang commented on HBASE-13082: --- I mean maybe we could have a fast path without locking for normal reading since for most scan operation there is no flush involved. And if a flush happen, then we change back to use a lock. I think this is something like a BiasedLock? > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964567#comment-14964567 ] ramkrishna.s.vasudevan commented on HBASE-13082: bq.I mean maybe we could have a fast path without locking for normal reading since for most scan operation there is no flush involved. I thought on this but not sure run time if we can decide this. But may be if we can say that use my scan only on files then this feature can easily be used. Already we have something like InternalScan#checkOnlyStoreFiles(). So that can be an indicator. But need to see how the client can specify this. So we can have two types of StoreScanner one which is lock free and other one with lock. Ideally only for the flush case we need this lock behaviour whereas the compactions and bulkload can still go with the versioned approach. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964536#comment-14964536 ] ramkrishna.s.vasudevan commented on HBASE-13082: We can rename the states to ACTIVE and DISCARDED or may be IN_ACTIVE something of this sort. The NON_COMPACTED indicates that it is an ACTIVE file which can be used in reads. COMPACTED indicates that it is no longer an active file and its contents are already copied to a new file. bq.Is that a new Chore per Store per Region? Every two minutes seems like a long time to hold on to files? Ya it is a per store per region. But I think we need to pass the region details for selecting only those files pertaining to that region. Or do you think it is better to start this per region? bq.Now in case of the versioned storefile approach change, there is a chance that there are reference files which are not yet archived after compaction because of the change in the store file management." My thinking was that once split is done - we will have reference files still they are removed by compaction. Now since we don't immediately remove the compacted files and depend on the cleaner to do it, there is a chance that when the next split occurs (may be due to user issuing split) there may be reference files but they are actually in the COMPACTED state (here i mean they are IN_ACTIVE or DISCARDED state) and we can safely proceed with splits? bq.Otherwise, flush registers it has happened at the region level. Scanners check for flush events every-so-often (we already have checkpoints in the scan to ensure we don't go over size or time constraints... could check at these times) and when they find one, they swap in the flushed file. When all Scanners have done this, then we let go the snapshot. This seems fine but again swapping in the flushed file into the current scan will need to us to hold the locks? bq.So how much is the perf gain we get by fully avoiding this lock? Worth of doing all these complex stuff. I think it is still worth as the gain may be high. I was able to get 10-12% latency gain just by using the PE tool with default load. In that case all the contents were flushed and only pure reads/scans were issued. bq.So I think resetting heap is necessary when there is a flush, and the problem is how to do it without a expensive overhead? Maybe something like a Biased Locking? Biased locking - yes. You mean a new type of lock implementaion? By default JVM works with biased locking enabled or did you mean something else? In latest processors Re-entrant locking is considered to be much efficient. http://mechanical-sympathy.blogspot.in/2011/11/java-lock-implementations.html. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964481#comment-14964481 ] Duo Zhang commented on HBASE-13082: --- {quote} On flush, what if we let all Scans complete before letting go the snapshot? More memory pressure. Simpler implementation. {quote} How much time a {{RegionScanner}} can continue sometimes depends on the user of HBase. Think of a {{Filter}} that matches nothing. A {{RegionScanner}} could scan the whole region if that {{Filter}} is applied. So I do not think having a ref count on snapshot is a good idea. So I think resetting heap is necessary when there is a flush, and the problem is how to do it without a expensive overhead? Maybe something like a Biased Locking? Thanks. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963561#comment-14963561 ] ramkrishna.s.vasudevan commented on HBASE-13082: bq.Because during flush the reads can still be served from the snapshot. Only after flush is a point to be noted. This is a concern. Though we are able to complete the flush from the snapshot since the files cannot be added to the storescanner's heap that could be a problem. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963558#comment-14963558 ] Anoop Sam John commented on HBASE-13082: bq.But one thing may be to check is that if we open but still the current scanner heap is not reset will it have any impact on the current scan is what needs to be checked? Because during flush the reads can still be served from the snapshot Yes during the process of flush (before the new file got committed) we will serve the scan from the snapshot. Then we will call clearSnapshot which will remove the cells from memory. So we must include the new file into the heap. Else we can not release the snapshot until the scanner is done. That will be too costly for us as we will tend to keep the cells in memory for much longer time and also which is not in our control. So we wont change the store file's heap during the scan is not really possible? > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14961093#comment-14961093 ] ramkrishna.s.vasudevan commented on HBASE-13082: Next week beginning I will upload a one pager. Thanks Stack. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14960548#comment-14960548 ] ramkrishna.s.vasudevan commented on HBASE-13082: https://reviews.apache.org/r/39393/- RB link. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14960851#comment-14960851 ] Hadoop QA commented on HBASE-13082: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12767056/HBASE-13082_4.patch against master branch at commit 30cf4e3761e95f3cceaf8c1aa154695e18198cd6. ATTACHMENT ID: 12767056 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 102 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 2.7.1) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1750 checkstyle errors (more than the master's current 1747 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + private Map>> doClose(final boolean abort, MonitoredTask status) +public Pair >> call() throws IOException { {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.regionserver.compactions.TestCompactionWithThroughputController org.apache.hadoop.hbase.regionserver.TestStore org.apache.hadoop.hbase.regionserver.TestHRegionReplayEvents org.apache.hadoop.hbase.regionserver.wal.TestLogRolling Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/16065//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/16065//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/16065//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/16065//console This message is automatically generated. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14960961#comment-14960961 ] stack commented on HBASE-13082: --- This is an important fix. I started to look at the rb and then got lost on why we were doing certain things. I started to read over the issue but the issue has been going on for a long time with us flip/flopping over whether this a good idea or not, lists of why this issue is good and then downsides, and then stuff like the scan implementation changed while this issue was going on. I think a one-pager is in order so can see in one place current approach, upsides and downsides and that we have this before we do the code (or at least, it looks like with this last patch, that we are trying an experiment -- which is sweet... lets factor the findings back into the one-pager). With a one-pager, we'll be able to do a better review. Thanks. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, > gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14958882#comment-14958882 ] Hadoop QA commented on HBASE-13082: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12766763/HBASE-13082_3.patch against master branch at commit e7defd7d9a76f44e3089db3fe522fe400fe6dcd7. ATTACHMENT ID: 12766763 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 78 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 2.7.1) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.compactions.TestCompactionWithThroughputController org.apache.hadoop.hbase.regionserver.TestCompactionWithCoprocessor org.apache.hadoop.hbase.regionserver.TestRegionMergeTransactionOnCluster org.apache.hadoop.hbase.util.TestMergeTool org.apache.hadoop.hbase.regionserver.TestMobStoreCompaction org.apache.hadoop.hbase.client.TestFromClientSide3 org.apache.hadoop.hbase.regionserver.TestStore org.apache.hadoop.hbase.regionserver.TestHRegionReplayEvents org.apache.hadoop.hbase.regionserver.wal.TestLogRolling org.apache.hadoop.hbase.regionserver.TestCompactionState org.apache.hadoop.hbase.util.TestMergeTable org.apache.hadoop.hbase.regionserver.TestCompaction org.apache.hadoop.hbase.master.cleaner.TestSnapshotFromMaster {color:red}-1 core zombie tests{color}. There are 5 zombie test(s): at org.apache.hadoop.hbase.backup.TestHFileArchiving.testArchiveOnTableFamilyDelete(TestHFileArchiving.java:322) at org.apache.hadoop.hbase.replication.TestReplicationSmallTests.testDisableEnable(TestReplicationSmallTests.java:308) at org.apache.hadoop.hbase.replication.regionserver.TestReplicationWALReaderManager.test(TestReplicationWALReaderManager.java:190) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation.testExcessReservationWillBeUnreserved(TestContainerAllocation.java:359) at org.apache.hadoop.hbase.replication.TestReplicationSyncUpTool.testSyncUpTool(TestReplicationSyncUpTool.java:178) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/16025//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/16025//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/16025//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/16025//console This message is automatically generated. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, gc.png, gc.png, gc.png, > hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14956369#comment-14956369 ] Hadoop QA commented on HBASE-13082: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12766471/HBASE-13082_2_WIP.patch against master branch at commit e5580c247c06d8c708b92e96a5622853ec06a77d. ATTACHMENT ID: 12766471 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 71 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 2.7.1) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1756 checkstyle errors (more than the master's current 1748 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: +Collection storeFileInfosForRegion = fs.getStoreFiles(store.getColumnFamilyName()); +int cleanerInterval = conf.getInt("hbase.hfile.compactions.cleaner.interval", 5 * 60 * 1000); +return StoreUtils.hasCompactedReferences(this.storeEngine.getStoreFileManager().getStorefiles()); +boolean mayBeStuck = (nonCompactedCandidateSelection.size() - filesCompacting.size() + futureFiles) +LOG.debug("Selecting compaction from " + nonCompactedCandidateSelection.size() + " store files, " + {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestDefaultCompactSelection {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/15997//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15997//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/15997//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15997//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15997//console This message is automatically generated. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082_1_WIP.patch, > HBASE-13082_2_WIP.patch, gc.png, gc.png, gc.png, hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14946193#comment-14946193 ] ramkrishna.s.vasudevan commented on HBASE-13082: bq.So, bulk load won't show mid-scan... you have to get to the end? That would be fine. New files added by bulk load during an on going scan won't be included in the current scan. The new files will be included only when a new scan is started. Currently during a course of a scan the bulk loaded files can be included too. bq.On the patch, can we get more of Lars comments in on what is going on Could we get rid of some of these getReaderLocks too... in hstorefile, in hstore, etc would be good to not let this stuff out if we can. Yes we can. I have a working version of a patch. Some issues with Merge regions and splits. Working on them. All the locks will be removed that are currently in StoreScanner and also the notifyReaders() will also go away totally. Just because we are updating the readers during a course of a scan we needed all those locks. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, > next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14946198#comment-14946198 ] stack commented on HBASE-13082: --- bq. New files added by bulk load during an on going scan won't be included in the current scan. The new files will be included only when a new scan is started. Currently during a course of a scan the bulk loaded files can be included too. That makes more sense than our current implementation IMO. bq. Yes we can. I have a working version of a patch. Some issues with Merge regions and splits. Working on them. All the locks will be removed that are currently in StoreScanner and also the notifyReaders() will also go away totally. Just because we are updating the readers during a course of a scan we needed all those locks. Sweet. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, > next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14945926#comment-14945926 ] stack commented on HBASE-13082: --- bq. But in case of bulk loaded files, currently in between a scan if a new file is bulk loaded it gets included, so after this it will not be. is that behavioral change fine? Sorry, say more [~ram_krish]. So, bulk load won't show mid-scan... you have to get to the end? That would be fine. On the patch, can we get more of Lars comments in on what is going on Could we get rid of some of these getReaderLocks too... in hstorefile, in hstore, etc would be good to not let this stuff out if we can. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, > next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940078#comment-14940078 ] ramkrishna.s.vasudevan commented on HBASE-13082: bq.I'm a little late to the party but this versioned data structure sounds neat. If I'm understanding correctly, it sounds like this versioned data structure would also allow us to remove the lingering lock in updateReaders (and potentially remove updateReaders completely?). I totally agree and can see this is going to happen after this change is done. Just wanted to confirm one thing , In case of compaction it is old data but in new file. So there is no problem. But in case of bulk loaded files, currently in between a scan if a new file is bulk loaded it gets included, so after this it will not be. is that behavioral change fine? > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, > next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934759#comment-14934759 ] Lars Hofhansl commented on HBASE-13082: --- Feel free [~ram_krish]. I just happened to look at the code again a few days ago, we need to fix this. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, > next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934576#comment-14934576 ] ramkrishna.s.vasudevan commented on HBASE-13082: Thanks Stack. Better to go ahead with the versioned way of reading. Will look into it and get back here. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, > next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934572#comment-14934572 ] ramkrishna.s.vasudevan commented on HBASE-13082: Any one planning to work on this. I am plan to take this up if there is no one working on it currently. Thoughts!!!? > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, > next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934574#comment-14934574 ] stack commented on HBASE-13082: --- Please bring it home [~ram_krish] Its an important one. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, > next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564821#comment-14564821 ] Anoop Sam John commented on HBASE-13082: [~larsh] are you working on the idea of scanners continue to read from old files when a compaction happened? Otherwise I can look at it. Coarsen StoreScanner locks to RegionScanner --- Key: HBASE-13082 URL: https://issues.apache.org/jira/browse/HBASE-13082 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, next.png, next.png Continuing where HBASE-10015 left of. We can avoid locking (and memory fencing) inside StoreScanner by deferring to the lock already held by the RegionScanner. In tests this shows quite a scan improvement and reduced CPU (the fences make the cores wait for memory fetches). There are some drawbacks too: * All calls to RegionScanner need to be remain synchronized * Implementors of coprocessors need to be diligent in following the locking contract. For example Phoenix does not lock RegionScanner.nextRaw() and required in the documentation (not picking on Phoenix, this one is my fault as I told them it's OK) * possible starving of flushes and compaction with heavy read load. RegionScanner operations would keep getting the locks and the flushes/compactions would not be able finalize the set of files. I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14565613#comment-14565613 ] Lars Hofhansl commented on HBASE-13082: --- I am backed up at the moment, would love if you had a look at it. Coarsen StoreScanner locks to RegionScanner --- Key: HBASE-13082 URL: https://issues.apache.org/jira/browse/HBASE-13082 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, next.png, next.png Continuing where HBASE-10015 left of. We can avoid locking (and memory fencing) inside StoreScanner by deferring to the lock already held by the RegionScanner. In tests this shows quite a scan improvement and reduced CPU (the fences make the cores wait for memory fetches). There are some drawbacks too: * All calls to RegionScanner need to be remain synchronized * Implementors of coprocessors need to be diligent in following the locking contract. For example Phoenix does not lock RegionScanner.nextRaw() and required in the documentation (not picking on Phoenix, this one is my fault as I told them it's OK) * possible starving of flushes and compaction with heavy read load. RegionScanner operations would keep getting the locks and the flushes/compactions would not be able finalize the set of files. I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503889#comment-14503889 ] Jonathan Lawlor commented on HBASE-13082: - I'm a little late to the party but this versioned data structure sounds neat. If I'm understanding correctly, it sounds like this versioned data structure would also allow us to remove the lingering lock in updateReaders (and potentially remove updateReaders completely?). Instead of having to update the readers, the compaction/flush would occur in the background and be made visible to new readers via a new latest version in the data structure, is that correct? In other words, would the introduction of this new versioned data structure make StoreScanner single threaded (and thus remove any need for synchronization)? Coarsen StoreScanner locks to RegionScanner --- Key: HBASE-13082 URL: https://issues.apache.org/jira/browse/HBASE-13082 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, next.png, next.png Continuing where HBASE-10015 left of. We can avoid locking (and memory fencing) inside StoreScanner by deferring to the lock already held by the RegionScanner. In tests this shows quite a scan improvement and reduced CPU (the fences make the cores wait for memory fetches). There are some drawbacks too: * All calls to RegionScanner need to be remain synchronized * Implementors of coprocessors need to be diligent in following the locking contract. For example Phoenix does not lock RegionScanner.nextRaw() and required in the documentation (not picking on Phoenix, this one is my fault as I told them it's OK) * possible starving of flushes and compaction with heavy read load. RegionScanner operations would keep getting the locks and the flushes/compactions would not be able finalize the set of files. I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504350#comment-14504350 ] Lars Hofhansl commented on HBASE-13082: --- Correct. No more locking other than to fix the current version of access data structure at the beginning of the scan, and StoreScanner would indeed be single threaded (which is it 99.% of the already :) ). That would be bigger change. Coarsen StoreScanner locks to RegionScanner --- Key: HBASE-13082 URL: https://issues.apache.org/jira/browse/HBASE-13082 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, next.png, next.png Continuing where HBASE-10015 left of. We can avoid locking (and memory fencing) inside StoreScanner by deferring to the lock already held by the RegionScanner. In tests this shows quite a scan improvement and reduced CPU (the fences make the cores wait for memory fetches). There are some drawbacks too: * All calls to RegionScanner need to be remain synchronized * Implementors of coprocessors need to be diligent in following the locking contract. For example Phoenix does not lock RegionScanner.nextRaw() and required in the documentation (not picking on Phoenix, this one is my fault as I told them it's OK) * possible starving of flushes and compaction with heavy read load. RegionScanner operations would keep getting the locks and the flushes/compactions would not be able finalize the set of files. I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377564#comment-14377564 ] Anoop Sam John commented on HBASE-13082: If scanners can continue read from the old files, it can continue to use the data from cache (if that file blocks were in cache). Otherwise a compaction will result in need for scan from a new file whose data is not in cache. (Assume no cache on write) I was also thinking on this especially wrt Cache usage.. Coarsen StoreScanner locks to RegionScanner --- Key: HBASE-13082 URL: https://issues.apache.org/jira/browse/HBASE-13082 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, next.png, next.png Continuing where HBASE-10015 left of. We can avoid locking (and memory fencing) inside StoreScanner by deferring to the lock already held by the RegionScanner. In tests this shows quite a scan improvement and reduced CPU (the fences make the cores wait for memory fetches). There are some drawbacks too: * All calls to RegionScanner need to be remain synchronized * Implementors of coprocessors need to be diligent in following the locking contract. For example Phoenix does not lock RegionScanner.nextRaw() and required in the documentation (not picking on Phoenix, this one is my fault as I told them it's OK) * possible starving of flushes and compaction with heavy read load. RegionScanner operations would keep getting the locks and the flushes/compactions would not be able finalize the set of files. I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14378784#comment-14378784 ] stack commented on HBASE-13082: --- I ran this patch on my rig and see ~6-7% more throughput (high concurrency, scan1000, medium-sized rows), 7750k/s vs 8250k/s. No big CPU difference that I can discern... perhaps smoother with this patch in place. Coarsen StoreScanner locks to RegionScanner --- Key: HBASE-13082 URL: https://issues.apache.org/jira/browse/HBASE-13082 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, next.png, next.png Continuing where HBASE-10015 left of. We can avoid locking (and memory fencing) inside StoreScanner by deferring to the lock already held by the RegionScanner. In tests this shows quite a scan improvement and reduced CPU (the fences make the cores wait for memory fetches). There are some drawbacks too: * All calls to RegionScanner need to be remain synchronized * Implementors of coprocessors need to be diligent in following the locking contract. For example Phoenix does not lock RegionScanner.nextRaw() and required in the documentation (not picking on Phoenix, this one is my fault as I told them it's OK) * possible starving of flushes and compaction with heavy read load. RegionScanner operations would keep getting the locks and the flushes/compactions would not be able finalize the set of files. I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14378969#comment-14378969 ] Lars Hofhansl commented on HBASE-13082: --- Thanks [~stack]. That's with 1000 bytes values, right? The cost saving is per row, so the smaller the rows the higher the gain (percentage-wise) Coarsen StoreScanner locks to RegionScanner --- Key: HBASE-13082 URL: https://issues.apache.org/jira/browse/HBASE-13082 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, next.png, next.png Continuing where HBASE-10015 left of. We can avoid locking (and memory fencing) inside StoreScanner by deferring to the lock already held by the RegionScanner. In tests this shows quite a scan improvement and reduced CPU (the fences make the cores wait for memory fetches). There are some drawbacks too: * All calls to RegionScanner need to be remain synchronized * Implementors of coprocessors need to be diligent in following the locking contract. For example Phoenix does not lock RegionScanner.nextRaw() and required in the documentation (not picking on Phoenix, this one is my fault as I told them it's OK) * possible starving of flushes and compaction with heavy read load. RegionScanner operations would keep getting the locks and the flushes/compactions would not be able finalize the set of files. I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14379236#comment-14379236 ] stack commented on HBASE-13082: --- Sorry.. average size about 180k rows. So, yeah, probably better benefits when smaller cells. Coarsen StoreScanner locks to RegionScanner --- Key: HBASE-13082 URL: https://issues.apache.org/jira/browse/HBASE-13082 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, next.png, next.png Continuing where HBASE-10015 left of. We can avoid locking (and memory fencing) inside StoreScanner by deferring to the lock already held by the RegionScanner. In tests this shows quite a scan improvement and reduced CPU (the fences make the cores wait for memory fetches). There are some drawbacks too: * All calls to RegionScanner need to be remain synchronized * Implementors of coprocessors need to be diligent in following the locking contract. For example Phoenix does not lock RegionScanner.nextRaw() and required in the documentation (not picking on Phoenix, this one is my fault as I told them it's OK) * possible starving of flushes and compaction with heavy read load. RegionScanner operations would keep getting the locks and the flushes/compactions would not be able finalize the set of files. I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14379235#comment-14379235 ] stack commented on HBASE-13082: --- [~lhofhansl] 10 columns zipfian sized from 0 to 8k. Coarsen StoreScanner locks to RegionScanner --- Key: HBASE-13082 URL: https://issues.apache.org/jira/browse/HBASE-13082 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, next.png, next.png Continuing where HBASE-10015 left of. We can avoid locking (and memory fencing) inside StoreScanner by deferring to the lock already held by the RegionScanner. In tests this shows quite a scan improvement and reduced CPU (the fences make the cores wait for memory fetches). There are some drawbacks too: * All calls to RegionScanner need to be remain synchronized * Implementors of coprocessors need to be diligent in following the locking contract. For example Phoenix does not lock RegionScanner.nextRaw() and required in the documentation (not picking on Phoenix, this one is my fault as I told them it's OK) * possible starving of flushes and compaction with heavy read load. RegionScanner operations would keep getting the locks and the flushes/compactions would not be able finalize the set of files. I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14378169#comment-14378169 ] Lars Hofhansl commented on HBASE-13082: --- Exactly. Logically it's relatively simple. Folding this into the current HBase will be more tricky. Some kind of abstraction around StoreFileManager providing a versioned view of all currently visible HFiles for the store. Coarsen StoreScanner locks to RegionScanner --- Key: HBASE-13082 URL: https://issues.apache.org/jira/browse/HBASE-13082 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, next.png, next.png Continuing where HBASE-10015 left of. We can avoid locking (and memory fencing) inside StoreScanner by deferring to the lock already held by the RegionScanner. In tests this shows quite a scan improvement and reduced CPU (the fences make the cores wait for memory fetches). There are some drawbacks too: * All calls to RegionScanner need to be remain synchronized * Implementors of coprocessors need to be diligent in following the locking contract. For example Phoenix does not lock RegionScanner.nextRaw() and required in the documentation (not picking on Phoenix, this one is my fault as I told them it's OK) * possible starving of flushes and compaction with heavy read load. RegionScanner operations would keep getting the locks and the flushes/compactions would not be able finalize the set of files. I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)