[
https://issues.apache.org/jira/browse/HBASE-14460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-14460:
--------------------------
Attachment: 0.98.test.patch
m.test.patch
0.94.test.patch
flamegraph-26636.094.100.svg
flamegraph-28767.098.100.svg
flamegraph-31647.master.100.svg
If I run a test that has 100 threads each updating their own rows -- i.e. no
contention on a row -- then I see master branch completing before 0.94 does;
i.e. master is faster. This is in spite of the thread dump resembling that
reported as problematic up top of this issue.
In 0.94, all are stuck waiting on the WAL syncer to come in:
{code}
"50" #74 daemon prio=5 os_prio=0 tid=0x00007f7a78661000 nid=0x3364 waiting for
monitor entry [0x00007f7a30ecd000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1334)
- waiting to lock <0x00000004cde22390> (a java.lang.Object)
at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1476)
at
org.apache.hadoop.hbase.regionserver.HRegion.syncOrDefer(HRegion.java:6160)
at
org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:5571)
at
org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:5454)
at
org.apache.hadoop.hbase.regionserver.TestIncrement$SingleCellIncrementer.run(TestIncrement.java:84)
{code}
In master they are stuck here:
{code}
"17" #55 daemon prio=5 os_prio=0 tid=0x00007f0374c6d000 nid=0x3a0b in
Object.wait() [0x00007f030c346000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.Object.wait(Native Method)
at
org.apache.hadoop.hbase.regionserver.MultiVersionConcurrencyControl.waitForRead(MultiVersionConcurrencyControl.java:218)
- locked <0x00000004d2e26208> (a java.lang.Object)
at
org.apache.hadoop.hbase.regionserver.MultiVersionConcurrencyControl.completeAndWait(MultiVersionConcurrencyControl.java:149)
at
org.apache.hadoop.hbase.regionserver.MultiVersionConcurrencyControl.await(MultiVersionConcurrencyControl.java:137)
at
org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:7360)
at
org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:7315)
at
org.apache.hadoop.hbase.regionserver.TestIncrement$SingleCellIncrementer.run(TestIncrement.java:86)
{code
The flame graphs show basically the same profile across all verisons (master
spends a bit less time appending which I suppose is expected).
> [Perf Regression] Merge of MVCC and SequenceId (HBASE-HBASE-8763) slowed
> Increments, CheckAndPuts, batch operations
> -------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-14460
> URL: https://issues.apache.org/jira/browse/HBASE-14460
> Project: HBase
> Issue Type: Bug
> Components: Performance
> Reporter: stack
> Assignee: stack
> Priority: Critical
> Attachments: 0.94.test.patch, 0.98.test.patch, 14460.txt,
> flamegraph-13120.svg.master.singlecell.svg, flamegraph-26636.094.100.svg,
> flamegraph-28066.098.singlecell.svg, flamegraph-28767.098.100.svg,
> flamegraph-31647.master.100.svg, flamegraph-9466.094.singlecell.svg,
> m.test.patch, region_lock.png, testincrement.094.patch,
> testincrement.098.patch, testincrement.master.patch
>
>
> As reported by 鈴木俊裕 up on the mailing list -- see "Performance degradation
> between CDH5.3.1(HBase0.98.6) and CDH5.4.5(HBase1.0.0)" -- our unification of
> sequenceid and MVCC slows Increments (and other ops) as the mvcc needs to
> 'catch up' to our current point before we can read the last Increment value
> that we need to update.
> We can say that our Increment is just done wrong, we should just be writing
> Increments and summing on read, but checkAndPut as well as batching
> operations have the same issue. Fix.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)