[jira] [Updated] (HBASE-15091) Forward-port to 1.2+ HBASE-15031 "Fix merge of MVCC and SequenceID performance regression in branch-1.0 for Increments"

stack (JIRA) Fri, 05 Feb 2016 16:48:07 -0800

     [ 
https://issues.apache.org/jira/browse/HBASE-15091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


stack updated HBASE-15091:
--------------------------
    Release Note: 
UPDATE: This forward port was not necessary. hbase-1.2.0 as it happens does not 
suffer the performance regression. HBASE-12751 which was added to hbase-1.2.0, 
actually fixed the performance regression in increment and append too. Ignore 
the below!!!! 

Increments can be 10x slower (or more) when there is high concurrency since 
HBase 1.0.0 (HBASE-8763). 

This 'fix' adds back a fast increment but speed is achieved by relaxing 
row-level consistency for Increments (only). The default remains the old, slow, 
consistent Increment behavior. 

Set "hbase.increment.fast.but.narrow.consistency" to true in hbase-site.xml to 
enable 'fast' increments and then rolling restart your cluster. This is a 
setting the server-side needs to read. 

Intermixing fast increment with other Mutations will give indeterminate 
results; e.g. a Put and Increment against the same Cell will not always give 
you the result you expect. Fast Increments are consistent unto themselves. A 
Get with {@link IsolationLevel#READ_UNCOMMITTED} will return the latest 
increment value or an Increment of an amount zero will do the same (beware 
doing Get on a cell that has not been incremented yet -- this will return no 
results). 

The difference between fastAndNarrowConsistencyIncrement and 
slowButConsistentIncrement is that the former holds the row lock until the WAL 
sync completes; this allows us to reason that there are no other writers afoot 
when we read the current increment value. In this case we do not need to wait 
on mvcc reads to catch up to writes before we proceed with the read of the 
current Increment value, the root of the slowdown seen in HBASE-14460. The 
fast-path also does not wait on mvcc to complete before returning to the client 
(but the write has been synced and put into memstore before we return). 

Also adds a simple performance test tool that will run against existing 
cluster. It expects the table to be already created (by default it expects the 
table 'tableName' with a column family 'columnFamilyName'): 

{code} 
$ ./bin/hbase org.apache.hadoop.hbase.IncrementPerformanceTest 
{code] 

Configure it by passing -D options. Here are the set below: 

2015-12-23 19:33:36,941 INFO [main] hbase.IncrementPerformanceTest: Running 
test with hbase.zookeeper.quorum=localhost, tableName=tableName, 
columnFamilyName=columnFamilyName, threadCount=80, incrementCount=10000 

... so to set the tableName pass -DtableName=SOME_TABLENAME 

Here is an example use of the test tool: 

{code} 
$ time ./bin/hbase --config ~/conf_hbase 
org.apache.hadoop.hbase.IncrementPerformanceTest -DincrementCount=50000 
{code} 



  was:
Increments can be 10x slower (or more) when there is high concurrency since 
HBase 1.0.0 (HBASE-8763). 

This 'fix' adds back a fast increment but speed is achieved by relaxing 
row-level consistency for Increments (only). The default remains the old, slow, 
consistent Increment behavior. 

Set "hbase.increment.fast.but.narrow.consistency" to true in hbase-site.xml to 
enable 'fast' increments and then rolling restart your cluster. This is a 
setting the server-side needs to read. 

Intermixing fast increment with other Mutations will give indeterminate 
results; e.g. a Put and Increment against the same Cell will not always give 
you the result you expect. Fast Increments are consistent unto themselves. A 
Get with {@link IsolationLevel#READ_UNCOMMITTED} will return the latest 
increment value or an Increment of an amount zero will do the same (beware 
doing Get on a cell that has not been incremented yet -- this will return no 
results). 

The difference between fastAndNarrowConsistencyIncrement and 
slowButConsistentIncrement is that the former holds the row lock until the WAL 
sync completes; this allows us to reason that there are no other writers afoot 
when we read the current increment value. In this case we do not need to wait 
on mvcc reads to catch up to writes before we proceed with the read of the 
current Increment value, the root of the slowdown seen in HBASE-14460. The 
fast-path also does not wait on mvcc to complete before returning to the client 
(but the write has been synced and put into memstore before we return). 

Also adds a simple performance test tool that will run against existing 
cluster. It expects the table to be already created (by default it expects the 
table 'tableName' with a column family 'columnFamilyName'): 

{code} 
$ ./bin/hbase org.apache.hadoop.hbase.IncrementPerformanceTest 
{code] 

Configure it by passing -D options. Here are the set below: 

2015-12-23 19:33:36,941 INFO [main] hbase.IncrementPerformanceTest: Running 
test with hbase.zookeeper.quorum=localhost, tableName=tableName, 
columnFamilyName=columnFamilyName, threadCount=80, incrementCount=10000 

... so to set the tableName pass -DtableName=SOME_TABLENAME 

Here is an example use of the test tool: 

{code} 
$ time ./bin/hbase --config ~/conf_hbase 
org.apache.hadoop.hbase.IncrementPerformanceTest -DincrementCount=50000 
{code} 



> Forward-port to 1.2+ HBASE-15031 "Fix merge of MVCC and SequenceID 
> performance regression in branch-1.0 for Increments"
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-15091
>                 URL: https://issues.apache.org/jira/browse/HBASE-15091
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Performance
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 1.2.0, 1.3.0
>
>         Attachments: 15031.addendum, 15091v2.branch-1.2.patch, 
> 15091v3.branch-1.2.patch, 15091v4.branch-1.2.patch, 15091v5.branch-1.2.patch, 
> 15091v6.branch-1.2.patch, 15091v6.branch-1.patch, 15091v7.branch-1.2.patch, 
> 15091v8.branch-1.2.patch, 15091v9.branch-1.2.patch, 
> HBASE-15091-branch-1.2.patch, HBASE-15091-branch-1.2_v1.patch, 
> HBASE-15091.png, HBASE-15091.v1.branch-1.2.patch, 
> HBASE-15091.v9.branch-1.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-15091) Forward-port to 1.2+ HBASE-15031 "Fix merge of MVCC and SequenceID performance regression in branch-1.0 for Increments"

Reply via email to