[ 
https://issues.apache.org/jira/browse/HBASE-14951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14951:
--------------------------------------
    Release Note: 
Rolling WAL events across a cluster can be highly correlated, hence flushing 
memstores, hence triggering minor compactions, that can be promoted to major 
ones. These events are highly correlated in time if there is a balanced 
write-load on the regions in a table. Default value for maximum WAL files (* 
hbase.regionserver.maxlogs*), which controls WAL rolling events - 32 is too 
small for many modern deployments. 
Now we calculate this value dynamically (if not defined by user), using the 
following formula:

maxLogs = Math.max( 32, HBASE_HEAP_SIZE * memstoreRatio * 2/ LogRollSize), where

memstoreRatio is *hbase.regionserver.global.memstore.size*
LogRollSize is maximum WAL file size (default 0.95 * HDFS block size)

We need to make sure that we avoid fully or minimize events when RS has to 
flush memstores prematurely only because it reached artificial limit of 
hbase.regionserver.maxlogs, this is why we put this 2 x multiplier in equation, 
this gives us maximum WAL capacity of 2 x RS memstore-size. 

Runaway WAL files.

The default log rolling period (1h) allows to accumulate up to 2 X Memstore 
Size data in a WAL. For heap size - 32G and all other default setting, this 
gives ~ 26GB of data. Under heavy write load, the number of WAL files can 
increase dramatically. RegionServer LogRoller will be archiving old WALs 
periodically. User has three options, either override default 
hbase.regionserver.maxlogs or override default 
hbase.regionserver.logroll.period (decrease), or both to control runaway WALs.

For system with bursty write load,  the hbase.regionserver.logroll.period can 
be decreased to lower value. In this case the maximum number of wal files will 
be defined by the total size of memstore (unflushed data), not by the 
hbase.regionserver.maxlogs. But for majority of applications there will be no 
issues with defaults. Data will be flushed periodically from memstore, the 
LogRoller will archive old wal files and the system will never reach the new 
defaults for hbase.regionserver.maxlogs, unless the system is under extreme 
load for prolonged period of time, but in this case, decreasing 
hbase.regionserver.logroll.period allows us to control runaway wal files.

The following table gives the new default maximum log files values for several 
different Region Server heap sizes:

heap    memstore perc   maxLogs
1G              40%                             32
2G              40%                             32
10G             40%                             80
20G             40%                             160
32G             40%                             256



  

  was:
Rolling WAL events across a cluster can be highly correlated, hence flushing 
memstores, hence triggering minor compactions, that can be promoted to major 
ones. These events are highly correlated in time if there is a balanced 
write-load on the regions in a table. Default value for maximum WAL files (* 
hbase.regionserver.maxlogs*), which controls WAL rolling events - 32 is too 
small for many modern deployments. 
Now we calculate this value dynamically (if not defined by user), using the 
following formula:

maxLogs = Math.max( 32, HBASE_HEAP_SIZE * memstoreRatio * 2/ LogRollSize), where

memstoreRatio is *hbase.regionserver.global.memstore.size*
LogRollSize is maximum WAL file size (default 0.95 * HDFS block size)

We need to make sure that we avoid fully or minimize events when RS has to 
flush memstores prematurely only because it reached artificial limit of 
hbase.regionserver.maxlogs, this is why we put this 2 x multiplier in equation, 
this gives us maximum WAL capacity of 2 x RS memstore-size. 

Runaway WAL files.

The default log rolling period (1h) allows to accumulate up to 2 X Memstore 
Size data in a WAL. For heap size - 32G and all other default setting, this 
gives ~ 26GB of data. Under heavy write load, the number of WAL files can 
increase dramatically. RegionServer LogRoller will be archiving old WALs 
periodically. User has three options, either override default 
hbase.regionserver.maxlogs or override default 
hbase.regionserver.logroll.period (decrease), or both to control runaway WALs.

The following table gives the new default maximum log files values for several 
different Region Server heap sizes:

heap    memstore perc   maxLogs
1G              40%                             32
2G              40%                             32
10G             40%                             80
20G             40%                             160
32G             40%                             256



  


> Make hbase.regionserver.maxlogs obsolete
> ----------------------------------------
>
>                 Key: HBASE-14951
>                 URL: https://issues.apache.org/jira/browse/HBASE-14951
>             Project: HBase
>          Issue Type: Improvement
>          Components: Performance, wal
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>            Priority: Minor
>             Fix For: 2.0.0, 1.2.0, 1.3.0
>
>         Attachments: HBASE-14951-v1.patch, HBASE-14951-v2.patch
>
>
> There was a discussion in HBASE-14388 related to maximum number of log files. 
> It was an agreement that we should calculate this number in a code but still 
> need to honor user's setting. 
> Maximum number of log files now is calculated as following:
>  maxLogs = HEAP_SIZE * memstoreRatio * 2/ LogRollSize



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to