[
https://issues.apache.org/jira/browse/HBASE-14951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vladimir Rodionov updated HBASE-14951:
--------------------------------------
Release Note:
Rolling WAL events across a cluster can be highly correlated, hence flushing
memstores, hence triggering minor compactions, that can be promoted to major
ones. These events are highly correlated in time if there is a balanced
write-load on the regions in a table. Default value for maximum WAL files (*
hbase.regionserver.maxlogs*), which controls WAL rolling events - 32 is too
small for many modern deployments.
Now we calculate this value dynamically (if not defined by user), using the
following formula:
maxLogs = Math.max( 32, HBASE_HEAP_SIZE * memstoreRatio * 2/ LogRollSize), where
memstoreRatio is *hbase.regionserver.global.memstore.size*
LogRollSize is maximum WAL file size (default 0.95 * HDFS block size)
We need to make sure that we avoid fully or minimize events when RS has to
flush memstores prematurely only because it reached artificial limit of
hbase.regionserver.maxlogs, this is why we put this 2 x multiplier in equation,
this gives us maximum WAL capacity of 2 x RS memstore-size.
Runaway WAL files.
The default log rolling period (1h) allows to accumulate up to 2 X Memstore
Size data in a WAL. For heap size - 32G and all other default setting, this
gives ~ 26GB of data. Under heavy write load, the number of WAL files can
increase dramatically. RegionServer LogRoller will be archiving old WALs
periodically. User has three options, either override default
hbase.regionserver.maxlogs or override default
hbase.regionserver.logroll.period (decrease), or both to control runaway WALs.
The following table gives the new default maximum log files values for several
different Region Server heap sizes:
heap memstore perc maxLogs
1G 40% 32
2G 40% 32
10G 40% 80
20G 40% 160
32G 40% 256
was:
Rolling WAL events across a cluster can be highly correlated, hence flushing
memstores, hence triggering minor compactions, that can be promoted to major
ones. These events are highly correlated in time if there is a balanced
write-load on the regions in a table. Default value for maximum WAL files (*
hbase.regionserver.maxlogs*), which controls WAL rolling events - 32 is too
small for many modern deployments.
Now we calculate this value dynamically (if not defined by user), using the
following formula:
maxLogs = Math.max( 32, HBASE_HEAP_SIZE * memstoreRatio * 2/ LogRollSize), where
memstoreRatio is *hbase.regionserver.global.memstore.size*
LogRollSize is maximum WAL file size (default 0.95 * HDFS block size)
The following table gives the new default maximum log files values for several
different Region Server heap sizes:
heap memstore perc maxLogs
1G 40% 32
2G 40% 32
10G 40% 80
20G 40% 160
32G 40% 256
> Make hbase.regionserver.maxlogs obsolete
> ----------------------------------------
>
> Key: HBASE-14951
> URL: https://issues.apache.org/jira/browse/HBASE-14951
> Project: HBase
> Issue Type: Improvement
> Components: Performance, wal
> Reporter: Vladimir Rodionov
> Assignee: Vladimir Rodionov
> Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14951-v1.patch, HBASE-14951-v2.patch
>
>
> There was a discussion in HBASE-14388 related to maximum number of log files.
> It was an agreement that we should calculate this number in a code but still
> need to honor user's setting.
> Maximum number of log files now is calculated as following:
> maxLogs = HEAP_SIZE * memstoreRatio * 2/ LogRollSize
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)