[
https://issues.apache.org/jira/browse/HBASE-24544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133953#comment-17133953
]
Michael Stack commented on HBASE-24544:
---------------------------------------
Old issue HBASE-4246 closed as stale but actually still an issue.
> Recommend upping zk jute.maxbuffer in all but minor installs
> ------------------------------------------------------------
>
> Key: HBASE-24544
> URL: https://issues.apache.org/jira/browse/HBASE-24544
> Project: HBase
> Issue Type: Bug
> Components: documentation
> Reporter: Michael Stack
> Priority: Major
>
> Add a doc note in upgrade and in zookeeper section recommending upping zk
> jute.maxbuffer to be above the default of 1M.
> Here is jute.maxbuffer from zk doc.
> {code}
> jute.maxbuffer:
> (Java system property: jute.maxbuffer)
> This option can only be set as a Java system property. There is no zookeeper
> prefix on it. It specifies the maximum size of the data that can be stored in
> a znode. The default is 0xfffff, or just under 1M. If this option is changed,
> the system property must be set on all servers and clients otherwise problems
> will arise. This is really a sanity check. ZooKeeper is designed to store
> data on the order of kilobytes in size.
> {code}
> It seems easy enough blowing the 1MB default. Here is one such scenario. A
> peer is disabled so WALs backup on each RegionServer or a bug makes it so we
> don't clear WALs out from under the RegionServer promptly. Backed-up WALs get
> into the hundreds... easy enough on a busy cluster. Next, there is a power
> outage and the cluster crashes down.
> Recovery may require an SCP recovering hundreds of WALs. As is, the way our
> SCP works, we can end up with a /hbase/splitWAL dir with hundreds -- even
> thousands -- of WALs in it. The 1MB buffer limit in zk can't carry listings
> this big.
> Of note, the jute.maxbuffer needs to be set on the zk servers -- with restart
> so the change is noticed -- and on the client-side, in the hbase master at
> least.
> This issue is about highlighting this old issue in our doc. It seems to be
> absent totally.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)