Michael Stack created HBASE-24544:
-------------------------------------
Summary: Recommend upping zk jute.maxbuffer in all but minor
installs
Key: HBASE-24544
URL: https://issues.apache.org/jira/browse/HBASE-24544
Project: HBase
Issue Type: Bug
Components: documentation
Reporter: Michael Stack
Add a doc note in upgrade and in zookeeper section recommending upping zk
jute.maxbuffer to be above the default of 1M.
Here is jute.maxbuffer from zk doc.
{code}
jute.maxbuffer:
(Java system property: jute.maxbuffer)
This option can only be set as a Java system property. There is no zookeeper
prefix on it. It specifies the maximum size of the data that can be stored in a
znode. The default is 0xfffff, or just under 1M. If this option is changed, the
system property must be set on all servers and clients otherwise problems will
arise. This is really a sanity check. ZooKeeper is designed to store data on
the order of kilobytes in size.
{code}
It seems easy enough blowing the 1MB default. Here is one such scenario. A peer
is disabled so WALs backup on each RegionServer or a bug makes it so we don't
clear WALs out from under the RegionServer promptly. Backed-up WALs get into
the hundreds... easy enough on a busy cluster. Next, there is a power outage
and the cluster crashes down.
Recovery may require an SCP recovering hundreds of WALs. As is, the way our SCP
works, we can end up with a /hbase/splitWAL dir with hundreds -- even thousands
-- of WALs in it. The 1MB buffer limit in zk can't carry listings this big.
Of note, the jute.maxbuffer needs to be set on the zk servers -- with restart
so the change is noticed -- and on the client-side, in the hbase master at
least.
This issue is about highlighting this old issue in our doc. It seems to be
absent totally.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)