[ 
https://issues.apache.org/jira/browse/HBASE-24544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133953#comment-17133953
 ] 

Michael Stack commented on HBASE-24544:
---------------------------------------

Old issue HBASE-4246 closed as stale but actually still an issue.

> Recommend upping zk jute.maxbuffer in all but minor installs
> ------------------------------------------------------------
>
>                 Key: HBASE-24544
>                 URL: https://issues.apache.org/jira/browse/HBASE-24544
>             Project: HBase
>          Issue Type: Bug
>          Components: documentation
>            Reporter: Michael Stack
>            Priority: Major
>
> Add a doc note in upgrade and in zookeeper section recommending upping zk 
> jute.maxbuffer to be above the default of 1M.
> Here is jute.maxbuffer from zk doc.
> {code}
> jute.maxbuffer:
> (Java system property: jute.maxbuffer)
> This option can only be set as a Java system property. There is no zookeeper 
> prefix on it. It specifies the maximum size of the data that can be stored in 
> a znode. The default is 0xfffff, or just under 1M. If this option is changed, 
> the system property must be set on all servers and clients otherwise problems 
> will arise. This is really a sanity check. ZooKeeper is designed to store 
> data on the order of kilobytes in size.
> {code}
> It seems easy enough blowing the 1MB default. Here is one such scenario. A 
> peer is disabled so WALs backup on each RegionServer or a bug makes it so we 
> don't clear WALs out from under the RegionServer promptly. Backed-up WALs get 
> into the hundreds... easy enough on a busy cluster. Next, there is a power 
> outage and the cluster crashes down.
> Recovery may require an SCP recovering hundreds of WALs. As is, the way our 
> SCP works, we can end up with a /hbase/splitWAL dir with hundreds -- even 
> thousands -- of WALs in it. The 1MB buffer limit in zk can't carry listings 
> this big.
> Of note, the jute.maxbuffer needs to be set on the zk servers -- with restart 
> so the change is noticed -- and on the client-side, in the hbase master at 
> least.
> This issue is about highlighting this old issue in our doc. It seems to be 
> absent totally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to