We aren't running with those patches, but would it be possible for you to heap dump one client? At least we would see exactly what's eating all the memory.
Thx, J-D On Fri, Oct 14, 2011 at 12:12 PM, Shrijeet Paliwal <[email protected]>wrote: > Hi All, > > HBase version: 0.90.3 + Patches > Hadoop version: CDH3u0 > Relevant Jiras: https://issues.apache.org/jira/browse/HBASE-2937, > https://issues.apache.org/jira/browse/HBASE-4003 > > We have been using the 'hbase.client.operation.timeout' knob > introduced in 2937 for quite some time now. It helps us enforce SLA. > We have two HBase clusters and two HBase client clusters. One of them > is much busier than the other. > > We have seen a deterministic behavior of clients running in busy > cluster. Their (client's) memory footprint increases consistently > after they have been up for roughly 24 hours. > This memory footprint almost doubles from its usual value (usual case > == RPC timeout disabled). After much investigation nothing concrete > came out and we had to put a hack > which keep heap size in control even when RPC timeout is enabled. Also > please note , the same behavior is not observed in 'not so busy > cluster. > > The patch is here : https://gist.github.com/1288023 > > Can some one, who is also running RPC timeout in production under fair > load, please share the experience. > > -Shrijeet >
