[
https://issues.apache.org/jira/browse/HBASE-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14611189#comment-14611189
]
stack commented on HBASE-11747:
-------------------------------
bq. JMX isn't a transport for messages, is it?
No. Generally JMX is for management. HBase uses it to publish server attributes
and metrics. HBase also puts up a JMX Bean Server so you can query the beans
over the net. This mechanism uses java's crazy RMI which is mostly unusable by
systems other than java and even then, has a ping-pong random port mechanism
that requires open port ranges. The nice thing about the https://jolokia.org/
is that it REST/JSON-ifies our JMX making it more palatable to more systems.
bq. Could you describe JMX approach?
ClusterStatus is made of various attributes including ServerLoad for every node
in the cluster. ServerLoad is not actually server load. Rather, it is a dumping
ground for all and sundry including server attributes, configuration, and
metrics. Redoing ServerLoad so it is just load vitals would be a nice to have
so we don't flood the master once a second as all report in with fat messages
on their heartbeats. Server metrics are also available published out of our
metrics system. Metrics are published variously -- as text in a servlet and as
jmx beans available on each server (jmx is on a period IIRC, servlet is poll).
That we are dumping out our metrics on a period via JMX and that we then go and
collect them all again to put on a heartbeat is silly. Would be nice to
refactor. If ServerLoad is slimmed, then it would help here given we do one up
for each server and insert in ClusterStatus.
That was high-level what I was thinking. Separate issue I'd say, a background
consideration when addressing this one.
bq. Question - how would this new RPC overlap with metrics functionality?
Was thinking they'd be distinct. If you want metrics, use our metrics system;
we are publishing our metrics per server anyways.
> ClusterStatus is too bulky
> ---------------------------
>
> Key: HBASE-11747
> URL: https://issues.apache.org/jira/browse/HBASE-11747
> Project: HBase
> Issue Type: Sub-task
> Reporter: Virag Kothari
> Attachments: exceptiontrace
>
>
> Following exception on 0.98 with 1M regions on cluster with 160 region servers
> {code}
> Caused by: java.io.IOException: Call to regionserverhost:port failed on local
> exception: com.google.protobuf.InvalidProtocolBufferException: Protocol
> message was too large. May be malicious. Use
> CodedInputStream.setSizeLimit() to increase the size limit.
> at
> org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1482)
> at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1454)
> at
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654)
> at
> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712)
> at
> org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.getClusterStatus(MasterProtos.java:42555)
> at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.getClusterStatus(HConnectionManager.java:2132)
> at
> org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2166)
> at
> org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2162)
> at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
> ... 43 more
> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
> message was too large. May be malicious. Use
> CodedInputStream.setSizeLimit() to increase the size limit.
> at
> com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)