subject:"\[jira\] \[Commented\] $HBASE\-11747$ ClusterStatus is too bulky"

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

2015-08-20 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705451#comment-14705451
 ] 

Andrew Purtell commented on HBASE-11747:


bq. Do we want to just bump CodeInputStream#limit to higher numbers and see if 
that addresses problem at hands

I did this on HBASE-13825

 ClusterStatus is too bulky 
 ---

 Key: HBASE-11747
 URL: https://issues.apache.org/jira/browse/HBASE-11747
 Project: HBase
  Issue Type: Sub-task
Reporter: Virag Kothari
 Attachments: exceptiontrace


 Following exception on 0.98 with 1M regions on cluster with 160 region servers
 {code}
 Caused by: java.io.IOException: Call to regionserverhost:port failed on local 
 exception: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1482)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1454)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712)
   at 
 org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.getClusterStatus(MasterProtos.java:42555)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.getClusterStatus(HConnectionManager.java:2132)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2166)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2162)
   at 
 org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
   ... 43 more
 Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

2015-07-01 Thread Thiruvel Thirumoolan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610932#comment-14610932
 ] 

Thiruvel Thirumoolan commented on HBASE-11747:
--

We are also exploring the option of compressing the status and sending from the 
server.

 ClusterStatus is too bulky 
 ---

 Key: HBASE-11747
 URL: https://issues.apache.org/jira/browse/HBASE-11747
 Project: HBase
  Issue Type: Sub-task
Reporter: Virag Kothari
 Attachments: exceptiontrace


 Following exception on 0.98 with 1M regions on cluster with 160 region servers
 {code}
 Caused by: java.io.IOException: Call to regionserverhost:port failed on local 
 exception: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1482)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1454)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712)
   at 
 org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.getClusterStatus(MasterProtos.java:42555)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.getClusterStatus(HConnectionManager.java:2132)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2166)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2162)
   at 
 org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
   ... 43 more
 Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

2015-07-01 Thread Mikhail Antonov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610877#comment-14610877
 ] 

Mikhail Antonov commented on HBASE-11747:
-

Wondering about next steps/directions here. Do we want to just bump 
CodeInputStream#limit to higher numbers and see if that addresses problem at 
hands (I think it should), or do we want to optimize protocol? Ive seen 3 
options here - 1)streaming instead of single message 2) decouple region/RS load 
info from cluster status itself 3) try to make data pieces themselves more 
compact, region names etc.

 ClusterStatus is too bulky 
 ---

 Key: HBASE-11747
 URL: https://issues.apache.org/jira/browse/HBASE-11747
 Project: HBase
  Issue Type: Sub-task
Reporter: Virag Kothari
 Attachments: exceptiontrace


 Following exception on 0.98 with 1M regions on cluster with 160 region servers
 {code}
 Caused by: java.io.IOException: Call to regionserverhost:port failed on local 
 exception: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1482)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1454)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712)
   at 
 org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.getClusterStatus(MasterProtos.java:42555)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.getClusterStatus(HConnectionManager.java:2132)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2166)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2162)
   at 
 org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
   ... 43 more
 Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

2015-07-01 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610905#comment-14610905
 ] 

stack commented on HBASE-11747:
---

Lets up CIS#limit for sure.

In new JIRA optimize protocol. There are a few already if you search 'hbase 
clusterstatus'. I like #2 and #3 from your list. For #2, was looking at 
exporting jmx so say the Master could read cluster metrics instead of getting 
metrics recast and served on the heartbeat. Was looking at https://jolokia.org/ 
Seems more sensible than JMX federation (Seems like its possible to hook up as 
src for D3 graphing). Do we poll rather than have the stuff pushed? What 
happens in a big cluster?

Good on you [~mantonov]

 ClusterStatus is too bulky 
 ---

 Key: HBASE-11747
 URL: https://issues.apache.org/jira/browse/HBASE-11747
 Project: HBase
  Issue Type: Sub-task
Reporter: Virag Kothari
 Attachments: exceptiontrace


 Following exception on 0.98 with 1M regions on cluster with 160 region servers
 {code}
 Caused by: java.io.IOException: Call to regionserverhost:port failed on local 
 exception: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1482)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1454)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712)
   at 
 org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.getClusterStatus(MasterProtos.java:42555)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.getClusterStatus(HConnectionManager.java:2132)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2166)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2162)
   at 
 org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
   ... 43 more
 Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

2015-07-01 Thread Mikhail Antonov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14611054#comment-14611054
 ] 

Mikhail Antonov commented on HBASE-11747:
-

[~thiruvel] 

bq. We are also exploring the option of compressing the status and sending from 
the server.

Could you please describe you case little more? You're facing this error, or 
just trying to optimize the traffic or something else? Would be interested to 
know the size of cluster/ # of regions you're serving.

 ClusterStatus is too bulky 
 ---

 Key: HBASE-11747
 URL: https://issues.apache.org/jira/browse/HBASE-11747
 Project: HBase
  Issue Type: Sub-task
Reporter: Virag Kothari
 Attachments: exceptiontrace


 Following exception on 0.98 with 1M regions on cluster with 160 region servers
 {code}
 Caused by: java.io.IOException: Call to regionserverhost:port failed on local 
 exception: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1482)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1454)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712)
   at 
 org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.getClusterStatus(MasterProtos.java:42555)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.getClusterStatus(HConnectionManager.java:2132)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2166)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2162)
   at 
 org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
   ... 43 more
 Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

2015-07-01 Thread Mikhail Antonov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14611051#comment-14611051
 ] 

Mikhail Antonov commented on HBASE-11747:
-

[~stack] 

bq. For #2, was looking at exporting jmx so say the Master could read cluster 
metrics instead of getting metrics recast and served on the heartbeat
Did you mean rpc, not jmx? I briefly looked at where it's actually used, and 
unless I'm missing something, we don't really use it in any heardbeats. Cluster 
status is used:

 - for subscribers on (multicast) publishing (that's the only push as far as I 
can tell?)
-  in separate MasterRpcServices#GetClusterStatus rpc call and accordingly in 
Admin interface wrapping it (which is in the log posted in the jora)
- in REST messages

For regular heartbeats we just use MRS#regionServerReport rpc call, which only 
pushes to master RS server name/load (including region load). So as far as I 
can tell, those are already mostly decoupled. So I think the options (aside 
bumping the size of message) drift to something like check if monolithic 
cluster status is looking too big (over defined limit) on server side, and 
return it with empty load in this case, setting some flag indicating that 
message is partially constructed to not fail as transport level, and that 
client should use separate call to request server/region load for the list of 
RSs it's interested to know about?

In other words, I guess I see 2 basic options:
 - bump the size of message in this jira (trivial patch)
 - leave current ClusterStatus format as is for compatibility, but add handling 
to return empty LiveServerInfo list if it's coming up too big, add new rpc call 
to retrieve list of LiveServerInfo for a list (range?) of region servers. 
Here's where RS groups would be handy. What do you think?

bq. Seems like its possible to hook up as src for D3 graphing
Hmm, that's something different, drawing metrics in the UI?

 ClusterStatus is too bulky 
 ---

 Key: HBASE-11747
 URL: https://issues.apache.org/jira/browse/HBASE-11747
 Project: HBase
  Issue Type: Sub-task
Reporter: Virag Kothari
 Attachments: exceptiontrace


 Following exception on 0.98 with 1M regions on cluster with 160 region servers
 {code}
 Caused by: java.io.IOException: Call to regionserverhost:port failed on local 
 exception: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1482)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1454)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712)
   at 
 org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.getClusterStatus(MasterProtos.java:42555)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.getClusterStatus(HConnectionManager.java:2132)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2166)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2162)
   at 
 org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
   ... 43 more
 Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

2015-07-01 Thread Mikhail Antonov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14611166#comment-14611166
 ] 

Mikhail Antonov commented on HBASE-11747:
-

(also curious to hear more opinions?)

 ClusterStatus is too bulky 
 ---

 Key: HBASE-11747
 URL: https://issues.apache.org/jira/browse/HBASE-11747
 Project: HBase
  Issue Type: Sub-task
Reporter: Virag Kothari
 Attachments: exceptiontrace


 Following exception on 0.98 with 1M regions on cluster with 160 region servers
 {code}
 Caused by: java.io.IOException: Call to regionserverhost:port failed on local 
 exception: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1482)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1454)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712)
   at 
 org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.getClusterStatus(MasterProtos.java:42555)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.getClusterStatus(HConnectionManager.java:2132)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2166)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2162)
   at 
 org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
   ... 43 more
 Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

2015-07-01 Thread Mikhail Antonov (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14611159#comment-14611159
]

Mikhail Antonov commented on HBASE-11747:
-

bq. JMX. Idea is to help shrink ClusterStatus by moving metrics out.
Hmm. JMX isn't a transport for messages, is it? I think I'm missing something
here.. I thought only of RPC messaging overhaul here. Could you describe JMX
approach?

bq. Rather than have a protocol that cuts in only when we are too big, could we
not slim ClusterStatus so vitals only and always require client use a separate
call for detail (or go to metrics system if it is counts, etc., that it is
interested in). I like your suggestion of adding a new call for doing new
protocol That'd be best.

So you think, just modify ClusterStatus proto server side wiring, so we just
never include load info in the message (we can avoid completely removing this
field to maintain wire compatibility?), and add new rpc method? That's what i'm
thinking now too. Question - how would this new RPC overlap with metrics
functionality?

Let me walk thru users of ClusterStatus and see which of them actually use load
info and for what (balancer, what else).

bq. Yes. Pardon my conflation. Will restrain myself in future.
Oh, I just meant, is there more aspects of this problem than what I see now,
which should be considered while deciding of what approach to take.

ClusterStatus is too bulky
---

Key: HBASE-11747
URL: https://issues.apache.org/jira/browse/HBASE-11747
Project: HBase
Issue Type: Sub-task
Reporter: Virag Kothari
Attachments: exceptiontrace

Following exception on 0.98 with 1M regions on cluster with 160 region servers
{code}
Caused by: java.io.IOException: Call to regionserverhost:port failed on local
exception: com.google.protobuf.InvalidProtocolBufferException: Protocol
message was too large. May be malicious. Use
CodedInputStream.setSizeLimit() to increase the size limit.
at
org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1482)
at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1454)
at
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654)
at
org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712)
at
org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.getClusterStatus(MasterProtos.java:42555)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.getClusterStatus(HConnectionManager.java:2132)
at
org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2166)
at
org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2162)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
... 43 more
Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
message was too large. May be malicious. Use
CodedInputStream.setSizeLimit() to increase the size limit.
at
com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
{code}

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

2015-07-01 Thread stack (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14611125#comment-14611125
]

stack commented on HBASE-11747:
---

bq. Did you mean rpc, not jmx?

JMX. Idea is to help shrink ClusterStatus by moving metrics out.

bq. I'm missing something, we don't really use it in any heardbeats

Right (I thought we did but it is just a sub-element, the ServerLoad, that is
passed on the heartbeat -- pardon me).

bq. check if monolithic cluster status is looking too big (over defined limit)
on server side, and return it with empty load in this case, setting some flag
indicating that message is partially constructed to not fail as transport
level, and that client should use separate call to request server/region load
for the list of RSs it's interested to know about?

Rather than have a protocol that cuts in only when we are too big, could we not
slim ClusterStatus so vitals only and always require client use a separate call
for detail (or go to metrics system if it is counts, etc., that it is
interested in)

I like your suggestion of adding a new call for doing new protocol That'd be
best.

bq. Hmm, that's something different, drawing metrics in the UI?

Yes. Pardon my conflation. Will restrain myself in future.

ClusterStatus is too bulky
---

Key: HBASE-11747
URL: https://issues.apache.org/jira/browse/HBASE-11747
Project: HBase
Issue Type: Sub-task
Reporter: Virag Kothari
Attachments: exceptiontrace

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

2015-07-01 Thread stack (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14611189#comment-14611189
]

stack commented on HBASE-11747:
---

bq. JMX isn't a transport for messages, is it?

No. Generally JMX is for management. HBase uses it to publish server attributes
and metrics. HBase also puts up a JMX Bean Server so you can query the beans
over the net. This mechanism uses java's crazy RMI which is mostly unusable by
systems other than java and even then, has a ping-pong random port mechanism
that requires open port ranges. The nice thing about the https://jolokia.org/
is that it REST/JSON-ifies our JMX making it more palatable to more systems.

bq. Could you describe JMX approach?

ClusterStatus is made of various attributes including ServerLoad for every node
in the cluster. ServerLoad is not actually server load. Rather, it is a dumping
ground for all and sundry including server attributes, configuration, and
metrics. Redoing ServerLoad so it is just load vitals would be a nice to have
so we don't flood the master once a second as all report in with fat messages
on their heartbeats. Server metrics are also available published out of our
metrics system. Metrics are published variously -- as text in a servlet and as
jmx beans available on each server (jmx is on a period IIRC, servlet is poll).
That we are dumping out our metrics on a period via JMX and that we then go and
collect them all again to put on a heartbeat is silly. Would be nice to
refactor. If ServerLoad is slimmed, then it would help here given we do one up
for each server and insert in ClusterStatus.

That was high-level what I was thinking. Separate issue I'd say, a background
consideration when addressing this one.

bq. Question - how would this new RPC overlap with metrics functionality?

Was thinking they'd be distinct. If you want metrics, use our metrics system;
we are publishing our metrics per server anyways.

ClusterStatus is too bulky
---

Key: HBASE-11747
URL: https://issues.apache.org/jira/browse/HBASE-11747
Project: HBase
Issue Type: Sub-task
Reporter: Virag Kothari
Attachments: exceptiontrace

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

2015-06-02 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14570287#comment-14570287
 ] 

Andrew Purtell commented on HBASE-11747:


One option is to use CodedInputStream#setSizeLimit in the client to effectively 
disable this check by setting it to Integer.MAX.

 ClusterStatus is too bulky 
 ---

 Key: HBASE-11747
 URL: https://issues.apache.org/jira/browse/HBASE-11747
 Project: HBase
  Issue Type: Sub-task
Reporter: Virag Kothari
 Attachments: exceptiontrace


 Following exception on 0.98 with 1M regions on cluster with 160 region servers
 {code}
 Caused by: java.io.IOException: Call to regionserverhost:port failed on local 
 exception: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1482)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1454)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712)
   at 
 org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.getClusterStatus(MasterProtos.java:42555)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.getClusterStatus(HConnectionManager.java:2132)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2166)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2162)
   at 
 org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
   ... 43 more
 Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

2015-04-27 Thread Dev Lakhani (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514224#comment-14514224
 ] 

Dev Lakhani commented on HBASE-11747:
-

Is there any progress on this, or a workaround we can make use of? The comments 
above by [~virag] state setting: CodedInputStream.setSizeLimit() where can we 
do this, is it possible to do this in the application code? Or is it possible 
to set any other config param for example, will 
replication.source.size.capacity help with a workaround until a fix is 
implemented? Thanks

 ClusterStatus is too bulky 
 ---

 Key: HBASE-11747
 URL: https://issues.apache.org/jira/browse/HBASE-11747
 Project: HBase
  Issue Type: Sub-task
Reporter: Virag Kothari
 Attachments: exceptiontrace


 Following exception on 0.98 with 1M regions on cluster with 160 region servers
 {code}
 Caused by: java.io.IOException: Call to regionserverhost:port failed on local 
 exception: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1482)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1454)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712)
   at 
 org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.getClusterStatus(MasterProtos.java:42555)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.getClusterStatus(HConnectionManager.java:2132)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2166)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2162)
   at 
 org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
   ... 43 more
 Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

2014-08-18 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101323#comment-14101323
 ] 

stack commented on HBASE-11747:
---

Good one. Every RS sending 100MB of 'status' to the master every second or so 
is just obnoxious, especially so when much of this info is being duplicated no 
our metrics 'channel'.  Thanks for bringing this one up Virag. We need a bit of 
fixup in here.

 ClusterStatus is too bulky 
 ---

 Key: HBASE-11747
 URL: https://issues.apache.org/jira/browse/HBASE-11747
 Project: HBase
  Issue Type: Sub-task
Reporter: Virag Kothari
 Attachments: exceptiontrace


 Following exception on 0.98 with 1M regions on cluster with 160 region servers
 {code}
 Caused by: java.io.IOException: Call to regionserverhost:port failed on local 
 exception: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1482)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1454)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712)
   at 
 org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.getClusterStatus(MasterProtos.java:42555)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.getClusterStatus(HConnectionManager.java:2132)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2166)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2162)
   at 
 org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
   ... 43 more
 Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

2014-08-14 Thread Virag Kothari (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14097370#comment-14097370
 ] 

Virag Kothari commented on HBASE-11747:
---

This exception will be thrown if message size is more than 64MB. With 1M 
regions (only open and no data) on 160 servers, the size is around 100Mb. 
For now, did a workaround by setting the CodedInputStream.setSizeLimit() to a 
very high value. 
 Do we need thinner API's? I assume RegionLoad is quite heavy.

 ClusterStatus is too bulky 
 ---

 Key: HBASE-11747
 URL: https://issues.apache.org/jira/browse/HBASE-11747
 Project: HBase
  Issue Type: Bug
Reporter: Virag Kothari

 Following exception on 0.98 with 1M regions on cluster with 160 region servers
 {code}
 Caused by: java.io.IOException: Call to regionserverhost:port failed on local 
 exception: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1482)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1454)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712)
   at 
 org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.getClusterStatus(MasterProtos.java:42555)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.getClusterStatus(HConnectionManager.java:2132)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2166)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2162)
   at 
 org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
   ... 43 more
 Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

2014-08-14 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14097387#comment-14097387
 ] 

Andrew Purtell commented on HBASE-11747:


bq. Do we need thinner API's? I assume RegionLoad is quite heavy.

Yes, but we have to be careful in 0.98 not to change APIs in a breaking way. I 
think increasing the message size limit to work around the problem is fine 
given that consideration. How often do you plan to call 
HBaseAdmin#getClusterStatus?

 ClusterStatus is too bulky 
 ---

 Key: HBASE-11747
 URL: https://issues.apache.org/jira/browse/HBASE-11747
 Project: HBase
  Issue Type: Bug
Reporter: Virag Kothari
 Attachments: exceptiontrace


 Following exception on 0.98 with 1M regions on cluster with 160 region servers
 {code}
 Caused by: java.io.IOException: Call to regionserverhost:port failed on local 
 exception: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1482)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1454)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712)
   at 
 org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.getClusterStatus(MasterProtos.java:42555)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.getClusterStatus(HConnectionManager.java:2132)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2166)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2162)
   at 
 org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
   ... 43 more
 Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

2014-08-14 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14097392#comment-14097392
 ] 

Andrew Purtell commented on HBASE-11747:


We can add *new* APIs. I wonder if it would be workable to introduce a 
streaming status API where the client uses a cursor to iterate over the 
master's picture of the cluster. Might be tricky wherever regions have migrated 
or servers have come and gone. The master would have to provide either a 
consistent snapshot of state or track changes since the client opened the 
curser and mix in change deltas with iteration results.

 ClusterStatus is too bulky 
 ---

 Key: HBASE-11747
 URL: https://issues.apache.org/jira/browse/HBASE-11747
 Project: HBase
  Issue Type: Bug
Reporter: Virag Kothari
 Attachments: exceptiontrace


 Following exception on 0.98 with 1M regions on cluster with 160 region servers
 {code}
 Caused by: java.io.IOException: Call to regionserverhost:port failed on local 
 exception: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1482)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1454)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712)
   at 
 org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.getClusterStatus(MasterProtos.java:42555)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.getClusterStatus(HConnectionManager.java:2132)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2166)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2162)
   at 
 org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
   ... 43 more
 Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

2014-08-14 Thread Elliott Clark (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14097506#comment-14097506
 ] 

Elliott Clark commented on HBASE-11747:
---

We should look at using smaller region names as well.  There's no need to send 
the whole region name across.

 ClusterStatus is too bulky 
 ---

 Key: HBASE-11747
 URL: https://issues.apache.org/jira/browse/HBASE-11747
 Project: HBase
  Issue Type: Bug
Reporter: Virag Kothari
 Attachments: exceptiontrace


 Following exception on 0.98 with 1M regions on cluster with 160 region servers
 {code}
 Caused by: java.io.IOException: Call to regionserverhost:port failed on local 
 exception: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1482)
   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1454)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654)
   at 
 org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712)
   at 
 org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.getClusterStatus(MasterProtos.java:42555)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.getClusterStatus(HConnectionManager.java:2132)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2166)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2162)
   at 
 org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
   ... 43 more
 Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol 
 message was too large.  May be malicious.  Use 
 CodedInputStream.setSizeLimit() to increase the size limit.
   at 
 com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

[jira] [Commented] (HBASE-11747) ClusterStatus is too bulky

17 matches

Site Navigation

Mail list logo

Footer information