[jira] [Commented] (HBASE-12141) ClusterStatus message might exceed max datagram payload limits

2014-10-02 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14156267#comment-14156267
 ] 

Nicolas Liochon commented on HBASE-12141:
-

Yeah, the strategy was to keep the message small enough (if multiple servers 
fail simultaneously, we send multiple messages instead of one). As well, we 
send the message multiple times in case it got lost somewhere. I had issue with 
Netty 3.x when tried to add frames. I haven't tried very hard. We could make 
MAX_SERVER_PER_MESSAGE configurable for network with a very small mtu? It's 
also possible to compress the message. Once again, I had issue with Netty 3.x 
for this in the past.

This said, I would be interested to understand the network config. 

 ClusterStatus message might exceed max datagram payload limits
 --

 Key: HBASE-12141
 URL: https://issues.apache.org/jira/browse/HBASE-12141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Andrew Purtell

 The multicast ClusterStatusPublisher and its companion listener are using 
 datagram channels without any framing. I think this is an issue because 
 Netty's ProtobufDecoder expects a complete PB message to be available in the 
 ChannelBuffer yet ClusterStatus messages can be large and might exceed the 
 maximum datagram payload size. As one user reported on list:
 {noformat}
 org.apache.hadoop.hbase.client.ClusterStatusListener - ERROR - Unexpected 
 exception, continuing.
 com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had 
 invalid wire type.
 at 
 com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:99)
 at 
 com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:498)
 at 
 com.google.protobuf.GeneratedMessage.parseUnknownField(GeneratedMessage.java:193)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClusterStatusProtos$ClusterStatus.init(ClusterStatusProtos.java:7554)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClusterStatusProtos$ClusterStatus.init(ClusterStatusProtos.java:7512)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClusterStatusProtos$ClusterStatus$1.parsePartialFrom(ClusterStatusProtos.java:7689)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClusterStatusProtos$ClusterStatus$1.parsePartialFrom(ClusterStatusProtos.java:7684)
 at 
 com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:141)
 at 
 com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:176)
 at 
 com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:182)
 at 
 com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
 at 
 org.jboss.netty.handler.codec.protobuf.ProtobufDecoder.decode(ProtobufDecoder.java:122)
 at 
 org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:66)
 at 
 org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
 at 
 org.jboss.netty.channel.socket.oio.OioDatagramWorker.process(OioDatagramWorker.java:52)
 at 
 org.jboss.netty.channel.socket.oio.AbstractOioWorker.run(AbstractOioWorker.java:73)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 The javadoc for ProtobufDecoder says:
 {quote}
 Decodes a received ChannelBuffer into a Google Protocol Buffers Message and 
 MessageLite. Please note that this decoder must be used with a proper 
 FrameDecoder such as ProtobufVarint32FrameDecoder or 
 LengthFieldBasedFrameDecoder if you are using a stream-based transport such 
 as TCP/IP.
 {quote}
 and even though we are using a datagram transport we have related issues, 
 depending on what the sending and receiving OS does with overly large 
 datagrams:
 - We may receive a datagram with a truncated message
 - We may get an upcall when processing one fragment of a fragmented datagram, 
 where the complete message is not available yet
 - We may not be able to send the overly large ClusterStatus in the first 
 place. Linux claims to do PMTU and return EMSGSIZE if a datagram packet 
 payload exceeds the MTU, but will send a fragmented datagram if PMTU is 
 disabled. I'm surprised we have the above report given the default is to 
 reject overly large datagram payloads, so perhaps the user is using a 
 different server OS or Netty datagram channels do their own fragmentation (I 
 haven't checked).
 In any case, the server and client pipelines are definitely not doing any 
 kind of framing. This is the multicast status listener from 

[jira] [Commented] (HBASE-12141) ClusterStatus message might exceed max datagram payload limits

2014-10-02 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14156857#comment-14156857
 ] 

Andrew Purtell commented on HBASE-12141:


See 
http://mail-archives.apache.org/mod_mbox/hbase-user/201410.mbox/%3C3256288.x8cyWY5ZEW%40localhost.localdomain%3E
 . The network configuration is interesting. 

 ClusterStatus message might exceed max datagram payload limits
 --

 Key: HBASE-12141
 URL: https://issues.apache.org/jira/browse/HBASE-12141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Andrew Purtell

 The multicast ClusterStatusPublisher and its companion listener are using 
 datagram channels without any framing. I think this is an issue because 
 Netty's ProtobufDecoder expects a complete PB message to be available in the 
 ChannelBuffer yet ClusterStatus messages can be large and might exceed the 
 maximum datagram payload size. As one user reported on list:
 {noformat}
 org.apache.hadoop.hbase.client.ClusterStatusListener - ERROR - Unexpected 
 exception, continuing.
 com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had 
 invalid wire type.
 at 
 com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:99)
 at 
 com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:498)
 at 
 com.google.protobuf.GeneratedMessage.parseUnknownField(GeneratedMessage.java:193)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClusterStatusProtos$ClusterStatus.init(ClusterStatusProtos.java:7554)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClusterStatusProtos$ClusterStatus.init(ClusterStatusProtos.java:7512)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClusterStatusProtos$ClusterStatus$1.parsePartialFrom(ClusterStatusProtos.java:7689)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClusterStatusProtos$ClusterStatus$1.parsePartialFrom(ClusterStatusProtos.java:7684)
 at 
 com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:141)
 at 
 com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:176)
 at 
 com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:182)
 at 
 com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
 at 
 org.jboss.netty.handler.codec.protobuf.ProtobufDecoder.decode(ProtobufDecoder.java:122)
 at 
 org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:66)
 at 
 org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
 at 
 org.jboss.netty.channel.socket.oio.OioDatagramWorker.process(OioDatagramWorker.java:52)
 at 
 org.jboss.netty.channel.socket.oio.AbstractOioWorker.run(AbstractOioWorker.java:73)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 The javadoc for ProtobufDecoder says:
 {quote}
 Decodes a received ChannelBuffer into a Google Protocol Buffers Message and 
 MessageLite. Please note that this decoder must be used with a proper 
 FrameDecoder such as ProtobufVarint32FrameDecoder or 
 LengthFieldBasedFrameDecoder if you are using a stream-based transport such 
 as TCP/IP.
 {quote}
 and even though we are using a datagram transport we have related issues, 
 depending on what the sending and receiving OS does with overly large 
 datagrams:
 - We may receive a datagram with a truncated message
 - We may get an upcall when processing one fragment of a fragmented datagram, 
 where the complete message is not available yet
 - We may not be able to send the overly large ClusterStatus in the first 
 place. Linux claims to do PMTU and return EMSGSIZE if a datagram packet 
 payload exceeds the MTU, but will send a fragmented datagram if PMTU is 
 disabled. I'm surprised we have the above report given the default is to 
 reject overly large datagram payloads, so perhaps the user is using a 
 different server OS or Netty datagram channels do their own fragmentation (I 
 haven't checked).
 In any case, the server and client pipelines are definitely not doing any 
 kind of framing. This is the multicast status listener from 0.98 for example:
 {code}
   b.setPipeline(Channels.pipeline(
   new 
 ProtobufDecoder(ClusterStatusProtos.ClusterStatus.getDefaultInstance()),
   new ClusterStatusHandler()));
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12141) ClusterStatus message might exceed max datagram payload limits

2014-10-02 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14156860#comment-14156860
 ] 

Andrew Purtell commented on HBASE-12141:


... which could explain fragmentation, I think. The MTU of OpenVPN tunnels will 
be less than normal by packet header + tunnel protocol overheads and possibly 
not subject to PMTU discovery. I'd wager our channel handler is being invoked 
with the first fragment is received so the PB is truncated but the remainder 
will show up soon.

 ClusterStatus message might exceed max datagram payload limits
 --

 Key: HBASE-12141
 URL: https://issues.apache.org/jira/browse/HBASE-12141
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.3
Reporter: Andrew Purtell

 The multicast ClusterStatusPublisher and its companion listener are using 
 datagram channels without any framing. I think this is an issue because 
 Netty's ProtobufDecoder expects a complete PB message to be available in the 
 ChannelBuffer yet ClusterStatus messages can be large and might exceed the 
 maximum datagram payload size. As one user reported on list:
 {noformat}
 org.apache.hadoop.hbase.client.ClusterStatusListener - ERROR - Unexpected 
 exception, continuing.
 com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had 
 invalid wire type.
 at 
 com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:99)
 at 
 com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:498)
 at 
 com.google.protobuf.GeneratedMessage.parseUnknownField(GeneratedMessage.java:193)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClusterStatusProtos$ClusterStatus.init(ClusterStatusProtos.java:7554)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClusterStatusProtos$ClusterStatus.init(ClusterStatusProtos.java:7512)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClusterStatusProtos$ClusterStatus$1.parsePartialFrom(ClusterStatusProtos.java:7689)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClusterStatusProtos$ClusterStatus$1.parsePartialFrom(ClusterStatusProtos.java:7684)
 at 
 com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:141)
 at 
 com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:176)
 at 
 com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:182)
 at 
 com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
 at 
 org.jboss.netty.handler.codec.protobuf.ProtobufDecoder.decode(ProtobufDecoder.java:122)
 at 
 org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:66)
 at 
 org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
 at 
 org.jboss.netty.channel.socket.oio.OioDatagramWorker.process(OioDatagramWorker.java:52)
 at 
 org.jboss.netty.channel.socket.oio.AbstractOioWorker.run(AbstractOioWorker.java:73)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 The javadoc for ProtobufDecoder says:
 {quote}
 Decodes a received ChannelBuffer into a Google Protocol Buffers Message and 
 MessageLite. Please note that this decoder must be used with a proper 
 FrameDecoder such as ProtobufVarint32FrameDecoder or 
 LengthFieldBasedFrameDecoder if you are using a stream-based transport such 
 as TCP/IP.
 {quote}
 and even though we are using a datagram transport we have related issues, 
 depending on what the sending and receiving OS does with overly large 
 datagrams:
 - We may receive a datagram with a truncated message
 - We may get an upcall when processing one fragment of a fragmented datagram, 
 where the complete message is not available yet
 - We may not be able to send the overly large ClusterStatus in the first 
 place. Linux claims to do PMTU and return EMSGSIZE if a datagram packet 
 payload exceeds the MTU, but will send a fragmented datagram if PMTU is 
 disabled. I'm surprised we have the above report given the default is to 
 reject overly large datagram payloads, so perhaps the user is using a 
 different server OS or Netty datagram channels do their own fragmentation (I 
 haven't checked).
 In any case, the server and client pipelines are definitely not doing any 
 kind of framing. This is the multicast status listener from 0.98 for example:
 {code}
   b.setPipeline(Channels.pipeline(
   new 
 ProtobufDecoder(ClusterStatusProtos.ClusterStatus.getDefaultInstance()),
   new ClusterStatusHandler()));
 {code}



--
This message was