ege-st commented on code in PR #11496:
URL: https://github.com/apache/pinot/pull/11496#discussion_r1319899550
##########
pinot-core/src/main/java/org/apache/pinot/core/transport/DataTableHandler.java:
##########
@@ -83,5 +98,12 @@ protected void channelRead0(ChannelHandlerContext ctx,
ByteBuf msg) {
public void exceptionCaught(ChannelHandlerContext ctx, Throwable cause) {
LOGGER.error("Caught exception while handling response from server: {}",
_serverRoutingInstance, cause);
_brokerMetrics.addMeteredGlobalValue(BrokerMeter.RESPONSE_FETCH_EXCEPTIONS, 1);
+ if (cause instanceof java.lang.OutOfMemoryError) {
Review Comment:
@gortiz
Re broker sending a max size: I don't have strong opinions on implementation
details, but once we get a bandaid for the Direct OOM issue then it'd be
awesome to dive into discussing ways we might do proactive resource management.
Re the Netty config option: I will test that proposal but I'm not sure that
it will prevent Netty from loading the entire message into Direct Memory. We'll
see with testing.
One thing is we don't have any message length field in our payload format
(which is unfortunate) which seems to be what this feature is built around. I
think the implication of that is this might work to provide the "max buffer
size" so that messages which exceed that buffer get dropped, but we since we
don't know how many bytes a message should have we can't reconstruct a message
from multiple buffers. Put another way: without changing our message format,
what we can do is set a max buffer size that won't cause an OOM and attempt to
deserialize the message and if it fails then drop the message and return an
error to the application layer. I think, that without having a length field in
the message we can't reconstruct the message because we won't know when to stop
appending byte buffers to our aggregate.
However, even if it works and we use it as the fix, I still think we need
explicit handling of DM OOMs in Netty, there will always be some confluence of
events that will make it happen and making sure we handle it gracefully is,
imo, a base line need.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]