Hi Gabor,

I currently can't offer much advice there because I don't know which
components from the Flink framework depend on pekko's frame size.

I know there are warnings logged for network memory and network buffers,
something like "110% of network memory requested, max value X, consider
increasing it." Maybe it's possible to have something like that for RPC
frame size? Ideally exposed with a metric that shows when requests exceed a
certain percentage of the max value.

Regards,
Alexis.

On Wed, 21 Jan 2026, 12:42 Gabor Somogyi, <[email protected]> wrote:

> Hi Alexis,
>
> I'm not aware of such feature. Just for my own understanding how could you
> imagine such feature?
>
> BR,
> G
>
>
> On Wed, Jan 21, 2026 at 11:20 AM Alexis Sarda-Espinosa <
> [email protected]> wrote:
>
>> Hello,
>>
>> If I recall correctly, pekko's frame size (and also akka's in the past)
>> was always an issue. I think documentation said that sometimes the
>> application just needs a larger size and it's not possible to know in
>> advance when that can happen. Today we saw a job restart and subsequently
>> crashloop with this exception cause:
>>
>> Caused by: java.util.concurrent.TimeoutException: Invocation of
>> [RemoteRpcInvocation(TaskExecutorGateway.submitTask(TaskDeploymentDescriptor,
>> JobMasterId, Duration))] at recipient [pekko.tcp://
>> [email protected]:6122/user/rpc/taskmanager_0] timed out. This is
>> usually caused by: 1) Pekko failed sending the message silently, due to
>> problems like oversized payload or serialization failures. In that case,
>> you should find detailed error information in the logs. 2) The recipient
>> needs more time for responding, due to problems like slow machines or
>> network jitters. In that case, you can try to increase pekko.ask.timeout.
>>
>> To fix this, I increased both pekko.ask.timeout & pekko.framesize
>> simultaneously, so I'm not sure which one was the root cause, but in any
>> case, is there still no way to monitor if this limit could be reached
>> before it happens?
>>
>> This was with Flink 2.1.1
>>
>> Regards,
>> Alexis.
>>
>

Reply via email to