[
https://issues.apache.org/jira/browse/FLINK-14618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jacob Sevart updated FLINK-14618:
---------------------------------
Description:
I'm hitting the akka framesize limit in production with some regularity, often
when the job has been running for a long time and we try to deploy or restart.
I suspect it's checkpoint related because clearing the checkpoint enables the
job to start up.
The
[guidance|[https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html]]
says:
{quote}If Flink fails because messages exceed this limit, then you should
increase it.
{quote}
The [error message|#L270]] is not very helpful towards that end. How large does
it need to be? How do I know whether increasing the size will fix it, or if the
message is unreasonably large due to a bug?
I'd like to modify the exception message to report the value of size.
This is related to FLINK-4399 but should be a much simpler fix.
was:
I'm hitting the akka framesize limit in production with some regularity, often
when the job has been running for a long time and we try to deploy or restart.
I suspect it's checkpoint related because clearing the checkpoint enables the
job to start up.
The
[guidance|[https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html]]
says:
{quote}If Flink fails because messages exceed this limit, then you should
increase it.
{quote}
The [error
message|[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/rpc/akka/AkkaInvocationHandler.java#L270]]
is not very helpful towards that end. How large does it need to be? How do I
know whether increasing the size will fix it, or if the message is unreasonably
large due to a bug?
I'd like to modify the exception message to report the value of size.
> Give more detailed debug information on akka framesize exception
> ----------------------------------------------------------------
>
> Key: FLINK-14618
> URL: https://issues.apache.org/jira/browse/FLINK-14618
> Project: Flink
> Issue Type: Improvement
> Components: Documentation, Runtime / Network
> Affects Versions: 1.6.3
> Reporter: Jacob Sevart
> Priority: Minor
>
> I'm hitting the akka framesize limit in production with some regularity,
> often when the job has been running for a long time and we try to deploy or
> restart. I suspect it's checkpoint related because clearing the checkpoint
> enables the job to start up.
> The
> [guidance|[https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html]]
> says:
> {quote}If Flink fails because messages exceed this limit, then you should
> increase it.
> {quote}
> The [error message|#L270]] is not very helpful towards that end. How large
> does it need to be? How do I know whether increasing the size will fix it, or
> if the message is unreasonably large due to a bug?
> I'd like to modify the exception message to report the value of size.
> This is related to FLINK-4399 but should be a much simpler fix.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)