[ 
https://issues.apache.org/jira/browse/FLINK-22729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-22729:
---------------------------------
    Comment: was deleted

(was: This issue was labeled "stale-major" 7 days ago and has not received any 
updates so it is being deprioritized. If this ticket is actually Major, please 
raise the priority and ask a committer to assign you the issue or revive the 
public discussion.
)

> Truncated Messages in Python workers
> ------------------------------------
>
>                 Key: FLINK-22729
>                 URL: https://issues.apache.org/jira/browse/FLINK-22729
>             Project: Flink
>          Issue Type: Bug
>          Components: Stateful Functions
>    Affects Versions: statefun-2.2.2
>         Environment: The Stateful Function version is 2.2.2, java8. The Java 
> App as well as
> the external Python workers are deployed in the same kubernetes cluster.
>            Reporter: Stephan Ewen
>            Priority: Critical
>             Fix For: statefun-3.1.0
>
>
> Recently we started seeing the following faulty behavior in the Flink
> Stateful Functions HTTP communication towards external Python workers.
> This is only occurring when the system is under heavy load.
> The Java Application will send HTTP Messages to an external Python
> Function but the external Function fails to parse the message with a
> "Truncated Message Error". Printouts show that the truncated message
> looks as follows:
> {code}
> <Start of Message>
> my.protobuf.MyClass: <Protobuf Content>
> my.protobuf.MyClass: <Protobuf Content>
> my.protobuf.MyClass: <Protobuf Content>
> my.protobuf.MyClass: <Protob
> {code}
> Which leads to the following Error in the Python worker:
> {code}
> Error Parsing Message: Truncated Message
> {code}
> Either the sender or the receiver (or something in between) seems to be
> truncacting some (not all) messages at some random point in the payload.
> The source code in both Flink SDKs looks to be correct. We temporarily
> solved this by setting the "maxNumBatchRequests" parameter in the
> external function definition really low. But this is not an ideal
> solution as we believe this adds considerable communication overhead
> between the Java and the Python Functions.
> The Stateful Function version is 2.2.2, java8. The Java App as well as
> the external Python workers are deployed in the same kubernetes cluster.
> ----
> This was reported on the Mailing List in 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Statefun-Truncated-Messages-in-Python-workers-td43831.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to